Welcome back from #AWSReInvent2019 !! Hope everyone had a safe and enjoyable visit and travel. If you are still there – lucky you!

While visiting the vendor pavilion at the Venetian resort, we came across many vendors – some known, and some unknown – to the masses. Gremlin is one of the gems we found very interesting to share with you.

First – some background – what is Chaos Engineering and Chaos Monkeys?

Chaos engineering is the discipline of experimenting on a software system in production in order to build confidence in the system’s capability to withstand turbulent and unexpected conditions.

In software development, a given software system’s ability to tolerate failures while still ensuring adequate quality of service—often generalized as resiliency—is typically specified as a requirement. However, development teams often fail to meet this requirement due to factors such as short deadlines or lack of knowledge of the field. Chaos engineering is a technique to meet the resilience requirement.
Chaos engineering can be used to achieve resilience against Infrastructure, Network, and Application failures.

Chaos Monkey is a tool invented in 2011 by Netflix to test the resilience of its IT infrastructure. It works by intentionally disabling computers in Netflix’s production network to test how remaining systems respond to the outage. Chaos Monkey is now part of a larger suite of tools called the Simian Army designed to simulate and test responses to various system failures and edge cases.


Now that we covered some of the basics, back to Gremlin. Gremlin Software is a “failure-as-a-service” platform built to make the Internet more reliable. It turns failure into resilience by offering engineers a fully hosted solution to safely experiment on complex systems, in order to identify weaknesses before they impact customers and cause revenue loss.

During our flight back from Las Vegas, I decided to run a trial of Gremlin for a very simple use case: CPU Saturation and it’s impact against a web application. No better test than our own VVL Systems website.

The video below walks you through a simple CPU saturation test and measurement of impact via NewRelic. Gremlin offers many other situation simulations for Chaos Engineering which we’ll dive deeper as we get more familiar with the platform.

About the author:

Vinnie Lima

Vinnie Lima is the Managing Director for VVL Systems & Consulting, a small business focusing on IT Optimization for Cloud, Infrastructure, and End Users. Based out of Baltimore, Maryland, Vinnie Lima has over 21 years in IT Automation, Orchestration, and Cloud. Mr. Lima’s career has been focusing on helping customers drive value from their IT investments through the use of leading edge technologies and approaches, driving innovation in a wide spectrum of industries such as DoD, Federal, Health Care, and Financial.

facebook twitter linkedin instagram

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to our Newsletter
Stay informed with the latest technology news, industry events, and training offered by VVL Systems for free! Fill out this form and receive our newsletter delivered straight to your inbox.

See how VVL has helped clients leverage the latest technologies and agile capabilities.

Latest VVL and Industry News

What is a Zero Trust Maturity Model?

In this article, we'll review the recent materials, standards, and guidance principles related to Zero Trust Framework with the hope…

COVID-19 Support

VVL Systems is increasing our proactive action to assist the fight against COVID-19, together with Federal, State, and Local governments…

Upcoming VVL and Industry Events