Talk Title: Time for some Chaos !!
How do you test your infrastructure for resilience and reliability? Create a Chaos!
Let us understand Chaos Engineering, a branch of SRE that simulates any failures on your infra by injecting various faults. You proactively test your infra by breaking things with the ultimate goal of improving stability of the systems. The talk shall also showcase a demo using a popular Chaos Engineering tool like Gremlin and Litmus.
The demo covers the fundamental aspects of the Chaos Engineering Platform using Google Kubernetes Engine, understanding and designing Chaos Workflows, simulations for random pod deletion, network traffic and latency degradation, disk fill on ephemeral and persistent storage. Also, we shall cover the role of observability in Chaos Engg and some important metrics to find the resiliency score of your applications.
Watch the recorded session of this talk :