# PRINCIPLES OF CHAOS ENGINEERING - Principles of Chaos Engineering

URL:: https://principlesofchaos.org/
Author:: principlesofchaos.org
## Highlights
> How much confidence we can have in the complex systems that we put into production? ([View Highlight](https://instapaper.com/read/1386512250/15501443))
> An empirical, systems-based approach addresses the chaos in distributed systems at scale and builds confidence in the ability of those systems to withstand realistic conditions. We learn about the behavior of a distributed system by observing it during a controlled experiment. We call this Chaos Engineering. ([View Highlight](https://instapaper.com/read/1386512250/15501447))
> These experiments follow four steps:
> Start by defining ‘steady state’ as some measurable output of a system that indicates normal behavior.
> Hypothesize that this steady state will continue in both the control group and the experimental group.
> Introduce variables that reflect real world events like servers that crash, hard drives that malfunction, network connections that are severed, etc.
> Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental group.
> The harder it is to disrupt the steady state, the more confidence we have in the behavior of the system. If a weakness is uncovered, we now have a target for improvement before that behavior manifests in the system at large. ([View Highlight](https://instapaper.com/read/1386512250/15501450))
> Build a Hypothesis around Steady State Behavior
> Focus on the measurable output of a system, rather than internal attributes of the system. ([View Highlight](https://instapaper.com/read/1386512250/15501454))
> Vary Real-world Events
> Chaos variables reflect real-world events. Prioritize events either by potential impact or estimated frequency. ([View Highlight](https://instapaper.com/read/1386512250/15501457))
> Run Experiments in Production ([View Highlight](https://instapaper.com/read/1386512250/15501460))
> To guarantee both authenticity of the way in which the system is exercised and relevance to the current deployed system, Chaos strongly prefers to experiment directly on production traffic. ([View Highlight](https://instapaper.com/read/1386512250/15501461))
> Automate Experiments to Run Continuously
> Running experiments manually is labor-intensive and ultimately unsustainable. ([View Highlight](https://instapaper.com/read/1386512250/15501462))
> Minimize Blast Radius ([View Highlight](https://instapaper.com/read/1386512250/15501464))
> While there must be an allowance for some short-term negative impact, it is the responsibility and obligation of the Chaos Engineer to ensure the fallout from experiments are minimized and contained. ([View Highlight](https://instapaper.com/read/1386512250/15501466))
---
Title: PRINCIPLES OF CHAOS ENGINEERING - Principles of Chaos Engineering
Author: principlesofchaos.org
Tags: readwise, articles
date: 2024-01-30
---
# PRINCIPLES OF CHAOS ENGINEERING - Principles of Chaos Engineering

URL:: https://principlesofchaos.org/
Author:: principlesofchaos.org
## AI-Generated Summary
None
## Highlights
> How much confidence we can have in the complex systems that we put into production? ([View Highlight](https://instapaper.com/read/1386512250/15501443))
> An empirical, systems-based approach addresses the chaos in distributed systems at scale and builds confidence in the ability of those systems to withstand realistic conditions. We learn about the behavior of a distributed system by observing it during a controlled experiment. We call this Chaos Engineering. ([View Highlight](https://instapaper.com/read/1386512250/15501447))
> These experiments follow four steps:
> Start by defining ‘steady state’ as some measurable output of a system that indicates normal behavior.
> Hypothesize that this steady state will continue in both the control group and the experimental group.
> Introduce variables that reflect real world events like servers that crash, hard drives that malfunction, network connections that are severed, etc.
> Try to disprove the hypothesis by looking for a difference in steady state between the control group and the experimental group.
> The harder it is to disrupt the steady state, the more confidence we have in the behavior of the system. If a weakness is uncovered, we now have a target for improvement before that behavior manifests in the system at large. ([View Highlight](https://instapaper.com/read/1386512250/15501450))
> Build a Hypothesis around Steady State Behavior
> Focus on the measurable output of a system, rather than internal attributes of the system. ([View Highlight](https://instapaper.com/read/1386512250/15501454))
> Vary Real-world Events
> Chaos variables reflect real-world events. Prioritize events either by potential impact or estimated frequency. ([View Highlight](https://instapaper.com/read/1386512250/15501457))
> Run Experiments in Production ([View Highlight](https://instapaper.com/read/1386512250/15501460))
> To guarantee both authenticity of the way in which the system is exercised and relevance to the current deployed system, Chaos strongly prefers to experiment directly on production traffic. ([View Highlight](https://instapaper.com/read/1386512250/15501461))
> Automate Experiments to Run Continuously
> Running experiments manually is labor-intensive and ultimately unsustainable. ([View Highlight](https://instapaper.com/read/1386512250/15501462))
> Minimize Blast Radius ([View Highlight](https://instapaper.com/read/1386512250/15501464))
> While there must be an allowance for some short-term negative impact, it is the responsibility and obligation of the Chaos Engineer to ensure the fallout from experiments are minimized and contained. ([View Highlight](https://instapaper.com/read/1386512250/15501466))