Performance test heuristics

%% date:: [[2023-02-23]] parent:: [[Exploratory testing]], [[Performance Testing]], [[Mental models]] %% # [[Performance test heuristics]] Here are some [[Mental models|heuristics]] for use in [[Exploratory testing]] for [[Performance Testing|performance]]. Go through the list or roll on them the way you would on [[Random Tables]]. This approach emphasises a way to do structured and productive [[Play]] when [[Software Testing|testing]] an application. <iframe width="560" height="315" src="https://www.youtube.com/embed/6wVwdlMwsJ8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ## Code `dice: [[Performance test heuristics#^code]]` | Problem | Description | | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | | [[Race condition]] | Are there areas of the application code that are prone to race conditions? What does a user see when one occurs? | | [[Error handling]] | Does the code catch known errors? What error messages are displayed to users? Are they human-readable? | | [[Prebuilding]] | Could the code anticipate user requests before they're made? | | TODO | Do a search for `TODO` in the code. Any hits? | | [[Asynchronicity]] and [[Parallelism]] | Is there anything in the code that is synchronous when it could/should be handled in parallel? | | [[Load minimum necessary components first]] | Could you load lighter, essential elements first before heavier ones? | | [[Distributed runqueue]] | If your code is executed simultaneously, could it benefit from shared global queues to pass information between instances? | | Embedded resources | Does your code call resources that return 404s because they've been moved? | | [[Principle of Atomicity]] | Are you reusing code that could be modularized for more efficient reuse? | ^code ## Script `dice: [[Performance test heuristics#^script]]` | Problem | Description | | | ---------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | --- | | [[The 1 Thread=1 Virtual User Paradigm is flawed]] | How are you generating the traffic? Is there a perceivable ceiling to your test throughput? | | | [[User concurrency is an ambiguous measure of throughput]] | Are you measuring throughput in terms of user concurrency? Is that how the rest of the team measures it too? | | | [[Dynamic think time and pacing]] | Are you staggering throughput by using static think time? How much think time do the users take? | | | [[Test Scenarios]] | What scenarios are you *not* covering with your tests? | | | [[Cache management]] | Does your script handle a cache the way your application does? | | | Historical data | Does history concur with your interpretation of the scenarios and volumes you're testing? How else could you confirm your test design with hard data? | | | [[Correlating dynamic data]] | Are you passing any hard-coded parameters in your script? What are the implications of doing so? | | ^script ## Infrastructure `dice: [[Performance test heuristics#^infra]]` | Problem | Description | | -------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | | [[Thundering herd problem]] | Which components are most likely to experience the thundering herd problem? What can you do about it? | | [[Circuit breaker pattern]] | Are there any circuit breakers built into the system? | | [[Dimming]] | Are there dimming or Brownout mechanisms in place to protect the system from load beyond expected limits? | | [[Chaos Engineering]] | How much damage could a monkey do if they were left in your data centre with a wrench? | | [[Retry storm]] | What safeties are in place to prevent retry storms? | | [[Failover Test]] | When was the last time you did a failover test? How did the system behave? | | Watching the watchers | Have you load tested your [[Observability]] stack? Are their redundancies in place? | | Redundancies | What *doesn't* have redundancies in the system? | | [[Infrastructure as cattle, not pets]] | Do you have pets among components? How could you treat them more like cattle instead? | | Queues and messaging | Which queues fill up the fastest? What happens when they do? | | Indexing | Are your databases configured to index queries? | | [[Sharding]] | Could you shard data to reduce traffic and resources? | | Resource reuse | Are there any opportunities for reusing existing resources instead of starting new instances? | | [[Thread priority\|Priority]] | Which components are mission critical? Could you prioritize some processes over others in the case of extreme load? | | [[Dynamic scaling]] | What are the limits for scaling components? Have you tested them? Do they scale *back* when not needed as well? How much does it all cost? | | [[Distributed tracing]] | How much insight do you have into the processes or components that make up majority of the response time as a request travels through your system? | | [[Monitoring]] and [[Observability]] | How much of the stack have you instrumented? How confident are you about being able to identify root causes or production incidents? | | ^infra ## Psychology `dice: [[Performance test heuristics#^psych]]` | Problem | Description | | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | | [[Cognitive Biases]] | What cognitive biases are you and your team susceptible to? | | [[Load minimum necessary components first]] | Could you speed up perceived performance by loading essential elements first? | | Error messages | Are the error messages human-readable? | | [[Real User Monitoring]] | Have you considered using RUM to glean insights about actual user behavior? | | [[Roleplaying]] | Put yourself into the persona of a particular kind of user. How are they feeling? Why are they using the application? | | Above the table | What's going on beyond the computer? What are humans likely to do with the information the application provides? | | Other teams | What other teams might have to work overtime if a performance incident occurs? Are there real-life processes in place to loop them in? | | Consequences of speed | If application performance surpasses expectations, can the rest of the business keep up the pace? What are the SLAs for Customer Support? | ^psych ## Principles `dice: [[Principles of testing#^principle]]` ## Resources - [[Principles of improving work performance]] - [[Factors affecting application performance]]