When the Average is a Poor Metric for Measuring Application Performance | Loggly (2024)

Systematically measuring application performance is the best way to make steady improvements that translate into a much better customer experience. If you are measuring and monitoring the performance of distributed applications, you may be inclined to use average values. In this blog post, which is the first in a mini-series, I’ll discuss three reasons why this may not be a great idea, especially when tracking latency.

For identifying performance issues, latency is essential. Paraphrasing performance expert Greg Brendan, latency is time spent waiting, and it has a direct impact on performance when caused by a synchronous process within an application request, thus making interpretation straightforward — the higher the latency, the worse the application performance. This is in contrast to other metrics used in performance analysis — utilization, IOPS (I/O per second) and throughput — which are better suited for capacity planning and understanding the nature of workloads.

Here are three reasons why average values are not always the best way to understand latency for an internet application:

  1. You may have outliers, which will distort your average values.
  2. Even if you do not have outliers, you may be averaging different populations.
  3. Even if you have a single population, that population may have a skewed distribution (which in fact is normally the case for Internet applications).

1. Outliers will distort average performance values, but throw them out at your peril.

Outliers in latency data can result from multiple causes, from a very unusual event to poor logging practices. For example, a user’s poorly crafted query can generate a very high response time that shows up in your logs. Poor application logging practices can also introduce outliers. In one case, when a query took longer than a certain threshold, the developer simply printed the timestamp out to the log, instead of the actual response time, generating response times on the order of years!

Care must be taken, however, when hardcoding thresholds separating “valid” data from outliers because you run the risk of discarding valuable data with the bad. To avoid this, I try to remind myself how NASA missed the discovery of the ozone hole above Antarctica. Their scientists had programmed their software to discard satellite data showing values below 180 dobson units for ozone concentration, flagging these readings as instrument errors (Green, P. and Gabor, G, “Misleading Indicators: How to Reliably Measure your Business, p. 77).

2. You don’t want to average different populations.

Even if you do not have outliers, you could be mixing data from different populations. If the underlying differences are large enough, the average will not be representative of the entire population. For instance, you may have a reasonable average for your application response time. But if the 10% slowest times all come from the your most profitable customers, the average is hiding the fact that your business it at risk.

Exploring the distribution of data using visualization and classification in order to find clusters that make sense is key. Here is an example of the height distribution among professional basketball players:

Distribution of professional basketball players by height (NBA and WNBA)

The above histogram masks the fact that the distribution for WNBA and NBA players is quite distinct, so the average for the group is not representative of either population. In fact, the tallest WNBA player would have average height in the NBA league:

Different populations need to be identified in order to make sense of data distributions
Source: https://wagesofwins.com

In the realm of latency performance, oftentimes responses resulting from cached data will have a very different distribution from those that need to be fetched from disk. The type and characteristics of a query will also impact latency. Click herefor an example by Mike Bostock from Github data ().

3. Skewed distributions aren’t well represented by averages.

While the bell curve, or normal distribution, is of fundamental importance to statistics, there are other distributions that are universally applicable in the realm of real things, and also turn out to be highly skewed. These include Benford’s Law, the Pareto Distribution and Zipf’s Law.

Lets focus on Benford’s Law. We naively assume that in any real-world dataset, each first digit from 1 to 9 has the same probability of occurring — at about a 11%. On the contrary, Benford Law states that the distribution of first digits will have 1 as the most common digit and 9 as the least common, in a precise logarithmic relationship.

For example, here is leading digit frequency in Twitter follower counts:

Skewed distributions describe the real world in surprising ways
Source: www.testingbenfordslaw.com

Benford’s law states that “1” occurs 30% of the time as first digit, digit “2” 17.6% of the time, digit “3” 12.4% of the time, and so forth, with the expected proportion of numbers beginning with leading digit n being . Correspondence with Benford’s Law gives us a higher degree of confidence that a dataset represents real phenomena and is not manipulated, rounded or truncated. For the law to hold, the data must also take values as positive numbers, range over many different orders of magnitude, and arise from a combination of largely independent factors.

Benford’s Law is used to detect potential fraud in several instances, such as: elections, tax returns, macroeconomic data, scientific results, etc. In other instances, such as budget planning, it can help detect poor practices, also known as wild guesses.

Long-tailed Distributions are Common in Internet Applications

So having established that skewed distributions are also universal using Benford’s Law as one example, it’s important to note that latency performance metrics for Internet-based applications are usually not normally distributed, but rather have a distribution with a heavy tail.

The exact cause is not well defined, but is related to the fact that there are complex relationships between system components which are not well understood . For example: (i) load distribution among nodes in a distributed application is not always even, and initial variations can have long-range effects; (ii) at the application and hardware level, some data is fetched from memory cache, other directly from disk; (iii) Resource contention causes latency in components, in patterns which are not well understood, for instance, in disk arrays, etc.

Poor latency performance has two important dimensions which need to be addressed. There is the objective fact — the hard numbers — which need to be explored in a high degree of detail in order to find root causes.

Yet performance in customer-facing applications is also subjective — what is acceptable today, may not be tomorrow, and what is acceptable in some contexts or by some customers, is not acceptable by others. In my next blog post I will look at a performance metric to measure latency that takes this subjective factor into account.

Are you interested in how to apply data to your optimization efforts?

I’ll be participating in a VentureBeat webinar on this topic this Thursday, December 11, 2014 at 9:30 a.m. PST. VentureBeat has rounded up a great panel of big data stars to discuss strategies for taking mountains of data and turn it into actionable insights. So register now and don’t miss out!

Further Reading

The Loggly and SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.
When the Average is a Poor Metric for Measuring Application Performance | Loggly (2024)
Top Articles
PowerOfStocks_5EMA — Indicator by Keanu_ritz
Set up a card for contactless payments - Android
English Bulldog Puppies For Sale Under 1000 In Florida
Katie Pavlich Bikini Photos
Gamevault Agent
Pieology Nutrition Calculator Mobile
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Doby's Funeral Home Obituaries
Compare the Samsung Galaxy S24 - 256GB - Cobalt Violet vs Apple iPhone 16 Pro - 128GB - Desert Titanium | AT&T
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Craigslist Dog Kennels For Sale
Things To Do In Atlanta Tomorrow Night
Non Sequitur
Crossword Nexus Solver
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Shasta County Most Wanted 2022
Energy Healing Conference Utah
Aaa Saugus Ma Appointment
Geometry Review Quiz 5 Answer Key
Hobby Stores Near Me Now
Icivics The Electoral Process Answer Key
Allybearloves
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Marquette Gas Prices
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Vera Bradley Factory Outlet Sunbury Products
Pixel Combat Unblocked
Cvs Sport Physicals
Mercedes W204 Belt Diagram
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Where Can I Cash A Huntington National Bank Check
Topos De Bolos Engraçados
Sand Castle Parents Guide
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hello – Cornerstone Chapel
Stoughton Commuter Rail Schedule
Selly Medaline
Latest Posts
Article information

Author: Margart Wisoky

Last Updated:

Views: 5953

Rating: 4.8 / 5 (78 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Margart Wisoky

Birthday: 1993-05-13

Address: 2113 Abernathy Knoll, New Tamerafurt, CT 66893-2169

Phone: +25815234346805

Job: Central Developer

Hobby: Machining, Pottery, Rafting, Cosplaying, Jogging, Taekwondo, Scouting

Introduction: My name is Margart Wisoky, I am a gorgeous, shiny, successful, beautiful, adventurous, excited, pleasant person who loves writing and wants to share my knowledge and understanding with you.