Be careful how you average – a retail example

simulate · on Aug 24, 2012

Stanford's Sam Savage has written an excellent book on the topic of disaggregating average data and mistakes made from using averages. Here's a summary of the book on his website: http://www.stanford.edu/~savage/flaw/

and here's a link to his book, titled The Flaw of Averages: http://www.amazon.com/Flaw-Averages-Underestimate-Risk-Uncer...

Evbn · on Aug 25, 2012

Cool book, but sad that the public on average seems hopeless at understanding that variance exists.

true_religion · on Aug 24, 2012

I was never a math kingpin, but my last startup was stock market/trading related so I got to brush shoulders with some brilliant analysts.

Their advice to me is "anytime you think you want to do a simple average, you'd be better served by displaying a histogram of averages".

I think this completely applies here too since it would help you quickly see if (a) the bulk of your customers are have a low repeat price and the average is buoyed up by a few large purchases or (b) one customer orders a whole bunch of tiny items at a low price dragging the averages down.

jbeda · on Aug 24, 2012

I see this type of thing come up all the time when monitoring complex production systems.

Say you have 10 servers in each of 3 datacenters and you are looking at request latency. Averaging all 30 servers is very different from averaging to the datacenter and then averaging/alerting on a dc by dc basis.

binarysolo · on Aug 24, 2012

TL;DR - use weighted averages. And there's a reason why people use median and mode. :)

Evbn · on Aug 25, 2012

Yeah, personalized analysis beats treating the population as uniform.