Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Be careful how you average – a retail example (custora.com)
45 points by cpierson on Aug 24, 2012 | hide | past | favorite | 6 comments


Stanford's Sam Savage has written an excellent book on the topic of disaggregating average data and mistakes made from using averages. Here's a summary of the book on his website: http://www.stanford.edu/~savage/flaw/

and here's a link to his book, titled The Flaw of Averages: http://www.amazon.com/Flaw-Averages-Underestimate-Risk-Uncer...


Cool book, but sad that the public on average seems hopeless at understanding that variance exists.


I was never a math kingpin, but my last startup was stock market/trading related so I got to brush shoulders with some brilliant analysts.

Their advice to me is "anytime you think you want to do a simple average, you'd be better served by displaying a histogram of averages".

I think this completely applies here too since it would help you quickly see if (a) the bulk of your customers are have a low repeat price and the average is buoyed up by a few large purchases or (b) one customer orders a whole bunch of tiny items at a low price dragging the averages down.


I see this type of thing come up all the time when monitoring complex production systems.

Say you have 10 servers in each of 3 datacenters and you are looking at request latency. Averaging all 30 servers is very different from averaging to the datacenter and then averaging/alerting on a dc by dc basis.


TL;DR - use weighted averages. And there's a reason why people use median and mode. :)


Yeah, personalized analysis beats treating the population as uniform.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: