I used to work at a company that cached international shipping rates. Those rates are BASICALLY by country because most of the fuel is burned to move the package the gross distance from one country to another.
But every once in a while we'd end up paying double for a shipment because the customer was way out in the sticks or whatever. Had we done more dynamic stuff like taking the whole address into account when quoting a price to a customer we'd never get bit by this kind of problem. But it was only a couple of times a month so I didn't worry too much about it.
My replacement found that this bothered him a lot and he figured he'd score points by fixing this problem. So he did exactly that, transitioned the entire quote system from local database lookups to remote UPS/FedEx/USPS/etc calls. 2-4 rates per shipper (Ground, Air, etc) for a total of about 10-15 every time a customer wanted a quote. And because we would repackage stuff (it was a logistics company) we often never knew the exact weight so we'd quote 3-4 prices so people could get a feel for which choice was their best bet for the best rate without delaying everything by an extra day or two in order to get a hard quote.
We cached these rates by country and weight (up to 1000lbs) so between all the service offerings and whatnot it was about 100,000 pieces of information in our actual, but occasionally incorrect database. So there were two choices:
1. Don't do any caching and just look them all up in realtime for customers. They're web APIs so there's latency associated.
2. Cache, but on a per-address basis. We had an address book for our customers so we knew the couple of addresses they would want to ship to and we could aggressively warm the cache so that all the rates would already be there. But there were about 10k unique addresses in the database * 100k total rates = 1 billion rates that needed to be cached.
When I presented this back-of-the-envelope calculation in a meeting do you know how he blew all of it off?
"Premature optimization is the root of all evil" -- Donald Knuth
I was so flabbergasted that someone could be aggressively ignorant and yet somehow twist Knuth's words to support their own position that I simply gave up. I was dealing with a powerful stupidity and it was stronger than me.
I later heard that during the transition it was touch-and-go for about a week and they had to issue a lot of credit to pissed off customers. The rate quoting page went from about 20ms to render (and maybe 300ms to load) now to about 4 seconds.
If my system requirements are that something run in 100ms, dismissing options that have a min 2s latency (however useful they are) is not "premature optimization". It's a system requirement, and options B, C and D don't make the cut. Simple as that.
But every once in a while we'd end up paying double for a shipment because the customer was way out in the sticks or whatever. Had we done more dynamic stuff like taking the whole address into account when quoting a price to a customer we'd never get bit by this kind of problem. But it was only a couple of times a month so I didn't worry too much about it.
My replacement found that this bothered him a lot and he figured he'd score points by fixing this problem. So he did exactly that, transitioned the entire quote system from local database lookups to remote UPS/FedEx/USPS/etc calls. 2-4 rates per shipper (Ground, Air, etc) for a total of about 10-15 every time a customer wanted a quote. And because we would repackage stuff (it was a logistics company) we often never knew the exact weight so we'd quote 3-4 prices so people could get a feel for which choice was their best bet for the best rate without delaying everything by an extra day or two in order to get a hard quote.
We cached these rates by country and weight (up to 1000lbs) so between all the service offerings and whatnot it was about 100,000 pieces of information in our actual, but occasionally incorrect database. So there were two choices:
1. Don't do any caching and just look them all up in realtime for customers. They're web APIs so there's latency associated.
2. Cache, but on a per-address basis. We had an address book for our customers so we knew the couple of addresses they would want to ship to and we could aggressively warm the cache so that all the rates would already be there. But there were about 10k unique addresses in the database * 100k total rates = 1 billion rates that needed to be cached.
When I presented this back-of-the-envelope calculation in a meeting do you know how he blew all of it off?
"Premature optimization is the root of all evil" -- Donald Knuth
I was so flabbergasted that someone could be aggressively ignorant and yet somehow twist Knuth's words to support their own position that I simply gave up. I was dealing with a powerful stupidity and it was stronger than me.
I later heard that during the transition it was touch-and-go for about a week and they had to issue a lot of credit to pissed off customers. The rate quoting page went from about 20ms to render (and maybe 300ms to load) now to about 4 seconds.