Re: assumptions about distribution But if the assumption is that they are all us...

saurik · on Jan 16, 2011

In fact, it is even worse than that: in the case of random keys (which he admits in the documentation is important for trie balance without helping the user obtain them) he'd be much much better off just chopping the first five bits off the key and using that to index into his top-level array.

Doing that would let the top-level array segment the search space perfectly into 32 sub-buckets, avoiding the fact that the higher order array elements are increasingly meaningless (as they are exponentially smaller).

At that point, however, the datastructure is now a level-compressed trie ;P. (LPC tries seem to be at the state of the art for this kind of structure).