It's pretty cool how a simple encoding scheme can have profound effects on compression. This is a big part of what makes DCT-style block compression work as well as it does. You rig the game such that all of your higher frequency components wind up at the end - ideally as zeroes after quantization. So, a simple RLE scheme can do a lot more work than it otherwise could.
The idea behind this is to scan diagonally by frequency so that big numbers are grouped together at the start. You can choose to zero out some of the bits near the end since those only contain high frequency components that are harder to notice. RLE then has the ability to work on the trailing zeroes, or other methods can be used instead.