A little hacky, yes, but extremely effective. I wrote an image processing application for somewhat large time-series datasets (> 1TB) on Linux which took liberal advantage of these details to run very nicely on v2 Xeon processors. It also worked quite well for GUIs which interacted with the datasets.
Sometimes worth it, but often not. The ability to do shared-memory multi-threading is one of the things that tempts me away from Python. Message passing is great and all, but sometimes you want your messages to be passing around control of a shared 4GB data structure, instead of trying to copy it.
(but it's a little hacky..)