Thanks for the tip, always nice to see the author of tools commenting on hn :-)
The (old) version of parallel packaged with Ubuntu 16.04 (linux subsystem for windows) - doesn't have --pipe-part -- but running from upstream, the speed is more reasonable:
$ time (./parallel-20170522/src/parallel -a ngrams.tsv \
--pipe-part --block -1 -j4 mawk -f map.awk \
| mawk -f reduce.awk )
max_key: 2006 sum: 22569013
real 0m2.265s
user 0m4.672s
sys 0m1.672s
(Tried a few variants with/without -jN -- and this seems typical for the fast end of the spectrum).
$ time (cat ngrams.tsv \
| mawk -f map.awk \
| mawk -f reduce.awk )
max_key: 2006 sum: 22569013
real 0m3.472s
user 0m2.891s
sys 0m2.406s
[ed: btw, did a double-take when I saw your Gnu Privacy Guard id: 0x88888888 :-) ]
Try --pipe-part instead: