I've not written a VM before, but the comments in perf_takehome.py and problem.py explain the basics of this.
I gleaned about half of this comment in a few minutes of just skimming the code and reading the comments on the functions and classes. There's only 500 lines of code really (the rest is the benchmark framework).
Same thought. I doubt they provided additional explanation to candidates - it seems that basic code literacy within the relevant domain is one of the first things being tested.
On the whole I don't think I'd perform all that well on this task given a short time limit but it seems to me to be an extremely well designed task given the stated context. The reference kernel easily fits on a single screen and even the intrinsic version almost does. I think this task would do a good job filtering the people they don't want working for them (and it seems quite likely that I'm borderline or maybe worse by their metric).
I gleaned about half of this comment in a few minutes of just skimming the code and reading the comments on the functions and classes. There's only 500 lines of code really (the rest is the benchmark framework).