> One shouldn't have to enable filesystem-wide W|X just to run one application T...

near · on May 28, 2016

An emulator is a very different use case than Firefox's JIT.

Take Dolphin (the Gamecube / Wii emulator) for example: you have a 4GiB DVD, it has a massive 80-hour game's entire code engine on there. You cannot recompile the entire disc at startup. Even if you could (you really can't), these games tend to push code into RAM to execute, which a static recompiler cannot handle.

The way dynamic recompilers handle the tremendous burden is to recompile small blocks at a time, and track which ones are hot and cold. When the buffers fill up, they start dropping old, stale code.

It's hard to say exactly what the performance impact would be, and it'd probably vary per game title. But it'd be a lot worse than a web browser recompiling jQuery plus another 200KiB of custom Javascript once on page load.

Plus, I am sure there are many more uses cases for W|X than just emulators and JITs. It would be a shame to try and eradicate them all from existence.

bzbarsky · on May 28, 2016

Browser JITs don't just recompile things on page load. They have to add inline cache entries whenever the existing IC is missed.

Now in practice ICs usually hit (that's the point!) so once you've been running for a bit things should hopefully not need more recompiling. Which is a significant difference from the situation you describe.

hyperpape · on May 28, 2016

How often is too often? When I output compilation logs for HotSpot, it's almost never not compiling (in terms of human perception of how fast the messages are output).

lisivka · on May 28, 2016

Map page two times at separate addresses: for executing and for writing. See http://nullprogram.com/blog/2016/04/10/ for details.

yoklov · on May 28, 2016

Extremely clever. Wouldn't this still violate W^X though? I would have thought the OS would see through a trick like this.

nanofortnight · on May 29, 2016

It doesn't violate W^X, because the same virtual memory address isn't writable and executable at the same time.

microcolonel · on May 28, 2016

If it's writing code into memory, and you're going to compile it and rewrite the jump when it's done, isn't the part where the current code is writing in W mode already?

The process would be: 1. compile the code when you fault on the basic block exit. 2. mark that basic block executable. 3. Optionally only after return or other jump: mark the jumping basic block as writable, patch the jump, then mark it as executable.

In this case, there would be three changes, JS JITs are doing this a lot more often than Dolphin is. They often have more than one level of JIT in addition to an interpreter; so they will end up doing this dance more than once per basic block.

Until I see at least some microbenchmarks and concrete estimates, I don't think I'll worry too much about this. Though it is unfortunate to have to modify all of this code.

PeterisP · on May 28, 2016

An emulator is a particular niche application, and a prime example of an exception for which one could enable W|X - acknowledging that most apps don't need W|X doesn't mean that W|X apps would be eradicated.

near · on May 28, 2016

I agree completely, but from the linked article:

> One day far in the future upstream software developers will understand that W^X violations are a tremendously risky practice and that style of programming will be banished outright.

Whoever wrote that they should be banned outright, I feel is being very short-sighted. It would be like banning cars because it's possible to seriously injure a person with one.

tj later replied to me on Twitter saying they meant for it to become per-process in the future. Once that happens, I'll be okay with this change. Right now, I think filesystem-level is far too broad. I often will want an emulator on the same filesystem as I want W^X protections for other applications on.

Nyan · on May 28, 2016

Just FYI, here's an algorithm [https://github.com/animetosho/ParPar/blob/master/xor_depends...] which explicitly relies on JIT being a fast operation.

It's different from your typical language interpreter type JIT in that it's used as an optimisation for generating an optimal processing kernel. Once generated, it's only used once before it's thrown away, as a different input requires a different processing kernel.

Have actually tested using syscalls (not for W^X but rather as a potential workaround for newer Intel CPUs exhibiting weird SMC detection) and have found the overhead to be way too much, even for just one syscall (whilst switching between W/X would require two).

jandrese · on May 28, 2016

How does Wine work on OpenBSD? I would think it would suffer quite badly under this restriction.