Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

SSE is not automatically faster than x87. GCC compiles to x87 on x86-32 by default, even with -O3.


False.

For the type of vector arithmetic you do in games, SSE (much less later flavors like SSE2/3/etc.) is a big win. SIMD instructions for number crunching is huge.

Your example there is more GCC being shitty than anything else.


You're both wrong. iso-8859-1 is wrong that it's not inherently faster. Individual SSE instructions are not necessarily faster (in a latency sense) than the x87 equivalents, but the cleaner register architecture (no stack) means that they can be parallel-issued better by the CPU, and code generated to use them does less spilling and filling to memory. SSE is just plain better, though not overwhelmingly so.

And angersock is missing the point: you can't take scalar code and rebuild it into SIMD (except in the very limited, never-works-as-well-as-you-think-it-should auto vectorization features in modern compilers), the parallelism needs to be designed in. That's not possible here without a rewrite of the game engine.


So, I specifically said that SSE was better for the vector arithmetic that games do. Almost any game you pick will, in the source somewhere, have Vector3::Add(), Vector4::Dot(), etc. functions.

Scalar code is not trivially fixed by using SSE, true, but the majority of really obnoxious math being done (skinning, vector arithmetic, etc.) should be really easy to make really fast.


Right, which is missing the point. The use case at hand is rebuilding some particular part of the engine (honestly it's unclear to me exactly what was done) with different flags, not reworking the vector librar{y,ies} to use SSE.

SSE wouldn't be a flags issue. Building a vectorized SSE library in "debug mode" would still produce vector instructions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: