If thats too much speed, you can just roll your own for loops. Because Nim is compiled with battle tested GCC, LLVM or VC++ it will try to SSE optimize your code if you pass the right switches. If you know what CPU your computer/server has you can compile with newest brand of high performance instructions like AVX2, or even the newest AVX512VNNI...
You can even use numpy in nim: https://forum.nim-lang.org/t/4102
But why use that when you can use Cuda and OpenCL accelerated vector math: https://github.com/mratsim/Arraymancer
If thats too much speed, you can just roll your own for loops. Because Nim is compiled with battle tested GCC, LLVM or VC++ it will try to SSE optimize your code if you pass the right switches. If you know what CPU your computer/server has you can compile with newest brand of high performance instructions like AVX2, or even the newest AVX512VNNI...