Gaussian random number generators using SIMD
From !1794 (merged) by @raymond.knopp
A couple of years ago we did a comparison of Ziggurat implemented with SIMD and the classical method with SIMD (i.e. the vector extension of the one used today in OAI). There are several of these in open-source and even the comparison. The best was classical with SIMD. Ziggurat with SIMD didn't parallelize as well and the time is random because of the method.
The student put the stuff here, I don't know if this was ever integrated or not in the codebase. https://github.com/lfarizav/pseudorandomnumbergenerators
after looking, they are basically the same with SSE/AVX optimizations. So my comment above is more or less valid. Ziggurat doesn't parallelize as well, because the scalar version is quite a bit better.
I would say that if we go to the trouble of changing the RN generator, use an SIMD version if you can.
This [the new Ziggurat] is the standard scalar Ziggurat implementation. In the link I shared you can see the speedup with SIMD. Whether you take Box-Muller (
gaussdouble
) or Ziggurat with AVX (256-bit) you will have at least a 7 times speedup with SIMD. So, it would be worth doing this. There may be even better ones out there now. These were the best we found at the time.Originally, I had wanted to implement a quantized Gaussian generator, since we are using fixed point. This will probably be the fastest, but I didn't find the best method for it, probably a table lookup based on a pair of uniform binary RVs, a bit like Box-Muller but without the need for trigonometric functions.