Speed up RFsimulator using SIMD
This MR speeds up the RFsimulator (by about 30% on my system) by using SIMD instructions for copying samples inside the RFsimulator.
The drawback: it currently only works for SISO and neither 2x2 nor 4x4. I am opening this MR to get feedback, and make both 2x2 and 4x4 work. I think it is useful work.