Enhancements for AVX2
This set of enhancements updates openair1 DSP routines for AVX2. It primarily concerns files in openair1/PHY/TOOLS and openair1/PHY/CODING.
Completed
- radix 2/4 FFTs and IFFTs. These are fully functional and provide a speedup of 1.5-1.8 compared to SSE4 implementation
- Turbo encoder. This has been done for the interleaver but results in no noticeable improvement. Primarily because the bottleneck is probably the turbo encoder itself which will not benefit from AVX2. Moreover, the interleaver requires some AVX2 instructions that are not truly 256-bit (i.e. operate on 128-bit lanes) and are lose some efficiency.
Under development
- 16-bit Turbo decoder. This is done for AVX2 and is under test.
- 8-bit Turbo decoder. Not yet implemented with AVX2.
Currently merged into bugfix-48-L1L2signaling