Skip to content

Optimizations of PDSCH Resource Mapping in nr_dlsch.c/nr_modulation.c

knopp requested to merge resource_mapping_optim into develop

These changes add SIMD optimizations for Neon/AVX2/AVX512 in the PDSCH transmit path. The timing improvements are listed here based on the

nr_dlsim -e25 -R273 -b273 -s30 -x "layers" -y 4 -z 4 -P 

benchmark with "layers" 2,3,4 and comparing "PHY proc tx":

273 PRBS, mcs25, 64QAM

peafowl (gcc11,AMD EPYC 9374F)

  • 2-layer, 4 TX : 431 us (develop 565 us)
  • 3-layer, 4 TX : 692 us (develop 849 us)
  • 4-layer, 4 TX : 963 us (develop 1172 us)

stupix (gcc10, Xeon Gold 6354)

  • 2-layer, 4 TX : 568 us (develop 652 us)
  • 3-layer, 4 TX : 901 us (develop 1030 us)
  • 4-layer, 4 TX : 1250 us (develop 1396 us)

matix (gcc14, Ryzen 9 PRO 7945)

  • 2-layer, 4 TX : 317 us (develop 505 us)
  • 3-layer, 4 TX : 538 us (develop 779 us)
  • 4-layer, 4 TX : 767 us (develop 1233 us)
Edited by knopp

Merge request reports

Loading