Skip to content

Optimizations of PDSCH Resource Mapping in nr_dlsch.c/nr_modulation.c

knopp requested to merge resource_mapping_optim into develop

These changes add SIMD optimizations for Neon/AVX2/AVX512 in the PDSCH transmit path. The timing improvements are listed here based on the

nr_dlsim -e25 -R273 -b273 -s30 -x "layers" -y 4 -z 4 -P 

benchmark with "layers" 2,3,4:

273 PRBS, mcs25, 64QAM

peafowl (gcc11,AMD EPYC 9374F)

  • 2-layer, 4 TX : 36.42us (develop 86.68us)
  • 3-layer, 4 TX : 54.77us (develop 124.75us)
  • 4-layer, 4 TX : 75.10us (develop 166.93us)

stupix (gcc10, Xeon Gold 6354)

  • 2-layer, 4 TX : 55.55us (develop 119.07us)
  • 3-layer, 4 TX : 64.25us (develop 159.34us)
  • 4-layer, 4 TX : 138.72us (develop 219.84us)

matix (gcc14, Ryzen 9 PRO 7945)

  • 2-layer, 4 TX : 22.75 us (develop 70.37us)
  • 3-layer, 4 TX : 33.99 us (develop 93.20us)
  • 4-layer, 4 TX : 62.07 us (develop 130.33us)

armix (gcc11, ARM Ampere aarch64)

  • 2-layer, 4 TX : 58.78 us (develop 176.99 us)
  • 3-layer, 4 TX : 106.32 us (develop 270.82 us) 3-layer mapping function not SIMD
  • 4-layer, 4 TX : 149.13 us (develop 378.96 us)
Edited by knopp

Merge request reports

Loading