gold_cache: replace linear search with open-addressing hash table

When many UEs are active, the gold sequence cache has to be looked up frequently for each UE's scrambling. The linear scan gets slower as the table grows. Its periodic table reorder causes unpredictable latency spikes. This MR replaces it with a hash table

Here is a benchmark with 128 UEs (done with the tool of !4028 adapted to work with !3902 (merged))

=== Results: 128 UEs, 273 RBs, MCS 28, 10000 slots (warmup 5) ===
  Avg PDSCH/slot: 0.3 / 128 UEs
  Slot budget: 500 us (mu=1)

  Phase            mean      p50      p90      p99      max
                   (us)     (us)     (us)     (us)     (us)
  ------------------------------------------------------------------
  Scheduler       137.0    103.3    356.7    458.0    873.3
  PHY TX           30.9     13.4     55.6    546.4   1053.6
  Total           167.9    116.0    421.4    689.1   1126.1
  ------------------------------------------------------------------
  Max total 1126.1 us at slot 3 (iter 115)
  WARNING: max total (1126 us) exceeds slot budget (500 us)!

  Breakdown                            /slot     /call       max   calls
                                        (us)      (us)      (us)
  ---------------------------------------------------------------------------
  Scheduler:
    Total                              136.9     136.8     873.0   10000
      RA scheduling                        -           -           -       0
      UL scheduling                      0.0       0.0       0.1   10000
      DL scheduling (PDCCH+PDSCH)       79.3      79.3     537.9   10000
        RLC data req                     0.3       1.1       7.4    2357
  PHY TX:
    Total                                  -           -           -       0
      DCI generation                       -           -           -       0
      DLSCH encoding                     7.3      25.6      90.2    2852
        segmentation                     0.1       0.2       1.0    2967
        rate matching                    3.0       9.9      22.0    2967
        scrambling                       6.1      20.7     689.4    2967
      DLSCH modulation                   0.9       3.2       9.7    2967
      layer mapping                      1.7       5.8       8.5    2967
      precoding                          1.7       0.5       1.1   30827
      resource mapping                   3.5       1.1       1.4   30827
      phase compensation                 3.4       3.4       4.9   10000
  ---------------------------------------------------------------------------

Done.
After:

=== Results: 128 UEs, 273 RBs, MCS 28, 10000 slots (warmup 5) ===
  Avg PDSCH/slot: 1.1 / 128 UEs
  Slot budget: 500 us (mu=1)

  Phase            mean      p50      p90      p99      max
                   (us)     (us)     (us)     (us)     (us)
  ------------------------------------------------------------------
  Scheduler        56.4     47.3     97.9    133.1    274.7
  PHY TX           53.3     47.9     72.9     87.6    320.3
  Total           109.7     97.1    142.6    187.3    435.8
  ------------------------------------------------------------------
  Max total 435.8 us at iter 8528

  Breakdown                            /slot     /call       max   calls
                                        (us)      (us)      (us)
  ---------------------------------------------------------------------------
  Scheduler:
    Total                               60.6      48.5     223.5   12498
      RA scheduling                        -           -           -       0
      UL scheduling                      0.0       0.0       0.1   12498
      DL scheduling (PDCCH+PDSCH)       41.4      33.1     111.4   12498
        RLC data req                     0.8       0.9       6.9    9150
  PHY TX:
    Total                                  -           -           -       0
      DCI generation                       -           -           -       0
      DLSCH encoding                    24.2      24.2     289.2   10000
        segmentation                     0.2       0.2       0.8   11440
        rate matching                   10.5       9.2     275.5   11440
        scrambling                       0.8       0.7      16.5   11440
      DLSCH modulation                   2.9       2.6       9.0   11440
      layer mapping                      6.3       5.5       8.2   11440
      precoding                          6.6       0.5       1.0   138760
      resource mapping                   2.2       0.2       3.0   138760
      phase compensation                 3.3       3.3       3.5   10000
  ---------------------------------------------------------------------------

Merge request reports

Loading