Fix IQ recorder/player for nrUE (!1958) · Merge requests · oai / openairinterface5G

Mongazon requested to merge iqfixes_nrue into develop Feb 17, 2023

Provide support of IQ record/replay for nrUE.

This encompasses the following changes:

Modification of exit_function() prototype with an additional paramater which can be EXIT_ASSERT or EXIT_NORMAL. Because the exit_function() needs to be implementd by each main program, it's up to each main program to take further action depending on EXIT_ASSERT or EXIT_NORMAL value. In the general case, EXIT_ASSERT leads to call abort() possibly generating a core file, while EXIT_NORMAL leads to call exit(). The main program should be free to set the exit "status".

Modification of AssertFatal() to call exit_function() with an additional parameter indicating EXIT_ASSERT. The call to exit_function() allows the xxx_end() function of the device to be called prior to exit from main program. In the case of IQ recorder, the xxx_end() function is the place where IQs stored in memory during run time are written to the IQ file. Without calling xxx_end() function, the IQ stored are lost and this makes the IQ recorder not usable to reproduce, analyze and debug situations where AssertFatal() is raised in the code for good reasons. This would be a big lack in functional usage of IQ recorder/player.

Modification of existing exit_function() implementation in various OAI programs to adapt to the extended interface. In the case of EXIT_ASSERT, call to abort() is done without previously existing calls to sleep(1). In the case of EXIT_NORMAL, call to exit(EXIT_SUCCESS) is preferred (may be not changed in all programs). In case a sleep(1) call was existing, it is kept.

Modification of IQ player/recorder:

Adapt to record/replay a variable number of samples. This was not the case in 4G but now the nrUE might read different number of samples, i.e. during re-synchronization. Now the number of bytes in an IQ record counts the real number of read samples. The IQ records are still written with a fixed size equal to BELL_LABS_IQ_BYTES_PER_SF. As for now, IQ recorder/player considers a subframe (SF) as a slot for numerology 1. These values are hard-coded for as follows (this can be changed in the future):

#define BELL_LABS_IQ_PER_SF 23040 // 23040 => 46080/2 slots => 3/4 40MHz (106 PRBs) #define BELL_LABS_IQ_BYTES_PER_SF (BELL_LABS_IQ_PER_SF * 4)

Adapt IQ read delay according to IQ recording times. When IQ are written (as part of USRP read), the EPOCH time is also written in the record header. When IQ are read, as part of trx_iqplayer_read() function, the read delay is not any longer taken from the subframes-read-delay configuration parameter. Instead, the delay to get samples at step N is automatically computed as the difference between recording time at step N and step N-1. This allows IQ replay to happen at the same rate IQ were recorded.

The configuration has been enhanced to distinguish IQ play and IQ record. This is done to allow checking specific behavior in the code. IS_SOFTMODEM_IQPLAYER is kept for IQ play for backward compatibility. IS_SOFTMODEM_IQRECORDER is added for IQ record specific handling where needed.

Modification of nrUE:

The design of UE synchronization relies on 2 threads, the UE_thread and a TPool thread running UE_sync. Both threads run in parallel. This design has been done since the UE_sync calculation takes some arbitrary time while the UE_thread needs to read from RF continuously, trashing frames until UE_sync provides a calculation result. Both threads run at the same priority and the using the same Linux scheduling policy. As a consequence, this design cannot guarantee that the number of frames trashed during UE_sync calculation would be the same in recording and replay mode. For this reason, we fix the number of trashed frames when running in record and replay mode. In the current fix, the number of trashed frames is fixed to 28 and corresponds to 280ms time to handle the synchronization process (this hard-coded value will be reviewed in the future to be possibly derived rather than hard-coded).

When running in record/replay mode, the seed used in RA procedures is set to fixed 0x1 value rather than random value i.e. rdtsc_oai(). The rand_r() system call which uses a single and unique seed ensures to produce the same random, which in fact is not random !

When running in record/replay mode, some controls are bypass (i.e. contention resolution identity...).

Other changes:

Some printf have been changed to LOG_D or LOG_I. Some LOG_D have been changed to LOG_I. Some LOG_I have been changed to LOG_D.

Regarding warnings introduced by the change to Assert_Exit macro:

gcc was issuing warnings because in the Assert_Exit, I replaced call to abort() by call to exit_function(). In such a case, this raised a lot of warnings in the code where AssertFatal is called. The reason is that without abort() call in the macro, the flow analyzed by gcc is different. A call to abort() in the expanded code breaks the flow while without this call, the flow exhibits legitimate warnings. This is for sure a design issue: whatever assert mechanism is implemented, the code shall not presuppose that the C code flow will be baffled. The typical example is a variable which is assigned a value in a switch statement where the default case does not assign a value but calls AssertFatal. From gcc point of view and without the call to abort() or exit() in the AssertFatal, a warning is legitimate. The ideal fix for this would be to assign a value either at initialization or in the default case so to let gcc knows what is the programming intention. There are several other warning cases in the code which, at the end of the day, shall be fixed aside the assert mechanism. For the time being, I have added a call to abort() after calling the exit_function() to make gcc happy. From the logical point of view, this solution is valid since the exit_function() will either call abort() or exit(). Thus the abort() call should never be effective unless the app-specific exit_function() does not call abort() nor exit(). At the end of the day a better defined assertion mechanism might be defined and induced (native) warnings shall be fixed. I suggest this should be done in a different merge request. Currently there are ~100 of native warnings which fall in this category.

The overall changes have been asserted when using nrUE in front of Nokia gNB to exhibit the following AssertFatal:

Assertion (k2 >= ((4))) failed! In get_k2() ../../../openair2/LAYER2/NR_MAC_UE/nr_ue_scheduler.c:140 Slot offset K2 (2) cannot be less than DURATION_RX_TO_TX (4). K2 set according to min_rxtxtime in config file.

In addition, the nrUE can be compiled with -g and run in replay mode for the same use case, allowing for advanced debugging.

Edited Feb 23, 2023 by Mongazon

Fix IQ recorder/player for nrUE

Merge request reports