UE re-sync issues
Summary
Branch: NR_UE_resync, NR_UE_RRC_Reestablishment
Resync issues:
-
Assertion during RRCReestablishment due to missing SDAP reconfiguration: handled in !2571 (merged) -
RFSIM and channel simulation documentation outdated: handled in !2576 (merged) -
UE timer T310 does not expires, UE re-sync does not happen -> !2677 (merged) -
Segfault at UE side during fallback RRCSetup: possibly handled in !2552 (merged) or in a later MR -
SEGFAULT on-> this issue does not seem to occur in the most recent testsRRCReconfiguration
generation after RRCReestablishmentComplete at gNB (encode_cellGroupConfig
)
RLC/PDCP issues:
Summary: resync is OK but traffic takes long to restore due to missing reset of PDCP counters at gNB side
-
RLC: UE transmits data on DRB1 before RLC Entity re-establishment, leading to SN mismatch with gNB, leading to discarded PDUs -> solved in NR_UE_resync -
PDCP: UE resets the PDCP counters, but gNB does not, leading to discarded PDUs: see !2668 (merged)
UPLINK
UE does not resync
- gNB telnet: set a high
noise_power_dB
orploss
- UE UL failure on gNB side
- UE's timer T310 never expires, sync detection doesnt start but it keeps looping on log messages:
[NR_MAC] [UE 0] Release all SRs
[NR_MAC] [UE 0] PDCCH Tick : MAC BSR Triggered ReTxBSR Timer expiry at frame 601 slot 18
- eventually UE ends with an assert, e.g.:
UE:
- Assertion (is_nr_UL_slot(tdd_config, slot_tx, mac->frame_type) != 0) failed:
[NR_PHY] RSRP = -50 dBm
[NR_MAC] Couldn't find SRS aperiodic resource with trigger 2
Assertion (is_nr_UL_slot(tdd_config, slot_tx, mac->frame_type) != 0) failed!
In lockGet_ul_config() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_UE/nr_ue_scheduler.c:67
UL config_request called at wrong slot 13
Exiting execution
mac b8 b7 5b d2
mac 1c b5 8e eb
[LIBCONFIG] nas.noS1: 3/3 parameters successfully set, (3 to default value)
[LIBCONFIG] nas.noS1: 3/3 parameters successfully set, (3 to default value)
Assertion (is_nr_UL_slot(tdd_config, slot_tx, mac->frame_type) != 0) failed!
In lockGet_ul_config() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_UE/nr_ue_scheduler.c:67
UL config_request called at wrong slot 13
- Assertion (tda_index < tdalist->list.count) failed:
[NR_MAC] [UE 0] Release all SRs
[NR_MAC] [UE 0] PDCCH Tick : MAC BSR Triggered ReTxBSR Timer expiry at frame 731 slot 18
[NR_PHY] RSRP = -50 dBm
[NR_PHY] RSRP = -51 dBm
mac ed d7 87 87
mac 80 27 da 8
[LIBCONFIG] nas.noS1: 3/3 parameters successfully set, (3 to default value)
[LIBCONFIG] nas.noS1: 3/3 parameters successfully set, (3 to default value)
Assertion (tda_index < tdalist->list.count) failed!
In set_tda_info_from_list() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_COMMON/nr_mac_common.c:601
TDA index from DCI 3 exceeds TDA list array size 3
Exiting execution
Assertion (tda_index < tdalist->list.count) failed!
In set_tda_info_from_list() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_COMMON/nr_mac_common.c:601
TDA index from DCI 3 exceeds TDA list array size 3
- Assertion (n_dmrs_cdm_groups == 1 || n_dmrs_cdm_groups == 2) failed:
[NR_PHY] RSRP = -22 dBm
[NR_MAC] Couldn't find SRS aperiodic resource with trigger 2
Assertion (n_dmrs_cdm_groups == 1 || n_dmrs_cdm_groups == 2) failed!
In nr_dlsch_extract_rbs() /home/guido/repo/openairinterface5g/ue-resync/openair1/PHY/NR_UE_TRANSPORT/nr_dlsch_demodulation.c:1221
n_dmrs_cdm_groups 0 is illegal
Exiting execution
/home/guido/repo/openairinterface5g/ue-resync/openair1/PHY/NR_UE_TRANSPORT/nr_dlsch_demodulation.c:1221 nr_dlsch_extract_rbs() Exiting OAI softmodem: _Assert_Exit_
Aborted
Example logs noise_power_dB
set to 20 for channel 1 (UL): _mod1_noise20.log
logs
DOWNLINK
Issues with sync loss due to high pathloss/noise
!2571 (merged)
Missing SDAP reconfiguration solved in- UE telnet: set very high ploss (> 50) or noise_power_dB (>10)
- sync loss due to low SNR: UE timer T310 expires and sync detections starts
- UE telnet: restore usual channel condition
- UE reconnect, does re-establishment but ends with SDAP reconfiguration assert:
[NR_RRC] Logical Channel DL-DCCH (SRB1), Received RRCReestablishment
[NR_RRC] [RAPROC] Logical Channel UL-DCCH (SRB1), Generating RRCReestablishmentComplete (bytes 2)
[NR_RRC] radio Bearer Configuration is present
Assertion (0) failed!
In nr_pdcp_reconfigure_drb() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/nr_pdcp/nr_pdcp_oai_api.c:1168
Function to reconfigure SDAP entity not implemented yet
Exiting execution
With !2571 (merged) integration re-sync is successful but we have 3 scenarios with other issues:
1) traffic takes longer to be restored due to RLC/PDCP issues
- after UE attachment to network, start a DL ping
- UE telnet: set very high ploss (> 50) or noise_power_dB (>10)
- sync loss due to low SNR: UE timer T310 expires and sync detections starts, ping is stopped at this point
- UE telnet: restore normal channel condition
- UE reconnect, does re-establishment
- after roughly 30 seconds, ping starts again. This is possibly related to these errors on UE side:
[RLC] max RETX reached on DRB 1
[RLC] ack_sn (26) not valid (tx_next_ack 0 tx_next 3), discard control PDU
Issue with RLC Entity re-establishment
UE transmits data on DRB1 before RLC Entity re-establishment, leading to SN mismatch with gNB
- a) UE last SN of DRB 1 before re-establishment is SN = 8
- b) RRC Re-establishment
- c) UE does
nr_rlc_entity_am_reestablishment
for SRB 0 innr_rlc_reestablish_default_srb
-> clear_entity - d) UE TX PDU with SN = 9 --> DRB RLC entity is not cleared yet, data present in RLC from before sync loss
- e) RRCReestablishmentComplete
- f) after a few frames UE does nr_rlc_entity_am_reestablishment for DRB 1 in
nr_rlc_reestablish_entity
-> clear_entity - g) gNB sends NACK for SN = 9 ----> failing
** Issue with missing reset of PDCP counters for AM entity**
Test to address the RLC re-establishment issue: With RLC re-establishment at the same time as nr_rlc_reestablish_default_srb
, the issue is then on gNB side:
[0m[93m[PDCP] discard NR PDU rcvd_count=1, entity->rx_deliv 21,sdu_in_list 0
possibly due to missing PDCP entity suspend
according to 5.1.4 of 3GPP TS 38.323.
However, the PDCP rx_deliv
, rx_next
and tx_next
are set to 0 only in 3 occasions:
- the PDCP Re-establishment for UM DRBs
- the PDCP Re-establishment for SRBs
- the PDCP suspend
in our case, we do the PDCP suspend for DRBs at UE, therefore resetting rx_deliv
, rx_next
and tx_next
, however we don't reset at gNB side since the DRB are AM and TS 38.323 does not requires the reset for AM DRBs.
Eventually, we would still have mismatch of PDCP counters at gNB side and PDUs will be discarded at PDCP layer.
2. encode_cellGroupConfig SEGFAULT on gNB
- RRCReestablishmentComplete OK on UE side, however there is a Segfault on NR_RRCReconfiguration generation on gNB
- it does not occur systematically
- log trace:
[NR_RRC] [RAPROC] UE 2c0a Logical Channel DL-DCCH, Generating NR_RRCReestablishment (bytes 2)
[NR_RRC] Set PDCP security UE 1 RNTI 2c0a nca 0 nia 2 in RRCReestablishment
[NR_MAC] No CU UE ID stored for UE RNTI 2c0a, adding CU UE ID 1
[NR_MAC] Remove NR rnti 0x0177
[RLC] remove UE 2c0a
[RLC] Update old UE RNTI 0177 context to RNTI 2c0a
[NR_MAC] UE 2c0a Generate msg4: feedback at 249.17, payload 21 bytes, next state WAIT_Msg4_ACK
[NR_MAC] (UE RNTI 0x2c0a) Received Ack of RA-Msg4. CBRA procedure succeeded!
[RLC] SRB0 already exists for UE with RNTI 0x2c0a, do nothing
[PDCP] discard NR PDU rcvd_count=0, entity->rx_deliv 1,sdu_in_list 0
[RLC] /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/nr_rlc/nr_rlc_oai_api.c:789:nr_rlc_add_srb: SRB 1 already exists for UE with RNTI 2c0a, do nothing
[NR_RRC] UE 1 Processing NR_RRCReestablishmentComplete from UE
[NR_RRC] UE 1 RNTI 2c0a: Generate NR_RRCReconfiguration after reestablishment complete (bytes 179)
Thread 22 "L1_rx_thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffa24e5640 (LWP 300455)]
send_initial_ul_rrc_message (rnti=375, sdu=0x55555a4ed529 "\356?\253\261Z/z\033\266Z\264n\352\030\216b\024\346\336\aj\rx\220\351\367\270\247r\f\203.q&\334\177(\276\342\016\346&|\212", <incomplete sequence \367>, sdu_len=6, data=0x7fff93be1010) at /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_gNB/gNB_scheduler_primitives.c:3014
3014 int encoded = encode_cellGroupConfig(UE->CellGroup, du2cu, sizeof(du2cu));
(gdb) bt
#0 send_initial_ul_rrc_message (rnti=375,
sdu=0x55555a4ed529 "\356?\253\261Z/z\033\266Z\264n\352\030\216b\024\346\336\aj\rx\220\351\367\270\247r\f\203.q&\334\177(\276\342\016\346&|\212", <incomplete sequence \367>, sdu_len=6, data=0x7fff93be1010)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_gNB/gNB_scheduler_primitives.c:3014
#1 0x000055555661441b in nr_rlc_entity_tm_recv_pdu (_entity=0x7fff9805e580, buffer=<optimized out>, size=6)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/nr_rlc/nr_rlc_entity_tm.c:44
#2 0x0000555556606aae in mac_rlc_data_ind (module_idP=module_idP@entry=0, rntiP=<optimized out>,
eNB_index=eNB_index@entry=0 '\000', frameP=frameP@entry=251, enb_flagP=enb_flagP@entry=true, MBMS_flagP=MBMS_flagP@entry=false,
channel_idP=0,
buffer_pP=0x55555a4ed529 "\356?\253\261Z/z\033\266Z\264n\352\030\216b\024\346\336\aj\rx\220\351\367\270\247r\f\203.q&\334\177(\276\342\016\346&|\212", <incomplete sequence \367>, tb_sizeP=6, num_tbP=1, crcs_pP=0x0)
SEGFAULT on UE with RRCSetup fallback
- after UE attachment to network, start a DL ping
- UE telnet: set very high ploss (> 50) or noise_power_dB (>10)
- sync loss due to low SNR: UE timer T310 expires and sync detections starts, ping is stopped at this point
- wait for gNB to send RRCRelease after UL failure timer expiration
- UE telnet: restore normal channel condition
- UE sends RRCReestablishmentRequest, gNB fallback to RRCSetup
- segfault on UE side:
[MAC] Initialization of 4-step contention-based random access procedure
[NR_MAC] PRACH scheduler: Selected RO Frame 23, Slot 19, Symbol 0, Fdm 0
[PHY] PRACH [UE 0] in frame.slot 23.19, placing PRACH in position 2828, msg1 frequency start 0 (k1 0), preamble_offset 3, first_nonzero_root_idx 0
[NR_MAC] [UE 0][RAPROC] Got BI RAR subPDU 5 ms
[NR_MAC] [UE 0][RAPROC] Got RAPID RAR subPDU
[NR_MAC] [UE 0][RAPROC][24.7] Found RAR with the intended RAPID 13
[MAC] received TA command 31
[PHY] RAR-Msg2 decoded
[NR_MAC] [RAPROC][24.17] RA-Msg3 transmitted
[MAC] [UE 0][RAPROC] Frame 25 : received contention resolution identity: 0x5935e0004194 Terminating RA procedure
[MAC] [UE 0][25.10][RAPROC] RA procedure succeeded. CB-RA: Contention Resolution is successful.
[NR_RRC] [UE0][RAPROC] Logical Channel DL-CCCH (SRB0), Received NR_RRCSetup
[NR_RRC] [UE 0] Recived RRC Setup in response to re-establishment request
[PDCP] release DRB 1 of UE 0
[RLC] Added srb 1 to UE with RNTI 0x0
[MAC] Applying CellGroupConfig from gNodeB
Thread 17 "TASK_RRC_NRUE" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbf7fe640 (LWP 1022334)]
0x0000555555920973 in nr_rrc_mac_config_req_cg (module_id=<optimized out>, cc_idP=<optimized out>, cell_group_config=0x7fffa80b5540) at /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_UE/config_ue.c:1953
1953 if (mac->current_UL_BWP->rach_ConfigCommon)
(gdb) bt
#0 0x0000555555920973 in nr_rrc_mac_config_req_cg (module_id=<optimized out>, cc_idP=<optimized out>,
cell_group_config=0x7fffa80b5540)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_UE/config_ue.c:1953
#1 0x00005555558f45d0 in nr_rrc_ue_process_masterCellGroup (rrc=0x55555638b7d0,
rrcNB=<optimized out>, masterCellGroup=<optimized out>, fullConfig=<optimized out>)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/RRC/NR_UE/rrc_UE.c:848
#2 0x00005555558f5158 in nr_rrc_process_rrcsetup (rrcSetup=<optimized out>, gNB_index=0 '\000',
rrc=0x55555638b7d0)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/RRC/NR_UE/rrc_UE.c:933
#3 nr_rrc_ue_decode_ccch (rrc=0x55555638b7d0, ind=<optimized out>, gNB_index=0 '\000')
at /home/guido/repo/openairinterface5g/ue-resync/openair2/RRC/NR_UE/rrc_UE.c:1000
#4 0x00005555558fc8ec in rrc_nrue (notUsed=<optimized out>)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/RRC/NR_UE/rrc_UE.c:1608
#5 0x00005555558feebf in rrc_nrue_task (args_p=<optimized out>)
at /home/guido/repo/openairinterface5g/ue-resync/openair2/RRC/NR_UE/rrc_UE.c:1538
#6 0x00007ffff7294ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#7 0x00007ffff7326850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
the reason is that SIB1 is not decoded yet and mac->current_UL_BWP
is not allocated by nr_rrc_mac_config_req_sib1
and it is cleared by release_mac_configuration
which is called twice:
a) handle_rlf_detection
-- > nr_rrc_initiate_rrcReestablishment
--> nr_rrc_mac_config_req_reset
--> release_mac_configuration
which works fine.
b) Then it is called when receiving the RRCSetup
message in response to the RRCReestablishmentRequest
message:
nr_rrc_process_rrcsetup
--> nr_rrc_mac_config_req_reset
--> calling release_mac_configuration
Since there is this check if (cause == RE_ESTABLISHMENT) {
in release_mac_configuration the current_UL_BWP
is not allocated when RRCSetup is received but only when RRCReestablishment
is received.
Issues after sync loss before re-sync
- Set medium noise levels in downlink channel (e.g. between 5/10 dB)
- UE has an UL failure on gNB side
- at UE side timer T310 does not expire and sync detection does not start. Same issue as uplink tests:
[NR_MAC] [UE 0] Release all SRs
[NR_MAC] [UE 0] PDCCH Tick : MAC BSR Triggered ReTxBSR Timer expiry at frame 639 slot 18
it ends with this assert:
[NR_MAC] Couldn't find SRS aperiodic resource with trigger 2
Assertion (is_nr_UL_slot(tdd_config, slot_tx, mac->frame_type) != 0) failed!
In lockGet_ul_config() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_UE/nr_ue_scheduler.c:67
UL config_request called at wrong slot 12
[NR_MAC] Couldn't find SRS aperiodic resource with trigger 3
Assertion (is_nr_UL_slot(tdd_config, slot_tx, mac->frame_type) != 0) failed!
In lockGet_ul_config() /home/guido/repo/openairinterface5g/ue-resync/openair2/LAYER2/NR_MAC_UE/nr_ue_scheduler.c:67
UL config_request called at wrong slot 16
Notes on RFSIM channel simulation (to be investigated)
Channel model application
In apply_channelmod.c
log:
- Inconsistencies in the computed output power, which increases proportionally with the pathloss. The higher is output power value the worst is channel condition (counterintuitive):
noise_power_dB -4: Input power -inf, output power: 43.157185, channel path loss 20.000000, noise coeff: 20.082400
noise_power_dB 15: Input power -inf, output power: 81.159243, channel path loss 20.000000, noise coeff: 39.082400
- Output power increases slowly with a pathloss increase:
pathloss 20: Input power -inf, output power: 43.177918, channel path loss 20.000000, noise coeff: 20.082400
pathloss 40: Input power -inf, output power: 44.358762, channel path loss 40.000000, noise coeff: 20.082400
pathloss starts having an impact when ploss > 40
.