Skip to content

Same head PDCP SDU is popped twice by two threads in TDD. Causing memory leak and eventually crash UE by memory allocation failure

**symptom: **
UE asserts in get_free_mem_block() due to failing the MEM_BLOCK allocation. It happened after running iperf downlink test for few 10mins.


**cause: **
the two threads UE_thread_rxn_txnp4() access to the same head SDU from pdcp_sdu_list when they both are doing the pdcp_fifo_flush_sdus(). The log below capture this issue.

[PDCP][I][pdcp_data_ind] inst=62004 size=1460
[PDCP][I][pdcp_data_ind] inst=62005 size=58
[PDCP][I][pdcp_data_ind] inst=62006 size=1460
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_odd] inst=62004 size=1460
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] inst=62004 size=1460
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] inst=62005 size=58
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] inst=62006 size=1460
[PDCP][I][pdcp_fifo_flush_sdus] 1 skip free_mem_block: pdcp_output_sdu_bytes_to_write = -58
[PDCP][I][pdcp_fifo_flush_sdus] 6 skip free_mem_block: bytes_wrote = -58

It happens when the "even" UE_thread_rxn_txnp4() is processing a downlink subframe, and the "odd" UE_thread_rxn_txnp4() is processing a special subframe. The "even" goes to pdcp_fifo_flush_sdus() at a later time in a TTI as it needs to finish the phy_procedures_UE_RX(); whereas the "odd" goes straight to the pdcp_fifo_flush_sdus() at the beginning of a TTI when it skips all the phy_procedures_UE_*(). The collision only occurs by chances but once it happens the memory leak begins and does not stop until memory allocation fails at get_free_mem_block().

When the collision occurs, the global variable pdcp_output_sdu_bytes_to_write is changed by the other thread unexpectedly. The pdcp_fifo_flush_sdus() then never runs into the part that call free_mem_block(), causing memory leak.


** fix: **
Add mutex protection to the pdcp_fifo_flush_sdus(). The protection can be disabled by commenting out the macro PDCP_SDU_FLUSH_LOCK.

Also added log when such collision occurs, as follow.

[PDCP][I][pdcp_data_ind] inst=698474 size=1460
[PDCP][I][pdcp_data_ind] inst=698475 size=58
[PDCP][I][pdcp_data_ind] inst=698476 size=1460
[PDCP][I][pdcp_data_ind] inst=698477 size=58
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] SFN/SF=77808/0 inst=698474 size=1460
[PDCP][W][pdcp_fifo_flush_sdus] [rxn_txnp4_odd] at SFN/SF=77808/1 wait for PDCP FIFO to be unlocked
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] SFN/SF=77808/0 inst=698475 size=58
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] SFN/SF=77808/0 inst=698476 size=1460
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_even] SFN/SF=77808/0 inst=698477 size=58
[PDCP][I][pdcp_fifo_flush_sdus] [rxn_txnp4_odd] at SFN/SF=77808/1 PDCP FIFO is unlocked


**other: **
Expect this issue is unlikely to happen in FDD. And expect the issue is prominent in TDD only when PDSCH is disabled in special subframe, i.e. special-subframe-conf is 0 or 5.