After some testing with 2024.w04, LITEON, Quectel RM500Q, Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-1050-realtime x86_64) on Xeon(R) Gold 6442Y, I would suggest to consider the following changes:
Add configurable system_core - for systems where CPU 0 cannot be reasonable isolated from kernel threads.
(DPDK uses system_core for its control threads: rte_mp_handle, eal-intr-thread,iavf-event-thread).
Add an explicit affinity assignment of ru_thread after initialization of XRAN library.
++ cpu = sched_getcpu();++ if (cpu != ru->ru_thread_core)+ {+ cpu_set_t cpuset;+ CPU_ZERO(&cpuset);+ CPU_SET(ru->ru_thread_core, &cpuset);+ //printf("AAA2 cpu %d instead of %d\n", cpu, ru->ru_thread_core);+ int ret = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);+ AssertFatal(ret == 0, "Error in pthread_getaffinity_np(): ret: %d, errno: %d", ret, errno);+ }+ if (setup_RU_buffers(ru)!=0) { printf("Exiting, cannot initialize RU Buffers\n"); exit(-1);
XRAN library: initialize EAL without telemetry to skip creation of another or two (DPDK 21.11.5) DPDK control thread(s)
I agree for 1 & 3. Regarding 2, following remarks:
we should set this in the oran fhlib library, into which we call from ru_thread() using the function openair0_transport_load() just above. The reason being that so far this only seems necessary for 7.2; many other radios, e.g., USRP, use all cores floating, so we should not force it for that use case
in fact, such code snippet was present in earlier versions, but I did not really understand why, so I commented it. The code is actually still there, see commented function set_main_core() in radio/fhi_72/oran-init.c. Sorry, I did not realize this earlier.
I have an open MR !2559 (merged), if you don't mind, I add it there tomorrow and will inform you when you when it is done.
I dont have an explanation, but it seems like after eal_init the affinity of parent thread (ru_thread) is changed. It might be a local problem - I have observed it in one system at every run (see printout, planning to verify on another.
(DPDK uses system_core for its control threads: rte_mp_handle, eal-intr-thread,iavf-event-thread).
do you have more information on that? I cannot find either how xran tells DPDK which is the system_core, nor do I find references to the system_core used in DPDK online. I will add it, but I still don't understand why that is necessary
DPDK eal_init usually is called from main thread, which is not supposed to be a packet processing thread. In our case DPDK decides (compute_ctrl_threads_cpuse) to spread its control threads following the affinity of current thread(ru_thread). Running ru_thread and DPDK control threads on the same CPU causes problem on my setup.
@asergeev You are running OAI using taskset whereas we don't do that. That is why we have the difference. I don't know if what you are doing is good or not because the threads we don't pin will take the threadpool threads. I don't know how it will affect the KPIs. As you can see the difference in your and my output.
Can you provide the output of the below script it will print like what I did above. This will provide allowed affinity for each thread.
You need to provide nr-softmodem id as it is the parent.
@schmidtr so the behaviour on our machines is not the same. In our case ru_thread and dpdk threads are using the same CPU core. We need to see how to use system_core to force the use of other cores than core 0.