IRQ Balancing

Posted · Add Comment

On Linux, interrupts are handled automatically by the kernel. In particular there’s a process named irqbalancer that is responsible for balancing interrupts across processors. Unfortunately the default is to let all processors handle interrupts with the result that the overall performance is not optimal in particular on multi-core systems. This is because modern NICs can have multiple RX queues that work perfectly if cache coherency is respected. This means that interrupts for ethX-RX queue Y must be sent to 1 (one) core or at most to a core and its Hyper-Threaded (HT) companion. If multiple processors handle the same RX queue, this invalidates the cache and performance gets worse. For this reason IRQ balancing is the key for performance. In particular what I suggest is to have 1/2 (in case of HT) core(s) handle the same interrupt. For this reason on Linux interrupts are usually send to all processors hence /prox/irq/X/smp_affinity is set to ffffffff that means all processors. Instead as I have just stated above it’s better to avoid that all processors handle all interrupts. Example

~# grep eth /proc/interrupts
191: 0 0 3 0 0 0 2 310630615 454 0 0 0 0 0 2 0 PCI-MSI-edge eth5-rx-3
192: 0 3 0 0 0 2 0 314774529 0 0 0 0 0 2 0 0 PCI-MSI-edge eth5-rx-2
193: 3 0 0 0 2 309832652 454 0 0 0 0 0 2 0 0 0 PCI-MSI-edge eth5-rx-1
194: 0 0 0 2 0 314283930 0 0 0 0 0 2 0 0 0 3 PCI-MSI-edge eth5-TxRx-0
195: 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 0 PCI-MSI-edge eth5
196: 0 3 0 311226806 0 0 0 0 0 2 0 0 0

where


# cat /proc/irq/19[12345]/smp_affinity
00008080
00008080
00002020
00002020
ffffffff

This setup allows to maximize performance, in particular when using PF_RING and TNAPI. Please disable irqbalancer when manually tweaking interrupts, as irqbalancer will restore interrupts the way it likes jeopardizing all your work.

Further reading: