Until last month, I have struggled to reach 7 Mpps packet capture using TNAPI. This week I see users still asking questions about how to handle 2 x 1 Gbit wire rate on commodity hardware.
I believe it’s now time to move to the next level, and achieve full 10Gbit wire rate on both RX and TX, using little CPU cycles so that we can not just capture but also process traffic. Together with Silicom we have developed a 10 Gbit PF_RING DNA driver, that we’ll soon introduce to the Linux community. We have been amazed to see how efficient is modern commodity hardware when programming it properly for targeting high-speed networking. The tests below have been made on a Supermicro server PC using a low-end X3450 Xeon processor, and configuring the 82599-based NIC to use just one queue (you can imagine what you can do with multiple queues)
# ./pfsend -i dna:eth5 -g 1 -l 60 -n 0 -r 10 Sending packets on dna:eth5 Using PF_RING v.4.6.5 Estimated CPU freq: 2397347000 Hz Number of 64-byte Packet Per Second at 10.00 Gbit/s: 14880952.38 TX rate: [current 14754203.44 pps/8104.73 Mbps][average 14752836.88 pps/8103.98 Mbps]
RX on one 10 Gbit port
# ./pfcount -i dna:eth5 ... Actual Stats: 14879572 pkts [1000.0 ms][14879259.5 pkt/sec]
RX on two 10 Gbit ports simultaneously# ./pfcount -i dna:eth5 # ./pfcount -i dna:eth6 Aggregate throughput 25.940 Mpps
Beside the performance figures you can see, it’s amazing to see what is the load on the CPU. Let’s consider this simple example (RX):
Actual Stats: 7470601 pkts [1'000.04 ms][7'470'242.42 pkt/sec]
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1474 root 20 0 8536 604 496 S 52 0.0 0:39.26 pfcount
So with 52% load on one core (the X3450 has 8 of them) you can capture with pfcount over 7Mpps.
It looks that packet capture is no longer a problem, and that as soon as PCIe gen-3 will be out, we can likely support 40 Gbit adapters (as soon as they will become available of course) using DNA.