Improving snort performance using PF_RING and multi-queue adapters

Posted · Add Comment

As of PF_RING 4.5.x, the user-space tools part of PF_RING have been enhanced with native snort support. As of version 2.9, snort sits on top of a library called DAQ (Data Acquisition library) that creates a transparent layer between snort and the packet capture modules.

PF_RING is now a first class citizen in DAQ, as in PF_RING/userland/snort you can find the PF_RING DAQ module. This modules not only allows snort to take advantage of PF_RING acceleration, but it allows to offload to PF_RING some of its processing tasks. For instance when snort marks a specific communication flow as “bad” (i.e. all the packets of the flow must be dropped as they are suspicious), instead of wasting precious CPU cycles moving packets off the adapter to snort (that will drop them), the PF_RING DAQ module stops those packets directly into PF_RING. In fact, this module intercepts the snort “verdict” and if for some reason snort decides that the flow must be dropped, PF_RING transparently adds a filtering rule into the kernel module so that future packets belonging to the same flow will be dropped into the kernel. As you can imagine, this is a major performance improvement as packets marked as bad, will not hit snort at all, nor be moved off PF_RING as they will be dropped immediately.

This is not all you can do with PF_RING. In fact when combining multi-queue adapters such as Intel ET (1 Gbit) and X520 (10 Gbit) with PF_RING with TNAPI, it is possible to run multiple snort instances on top of the same ethernet port, where each snort instance will receive a portion of the traffic. The magic is done by PF_RING that avoid to merge NIC RX queues, by preserving the queue information and creating virtual ethernet interfaces (e.g. eth0@3 is the 3rd RX queue of the eth0 adapter). This feature allows users to parallelize snort execution (snort is single threaded so this is a good feature to have) as several instances can run simultaneously sitting on different CPU cores,  while fully unleashing the power of multi-core computers as all cores are busy processing packets.

If this is seems to be enough, you can do more. As most of you know, 82599-based adapters have the ability to assign traffic to cores by means of a component named flow director (FD). As FD is implemented in hardware, the dirty work is performed inside the NIC and not into the kernel. FD allows packets to be dropped into the NIC (i.e. at wire-speed in hardware) by means of filtering rules (up to 32k for perfect filters or infinite for hashing where you just have to pay in term of false positives). PF_RING allows to take advantage of FD so that via PF_RING you can set a filtering rule and if your nic is FD-aware, packets will be filtered into the NIC without any CPU cycle wasted on this. If you want, you can read all the details in this article.

Imagine that snort encounters a bad flow and decides to drop it. Thanks to PF_RING, if you have a 82599-based adapter, you can block those bad packets directly into the NIC. transparently. At 10 Gbit wire speed. All this using a commodity network adapter.

Is the time of costly FPGA-based NICs over? Maybe. Have fun with PF_RING for now.