2. Intel FM10K nBPF Support

This module allows to set filtering rules directly on the NIC card using the BFP-like syntax supported by nBPF filters. As the filter expression complexity affects the ability for translation into specific rules for the NIC, we will define a set of constraints and allowed expressions.

2.1. Supported cards

  • Silicom PE3100G2DQIR [chip Intel FM10000 (code-name Red Rock Canyon)]

2.2. Requirements

This library is part of libpfring, in order to compile libpfring with Red Rock Canyon (RRC) filtering support you have to install the RDIF software and configure/make the PF_RING fm10k driver.

2.2.1. RDIF software

RDIF software provides a daemon (rdifd) and an rdif control tool (rdifctl). RIDF is available in package “RRC_100G_1R1b” that is provided by Silicom, Inc.

To compile and install RDIF software in the system directory do:

cd RRC_100G_1R1b/Linux/Redirect/RD_RRC_Control
tar xzvf rdif-*.tar.gz
cp fm_platform_attributes.cfg rdif-*/driver/
cd rdif-*/
./clean
sudo ./install

The rdifd daemon doesn’t run automatically. You either need to start it manually or via the script load_driver.sh discussed in the following section.

2.2.2. PF_RING fm10k driver

The fm10k driver is open-source and distributed with PF_RING. To get, compile, and install it do:

git clone https://www.github.com/ntop/PF_RING
cd PF_RING
make
cd drivers/intel/fm10k/fm10k-0.20.1-zc/src/
make
sudo ./load_driver.sh

load_driver.sh loads the fm10k driver, starts the rdif daemon, and sets the NIC switch with a default configuration, ready to receive nBPF filters.

The output of load_driver.sh tells also the names of the fm10k interfaces detected.

sudo ./load_driver.sh
[...]
Configuring ens9
[...]
Configuring enp3s0
[...]

Take note of these names as you will need them.

2.3. Example

Tools pfcount and pfsend bundled with PF_RING are used to send traffic and test an nBPF hardware filter on the FM10000 RRC.

Following are the steps to build the tools:

cd PF_RING/userland
make
cd examples

To carry on the following test we loop-connect the two interfaces of the NIC, interfaces that are ens9 and enp3s0 on our system.

Let’s say now that we want to send traffic at 30Gbps from interface ens9, with the additional requirements that packet lenght is 1500 bytes, the source ip address is 192.168.0.1 and the destination ip address is 192.168.0.2. The pfsend command that has to be executed is the following

sudo ./pfsend -i zc:ens9 -r 30 -b 1 -l 1500 -S 192.168.0.1 -D 192.168.0.2
[...]
TX rate: [current 2'372'059.48 pps/28.92 Gbps][average 2'370'550.71 pps/28.90 Gbps][total 196'761'818.00 pkts]
TX rate: [current 2'372'061.83 pps/28.92 Gbps][average 2'370'568.70 pps/28.90 Gbps][total 199'133'951.00 pkts]
[...]

This will be our traffic generator. Let’s move to the traffic capture with nBPF hardware filters. To capture without filters open another console and do

sudo ./pfcount -i zc:enp3s0
[...]
Actual Stats: [2'445'073 pkts rcvd][1'000.08 ms][2'444'874.96 pps][29.81 Gbps]
Actual Stats: [2'443'444 pkts rcvd][1'000.06 ms][2'443'294.95 pps][29.79 Gbps]

A look at the process highlights that pfcount is consuming 34% of a CPU.

ps aux | grep pfcount
root     17465 34.2  0.0  92248  2988 pts/1    R+   10:45   0:58 ./pfcount -i zc:enp3s0

Let’s now try to add a ‘capture-all’ filter to our pfcount.

sudo ./pfcount -i zc:enp3s0  -f "src host 192.168.0.1"

A look at the process highlighs a slight increase in the CPU load that, however, is still above 30%

ps aux | grep pfcount
root     18465 38.1  0.0  92248  2984 pts/1    S+   10:50   0:16  ./pfcount -i zc:enp3s0 -f src host 192.168.0.1

Now we can try and add a ‘drop-all’ filter to our pfcount by changing the ip source address.

sudo ./pfcount -i zc:enp3s0  -f "src host 192.168.0.2"

This time, process CPU occupancy is less that 1% confirming that our hardware filters are doing the heavy lifting thus leaving the CPU available for other activities.

ps aux | grep pfcount
root     18911  0.5  0.0  92248  3100 pts/1    S+   10:53   0:00  ./pfcount -i zc:enp3s0 -f src host 192.168.0.2

2.4. API

The API of nBPF module for Intel RRC includes the following functions:

  • int nbpf_rdif_reset(int unit)

    The nbpf_rdif_reset function set the nic card in MON2 mode. In MON2 mode every port of the switch is unlinked and no traffic pass between the ports. Input parameter:

    • “unit” -> intel NIC card indentifier [range from 0 to (MAX_INTEL_DEV - 1)]

    Return value:

    • 0 on failure
    • 1 on success

    Suggestion: use this function just once (in initialize phase of the NIC card).

  • nbpf_rdif_handle_t *nbpf_rdif_init(char *ifname)

    The nbpf_rdif_init function initializes the switch in order to put the port in inline mode: port 1 with port 3 for interface 0 port 2 with port 4 for interface 1 Input parameter:

    • “ifname” -> Interface name (for example “eth0”, “ens9”….)

    Return value:

    • NULL on failure
    • handle pointer on success. Please use the handle with “nbpf_rdif_set_filter” and “nbpf_rdif_destroy” functions.
  • int nbpf_rdif_set_filter(nbpf_rdif_handle_t *handle, char *bpf)

    If possible the nbpf_rdif_set_filter transforms the bpf filter in rules for the switch. Not all the bpf filters can be set (please read README.md. Input parameter:

    • handle -> data structure that contains the bpf rdif data. This handle is returned from nbpf_rdif_init function.
    • bpf -> bpf filter

    Return value:

    • 0 on failure
    • 1 on success
  • void nbpf_rdif_destroy(nbpf_rdif_handle_t *handle)

    The nbpf_rdif_destroy function removes the dinamic memory of the handle and deletes the rules on the switch for an interface (puts it in inline mode). Input parameter:

    • handle -> data structure that contains the bpf rdif data. This handle is returned from nbpf_rdif_init function. Before exiting, the function frees the dinamic memory.

2.5. Testing

You can use some commands to get stats about the packets that match with the rule you have set on the switch. The Silicom PE3100G2DQIR has two interfaces and two rules groups are set the bpf filter (group 1 for the first one and group 2 for the second one).

Commands:

  • rdifctl query_list 1 return how many rules the bpf filter has set on the switch for group 1 (interface 1)
  • rdifctl query_list 2 return how many rules the bpf filter has set on the switch for group 2 (interface 2)
  • sudo rdifctl rule_stat 1 1 how many packets of group 1 match with the rule 1
  • sudo rdifctl rule_stat 1 2 how many packets of group 1 match with the rule 2
  • ect

Tipically when a bpf filter is set you have two rule. For instance for “src host 10.0.0.1” bpf filter you have:

rule 1: permit traffic with src ip 10.0.0.1
rule 2: deny all