Can you believe that moving out of kernel can actually result in increase in the performance of network processing?
It’s still hard to believe, but Robert Graham is making this point. In his own words:
The way network stacks work today is to let the kernel do all the heavy lifting. It starts with kernel drivers for Ethernet cards, which passes packets to the kernel’s TCP/IP stack. Upon the reception, the packet must make an arduous climb up the network stack until it finally escapes to user-mode
Instead of moving everything into the kernel we can move everything into user-mode. This is done first by rewriting the network driver. Instead of a network driver that hands off packets to the kernel, you change the driver does that it doesn’t. Instead, you map the packet buffers into user-mode space
In recent benchmark, Intel has demonstrated a system using an 8-core 2.0-GHz 1-socket server forwarding packets at a rate of 80-million packets/second. That means receiving the packet, processing it in user-mode (outside the kernel) and retransmission. That works out to 200 clock cycles per packet
For example, I was having a discussion about DNS servers on Twitter. I was throwing around the number of “10 million DNS requests per second”. The other person said that this was impossible, because you’d hit the packets-per-second performance limit of the system. As the Intel benchmarks show, this is actually 12% the packet limit of the system
The general concept we are working toward is the difference between the “control plane” and the “data plane”. A 100 years ago, telephones were just copper wires. You need a switch board operator to connect your copper wire to your destination copper wire. In the digital revolution in the late 1960s and early 1970s, wires became streams of bits, and switchboards became computers. AT&T designed Unix to control how data was transmitted, but not to handle data transmission itself. Thus, operating system kernels are designed to carry the slow rate of control information, not the fast rate of data. That’s why you get a 100 to 1 performance difference between custom drivers and kernel drivers.
For example, back in the year 2000 at DefCon, I brought up the fact that my intrusion detection system (IDS) running on Windows on my tiny 11-inch notebook could handle a full 148,800 packets/second. Back then kernel drivers caused an interrupt for every packet, and hardware was limited to about 15,000 interrupts-per-second. People had a hard enough time accepting the 10x performance increase, that a tiny notebook could outperform the fastest big-iron server hardware, and that Windows could outperform their favorite operating system like Solaris or Linux. They couldn’t grasp the idea of simply turning off interrupts, and that if you bypass the operating system, it doesn’t matter whether you are running Windows or Linux. These days with PF_RING having the similar architecture to the BlackICE drivers, this is more understandable, but back then, it was unbelievable.