Zero-copy BPF update (was: Re: Aggregating many ports into one for tcpdump server.)



On Wed, 5 Dec 2007, Vlad GALU wrote:

I've had several reports of significantly improved packet capture rates at high speeds with it, but it's not yet in the tree because we feel it needs more evaluation and review. I hope to ship some form of zero-copy BPF buffer support in FreeBSD 8, and possibly even MFC it. Any feedback you might have would be most helpful.

Having sent you the patch, I should have let you know that you'll need to:

- Add options BPF_ZEROCOPY to your kernel configuration to enable the
zero-copy buffering mode.

- Make sure the kernel and libpcap are rebuild following the application of
the patch and dropping in the tarball.

- setenv BPF_ZERO_COPY before running tcpdump or other BPF-based tools to
enable the zero-copy buffer mode.

The patch includes both kernel changes (abstract the buffer model, add a new buffer model) and user space changes (updated libpcap to speak the new model, selected right now with the environmental variable). Presumably if merged, zero-copy BPF buffers would be used by default via libpcap if present in the kernel, but right now this is all for evaluation purposes.

Thanks, Robert! I'll start running a few tests next week, I'm waiting for some hardware to arrive first.

I've put up an updated tarball based on some recent changes here:

http://www.watson.org/~robert/freebsd/20071226-zcopybpf.tgz

The main changes since this last drop are:

- BPF_ZERO_COPY environmental variable renamed to BPF_ZEROCOPY to match kernel
option name.

- libpcap support for zero-copy BPF buffers reworked to avoid unconditional
call to select() for each buffer when there's already a pending buffer
available to use; in general, avoid system calls entirely when there's data
already waiting, only use system calls when there isn't a completed buffer
to work on next.

- Comments cleanup and some code cleanup.

- A README to provide a little more guidance on getting it working. :-)

You will need to "make clean ; make ; make install" in the modified libpcap against, as the size of pcap_t has changed. In principle "make ; make install" should DTRT, but it appears not to for me.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Merging relayfs?
    ... > kernel to implement speculative tracing, ... relayfs was prepared for low latency on move data outside kernel space, ... getting data from probes do not require organize all them in regular ... Only in all cases where buffer must be neccessarly moved outside kernel ...
    (Linux-Kernel)
  • [PATCH] Documentation update for relay interface
    ... Here's updated documentation for the relay interface, ... +efficiently log and transfer large quantities of data from the kernel ... +A 'relay channel' is a kernel->user data relay mechanism implemented ... +buffer data. ...
    (Linux-Kernel)
  • [RFC] Userspace tracing memory mappings
    ... 16 per cpu trace buffers at the same time. ... - Also need some space for the kernel to export control information. ... When the process issues its first buffer switch (that's a second added ...
    (Linux-Kernel)
  • Re: Question about memory mapping mechanism
    ... The thing is that I'd like to prevent kernel to swap these pages out, ... the buffer pages, I should increase the referrence of the pages by calling ... Well it wasn't this code in particular, but another driver I was putting ... use the infiniband approach to mmap() the user-space pages and send them ...
    (Linux-Kernel)
  • Re: contigmalloc() and mmap()
    ... there seems no big differences between the kernel ... > to the card on another node, it will be DMAed to memory too. ... The buffer is mmaped to user process space, ... > mmap driver's buffer (allocated by contigmalloc()) and is killed, ...
    (freebsd-hackers)