Re: mbuf changes



On 9/27/10 6:14 AM, Andre Oppermann wrote:
On 27.09.2010 15:18, Luigi Rizzo wrote:
On Mon, Sep 27, 2010 at 02:55:45PM +0200, Andre Oppermann wrote:
...
my idea was to have an extra field in the mbuf to tell how much room
should be reserved/used for metadata (such as mtags) after
the payload area so you don't need to change the allocator, and
possibly can even modify this on an existing mbuf.
Almost always mbufs have spare room (e.g. incoming pkts have all
data in the cluster and mostly empty mdata; outgoing, except
for rare cases, tend to be in a similar situation.
So this approach would allow to take an already allocated
mbuf and put the mtag in the spare area after the data.

For incoming data this approach could work as usually 2K mbuf clusters
are used and they have trailing space available, or rather the normal
mbuf referencing the cluster doesn't have its own data section unused.

When trailing space should be used the M_TAILINGSPACE() needs modifications
and a full tree audit is required to make sure that all mbuf consumers are
correctly using it and not some own version that directly assumes certain
mbuf sizes, etc. A lot of work.

For locally generated mbufs and socket buffers we try to use the mbufs to
their maximal extent. When the socket buffer data is packetized it normally
is referenced then we get the normal mbuf with its data portion unused. So
that could work.

A complication is the m_tag_free() field and function which puts the memory
deallocation into the hands of the mtag user. That means all mtag consumers
have to made aware of provided storage w/o having to return the memory
directly
to the memory allocator (malloc/UMA).

So the only way I realistically see is to make use of the mbuf's unused
data portion when it has external storage to it. This should probably
cover about 98% of all cases. The rest has to malloc() the mtag storage
as usual.

so it wouldn't be bad -- i cannot judge the numbers, but definitely
it would work for all incoming traffic, plus all tcp data packets
(as the payload is in the cluster), plus all pure acks (which are small),
plus all UDP above some 200 bytes...

Yes, about that.

I could whip up a prototype for review in the next weeks.

I seem to remember that jeffr had already something done in Perforce.

That's a more general overhaul of the way mbuf's are structured and
allocated with UMA. I'm not sure it provides for the mtag issue. Will
check though.

I'd like to see if we can go over his stuff and any other suggested changes before 9.0
and see if we can agree on a change for 9.0

Jeff, we discussed this a year ago.. do you still have your suggested changes?



_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: mbuf changes
    ... the payload area so you don't need to change the allocator, ... mbuf and put the mtag in the spare area after the data. ... to the memory allocator. ...
    (freebsd-net)
  • Re: mbuf changes
    ... the payload area so you don't need to change the allocator, ... mbuf and put the mtag in the spare area after the data. ... to the memory allocator. ...
    (freebsd-net)
  • Re: About netstat -m: What is "mbuf+clusters out of packet secondary zone in use" ?
    ... Running netstat -m during an ethernt stress test I see that the "mbuf+clusters out of packet secondary zone in use" number is growing gradualy. ... the FreeBSD mbuf allocator has a number of slab allocator zones it can draw from, depending on the type of request. ... The 'mbuf' zone allocates simple mbufs from a cache. ... The various cluster zones allocate cluster storage of various sizes, from the 2k default cluster size up to various jumbo sizes used when sending large amounts of data or when jumbograms are configured for a network interface that supports them. ...
    (freebsd-net)
  • Re: mbuf changes
    ... mbuf and put the mtag in the spare area after the data. ... mbuf referencing the cluster doesn't have its own data section unused. ... to the memory allocator. ...
    (freebsd-net)
  • Re: mbuf changes
    ... mbuf and put the mtag in the spare area after the data. ... to the memory allocator. ... That leaves 156 bytes for the mbuf header and prepend ...
    (freebsd-net)