Re: sort(1) memory usage



Jeremy Chadwick <koitsu@xxxxxxxxxxx> writes:
As you said: the code shows that when no files are specified (e.g. read
off a pipe), sort will make some assumptions regarding the initial
buffer size to read data into. The buffer size allocated in that case
is fairly large, rather than basing it off of the first line off stdin;
it looks like this is done to save CPU time in the long run (otherwise
you'd have to rellocate more later and take a hit; initbuf() is
responsible for that).

Oh give me a break, the self-starting exponential algorithm for growth
of dynamically allocated buffers has been known for decades. In case
any GNU sort developers are reading this, here it comes, free of charge:

static char *buf = NULL;
static size_t bufsz = 0;
static size_t buflen = 0;

int
buf_append(const char *str)
{
size_t len;

len = strlen(str);
if (buflen + len + 1 > bufsz) {
size_t nbufsz = bufsz;
char *nbuf;

while (buflen + len + 1 > nbufsz)
nbufsz = nbufsz * 2 + 1;
nbuf = realloc(buf, nbufsz);
if (nbuf == NULL)
return (-1);
buf = nbuf;
bufsz = nbufsz;
}
memcpy(buf + buflen, str, len);
buf[buflen + len] = '\0';
buflen += len;
return (0);
}

With a good allocator - and depending to some extent on the memory usage
pattern of the rest of your program - if you jump-start it by initally
allocating 16 kB or so (and setting bufsz accordingly), realloc() will
never need to copy anything - but even in the worst case, the amortized
cost will be O(2n), IIRC. This is practically unnoticeable next to the
cost of the sorting algorithm itself, which will be O(n log n) at best
and O(n*n) at worst.

Looking at the code, it seems to go to extreme lengths to get it
absolutely wrong. For instance, if hw.physmem / 8 > hw.usermem, it will
pick the former, which means it's pretty much guaranteed to either fail
or hose your system (or both).
Can you expand on this? Looking at the code, it doesn't appear that's
possible. The code in question is default_sort_size(), which is used
when no -S or --buffer-size argument is specified.

I looked at how it computes the cap, which is MAX(total / 8, avail) - in
other words, never mind what's actually feasible, I want more! More!
More, I say!

Count this as a vote for ditching GNU sort in favor of a BSD-licensed
implementation (from {Net,Open}BSD for instance).
In this specific case, I think you're bashing GNU just because you feel
like it. Come on man... =/

No, I'm bashing GNU because it's bloated crap, as this example clearly
shows. It wouldn't be the first time a BSD rewrite of a GNU tool ended
up performing better; see for instance bsdtar. Besides, the FreeBSD
project has a tradition of replacing GNU tools with BSD-licensed
equivalents as long as no functionality is lost.

DES
--
Dag-Erling Smørgrav - des@xxxxxx
_______________________________________________
freebsd-hackers@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Scrolling through large view area.
    ... the spike was at the allocator. ... buffer has to be", so they called the allocator to allocate a buffer of the correct ... In a third case, I had put a "bubble sort" into my program, knowing I could replace it ... Ten minutes later, it was still sorting. ...
    (microsoft.public.vc.mfc)
  • Re: How to correctly buffer (FIFO) a number of samples to create a time-shifting filter ?
    ... If you do not have your own allocator you are at the mercy of downstream ... It seams that this will happen anyway also with transinplace filters ... real copy the samples and manage my own buffer:(... ...
    (microsoft.public.win32.programmer.directx.video)
  • Re: Custom filter gets "These filters cannot agree on a connection
    ... I have a very different kind of problem with my parser filter. ... Somhow when I am trying to set the Allocator for IAsyncReader to have above ... multiple of 512 for above mentioned size with buffer alignment value 512. ... Well documentation says output pin does't really have to agree on allocator ...
    (microsoft.public.win32.programmer.directx.video)
  • Re: Benchmark: STLs list vs. hand-coded one
    ... And having 64MB buffer allocated you will need to write your own new, ... or malloc() implementations to manage memory in the buffer and do ... unlikely that a custom memory allocator can be any faster than a generic ...
    (comp.arch.embedded)
  • Re: [RFC v13][PATCH 05/14] x86 support for checkpoint/restart
    ... Using the slab allocator would: ... allow an optimization that will reduce application downtime during checkpoint. ... all the pieces together as a long buffer. ... I do insist that the wrappers remain. ...
    (Linux-Kernel)

Loading