Re: sort(1) memory usage



On Sun, Feb 03, 2008 at 04:31:34PM +0100, Dag-Erling Smørgrav wrote:
Dag-Erling Smørgrav <des@xxxxxx> writes:
Erik Trulsson <ertr1013@xxxxxxxxxxxxx> writes:
Yep, it seems that GNU sort allocates a quite large buffer by default when
the size of the input is unknown (such as when it reads input from stdin.)
A quick check in the source code indicates that it tries to size this buffer
according to how much memory the system has (and according to any limits set
on how much memory the process is allowed to use.)

Uh, OK. This scaling doesn't seem to work correctly. It seems to
allocate 27 MB on 32-bit machines and 54 MB on 64-bit machines,
regardless of memory size.

I said it *tries* to the size the buffer according the amount of memory
available. I didn't say it succeded in doing so, or that it even made a
good attempty at it.

Those 27MB/54MB is probably because it hits some kind of limit.
On a machine having only 64MB RAM, sort(1) "only" allocated 21MB adress space.

I suspect the scaling algorithm was designed for older machines which
rarely, if ever, had more than maybe 64MB RAM (and usually less than that),
and that little thought was given to multi-gigabyte machines like those
common today.



Looking at the code, it seems to go to extreme lengths to get it
absolutely wrong. For instance, if hw.physmem / 8 > hw.usermem, it will
pick the former, which means it's pretty much guaranteed to either fail
or hose your system (or both).

In the immortal words of Blazing Star: YOU FAIL IT

Count this as a vote for ditching GNU sort in favor of a BSD-licensed
implementation (from {Net,Open}BSD for instance).


If any such implementation was a true drop-in replacement of GNU sort
(supporting all the same options etc.) and did not have noticably worse
performance, then I certainly would not raise any objections to that.



--
<Insert your favourite quote here.>
Erik Trulsson
ertr1013@xxxxxxxxxxxxx
_______________________________________________
freebsd-hackers@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]
    ... If I'm reading the source code for GNU sort correctly, ... following snippet of shell code displays how much memory it uses ... if (memfree> memtotal/8) ... That is, over simplifying, GNU sort looks at the first two entries ...
    (Linux-Kernel)
  • Re: Broadcom 440x NIC not recognized on boot
    ... I am attempting to add support for the built-in network device in my ... dmesg output now indicates the card is recognized, ... And it ain't getting the register memory it needs, ... The source code also refers to PCI assigning the memory block, ...
    (freebsd-questions)
  • Re: sort(1) memory usage
    ... the size of the input is unknown (such as when it reads input from stdin.) ... A quick check in the source code indicates that it tries to size this buffer ... according to how much memory the system has (and according to any limits set ... In the immortal words of Blazing Star: YOU FAIL IT ...
    (freebsd-hackers)
  • Re: sort(1) memory usage
    ... A quick check in the source code indicates that it tries to size this buffer ... according to how much memory the system has (and according to any limits set ... In the immortal words of Blazing Star: YOU FAIL IT ... Count this as a vote for ditching GNU sort in favor of a BSD-licensed ...
    (freebsd-hackers)
  • Re: string literal is an lvalue; other literals are rvalues.
    ... >If I have a literal 5 in my source code it begins life as an ascii ... character set in which your program is written is ASCII. ... occupy what memory. ... const int x = 625; ...
    (comp.lang.cpp)

Loading