Re: ten thousand small processes

From: Jeff Roberson (jroberson_at_chesapeake.net)
Date: 06/25/03

  • Next message: Terry Lambert: "Re: ten thousand small processes"
    Date: Wed, 25 Jun 2003 03:06:54 -0400 (EDT)
    To: "D. J. Bernstein" <djb@cr.yp.to>
    
    

    On 25 Jun 2003, D. J. Bernstein wrote:

    > As I said, I don't particularly care about the text segment. I'm not
    > talking about ten thousand separate programs.
    >
    > Why does the memory manager keep the stack separate from data? Suppose a
    > program has 1000 bytes of data+bss. You could organize VM as follows:
    >
    > 0x7fffac18 0x7fffb000 0x80000000
    > <---- stack data+bss text, say 5 pages heap ---->

    This is a layout that is chosen by some 64bit architectures. Alpha for
    example. The problem is that on alpha you have a LOT of address space and
    so it has many options for placing shared libraries. On x86 if you place
    them roughly in the middle you take space away from heap and stack
    equally.

    Furthermore, text is typically linked to run relative to address 0. This
    isn't up to the operating system. This is up to the tool chain and object
    format. In some cases it is up to the ABI. The other problem with this
    arrangement is that it restricts the heap size. On FreeBSD this would
    leave you with 1GB of heap and nearly 2GB of stack. Perhaps you use your
    stack differently than I do but that does not sound so appealing.

    >
    > As long as the stack doesn't chew up more than 3096 bytes and the heap
    > isn't used, there's just one page per process.

    Except that the operating system needs a stack too. That's several pages.
    And the uarea adds another page. And the proc structure, and the vm
    space, and the file desc table, and the thread structures now that freebsd
    is multithreaded. That's probably another 20kb or so on x86. The minor
    savings in user space are far outweighed by the kernel usage. Amdahl
    would have something to say about that.

    Furthermore, the VM treats stack pages and data pages differently. it
    also treats bss pages differently. Sure you could fit them all in if you
    wrote special case code to handle this situation, but how often does it
    really occur? I'm guessing just about never for almost all applications
    that FreeBSD is used for. This is a general purpose operating system that
    needs to work for normal cases.

    > As for page tables: Instead of allocating space for a bunch of nearly
    > identical page tables, why not overlap page tables, with the changes
    > copied on a process switch?

    They aren't nearly identical. They point at different pages. You can't
    overlap them unless you have 4MB of aligned mapped pages that are
    identical across two processes as is the case with large shared memory
    segments. Again, I think you would do well to read up on MMUs and paging
    hardware.

    If I gave two processes the same page directory and page tables they would
    overwrite each others memory!

    > As for 39 pages of VM, mostly stack: Can the system actually allocate
    > 390000 pages of VM? I'm only mildly concerned with the memory-management

    There is no special allocation for virtual address space that is
    contiguous with another region. It is simply the upper bound on an
    address. The system can allocate more vm than the system has swap and
    physical memory. The system can allocate more vm than available disk
    space if you ask for the right thing in the right number of processes.
    390000 is only 1.5 gigs. You could allocate that many pages in one
    process on x86.

    > time; what bothers me is the loss of valuable address space. I hope that
    > this 128-kilobyte stack carelessness doesn't reflect a general policy of
    > dishonest VM allocation (``overcommitment''); I need to be able to
    > preallocate memory with proper error detection, so that I can guarantee
    > the success of subsequent operations.

    You need to look at the situation realisticly. FreeBSD is not being
    developed for your mythical one page process. It's developed for real
    applications that use up stack space. That limit is set so that in the
    common case we don't have to do an expensive operation to grow the stack's
    map. Make the common case fast, right? I don't appreciate your tone
    here, especially coming from someone who obviously is not familiar with
    VMs.

    >
    > As for malloc()'s careless use of memory: Is it really asking so much
    > that a single malloc(1) not be expanded by a factor of 16384?

    Yes, when in the common case that extra allocation will be used later.
    The size of the allocation from the back end dramatically impacts the
    performance of malloc and the vm system. It also effects fragmentation.

    > Here's a really easy way to improve malloc(). Apparently, right now,
    > there's no use of the space between the initial brk and the next page
    > boundary. Okay: allocate that space in the simplest possible way---

    This is fairly extreme hackery to save a half page of memory on average
    and take a branch mispredict the rest of the time.

    [code removed]
    >
    > ---with no waste of space and practically no waste of time. Maybe add

    Except for the most important time; developers. This is an absurd
    suggestion.

    > 8192 to wherewenormallystart; this is lots of room for people who know
    > how to write small programs, and the cost is unnoticeable for people who
    > don't.

    People who know how to write really small programs would know not to use
    the standard libc or at least not the standard malloc implementation. It
    is designed for average programs for real systems.

    > (Quite a few of my programs simulate this effect by checking for space
    > in a bss array, typically 2K. But setting aside the right amount of
    > space would mean compiling, inspecting the brk alignment, and
    > recompiling. I also feel bad chewing up space on systems where malloc()
    > actually knows what it's doing.)

    I'm sure your programs are very small. Our userland malloc is actually
    quite good. We have phk to thank for that. I'm sure he'd love to hear
    your critiques and suggestions.

    > As for the safety of writing code that makes malloc() fail horribly:
    > After the Solaris treatment of BSD sockets, and the ``look, Ma, I can
    > make an only-slightly-broken imitation of poll() using select()!''
    > epidemic, I don't trust OS distributors to reserve syscall names for
    > actual syscalls. I encounter more than enough portability problems
    > without going out of my way to look for them.

    The man pages specifically warn against using brk and sbrk yourself if
    you're going to use malloc() and free(). You get what you deserve if you
    do that.

    > ---D. J. Bernstein, Associate Professor, Department of
    Mathematics,
    > Statistics, and Computer Science, University of Illinois at Chicago
    >

    As I said before, it sounds like your application is better suited for
    DOS. I'm sure you'll find that you have much more control over the
    address layout of your system.

    Cheers,
    Jeff

    _______________________________________________
    freebsd-performance@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-performance
    To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"


  • Next message: Terry Lambert: "Re: ten thousand small processes"

    Relevant Pages

    • Re: A solution for the allocation failures problem
      ... struct nested *ptr1; ... which would jump to label if the allocation failed. ... implemented for the stack under windows. ... So in the malloc instance I would say you make use of the pre-allocated reserve rather than freeing it so you can do further mallocs whilst recovering. ...
      (comp.lang.c)
    • Re: ten thousand small processes
      ... Stack needs to be executable for the current signal trampoline ... the use of malloc() that is causing your primary ... if there is any heap memory in use at all, no matter what you do, ... either directly, as a 4M page mapping (not used for user processes, ...
      (freebsd-performance)
    • Re: function "&" for strings type cause memory problem.
      ... > and I allocate the large object in heap memory in previous case. ... Memory allocation is always a risk. ... There is no difference between stack and heap here. ...
      (comp.lang.ada)
    • Re: function "&" for strings type cause memory problem.
      ... > its execution state i. Si/|Si| is an average object size. ... >> and I allocate the large object in heap memory in previous case. ... One can ignore allocation issues as long as ... > the heap can be enlarged on demand while the stack not, ...
      (comp.lang.ada)
    • Re: A solution for the allocation failures problem
      ... It is not possible to check EVERY malloc result within complex software. ... a memory exhaustion situation arises, ... implemented for the stack under windows. ... with the memory allocation method, you can often just retry your ...
      (comp.lang.c)