Re: malloc problems with MySQL?



Paul Allen wrote:
From Jason Evans <jasone@xxxxxxxxxxx>, Wed, Apr 26, 2006 at 07:29:01AM -0700:

Allocations that are larger than the chunk size (2 MB by default) are allocated using mmap(2), rather than sbrk(2). Most likely, your problems will go away if you reduce the heap size, so that mmap has more address space to work with.

Which raises the question of what this code is checking for...
1167 #ifdef USE_BRK
1168 else if ((uintptr_t)ret >= (uintptr_t)brk_base
1169 && (uintptr_t)ret < (uintptr_t)brk_max) {
1170 /*
1171 * We succeeded in mapping memory, but at a location that could
1172 * be confused with brk. Leave the mapping intact so that this
1173 * won't ever happen again, then try again.
1174 */
1175 assert(addr == NULL);
1176 goto AGAIN;
1177 }
1178 #endif

This code was based on my misunderstanding of MAXDSIZ. I thought it was an upper bound on how large the data segment could be, but it's actually only the default value. I will remove this code, and modify the semantics of brk_max to deal with a variable data segment size limit.

1212 /*
1213 * Check for address ranges that were previously chunks and try
1214 * to use them.
1215 */
1216 1217 tchunk = RB_MIN(chunk_tree_s, &old_chunks);
1218 while (tchunk != NULL) {
1219 /* Found an address range. Try to recycle it. */
1220 1221 chunk = tchunk->chunk;
1222 delchunk = tchunk;
1223 tchunk = RB_NEXT(chunk_tree_s, &old_chunks, delchunk);
1224 1225 /* Remove delchunk from the tree. */
1226 RB_REMOVE(chunk_tree_s, &old_chunks, delchunk);
1227 base_chunk_node_dealloc(delchunk);
1228 1229 #ifdef USE_BRK
1230 if ((uintptr_t)chunk >= (uintptr_t)brk_base
1231 && (uintptr_t)chunk < (uintptr_t)brk_max) {
1232 /* Re-use a previously freed brk chunk. */
1233 ret = chunk;
1234 goto RETURN;
1235 }
1236 #endif
1237 if ((ret = pages_map(chunk, size)) != NULL) {
1238 /* Success. */
1239 goto RETURN;
1240 }
1241 }
1242 There is something rather scary going on in this part of the code.
Perhaps it just needs better commenting... You are scanning the tree
headed by old_chunks looking for an address range. If this is in the
mmap region you attempt to mmap it again... to "reuse address space"

To which I have to say: huh? In chunk_dealloc you explicitly call munmap on the address. Therefore, what is the point of intentionally "reusing" it by insisting the kernel return an address you had before. Perhaps
this would make sense if you had some perchunk structures allocated that
you wanted to reuse but if this is the case then I question whether you
really mean to just RB_REMOVE the chunk. Surely some other cleanup would
be appropriate...

This code behaves as intended. Note that in order to get chunk-aligned memory via mmap(), it is necessary to over-allocate, then trim. By keeping a cache of unmapped chunks, we can avoid the extra system calls most of the time, and we are also able to reduce memory fragmentation in many cases. (Over-allocation and trimming fails to utilize chunk-sized holes in the memory map.)

For that matter the brk allocation code should be changed to permit an
allocation greater than chunk_size. i.e., by assigning incr from size
and pulling it out of the if block. If you do this, of course you can
only be strictly better off than you are now because a program that
mainly allocs without freeing will get the benefit of the larger address
space. Whereas given that you allocate chunk_size from brk now,
when you go to reuse brk space you can decide to only do so to fullfil
chunk_size requests without losing anything relative to the current implementation.

There is one issue with this change that you don't mention: the current implementation never returns sbrk()ed memory. I did this in order to avoid races in multi-threaded programs that use brk() or sbrk(). However, in hindsight, such obscure (not to mention poorly designed) programs aren't worth supporting at the expense of all other programs. As such, I don't have a problem with making this change (along with a change that attempts to shrink the data segment during chunk deallocation).

That you do not permit kegs larger than chunk_size though suggests a
defect in your implementation. one solution to this would be to adjust
chunk_size before giving up on the allocation...

I don't understand what you are saying here. Perhaps you were being confused by the chunk_size variable masking in huge_malloc() that you mention below.

Also it is quite confusing that you have a global variable "chunk_size"
that you mask with a local variable in huge_malloc.

Indeed, it is. I'll fix this.

Thanks,
Jason
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Translating VM address to physical addresses in a non-root app
    ... pages you might get away with a contiguous allocation. ... Locking pages in memory is not an issue. ... > to mmap create sequential VM addresses in user space. ...
    (comp.os.linux.development.system)
  • Re: sbrk(2) broken
    ... Until he does, the datasize limit is a joke anyway, as anyone can circumvent it by either using mmap() instead of mallocor setting _malloc_options before calling malloc. ... When I e-mailed Jason Evans and Alan Cox about it, I suggested that we actually teach mallocto enforce an allocation limit itself by querying a limit once at process startup, and then using its own accounting to decide when to start failing requests. ... As an alternative model that would require some more infrastructural changes, I suggested a new mmapflag that hinted to the kernel that the page should count against a swap/anonymous memory limit, but that we should avoid more serious changes at the last minute before a release. ... The right answer is presumably to introduce a new LIMIT_SWAP, which limits the allocation of anonymous memory by processes, and size it to something like 90% of swap space by default. ...
    (freebsd-current)
  • Re: Translating VM address to physical addresses in a non-root app
    ... > driver do the allocation of memory. ... If multiple apps mmap simultaneously the memory ...
    (comp.os.linux.development.system)
  • Re: Whats the substitution of brk system call in POSIX standard?
    ... brk() depends on a process memory layout that's particular to traditional Unix implementations, but may not be used in other environments and would be difficult to emulate. ... Programs that need to call these directly are extremely rare, and it's hard to imagine them being portable to OSes with different memory layouts. ... malloc() is a library routine and is originally implemented with brksystem call. ... Now that brkis removed, if its substitution is mmap(), then, all memory allocations have to have something to do with virtual memory since this is natural implementation of mmap. ...
    (comp.unix.programmer)
  • Re: Whats the substitution of brk system call in POSIX standard?
    ... |>> I read from the brk manual page that brkand sbrk"are deliberately ... |> malloc() and mmap(), I guess. ... |> imagine them being portable to OSes with different memory layouts. ...
    (comp.unix.programmer)