Re: C++ Garbage Collector on VMS?




"Bill Todd" <billtodd@xxxxxxxxxxxxx> wrote in message
news:Y42dnV5sy5_-iq7bnZ2dnUVZ_g-dnZ2d@xxxxxxxxxxxxxxxxxxxxxxxxxxx
Michael D. Ober wrote:

...

The GC runs whenever a memory allocation fails for insufficient memory.

That is often a sub-optimal strategy: a flexible GC can instead run in
the background proactively and incrementally doing its job - that way,
while latency may still not be completely deterministic (e.g., if the
background scan falls behind and an allocation must wait for the GC to
run), allocations are *usually* unaffected by garbage collection and
almost *never* have to wait for a *full* GC scan.


My example was that of a basic GC. Modern GCs are far more sophisticated
and actually do what you describe.

When it runs, it identifies all objects in memory that can be accessed
directly or indirectly from the stacks and global data spaces (or in the
case of a reference counting system, have a non-zero reference count) as
well as all the pointers to those objects. The GC then moves the objects
to the beginning of the heap, updating the pointers to the objects as it
does so.

Ah - so you are familiar with reference-counting after all: your earlier
statements to the effect that "a GC is the only way to go since it
determines when the memory is no longer acessible by the program"
suggested otherwise.

Use of reference counting to deallocate (e.g., in its destructor) an
object when no one retains any interest in it is a perfectly good
alternative to the kind of GC that you describe, at least in solving the
specific problem of determining when an object is no longer accessible by
other parts of the program that you claimed in your quoted statement above
general-purpose GC was *required* to solve. It does depend upon proper
maintenance of the reference counts (see also Arne's example elsewhere of
an otherwise-isolated reference loop), and doesn't defragment the heap
(segregating allocation areas by type/size can help a lot there, though -
and can significantly speed up allocation and even
construction/destruction in the process), but then it also doesn't have
anything like the complexity of more powerful GC, either (which itself
requires at least the discipline on the part of the programmer not to
perform explicit allocations outside the scope of the GC that interact in
any way with GC-controlled memory).


This is why early GCs were reference counted based. However, reference
counting, even on hardware designed to support it, can take up to half the
CPU time a program requires to run. For instance, one of the early
Smalltalk GCs that was based on refernce counting was measured to actually
take up to half the CPU cycles. Also, since reference counting systems
usually don't compact free heap space, you can still run out of memory
quicker than a compacting GC.

As a confirmed programming Neanderthal, I tend to use C as a better
assembler and C++ as an even better one: one can write object-oriented
code in all three, after all (see the Linux kernel for copious examples).
If I were going to use a GC I think I'd want it to incorporate use of
scannable indirect descriptors for *every* allocation, such that it could
be absolutely sure of what it was doing (rather than attempting to infer
what *might* be a pointer from examination of the stack, for example - a
process that I'd expect might be even more prone to memory leaks than a
good programmer is).

- bill

Your observation of ensuring the descriptors are scannible is the basis for
modern GCs. You are absolutely correct in that you don't want to have to
guess "is this value an integer or is it a pointer to the heap?"

What I described was a very simplified description of a
mark/sweek/compaction GC. Modern GCs don't wait until a allocation fails,
although they still trigger on this event. Instead, they use idle time to
compress memory as well as track the rate of new memory allocation to
predict when a GC will be required. The other major enhancement is the use
of a generational heap. Most allocated memory is used for a very short
time, which translates into the memory no longer being accesible by the
program by the time the GC runs again. In a generational GC, objects are
moved from the initial generation (Gen(0)) to the next generation (Gen(1))
during the compaction pass. At the end of Gen(0) compaction, the Gen(0)
heap is completely empty. Gen(1) only gets collected when it runs out of
space and then it goes to Gen(2) and so on. The only time a full compaction
is required is when the last generation is full.

Mike.


.



Relevant Pages

  • Re: GC performance - GC fragility
    ... I take care ). ... memory in an endless loop doesn't eat all the systems memory. ... reference an object that has been collected in the same way that you can ... otherwise allocation a single object would result in 100 ...
    (borland.public.delphi.non-technical)
  • Re: Whats the difference between the heap and the freestore?
    ... >> In general, when discussing dynamically allocated memory, I hear people ... As of a couple of days ago, I though the heap ... > When you use the term heap in the context of memory allocation it annoys ... > without saying that they have to be allocated on a stack or anything else. ...
    (comp.lang.cpp)
  • Memory problems - WinDbg and SOS: Who recognizes this pattern?
    ... Last night I managed to get a memory dump using ADPlus and I analyzed it ... ephemeral segment allocation context: none ... Large object heap starts at 0x0a0d0030 ...
    (microsoft.public.dotnet.framework.performance)
  • Re: Tip Required on Dynamic Memory Allocation in Server Applications ...
    ... a memory allocation failure after 300 usages of the string ... Suggestions where given to pre-reserve the size of the heap, ... My program works untill 200-300 requests are ...
    (microsoft.public.win32.programmer.kernel)
  • Re: to dispose or not ?
    ... always have the system memory roots available. ... collector for the Gen0 and Gen1 heaps and a mark/sweep/compact collector for ... heap or compacts free space out of the heap. ... the reference from the variable to the object in the heap. ...
    (microsoft.public.dotnet.languages.vb)