I have collected some arguments (not my own ones) from literature and discussions - both on the net and private ones. Not being naive enough to believe that this religious war might end soon, it may at least give some common ground on which to base discussions.
Also, it shall prevent useless questions/flames etc. sent to my person in the future.
For those of you who know Smalltalk/GC, it may not give you any new information - just skip & forget about it.
Some of the stuff below may sound a bit drastic, cynic or whatever - think of a big smily :-) after every such statement ...
Except in the rare cases when those objects (not in the OO-sense) are used and needed for the whole running time of the program, sooner or later the storage used by them must be freed and given back to some management facility - otherwise your program would continue to grow, eating up tons of unused storage space.
Although this is a bit of a simplification, programming systems can be roughly divided into two classes:
those that require the programmer to explicitly request for the freeing of an unused storage area (i.e. by calling for a "dispose", "free" or "destroy" operation); those that find unused storage automatically and return the storage to some free-memory management facility. This is called "Garbage collection".
many believe that it is almost impossible in a modest sized programming project, to tell for sure when an object is no longer needed(large being something where many people work for years, creating code with megabytes in size)Some even state that it is impossible to create a large system, which will not have memory leaks (without GC).
Therefore (in practice) one of three situations arise:
- this results in so called "memory leaks" and leads to ever growing storage needs.
take the X-window server as an example, which - even after years of development - is still known to contain some of these leaks. Also, many windowing toolkits (such as the motif library) suffer from this problem.
"Garbage collection" is typically done by terminating the program and restarting from scratch - this suggestion can even be found in some vendor user manual (this is not a joke).
- this may result in a variety of errors, from going totally unnoticed (if the data did not get overwritten in the meantime), to invalid data to total crashes of the program. It is also possible, that a program runs pretty well on one machine, but crashes badly on another - due to different reallocation strategies in the memory allocator. Or, that a simple change in the program makes a perfectly running program suddenly crash - as a consequence of some storage address changes.
To make it clear:
I REALLY do NOT think (and do NOT want to give the impression) that all these programmers are incapable of good programming - every one of us has (and had) many errors of this kind - even the very best gurus make those errors !!!Its simply the complexity of big systems (especially when created by a big group) which makes these errors appear again and again.
When separate subsystems (libraries, especially: binary libraries) are involved, things tend to become even harder, since it is often unclear (from the specification and documentation) who is responsible for freeing of objects. ((just think of a container-class, which keeps references to some other objects; the question is if the container should free its referred-to objects when freed or leave this to the user of the container. If left to the user, how does he know if other parts of the program do not have references to those objects))
The most well known and most often used strategies are:
then a sweep is done over the whole memory, looking for unmarked storage areas - these are put back onto some free list or equivalent.
Most of the early GC implementations used the mark & sweep collection scheme, which (I think) is responsible for many peoples opposition against GC. The basic mark & sweep algorithm leads to a pause, making interaction with the program impossible for a while. This disruptive pause becomes extreme, if paging is involved in virtual memory systems (since all of the virtual memory must be processed during the sweep phase, many page faults may occur in systems where the real memory is smaller than the virtual memory need).
Considering early Lisp systems (timesharing 20 users on a DEC20, running a megabyte lisp program in 128-256 k real memory) you can imagine how long these pauses could become. There are many references found in the literature - describing pause times of seconds and even minutes.
This may be the reason for many peoples prejustice against GC - having an idea of large, disruptive pauses in their mind :-).
The cpu-time overhead created by mark & sweep (ignoring paging) is in the range of 5% to 10%. If the time was not spent in one big pause, mark & sweep would not be too bad (especially considering modern workstations, where 16-32Mb of memory are standard and an overhead of some % can be tolerated).
Mark & sweep does not compress the memory - this means that even though enough memory is available, it may not be usable for an allocation due to fragmentation.
At first look, reference counting seems to overcome those problems - distributing the GC overhead over the execution time of the program. However, reference counting too creates some overhead:
The bad news with reference counting is that it is not sufficient - self referencing objects, and objects containing reference cycles cannot be freed by a reference counting GC. Thus there is still a need for another technique, to get rid of those. Alternatively, cycles could be broken manually by the programmer - but that again leads to possible ommisions and therefore hard to find errors.
[[ At first, some might argue that cycles are relatively seldom used, but this is NOT true ! just consider the data structure for a window on the screen, containing references to its subviews, which contain a reference to their parent view; voila, a cycle. If modern programming techniques, AI, multiple lightweight processes etc are involved, even more indirect cycles will be definitely present, especially in a language such as Smalltalk, which presents even stack frames, processes and executable code as objects ... ]]
[[ Literature notes that reference counting GC's in early Smalltalk systems lead to some very obscure errors. These occured whenever a programmer forgot to break up the MVC-cycle in their view closing code.]]
Copying collectors are among the fastest available GC techniques, especially generation scavenging (which uses a hierarchy of semispaces) is to date unbeaten in its average low cpu-time overhead (to my knowledge).
In generation scavenging, all memory is divided into separate areas for generations of objects - all objects are first created in a relatively small area called newSpace, Eden or similar. When this area becomes full, a scavenge operation is performed on this area only, which can be done relatively fast. Measurements showed, that usually most of these new objects are no longer referenced, thus - since only the living objects are investigated and copied - leads to short pause times for this space (20ms to 50 ms), which can easily be hidden between two key strokes or a file-access operation in timesharing systems (as in unix systems, where a sync-operation may produce comparable delays from time to time).
To avoid copying objects around forever, objects which survived these
scavenges for some time will be copied to another area (generation) called oldSpace.
This action is called tenuring (aging).
Systems may have two or more of these spaces.
Once an object has been moved into a higher generation, it will no longer create any copying overhead in the younger generation. Of course, these generations may fill up too, but this filling up occurs much less likely.
Measurements show, that generation scavenging produces an overhead of some 3% to 5% (read worst case situations discussion below).
Copying collectors offer the additional benefit, that cyclic references are handled correctly, AND that the storage space is always compacted after a GC.
Interrestingly, pause times are shorter if lots of garbage is reclaimed, and longer if many objects survive.
The disadvantage of the copying techniques is that they require an additional amount of (unusable) memory, ranging from 50% for the basic Baker algorithm, down to 20% for Ungars generation scavenging. (this memory needs to be virtual only - it is only needed while switching semispaces. During program execution, the other semispace can be paged out.)
Of course, the memory could be divided into areas which are collected with different algorithms, for example, it is possible to use a copying collector for young objects, and move objects into another area collected by mark & sweep or reference counting.
Both mark & sweep and copying algorithms can be made incremental and run in the background, at idle times or whenever another object gets allocated. There are even algorithms for multi-CPU systems, where background collection is done by a separate processor. These incremental algorithms are useful as an additional strategy, to use the cpu while nothing else can be done - for example, in an interactive application, even with a fast typist, there is a lot of cpu time available even between two key strokes (some 100ms), which is enough time to do a scavenge operation or incrementally mark some objects.
Others use tools, which (more or less successfull) find invalid references to already freed objects and multiple freeing of the same object and so on. To my knowledge, none of these is perfect and can make certain that all such situations are handled: - analysis of the source code is (theoretically) impossible, - analysis of the executing process needs that every possible flow of control through the program be executed (at least once) to detect all possible memory bugs.
For many applications, this is simply not possible. Also - in the strict sense, these check-runs have to be repeated for every change in the program - whatever small this change was.
I doubt that these are very useful for modest size applications, involving a team of many programmers. (not talking about toy programs here :-)
There is finally a very radical, puristic group of GC enemies which simply suggests the following:
"if you cannot get your memory organized, give up programming - since you are simply too dump"
(this is no joke; this statement really occurred in the comp.lang.c++ newsgroup) Big smily here :-)
(*) Notes:
There are other GC's possible; a relatively popular algorithm is the so called conservative GC which scans memory for things which "look like pointer" to track reachable objects.
I do not know what ST/V and Enfin use.
ST/X uses a modified generation scavenging scheme for new objects (with
adaptive aging) and a Baker algorithm for old objects, which is started on
demand. There is also an incremental mark & sweep running over the old objects
at idle times or (optionally) as a low priority background process.
The implementation is prepared for and its planned to add a third intermediate
generation, and to replace the semispace Baker algorithm by some in-place
compressor. (the compressor will be somewhat slower, but relaxing VM need)
Oldspace collections happen very infrequent - if no special memory needs arise (as when doing image processing/viewing), the system may run for ever without one. (especially with the incremental background collector running.)
Thus, reclamation of this space is now the responsibility of the oldspace
collector; which, for various reasons, is producing more cpu-time
overhead. Normally, the incremental GC can free this memory easily; however,
if the incremental GC cannot keep up with the oldspace allocation produced by tenuring,
the oldSpace will fill up finally, and a noticable pause cannot be avoided
(when the oldSpace is reclaimed and/or compacted).
The adaptive aging does usually prevent this, but whenever big-chunks
of data are allocated, there is a danger of those being moved to the
oldspace early or, if the chunks size exceeds some limit, being
directly allocated in the oldspace.
With todays high performance computers, this is an adequate price to pay for the added security, and - not to forget - the time savings of the programmer, who would otherwise spend a lot of his/her time in the debugger - instead of doing productive work.
Of course, there are always special situations in which one algorithm performs better than others - memory allocation patterns vary over different applications. Generation scavenging provides the best average performance over different allocation patterns.
Or, to summarize (and estimate/simplify):
if a programmer spends 80% of his/her time debugging code, where 50% of all errors are dangling pointer, bad free or other GC-avoidable bugs, getting rid of those bugs rewards you with a development cost saving of 40% !!!
Now, those savings will pretty soon amortize the added cost of a 5% to 10% faster CPU.
The exact percentages may vary and not be correct in every situation, but you see the point; don't you ?
Copyright © Claus Gittinger Development & Consulting, all rights reserved
(cg@ssw.de)