There are two entries on garbage collection, the first is an excellent entry written for the FAQ by Paul Johnson and the second is from the FAQ author's company on the necessity of garbage collection for object-oriented programming and technique.
From: Paul Johnson (paj@gec-mrc.co.uk)
Garbage collection (GC) is a facility in the run-time system associated with a language which will automatically reclaim objects which are no longer used. OO Languages which require garbage collection include Eiffel, Smalltalk and CLOS. C and C++ can have garbage collection retrofitted (see [3] and [4] below). [Ada has switchable GC, too -bob]
Without GC programmers must explicitly deallocate dynamic storage when it is no longer needed (in C this is done by a call to free(3)). There are a number of problems with this:
1: Bugs due to errors in storage deallocation are very hard to find, although products are available which can help.
2: In some circumstances the decision about whether to deallocate storage cannot be made by the programmer. Drawing editors and interpreters often suffer from this. The usual result is that the programmer has to write an application-specific garbage collector.
3: An object which is responsible for deallocating storage must be certain that no other object still needs that storage. Thus many modules must co-operate closely. This leads to a tight binding between supposedly independent modules.
4: Libraries with different deallocation strategies are often incompatible, hindering reuse.
5: In order to avoid problems 3 and 4, programmers may end up copying and comparing whole objects rather than just references. This is a particular problem with temporary values produced by C++ overloaded operators.
6: Because keeping track of storage is extra work, programmers often resort to statically allocated arrays. This in turn leads to arbitrary restrictions on input data which can cause failure when the assumptions behind the chosen limits no longer apply. For instance many C compilers limit expression nesting, identifier length, include file nesting and macro stack depth. This causes problems for programs that generate C.
One partial solution to a lack of GC is reference counting. In this scheme each object keeps a count of references to it. When this count drops to zero the object is automatically deallocated. However this is inefficient (swapping two references will result in three decrements, three increments and six comparisons) and cannot reclaim circular data structures. Two systems that use a reference count GC are the Interviews C++ graphics library and the Unix file system (the link count).
Opponents of GC reply that it introduces an overhead which is unacceptable in some applications. However the overhead of manual storage deallocation is probably as high as GC. GC algorithms are also available with good real-time behaviour.
[Further, GC can perform compaction improving locality of reference.]
Further Reading:
[1] "Object-Oriented Software Construction" by Meyer puts the argument for GC.
[2] "Uniprocessor Garbage Collection Techniques," by Paul R. Wilson, in Memory Management (proceedings of 1992 Int'l Workshop on Memory Management, Sept. 1992, St. Malo, France, Yves Bekkers and Jacques Cohen, eds.), Springer Verlag Lecture Notes in Computer Science #637.
This is an excellent summary of the state of the art in GC algorithms. This and other papers about garbage collection are available in PostScript via anonymous ftp (cs.utexas.edu:pub/garbage/gcsurvey.ps. [See APPENDIX E]
[3] "Garbage Collection in an Uncooperative Environment" by Boehm and Weiser. Software --- Practise and Experience vol 18(9), pp 807-820. Sept 1988. This describes GC in C and C++. ftp://parcftp.xerox.com/pub/gc/gc.html
[4] Geodesic Systems provides GC for C and C++. See http://www.geodesic.com and Appendix G.
3.9b) Why is Garbage Collection Necessary for Object-Oriented Programming? -------------------------------------------------------------------------- Michael Spertus Geodesic Systems mps@geodesic.com
There are several reasons why true object-oriented programming requires garbage collection.
1. Manual Memory Management Breaks Encapsulation.
Program components frequently need knowledge of an entire program to determine the last use of an object and provide deletion. This makes reuse of the component nearly impossible. For example, methods and functions taking a container as an argument need to know of or make assumptions about the rest of the program to determine ownership of the objects in the container.
Attempts to encapsulate memory management with reference counting, the "poor man's garbage collector", are usually misguided. Reference counting has worse performance than GC, awkward syntax, and poor semantics, which typically include failure to reclaim cycles, inability to handle stack and static objects, lack of polymorphism, and problems with interior pointers (e.g. arrays and multiple inheritance). Intensive research [1] in garbage collection has completely solved the above problems and made reference counting an inadequate substitute for true GC.
2. Implementation Hiding is not Compatible with Manual Memory Management
Implementation hiding is a pillar of object-oriented programming, but explicit memory management requires implementation-dependent low-level knowledge of how memory is structured. For example, programmers often use "copy on write" to efficiently implement pass-by-value semantics. However, to manage memory explicitly, a program has to know if it has a copy of an object or the object itself. Programmers sometimes use reference counting to encapsulate copy-on-write memory management. However, this only works well in simple cases like strings where the data is not polymorphic and does not contain pointers.
3. Message Passing Leads to Dynamic Execution Paths
Manual memory management must make assumptions about a program's order of execution to make a compile-time determination of the last user of an object. While this is often possible in procedural languages, the object- oriented paradigm of objects sending messages to each other (possibly from different threads) makes it impossible to statically determine the last user of an object. For example, event driven GUI programs frequently have no clear order of execution. Other dynamic control structures, such as exceptions, also make static analysis of memory usage at compile-time impossible.
4. Differing Memory Management Schemes Hinder Reuse
Because no memory management scheme is universal enough for all applications, manually managed components and libraries often use incompatible memory management schemes. For example, there are common container libraries using each of the following schemes:
a) Doubly specified empty and remove methods with one including a memory delete, placing the memory management burden on the client, who must call the appropriate method. b) Switches indicating deletion. Many applications must clear the switch to support long-lived data and keep track of ownership of data shared by multiple containers, leaving many memory management issues unaddressed.
c) Value semantics store objects rather than references, inhibiting data sharing and carrying expensive performance costs when complex objects are copied by value.
d) Reference counting, which was already discussed. Any components or libraries that use containers with different memory management strategies are difficult to use with each other.
5. Garbage Collection Works
It is not enough to merely find fault with manual memory management. One also has to show that garbage collection provides a better alternative. Early versions of garbage collection were merely crude implementations of mark-and-sweep that left much to be desired. However, garbage collection has advanced as rapidly as most computer-related technologies and is now a robust, mature technology.[1] Many object-oriented languages specify garbage collection for all or part of their memory. Even C and C++ have at least one commercially supported garbage collector that can transparently and compatibly manage both new and existing programs. [2]
Garbage collected programs are usually as fast and responsive as their manually managed brethren. [3] In fact, multi-media programmers sometimes choose treadmill collectors [4] over hand-management because of its superior real-time performance as manual management usually has difficulty scheduling destructor calls smoothly. Of course, garbage collected programs are generally more reliable and easier to develop, maintain, and reuse than manually managed programs. Finally, garbage collection can be mixed with manual management to provide the programmer with the broadest set of tools, and garbage collection is much too important a tool to be absent from any object-oriented programmer's toolbox.
References [1] Paul R. Wilson, "Uniprocessor Garbage Collection Techniques", 1992 International Workshop on Memory Management, Springer-Verlag Lecture Notes in Computer Science series.
[2] Geodesic Systems, Great Circle(TM) Automatic Memory Management System. http://www.geodesic.com/GreatCircle/index.html.
[3] Detlefs, Dosser, and Zorn, "Memory Allocation Costs in Large C and C++ Programs". ftp://cs.colorado.edu/pub/techreports/zorn/CU-CS-665-93-ps.Z.
[4] Henry Baker, "The Treadmill: Real-Time Garbage Collection without Motion Sickness". ftp://ftp.netcom.com/pub/hb/hbaker/NoMotionGC.html
This document was translated by ms2html v1.8 on 04.06.96.