Object-Orientation FAQ

3.9) Why is Garbage Collection A Good Thing?

There are two entries on garbage collection, the first is an excellent entry
written for the FAQ by Paul Johnson and the second is from the FAQ author's
company on the necessity of garbage collection for object-oriented programming
and technique.

  From: Paul Johnson (paj@gec-mrc.co.uk)

Garbage collection (GC) is a facility in the run-time system associated with a
language which will automatically reclaim objects which are no longer used.
OO Languages which require garbage collection include Eiffel, Smalltalk and
CLOS.  C and C++ can have garbage collection retrofitted (see [3] and [4]
below).  [Ada has switchable GC, too -bob]

Without GC programmers must explicitly deallocate dynamic storage when
it is no longer needed (in C this is done by a call to free(3)).
There are a number of problems with this:

1: Bugs due to errors in storage deallocation are very hard to find,
   although products are available which can help.

2: In some circumstances the decision about whether to deallocate
   storage cannot be made by the programmer.  Drawing editors and
   interpreters often suffer from this.  The usual result is that the
   programmer has to write an application-specific garbage collector.

3: An object which is responsible for deallocating storage must be
   certain that no other object still needs that storage.  Thus many
   modules must co-operate closely.  This leads to a tight binding
   between supposedly independent modules.

4: Libraries with different deallocation strategies are often
   incompatible, hindering reuse.

5: In order to avoid problems 3 and 4, programmers may end up copying
   and comparing whole objects rather than just references.  This is a
   particular problem with temporary values produced by C++ overloaded
   operators.

6: Because keeping track of storage is extra work, programmers often
   resort to statically allocated arrays.  This in turn leads to
   arbitrary restrictions on input data which can cause failure when
   the assumptions behind the chosen limits no longer apply.  For
   instance many C compilers limit expression nesting, identifier
   length, include file nesting and macro stack depth.  This causes
   problems for programs that generate C.

One partial solution to a lack of GC is reference counting.  In this
scheme each object keeps a count of references to it.  When this count
drops to zero the object is automatically deallocated.  However this
is inefficient (swapping two references will result in three
decrements, three increments and six comparisons) and cannot reclaim
circular data structures.  Two systems that use a reference count GC
are the Interviews C++ graphics library and the Unix file system (the
link count).

Opponents of GC reply that it introduces an overhead which is
unacceptable in some applications.  However the overhead of manual
storage deallocation is probably as high as GC.  GC algorithms are
also available with good real-time behaviour.

[Further, GC can perform compaction improving locality of reference.]

Further Reading:

[1] "Object-Oriented Software Construction" by Meyer puts the argument
for GC.

[2] "Uniprocessor Garbage Collection Techniques," by Paul R. Wilson,
in Memory Management (proceedings of 1992 Int'l Workshop on Memory
Management, Sept. 1992, St. Malo, France, Yves Bekkers and Jacques Cohen,
eds.), Springer Verlag Lecture Notes in Computer Science #637.

This is an excellent summary of the state of the art in GC algorithms.  This
and other papers about garbage collection are available in PostScript via
anonymous ftp (cs.utexas.edu:pub/garbage/gcsurvey.ps.  [See APPENDIX E]

[3] "Garbage Collection in an Uncooperative Environment" by Boehm and
Weiser.  Software --- Practise and Experience vol 18(9), pp 807-820.
Sept 1988.  This describes GC in C and C++.
ftp://parcftp.xerox.com/pub/gc/gc.html

[4] Geodesic Systems provides GC for C and C++.  See http://www.geodesic.com
  and Appendix G.

3.9b) Why is Garbage Collection Necessary for Object-Oriented Programming?
--------------------------------------------------------------------------
			  Michael Spertus
			  Geodesic Systems
			  mps@geodesic.com

There are several reasons why true object-oriented programming requires garbage
collection.

1. Manual Memory Management Breaks Encapsulation.

  Program components frequently need knowledge of an entire program to
  determine the last use of an object and provide deletion.  This makes reuse
  of the component nearly impossible. For example, methods and functions
  taking a container as an argument need to know of or make assumptions
  about the rest of the program to determine ownership of the objects in the
  container.

  Attempts to encapsulate memory management with reference counting, the "poor
  man's garbage collector", are usually misguided. Reference counting has worse
  performance than GC, awkward syntax, and poor semantics, which typically
  include failure to reclaim cycles, inability to handle stack and static
  objects, lack of polymorphism, and problems with interior pointers (e.g.
  arrays and multiple inheritance). Intensive research [1] in garbage
  collection has completely solved the above problems and made reference
  counting an inadequate substitute for true GC.

2. Implementation Hiding is not Compatible with Manual Memory Management

  Implementation hiding is a pillar of object-oriented programming,
  but explicit memory management requires implementation-dependent
  low-level knowledge of how memory is structured. For example, programmers
  often use "copy on write" to efficiently implement pass-by-value semantics.
  However, to manage memory explicitly, a program has to know if it has a copy
  of an object or the object itself. Programmers sometimes use reference
  counting to encapsulate copy-on-write memory management. However, this only
  works well in simple cases like strings where the data is not polymorphic
  and does not contain pointers.

3. Message Passing Leads to Dynamic Execution Paths

   Manual memory management must make assumptions about a program's order of
   execution to make a compile-time determination of the last user of an
   object. While this is often possible in procedural languages, the object-
   oriented paradigm of objects sending messages to each other (possibly from
   different threads) makes it impossible to statically determine the last user
   of an object. For example, event driven GUI programs frequently have no
   clear order of execution. Other dynamic control structures, such as
   exceptions, also make static analysis of memory usage at compile-time
   impossible.

4. Differing Memory Management Schemes Hinder Reuse

  Because no memory management scheme is universal enough for all applications,
  manually managed components and libraries often use incompatible memory
  management schemes. For example, there are common container libraries using
  each of the following schemes:

  a) Doubly specified empty and remove methods with one including a memory
     delete, placing the memory management burden on the client, who must call
     the appropriate method.
  b) Switches indicating deletion. Many applications must clear the switch to
     support long-lived data and keep track of ownership of data shared by
     multiple containers, leaving many memory management issues unaddressed.

  c) Value semantics store objects rather than references, inhibiting data
     sharing and carrying expensive performance costs when complex objects are
     copied by value.

  d) Reference counting, which was already discussed.
  Any components or libraries that use containers with different memory
  management strategies are difficult to use with each other.

5. Garbage Collection Works

  It is not enough to merely find fault with manual memory management. One also
  has to show that garbage collection provides a better alternative. Early
  versions of garbage collection were merely crude implementations of
  mark-and-sweep that left much to be desired. However, garbage collection has
  advanced as rapidly as most computer-related technologies and is now a robust,
  mature technology.[1] Many object-oriented languages specify garbage
  collection for all or part of their memory. Even C and C++ have at least one
  commercially supported garbage collector that can transparently and
  compatibly manage both new and existing programs. [2]

  Garbage collected programs are usually as fast and responsive as
  their manually managed brethren. [3] In fact, multi-media programmers
  sometimes choose treadmill collectors [4] over hand-management because of its
  superior real-time performance as manual management usually has difficulty
  scheduling destructor calls smoothly. Of course, garbage collected programs
  are generally more reliable and easier to develop, maintain, and reuse than
  manually managed programs. Finally, garbage collection can be mixed with
  manual management to provide the programmer with the broadest set of tools,
  and garbage collection is much too important a tool to be absent from any
  object-oriented programmer's toolbox.

References
[1] Paul R. Wilson, "Uniprocessor Garbage Collection Techniques", 1992
  International Workshop on Memory Management, Springer-Verlag Lecture Notes in
  Computer Science series.

[2] Geodesic Systems, Great Circle(TM) Automatic Memory Management System.
  http://www.geodesic.com/GreatCircle/index.html.

[3] Detlefs, Dosser, and Zorn, "Memory Allocation Costs in Large C and C++
  Programs".  ftp://cs.colorado.edu/pub/techreports/zorn/CU-CS-665-93-ps.Z.

[4] Henry Baker, "The Treadmill: Real-Time Garbage Collection without Motion
  Sickness".  ftp://ftp.netcom.com/pub/hb/hbaker/NoMotionGC.html

This document was translated by ms2html v1.8 on 04.06.96.