List/Sublist Projects: Garbage Collector
((to remind you of the context of all this.....))
These projects revolve around the program for reading and printing
lists and sublists, as discussed in class.
(Note: this incudes an updated Tokenizer -- please use the new copy).
(Note: requires Java 1.5 at least)
Overview:
- First learn about how the provided program and data structure
works
- Provide a mechanism for handling the allocation of Token objects,
leading eventually to a kind of "garbage-collection" scheme.
- Implement some arithmetic functions, leading eventually to a
simple programming language with variables.
Due dates:
- #13: phase 1, week of Nov 26
- #15: phase 2, Wednesday Dec 5
- #16 phase 3, or #17, eval pt 2: Wednesday
Dec 12
- Final date for any homework: Friday, Dec 14
After Friday, you should be concentrating on exams for this and your
other classes. No homework will be accepted after Friday, Dec 14,
2007.
- Final exam, Tuesday Dec. 18, 3-6 pm, M-1-619
This page documents the garbage collector project. Alternatively,
you can work on the augmented eval project (#16).
Assignment 16: Garbage Collector for Item/List
In this phase, we pull the pieces together and implement a simple
"garbage collector" for the List/Sub list world. Most of the work
has
already been done! We'll use a "mark and sweep" algorithm.
Provide a "status" field to Token, if you've not already done so.
Getting Organized:
Start the garbage collector by visiting all the Tokens and
setting
their status to "unknown".
(This is why we'd like to have an array of all existing Tokens, to make
it easy to find them)
Mark:
we'll want to find all the Tokens which might possibly be in
use. For
the basic program, that's the Item (probably an IList) currently under
construction, anchored in a variable called "theList" in class IList.
Recursively visit all the Items belonging to theList, and whenever you
find a Token, mark it -- by setting its "status" field to
"inUse".
(Note that some Tokens, such as those for "(" and ")", never actually
make it into theList.)
If your program has any other place where Tokens might be attached,
you'll want to visit that also. (Notice that the readList
function
carefully attaches new sublists to the parent before adding elements,
so that theList always provides a path to everything we've built so
far.)
Sweep:
Now sweep through the array of all Tokens. Any one whose "status"
is
still "unknown" must actually be free -- as there is no existing path
leading to it. Change its status to "free" and add it to the
freeList.
You're done.
Commentary:
Java uses a system very much like this to take care of free
memory.
There is an initial pool. Requests for "new" items are satisfied
by
taking memory from the pool. When the pool is exhausted, the
garbage
collector marks all currently accessible objects, and then sweeps
through the free area reclaiming anything that's not reachable.
- Java can find all the variables and other places where there are
references to objects. It keeps track of variables while
compiling the
program, of course. Since it also knows the type of every object,
it
can follow trees and lists and things like that to track down
everything that's reachable.
- Java, since it can work behind the scenes, can also see the
"stack" of
activation records for all active functions. In that way, it can
locate all active variables, arguments, etc.
- When we linked the free objects into a list, we had to add a
"nextFree" link field to each Token object. Adding such a field
to all
Java objects could take up a lot of space. Happily, the
nextFree
links are only needed when the object is on the free list, at which
time the rest of the object does not contain valid data; and
vice-versa. Working behind the scenes, Java uses that space for
the
free list links.
- Not all Java objects are the same size, of course. (Unlike
in
our
simple example, where we have only objects of type Token). That
makes
the handling of space much more interesting, but the garbage collection
strategy still works fine.