About garbage collection

Garbage collection is triggered when nb_create_string or nb_create_array are called (and only those two functions, either directly or indirectly) and there are not enough free cells to satisfy the request in any block in the free list. As we've seen, some strings or arrays held in the heap may be orphaned and considered garbage. In this case, a garbage collection is triggered in order to try to return some unused objects to the heap, and this is done in two phases.

Phase 1, Mark and sweep.

The first phase is to mark all accessible objects in the heap and then reconstruct the free list from all unreferenced (and therefore free) objects. Adjacent free objects are coalesced into single blocks. Once the free list is reconstructed, another attempt is made to see whether there is a block that can satisfy the request. If there is, that is all well and good, and the block of cells is then removed from the free list and put to use. This phase is particularly efficient and doesn't require any costly movement of objects in memory.

Phase 2, Compact.

Following mark and sweep, if there is no single free block large enough to satisfy the request, the garbage collector tries even harder by compacting the used cells. In the second phase, all used blocks are known (from the first phase) and the garbage collector slides them to the beginning of object memory adjusting all references to moved blocks as it does so. In this way it creates a contiguous run of cells beyond the last used cell. These unused cells are now returned to the free list as a single block and are then used to allocate the new object (or fail if there is still not enough room for the object). Thus, the CoreBASIC garbage collector is a compacting garbage collector.

Of course, there is a down side to this. For a start, garbage collection takes time, so your program is halted whilst garbage is collected, but this isn't a big problem as the first phase of collection is very fast. Only when there is severe pressure on object memory with many live references to objects will the second compacting phase be triggered, and this is slightly slower.

The biggest down side of the compacting algorithm is that you need to take special care when creating an array or reference that all arrays and strings you want to keep around are rooted. The best place to keep objects alive so they're not collected is to push a reference to them onto the expression stack.

In a similar vein, it's imperative that you do not dereference an array or string and keep that pointer locally if there is a possibility that a garbage collection will happen before you are done with the object. This is because the garbage collector may move objects in memory, and so the string or array elements will move in memory also. If you need to refer to an object where there is the possibility of a garbage collection, you must use the reference and dereference when you need to look at the string contents or array elements.

Some places in the interpreter are marked with a comment "*GC*" which indicates that the code following is written in a specific way so that it doesn't get tangled up with the garbage collector.