Quick Links:

Releases | Mailing Lists | Source Control | Issue Tracker | Regression Tests

16 MMTk

Chapter 16
MMTk

The garbage collectors for Jikes RVM are provided by MMTk. The document MMTk: The Memory Manager Toolkit describes MMTk and gives a tutorial on how to use and edit it and is the best place to start. An updated version of the tutorial is available in this guide. A detailed description of the call chain from the compilers through to MMTk here is another good place to start understanding how MMTk integrates with Jikes RVM. Anatomy of a Garbage Collector describes the major building blocks of an MMTk collector and Scanning Objects in Jikes RVM describes how objects are scanned for their pointer fields during GC. MMTk also has a pure Java test harness that allows development of garbage collectors in an IDE like eclipse.

Jikes RVM can be configured to employ various different allocation managers taken from the MMTk memory management toolkit. Managers divide the available space up as they see fit. However, they normally subdivide the available address range to provide:

Virtual memory pages are lazily mapped into Jikes RVM’s memory image as they are needed.

The main class which is used to interface to the memory manager is called Plan. Each flavor of the manager is implemented by substituting a different implementation of this class. Most plans inherit from class StopTheWorldGC which ensures that all active mutator threads (i.e. ones which do not perform the job of reclaiming storage) are suspended before reclamation is commenced. The argument passed to -X:gc:threads determines the number of parallel collector threads that will be used for collection.

Generational collectors employ a plan which inherits from class Generational. Inter alia, this class ensures that a write barrier is employed so that updates from old to new spaces are detected.

Jikes RVM may also use the GCSpy visualization framework. GCSpy allows developers to observe the behavior of the heap and related data structures.

16.1 Anatomy of a Garbage Collector

** Work in progress, contributions appreciated **

This page gives a brief outline of the major control flows in the execution of a garbage collector in MMTk. For simplicity, we focus on the MarkSweep collector, although much of the discussion will be relevant to other collectors.

This page assumes you have a basic knowledge of garbage collection. For those that don’t, please see one of the standard texts such as The Garbage Collection Handbook.

16.1.1 Structure of a Plan

An MMTk Plan is required to provide 5 classes. They are required to have consistent names which start with the same name and have a suffix that indicates which class it inherits from. in the case of the MarkSweep plan, the name is ”MS”.

The basic architecture of MMTk is that virtual address space is divided into chunks (of 4MB in a 32-bit memory model) that are managed according to a specific policy. A policy is implemented by an instance of the Space class, and it is in the policy class that the mechanics of a particular mechanism (like mark-sweep) is implemented. The task of a Plan is to create the policy (Space) objects that manage the heap, and to integrate them into the MMTk framework. MMTk exposes some of this memory management policy to the host VM, by allowing the VM to specify an allocator (represented by a small integer) when allocating space. The interface exposed to the VM allows it to choose whether an object will move during collection or not, whether the object is large enough to require special handling etc. The MMTk plan is free (within the semantic guarantees exposed to the VM) to direct each of these allocators to a particular policy.

16.1.2 Policies

A policy describes how a range of virtual address space is managed. The base class of all policies is org.mmtk.policy.Space, and a particular instance of a policy is known generically as a space. The static initializer of a Plan and its subclasses define the spaces that make up an MMTk plan.


23cAp0x19-22100016.1.2: MS.java
public static final MarkSweepSpace msSpace = new MarkSweepSpace(”ms”, VMRequest.discontiguous()); 
public static final int MARK_SWEEP = msSpace.getDescriptor();

In this code fragment, we see the MS plan defined. Note that we generally also define a static final space descriptor. This is an optimization that allows some rapid operations on spaces.

A Space is a global object, shared among multiple mutator threads. Each policy will also have one or more thread-local classes which provide unsynchronized allocation. These classes are subclasses of org.mmtk.utility.alloc.Allocator, and in the case of MarkSweep, it is called MarkSweepLocal. Instances of MarkSweepLocal are created as part of a mutator context, like this


23cAp1x19-22100016.1.2: MSMutator.java
protected MarkSweepLocal ms = new MarkSweepLocal(MS.msSpace);

The design pattern is that the local Allocator will allocate space from a thread-local buffer, and when that is exhausted it will allocate a new buffer from the global Space, performing appropriate locking. The constructor of the MarkSweepLocal specifies the space from which the allocator will allocate global memory.

16.1.3 Allocation

MMTk provides two methods for allocating an object. These are provided by the MSMutator class, to give each plan the opportunity to use fast, unsynchronized thread-local allocation before falling back to a slower synchronized slow-path.

The version implemented in MarkSweep looks like this:


23cAp2x19-22200016.1.3: MSMutator.java
public Address alloc(int bytes, int align, int offset, int allocator, int site) { 
  if (allocator == MS.ALLOC_DEFAULT) { 
    return ms.alloc(bytes, align, offset); 
  } 
  return super.alloc(bytes, align, offset, allocator, site); 
}

The basic structure of this method is common to all MMTk plans. First they decide whether the operation applies to this level of abstraction (if (allocator == MS.ALLOC_DEFAULT)), and if so, delegate to the appropriate place, otherwise pass it up the chain to the super-class. In the case of MarkSweep, MSMutator delegates the allocation to its thread-local MarkSweepLocal object ms.

The alloc method of MarkSweepLocal is inherited from SegregatedFreeListLocal (mark-sweep is not the only way of managing free-list allocation), and looks like this


23cAp3x19-22200016.1.3: SegregatedFreeListLocal.java (simplified)
public final Address alloc(int bytes, int align, int offset) { 
  int sizeClass = getSizeClass(bytes); 
  Address cell = freeList.get(sizeClass); 
  if (!cell.isZero()) { 
    freeList.set(sizeClass, cell.loadAddress()); 
    / Clear the free list link / 
    cell.store(Address.zero()); 
    return cell; 
  } 
  return allocSlow(bytes, align, offset); 
}

This is a standard pattern for thread-local allocation: first we look in the thread-local space (line 3), and if successful return the result (lines 4-8). If unsuccessful, we request space from the global policy via the method Allocator.allocSlow. This is the common interface that all Allocators use to request space from the global policy. This will eventually call the allocator-specific allocSlowOnce method. The workings of the allocSlowOnce method are very policy-specific, so not appropriate to look at at this stage, but eventually all policies will attempt to acquire fresh virtual memory via the Space.acquire method.

Space.acquire is the only correct way for a policy to allocate new virtual memory for its own use.


23cAp4x19-22200016.1.3: Space.java (simplified)
public final Address acquire(int pages) { 
  pr.reservePages(pages); 
  // Poll, either fixing budget or requiring GC 
  if (VM.activePlan.global().poll(falsethis)) { 
    VM.collection.blockForGC(); 
    return Address.zero(); // GC required, return failure 
  } 
  // Page budget is ok, try to acquire virtual memory 
  Address rtn = pr.getNewPages(pagesReserved, pages, zeroed); 
  if (rtn.isZero()) {  // Failed, so force a GC 
    boolean gcPerformed = VM.activePlan.global().poll(truethis); 
    VM.collection.blockForGC(); 
    return Address.zero(); 
  } 
  return rtn; 
}

The logic of space.acquire is:


23cAp5x19-22200016.1.3: Allocator.java (simplified)
public final Address allocSlowInline(int bytes, int alignment, int offset) { 
  boolean emergencyCollection = false
  while (true{ 
    Address result = allocSlowOnce(bytes, alignment, offset); 
    if (!result.isZero()) { 
      return result; 
    } 
    if (emergencyCollection) { 
      VM.collection.outOfMemory(); 
    } 
    emergencyCollection = Plan.isEmergencyCollection(); 
  } 
}

This code fragment shows the retry logic in the allocator. We try allocating using allocSlowOnce, which may recycle partially-used blocks and eventually call Space.acquire. If a GC occurred, we try again. Eventually the plan will request an emergency collection which will (for example) cause soft references to be dropped. If this fails we throw an OutOfMemoryError.

16.1.4 Collection

Scheduling

In a stop-the-world garbage collector like MarkSweep, the mutator threads run until memory is exhausted, then all mutator threads are suspended, the collector threads are activated, and they perform a garbage collection. After the GC is complete, the collector threads are suspended and the mutator threads resume. MMTk also has some support for concurrent collectors, in which one or more collector threads can be scheduled to run alongside the mutator, either exclusively or in addition to (hopefully briefer) stop-the-world phases.

Thread scheduling in MMTk is handled by a GC controller thread, implemented in the singleton class org.mmtk.plan.ControllerCollectorContext held in the static field Plan.controlCollectorContext. Whenever a collection is initiated, it is done by calling methods on this object.

Initiating

As mentioned above, every attempt to allocate fresh virtual memory calls the current plan’s poll(...) method. This initiates a GC by calling controlCollectorContext.request(), which in a stop-the-world collector like MarkSweep pauses the mutator threads and then wakes the collector threads. The main loop of the garbage collector is simply the run() method of ParallelCollector, shown below.


23cAp6x19-22500016.1.4: ParallelCollector
public void run() { 
  while(true{ 
    park(); 
    collect(); 
  } 
}

The collect() method is specific to the type of collector, and in StopTheWorldCollector it looks like this


23cAp7x19-22500016.1.4: StopTheWorldCollector
public void collect() { 
  Phase.beginNewPhaseStack(Phase.scheduleComplex(global().collection)); 
}

Collector Phases

Every garbage collection consists of a series of steps. Each step is either executed once (e.g. updating the mark state before marking the heap), or in parallel on all available collector threads (e.g. the parallel mark phase). The actual work of a step is done by the collectionPhase method of the global, collector or mutator class of a plan.

In early versions of MMTk, the main collection method was a template method, calling individual methods for each phase of the collection. As the number of collectors in MMTk grew, this became unwieldy and has been replaced with a configurable mechanism of phases.

The class org.mmtk.plan.Simple defines the basic structure of most of MMTk’s garbage collectors. First it defines the phases themselves,


23cAp8x19-22600016.1.4: Simple.java
public static final short SET_COLLECTION_KIND = Phase.createSimple(”set-collection-kind”, null); 
public static final short INITIATE            = Phase.createSimple(”initiate”, null); 
public static final short PREPARE             = Phase.createSimple(”prepare”); 
...

Each phase of the collection is represented by a 16-bit integer, an index into a table of Phase objects. Simple phases are scheduled, and combined into sequences, or complex phases.


23cAp9x19-22600016.1.4: Simple.java
/ Ensure stacks are ready to be scanned / 
protected static final short prepareStacks = Phase.createComplex(”prepare-stacks”, null
    Phase.scheduleMutator    (PREPARE_STACKS), 
    Phase.scheduleGlobal     (PREPARE_STACKS));

A simple phase can be scheduled in one of 4 ways:

Between every phase of a collection, the collector threads rendezvous at a synchronization barrier. The actual execution of a collector’s phases is done in the method Phase.processPhaseStack. This method handles resuming a concurrent collection as well as running a full stop-the-world collection.

The actual work of a collection phase is done (as mentioned above) in the collectionPhase method of the major Plan classes.


23cAp10x19-22600016.1.4: MS.java
@Inline 
@Override 
public void collectionPhase(short phaseId) { 
  if (phaseId == PREPARE) { 
    super.collectionPhase(phaseId); 
    msTrace.prepare(); 
    msSpace.prepare(true); 
    return
  } 
  if (phaseId == CLOSURE) { 
    msTrace.prepare(); 
    return
  } 
  if (phaseId == RELEASE) { 
    msTrace.release(); 
    msSpace.release(); 
    super.collectionPhase(phaseId); 
    return
  } 
  super.collectionPhase(phaseId); 
}

This excerpt shows how the global MS plan implements collectionPhase, illustrating the key phases of a simple stop-the-world collector. The prepare phase performs tasks such as changing the mark state, the closure phase performs a transitive closure over the heap (the mark phase of a mark-sweep algorithm) and the release phase performs any post-collection steps. Where possible, a plan is structured so that each layer of inheritance deals only with the objects it creates, i.e. the MS class operates on the msSpace and delegates work on all other spaces to the super-class where they are defined. By convention the PREPARE phase is performed outside-in (super-class preparation first) and RELEASE is done inside-out (local first, super-class second).

Tracing the heap

The main operation of a tracing collector is the transitive closure operation where all (or a subset) of the object graph is visited. Some collectors such as generational collectors perform these operations in more than one way, e.g. a nursery collection in a generational collector does not trace through pointers into the mature space, while a full-heap collection does. All MMTk collectors are designed to run using several parallel threads, using data structures that have unsynchronized thread-local and synchronized global components in the same way as MMTk’s policy classes.

MMTk’s trace operation uses the following terminology:

Each distinct transitive closure operation is defined as a subclass of TraceLocal. The closure is performed in the collectionPhase method of the plan-specific CollectorContext class


23cAp11x19-22700016.1.4: MSCollector.java
public void collectionPhase(short phaseId, boolean primary) { 
  ... 
  if (phaseId == MS.CLOSURE) { 
    fullTrace.completeTrace(); 
    return
  } 
  ... 
}

The initial starting point for the closure is computed by the STACK_ROOTS and ROOTS phases, which add root locations to a buffer by calling TraceLocal.reportDelayedRootEdge. The closure operation proceeds by invoking traceObiect on each root location (in method processRootEdge), and then invoking scanObject on each heap object encountered. Note that the CLOSURE operation is performed multiple times in each GC, due to processing of reference types.

16.2 Memory Allocation in Jikes RVM

The way that objects are allocated in Jikes RVM can be difficult to grasp for someone new to the code base. This document provides a detailed look at some of the paths through the JikesRVM - MMTk interface code to help bootstrap understanding of the process. The process and code illustrated below is current as of March 2011, svn revision 16052 (between JikesRVM 3.1.1 and 3.1.2).

16.2.1 Memory Manager Interface

The best starting place to understand the allocation sequence is in the class org.jikesrvm.mm.mminterface.MemoryManager, which is a facade class for the MMTk allocators. MMTk provides a variety of memory management plans which are designed to be independent of the actual language being implemented. The MemoryManager class orchestrates the services of MMTk to allocate memory, and adds the structure necessary to make the allocated memory into Java objects.

The method allocateScalar is where all scalar (ie non-array) objects are allocated. The parameters of this method specify the object to be allocated in sufficient detail that when this method is compiled by the opt compiler, all of the parameters are compile-time constants, allowing maximum optimization. Working through the body of the method,

Selected.Mutator mutator = Selected.Mutator.get();

As mentioned above, MMTk provides many different memory management plans, one of which is selected at build time. This call acquires a pointer to the thread-local per-mutator component of MMTk. Much of MMTk’s performance comes from providing unsynchronized thread-local data structures for the frequently used operations, so rather than provide a single interface object, it provides a per-thread interface object for both mutator and collector threads.

allocator = mutator.checkAllocator(org.jikesrvm.runtime.Memory.alignUp(size, MIN_ALIGNMENT), align, allocator);

An MMTk plan in general provides several spaces where objects can be allocated, each with their own characteristics. Jikes RVM is free to request allocation in any of these spaces, but sometimes there are constraints only available on a per-allocation basis that might force MMTk to override Jikes RVM’s request. For example, Jikes RVM may specify that objects allocated by a particular class are allocated in MMTk’s non-moving space. At execution time, one such object may turn out to be too large for allocation in the general non-moving space provided by that particular plan, and so MMTk needs to promote the object to the Large Object Space (LOS), which is also non-moving, but has high space overheads. This call will generally compile down to 0 or a small handful of instructions.

Address region = allocateSpace(mutator, size, align, offset, allocator, site);

This calls a method of MemoryManager, common to all allocation methods (for Arrays and other special objects), that calls

Address region = mutator.alloc(bytes, align, offset, allocator, site);

to actually allocate memory from the current MMTk plan.

Object result = ObjectModel.initializeScalar(region, tib, size);

Now we call the Jikes RVM object model to initialize the allocated region as a scalar object, and then

mutator.postAlloc(ObjectReference.fromObject(result), ObjectReference.fromObject(tib), size, allocator);

we call MMTk’s postAlloc method to perform initialization that can only be performed after an object has been initialized by the virtual machine.

16.2.2 Compiler integration

The allocateScalar method discussed above is only actually called from one place, the method resolvedNewScalar(int ...) in the class org.jikesrvm.runtime.RuntimeEntrypoints. This class provides methods that are accessed directly by the compilers, via fields in the org.jikesrvm.runtime.Entrypoints class. The ’resolved’ part of the method name indicates that the class of object being allocated is resolved at compile time (recall that the Java Language Spec requires that classes are only loaded, resolved etc when they are needed - sometimes it’s necessary to compile code that performs classloading and then allocate the object).

RuntimeEntrypoints also contains an overload, resolvedNewScalar(RVMClass), that is used by the reflection API to allocate objects. It’s instructive to look at this method, as it performs essentially the same operations as the compiler when compiling the call to resolvedNewScalar(int...).

Working backwards from this point requires delving into the individual compilers.

Baseline Compiler

There is a different baseline compiler for each architecture. The relevant code in the baseline compiler for the ia32 architecture is in the class org.jikesrvm.compilers.baseline.ia32.BaselineCompilerImpl. The method e-mit_resolved_new(RVMClass) is responsible for generating code to execute the new bytecode when the target class is already resolved. Looking at this method, you can see it does essentially what the resolvedNewScalar(RVMClass) method in RuntimeEntrypoints does, then generates machine code to perform the call to the resolvedNewScalar entrypoint. Note how the work of calculating the size, alignment etc of the object is performed by the compiler, at compile time.

Similar code exists in the PPC baseline compiler.

Optimizing Compiler

The optimizing compiler is paradoxically somewhat simpler than the baseline compiler, in that injection of the call to the entrypoint is done in an architecture independent level of compiler IR. (An overview of the Jikes RVM optimizing compiler can be found in the paper The Jalapeño Dynamic Optimizing Compiler for Java).

In HIR (the high-level Intermediate Representation), allocation is expressed as a ’new’ opcode. During the translation from HIR to LIR (Low-level IR), this and other opcodes are translated into instructions by the class org.jikesrvm.compilers.opt.hir2lir.ExpandRuntimeServices. The method perform(IR) performs this translation, selecting particular operations via a large switch statement. The NEW_opcode case performs the task we’re interested in, doing essentially the same job as the baseline compiler, but generating IR rather than machine instructions. The compiler generates a ’call’ operation, and then (if the compilation policy decides it’s required) inlines it.

At this point in code generation, all the methods called by RuntimeEntrypoints.resolvedNewScalar(int...) which are annotated @Inline are also inlined into the current method. This inlining extends through to the MMTk code so that the allocation sequence can be optimized down to a handful of instructions.

It can be instructive to look at the various levels of IR generated for object allocation using a simple test program and the OptTestHarness utility described elsewhere in the user guide.

16.3 Scanning Objects in Jikes RVM

One of the services that MMTk expects a virtual machine to perform on its behalf is the scanning of objects, i.e. identifying and processing the pointer fields of the live objects it encounters during collection. In principle the implementation of this interface is simple, but there are two moderately complex optimizations layered on top of this.

From MMTk’s point of view, each time an object requires scanning it passes it to the VM, along with a TransitiveClosure object. The VM is expected to identify the pointers and invoke the processEdge method on each of the pointer fields in the object. The rationale for the current object scanning scheme is presented in this paper.

16.3.1 JikesRVM to MMTk Interface

MMTk requires its host virtual machine to provide an implementation of the class org.mmtk.vm.Scanning as its interface to scanning objects. Jikes RVM’s implementation of this class is found under the source tree MMTk/ext/vm/jikesrvm, in the class org.jikesrvm.mm.mmtk.Scanning. The methods we are interested in are scanObject(TransitiveClosure, ObjectReference) and specializedScanObject(int, TransitiveClosure, ObjectReference).

In MMTk, each plan defines one or more TransitiveClosure operations. Simple full-heap collectors like MarkSweep only define one TransitiveClosure, but complex plans like GenImmix or the RefCount plans define several. MMTk allows the plan to request specialized scanning on a closure-by-closure basis, closures that are specialized call specializedScanObject while unspecialized ones call scanObject. Specialization is covered in more detail below.

In the absence of hand-inlined scanning, or if specialization is globally disabled, scanning reverts to the fallback method in org.jikesrvm.mm.mminterface.SpecializedScanMethod. This method can be regarded as the basic underlying mechanism, and is worth understanding in detail.

RVMType type = ObjectModel.getObjectType(objectRef.toObject()); 
    int[] offsets = type.getReferenceOffsets();

This code fetches the array of offsets that Jikes RVM uses to identify the pointer fields in the object. This array is constructed by the classloader when a class is resolved.

if (offsets != REFARRAY_OFFSET_ARRAY) { 
  for(int i=0; i < offsets.length; i++) { 
    trace.processEdge(objectRef, objectRef.toAddress().plus(offsets[i])); 
  }

One distinguished value (actually null) is used to identify arrays of reference objects, and this block of code scans scalar objects by tracing each of the fields at the offsets given by the offset array.

} else { 
   for(int i=0; i < ObjectModel.getArrayLength(objectRef.toObject()); i++) { 
    trace.processEdge(objectRef, objectRef.toAddress().plus(i << LOG_BYTES_IN_ADDRESS)); 
  } 
}

The other case is reference arrays, for which we fetch the array length and scan each of the elements.

The internals of trace.processEdge vary by collector and by collection type (e.g. nursery/full-heap in a generational collector), and the details need not concern us here.

16.3.2 Hand Inlining

Hand inlining was introduced in February 2011, and uses a cute technique to encode 3 bits of metadata into the TIB pointer in an object’s header. The 7 most frequent object patterns are encoded into these bits, and then special-case code is written for each of them.

Hand inlining produces an average-case speedup slightly better than specialization, but performs poorly on some benchmarks. This is why we use it in combination with specialization.

16.3.3 Specialized Scanning

Specialized Scanning was introduced in September 2007. It speeds up GC by removing the process of fetching and interpreting the offset array that describes each object, by jumping directly to a hard-coded method for scanning objects with a particular pattern.

The departure point from ”standard” java into the specialized scanning method is SpecializedScanMethod.invoke(...), which looks like this

@SpecializedMethodInvoke 
@NoInline 
public static void invoke(int id, Object object, TransitiveClosure trace) { 
  / By default we call a non-specialized fallback / 
  fallback(object, trace); 
}

The @SpecializedMethodInvoke annotation signals to the compiler that it should dispatch to one of the specialized method slots in the TIB.

Creation of specialized methods is handled by the class org.jikesrvm.classloader.SpecializedMethodManager.

16.4 Using GCSpy

16.4.1 The GCspy Heap Visualisation Framework

GCspy is a visualisation framework that allows developers to observe the behaviour of the heap and related data structures. For details of the GCspy model, see GCspy: An adaptable heap visualisation frameworkby Tony Printezis and Richard Jones, OOPSLA’02. The framework comprises two components that communicate across a socket: a client and a server incorporated into the virtual machine of the system being visualised. The client is usually a visualiser (written in Java) but the framework also provides other tools (for example, to store traces in a compressed file). The GCspy server implementation for Jikes RVM was contributed by Richard Jones of the University of Kent.

GCspy is designed to be independent of the target system. Instead, it requires the GC developer to describe their system in terms of four GCspy abstractions: spaces, streams, tiles and events. This description is transmitted to the visualiser when it connects to the server.

A space is an abstraction of a component of the system; it may represent a memory region, a free-list, a remembered-set or whatever. Each space is divided into a number of blocks which are represented by the visualiser as tiles. Each space will have a number of attributes – streams – such as the amount of space used, the number of objects it contains, the length of a free-list and so on.

In order to instrument a Jikes RVM collector with GCspy:

  1. Provide a startGCspyServer method in that collector’s plan. That method initialises the GCspy server with the port on which to communicate and a list of event names, instantiates drivers for each space, and then starts the server.
  2. Gather data from each space for the tiles of each stream (e.g. before, during and after each collection).
  3. Provide a driver for each space.

Space drivers handle communication between collectors and the GCspy infrastructure by mapping information collected by the memory manager to the space’s streams. A typical space driver will:

  1. Create a GCspy space.
  2. Create a stream for each attribute of the space.
  3. Update the tile statistics as the memory manager passes it information.
  4. Send the tile data along with any summary or control information to the visualiser.

The Jikes RVM SSGCspy plan gives an example of how to instrument a collector. It provides GCspy spaces, streams and drivers for the semi-spaces, the immortal space and the large object space, and also illustrates how performance may be traded for the gathering of more detailed information.

16.4.2 Installation of GCspy with Jikes RVM

Building GCSpy

The GCspy client code makes use of the Java Advanced Imaging (JAI) API. The build system will attempt to download and install the JAI component when required but this is only supported on the ia32-linux platform. The build system will also attempt to download and install the GCSpy server when required.

Building Jikes RVM to use GCspy

To build the Jikes RVM with GCSpy support the configuration parameter config.include.gcspy must be set to true such as in the BaseBaseSemiSpaceGCspy configuration. You can also have the Jikes RVM build process create a script to start the GCSpy client tool if GCSpy was built with support for client component. To achieve this the configuration parameter config.include.gcspy-client must be set to true.

The following steps build the Jikes RVM with support for GCSpy on linux-ia32 platform.

$ cd $RVM_ROOT 
$ ant -Dhost.name=ia32-linux -Dconfig.name=BaseBaseSemiSpaceGCspy -Dconfig.include.gcspy-client=1

It is also possible to build the Jikes RVM with GCSpy support but link it against a fake stub implementation rather than the real GCSpy implementation. This is achieved by setting the configuration parameter config.include.gcspy-stub to true. This is used in the nightly testing process.

Running Jikes RVM with GCspy

To start Jikes RVM with GCSpy enabled you need to specify the port the GCSpy server will listen on.

$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux 
$ ./rvm -Xms20m -X:gc:gcspyPort=3000 -X:gc:gcspyWait=true &

Then you need to start the GCspy visualiser client.

$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux 
$ ./tools/gcspy/gcspy

After this you can specify the port and host to connect to (i.e. localhost:3000) and click the ”Connect” button in the bottom right-hand corner of the visualiser.

16.4.3 Command line arguments

Additional GCspy-related arguments to the rvm command:

16.4.4 Writing GCspy drivers

To instrument a new collector with GCspy, you will probably want to subclass your collector and to write new drivers for it. The following sections explain the modifications you need to make and how to write a driver. You may use org.mmtk.plan.semispace.gcspy and its drivers as an example.

The recommended way to instrument a Jikes RVM collector with GCspy is to create a gcspy subdirectory in the directory of the collector being instrumented, e.g. MMTk/src/org/mmtk/plan/semispace/gcspy. In that directory, we need 5 classes:

SSGCspy is the plan for the instrumented collector. It is a subclass of SS.

SSGCspyConstraints extends SSConstraints to provide methods boolean needsLinearScan() and boolean withGCspy(), both of which return true.

SSGCspyTraceLocal extends SSTraceLocal to override methods traceObject and willNotMove to ensure that tracing deals properly with GCspy objects: the GCspyTraceLocal file will be similar for any instrumented collector.

The instrumented collector, SSGCspyCollector, extends SSCollector. It needs to override collectionPhase.

Similarly, SSGCspyMutator extends SSMutator and must also override its parent’s methods collectionPhase, to allow the allocators to collect data; and its alloc and postAlloc methods to allocate GCspy objects in GCspy’s heap space.

The Plan

SSGCspy.startGCspyServer is called immediately before the ”main” method is loaded and run. It initialises the GCspy server with the port on which to communicate, adds event names, instantiates a driver for each space, and then starts the server, forcing the VM to wait for a GCspy to connect if necessary. This method has the following responsibilities.

Drivers extend AbstractDriver and register their space with the ServerInterpreter. In addition to the server, drivers will take as arguments the name of the space, the MMTk space, the tilesize, and whether this space is to be the main space in the visualiser.

The Collector and Mutator

Instrumenters will typically want to add data collection points before, during and after a collection by overriding collectionPhase in SSGCspyCollector and SSGCspyMutator.

SSGCspyCollector deals with the data in the semi-spaces that has been allocated there (copied) by the collector. It only does any real work at the end of the collector’s last tracing phase, FORWARD_FINALIZABLE.

SSGCspyMutator is more complex: as well as gathering data for objects that it allocated in From-space at the start of the PREPARE_MUTATOR phase, it also deals with the immortal and large object spaces.

At a collection point, the collector or mutator will typically

  1. Return if the GCspy port number is 0 (as no client can be connected).
  2. Check whether the server is connected at this event. If so, the compensation timer (which discounts the time taken by GCspy to ather the data) should be started before gathering data and stopped after it.
  3. After gathering the data, have each driver call its transmit method.
  4. SSGCspyCollector does not call the GCspy server’s serverSafepoint method, as the collector phase is usually followed by a mutator phase. Instead, serverSafepoint can be called by SSGCspyMutator to indicate that this is a point at which the server can pause, play one event, etc.

Gathering data will vary from MMTk space to space. It will typically be necessary to resize a space before gathering data. For a space,

The Driver

GCspy space drivers extend AbstractDriver. This class creates a new GCspy ServerSpace and initializes the control values for each tile in the space. Control values indicate whether a tile is used, unused, a background, a separator or a link. The constructor for a typical space driver will:

  1. Create a GCspy Stream for each attribute of a space.
  2. Initialise the tile statistics in each stream.

Some drivers may also create a LinearScan object to handle call-backs from the VM as it sweeps the heap (see above).

The chief roles of a driver are to accumulate tile statistics, and to transmit the summary and control data and the data for all of their streams. Their data gathering interface is the scan method (to which an object reference or address is passed).

When the collector or mutator has finished gathering data, it calls the transmit of the driver for each space that needs to send its data. Streams may send values of types byte, \spverbshort+ or int, implemented through classes ByteStream, ShortStream or IntStream. A driver’s transmit method will typically:

  1. Determine whether a GCspy client is connected and interested in this event, e.g. server.isConnected(event)
  2. Setup the summaries for each stream, e.g. stream.setSummary(values...);
  3. Setup the control information for each tile. e.g.
    controlValues(CONTROL_USED, start, numBlocks); 
    controlValues(CONTROL_UNUSED, end, remainingBlocks);
  4. Set up the space information, e.g. setSpace(info);
  5. Send the data for all streams, e.g. send(event, numTiles);

Note that AbstractDriver.send takes care of sending the information for all streams (including control data).

Subspaces

Subspace provides a useful abstraction of a contiguous region of a heap, recording its start and end address, the index of its first block, the size of blocks in this space and the number of blocks in the region. In particular, Subspace provides methods to: