Chapter 6 - Java for Beginners Course

Garbage Collection

Java is a language that provides automatic memory management, which in practical terms means it will perform garbage collection for us.

This concept ties into the life-cycle of objects in our Java application. From Chapter 1, we know that when an object is created, e.g. new Airplane(), a portion of our memory will be allocated to store that new object.

Assume that our Java application has been running for a while, chances are that several objects will be created during the life time of our application, and there is only so much memory we can allocate.

In Java, the memory space where objects are allocated is called the Heap and it has a defined size. When we start a Java application we can change the initial and the maximum size of the heap. We can’t go above that maximum, so, how do we remove objects we don’t need anymore?

Object de-allocation

In Java, object de-allocation, the process of freeing the memory space occupied by an object, is done automatically for us. The means by which Java performs this is by doing a garbage collection of our heap.

The garbage collection process, at a very high level, will identify what objects aren’t needed any longer, and remove them from memory.

Once an object has been de-allocated, the memory space is freed for later use by our Java application.

Garbage collection and memory management can be a course on their own. In this section we only do an introduction to it. Details on how the heap works and how the garbage collection works depend on the garbage collector that has been configured as Java offers different implementations of garbage collectors.
Other languages also offer automatic memory management, others do not. In the languages that don’t offer automatic memory management, the developer is responsible of freeing up unused objects/memory.

What objects are considered unused?

In general, objects that have no object references pointing to them are considered unused by the JVM. In reality, this is more complex than the explanation we just gave, but this is the general idea.

For the purpose of this course, we won’t go into full details, but we will give an example of an object that should be garbage collected below to help clarify this.

Garbage Collection roots (GC roots) are out of the scope of this course, but we invite the reader to find more about this concept if it’s of interest.

finalize method

One of the methods that comes from the Object class is the finalize method, which will be invoked by the JVM before an object is garbage collected.

Its signature is:

protected void finalize() throws Throwable
We haven’t introduced the throws keyword yet but will introduce it in a future chapter.

You can use this method to ensure any resources used by the object are freed or to do general clean up of the object.

One caveat of this method is that you shouldn’t rely on this method being invoked as there is no guarantee it will be called. For example, if your application is forcefully terminated by the operating system, these methods won’t be invoked.

The default implementation of this method that comes from the Object class is an empty implementation, it doesn’t perform any operation.

Example of Garbage Collection

For the purpose of this example, we’ll focus on the creation and destruction of a SimpleObject class that defines a finalize method:

public class SimpleObject {
    private int id;

    public SimpleObject(int id) {
        this.id = id;
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("Object " + this + " finalized!");
    }

    @Override
    public String toString() {
        return "SimpleObject [id=" + id + "]";
    }
}

We’ll be creating several objects of this type and assign each object an id. For this, we’ll use a GarbageProducer class:

public class GarbageProducer {
    private SimpleObject lastObject;

    public void produceObjects(int numberOfObjects) {
        for (int i = 0; i < numberOfObjects; i++) {
            lastObject = new SimpleObject(i);
        }
    }

    public SimpleObject getLastObject() {
        return lastObject;
    }
}

This class exposes a method produceObjects that as an input receives the numberOfObjects we want to create. We use a for loop to create the objects.

It also keeps track of the lastObject that was created inside the for loop.

Now, let’s give this a try and see what happens behind the scenes:

In the example code run the JavaGarbageCollectionApp
GarbageProducer producer = new GarbageProducer();
producer.produceObjects(2);
System.out.println("Last object created was: " + producer.getLastObject());

Output:

Last object created was: SimpleObject [id=1]

In this case, we’re asking our GarbageProducer to produce 2 objects, this means our for loop will do 2 iterations:

  1. On the first iteration, with i = 0, a SimpleObject with id = 0 will be created. The lastObject reference will be pointing to that object. At this stage, this object with id = 0 can’t be garbage collected as it is still in use.

  2. On the second iteration, with i = 1, a SimpleObject with id = 1 is created. We change our lastObject reference to point to this new object with id = 1 instead to the object with id = 0. At this stage, our first object with id = 0 is now a candidate for garbage collection as it has nothing pointing to it.

Why wasn’t the finalize method called for the first object then?

Chances are you will get the same output shown above.

This was a very small example (we only asked our garbage producer to create 2 objects), and with the amount of objects we created it is unlikely that the JVM will need to perform garbage collection. By the time our example application has completed, the garbage collection process wasn’t required.

Can we force it for the purposes of the example?

In general, for example purposes, yes. Of course, this isn’t a realistic example. To achieve this, let’s change the number of objects to be created to 10_000_000 instead of 2, like so:

GarbageProducer producer = new GarbageProducer();
producer.produceObjects(10_000_000);
System.out.println("Last object created was: " + producer.getLastObject());

Output (your output may very well vary):

//...
Object SimpleObject [id=6151778] finalized!
Last object created was: SimpleObject [id=9999999]
Object SimpleObject [id=6151779] finalized!
//...

We know that only the last object that is created in the for loop will have an object reference, which means that for example, by the time we’ve created 1000 objects, only the object with id=999 will have an object reference pointing to it (lastObject) and the other 999 objects will be eligible for garbage collection (they aren’t used any longer and have no object reference pointing to them).

The output you get will surely be different to the one we got (we are only showing a few sample lines above). Some of the reasons you might get a different output:

  • Depending on the maximum size of your Java heap, the JVM might not need to trigger garbage collection processes.

  • If your heap size is too small, you’ll get an OutOfMemoryError in your output and the application will terminate.

  • The order in which objects are finalized isn’t guaranteed.

The only things we can guarantee from the example above are that:

  • If the application terminates successfully, at some point in our output we’ll have the following line: Last object created was: SimpleObject [id=9999999]. It is the only object that will not be eligible for Garbage Collection and is the lastObject created by our GarbageProducer.

  • Or, the application will fail due to an OutOfMemoryError.

The order and number of finalized objects isn’t deterministic.

Try changing the number of objects that are created and see if you can reproduce the different scenarios mentioned above.

Memory leaks, what are they?

The fact that Java performs garbage collection for us doesn’t prevent our applications from suffering memory leaks. This is important to understand as it is one of the issues that many applications that are developed (independent from the language) can run into.

Memory leaks, in their basic form, occur when objects that are not needed any longer in the application aren’t destroyed/cleaned up, and the number of these unnecessary objects increases as our application runs. Eventually, our application will run out of memory and will stop or become unresponsive.

In the case of Java, let’s take the example above. Assume that instead of discarding the references to previously created SimpleObject instances, we kept object references for all of the 10 million objects we created in the for loop.

As all of these objects have references pointing to them, they wouldn’t be eligible for garbage collection, and, depending on the maximum size of our heap we could eventually run out of memory.

In the next chapter, we’ll introduce data structures which will allow us to manage variable numbers of objects in memory.