JAVA: ArrayList and ensureCapacity()

📢 Check out my latest blog post on optimizing Java ArrayList performance with ensureCapacity() method! 💪🚀 Learn how to efficiently manage array resizing and improve your application's speed.

JAVA: ArrayList and ensureCapacity()

This post is inspired by another blog post regarding memory preallocation when using std::vector in C++.

C++: std::vector and reserve()
📢 In this article, you’ll learn how to improve the performance of your C++ programs by utilizing the reserve method in std::vector. Discover the benefits of reserving memory upfront and avoiding unnecessary reallocations. 💡🖥️

We already know that preallocating memory in std::vector positively impacts execution time. This time, we would like to find out whether the same effect can be achieved when using ArrayList in JAVA. Its interface exposes the ensureCapacity() method that reserves upfront a requested amount of memory. Before digging into benchmarks, let's see the growth pattern when not preallocating the memory.

ArrayList's growth

Determining the capacity of underlying memory isn't as straightforward as in the case of std::vector. The major issue here is ArrayList doesn't expose any method that would provide that insight for us. We have to find the other way around. First of all, we have to determine where elements are stored. Therefore, we can have a look at ArrayList's add() method as a starting point.

public boolean add(E o) {
  ensureCapacity(size + 1);  // Increments modCount!!
  elementData[size++] = o;
  return true;
}

Add() method not only inserts passed object E into elementData, but also ensures it has enough capacity to hold all the data. Nevertheless, it is elementData we are interested in.

private transient E[] elementData;

Looking at the elementData declaration, we already know that it is a plain array. Next, we have to find a way to extract the necessary information from it. Thankfully JAVA has reflection.

Using Java Reflection
Reflection is a feature in the JAVA language that allows an executing program to examine or "introspect" upon itself, and manipulate internal properties of the program. For instance, it's possible for a JAVA class to obtain the names of all its members and display them.
The ability to examine and manipulate a JAVA class from within itself may not sound like very much, but in other programming languages this feature is nonexistent. For example, there is no way in a Pascal, C, or C++ program to obtain information about the functions defined within that program.

Let's define a helper class that will provide us some insights into ArrayList.

package dev.slys.example;
  
import java.lang.reflect.*;
import java.util.*;

public final class ArrayListHelper {
   
    private ArrayListHelper() {}
    private static final String ELEMENT_DATA_NAME = "elementData";
    private static Field ELEMENT_DATA_FIELD;
    static {
        try {
            ELEMENT_DATA_FIELD = ArrayList.class.getDeclaredField(ELEMENT_DATA_NAME);
            ELEMENT_DATA_FIELD.setAccessible(true);
        } catch (Exception ignore) {
        }
    }

    @SuppressWarnings("unchecked")
    public static <E> int capacity(ArrayList<E> arrayList) {
        try {
            final E[] elementData = (E[]) ELEMENT_DATA_FIELD.get(arrayList);
            return elementData.length;
        } catch (Exception e) {
            throw new RuntimeException(e);
        }

    }
}

The helper binds to a Field named elementData, then it obtains a reference to the bounded field. Once we have it, we are able to check the length to find out the elementData's current size.

Now let's find the growth pattern. For that reason, we will write a small loop to insert elements into ArrayList.

package dev.slys.example;
import java.util.*;

public class Main {
  public static void main(String[] args) {
    System.out.println("Hello world!");
    ArrayList<Integer> ints = new ArrayList<>();
    System.out.println("capacity: " + ArrayListHelper.capacity(ints));
    for (int i = 0; i < 99; ++i) {
      ints.add(i);
      System.out.println("capacity: " + ArrayListHelper.capacity(ints));
    }
  }
}

Each time we add an element, we print the capacity (or length of elementData). Let's run it.

Hello world!
capacity: 0
capacity: 10
...
capacity: 10
capacity: 15
...
capacity: 15
capacity: 22
...
capacity: 22
capacity: 33
...
capacity: 33
capacity: 49
...
capacity: 49
capacity: 73
...
capacity: 73
capacity: 109
...
capacity: 109

We start with 0 capacity when it is extended to 10, then to 22, 33, 49, ..., and so on. The following diagram depicts memory allocation in consecutive steps.

Initially, there is no element (step 0). When we add the first element (step 1) allocation occurs. Space for 10 elements is allocated and "0" is added. In steps 2 to 10, we just add other elements as there is plenty of space for them. Once the memory is exhausted (step 11) new chunk of memory is allocated and elements from the old memory region are copied into a new one. Then "10" is added.

It seems like 10 is an arbitrary number, but another value in the sequence is a result of increasing current capacity by half to obtain new capacity, i.e. 10 / 2 = 5, then 10 + 5 = 15, and 15 / 2 = 7 (rounded down), then 15 + 7 = 22, and so on.

public void ensureCapacity(int minCapacity) {
  modCount++;
  int oldCapacity = elementData.length;
  if (minCapacity > oldCapacity) {
    Object oldData[] = elementData;
    int newCapacity = (oldCapacity * 3)/2 + 1;
    if (newCapacity < minCapacity)
      newCapacity = minCapacity;
    elementData = (E[])new Object[newCapacity];
    System.arraycopy(oldData, 0, elementData, 0, size);
  }
}

When we dig a bit more into ArrayList implementation and take a look into ensureCapacity(), actually we can find what it takes to obtain capacities. Capacity is depicted by the following equation.

$$capacity_{new} = (capacity_{old} * 3) / 2 + 1$$

Benchmark

Once we know the growth pattern of ArrayList we are ready to perform benchmarks to find out what is the impact of preallocating the memory.