C++: std::vector and reserve()

📢 In this article, you'll learn how to improve the performance of your C++ programs by utilizing the reserve method in std::vector. Discover the benefits of reserving memory upfront and avoiding unnecessary reallocations. 💡🖥️

C++: std::vector and reserve()

When dealing with a collection of elements, we have a couple of options to choose from. One of them is of course a plain, old C-style array.

int* ints = new int[10];
for (unsigned i = 0; i < 10; ++i)
  ints[i] = i;

In the example above, we create an array of 10 integers and then assign consecutive numbers from 0 to 9. But what happens when the allocated space is exhausted? Well, if we just go out of array boundaries, we will likely either corrupt our memory or in the best case get a segmentation fault.

int* old = ints;
ints = new int[20];

for (unsigned i = 0; i < 10; ++i)
  ints[i] = old[i];

delete[] old;

We need to manually resize the array, so it takes the following steps to fulfill:

  • allocate a new array with a size greater comparing to the old one,
  • copy elements from the old one to the new one,
  • tear down the old one.

Now we are ready to add new integers to the collection.

for (unsigned i = 10; i < 20; ++i)
  ints[i] = i + 100;

In the end, printing the content of the array will give the following output.

0 1 2 3 4 5 6 7 8 9 110 111 112 113 114 115 116 117 118 119

Our naive solution is working fine, but this implementation is error-prone. It requires tracking the number of elements, maintaining memory, and copying things around. In the meantime, there is a more suitable way of handling dynamic memory.

std::vector

C++ standard library provides a container with contiguous storage out of the box. It is called std::vector. The storage of the vector is handled automatically, being expanded as needed. Vectors usually occupy more space than static arrays, because more memory is allocated to handle future growth. This way a vector does not need to reallocate each time an element is inserted, but only when the additional memory is exhausted.

std::vector<int> ints;

for (unsigned i = 0; i < 10; ++i)
  ints.push_back(i);

We declare a vector and then push into it 10 integers.

std::vector<int> ints;

for (unsigned i = 0; i < 10; ++i)
  ints.push_back(i);

for (auto&& i : ints)
  std::cout<< i << ' ';
std::cout<<std::endl;

for (unsigned i = 10; i < 20; ++i)
  ints.push_back(100 + i);

for (auto&& i : ints)
  std::cout<< i << ' ';
std::cout<<std::endl;

When we want to add new elements, we just need to push back another round of integers. And that's it! Memory allocation is done automatically. Printing the vector's content will give the following output.

0 1 2 3 4 5 6 7 8 9 
0 1 2 3 4 5 6 7 8 9 110 111 112 113 114 115 116 117 118 119

Vector's growth

Vector automatically extends allocated memory. Let's try to find out how memory grows over time. For that purpose, we will use the dedicated capacity() method.

std::vector<int> ints;

for (unsigned i = 0; i < 16; ++i) {
  std::cout<< "capacity: " << ints.capacity() << std::endl;
  ints.push_back(i);
}

We sequentially add 16 integers and print the capacity each time. We can run it now.

capacity: 0
capacity: 1
capacity: 2
capacity: 4
capacity: 4
capacity: 8
capacity: 8
capacity: 8
capacity: 8
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16

Initial capacity is 0. Then it grows to 1, 2, 4, 8, 16, and so on. Each time there is a need to allocate more memory, capacity doubles.

The following diagram depicts memory allocation in consecutive steps.

Initially, there is no element (step 0), so allocation occurs. A space for a single element is allocated and "0" is assigned to it (step 1). When we try to add another element, another allocation occurs as space is already exhausted (step 2). A memory that can keep two elements is allocated. Elements from the old memory region are copied into newly allocated memory - in this particular case, it is a single element - "0". In the end, "1" is assigned to the free slot. When we add another integer, a new allocation has to happen (step 3). We double the allocated space, so memory for four elements is allocated. Then elements from the old memory are copied into the new memory chunk. "0" and "1" are copied and then "2" is added. At this point, there is still one free slot. When we add another integer into the container (step 4), no new allocation is made as capacity hasn't been exhausted yet. We simply add "3" into the vector. When adding another integer (step 6), space is already exhausted. A new memory chunk is allocated, and elements from the old memory are copied into a new one, and then "5" is added, and so on.

Preallocation

When we know the number of elements upfront it is worth reserving the necessary memory. There is a dedicated method for this purpose - reserve(...). We can slightly modify the code and reserve the memory for 16 elements upfront.

std::vector<int> ints;
ints.reserve(16);

for (unsigned i = 0; i < 16; ++i) {
  std::cout<< "capacity: " << ints.capacity() << std::endl;
  ints.push_back(i);
}

Now we can compile and execute modified code.

capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16
capacity: 16

We can see each time we push back an integer capacity remains the same. Of course, it is until we don't exhaust all the allocated memory. So what happens underneath?

Initially, no memory is allocated (step 0). Then we call the reserve(...) method to allocate the space (step 1). We push back "0" into the vector. As we have plenty of space, no reallocation occurs. Again, we push back another integer "1" and no reallocation and copying are done, and so on.

Benchmark

Now it is worth asking what the costs are.