C++ Smart Pointer Part 01: Unique Pointer

February 28, 2023

Acknowledgment

This blog post is inspired by two notable talks on smart pointers given at CppCon

I am deeply grateful to Arthur O'Dwyer and David Olsen for their illuminating CppCon talks, which strengthened my understanding of smart pointers in C++.

Their expertise is invaluable in developing my knowledge of smart pointers. I highly recommend watching these talks to improve your own understanding.

This post summarizes some key points and examples from their talks and my insights. I hope it serves as a helpful resource for those seeking to learn about smart pointers in C++.

Introduction

Have you ever had trouble managing dynamic memory allocation in C++? If so, you are not alone.

Memory management is a challenging aspect of C++ programming, especially for beginners. It requires careful attention to avoid common pitfalls, such as memory leaks and dangling pointers, which can cause crashes and bugs. One solution to this problem is to use smart pointers, which automate memory management and make your code more robust and readable.

In this blog post, I will focus on one type of smart pointer: the unique pointer. Unique pointer ensures exclusive ownership of an object and prevent copying or sharing. This makes them useful for implementing move semantics, avoiding circular dependencies, and customizing resource cleanup.

By reading this blog post, you will learn:

Why unique pointers are useful and when to use them
How unique pointers work under the hood
How to use unique pointers in your own code with examples
How to use unique pointers with custom deleters

Why we need the Unique Pointer

Pitfall 01: Dangling Pointer

When using a dynamically allocated object like matrix without a smart pointer such as unique pointer, it is easy to create a dangling pointer. A dangling pointer is a pointer that points to an object that has already been deallocated. This can happen when a pointer to a dynamically allocated object is copied, and the original pointer is deleted, leaving the copied pointer pointing to invalid memory.

In the following C++ example code, I create a matrix<int> object m1 using the new operator, and then copy the pointer to another variable m2. After deleting m1, m2 becomes a dangling pointer pointing to an object that no longer exists. When I try to call get_row() using m2, it results in heap-use-after-free.

template<typename T>
class matrix {
public:
  matrix(int rows, int cols)
    : m_rows(rows)
    , m_cols(cols) {
    m_data = new T*[m_rows];
    for (int i = 0; i < m_rows; i++) {
      m_data[i] = new T[m_cols];
    }
  }

  ~matrix() {
    for (int i = 0; i < m_rows; i++) {
      delete[] m_data[i];
    }
    delete[] m_data;
  }

  int get_row() {
    return m_rows;
  }

  friend std::ostream& operator<<(std::ostream& os, matrix<T>& mat) {
    os << "(rows, column) = (" << mat.m_rows << ", " << mat.m_cols << ")";
    return os;
  }

private:
  int m_rows = 0;
  int m_cols = 0;
  T** m_data = nullptr;
};

void test_case_danging_pointer(void) {
  matrix<int>* m1 = new matrix<int>(3, 4);
  std::cout << m1->get_row() << "\n";

  // It is not an operator, but rather an assignment of a pointer value. 
  // It simply assigns the value of m1 to m2, making m2 point to the same memory location as m1.
  matrix<int>* m2 = m1;
  delete m1;

  // m2 is a dangling pointer, resulting in heap-use-after-free
  std::cout << m2->get_row() << "\n";
}

To resolve the issue of dangling pointers without using unique pointers, you can perform a deep copy of the matrix data instead of simply assigning m1 to m2. This will ensure that m2 has its own independent copy of the matrix. One way to achieve this is by defining a copy assignment for the matrix class that allocates new memory for the copy and copies the data from the original matrix into the new memory.

It is important to note that both m1 and m2 are pointers and therefore they do not actually contain the objects themselves, but rather the memory addresses of the objects. The delete operator deallocates the memory that the pointers point to, not the pointers themselves.

Here is an example of how you can implement the copy assignment for the matrix class:

template<typename T>
class matrix {
public:
  // Default constructor and other member functions

  // Copy assignment operator
  matrix& operator=(const matrix& other) {
    if (this != &other) {
      // deallocate the old memory
      for (int i = 0; i < m_rows; i++) {
        delete[] m_data[i];
      }
      delete[] m_data;
  
      // copy the new matrix data
      m_rows = other.m_rows;
      m_cols = other.m_cols;
      m_data = new T*[m_rows];
      for (int i = 0; i < m_rows; i++) {
        m_data[i] = new T[m_cols];
        std::copy(other.m_data[i], other.m_data[i] + m_cols, m_data[i]);
      }
    }
    return *this;
  }

private:
  // as before
};

In the example above, the operator= function first checks if this and other are different objects. If they are the same object, then there is no need to copy the matrix. Next, the function deallocates the memory of the old matrix by deleting each row of m_data and then deleting m_data itself. After that, the function copies the data of other into the new matrix by allocating new memory for m_data, copying each row of other.m_data into m_data, and finally returning the updated matrix object.

Here is an example of how you can use the copy assignment to avoid dangling pointers:

void test_case_danging_pointer(void) {
  // Create a new matrix m1
  matrix<int>* m1 = new matrix<int>(3, 4);
  std::cout << m1->get_row() << "\n";
 
  // Create a new matrix m2 and copy m1 into it 
  matrix<int>* m2 = new matrix<int>(0, 0);
  *m2             = *m1; // Copy assignment operator is called here
  delete m1;
 
  // Use m2 without any issues 
  std::cout << m2->get_row() << "\n";
  delete m2;
}

In the example above, the copy assignment operator is called when *m2 = *m1 is executed, which ensures that m2 has its own independent copy of the matrix data. Therefore, deleting m1 does not cause m2 to become a dangling pointer, and using m2 after deleting m1 is safe.

When using raw pointers, it is crucial to be cautious about copying them and to delete the original pointer only after confirming that all copied pointers have been dereferenced.

Pitfall 02: Memory Leaks

One common mistake in programming is forgetting to free memory allocated with new or malloc, which can result in memory leaks. Memory leaks occur when memory is allocated but never deallocated, leading to a gradual loss of available memory and eventual program failure.

In C++, one way to dynamically allocate memory is to use the new operator. When using new to create objects, it's important to remember to delete them when they are no longer needed to avoid memory leaks.

Consider the following code, which defines three classes: packet, player_packet, and npc_packet. The latter two inherit from the base class packet.

class packet {
public:
};

class player_packet : public packet {
public:
};

class npc_packet : public packet {
public:
};

void test_case_memory_leak(void) {
  std::vector<packet*> system_packets;
  system_packets.push_back(new player_packet());
  system_packets.push_back(new npc_packet());
}

In the test_case_memory_leak() function, a vector of packet pointers called system_packets is created. Two dynamically allocated objects, one of type player_packet and one of type npc_packet, are then added to the vector.

However, there is no code to free the allocated memory, resulting in a memory leak. This is because the vector system_packets is destroyed when it goes out of scope, but the objects it contains are not deleted.

To avoid memory leaks, it's important to free the allocated memory when it is no longer needed. One way to do this is to use a for loop to iterate through the vector and delete each object:

void test_case_memory_leak(void) {
  std::vector<packet*> system_packets;
  system_packets.push_back(new player_packet());
  system_packets.push_back(new npc_packet());

  for (auto& packet : system_packets) {
    delete packet;
  }
}

In summary, memory leaks can cause serious problems in programs if left unchecked. To prevent memory leaks, always remember to free dynamically allocated memory using delete when it is no longer needed.

Introduction to unique pointers

Definition

C++ unique pointers are a type of smart pointer that was introduced in the C++11 standard library. Unique pointers provide automatic memory management and help prevent memory-related bugs by ensuring that dynamically allocated objects are properly deleted when they are no longer needed.

Unique pointers are called unique because they ensure that the object they point to is owned by only one unique pointer at a time. This means that if you try to copy a unique pointer, the compiler will generate an error because the copy constructor and copy assignment operator are both deleted. Instead, you can use move semantics to transfer ownership of the object to another unique pointer. This is done using the std::move() function, which moves the contents of one unique pointer to another.

The basic syntax for declaring a unique pointer in C++ is as follows:

std::unique_ptr<T> ptr = std::make_unique<T>(args...);

Here, T is the type of object that the pointer will point to, and args... is a parameter pack that is used to pass arguments to the object's constructor, if any. The std::make_unique() function is used to allocate the object on the heap and initialize it with the specified arguments.

Benefit

To avoid creating dangling pointers and memory leaks, it is recommended to use a smart pointer like std::unique_ptr. std::unique_ptr automatically deallocates the object when it goes out of scope, preventing dangling pointers and memory leaks. The benefits of std::unique_ptr become clear when examining the issues in the previous example code.

In the previous section, Pitfall 01: Dangling Pointer, we see how raw pointers can lead to dangling pointers when one pointer deletes an object but the other still points to the same memory location. To avoid this, the following test code uses std::unique_ptr to automatically deallocate the object when it goes out of scope. By using std::unique_ptr, we can prevent dangling pointers.

void test_case_no_dangling_pointer() {
  std::unique_ptr<matrix<int>> m1 = std::make_unique<matrix<int>>(3, 4);
  std::cout << m1->get_row() << "\n";

  std::unique_ptr<matrix<int>> m2 = std::make_unique<matrix<int>>(0, 0);
  *m2                             = *m1;

  m1.reset();

  std::cout << m2->get_row() << "\n";
}

In the previous section, Pitfall 02: Memory Leaks, we can see how using raw pointers can lead to memory leaks when we forget to deallocate the object. To prevent this pitfall, the test code uses std::unique_ptr to automatically deallocate the objects when they go out of scope. By using std::unique_ptr, we can avoid the pitfall of creating memory leaks.

void test_case_no_memory_leak(void) {
  std::vector<std::unique_ptr<packet>> system_packets;
  system_packets.push_back(std::make_unique<player_packet>());
  system_packets.push_back(std::make_unique<npc_packet>());
}

Using unique pointers

Here's an example of how to use std::unique_ptr

Initialization
To initialize a std::unique_ptr, you can use its constructor or the std::make_unique function.

// Initialization
std::unique_ptr<matrix<int>> uptr1(new matrix<int>(3, 4));
auto uptr2 = std::make_unique<matrix<int>>(3, 4);

Accessing raw pointer
You can access the raw pointer held by a std::unique_ptr by calling the get member function.

// Accessing raw pointer
matrix<int>* raw_ptr = uptr1.get();
std::cout << raw_ptr->get_row() << "\n";

Releasing ownership
You can release ownership of the pointer held by a std::unique_ptr by calling the release member function. This will return the raw pointer, but the std::unique_ptr will no longer manage the memory.
```
// Releasing ownership
matrix<int>* raw_ptr2 = uptr2.release();
std::cout << raw_ptr2->get_row() << "\n";
delete raw_ptr2;
```
Moving ownership
You can move ownership of a std::unique_ptr to another std::unique_ptr by using either move assignment or move construction.
```
std::unique_ptr<matrix<int>> uptr3 = std::move(uptr1);
std::cout << uptr3->get_row() << "\n";
```

Resetting pointer

uptr3.reset(new matrix<int>(17, 19));
std::cout << uptr3->get_row() << "\n";

Putting it all together:

void test_unique_ptr_matrix() {
  // Initialization
  std::unique_ptr<matrix<int>> uptr1(new matrix<int>(3, 4));
  auto uptr2 = std::make_unique<matrix<int>>(3, 4);

  // Accessing raw pointer
  matrix<int>* raw_ptr = uptr1.get();
  std::cout << raw_ptr->get_row() << "\n";

  // Releasing ownership
  matrix<int>* raw_ptr2 = uptr2.release();
  std::cout << raw_ptr2->get_row() << "\n";
  delete raw_ptr2;

  // Moving ownership
  std::unique_ptr<matrix<int>> uptr3 = std::move(uptr1);
  std::cout << uptr3->get_row() << "\n";

  // Resetting pointer
  uptr3.reset(new matrix<int>(17, 19));
  std::cout << uptr3->get_row() << "\n";
}

int main(void) {
  test_unique_ptr_matrix();
  return 0;
}

Great reminder! Enabling the -fsanitize=address flag as you compile the test code is a good practice to detect memory-related issues such as memory leaks, heap-use-after-free, and buffer overflows. This flag adds runtime checks that help identify these issues before they cause problems in your code.

To enable this flag, you can add it to your compiler command as follows:

$ g++ -fsanitize=address -o app.out main.cpp
$ ./app.out
3
3
3
17

How Unique Pointer Works

A unique pointer is a move-only type that owns an object exclusively and releases it automatically when it goes out of scope. Let me explain them in detail.

Ownership
```
{
  std::unique_ptr<matrix<int>> uptr1(new matrix<int>(3, 4));
}
```
The code snippet creates a unique pointer uptr1 which owns a dynamically allocated matrix<int> object with dimensions 3 by 4.

When a unique pointer is created, it takes ownership of the object and becomes responsible for deallocating the object when it goes out of scope or is reset. This ensures that the object is properly cleaned up and avoids memory leaks.

In this case, uptr1 owns the matrix<int> object and is responsible for deallocating it. The use of a unique pointer ensures that the matrix<int> object will be automatically deleted when uptr1 goes out of scope or is reset. This helps to prevent memory leaks and improves the safety and reliability of the program.
Move-Only Type
```
{
  std::unique_ptr<matrix<int>> uptr1(new matrix<int>(3, 4));  
  std::unique_ptr<matrix<int>> uptr2 = std::move(uptr1);
}
```
The code creates a unique pointer uptr1 which owns a dynamically allocated matrix<int> object with dimensions 3 by 4. Then, a second unique pointer uptr2 is created and initialized with the result of calling std::move on uptr1.

Unique pointers are move-only types, which means they cannot be copied, but can be moved. When a unique pointer is moved, ownership of the dynamically allocated object is transferred from the source unique pointer to the destination unique pointer. After the move, the source unique pointer no longer owns the object, and the object can only be accessed through the destination unique pointer.

In this case, uptr1 is moved into uptr2, transferring ownership of the matrix<int> object from uptr1 to uptr2. After the move, uptr1 is empty, and any attempt to access the object through uptr1 will result in undefined behavior. uptr2, on the other hand, now owns the object and can access it using the -> operator or by dereferencing it with the * operator.

The use of move semantics in unique pointers allows for efficient and safe transfer of ownership of dynamically allocated objects, without the need for expensive copying or risking multiple owners attempting to delete the same object.
RAII
```
{
  std::unique_ptr<matrix<int>> uptr1(new matrix<int>(3, 4));  
  uptr1.reset(new matrix<int>(17, 19));
}
```
The code creates an instance of std::unique_ptr named uptr1 and initializes it with a dynamically allocated matrix<int> object of size 3 by 4.

std::unique_ptr is a smart pointer that provides automatic memory management for dynamically allocated objects. It uses the RAII (Resource Acquisition Is Initialization) principle to ensure that the object it points to is properly destroyed when the std::unique_ptr goes out of scope.

In the code snippet, uptr1 takes ownership of the dynamically allocated matrix<int> object using its constructor. This means that uptr1 is responsible for deleting the matrix<int> object when it goes out of scope.

Later in the code, the reset() function of uptr1 is called with a new matrix<int> object of a different size. This causes the std::unique_ptr to release ownership of the original matrix<int> object and take ownership of the new one. The old matrix<int> object is automatically destroyed as a result of going out of scope with no owning pointer left.

The use of std::unique_ptr in this way helps to prevent memory leaks and other memory-related issues by ensuring that objects are automatically destroyed when they are no longer needed or when the std::unique_ptr goes out of scope. This simplifies memory management and reduces the risk of bugs that can occur when manually managing memory.

Custom Deleter

The std::unique_ptr object can also take a custom deleter class as an argument, which defines how the object should be deleted. We need custom deleter when the standard delete cannot handle the resource properly. For example, if the resource is a file pointer that needs to be closed using fclose. By using custom deleter, we can tell the unique pointer how to delete its object.

The example code uses a std::unique_ptr object to manage a file pointer. It defines a custom deleter class called file_closer, which has a function call operator that takes a file pointer as an argument and closes it using fclose. It also prints a message to the console saying that the file was closed.

In the test_file_io function, I creates a std::unique_ptr object with two arguments: fopen("dataset.txt", "a"), which opens a file named dataset.txt for appending data, and file_closer(), which creates an instance of the custom deleter class. The code then writes some text to the file using fprintf and passes uptr.get() as the first argument. uptr.get() returns the raw file pointer that uptr manages. When the function ends, the std::unique_ptr object goes out of scope and calls its deleter on the file pointer, closing the file and printing the message.

class file_closer {
public:
  void operator()(FILE* fp) const {
    fclose(fp);
    std::cout << "[" << __PRETTY_FUNCTION__ << "] File closed\n";
  }
};

void test_file_io(void) {
  std::unique_ptr<FILE, file_closer> uptr(fopen("dataset.txt", "a"), file_closer());
  if (uptr) {
    fprintf(uptr.get(), "I go to school by bus!\n");
  } else {
    std::cerr << "Failed to open dataset.txt\n";
  }
}

Building your own unique pointer

To improve your C++ memory management skills and write better code, you should learn how to create your own unique pointer. This will help you understand how it works and how to use it effectively.

This can also be useful when working on projects without the standard library. For example, in some embedded systems programming contexts, it might not be feasible to rely on the standard library, and knowing how to implement basic memory management functionality like std::unique_ptr can be a valuable skill.

One possible way to define a custom implementation of unique pointer is the following code.

#include <iostream>
#include <memory>
#include <utility>

namespace theta {

template<class T>
struct remove_reference {
  using type = T;
};

template<typename T>
struct remove_reference<T&> {
  using type = T;
};

template<typename T>
struct remove_reference<T&&> {
  using type = T;
};

template<class T>
using remove_reference_type = typename remove_reference<T>::type;

template<typename T>
typename remove_reference<T>::type&& move(T&& arg) {
  return static_cast<typename remove_reference<T>::type&&>(arg);
}

template<class T>
T&& forward(remove_reference_type<T>& arg) noexcept {
  return static_cast<T&&>(arg);
}

template<typename T>
void swap(T& one, T& other) {
  T cache = theta::move(one);
  one     = theta::move(other);
  other   = theta::move(cache);
}

template<typename T, typename Deleter = std::default_delete<T>>
class unique_ptr {
public:
  explicit unique_ptr(T* ptr = nullptr, Deleter deleter = Deleter())
    : m_ptr(ptr)
    , m_deleter(deleter) {
  }

  ~unique_ptr() {
    m_deleter(m_ptr);
  }

  unique_ptr(const unique_ptr&) = delete;

  unique_ptr& operator=(const unique_ptr&) = delete;

  unique_ptr(unique_ptr&& other) noexcept
    : m_ptr(other.m_ptr)
    , m_deleter(theta::move(other.m_deleter)) {
    other.m_ptr = nullptr;
  }

  unique_ptr& operator=(unique_ptr&& other) noexcept {
    if (this != &other) {
      m_deleter(m_ptr);
      m_ptr       = other.m_ptr;
      m_deleter   = theta::move(other.m_deleter);
      other.m_ptr = nullptr;
    }
    return *this;
  }

  T* release() noexcept {
    T* ptr = m_ptr;
    m_ptr  = nullptr;
    return ptr;
  }

  void reset(T* ptr = nullptr) noexcept {
    m_deleter(m_ptr);
    m_ptr = ptr;
  }

  void swap(unique_ptr& other) noexcept {
    theta::swap(m_ptr, other.m_ptr);
    theta::swap(m_deleter, other.m_deleter);
  }

  T* get() const noexcept {
    return m_ptr;
  }

  explicit operator bool() const noexcept {
    return m_ptr != nullptr;
  }

  T& operator*() const {
    return *m_ptr;
  }

  T* operator->() const noexcept {
    return m_ptr;
  }

private:
  T* m_ptr = nullptr;
  Deleter m_deleter;
};

template<typename T, typename... Args>
unique_ptr<T> make_unique(Args&&... args) {
  return unique_ptr<T>(new T(theta::forward<Args>(args)...));
}

template<typename T>
class matrix {
public:
  matrix(int rows, int cols)
    : m_rows(rows)
    , m_cols(cols) {
    m_data = new T*[m_rows];
    for (int i = 0; i < m_rows; i++) {
      m_data[i] = new T[m_cols];
    }
  }

  ~matrix() {
    for (int i = 0; i < m_rows; i++) {
      delete[] m_data[i];
    }
    delete[] m_data;
  }

  // assignment operator
  matrix& operator=(const matrix& other) {
    if (this != &other) {
      // deallocate the old memory
      for (int i = 0; i < m_rows; i++) {
        delete[] m_data[i];
      }
      delete[] m_data;

      // copy the new matrix data
      m_rows = other.m_rows;
      m_cols = other.m_cols;
      m_data = new T*[m_rows];
      for (int i = 0; i < m_rows; i++) {
        m_data[i] = new T[m_cols];
        std::copy(other.m_data[i], other.m_data[i] + m_cols, m_data[i]);
      }
    }
    return *this;
  }

  friend std::ostream& operator<<(std::ostream& os, matrix<T>& mat) {
    os << "(rows, column) = (" << mat.m_rows << ", " << mat.m_cols << ")";
    return os;
  }

private:
  int m_rows = 0;
  int m_cols = 0;
  T** m_data = nullptr;
};

} // namespace theta

void test_unique_ptr_matrix() {
  // Initialization
  theta::unique_ptr<theta::matrix<int>> uptr1(new theta::matrix<int>(3, 4));
  auto uptr2 = theta::make_unique<theta::matrix<int>>(3, 4);

  // Accessing raw pointer
  theta::matrix<int>* raw_ptr = uptr1.get();
  std::cout << *raw_ptr << "\n";

  // Releasing ownership
  theta::matrix<int>* raw_ptr2 = uptr2.release();
  std::cout << *raw_ptr2 << "\n";
  delete raw_ptr2;

  // Moving ownership
  theta::unique_ptr<theta::matrix<int>> uptr3 = theta::move(uptr1);
  std::cout << *uptr3 << "\n";

  // Resetting pointer
  uptr3.reset(new theta::matrix<int>(17, 19));
  std::cout << *uptr3 << "\n";
}

int main(void) {
  test_unique_ptr_matrix();
  return 0;
}

I compiled with g++ with the option -fsanitize=address to enable address sanitizer.

$ g++ -fsanitize=address -o app.out main.cpp

This is the output result I got:

$ ./app.out
(rows, column) = (3, 4)
(rows, column) = (3, 4)
(rows, column) = (3, 4)
(rows, column) = (17, 19)

Conclusion

In this post, we learned about unique pointers in C++, which are smart pointers that manage a single object’s lifetime and prevent memory leaks. We saw how to use them with different syntaxes, how they work internally with move semantics and ownership transfer, how to customize their behavior with custom deleters, and how to build our own unique pointer from scratch. I hope this post helped you understand unique pointers in C++. Thank you for your reading, see you soon.