RAII: A Cornerstone of C++ Programming

January 31, 2023

Introduction

The blog post is a personal study guide on a memorable talk, Back to Basics: RAII and the Rule of Zero, Arthur O'Dwyer, CppCon 2019

Cornerstone

RAII stands for Resource Acquisition is initialization. It ties resource acquisition to object lifetime for automatic cleanup.

  1. What are resources?

    They require manual management. For example:

    • Heap Memory Allocation
      class packet {};
      
      {
        packet* pkt = (packet*) malloc(sizeof(packet));
        free(pkt);
      }
      
      {
        packet* pkt = new packet;
        delete pkt;
      }
      
      {
        packet* pkt = new packet[n];
        delete[] pkt;
      }
    • POSIX File Operations
      int fd = open(filename, O_RDONLY);
      close(fd);
    • POSIX Mutex
      pthread_mutex_t lock;
      
      pthread_mutex_lock(&lock);
      pthread_mutex_unlock(&lock);
    • C++ Thread
      void on_worker() {
      }
      
      std::thread worker(on_worker);
      worker.join();
  2. How do we acquire resources?

    As you can see: Once we have identified the resource, we will know exactly how to acquire and release it.

  3. What is the object's lifetime?

    In this post, we'll focus on the constructor (acquire) and destructor (release) in object's lifetime. If you need the details, you can read the document on cppreference.com

  4. Why automatic cleanup?

    Let's see an example when we don't use the automatic cleanup. Here's the example:

    #include <vector>
    
    class packet {
    public:
    };
    
    class player_packet : public packet {
    public:
    };
    
    class npc_packet : public packet {
    public:
    };
    
    int main(void) {
      {
        std::vector<packet*> system_packets;
        system_packets.push_back(new player_packet());
        system_packets.push_back(new npc_packet());
      }
      return 0;
    }
    $ rm -f app.out; clang++ -O0 -fsanitize=address -fno-omit-frame-pointer -o app.out main.cpp; ./app.out && echo $?
    ==59057==ERROR: LeakSanitizer: detected memory leaks
    
    Direct leak of 1 byte(s) in 1 object(s) allocated from:
        #0 0x514a37 in operator new(unsigned long) (/home/gapry/Workspaces/Demo/app.out+0x514a37)
        #1 0x517848 in main (/home/gapry/Workspaces/Demo/app.out+0x517848)
    
    Direct leak of 1 byte(s) in 1 object(s) allocated from:
        #0 0x514a37 in operator new(unsigned long) (/home/gapry/Workspaces/Demo/app.out+0x514a37)
        #1 0x5177ea in main (/home/gapry/Workspaces/Demo/app.out+0x5177ea)
    
    SUMMARY: AddressSanitizer: 2 byte(s) leaked in 2 allocation(s).

    Observably, it's the memory leak. This is because system_packets is in stack memory. If it leaves the scope, it will be destroyed. Since new player_packet() and new npc_packet() are still in heap memory, no one will clean them up. It's the reason for the memory leak.

    We can use std::unsigned_ptr to solve it. Here's the reference:

    #include <memory>
    
    int main(void) {
      {
        std::vector<std::unique_ptr<packet>> system_packets;
        system_packets.push_back(std::make_unique<player_packet>());
        system_packets.push_back(std::make_unique<npc_packet>()); 
      }
      return 0;
    }
    $ rm -f app.out; clang++ -O0 -fsanitize=address -fno-omit-frame-pointer -o app.out main.cpp; ./app.out && echo $?
    0

Example: Class Vector

Let's consider the following example

#include <iostream>
#include <algorithm>

class Vector {
  int*   _ptr    = nullptr;
  size_t _offset = 0;

  friend std::ostream& operator<<(std::ostream& os, const Vector& vec) {
    for(int i = 0; i < vec._offset; ++i) {
      os << vec[i] << " ";
    }
    os << "\n";
    return os;
  }

public:
  Vector() = default;

  ~Vector() {
    delete[] _ptr;
  }

  void push_back(int new_value) {
    int* buff = new int[_offset + 1];
    std::copy(_ptr, _ptr + _offset, buff);
    delete[] _ptr;
    _ptr            = buff;
    _ptr[_offset++] = new_value;
  }

  int& operator[](int idx) const {
    return _ptr[idx];
  }
};

int main(void) {
  {
    Vector v;
    v.push_back(-1);
    v.push_back(0);
    v.push_back(1);
    {
      // our test
    }
    std::cout << v;
  }
  return 0;
}
$ rm -f app.out; clang++ -O0 -fsanitize=address -fno-omit-frame-pointer -o app.out main.cpp; ./app.out
-1 0 1

It seems good at the moment, but it's not.

Pitfall #1

{
  // our test
  Vector w = v; 
}
$ rm -f app.out; clang++ -O0 -fsanitize=address -fno-omit-frame-pointer -o app.out main.cpp; ./app.out
=================================================================
==79679==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000050 at pc 0x00000051ae88 bp 0x7ffd427acf00 sp 0x7ffd427acef8

We must remember that Initialization is not Assignment

  • Case #1
    Vector w = v; // it invokes the copy constructor since it's an initialization.
  • Case #2
    Vector w; 
    w = v; // it invokes the assignment operator since it's an assignment to the existing object.

In our test case, it implicitly invokes the default copy constructor generated by the compiler, since we don't implement it. The heap memory is shared. To avoid this, we need to implement the copy constructor.

Vector(const Vector& rhs) {
  _ptr    = new int[rhs._offset];
  _offset = rhs._offset;
  std::copy(rhs._ptr, rhs._ptr + _offset, _ptr);
}

Pitfall #2

{
  // our test
  Vector w;
  w = v;
  
  v = v;
}
$ rm -f app.out; clang++ -O0 -fsanitize=address -fno-omit-frame-pointer -o app.out main.cpp; ./app.out
=================================================================
==81921==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000050 at pc 0x00000051aea8 bp 0x7ffc990afba0 sp 0x7ffc990afb98

As we mentioned in Pitfall #1, we need to apply the same reasoning to implement the assignment operator to handle the test cases.

Vector& operator=(const Vector& rhs) {
  Vector copy = rhs;
  copy.swap(*this);
  return *this;
}

void swap(Vector& rhs) {
  std::swap(_ptr,    rhs._ptr);
  std::swap(_offset, rhs._offset);
}

A Little Better: Support Move Semantics

  • Move Constructor
    Vector(Vector&& rhs) {
      _ptr    = std::exchange(rhs._ptr, nullptr);
      _offset = std::exchange(rhs._offset, 0);
    }
    {
      // our test
      Vector w(std::move(v));
    }
  • Move Assignment
    Vector& operator=(Vector&& rhs) {
      Vector copy(std::move(rhs));
      copy.swap(*this);
      return *this;
    }
    {
      // our test
      Vector w;
      w = std::move(v);
    }

Conclusion

  • We need to know the difference between rvalue references and lvalue references
    • T& is a lvalue reference to a T, so we know that the copy constructor parameter is lvalue.
    • T&& is a rvalue reference to a T, so we know that the move constructor parameter is rvalue.
  • Copy constructor, T(const T&) and Copy assignment operator, T& operator=(const T&): copy the resource
  • Move constructor, T(T&&) and Move assignment operator, T& operator=(T&&): transfer ownership of the resource
  • There are some details in the talk that I don't mention in the blog post, such as exception and shared pointer.

Profile picture

Written by Gapry, 魏秋
Twitter | GitHub | Facebook | Mastodon