Skip to content

Memory Management

A deep dive into the memory management system utilized by WebKit.

Overview

In WebKit, when an object is owned by another object, we typically use std::unique_ptr to express that ownership. WebKit uses two primary management strategies when objects in other cases: garbage collection and reference counting.

Reference counting in WebKit

Overview

Most of WebCore objects are not managed by JavaScriptCore’s garbage collector. Instead, we use reference counting. We have two referencing counting pointer types: RefPtr and Ref. RefPtr is intended to behave like a C++ pointer whereas Ref is intended to behave like a C++ reference, meaning that the former can be set to nullptr but the latter cannot.

Ref<A> a1; // This will result in compilation error.
RefPtr<A> a2; // This is okay.
Ref<A> a3 = A::create(); // This is okay.
a3->f(); // Calls f() on an instance of A.
A* a4 = a3.ptr();
a4 = a2.get();

Unlike C++‘sstd::shared_ptr, the implementation of referencing counting is a part of a managed object. The requirements for an object to be used with RefPtr and Ref is as follows:

  • It implements ref() and deref() member functions
  • Each call to ref() and deref() will increment and decrement its internal reference counter
  • The initial call to ref() is implicit in new, after the object had been allocated and the constructor has been called upon; i.e. meaning that the reference count starts at 1.
  • When deref() is called when its internal reference counter reaches 0, “this” object is destructed and deleted.

There is a convenience super template class, RefCounted<T>, which implements this behavior for any inherited class T automatically.

How to use RefPtr and Ref

When an object which implements the semantics required by RefPtr and Ref is created via new, we must immediately adopt it into Ref type using adoptRef as follows:

class A : public RefCounted<A> {
public:
    int m_foo;

    int f() { return m_foo; }

    static Ref<A> create() { return adoptRef(*new A); }
private:
    A() = default;
};

This will create an instance of Ref without calling ref() on the newly created object, avoiding the unnecessary increment from 0 to 1. WebKit’s coding convention is to make the constructor private and add a static create function which returns an instance of a ref counted object after adopting it.

Note that returning RefPtr or Ref is efficient thanks to copy elision in C++11, and the following example does not create a temporary Ref object using copy constructor):

Ref<A> a = A::create();

When passing the ownership of a ref-counted object to a function, use rvalue reference with WTFMove (equivalent to std::move with some safety checks), and use a regular reference when there is a guarantee for the caller to keep the object alive as follows:

class B {
public:
    void setA(Ref<A>&& a) { m_a = WTFMove(a); }
private:
    Ref<A> m_a;
};

...

void createA(B& b) {
    b.setA(A::create());
}

Note that there is no WTFMove on A::create due to copy elision.

Forwarding ref and deref

As mentioned above, objects that are managed with RefPtr and Ref do not necessarily have to inherit from RefCounted. One common alternative is to forward ref and deref calls to another object which has the ownership. For example, in the following example, Parent class owns Child class. When someone stores Child in Ref or RefPtr, the referencing counting of Parent is incremented and decremented on behalf of Child. Both Parent and Child are destructed when the last Ref or RefPtr to either object goes away.

class Parent : RefCounted<Parent> {
public:
    static Ref<Parent> create() { return adoptRef(*new Parent); }

    Child& child() {
        if (!m_child)
            m_child = makeUnique<Child>(*this);
        return m_child
    }

private:
    std::unique_ptr<Child> m_child;    
};

class Child {
public:
    ref() { m_parent.ref(); }
    deref() { m_parent.deref(); }

private:
    Child(Parent& parent) : m_parent(parent) { }
    friend class Parent;

    Parent& m_parent;
}

Reference Cycles

A reference cycle occurs when an object X which holds Ref or RefPtr to another object Y which in turns owns X by Ref or RefPtr. For example, the following code causes a trivial memory leak because A holds a Ref of B, and B in turn holds Ref of the A:

class A : RefCounted<A> {
public:
    static Ref<A> create() { return adoptRef(*new A); }
    B& b() {
        if (!m_b)
            m_b = B::create(*this);
        return m_b.get();
    }
private:
    Ref<B> m_b;
};

class B : RefCounted<B> {
public:
    static Ref<B> create(A& a) { return adoptRef(*new B(a)); }

private:
    B(A& a) : m_a(a) { }
    Ref<A> m_a;
};

We need to be particularly careful in WebCore with regards to garbage collected objects because they often keep other ref counted C++ objects alive without having any Ref or RefPtr in C++ code. It’s almost always incorrect to strongly keep JS value alive in WebCore code because of this.

ProtectedThis Pattern

Because many objects in WebCore are managed by tree data structures, a function that operates on a node of such a tree data structure can end up deleting itself (this object). This is highly undesirable as such code often ends up having a use-after-free bug.

To prevent these kinds of bugs, we often employ a strategy of adding protectedThis local variable of Ref or RefPtr type, and store this object as follows:

ExceptionOr<void> ContainerNode::removeChild(Node& oldChild)
{
    // Check that this node is not "floating".
    // If it is, it can be deleted as a side effect of sending mutation events.
    ASSERT(refCount() || parentOrShadowHostNode());

    Ref<ContainerNode> protectedThis(*this);

    // NotFoundError: Raised if oldChild is not a child of this node.
    if (oldChild.parentNode() != this)
        return Exception { NotFoundError };

    if (!removeNodeWithScriptAssertion(oldChild, ChildChange::Source::API))
        return Exception { NotFoundError };

    rebuildSVGExtensionsElementsIfNecessary();
    dispatchSubtreeModifiedEvent();

    return { };
}

In this code, the act of removing oldChild can execute arbitrary JavaScript and delete this object. As a result, rebuildSVGExtensionsElementsIfNecessary or dispatchSubtreeModifiedEvent might be called after this object had already been free’ed if we didn’t have protectedThis, which guarantees that this object’s reference count is at least 1 (because Ref’s constructor increments the reference count by 1).

This pattern can be used for other objects that need to be protected from destruction inside a code block. In the following code, childToRemove was passed in using C++ reference. Because this function is going to remove this child node from this container node, it can get destructed while the function is still running. To prevent from having any chance of use-after-free bugs, this function stores it in Ref (protectedChildToRemove) which guarantees the object to be alive until the function returns control back to the caller:

ALWAYS_INLINE bool ContainerNode::removeNodeWithScriptAssertion(Node& childToRemove, ChildChangeSource source)
{
    Ref<Node> protectedChildToRemove(childToRemove);
    ASSERT_WITH_SECURITY_IMPLICATION(childToRemove.parentNode() == this);
    {
        ScriptDisallowedScope::InMainThread scriptDisallowedScope;
        ChildListMutationScope(*this).willRemoveChild(childToRemove);
    }
    ..

Also see Darin’s RefPtr Basics for further reading.

Weak Pointers in WebKit

In some cases, it’s desirable to express a relationship between two objects without necessarily tying their lifetime. In those cases, WeakPtr is useful. Like std::weak_ptr, this class creates a non-owning reference to an object. There is a lot of legacy code which uses a raw pointer for this purpose, but there is an ongoing effort to always use WeakPtr instead so do that in new code you’re writing.

To create a WeakPtr to an object, we need to make its class inherit from CanMakeWeakPtr as follows:

class A : CanMakeWeakPtr<A> { }

...

function foo(A& a) {
    WeakPtr<A> weakA = a;
}

Dereferencing a WeakPtr will return nullptr when the referenced object is deleted. Because creating a WeakPtr allocates an extra WeakPtrImpl object, you’re still responsible to dispose of WeakPtr at appropriate time.

WeakHashSet

While ordinary HashSet does not support having WeakPtr as its elements, there is a specialized WeakHashSet class, which supports referencing a set of elements weakly. Because WeakHashSet does not get notified when the referenced object is deleted, the users / owners of WeakHashSet are still responsible for deleting the relevant entries from the set. Otherwise, WeakHashSet will hold onto WeakPtrImpl until computeSize is called or rehashing happens.

WeakHashMap

Like WeakHashSet, WeakHashMap is a specialized class to map a WeakPtr key with a value. Because WeakHashMap does not get notified when the referenced object is deleted, the users / owners of WeakHashMap are still responsible for deleting the relevant entries from the map. Otherwise, the memory space used by WeakPtrImpl and its value will not be free'ed up until next rehash or amortized cleanup cycle arrives (based on the total number of read or write operations).

Reference Counting of DOM Nodes

Node is a reference counted object but with a twist. It has a separate boolean flag indicating whether it has a parent node or not. A Node object is not deleted so long as it has a reference count above 0 or this boolean flag is set. The boolean flag effectively functions as a RefPtr from a parent Node to each one of its child Node. We do this because Node only knows its first child and its last child and each sibling nodes are implemented as a doubly linked list to allow efficient insertion and removal and traversal of sibling nodes.

Conceptually, each Node is kept alive by its root node and external references to it, and we use the root node as an opaque root of each Node's JS wrapper. Therefore the JS wrapper of each Node is kept alive as long as either the node itself or any other node which shares the same root node is visited by the garbage collector.

On the other hand, a Node does not keep its parent or any of its shadow-including ancestor Node alive either by reference counting or via the boolean flag even though the JavaScript API requires this to be the case. In order to implement this DOM API behavior, WebKit will create a JS wrapper for each Node which is being removed from its parent if there isn't already one. A Node which is a root node (of the newly removed subtree) is an opaque root of its JS wrapper, and the garbage collector will visit this opaque root if there is any JS wrapper in the removed subtree that needs to be kept alive. In effect, this keeps the new root node and all its descendant nodes alive if the newly removed subtree contains any node with a live JS wrapper, preserving the API contract.

It's important to recognize that storing a Ref or a RefPtr to another Node in a Node subclass or an object directly owned by the Node can create a reference cycle, or a reference that never gets cleared. It's not guaranteed that every node is disconnected from a Document at some point in the future, and some Node may always have a parent node or a child node so long as it exists. Only permissible circumstances in which a Ref or a RefPtr to another Node can be stored in a Node subclass or other data structures owned by it is if it's temporally limited. For example, it's okay to store a Ref or a RefPtr in an enqueued event loop task. In all other circumstances, WeakPtr should be used to reference another Node, and JS wrapper relationships such as opaque roots should be used to preserve the lifecycle ties between Node objects.

It's equally crucial to observe that keeping C++ Node object alive by storing Ref or RefPtr in an enqueued event loop task does not keep its JS wrapper alive, and can result in the JS wrapper of a conceptually live object to be erroneously garbage collected. To avoid this problem, use GCReachableRef instead to temporarily hold a strong reference to a node over a period of time. For example, HTMLTextFormControlElement::scheduleSelectEvent() uses GCReachableRef to fire an event in an event loop task:

void HTMLTextFormControlElement::scheduleSelectEvent()
{
    document().eventLoop().queueTask(TaskSource::UserInteraction, [protectedThis = GCReachableRef { *this }] {
        protectedThis->dispatchEvent(Event::create(eventNames().selectEvent, Event::CanBubble::Yes, Event::IsCancelable::No));
    });
}

Alternatively, we can make it inherit from an active DOM object, and use one of the following functions to enqueue a task or an event:

Document node has one more special quirk because every Node can have access to a document via ownerDocument property whether Node is connected to the document or not. Every document has a regular reference count used by external clients and referencing node count. The referencing node count of a document is the total number of nodes whose ownerDocument is the document. A document is kept alive so long as its reference count and node referencing count is above 0. In addition, when the regular reference count is to become 0, it clears various states including its internal references to owning Nodes to sever any reference cycles with them. A document is special in that sense that it can store RefPtr to other nodes. Note that whilst the referencing node count acts like Ref from each Node to its owner Document, storing a Ref or a RefPtr to the same document or any other document will create a reference cycle and should be avoided unless it's temporally limited as noted above.