DOM¶

A deep dive into the Document Object Model.

Introduction¶

Document Object Model (often abbreviated as DOM) is the tree data structured resulted from parsing HTML. It consists of one or more instances of subclasses of Node and represents the document tree structure. Parsing a simple HTML like this:

<!DOCTYPE html>
<html>
<body>hi</body>
</html>

Will generate the following six distinct DOM nodes:

Document
- DocumentType
- HTMLHtmlElement
  - HTMLHeadElement
  - HTMLBodyElement
    - Text with the value of “hi”

Note that HTMLHeadElement (i.e. <head>) is created implicitly by WebKit per the way HTML parser is specified.

Broadly speaking, DOM node divides into the following categories:

Container nodes such as Document, Element, and DocumentFragment.
Leaf nodes such as DocumentType, Text, and Attr.

Document node, as the name suggests a single HTML, SVG, MathML, or other XML document, and is the owner of every node in the document. It is the very first node in any document that gets created and the very last node to be destroyed.

Note that a single web page may consist of multiple documents since iframe and object elements may contain a child frame, and form a frame tree. Because JavaScript can open a new window under user gestures and have access back to its opener, multiple web pages across multiple tabs might be able to communicate with one another via JavaScript API such as postMessage.

Node’s Type and State flags¶

Each node has a set of TypeFlag, which are set at construction time and immutable, and a set of StateFlag, which can be set or unset throughout Node’s lifetime. Node also makes use of EventTargetFlag for indicating ownership and relationship with other objects. For example, TypeFlag::IsElement is set whenever a Node is a subclass of Element. StateFlag::IsParsingChildren is set whenever a Node is in the state of its child nodes being parsed. EventTargetFlag::IsConnected is set whenever a Node is connected. These flags are updated by each subclass of Node throughout its lifetime. Note that these flags are set or unset within a specific function. For example, EventTargetFlag::IsConnected is set in Node::insertedIntoAncestor. It means that any code which runs prior to Node::insertedIntoAncestor running on a given Node will observe an outdated value of EventTargetFlag::IsConnected.

Insertion and Removal of DOM Nodes¶

In order to construct a DOM tree, we create a DOM Node and insert it into a ContainerNode such as Document and Element. An insertion of a node starts with a validation, then removal of the node from its old parent if there is any. Either of these two steps can synchronously execute JavaScript via mutation events and therefore can synchronously mutate tree’s state. Because of that, we need to check the validity again before proceeding with the insertion.

An actual insertion of a DOM Node is implemented using executeNodeInsertionWithScriptAssertion or executeParserNodeInsertionIntoIsolatedTreeWithoutNotifyingParent. To start off, these functions instantiate a RAII-style object ScriptDisallowedScope, which forbids JavaScript execution during its lifetime, do the insertion, then notify the child and its descendant with insertedIntoAncestor. Note that insertedIntoAncestor can be called when a given Node becomes connected to a Document, or it’s inserted into a disconnected subtree. It’s not correct to assume that this Node is always connected to a Document in insertedIntoAncestor. To run code only when a Node becomes connected to a document, check InsertionType’s connectedToDocument boolean. It’s also not necessarily true that this Node’s immediate parent node changed. It could be this Node’s ancestor that got inserted into a new parent. To run code only when this Node’s immediate parent had changed, check if node’s parent node matches parentOfInsertedTree. There are cases in which code must run whenever its TreeScope (ShadowRoot or Document) had changed. In this case, check InsertionType’s treeScopeChanged boolean. In all cases, it’s vital that no code invoked by insertedIntoAncestor attempts to execute JavaScript synchronously, for example, by dispatching an event. Doing so will result in a release assert (i.e. crash). If an element must dispatch events or otherwise execute arbitrary author JavaScript, return NeedsPostInsertionCallback from insertedIntoAncestor. This will result in a call to didFinishInsertingNode which unlike insertedIntoAncestor allows script execution (it gets called only after ScriptDisallowedScope has been out of scope). But note that the tree’s state may have been mutated by other scripts between when insertedIntoAncestor is called and by when didFinishInsertingNode is called so it’s not safe to assume any tree state condition which was true during insertedIntoAncestor to be true in didFinishInsertingNode. It’s also not safe to leave Node in an inconsistent state at the end of insertedIntoAncestor because JavaScript may invoke any API on such a Node between insertedIntoAncestor and didFinishInsertingNode. After invoking insertedIntoAncestor, these functions invoke childrenChanged on the new parent. This function has the first opportunity to execute any JavaScript in response to a child node being inserted. HTMLScriptElement, for example, may execute its script in its childrenChanged. Finally, the functions will invoke didFinishInsertingNode on Nodes which returned NeedsPostInsertionCallback from its insertedIntoAncestor and trigger mutation events such as DOMNodeInsertedEvent.

The removal of a DOM Node from its parent is implemented using ContainerNode::removeAllChildrenWithScriptAssertion and ContainerNode::removeChildWithScriptAssertion. These functions first dispatch mutation events and check if child’s parent is still the same container node. If it’s not, we stop and exit early. Next, they disconnect any subframes in the subtree to be removed. These functions then instantiate a RAII-style object ScriptDisallowedScope, which forbids JavaScript execution during its lifetime like the insertion counterparts, and notify Document of the node’s removal so that objects such as NodeIterator and Range can be updated. The functions will then do the removal and notify the child and its descendant with removedFromAncestor. Note that removedFromAncestor can be called when a given Node becomes disconnected from a Document, or it’s removed from an already disconnected subtree. It’s not correct to assume that this Node used to be connected to a Document in removedFromAncestor. To run code only when a Node becomes disconnected from a document, check RemovalType’s disconnectedFromDocument boolean. It’s also not necessarily true that this Node’s immediate parent node changed. It could be this Node’s ancestor that got removed from its old parent. To run code only when this Node’s immediate parent had changed, check if node’s parent node is nullptr. To run code whenever its TreeScope (ShadowRoot or Document) had changed, check RemovalType’s treeScopeChanged boolean. In all cases, it’s vital that no code invoked by removedFromAncestor attempts to execute JavaScript synchronously, for example, by dispatching an event. Doing so will result in a release assert (i.e. crash). If an element must dispatch events or otherwise execute arbitrary author JavaScript, queue a task to do so. After invoking removedFromAncestor, these functions invoke childrenChanged on the old parent.

Additionally, certain StateFlag and EventTargetFlag might be outdated in insertedIntoAncestor and removedFromAncestor. For example, EventTargetFlag::IsConnected flag is not set or unset until Node::insertedIntoAncestor or Node::removedFromAncestor is called. Accessing other node’s states and member functions are even trickier. Because insertedIntoAncestor or removedFromAncestor may not have been called on such nodes, functions like getElementById and rootNode will return wrong results for those nodes. Code which runs inside these functions must carefully avoid these pitfalls.