Yjs Fundamentals - Part 1: Theory (2024)

Dovetail Engineering

Why we chose Yjs for Dovetail over other CRDTs

It’s super robust and fast. It allows collaborators to make conflicting edits, even offline, and always brings each collaborator’s view of the document back into sync, while preserving each collaborator’s intention. It can do this no matter what order edits are received, even if the same edit is accidentally received multiple times. It blows alternative CRDT technologies out of the water in terms of robustness and performance.

Using plain English terminology, Yjs treats “edits” as first-class citizens, rather than the “document.” Technically, it consists of a data structure for representing edits, an algorithm for assembling edits together into a document, and a protocol for sharing edits between clients. In addition, it also has a data structure and protocol for sharing which clients are viewing documents, which is called “awareness.”

Read on to build your intuition for the internals of Yjs.

Let’s start with the two concepts mentioned above: “documents” and “edits.”

Documents

The “document” is some form of data that a web editor renders and modifies — it could be a ProseMirror document, a note grid, a canvas board, etc. This is the concept that application engineers are concerned about when building features for customers.

Documents are ephemeral — The current state of a document is constructed in memory on the fly by assembling edits. It is not permanently stored anywhere. A combination of Yjs library code and our code provides a view of the current state of the document data with methods to edit it.

Everything is a list — Under the surface, Yjs represents all compound data types as a list. An array is a list of items. A map is a list of key-value pairs. Strings are a list of characters (but are usually broken up lazily). And then there are primitive values, like numbers and characters/strings. Yjs provides useful read/write interfaces to compose documents with these data types without being concerned about the underlying set of edits or their structure. For further information see this list of interfaces that Yjs provides.

Edits

An “edit” is any change to a document. In Yjs terminology, these are called “updates,” but here we’ll stick with the plain English term for this introductory post.

Edits are first-class citizens — They are data and are stored permanently by both clients and the server. This facilitates offline editing. We only need to be aware of the nature of edits when implementing Yjs services, such as storage persistence, and when implementing the edit-sharing protocol. If you are only working on web UI, you don’t need to think about this.

Once an edit is created, it can never be destroyed. They just accumulate over time. But identical edits can be de-duplicated, and content in an edit that has subsequently been removed from the document by another edit can be truncated.

Edits are only inserts or deletes — The information in an edit consists of the edit operation type, what content is added/removed, and the edited position in the document relative to other edits. I have added a reference table below with a more detailed description of the data structures.

If you add an item to a list, that is an insert (the name of the Yjs data structure is called Item). An insert has an id, content, and properties that store its position in the document. The positional properties allow inserts to be assembled like a doubly linked list. If two clients insert something in the same position at the same time, the new inserts will link to the same left and right inserts. In the next section, I’ll explain how these conflicts are reconciled by using the origin and originRight properties. Sometimes inserts have another reference called parent which is used for items in nested data types, like maps, to refer to their parent.

While normally every new character or other primitive inserted into a list would be its own insert, later on I will explain an optimization where some subsequent inserts are merged together into a “compound insert.”

If you remove some content, a delete edit is created. It represents what insert id should be hidden from the document, and length, which is how much of the content was deleted (in the case of “compound inserts”). Remember that edits are never destroyed, so creating a delete doesn’t remove the relevant insert. The insert isn’t ever entirely deleted so a user’s intent to add an item between A and B will be preserved, even if the content of B gets entirely removed. However, for performance reasons, the insert content will be truncated as much as possible to save space. If all of its content is removed, it becomes a “tombstone”. This is the same as a normal insert but only with an ID and positional properties.

See an example of creating a top-level data type here
And the resulting data output here

If you modify an item, that is represented by a delete + insert. There are two types of modifications: a modification of a whole insert, and a modification of a part insert. A whole modification is achieved by the delete.length covering the whole original insert.

See an example of modifying an item here
And the resulting data output here

Bonus Yjs optimization: compound inserts — If Yjs stored a distinct insert for every character in a 100,000-character document, the size of the document would blow up significantly. It would need to store the ID and position metadata for each character. It would also make the process of assembling the edits into a document very slow.

To solve this problem, Yjs does some clever stuff with inserts when content is appended. For example, if you typed the characters “ABC” in sequential order, what actually happens is that the second and third inserts are merged with the first one. The whole merging algorithm and context is too much to explain here, but if you’re interested you can take a look at the code here.

Another interesting optimization is that if you delete part of the content in the compound insert, the insert gets split up. Yjs has figured out how to make this work correctly in practice by only applying them in the right conditions. This surprised me because I thought that edits were immutable. Make sure that the code you write takes advantage of these optimizations, and doesn’t break them by making assumptions about the nature of edits.

See an example of inserting subsequent characters here
And the results data output here

inserts are sorted in order to link up left and right references. deletes simply get applied to the referenced insert’s content. The only conflicts that arise from the Yjs model of edits boil down to insertion order conflicts. If two clients try to insert content between the same two inserts, how should they be ordered? No conflicts arise from deletes.

The goal that the Yjs algorithm aims for is to preserve the user’s intention. Their intention can be thought of as “I want to insert this content between this left and this right item, but if someone else has already inserted something here, lean towards the origin item.” If both inserts have the same origin, then the inserts are ordered by client_id.

The following diagram shows how an insertion order conflict is resolved by using just one origin property. In reality, Yjs uses both origin and originRight.

This is where Yjs shines. Other technologies have a lot of conflicts to resolve, require expensive algorithms to reconcile or struggle when clients are very out of sync. The simplicity of the Yjs data model makes this really simple and cheap.

Stay tuned for further posts on how the sync protocol works, how awareness works, how our Yjs architecture works, and some optimizations we’ve made at Dovetail.

Yjs Fundamentals - Part 1: Theory (2024)

Why we chose Yjs for Dovetail over other CRDTs

Documents

Edits

Further reading

References