Published in · 7 min read · Oct 27, 2022
--
By Peter Wooden — Software Engineer
Dovetail uses Yjs to create a real-time collaborative editing experience for our View and Canvas features. It’s a complex technology and there’s no one-stop guide to understand how it works — to understand it myself I had to read the docs, academic articles, and source code to put together a mental model. I’m writing this guide to help everyone else build their mental model with less effort. After reading this series you should be better able to make performance improvements and debug issues with your Yjs implementation. This post will focus on internal Yjs theory and concepts.
Why we chose Yjs for Dovetail over other CRDTs
It’s super robust and fast. It allows collaborators to make conflicting edits, even offline, and always brings each collaborator’s view of the document back into sync, while preserving each collaborator’s intention. It can do this no matter what order edits are received, even if the same edit is accidentally received multiple times. It blows alternative CRDT technologies out of the water in terms of robustness and performance.
Using plain English terminology, Yjs treats “edits” as first-class citizens, rather than the “document.” Technically, it consists of a data structure for representing edits, an algorithm for assembling edits together into a document, and a protocol for sharing edits between clients. In addition, it also has a data structure and protocol for sharing which clients are viewing documents, which is called “awareness.”
Read on to build your intuition for the internals of Yjs.
Let’s start with the two concepts mentioned above: “documents” and “edits.”
Documents
The “document” is some form of data that a web editor renders and modifies — it could be a ProseMirror document, a note grid, a canvas board, etc. This is the concept that application engineers are concerned about when building features for customers.
Documents are ephemeral — The current state of a document is constructed in memory on the fly by assembling edits. It is not permanently stored anywhere. A combination of Yjs library code and our code provides a view of the current state of the document data with methods to edit it.
Everything is a list — Under the surface, Yjs represents all compound data types as a list. An array is a list of items. A map is a list of key-value pairs. Strings are a list of characters (but are usually broken up lazily). And then there are primitive values, like numbers and characters/strings. Yjs provides useful read/write interfaces to compose documents with these data types without being concerned about the underlying set of edits or their structure. For further information see this list of interfaces that Yjs provides.
Edits
An “edit” is any change to a document. In Yjs terminology, these are called “updates,” but here we’ll stick with the plain English term for this introductory post.
Edits are first-class citizens — They are data and are stored permanently by both clients and the server. This facilitates offline editing. We only need to be aware of the nature of edits when implementing Yjs services, such as storage persistence, and when implementing the edit-sharing protocol. If you are only working on web UI, you don’t need to think about this.
Once an edit is created, it can never be destroyed. They just accumulate over time. But identical edits can be de-duplicated, and content in an edit that has subsequently been removed from the document by another edit can be truncated.
Edits are only inserts
or deletes
— The information in an edit consists of the edit operation type, what content is added/removed, and the edited position in the document relative to other edits. I have added a reference table below with a more detailed description of the data structures.
If you add an item to a list, that is an insert
(the name of the Yjs data structure is called Item
). An insert
has an id
, content
, and properties that store its position in the document. The positional properties allow inserts
to be assembled like a doubly linked list. If two clients insert something in the same position at the same time, the new inserts
will link to the same left
and right
inserts
. In the next section, I’ll explain how these conflicts are reconciled by using the origin
and originRight
properties. Sometimes inserts
have another reference called parent
which is used for items in nested data types, like maps, to refer to their parent.
While normally every new character or other primitive inserted into a list would be its own insert
, later on I will explain an optimization where some subsequent inserts are merged together into a “compound insert.”
If you remove some content, a delete
edit is created. It represents what insert
id
should be hidden from the document, and length
, which is how much of the content was deleted (in the case of “compound inserts”). Remember that edits are never destroyed, so creating a delete
doesn’t remove the relevant insert
. The insert
isn’t ever entirely deleted so a user’s intent to add an item between A and B will be preserved, even if the content of B gets entirely removed. However, for performance reasons, the insert
content will be truncated as much as possible to save space. If all of its content is removed, it becomes a “tombstone”. This is the same as a normal insert
but only with an ID and positional properties.
- See an example of creating a top-level data type here
- And the resulting data output here
If you modify an item, that is represented by a delete
+ insert
. There are two types of modifications: a modification of a whole insert, and a modification of a part insert. A whole modification is achieved by the delete.length
covering the whole original insert
.
- See an example of modifying an item here
- And the resulting data output here
Bonus Yjs optimization: compound inserts
— If Yjs stored a distinct insert
for every character in a 100,000-character document, the size of the document would blow up significantly. It would need to store the ID and position metadata for each character. It would also make the process of assembling the edits into a document very slow.
To solve this problem, Yjs does some clever stuff with inserts
when content is appended. For example, if you typed the characters “ABC” in sequential order, what actually happens is that the second and third inserts
are merged with the first one. The whole merging algorithm and context is too much to explain here, but if you’re interested you can take a look at the code here.
Another interesting optimization is that if you delete
part of the content in the compound insert
, the insert
gets split up. Yjs has figured out how to make this work correctly in practice by only applying them in the right conditions. This surprised me because I thought that edits were immutable. Make sure that the code you write takes advantage of these optimizations, and doesn’t break them by making assumptions about the nature of edits.
- See an example of inserting subsequent characters here
- And the results data output here
inserts
are sorted in order to link up left
and right
references. deletes
simply get applied to the referenced insert
’s content. The only conflicts that arise from the Yjs model of edits boil down to insertion order conflicts. If two clients try to insert content between the same two inserts
, how should they be ordered? No conflicts arise from deletes
.
The goal that the Yjs algorithm aims for is to preserve the user’s intention. Their intention can be thought of as “I want to insert this content between this left
and this right
item, but if someone else has already inserted something here, lean towards the origin
item.” If both inserts have the same origin, then the inserts are ordered by client_id
.
The following diagram shows how an insertion order conflict is resolved by using just one origin
property. In reality, Yjs uses both origin
and originRight
.
This is where Yjs shines. Other technologies have a lot of conflicts to resolve, require expensive algorithms to reconcile or struggle when clients are very out of sync. The simplicity of the Yjs data model makes this really simple and cheap.
Stay tuned for further posts on how the sync protocol works, how awareness works, how our Yjs architecture works, and some optimizations we’ve made at Dovetail.
Further reading
If you’d love the opportunity to work with Yjs, check out our open roles — we’re hiring!