Conceptual question: how does yjs avoid race conditions?

theahura · July 12, 2023, 6:15pm

This is more a question from a conceptual/high level understanding perspective, both about yjs and crdts in general. Hopefully this is the right place to ask!

I was trying to wrap my head around how yjs avoids race conditions when persisting to disk, for example using leveldb, mongo, indexeddb, or localstorage. Do y-{adapter} implementations have to use a db transaction or lock the underlying data in some way? Or does the yjs networking layer somehow automagically combine data coming from multiple sources at the client, such that there is only one input?

More broadly, how would yjs work in a distributed system that also has persistence? Say user A is connected to container A, and user B is connected to container B, and both users are editing the same document. Would yjs p2p functionality ensure that updates to this backend only ever come from one user?

Appreciate any thoughts in advance! Also would love to be linked to further reading, papers or lectures or anything like that are all helpful here

raine · July 12, 2023, 8:52pm

Race conditions are not an issue with CRDTs because updates are commutative. That is, they can be applied in any order, and the end result is the same. Updates from multiple concurrent sources are integrated (i.e. merged) in-memory on each client. The result is said to be “arbitrary but deterministic” and will “eventually converge”. To understand the specific details of how this works, you’d have to look at the specific implementation of the CRDT.

A YJS Doc is represented as a list of updates. Persisting to a storage medium is just a matter of saving those updates to disk. If a new update from another client arrives, it just gets appended to the list on disk. Because of commutativity, it doesn’t matter in what order the updates arrive. But yes, when it comes to actually flushing the change to disk, it is done with a lock to ensure the change is atomic.

It’s also worth noting that applying YJS updates is idempotent, i.e. applying the same update more than once has no effect. This makes it safer to batch apply lots of updates from different clients and not have to worry about redundant updates.

german-jablo · January 4, 2025, 8:27pm

I think this is a good question and one that hasn’t been fully answered.

Yes, the operations are commutative. But as far as I know, the yjs backends don’t perform any checks to see if another user has made an update since the conflict resolution process started when writing to the database.

YDoc is usually stored as a single row in the database, so it seems perfectly possible that a transaction could be lost if two users update the document at the same time.

Not very likely, but not impossible. Especially if there are multiple people working on the same document at the same time.