Locking using Yjs

ijdickinson · October 18, 2021, 11:07am

New Yjs user here, so apologies if this is a question with an obvious or existing answer.

Suppose I’m working on a whiteboard app, which shows a collection of whiteboard notes that any user can move or edit. Suppose on the board there is a note N1, and that both Alice and Bob want to move it to somewhere else on the board.

Clearly only one user can move N at any time, so I’ll need some sort of distributed lock. My initial thought would be a Y.Map, which maps from note ID (N1) to either null or the ID of the currently locking user. Once a user has a lock, they are free to move the note, and no other users can get the lock until it becomes free again.

So: both Alice’s client and Bob’s client both try to set a value into the Y.Map at the same time. At this point, my understanding gets a bit fuzzy. Does one update get rejected because another get there first? Or are both updates accepted, and the latest one wins? Or something else? And what’s the best/correct way for my app to know that potentially conflicting updates have been resolved and that, say, Alice now safely has the lock and can proceed with her move operation (and Bob has to wait)?

bminer · November 11, 2021, 3:47am

Yjs cannot provide distributed locks because CRDTs are eventually consistent by nature. A Y.Map will eventually converge to a known state once all users synchronize their shared Y.doc, but this does not provide the guarantees needed for a locking mechanism.

Distributed locks require linearizability, which is, in some ways, the opposite of what you get with Yjs. To be linearizable, you need a central authoritative node in the system or a consensus among nodes.

philip · November 12, 2021, 11:31pm

I’m a newbie too.

My understanding with Y.Map is that it works with last write wins, so both updates will get accepted, but the final value will be the value that is the latest/most recent. So if Bob says ‘x=5’ and 20 ms later Alice says ‘x=6’, then after that they both sync with eachother then the final value will be x=6.

A way to do this would be to: Bob selects Note, Bob sets set(‘note1-locked’, ‘Bob’); Alice recieves this update 100ms later, then the client will gray out the note so that Alice cannot move it.

What happens if Alice locks the note between the time that Bob locks it and the 100ms it takes for the update to reach Alice? Then Alice will lock it with set(‘note1-locked’,‘Alice’). Then everyone will sync. Now Bob will end up with ‘note1-locked’: ‘Alice’, so Bob has lost it, and will drop the note and it will be grayed out. Thought it is important to note that this is an edge case. Very rarely will someone select the same note within 100ms of eachothers!

dmonad · November 16, 2021, 1:58pm

Thank you @bminer, Yjs indeed doesn’t use locking. It is an eventual-consistent “database” that synchronizes without central coordination. The whole concept of time is problematic in decentralized, eventual-consistent databases.

Most CRDTs implementations don’t rely on the concept of physical time (as in time based on a unix timestamp; or a GMT timestamp). Mostly, they have a relative understanding of time, based on Lamport Timestamps. In Yjs, and many other CRDTs, LWW means that when concurrent writes happen, one of the writes will win. It doesn’t necessarily have to be the one that was created “before” based on our physical experience of time. Once clients synchronize, their relative clocks are in order and we can, once again, detect whether a client performed an action concurrently.

The reason for this is that it is impossible to synchronize time-clocks. Some computers around the world have broken time devices, or are simply desynchronized with NTP servers. NTP servers broadcast a fairly accurate timestamp (based on physical time) to computers and try to keep them more-or-less in sync. However, this is unreliable and doesn’t always work. It’s normal that clients are desynchronized by a few seconds. So a decentralized last writer wins implementation based on physical time is likely to favor clients that don’t synchronize with NTP servers, which is a huge problem. Hence, we use the relative concept of time instead because it is impossible to synchronize time.

The last time I explained this people cited some article from how Amazon synchronizes their times. Note that these are huge server-farms that are located very close to each other through a reliable network connection. This can’t be done with consumer computers that are often located at places with bad internet access.

ijdickinson · November 16, 2021, 6:09pm

Thanks @dmonad and @bminer. I can see I’ll need to add a separate signalling mechanism to achieve the locking that I need.

Very rarely will someone select the same note within 100ms of each others!

I agree, @philip, that conflicts won’t happen very often. But they could, and will sometimes, so if I don’t account for that properly in the app I’m going to end up with some nasty Heisenbugs to scratch my head over one day

bminer · November 17, 2021, 3:15am

I’m a bit new to this stuff, so please correct me if I get something wrong.

I’d highly recommend Kleppmann’s book Data-intensive applications. In it, he describes some of these subtleties. Y.js provides a total ordering of events using Lamport timestamps, but somewhat counterintuitively, this is not enough to solve many problems of decentralized systems. That’s because there is no recency guarantee in Y.js; that is, it is eventually consistent. For example, node A could make a change concurrently with node B, but they may not communicate with each other for years. Each node maintains a total order of events it has seen so far, but to guarantee consistency, events may get rearranged.

When nodes eventually communicate (perhaps years later), they are guaranteed to be consistent with one another. But for a distributed lock, you need to make a decision now (i.e. you need to answer: “do I hold the lock or not?”) You cannot wait years for the communication to eventually occur, and you cannot guarantee that all nodes are communicating with each other. That’s why Y.js allows you to make writes offline.

To implement a distributed lock, you need a recency guarantee called linearizability. This requires a consensus between all nodes in the system; at a bare minimum, a quorum among a majority of the nodes is required before a distributed lock can be taken. Because consensus is generally expensive to achieve, especially from a latency perspective, a leader node is often elected using a consensus algorithm. Since there is a single leader node, it can instantly make decisions about holding or releasing a lock.

dmonad · November 17, 2021, 12:01pm

Nailed it @bminer.

@ijdickinson I recommend implementing a Whiteboard app without locking. You can have two users manipulate a Y.Map concurrently and both users will eventually synchronize the position of the moved object. In the case of concurrent moves, the moves from one user will automatically be overwritten by the “most recent” changes eventually. Locking approaches are very hard to implement correctly and yield a bad user experience.