Handling slow mergeUpdates on server

TerranceN · April 29, 2022, 7:08pm

Hello! We’re developing a collaborative vector drawing application and had some questions about our usage of YJS, specifically what we can do to mergeUpdates to make it block less.

On the server our usage of yjs is somewhat similar to the ’ Syncing clients without loading the Y.Doc’ example in the readme, where our server both broadcasts updates to all the clients, but also uses mergeUpdates to keep a running snapshot of the entire document, which is then stored. If that’s not intended/recommended, let us know!

On the client we’re naively/recursively converting our data structure into a Y.Doc, which for more complicated canvases with more complicated shapes (specifically text) end up with both a lot more data (10s of MB), and a lot more structs (yjs.decodeUpdate(...).structs.length gets to be about few million after we merge down the updates).

We’re using Y-Websockets to transmit this data to our backend, which then uses a pub-sub system to keep all clients in sync. We’ve found with the size of the updates we’re handling, mergeUpdates is becoming a problem.

Running some benchmarks with updates from an example canvas, where I duplicated all elements in the canvas a bunch of times, resulted in the following: yjs-mergeUpdates-example.ts · GitHub.

~1.3 seconds doesn’t seem like a lot on the surface, but it’s not the largest example we could generate, and that time is all spent synchronously merging that update, meaning nothing else can happen (like distributing new updates) until it’s done. It also uses a decent amount of memory (and obviously more if we make the updates even larger).

In order to keep updates flowing, I had the process that distributes updates not do the merge, but just store the update on a queue of updates to be merged, and then had a queue worker actually do the merging. This is still having issues when the merges take long enough our health endpoint (as these are running in a cloud environment) doesn’t always respond in time.

We’ve considered a few ways forward:

Have fewer structs by serializing updates to our datastructure past some depth. Since right now we recursively convert everything to YJS types, every number of every part of every path becomes its own struct in the final YJS doc (at least in my understanding). If instead we pick a level of non-mergability, like a path in our vector drawing, and never recurse deeper than that, we should have fewer structs to merge, and so maybe mergeUpdates could be much faster? Basically we don’t know if having millions of structs is reasonable / should be addressed.
Find a way to break a single mergeUpdates call into an async operation that can yield to other processes. We saw someone else asking about something similar-ish (Split update into smaller updates), and I can see why it’s not recommended, as to use the LazyStructReader/Writer it seems like we’d have to either fork yjs, or at least copy in a lot of logic from mergeUpdates to add some setImmediate calls and a Promise wrapper.
Throw mergeUpdates into a nodejs worker thread, which should be able to pass references to Uint8Arrays (i.e. without copying), and since it’s actually another thread, should not block responding to health checks.
Make sure no updates from clients are larger than some size. Right now every user modification to the canvas can cause a number of updates, which we yjs.transact together. If instead we batched updates less to make sure they’re no larger than some size, maybe mergeUpdates could be a lot faster?

Curious of people’s thoughts on it, thanks!

dmonad · April 30, 2022, 11:05am

Hi @TerranceN,

Drawing applications can generate a lot of updates and “Items” that we need to retain to resolve potential conflicts. This will blow-up your merged document and result in slower loading times. If you generate Items while dragging an object, for example, then you can easily end up with many millions of “Item” objects which will greatly slow-down your application. Generally, I recommend designing your application so that high-volume operations (e.g. operations generated from moving, or drawing) can be merged into a single item object. This doesn’t requires deep knowledge about how Yjs represents changes. It all boils down to preventing insertions in “alternating order”:

A common issue is that developers often modify Y.Map objects in alternating order. For example, you could change the position of an object on your canvas by modifying the “x” and “y” coordinates:

const ymap = ydoc.getMap()

ymap.set('x', 0)
ymap.set('y', 0)

// while dragging the object you will modify the x, and y coordinates:
ymap.set('x', 10)
ymap.set('y', 10)

ymap.set('x', 20)
ymap.set('y', 20)

If you now look into ydoc.store.clients, you will notice that you created 6 Item objects because Yjs was not able to represent the changes efficiently. However, if you would do the following instead:

const ymap = ydoc.getMap()

/// modify the coordinate 1 million times
for (let i = 0; i < 1000000; i++) {
  ymap.set('coordinate', { x: i, y: i })
}

This approach will only generate two Item objects in ydoc.store.clients because we only modified a single property on the same object. My first advice is to optimize frequent operations to only modify a single property. Do not modify properties in alternating order because that will generate more metadata.

Although it seems intriguing, I don’t recommend you to use Yjs as a general JSON store. React-like applications can get away with this because they only generate a fairly small number of operations. However, if you develop a drawing application (or 3d, or any other kind of visual app that generates a lot of operations), then you should think more about how you represent data in Yjs. It will help to perform some benchmarks before settling on an approach.

Btw, the alternating insertion performance bottleneck also applies when you modify different objects in alternating order. One example for this is the following:

ymap.set('value', new Y.Map({ x: 0, y: 0 })) // generates 3 distinct "Item" objects
ymap.set('value', new Y.Map({ x: 0, y: 0 }))
ymap.set('value', new Y.Map({ x: 0, y: 0 }))
ymap.set('value', new Y.Map({ x: 0, y: 0 }))

We will end up with 12 Item objects. My advice is to only generate Y.Map objects when you really need the operations on the Y.Map to merge. It is often preferable to simply insert JSON objects instead and replace the whole JSON object whenever you want to change anything.

// only generate a single Item object with less metadata to keep track of
ymap.set('value', { x: 0, y: 0 })

If your application is heavy on the usage of Y.Array, then you should try to append to the previous insertion instead of pretending or inserting at a random position.

// The following will always generate three distinct Item objects
yarray.insert(0, [1])
yarray.insert(0, [2])
yarray.insert(0, [3])

// The following will always generate a single Item object because content was inserted from left to right:
yarray.insert(0, [1])
yarray.insert(1, [2])
yarray.insert(2, [3])

In the future, I plan to publish more shared data types that are not that prone to the problem of inserting in “alternating order”. The Y.KeyValue data type is not prone to this problem as the benchmarks confirm. This might be something you use to replace your current usage of Y.Map.

Another approach to speed-up your backend is to use the experimental Ywasm package instead of Yjs for merging updates. See GitHub - y-crdt/y-crdt: Rust port of Yjs & ywasm - npm

There is a 3D application that is using Yjs as a data-model. They optimize for alternating order, however, in their case, it is really not possible to always prevent it. They ended up re-generating the Yjs document every few weeks to get rid of the unneeded metadata. Of course, a downside is that client’s can’t merge offline changes once the model is re-generated. I don’t think that this is necessary in your case as you can model drawing applications generally really well with Yjs.

dmonad · April 30, 2022, 11:08am

Just one last word on Y.mergeUpdates. It is crazy fast compared to loading a document to a Yjs document. However, it does not perform garbage-collection. You clients will spend some extra time to garbage-collect information that is not needed anymore. This actually can get pretty expensive over time if the server never performs garbage-collection.

You should try to load the document on the server to a Yjs document from time to time to perform garbage-collection.