I have a collaborative application in which the clients and the server communicate through WebSocket messages. The client will receive the message from the server and then applyUpdate(update) directly on the YDoc, and the server will only send updates not the whole state of the YDoc to the clients.
I’ve run a test in which there is an empty YDoc created by new YDoc and the updates which are normally around 600 bits of size for each one. With more updates being applied to the YDoc, the time consumed by applyUpdate increases from 1ms per update to 100ms per update, and the number of updates in the test are around 3800 which are from the testing env by real user collaborations.
Is there any way to optimize the applyUpdate time? It is really unacceptable to handle each update on the client side for around 50ms.
Yes, it’s quite bad. I have a 1MB Doc that takes about 10 seconds to load from disk
I can’t speak to the potential for optimization, but I can share a couple workarounds I’ve considered.
You may be able to reduce the time by combining updates with Y.mergeUpdates. The result is always smaller than the sum of its parts.
A more obtrusive workaround is to completely reset the history of the Doc when it grows too big by populating a new Doc from JSON. Of course you lose the ability to merge changes from offline clients when this occurs, so it’s a big tradeoff. You’d have to have some kind of migration strategy for clients to move to the new Doc.
The other workaround is to keep the Doc in a web worker. That will offload applyUpdate to a separate thread so that it doesn’t block the UI. Still, it’s a linear improvement at best.
I’ve been disappointed with YJS’s handling of large Docs, and absence of lazy loading. It too often assumes a single, prose-like document and neglects the large data realities of many other uses cases.