Persisting and retrieving state from database

Guys, I cloned y-websocket server and I’m trying to implement a postgresql adapter with bindState and writeState, but when bindState is called with the data from the database and it will apply Y.applyUpdate, the clients go out of synchrony.

Someone already used Y.applyUpdate inside bindState? Also, I’m converting the doc as Uint8Array to base64 to persist in the database using buffer.toBase64()

Here is a code example, with a client and an API to persist, https://github.com/iamgbayer/collaborative-poc/blob/master/packages/api/src/persistence.js#L23

When I comment Y.applyUpdate code, all clients stay synced.

Yjs is making an awesome environment to build collaborative systems, thank you all.

Hi @iamgbayer,

I’m going to insert my answer from the gitter channel here:

The current concept is that bindState is called when the Y.Doc is created, so you can listen to document updates (ydoc.on(‘update’, update => …)) and then write them incrementally to a database. writeState is called right before the Y.Doc is destroyed - i.e. after all clients left the room and there is no more data to be expected.

Please read the section in the Yjs readme again that explains how document updates work. If you understand how you write the complete state (Y.encodeStateAsUpdate) and how to get incremental updates (using the observer), you might come up with a concept that works for you. If you are using Postgres, it might not be a good idea to write incremental updates, but instead write the complete state instead after a denounce (e.g. 3 seconds after the last update message). Use bindState to register an observer to listen to changes.

2 Likes

Hi @iamgbayer

I am trying to set up persistence to Postgres now.

Did you manage to set it up in the end? Would love to piggy-back on your work, or help you get there if you need help!

Why is it a good idea to persist the whole state? Wouldn’t it just become bigger and bigger as the data structure grows?
If only the updates are saved and the clients query the ones after their last received update, wouldn’t it be more efficient?

That’s how CRDT’s work. Yjs is based on an append-only array data structure, with some garbage collection to avoid build up. Yes, documents can get large, and it can be non-trivial. But that is how they are able to offer the guarantees that they do.