Custom persistence provider based on y-websocket

danielsavinoff · July 4, 2024, 8:55pm

The library is great and thanks for maintaining such a useful project. If somebody asked me what could be improved, I would answer documentation, particularly the custom provider page.

Back to the topic I am struggling and would greatly appreciate detailed answers.

I am trying to persist ydoc document in Postgresql and here are some questions that I can’t answer myself

Why is bindState() being used and why do we need to store the data in no SQL database? Is it to keep the history? Would it be feasible to store only the result calling only writeState if no connections are present, and the history would be only client-side which is sent to a WebSocket server for merging with other versions and propagation to other clients?
RAM is limited in my case. Should I use LevelDB then if any?

And I am also not sure about how to implement that correctly even though I spent a few days reading the codebase and even read Kevin Jahns’s research .

Do I just call setPersistence and set provider, bindState, writeState?
And what’s the usage of setContentInitializor()? The type of the argument function is @type {(ydoc: Y.Doc) => Promise void} which is confusing. It should return void and ydoc interface doesn’t have content property.

upd:
This answered my questions
To implement persistence and Websocket communication in your existing infrastructure import
setupWSConnection, setPersistence from ‘y-websocket/bin/utils’. Then, add the setupWSConnection function as a listener function that runs on connect. It handles the state propagation between peers. Lastly, for persistence call setPersistence and pass an object with bindState and writeState functions found in the topic above.

dmonad · July 10, 2024, 12:18pm

Hi @danielsavinoff ,

Why is bindState() being used and why do we need to store the data in no SQL database? Is it to keep the history? Would it be feasible to store only the result calling only writeState if no connections are present, and the history would be only client-side which is sent to a WebSocket server for merging with other versions and propagation to other clients?

There are many ways to persist a Yjs document. You could persist the whole document (or only incremental updates) on every keystroke, or only after all connections closed. However, I recommend to store the document at least in short intervals to ensure that you won’t encounter dataloss.

How exactly are you distinguishing between “history” and “data”? If you are asking whether you need to persist the encoded Yjs document (or the incremental updates) on the backend - then yes, you should absolutely do that. Think of Yjs as a git repository. Two git repositories that contain the same content, but different histories will still have merge conflicts. If you re-initialize the Yjs document, then you will get unintended merge results. E.g. if you re-initialize a Y.Text, the content will get duplicated.

RAM is limited in my case. Should I use LevelDB then if any?

If you specify any persistence then the document will be removed from memory once all clients disconnected. If you use y-redis, then the client won’t keep anything in-memory.

And I am also not sure about how to implement that correctly even though I spent a few days reading the codebase and even read Kevin Jahns’s research .

Welcome to this rabbit hole ^^ There is no single way to implement a backend correctly. There are always tradeoffs. Some existing implementations might fit your requirements better. There are many different backends for Yjs now (even some cloud providers like y-sweet, LiveBlocks, and Hocuspocus). My recommendation is to use something that works for now. You can always switch if your requirements change - that’s a very nice part of the Yjs ecosystem.