Thinking the Yjs 'way'

johnnygri · September 7, 2021, 7:45am

I would like to understand how to better think in the Yjs 'way’

I’m excited to understand Yjs better as I see how it enables me to bring collaboration to my app. Amazing. I managed to get a basic collaborative editor up and running over WebRTC in 30 mins with Quill and cursors, which is a fantastic testimony as to how good this library is, great work!

Currently I’m using a very standard approach: store document in Firestore, retrieve for editing, save when done (I’m using Quill and am storing Delta objects).

Using Yjs, I believe it might be important to rethink the way document data is persisted. There’s been a few discussions on how to initialise a Ydoc from an existing, standard JSON object. Perhaps this is the wrong approach?

’Traditional’ thinking approach:

Fetch stored JSON object
Set YDoc initial content from JSON object ← this presents problems, which client initialises the Ydoc?

’Yjs thinking’ approach?:

Create new JSON object, create Ydoc, update DB with Ydoc content
Retrieve Ydoc update content for editing, initialise with this value ← each client will get latest update and then sync, no problem
Long term storage may require a doc → Ydoc conversion prior to editing, and a Ydoc → doc conversion before saving. But maybe this is traditional thinking and wrong.

’Concepts’ Documentation Addition
Lastly, may I humbly suggest it would be good to add a page to the docs which describes the ‘Yjs way’ of thinking if that’s even a thing. I think new users would benefit from this. As trying to learn RxJS required some re-thinking, perhaps using Yjs is the same? I’m very happy to contribute once my understanding is solid.

Apologies if I’ve missed existing documentation that explain this!

leehsueh · November 21, 2023, 8:19pm

I’m in the same boat, would love to hear others’ who have gone through this process to summarize what shifts in thinking are needed when it comes to adding real-time collaboration to existing data.

From what I’ve been piecing together, it seems like it’s important that the Yjs CRDT data structure is persisted for long term storage, because that’s the only way all clients will have the full state. It also means that fetching/loading the data for clients means fetching the CRDT, and not the traditional way of fetching the content itself.

@johnnygri have you made any revelations in this journey that you’re able to share? thanks so much

johnnygri · November 21, 2023, 9:10pm

Yeah decided to use liveblocks.io, haven’t had any responses to any of my posts on this forum

leehsueh · November 29, 2023, 12:18am

How is your experience with using liveblocks? Do you use it for authentication as well?

leehsueh · November 29, 2023, 12:22am

@raine any chance you could weigh on this question? What are some mindset shifts a person new to CRDTs and collaborative workflows would need coming from a traditional pattern of manually saving content to a database without the need to maintain state?

Thanks so much for your time/input.

raine · November 29, 2023, 3:18am

As you mentioned, the binary data has to be the source of truth since the full history is needed to resolve potential conflicts. That’s an important one, and it can be hard if you’re used to JSON. The decoding methods are easy to use… if you know which one to use when. Decoding an update is different than decoding a message for example, and if you’re slightly off then you won’t get a clear error message, just nothing will happen.

Somewhat related, you must load the entire Doc into memory in order to read any content at all. (Queries? What queries? )

The other big mental shift has to do with the peer-to-peer nature of YJS. Another client can always be making a concurrent edit at the same time, and it may just not have synced yet. In other words, there is no such thing as being “fully synced”. So in general there is no point where any one client has a global view of all the data. It’s reactive; you’re dealing with events more than you’re dealing with data, so you have to think in terms of continuous streams of events and observability.

The community is not the most active (to put it mildly), but YJS is powerful and it’s rewarding to get to know how it works on a deeper level (not that I venture into the inner workings of the CRDT algorithm myself). The Hocuspocus team is great so definitely check them out.

Of course, I hear that automerge is getting better so there’s always that if things are too quiet around here!

joakim · November 30, 2023, 12:16pm

SyncedStore is another way to hide some of the intricacies of Yjs.

leehsueh · December 11, 2023, 6:41pm

Thanks for the response!

Yjs as far as I can tell has a lot more of an ecosystem built around it compared to automerge (editor bindings, network providers, even third party service like liveblocks). Hocus-pocus also looks like a great medium between extending y-websocket and a fully managed service like liveblocks or tiptap cloud.