What's the correct way to set default content for y-prosemirror?

Mirone · May 13, 2022, 6:56am

Hi there, I’m struggling with set the default content of a prosemirror editor with Yjs.

First Approach

Use prosemirror’s doc property when create state.

const state = EditorState.create({
    schema,
    doc,
    plugins,
})

This is not working, Yjs won’t respect the doc property and it will erase content in the editor when sync.

Second Approach

Use prosemirrorToYDoc to create the Y.Doc, and initialize ySyncPlugin with this Y.Doc.

It can create an editor with the correct default content. However, everytime I refresh the page or let a new user join the room. The content will be copied and users will get duplicated contents.

After a new user joining the room:

Third Approach

I tried to update the editor content through editorView.dispatch(newTr) programmatically. However it will influence undo manager’s history and when user call redo command, the content will be erased. Which is not what I expect.

So, what’s the correct way to create an editor with default content?
I expect the user behavior as:

When a user join an empty room, the editor is created with default content.
When second user join the room, the editor content is just synced from other users.

dmonad · May 13, 2022, 11:47am

Hi @Mirone,

Only one peer should populate the document with content. Populating content is an insertion. Therefore, duplicate insertions of “default content” will always lead to duplication of content.

Your first approach doesn’t work because y-prosemirror prefers the state of the Yjs type and overrides the existing content. Otherwise, we’d also get duplication of content.

My recommendation is to only initialize a document once. This can happen on the first client that creates a document. You will avoid a lot of complicated issues if you simply keep the Yjs document around instead of re-initializing it every time you open a document.

Mirone · May 13, 2022, 2:46pm

Some links might be useful:

Mirone · May 13, 2022, 2:57pm

Thanks for your clarification. I’m wondering that how can I know if there’s other client that already populated contents?

vojto · May 13, 2022, 3:00pm

You could ask server for all the updates, and then just check if the document is empty. If it’s empty, it means no one has initialized the content yet, and you can initialize it.

I’m dealing with the same problem, but I can’t just wait to go online, because our app needs to work offline.

So if you’re app goes online, you can just wait until connection is established.

dmonad · May 14, 2022, 11:38am

As I said, it makes more sense to design your application in a way so that only the client that creates a document populates it with content.

I often see developers working around this, trying to populate the document from a JSON representation of their data instead of simply storing the Yjs document (possibly alongside the JSON representation). This leads to all kinds of problems and complexities that you really want to avoid.

Waiting for a “sync” event is not good enough. You could have a client with an older version of the document (re)-joining the session after a short disconnection. In the best case, your re-populated content gets duplicated, in the worst-case newer changes get overwritten by the old version.

If you go this route, you need to implement some kind of session management. y-websocket does not support this. You also want to “populate” the content on the backend instead of the client. You need to have some process to elect a peer that initializes content. This can happen, for example, through a proper lock implementation (e.g. redlock - super complex, hard to really understand). Electing a client-peer, with a potentially bad network connection, will lead to problems that are impossible to debug, so choose a peer with a good network connection (server).

Mirone · May 14, 2022, 12:44pm

I think here what I want to implement is something like a template. I tried to implement it like this:

// when user connect
wsProvider.once('synced', async (isSynced: boolean) => {
    if (isSynced) {
        collabService.applyTemplate(doc, template).connect();
    }
});


class CollabService {
    applyTemplate(doc, template) {
        const yDocNode = yDocToProsemirror(schema, doc);

        if (yDocNode.textContent.length === 0) {
            const templateDoc = prosemirrorToYDoc(template);
            const template = encodeStateAsUpdate(templateDoc);
            applyUpdate(doc, template);
            templateDoc.destroy();
        }
    }
}

How do you think about this solution?

dmonad · May 14, 2022, 4:43pm

@Mirone This approach is still vulnerable to the duplication of content. As I explained in the original thread, you should store the base64 somewhere.

Mirone · May 14, 2022, 4:51pm

If I’m making an app or website. I’ll store the base64, however, I’m working on the milkdown editor framework which is similar to tiptap but has markdown support.

So, I can recommend users to store the base64. But I also need to provide a way to make users can use markdown as template.

Also, I’m wondering what’s the difference if users store the prosemirror JSON instead of base64 of ydoc? IMO, they can be transformed into each other.

dmonad · May 14, 2022, 5:21pm

If this is executed twice, you still duplicate content:

const templateDoc = prosemirrorToYDoc(template);
const template = encodeStateAsUpdate(templateDoc);
applyUpdate(doc, template);
templateDoc.destroy();

You need another way to prevent duplication of content.

Waiting for the sync event is also not good enough, because sometimes clients are just disconnected for a short period of time. Once they rejoin… BAM… duplication of content.

I hear you that you want to store everything as markdown. I’m just saying that it is highly recommended to store the Yjs document instead somewhere. Once the editing history is lost, Yjs will duplicate content.

It doesn’t matter which framework for conflict resolution you use (CRDTs, OT, Git, …), you must always retain the history, otherwise, you can’t resolve conflicts.

Content should be initialized only once by the client that initially creates a document. After that, you should keep the Yjs document laying around somewhere (indexeddb, file system, database, wherever). Otherwise, you need to find your own way around the duplication issue (you asked for the recommended approach).

Mirone · May 14, 2022, 5:29pm

Thanks for your clarification. So what I misunderstood is that YDoc not only store the document structure, but also store the editing history. That makes sense.

I think it’s better to let users to decide the sync timing and the storage structure. I’ll just provide some utilities for them.

dmonad · May 14, 2022, 5:38pm

Great, I think that’s the way to go. Maybe I should explain that better in the docs.

It makes sense for authors of frameworks only to build the bindings to Yjs. The user then can choose existing backend-, and storage solutions from the Yjs ecosystem.

Mirone · May 14, 2022, 5:40pm

Thanks for your patience. Great work! Will let you know once I finish my work in milkdown.