YJS document / map emptied in some exceptional situations

I’m developing a SAAS with YJS to handle document synchronization. The application is in beta and we have about 700 users. Everything works fine 99.3% of the time… but sometimes, it happens a YDocument / YMap is completely emptied without apparent reason.

  • We use Web Socket and IndexedDB providers.
  • We persist the data locally with IndexedDB and in a Postgres database on the server.
  • Every document contains a single YMap that contains all the needed data. Either the document or the YMap is emptied (an empty document would result in a new emptied map). It can happen for very different documents so it’s not “bad manipulation” from my side, but a bug that occurs at the core of the program.

I have absolutely zero idea from where the bug comes from. Since it happens exceptionally, I am unable to reproduce it.

Do you have any idea what could potentially empty YDocument / YMap? A race condition? Are there any similar precedents?

In the meantime, I’m going to add a lot of logs, saves and security checks to find out what can cause the issue.

1 Like

I do hope you find what is causing this. I also have seen very occasional strange behaviour - not quite the same but possibly related. It goes like this: user 1 adds some data; user 2 views this data (let’s call this ‘state 1’) and then exits (e.g. closes their browser). User 1 changes the data to state 2. Some weeks later, user 2 views the data again (but does not change it); the data is found to have reverted to state 1 without any intervention from either user 1 or user 2. Unfortunately, this happens very occasionally and unpredictably and I have not found it possible to reproduce (although I have screenshots of each stage, so I am not dreaming it).

1 Like

Are you guys both using IndexedDB provider to store the data locally? I have seen a similar bug where if you don’t wait for Yjs document to fully load before mounting the ProseMirror editor you’ll end up with ProseMirror’s empty initial state erasing your whole document. Might be something similar happening here where the IndexedDB state is loaded after the document is initialized. Then Yjs sees this as a new update and discards the fetched state.

1 Like

Good to known @TeemuKoivisto!

It is not the case in my situation, I’m working with YJS shared types without necessarily connecting with my ProseMirror editor (some emptied documents did not contain textual information at all).

I’ve added many logs and I think I discovered what is causing the bug. It comes from the websocket provider rather than the IndexedDB provider (though I’m concerned whether it can also happen with the IndexedDB provider).

It is indeed a race condition:

  1. an empty document is created and start synchronizing to load its content
  2. a websocket connection opens
  3. a call to the database is made to load the content (asynchronously)
  4. the websocket connection closes almost instantly (42ms later), before the database call is resolved – which trigger a save to the database
  5. since the document is actually empty (it has not been loaded yet), it overwrites the previous document value in the database

I’ve added a guard to prevent the writing of empty documents to the database, but this raises two questions:

  • why some websocket connections open and close almost instantly (most of the time to reopen again some ms later)? ← this it the root of the bug
  • can the same situation happen with the IndexedDB provider?
1 Like

@micrology What providers do you implement?

All right! Good you resolved it. But I’m not sure what you mean by

an empty document is created and start synchronizing to load its content

Are you creating a completely new document or just fetching an old document that has been previously synced to Yjs? I’d recommend when you create new docs you immediately update the Yjs state with the default contents incase you are not using the default doc > paragraph > text as your initial doc.

Otherwise, it sounds like the same bug as I had where you are not waiting for Yjs doc to fully load. You should add a provider.on('synced', () => { callback to your provider where you actually set your ProseMirror doc eg:

    provider.on('synced', () => {
      const pmDoc = yDocToProsemirrorJSON(yDoc, 'pm-doc')
      viewProvider.setDoc(pmDoc)
    })

and then start syncing through Yjs.

I’m using y-websocket and y-indexeddb.

If I’ve understood this correctly, it doesn’t matter whether the document (see step 1) is empty or not. It just happens that you have noticed and tested the case when it is empty. But the same could occur even for a non-empty document. Am I right? If so, your ‘guard’ might help but would not completely resolve it.

Loading data with YJS is different than loading data with a classic system.

In an usual world, you would do:

const document = await loadDocumentFromServer(documentId)

But with YJS, you need to create an empty document with the given id, and synchronization will happen automatically:

const document = new Y.Document({ guid: documentId });
// document is presently empty, but synchronization automatically started...

This is what I’m calling “Step 1”. I don’t put data in the document when I want to load it, but indeed, if you put data in your document before starting the synchronization, the same bug could happen.

Since I never put data in a document when I’m trying to load it, the guard should work.

But I just had another case today of disappearing data, so I guess the issue is not fixed by the guard :disappointed: I need to do more investigation again…

This time, it’s not an overwrite with an empty document, but an overwrite with a default reseted document.