How to debug a corrupted document?

Hello,

One of my user lost his document after refreshing the page. In the database, I can see the updates correctly stored, but when these are applied, it doesn’t update the document at all.

I tried to only apply the first few updates (even just the first one), but it’s still blank. I tried to decode the first update and I can see some content in it (doc structure). From what I can tell, all triggered updates have been correctly saved in the database.

In what case could this kind of issue can happen? Is there a way to know if any update would be missing or if something went wrong?

Is there any way to potentially detect an issue like that before a page refresh?

Thank you for your help!

That’s a tough one. If even the first update does not get integrated into the Doc, maybe there are earlier updates that are missing.

Do you have a backup of the database from before the issue? That’s the only way I can think to compare what is different.

Hi Raine,
Thanks for your reply!

I double checked and the first update is actually always the same across my documents and looks correct. As it’s just the document structure, nothing is visible and I assumed it didn’t work but it seems to work. So the issue is probably in another update. However, looking at the logs it seems that no update is missing. The data in the IndexedDB is also the same :neutral_face:

Not sure what went wrong but I’m worried that it can happen again to other users.

Lastly, how does YJS know that an update is missing? I’m assuming that YJS fails to apply the updates, do you know if there is any way to catch the error? I’m not really sure what I should do in my case.

I believe that YJS stores out-of-order updates and waits until missing updates arrive. I assume that the clock is expected to increase by 1 each update, but would need to confirm. Maybe you can check the clocks of the updates in the corrupted Doc.

It looks like doc.store.pendingStructs.missing could be related but that’s just a guess.

Thank you so much for your inputs. Do you know how I can access the clock of an update? I tried to use logUpdate but it doesn’t give me that information.

The clock is part of the ID (clientId + clock) stored in the structs array.

I made a demo that prints out the clock and shows how missing updates are stored in pendingStructs when updates are applied out of order:

yjs-clocks-8npcy3 (CodeSandbox)

Aside: When cloning a Doc by applying updates, top-level docs will only show up in doc.toJSON() once doc2.getMap(...) is called. I made the mistake while I was putting together the demo and it created some confusion.

yjs-cloning-top-level-shared-types (CodeSandbox)

1 Like

Thanks again for your reply and sharing the demo!!

I looked at the clocks and noticed a few things (I don’t know exactly how the YJS clocks work):

  • there are gaps (I’m assuming it’s expected as the client id also changes?)
  • at the end, there are clocks starting with 0 again

Regarding the last point, I tried to load the document without the updates where the clock starts with 0 (just in case) but it doesn’t change anything.

Here are the clocks: https://codesandbox.io/p/devbox/sharp-pare-hl2t5m?file=%2Fupdates.json

Take the first client: 3827980336. What happens when you only apply their 45 updates? What gets integrated into the Doc? What shows up in pendingStructs?

I’m guessing the multiple clientIds come from loading the Doc from a provider after a refresh. You’d have to do some experimenting to see what clientIds and clock gaps are expected.