Repair a doc when some updates are missing

I store updates of a yDoc remotely in a db. However sometimes for some reason if we miss to store an update or an update is missing, how can I recover or continue storing the new updates without breaking the final doc. ??

How can you miss an update? If the client is still connected, they should resend the update.

If the histories diverge between client and server, once the client quits their session and reloads the doc from DB, the doc should work. However, I have noticed that if the client and server go out of sync, the yjs Doc in the server stops triggering update events until client gets the missing updates. So if you are saving the updates based on that, you’ll miss a whole lot more than a single update.

So you definitely should send an error event or something to the client if this happens to force a reconnect & reload the doc. Well that’s probably not necessary if DB save fails and even if update goes missing you should be able to handle it inside the same yjs session. And make sure your updates are saved in at least 99.99% of times

Actually i’m not missing any update, I still store the update. For some reason the update does not change the doc.

import { onStoreDocumentPayload } from "@hocuspocus/server";
import { db, updateCache } from "./config";
import { encodeStateAsUpdate, mergeUpdates } from "yjs";
import { FieldValue } from "firebase-admin/firestore";
import { User } from "./updates-cache";

const getUniqueUsers = (users: User[]) => {
  if (!users || users.length == 0) return [];
  const uniqueUsers: Map<string, User> = new Map();
  users.forEach((user) => {
    if (user && user.uid) {
      uniqueUsers.set(user.uid, user);
    }
  });
  return Array.from(uniqueUsers.values());
};

export const onStoreDocument = async ({
  documentName,
  document,
}: onStoreDocumentPayload) => {

  
  if (updateCache.updates.length == 0) {
    console.log("Updates empty, skipping store");
    return true;
  }

  const docRefUpdates = db.collection(`docs/${documentName}/updates`);

  const megaUpdate = mergeUpdates(
    updateCache.updates.map((data) => data.update)
  );

  try {
    if (megaUpdate) {
      await docRefUpdates.add({
        data: megaUpdate,
        createdDate: FieldValue.serverTimestamp(),
        size: megaUpdate.byteLength,
        createdBy: getUniqueUsers(
          updateCache.updates.map((data) => data.createdBy) as User[]
        ),
      });

      updateCache.clear();
    }

    // Now store the JSON for viewing
    let docRef = db.doc(`docs/${documentName}`);
    const blocks = document.getArray("blocks");
    const theme = document.getMap("theme");
    let state = Buffer.from(encodeStateAsUpdate(document));

    console.log(updateCache.totalUpdates);

    await docRef.update({
      data: state,
      blocks: JSON.stringify(blocks.toJSON()),
      theme: JSON.stringify(theme.toJSON()),
      lastUpdatedDate: FieldValue.serverTimestamp(),
    });
  } catch (error) {
    console.log(error);
  }
};

I have code like this, sometimes for some reason. Certain updates don’t reflect any change on the original doc. When looked at the update for e.g

{
  "structs": [
    {
      "id": {
        "client": 3462415570,
        "clock": 6
      },
      "length": 1,
      "origin": {
        "client": 943006925,
        "clock": 5
      },
      "left": null,
      "right": null,
      "rightOrigin": null,
      "parent": null,
      "parentSub": null,
      "redone": null,
      "content": {
        "arr": [
          "#b31412"
        ]
      },
      "info": 2
    }
  ],
  "ds": {
    "clients": {}
  }
}

as you can see i have change a color #b31412 but this change won’t reflect on the document, and from that point onwards any change I make are not reflected.

So I was curious, why this happens and how can I avoid such things.

Oh I see. Yes, that’s indeed quite troublesome.

Does this happen after you reload the doc from DB or during the editing? And if you have two people editing at same time, does the updates stop showing from one another?

Also, just to make sure, try using y-websocket example server for making similar updates to see that the problem doesn’t lie client-side (this I tried couple weeks ago and should work https://github.com/TeemuKoivisto/yjs-demos/blob/main/demo-server/demo-server.js )

If you’re getting dropped updates, it might be worth considering the YJS sync protocol:

Core Yjs defines two message types:

  • YjsSyncStep1: Includes the State Set of the sending client. When received, the client should reply with YjsSyncStep2.
  • YjsSyncStep2: Includes all missing structs and the complete delete set. When received, the client is assured that it received all information from the remote client.

So in SyncStep1, client A tells client B which updates it already has, and in SyncStep2, Client B sends the missing updates. This occurs in both directions to get the Docs in sync.

If an update is dropped, this will indeed prevent any future updates. An update is only valid for a given start state, so it doesn’t make sense to apply an update that is valid at time X to the state at time Y.

A client should not update its state vector until the updates are successfully stored. y-websocket handles this for you. If you’re building a custom provider, you should have a similar mechanism in place.

After I reload, at this point there are no edits happening. And I’m just re-constructing the ydoc from the DB.

Will check it out thanks.