Creating & applying patches for Yjs shared data types

Background

My goal is to allow users to checkout the state of a sidebar to a branch, make changes to it, including reordering and moving children between parents, and ultimately apply only their changes to the upstream state. I’m using this type of data structure to represent the sidebar state: { ParentId: ChildId[] }.

Strategy

When the user checks out a branch, we store the start state of the sidebar. When we want to apply the user’s changes to upstream we generate two YDoc's to compare its current state against the start state to generate a diff. The structure of the YDoc is YMap<ParentId, YArray<ChildId>>, the diff is generated from Y.encodeStateVector and Y.encodeStateAsUpdate, and the diff is applies with Y.applyUpdate.

Example

Upstream starts as { Category1: [Page1], Page1: [Page2, Page3], Category2: [Page4] } and looks like:

  • Category 1
    • Page 1
      • Page 2
      • Page 3
  • Category 2
    • Page 4

User A checks out the branch and decides to reorder Page2 & Page3 so its state becomes { Category1: [Page1], Page1: [Page3, Page2], Category2: [Page4] } and looks like:

  • Category 1
    • Page 1
      • Page 3
      • Page 2
  • Category 2
    • Page 4

Meanwhile upstream someone merges in a change that simply deletes Category 2. So the upstream state becomes { Category1: [Page1], Page1: [Page2, Page3] } and looks like:

  • Category 1
    • Page 1
      • Page 2
      • Page 3

Ultimately when User A goes to merge in their changes we should expect their diff to only represent the reorder because that is the only thing that changed on their branch. The expected end result is { Category1: [Page1], Page1: [Page3, Page2] } and should look like:

  • Category 1
    • Page 1
      • Page 3
      • Page 2

Code

This is roughly what my code looks like to get a better sense for implementations details:

function serializeSidebarToCRDT(sidebar, docKey = 'root') {
  const doc = new Y.Doc();
  Object.entries(sidebar).reduce((acc, [key, value]) => {
    const children = new Y.Array();
    children.push(value);
    acc.set(key, children);
    return acc;
  }, doc.getMap(docKey));
  return doc;
}

function getSidebarDiff (prev, next) {
  const prevCRDT = serializeSidebarToCRDT(prev);
  const nextCRDT = serializeSidebarToCRDT(next);
  const stateVector = Y.encodeStateVector(prev);
  const diff = Y.encodeStateAsUpdate(next, stateVector);
  return diff;
}

function applySidebarDiff(upstream, prev, next) {
  const doc = serializeSidebarToCRDT(upstream);
  const diff = getSidebarDiff(prev, next);
  Y.applyUpdate(doc, diff);
  return doc;
}

// Start state of the branch
const prev = { Category1: [Page1], Page1: [Page2, Page3], Category2: [Page4] };
// Current state of the branch
const next = { Category1: [Page1], Page1: [Page3, Page2], Category2: [Page4] };
// Upstream state, which has diverged
const upstream = { Category1: [Page1], Page1: [Page2, Page3] };

const nextUpstream = applySidebarDiff(prev, next, upstream);
expect(nextUpstream).toStrictEqual({ Category1: [Page1], Page1: [Page3, Page2] });

Issue

While the expected end state should be { Category1: [Page1], Page1: [Page3, Page2] }, in the above example the actual end state ends up being { Category1: [Page1], Page1: [Page3, Page2], Category2: [Page4] } where the branch’s diff seems to revert some of upstream’s changes.

I would note that some scenarios seem work out fine. For example if we are adding new children to a specific parent on both a branch and upstream, those appear to merge without issue.

Questions

  • Is this a scenario that Yjs could support?
  • If so, what could I do differently to generate and apply the diff to end up with the desired end state?

To somewhat answer my own question, my assumption here is that I’m using CRTDs in a way they aren’t designed for.

In this example each CRDT is constructed from a JS Object and we are using the different structures to generate a diff/patch to apply to another CRDT. As they are created independently and they would have different Lamport timestamps referring to their internal Items. Even thought they contain the same or similar data then there’s no way for the transaction history to successfully generate/apply diffs between these CRDT if they don’t share share the same origin and therefore have mismatching Lamport timestamps.

Likely the solution here would be to encode upstream as a YDoc, on the creation of a branch create another YDoc, make changes to that document as desired and finally just apply the state of the branch to upstream.

Practically I’m not sure that this solution will work in this application as the upstream sidebar is a derived state from information stored in database collections, so to store this state as YDoc would create another source of truth and get somewhat messy, but it’s something we could continue to investigate.

Not sure if it’s possible, but I’d be interested if there’s a solution that doesn’t involve having to maintain the same YDoc instance on upstream. Otherwise a better solution to this problem might be to find library that generates and applies git-like patches to JSON structures. That said, not having to deal with conflicts is definitely one major benefit of a CRDT approach and in general Yjs has been a pleasure to work with :tada:

1 Like

Welcome to the discussion board @ilias-t,

unfortunately, that’s not how Yjs works. Yjs documents start empty. Any insertion is then regarded as an operation. If, for example, you’d initialize two Yjs document exactly the same way using the same content. The merged document would contain the same content twice:

  const prevCRDT = serializeTextToCRDT('abc');
  const nextCRDT = serializeTextToCRDT('abc');
  merge(prevCRDT, nextCRDT) // => 'abcabc'

Yjs does not work like a git-diff. Yjs documents are more like git branches. Two branches may share some common history. The conflicting operations will be automatically merged. When two branches applied the same commits (in any order) they will show the same content.

You actually need to retain the Yjs history to be able to compute the differences using Yjs.

1 Like

Ok great yeah that answers my question, I’ll try to fix my code with this in mind—thanks so much!