Lean-yjs: formally proving the Yjs conflict resolution algorithms

Hello,

I’m working on iasakura/lean-yjs, a Lean4 project to formally model and prove the correctness of Yjs algorithms.
My goal is to contribute to the theoretical reliability of Yjs by providing formal proofs, and hopefully to share observations from the verification process, in case they might be useful for those interested in the correctness of the Yjs algorithm.

So far I’ve proved the commutativity of the Yjs conflict resolution algorithm, which is modeled as the integrate function from the reference implementation by Joseph Gentle.
Currently, my work is still at the algorithmic level of verification, but my plan is to gradually move towards proofs that are closer to the actual implementation.

Along the way, I also noticed that the original YATA insert algorithm described in “Near Real-Time Peer-to-Peer Shared Editing on Extensible Data Types” admits a potential non-convergent case.
This issue does not affect Yjs at all, because the Yjs algorithm already addresses this problem. In fact, my verification shows that my formal model of Yjs does not exhibit such non-convergent cases.
I’m sharing this because I believe it might be useful for those interested in understanding the correctness of the YATA algorithm.

Non-convergent example

(based on Section 3.4 insert algorithm)

Setup (three clients, concurrent inserts):

  • Initial document: empty.
  • Client 1 inserts "a" at position 0 concurrently with Client 2 inserting "b" at position 0.
    Hence: a.origin = ⊥, b.origin = ⊥.
  • Client 3 observes "b" and then inserts "c" after "b".
    Hence: c.origin = b.

We compare two scenarios that lead to two different results:


Order A: abc

  • A-1: apply a"a"
  • A-2: apply b
    • compare with a: a.origin = b.origin = ⊥ and a.id < b.id → place b after a
    • result: "ab"
  • A-3: apply c (with c.origin = b) → place c after b
    • result: "abc"

Order B: bca

  • B-1: apply b"b"
  • B-2: apply c (with c.origin = b) → "bc"
  • B-3: apply a
    • loop #1 vs b: a.origin = b.origin = ⊥ and a.id < b.id → no matching “place after” condition in this pass, keep scanning
    • loop #2 vs c: a.origin < c.origin (since c.origin = b and a.origin = ⊥) → place a after c
    • result: "bca"

Thus, the two scenarios diverge: "abc" vs "bca".

(This non-convergent case arises from the subtlety in the scanning process of the integration loop (e.g., comparing a vs. b), where the insert position does not move until an item appears whose origin makes the ordering unambiguous. This turned out to be one of the most challenging parts of my formal verification.)

If I’m misunderstanding something here, I’d be happy to be corrected.
I would greatly appreciate any feedback or comments.

1 Like

Hi @iasakura,

I really appreciate you working on this! :slight_smile:

Nine years later, you are actually the first to point out the issue in the pseudocode in the paper! The algorithm for item.integrate in the Yjs codebase did not change at all (aside from some optimizations I made). It seems I translated the code incorrectly to pseudocode. Thanks for going the extra mile and using the correct algorithm.

It’s unfortunate that the code in the paper is incorrect. I’m going to reference your repository and this discussion in the Yjs readme. I’m sure you will write a paper about lean-yjs. I would love to reference that as well.

1 Like

Hi @dmonad,

Thank you so much for your kind and detailed reply!

I’m glad that formal verification helped uncover a subtle issue in the pseudocode that went unnoticed for nine years. I’m also grateful for your clarification regarding the pseudocode. I now understand that the discrepancy is not because YATA and Yjs differ, but rather due to an error in the pseudocode itself. If I mention this point in the future (e.g., in a paper or in documentation), I’ll make sure to describe it that way.

I’m really excited to see lean-yjs referenced in the Yjs README. Thank you so much! I’ll keep improving lean-yjs, and at the same time, I’d love to work towards a paper, even if it may take quite some time since I’m not in academia right now. When a paper is published (maybe on arXiv), I’ll be sure to share it here.

1 Like

No worries about the paper! I just assumed that you’re coming from academia. Keep me updated on the repo, and thanks for working on this!

1 Like