Modeling slate split node behavior in YJS

BitPhinix · November 14, 2020, 11:02am

Is there a way to perform a slate like split node operation in YJs?

Currently I’m modeling a split node operation by removing the 2nd part of the split text from the “origin” node and creating a new text with the removed part, but this leads to issues:

Let’s say we have 2 clients with the same state (a paragraph with the text hello world):
{type: "paragraph", children: [{ text: "hello world"}]}

Client 1 performs a split_node operation inside the paragraph with offset 5 this results in the following operations:

remove text " world" from [0, 0]
insert new text node containing " world" at [0, 1]

Client 2 simultaneously performs a split_node operation inside the paragraph with offset 6 this results in the following operations:

remove text “world” from [0, 0]
insert new text node containing “world” at [0, 1]

If the clients are now synced the remove text operations will be combined and both insert node operations will be applied resulting in both the “word” and " word" text nodes to be inserted. YJs can’t know that the new nodes are a result of a split_node operations.

Any ideas on how to solve it? I’m sure that’s an issues other bindings have to deal with as well.

dmonad · November 16, 2020, 9:29pm

There are two answers to this:

Answer 1: Use Y.Text

It is possible if you model slate-content as Y.Text. The quill editor binding, for example, only uses text-attributes to format paragraphs (specifically these attributes are applied to the “\n” newline character). But designing editor models using y-text only is pretty limiting and overly complex. If this what you are looking to achieve, it is probably the right solution though.

You can certainly design abstract tree structures on y-text. Quill/Parchment & Google Docs also represent changes on linear structures. You should definitely look into Quill’s delta format (which is supported by Y.Text) to model tree document structures in a linear structure. This will allow you to implement your desired split_node behavior.

Answer 2: There is no right solution for splitting nodes

Sync conflicts are resolved almost immediately. So in the unlikely case that two users really split the same node concurrently, the users will easily manage to undo one of the splits and continue working together. Shared editing cannot be implemented perfectly and it is impossible to model every intention. Most users will avoid working on the same paragraph anyway when they see the cursor location of another user. So implementing shared cursors already solves this issue.

When you consider offline-editing (users performing changes without a network connection, and later they merge changes), there are no good solutions:

Continuing your example: Let’s say Client 1 splits the paragraph and prepends “my” to the second paragraph we end up with: “hello \nmy world” Client 2 simultaneously splits the paragraph and prepends “your” to the second paragraph we end up with: “hello \nyour world”. In the best-case scenario (which is the case when using the current split behavior) you end up with: “hello \nmy world \nyour world”. This is readable and every paragraph makes sense. In your scenario you would end up with “Hello \nmy \nyour world”.

When we duplicate content in the split_node scenario, we always end up with more content after a merge. In most cases, this results in content that makes sense after merging. This is why I prefer the node representation of paragraphs. In some cases, when splitting nodes using the method suggested in Answer1, you end up with weird dysfunctional text that can’t be read. Another advantage of using duplication: Later when you implement suggestions & snapshot diffs the users will have an easier time to revert or accept specific changes.

That said, there are certainly scenarios where true splitting of nodes is preferable. From my personal experience (working on shared editing since 2015) I believe now that duplicating content is the way to go. Maybe you can model some scenarios and write down the advantages of using true split_node as well. My points are 1. that the feature is irrelevant when users have a real-time connection and 2. that duplicating content is preferable when implementing “showing the differences between versions”.

In the future, I plan to implement functionality to move ranges of text. With this feature, you might implement true node_split on Y.Xml structures. But this feature will come with additional complexity and computational overhead.

An interesting side-effect of using tree structures (e.g. Y.Xml) instead of representing everything as Y.Text is that Yjs can garbage-collect more content, resulting in smaller documents. The reason is that deleted paragraphs can be efficiently garbage collected. Quill documents are always larger than ProseMirror documents because Y.Text needs to preserve more information because it is potentially relevant.

(I’ve been a bit unclear on a lot of the terminology. So let me know if you need me to clarify something.)

Clarifications:

Yjs nodes (e.g. Y.XmlElement) are usually split by deleting part of the content and then inserting the deleted content again in a fresh node. Yjs doesn’t support moving of text-ranges yet (it might never support this feature).
“true node_split” ⇒ I mean being able to split nodes without duplicating content.

BitPhinix · November 17, 2020, 9:06pm

Hi dmonad,

thank you so much for this extensive answer. It makes total sense.

dmonad · November 20, 2020, 1:53pm

Phew, I composed this text pretty late at night and really didn’t expect that anwer