Duplicate TextEvents when syncing a new client

Hello @dmonad

I’ve got another curious bug that I’m trying to sort out. Same context as before, with content being modeled as a YArray where nodes contain text. I’m using y-websocket to handle communication. Mostly things work fine – if I have multiple clients, changes from one client are propagated to the other, etc.

When I create a new client and connect with the server, the initial update returned from the server always contains an ArrayEvent that represents the entire current state of the document – this gets applied to the application level representation, and things are fine.

But here’s the weird part – for any YArray elements for which one or more characters was removed from its text contents at any point during its lifetime, the initial update will also contain a TextEvent that inserts the entire text contents of that element into the same node – meaning that since I propagate the effect of all events to the application level, I end up seeing duplicated text in the application level representation corresponding to those YArray elements – i.e., the first copy coming from the content contained in the ArrayEvent, then the second coming from the TextEvent.

I only see this behavior for YArray elements where one or more characters were deleted from their text contents at some point – for elements for which no characters were ever deleted, I only see them showing up in the ArrayEvent – the update message doesn’t contain the corresponding TextEvent.

I’ve attempted to watch what’s happening on the server side through judicious use of .observeDeep() and logging of events there, but I don’t see anything untoward – just the expected TextEvents as I add and remove characters. And when I try and reproduce the problem in a unit test by replicating the sequence of manual events (and validating that I see the same sequence of TextEvents on the simulated server document), it of course Just Works – i.e., I don’t see the extra TextEvents and the resulting text duplication in the application level.

Questions:

  • Does this ring any bells with you?
  • Can you think of any reason why the initial update contains the extra TextEvents in these conditions, even though their effect already seems to be fully represented in the ArrayEvent that is also contained in the same update?
  • If this is intended/expected behavior from Yjs, is there any straightforward policy that I can employ at the application level to filter out the extra TextEvents when appropriate?

Thanks!

Hey @kjohnson,

The way how you described this issue lead me to a bug in the event system.

Any kind of change must happen during a transaction. When a type was added (e.g. yarray.insert(0, [Y.Text]))) and then the inner type is modified (yarray.get(0).insert(0, 'some content')) - all during the same transaction - then we expect that no TextEvent is fired because the Text type was just added during the transaction.

But there is a bug in Yjs when content is deleted (e.g. yarray.get(0).delete(0, 5)). In this case, we send an event for adding the Y.Text element to Y.Array (this is where you compute the initial content), and also a change event for Y.Text (which is superfluous, because Y.Text was just added - hence probably the duplicated content).

I suspect that this is the reason for your bug.

I just published Y̶j̶s̶@̶v̶1̶3̶.̶4̶.̶2̶ Yjs@v13.4.3 with a fix for this behavior. Please pull the latest version and let me know if this fixed the problem.

Hello @dmonad

Thanks for the quick reply! Turns out I was already using Yjs@v13.4.2, so that wasn’t the issue.

But after a bit more digging, I’ve been able to chase things down a bit further.

In the client-server setup that I had been using, the Y.doc on the server was being created with GC enabled but gcFilter: (() => false) (this is inherited code, so I don’t yet have a good understanding of why the filter is set that way). But in your reference y-websocket server, I notice that gcFilter is left unset and thus defaulting to () => true. Indeed, if I make some small application-specific mods to your server and run with it instead, the client-server test works fine – no more duplicated text.

And when I take that insight back to my attempt to reproduce this in a unit test – voila! If I create the Y.doc() without specifying gcFilter, I am unable to repro the duplicated text. But if I change it to match the server setup I had been using – i.e., gcFilter: (() => false) – then the unit test shows the same behavior I’m seeing in the full client-server setup – which is to say, when I create a second client and simulate the initial client-server sync for it, then I end up seeing the duplicated text at the application level.

For what it’s worth, I am also able to repro the problem if I disable GC (i.e., gc: false) and leave gcFilter unspecified.

At this point, I’m not clear whether we need GC enabled or disabled – my guess is that it’s okay to enable GC (meaning it looks like we won’t see the duplication issue). But I’m guessing that it’s probably undesirable to send the extra TextEvent if GC is disabled, so I suspect there is still a bug lurking here – even if it doesn’t affect us directly/immediately.

What do you think?

Whoops. I meant v13.4.3. I published it just before I wrote my comment - so it’s unlikely you already have it.

When you specify gcFilter (or enable/disable GC) you might be able to end up with different events (fired in a different order). So maybe this is the reason why this change fixed your code.

Hello @dmonad

I confirm that upgrading to Yjs@13.4.3 appears to resolve the problem, regardless of how GC is configured.

Thanks!