Identify ydoc update event

folencao · April 6, 2023, 4:24am

@dmonad - We use the ydoc update event in the backend(i.e. the y-websocket) to record which user and when did the change, this build the Version History as Google docs,

There is a problem when the user first loads the page(client creates the ydoc and binds with Prosemirror), it will trigger ydoc update event as well for the first sync, we can’t treat this update event as user change/activity as it is not real user update, do you have any suggestion that we can identify the update event for the first sync and avoid it?

raine · April 6, 2023, 1:11pm

On the backend, you could try waiting until the doc is initially synced before subscribing to update.

folencao · April 6, 2023, 1:48pm

@raine - Appreciate your reply, I think that doesn’t work, for example, the first client already accesses the page, and the backend already initially loads the doc data and subscribed to the update event, then later the second client access that doc, the first/inital sync with backend for the second user will trigger the backend update event(which we intend to avoid) as backend already subscribed the update event.

raine · April 6, 2023, 1:57pm

Ah, interesting.

So the clients are not making any changes to the doc on load?

And the update on the server is non-empty?

Makes me wonder what the update actually contains.

folencao · April 6, 2023, 2:26pm

Yeah, didn’t do any changes.
(1) The second user first accesses the page will trigger one backend update event.
(2) But if the second user refreshes the page, seems it will trigger the update event multiple times, weird, feels like multiple updates from the first user did before.

No, non-empty, this is the weird and that’s why we need to identify these updates.

raine · April 6, 2023, 3:52pm

Yeah, that does seem strange. I’m not sure what a non-empty update is doing if there are no changes. What is in e.changes.delta?

You could also track lastUpdated in a separate shared type. Though I’m not sure how to integrate that with your Version History.

folencao · April 7, 2023, 12:35am

Thanks, @raine

For now, our solution is when the editor got focused and will send a message/notification to the backend, then the backend will start to receive the update from this user, in most cases this works, but not ideal for us.

NataliaMolchanova · July 12, 2023, 5:43am

Hi @folencao, I am looking for exactly same thing. Did you find a better approach? I am comparing Y.snapshots to eliminate sync/update events, but this seems to be an overkill. Also i noticed that state that i receive after sync differs from the state before sync even if the snapshots are equal

folencao · July 12, 2023, 7:00am

@NataliaMolchanova - I will share our found and solution here for anyone who needs it:

Real user update:
You can extract the clientID from the result of Y.decodeUpdate(update) (ideally your server can store the map of this yjs clientID to the real user in your application)

Sync/update events when the first time load data from LevelDB
The problem(we use LevelDB to persist the updates data):

1. When the first user access a ydoc that the ydoc data haven’t loaded yet from LevelDB.
1. The Server(i.e. bindState) will try to load the ydoc data from LevelDB Asynchronously
1. The Server will send this ydoc data to the client as an initial step which client code will not treat as a user update.
  (See more details about sync steps here https://github.com/yjs/y-protocols/blob/ba21a9c92990743554e47223c49513630b7eadda/sync.js#L15)

The problem is on the #3 step, as the above #2 loading step is asynchronous process, so when running #3 step, the #2 ydoc data is not ready yet and finally client initialize with emtpty ydoc, later #2 data is loaded and send to client treat as a normal user update, this will trigger client update event, then finally send to the server again, so this is why we see server ydoc update events during the initial sync, eve user didn’t change any content.

Solution:
We did some changes to the utils.js and client y-websocket.js,

When the “messageYjsSyncStep1” message(in messageListener) from the client triggers the bindState method to load data from the LevelDB, we will just store this connection and data #1 first, then waiting read LevelDB to finish, then send all the data to the client at #2, we defined new message type to do this.

I know maybe this is hard to understand the whole thing, and I am not sure how to explain all the details as I can’t simply share the entire code.

NataliaMolchanova · July 17, 2023, 4:51am

Thank you! You pushed me to the right direction, I was able to get rid of snapshot comparison