Validation, security and middleware

Hey everyone!

I’ve been exploring the Y eco system and have built an integration with Django (channels-yroom). I’m wondering how to best implement validation, authorization/authentication and more generally some sort of middleware business logic for a Y doc server-like peer.

Right now I’m decoding the binary sync protocol and can e.g. restrict ‘write’ updates to authenticated connections. But what if I want to restrict the creation or validate the contents of types in the document? As far as I can tell a rogue JS client can create arbitrarily named types with arbitrary content and sync them to others. It seems like other peers have no easy or blessed way to look into an update before applying it. My current attempt is to copy the current state, apply the update to the copied doc, observe the changes and then decide if I want to apply the update on the synced doc. This is made more difficult by the fact that types in observable callbacks don’t seem to know their own names.

I feel like I’m missing something. What’s the right pattern here? Can I only authenticate the Websocket connection and then have to trust actions of other peers?

1 Like

That’s the best approach that I can think of. YJS doesn’t currently have anything like middleware.

One thing to be aware of is that updates can arrive out-of-order. An out-of-order update may not result in an observable change. It will only be noticeable when the missing updates are applied. So when your validator fails, it’s possible that the invalid update is not the current one, but several updates back. However, if you avoid applying the new update, the integrity of the Doc should be preserved since the out-of-order updates have no effect.

The other question is about where to go once you do discover an invalid update. Is the Doc frozen in time? Are all updates that depend on the invalid update rejected, but other updates permitted? I’m unsure what behavior will be impaired if an update is simply omitted. It would be worth doing some experiments with that.

A more efficient solution would be to use Y.decodeUpdate and validate at the level of YJS Items. However that involves getting into YJS internals. I’m not sure the complexity involved.

Thanks for taking the time to reply.

Ideally, client-side validation on top of Y.js only permits valid transactions. Assuming only malicious clients circumvent client-side validation, it will be OK if their client behavior will be impaired.

I’m using yrs – the Rust port – and the public API there is currently too limited to dig deeper.

Anyway, I’m surprised that this lack of validation is not an issue for more users.
As an example a malicious user could spam a ydoc with hidden data, e.g.

doc.transact(() => { doc.getText("hidden-data").insert(0, <incompressible garbage>)})

which would make all updates and future syncs larger, possibly escalating to a denial of service, if I understand correctly. A middleware or just some filtering function would be able to prevent the distribution of such updates.

1 Like

YJS is still pretty barebones. It lacks validation, encryption, ready-only permission, schema migrations, lazy loading, graph support, etc.

I imagine validation should be performed on the server since that is where poisoned updates can be spread to other users. It seems relatively straightforward to add a validation hook to the updateHandler. Not sure what that would be in rust though.

Anyway, I’m surprised that this lack of validation is not an issue for more users.could spam a ydoc with hidden data, e.g. which would make all updates and future syncs larger, possibly escalating to a denial of service, if I understand correctly.

That’s also exactly my concern: a rogue client could just trash your Y.Doc and potentially crash your server (or make you poor if you’re on the cloud). Have you found a viable solution for this and the validation problem yet? @stefanw

I’m currently also thinking about using ZOD client/server-side validation, but of course decoding and applying the full Y.Doc in memory every time a new update is received adds quite a bit to server runtime costs. If validation was build-in feature of the protocol, getting the full state probably wouldn’t be necessary.

I found also this for decentralized authentication/permissions.

But not sure though how inter-operational it is with Y.js, maybe @raine knows more?

Sorry, I’m not familiar with it.

1 Like

I’m also very interested in this, if I’m able to use YJS for my project it would vastly simplify the project, however I ended up going with a completely custom approach due to not being able to find a good solution to this problem. Enforcing a data structure (A thread I made on this earlier)