Globally unique client id

dmonad · December 16, 2020, 7:26pm

To be honest, I don’t understand the argument that you can’t use numbers because you need to cryptographically sign … document updates?

With BigInts, you’ll have arbitrary precision for numbers. So you could even use a precision of 1024 bit or 4096. I guess you want your client-id to be some kind of certificate, not just a GUID. If you convert your certificate to a Uint8Array, you can use that as a source for the BigInt client-id.

Seph is mapping from string to a number to improve the hashing and the referencing process. He built this concept as a convenient method to reference client-ids (which are strings in his case). In Rust you don’t want to manage pointers, so you often come up with tricks like this. In Yjs I would use references to locations in memory (with BigInt, a client-id would be a reference/pointer to a location in memory). In Yrs, I would do something similar as Seph.

I don’t know about your application. But I don’t think that you ever want to map user identifiers (which are strings/certificates in your case) to client-ids. A couple of reasons:

A user may connect to the same document several times. A user also sometimes makes concurrent changes (e.g. if they point two browsers to the same document and then make changes)
Although I call them client-ids, they are really session identifiers. They are not designed to be used for anything else.
If you want to preserve the editing history and associate it to a user, you should use PermanentUserData instead (it helps you to map users to specific edits on the document).
You should not reuse client-ids. You should start with a fresh client-id for every session. Reusing client-ids is error-prone and may lead to unintended side-effects if not done correctly: https://docs.yjs.dev/api/faq
Deletes are not associated to client-ids (which may be counterintuitive for you).
The only minuscule advantage of reusing client-ids is that you might decrease the size of the Yjs document by a bit. But by using strings as client-ids, you will actually increase the size of the document.
By using strings instead of numbers, you will decrease performance quite heavily. Lookups should be reduced as much as possible. Especially in Yjs, the process of looking up two adjacent structs by number is ~90% of the work when applying document updates. In these scenarios, memory-locality is very important.

My recommendation is just to ignore the concept of client-ids in Yjs - it should be kept internal to Yjs. If you still want to, you can map from certificate to the Set of all owned client-ids to support the use-case of having several concurrent Yjs-sessions with a single user.