Globally unique client id

dmonad · December 14, 2020, 5:55pm

It is not about the encoding-overhead of using strings. A variable-length integer is basically a special variable-length string. The reason why I chose numbers instead is because they are natively comparable (which is important for CRDTs). String comparison is not as well defined (i.e. utf8 comparison on different systems). The other reason is that numbers (doubles) are primitive data types in JavaScript. Primitive data types don’t need a reference and are stored on the stack. If they are stored on an object (e.g. the ID object) then they are stored on the ID instance, not somewhere else on the stack. There are actually more arguments. But it narrows down to performance and compatibility. I don’t see an advantage in using strings.

You find more information about Yjs internals on the documentation website: Internals | Yjs Docs

In the video, we shortly go through the encoding approach. But I really didn’t describe it in detail. It is rather complicated and probably deserves a dedicated article. I developed a dedicated library for encoding/decoding of document updates that contains different RLE encoders and an opinionated approach to work with data (GitHub - dmonad/lib0: Monorepo of isomorphic utility functions - encoding.js/decoding.js).

Yeah, in any case, we need to ensure this - even in Yjs. 53 bit should be more than enough, especially with a relatively low number of collaborators.