A basic concept is that ALL the history of the Doc is stored, and that this is necessary for the CRDT to do its job. It’s a shift from thinking of data synchronically (JSON).
I would also say to start thinking about memory usage early. YJS was built as a general purpose realtime CRDT framework, but its case study has always been a single word processing doc. Many real-time applications have different volumes of data and lifespans than word processing docs. Efficient binary optimization only gets you so far when you’re working with hundreds of thousands or millions of objects. Load time is the main thing that suffers, as you have mentioned elsewhere. Splitting into multiple Docs decreases load time, but at the cost of atomicity.
Subdocs are half-baked, so managing any nontrivial number of Docs requires a robust, custom solution that can load/destroy/update them appropriately.
Patterns for handling offline and multiplayer (if a peer goes offline for a long time and then comes back in sync – how to have fine-grained control as an application developer? Maybe you want some intermittent offline support but don’t want your users to be able to go offline for weeks)
Handling versions or backups in a database, persisting YDoc as a binary blob vs storing updates in an append-only log, pros and cons of each approach
I was super excited when I saw that tutorials for the first two existed on docs.yjs.dev, then super disappointed when I saw that they were just left as “TODO.”
Gonna take me a bit to hash out ideas and decide what the focus will be for first video. In the meantime, I can answer this question: it really depends on your app. Realistically for LegendKeeper, live collaboration is relatively rare, but customers like that they have the option. As far as scale, honestly not sure–since everything in LK is a separate Y.Doc, it’s rare they ever have more than 3-4 people on them max. We also use the differential updates API for a stateless approach, rather than holding any YDocs in memory on the server, so I’d expect it would scale pretty well.
Just from a product design perspective: Ironically, an activity feed that shows evidence of asynchronous activity is probably far more important when it comes to user engagement. At least, that’s how it’s been for us.
I can echo some of the other thoughts raised here.
For example what’s it like to deal with growing sizes of data structures over time and how much do they grow in practice for your y.js structure and user patterns. Unexpected side effects such as load time would be very interesting to hear about; I hadn’t considered that until now. And would love to hear about “Storing yjs db in sql on the server side (including the behaviour in a real time system, like update an yjs db and sync into sql every x minute or so)” as bgervan mentioned above.
Great initiative @braden. I just entered the Y.js realm and I feel the lack of some basic best practice guides (maybe I just don’t know where to browse).
I actually composed an entire post that’s mostly about asking for Common Concepts & Best Practices.
(additionally, scalable structures and deep object updates in an efficient way would also be interesting).
I see lack of examples with y-webrtc, would love to explore more on that, also would be if there is tutorial for how to bind drawing whiteboards with providers, like there are examples for code-editor.
As a newcomer to Y.js - now with some experience after integrating Y.js into my open-source “post it board” called OurBoard, here’s my 5c.
I’ve been a bit confused by how Y.js documentation is organized. Like, there’s the https://yjs.dev/ site, and there’s https://docs.yjs.dev/ which I’ve found the most useful documentation site. But they don’t seem linked and also some information is incorrect. To me it seems that the best thing could be to get the information up to date and correct, and make it clearer where to look information from. Also, if it was clear how to contribute, I would be willing to create some Pull Requests for documentation improvements.
I’m sorry this is a bit off topic for the tutorial content questions, but anyway, I think it would be great to have the tutorials nicely linked or even incorporated in the main documentation site.
Anyways, here are a couple things that I had to find out the Hard Way - and good tutorials might have helped.
Only top-level objects are truly conflict-free in the sense that if user A and user B create a nested Y.Map at the same key inside a top-level map, these two operations are actually in conflict (while if both use top-level getMap their maps will converge)
Adding a nested YMap or YArray, them moving it to a different position in the “tree” will fail, because nested structures cannot be re-attached (or I have understood something wrong)
The server component of y-websocket is a starting point instead of a production grade server really, and is not TypeScript (which personally I would prefer for everything JS). Maybe hocuspocus is the answer (?) and if so, it could be even recommended instead of y-websocket. Haven’t tried it myself but it looks nice on the surface. I also refactored the y-websocket server into a (opinionated) better structured TypeScript version that I might publish in case that makes sense
Generally though I’ve found Y.js documentation very useful and the quality quite good.
Sorry, yjs.dev is quite outdated. Someone actually designed a new awesome website for yjs.dev, which will link to docs.yjs.dev for documentation. I’ll work on finally publishing it.
There are many backends to Yjs now. I try to list them in the readme without giving preferences. But I see your point that it is confusing for a newcomer.
@braden I’m really looking forward to this Let me know if you want feedback!