Understanding memory requirements for production usage

braden · August 28, 2020, 8:46pm

Hello! I’ve been following YJS development for a few months now and am super impressed with this tech. I run a collaborative platform for writers and RPG gamemasters (legendkeeper.com). I think I have a decent idea of how I’d like to implement YJS for our prosemirror-based editor, backed up by postgres, but I’d like to further my understanding.

My biggest remaining question is about memory requirements on the server side when using y-websocket. In this post, you talk about a large YDoc consuming 40mb of memory. Is the full document always required to be in memory when manipulating YDocs, or is that dependent on the persistence method? Or is it only fully in memory during moments of mutation? Still trying to wrap my head around the mental model of it all. I think YJS fits well into our tech stack, just unsure if it fits within our budget. We have 500 users on at a time and are growing quite a bit, and these users like to open multiple documents simultaneously.

dmonad · September 2, 2020, 2:17pm

Hi @braden,

The main advantage of using Yjs is that you can scale indefinitely using a simple pubsub server to propagate document updates.

The memory usage of representing a conference paper (see this post) is about ~2MB (260k edits). Although there currently seems to be some bug in y-websocket which blows up memory.

I plan to rework y-websocket server not to load the document to memory at all. I explained my plans in the post you mentioned. Eventually, I’d like to be able to handle 100k connections on a single websocket server.

braden · September 3, 2020, 8:51pm

Thanks for the reply! That clarifies things for me.

CoCreator-Frank · September 19, 2020, 9:51pm

hey @dmonad how is this rework going? Has this bug been fixed. we seem to also be affected by it?

Thank you

dmonad · September 30, 2020, 1:47pm

Hi @CoCreator-Frank,

As I explained in the ticket you are mentioning I’m hoping for someone to come in and provide a fix. I really can’t stem this project and my day job on my own. There were a couple of people who offered their support, let’s see if they can fix it first.

ellisonbg · October 7, 2020, 2:28am

The memory leak in y-websocket is fixed.

tommoor · October 8, 2020, 4:45pm

Would love to hear more about how this would work – might be able to contribute here.

dmonad · October 12, 2020, 9:50pm

@tommoor I plan to keep the document in the compressed binary format in the persistence plugin (I already developed an interface for it).

Part 1: I want so compute the sync steps directly on the Yjs binary format. This is currently only simulated in y-leveldb by temporarily loading the Yjs document. The API can be considered stable though.

Part 2: write a new y-websocket server to use that new API. I’d love to just have an alternative server implementation that uses the new persistence API. I will probably keep the old server around because some people depend on the document content being loaded to memory to periodically send the document content to a database for indexing.

The server implementation is currently quite a mess. It grew over the years. It is not adaptable and doesn’t provide a nice solution for handling authentication… I’d like to have a minimalistic server that others can adapt. Alternatively I know that a lot of people would rather like a server that handles everything for them (more opinionated). This would be open to discussion.

Would you like to work on part 2?