Using LeveldbPersistence for a replicated cache

edwardrem · March 2, 2023, 6:08pm

Hi. Great project!

I plan on having many clients requiring a shared cache (just a key/value). Each client will have its own cache and can write and remove key/value entries from their cache. This cache should be replicated to all other clients. I will have them all talk to a WebsocketProvider (maybe peer to peer in the future).

My question is: Is using LeveldbPersistence for the replicated cache a good idea?

I’m thinking of something like the following:

const Y = require('yjs') 
const { WebsocketProvider } = require('y-websocket');
const { LeveldbPersistence } = require('y-leveldb');

const persistence = new LeveldbPersistence('./mydb');

const myDoc = new Y.Doc();

const wsProvider = new WebsocketProvider('ws://localhost:1234', 'myCacheRoom', myDoc, { WebSocketPolyfill: require('ws') });

myDoc.on('update', (update) => {
  persistence.storeUpdate('myDoc', update);
});

const entry = myDoc.getMap('myCacheKey1')
entry.set('value', { foo: 123 });
entryset('expiration', 1678644313)

Will the entire myDoc instance have the entire cache loaded into memory? I would not want that as the cache will be massive.
Does this approach make sense?

Thanks!

raine · March 3, 2023, 4:08pm

Yes, the entire Doc would be loaded into memory. You could split it up into one subdocument per client, but the entire cache for that client would still be need to be loaded into memory to read any value from it.

I’m not sure YJS is the right choice here. Do you even need a CRDT? If each cache is only writeable by a single client, you don’t need the collaborative aspect, which has significant overhead. YJS works well with human-sized documents, and not as well with large hash tables in my experience.

You may be better off with a traditional key-value store with separate auth for each user for their part of the cache.

edwardrem · March 3, 2023, 4:22pm

Thanks for your response. It makes sense.

I really just like the fact that yjs takes care of all the syncing aspect seamlessly. Plus, when a client disconnects and reconnects, it “get caught up” automatically. I wish I could find something similar (and lightweight) to keep a key-value db synchronized in the same manner. Anything you know of?

raine · March 3, 2023, 5:53pm

Yeah, the syncing is really easy in Yjs.

Do you need offline-first functionality? Also, how much data are you expecting?

edwardrem · March 6, 2023, 2:28pm

Hi Raine,

I do need offline-first functionality. The amount of data could be tens or hundreds of thousands records. I have found a library that fits the bill very well. It’s called Hypercore (see https://holepunch.to).

raine · March 6, 2023, 3:03pm

Hundreds of thousands of records can feasibly be handled in-memory with Yjs. Millions requires some kind of lazy memory management (such as with Subdocuments).

I’m glad you found the right tool for the job. Hypercore looks interesting. Reminds me of OrbitDB.