Using y.js for distributed storage

Hi,
I’m a newbie at y.js, but looking at the leveldb thread, I understand that it is possible to distribute and save data on level without necessarily materializing the doc (and consequently having constant). I was wondering if it was possible to apply this concept to a distributed redundant storage: many nodes, each one with an instance of leveldb, with constant memory footprint, eventually converging to the same data distributed in all the dbs.
Would that be feasible? Or I’m just misunderstanding how it works? In case it’s feasible, what data would be on the databases? a list of deltas or the latest snapshot of the doc?

Hi @janesconference,

Sure, that would be possible. I completed the groundwork for this feature: https://github.com/yjs/yjs/pull/274

y-websocket does not yet sync without loading the Yjs state to memory. This is what I’m working on next.

The y-leveldb database contains a list of small incremental updates. When a client syncs with y-leveldb, all contained updates will be merged to a single document state (using the new Y.mergeUpdates function). Then we sync with the client using Y.diffUpdate(mergedUpate, state vector).

When the database contains too many updates (~100-1000 entries), we simply merge all contained updates and merge them to a single entry. This reduces overhead when querying the database. The same approach is used by y-indexeddb.

1 Like