How to sync thousands of documents and have local persistent store?

Update: some successful code below!
I implemented this intermediate MultiDocProvider class to help sync multiple docs. The class assumes a simple append/overwrite/read functions for underlying store. I think such “middleware” approach may be useful for all the existing y-* providers to avoid duplicating code around doc/subdoc update, tracking, and [in the future] other common code such as debouncing. Here’s the TypeScript code, anyone feel free to use with MIT, Apache, or CC0 license:

import * as Y from 'yjs'

export interface UpdateStore {
    append(docName: string, arr: Uint8Array): void;
    overwrite(docName: string, arr: Uint8Array): void;
    read(docName: string): Promise<Uint8Array[]>;
}

export class MultiDocProvider {
    private store: UpdateStore;
    private trimOpsCount: number;
    constructor(store: UpdateStore, trimOpsCount?: number) {
        this.store = store;
        this.trimOpsCount = trimOpsCount ?? 500;
    }

    public trackDoc(docName: string, doc: Y.Doc): void {
        this.store.read(docName).then((updates: Uint8Array[]) => {
            updates.forEach(update => Y.applyUpdate(doc, update));
        });

        let docUpdateCount: number = 0;
        const onUpdate = (update: Uint8Array, origin: any, doc: Y.Doc) => {
            docUpdateCount++;
            if (docUpdateCount > this.trimOpsCount) {
                const fullUpdate: Uint8Array = Y.encodeStateAsUpdate(doc);
                this.store.overwrite(docName, fullUpdate);
                docUpdateCount = 0;
            }
            else {
                this.store.append(docName, update);
            }
        };

        const onSubdocs = this.onSubdocs.bind(this);
        const onDestroy = (doc: Y.Doc): void => {
            doc.off('update', onUpdate);
            doc.off('subdocs', onSubdocs);
            doc.off('destroy', onDestroy);
        };

        doc.on('update', onUpdate);
        doc.on('subdocs', onSubdocs);
        doc.on('destroy', onDestroy);
    }

    private onSubdocs(docs: { added: Set<Y.Doc>, removed: Set<Y.Doc>, loaded: Set<Y.Doc> }): void {
        docs.loaded.forEach((subDoc: Y.Doc) => {
            this.trackDoc(subDoc.guid, subDoc);
        });
    }
}

Here’s a very simple in-memory store, useful for testing:

class MemoryStore implements UpdateStore {
    private store: { [docName: string]: Uint8Array[] } = {};
    public append(docName: string, arr: Uint8Array): void {
        const data: Uint8Array[] = this.store[docName];
        if (data) {
            data.push(arr);
        }
        else {
            this.store[docName] = [arr];
        }
    }

    public overwrite(docName: string, arr: Uint8Array): void {
        this.store[docName] = [arr];
    }

    public read(docName: string): Promise<Uint8Array[]> {
        return Promise.resolve(this.store[docName] ?? []);
    }
}

I’m planning to link this multi-doc provider with different stores - either for filesystem directly or leveldb.

4 Likes