Is YKeyValue Still Necessary with Yjs Update Version 2 Optimizations?

csbenjamin · October 2, 2024, 2:32pm

Hello Yjs Community,

I’ve been exploring the performance optimizations in Yjs and came across an interesting observation when using the two different update handlers. Here’s the test I conducted:

doc.on('update', u => {
    console.log('V1:', u.length, 'bytes');
});
doc.on('updateV2', u => {
    console.log('V2:', u.length, 'bytes');
});
doc.transact(() => {
    for (let i = 0; i < 100000; i++) {
        doc.getMap().set('a', 0);
        doc.getMap().set('b', 0);
    }
}, doc.clientID);

// Log Output
V1: 1983505 bytes
V2: 62 bytes

As seen in the log, the optimization in version 2 (updateV2) is gigantic compared to version 1 (update). This led me to consider the necessity of using the YKeyValue class from the y-utility library, which is designed to handle efficient key-value storage by mitigating issues with Y.Map when keys are updated in an alternating order.

Here are some benchmark results comparing Y.Map and YKeyValue:

Operations	Keys	YKeyValue Doc Size (bytes)	Y.Map Doc Size (bytes)	JSON Size (bytes)
100k	10	271	524,985	121
100k	100	2,817	578,231	1,291
100k	1,000	30,017	593,834	13,891
500k	10	329	2,684,482	131
500k	100	3,013	2,954,249	1,391
500k	1,000	31,005	2,992,244	14,891

My Question:
With the significant optimizations introduced in updateV2, does this reduce or eliminate the need for using YKeyValue for managing key-value stores in scenarios where frequent and alternating updates occur? In other words, can updateV2 handle the optimizations that YKeyValue was initially designed to address, thereby simplifying the implementation by relying solely on Y.Map?

Thank you!

dmonad · October 4, 2024, 1:13pm

Hi @csbenjamin

The YKeyValue feature is still useful to optimize metadata overhead, memory usage, and encoding size (+decoding speed).

v2 encoding compresses your data. Repetitive information can often be encoded into just a few bytes. But Yjs still needs to allocate memory once the bytes are decoded.

csbenjamin · October 13, 2024, 4:39pm

Thank you for the clarification, @dmonad!

I wasn’t fully aware that the Y.Doc instance can occupy significantly more memory than the updates themselves, even with the optimizations introduced in updateV2. While updateV2 does an impressive job of compressing repetitive data into smaller byte sizes, I now understand that the actual memory footprint of the Y.Doc after decoding the updates can still be substantial. Thanks again for highlighting this distinction!