"How to minimize Yjs store size for an editor like Figma?"

Hello everyone I am using yjs for editor like Figma
my data structure is adjacency list as Y.Map and every vertex again Y.Map
Below is the demo code snippet

import * as Y from "yjs";

const identity = [1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1];

const graph = new Y.Doc();
const adjacencyList = graph.getMap("adjacencyList");

for (let index = 0; index < 5000; index++) {
  // create matrix for vertex
  // use for positioning in 2D space
  const matrix = new Y.Array();
  matrix.insert(0, [...identity]);

  // create edges
  const edges = new Y.Array();

  // generate id for vertex
  const id = crypto.randomUUID();

  // create vertex
  const vertex = new Y.Map();

  vertex.set("id", id);
  vertex.set("matrix", matrix);
  vertex.set("edges", edges);

  // then add the vertex to adjacencyList
  adjacencyList.set(id, vertex);
}

function calcMatrix(array, movmentX, movmentY) {
  // ....
  return []; // new matrix
}

// drag listener
function drag(event) {
  const { movmentX, movmentY } = event;
  // lets drag all vertexes simultaneously
  const selectedId = [...adjacencyList.keys()];
  graph.transact(() => {
    for (const id of selectedId) {
      const YMatrix = adjacencyList.get(id).get("matrix");
      const newMatrix = calcMatrix(YMatrix.toArray(), movmentX, movmentY);
     
        YMatrix.delete(0, YMatrix.lenght);
        YMatrix.insert(0, [newMatrix]);
    }
  });
}

After some seconds of dragging my store becomes about ~ 10Mb and
graph.store.clients.get(graph.clientID).length ~ 356000

How can I minimize store ?

I would recommend throttling updates to the Y.Doc so they are saved only once a second or so. Then for real-time synchronization use the Awareness API. This works because only active users need to share real-time coordinates with each other. When a user joins for the first time, syncing to the last 1s or more should be fine.

If you don’t need a complete history of drag movements, you may be able to get away with only using the Awareness API for real-time sharing, and relying on a non-CRDT storage mechanism for persistence of just the last saved position.

Also if it’s possible to only save the x,y coordinate and not the whole matrix that would help, but I may be missing a reason for them to be saved.

2 Likes

as written in the doc the Awareness API is designed to manage small stores and the API is also small to maintain deep updates, etc… and the data presented by me is a very small sample in reality it is too complex

I think it’s less about size, and more about whether you need the full history or not. The Awareness API is good at ephemeral, real-time conflict resolution where you don’t need to persist the full history.

YJS is heavily optimized, but CRDT’s use a lot of memory. If you want a smaller storage footprint, you need throttling and periodic truncation.

As I understand saying throttling updates to the Y.Doc you mean as discussed in Throttling Yjs updates with garbage collection but the live doc again have a graph.store.clients.get(graph.clientID).length ~ 356000 which is going slow down any operation on doc.
My problem is now single user, single session editing experience after i can think about merging updates some how and deliver to peers

I think I need some kind of function to throttling Items in list “doc.store.clients.get(doc.clientID)”

I was suggesting throttling calls to YMatrix.delete/insert to avoid accumulating all the updates to begin with.

If you need to track the drag in real-time, store it in a variable outside YJS then flush changes to YJS at an appropriate interval.

2 Likes

Just chiming in to +1 the recommendation to put transitive updates elsewhere. Intermediate states don’t belong in your YDoc; they should use awareness or something external. Think of writing to the YDoc as your “COMMIT” action. For example, if I clicked and dragged a shape, the hundreds of drag events would be funneled through Awareness. It’s not until I let go of the mouse that I commit those changes to the YDoc.

2 Likes

This pattern is both:
a) important to understand
b) easy to miss when you first pick up yjs

Perhaps the yjs framework itself should abstract this sort of thing away into the shared types API; yjs would differentiate between high frequency ephemeral updates and low frequency action/event/commit type updates without burdening the developer with thinking about how to implement it… :thinking: :face_with_monocle:

It could be ie. a plugin type of thing by optionally expanding the shared type implementation code or something; but emphasis that such a thing should be easy to choose to use, works across the yjs ecosystem so no breaking API changes, and entirely optional.

3 Likes

@meatflavourdev That’s not a bad idea.

You could mirror your document schema in Awareness, and then flush state periodically to the Doc. The throttle rate would be easily configurable.

If real-time updates are not needed, throttling the Doc transaction up front has the least overhead.

1 Like

can i ask a question by the way?
image

i am wondering how you get size of the store, that’ll help me a lot ,thank you!