Refreshing page causes y-indexeddb to accumulate db entries

I have a standard call to y-indexeddb in a web app , e.g.

	const persistence = new IndexeddbPersistence(room, doc)
	// once the map is loaded, it can be displayed
	persistence.once('synced', () => {
		console.log(' local content loaded')
	})

When I look in DevTools at Application > IndexedDB > [room] > updates, I see 2 rows, one for a Uint8Array[0,0,buffer: arrayBuffer(2)...] and one long Uint8Array.

If I now refresh the page in the browser (without making any changes to the doc), another two rows are added (similar to the first two). Each refresh adds another two rows. Is this the expected behaviour? Since the data to be stored hasn’t changed, I wasn’t expecting any change to the database.

(The motive for asking this question is that I am experiencing some unreliability in retrieving the contents of a yDoc when I use both y-indexedDB and y-websocket, and I am trying to track down the cause).

1 Like

@micrology, are you using Firefox? probably related to this, Y-websocket-server connection event emitted twice on page reload

This occurs on Chrome and Safari as well as Firefox (on a Mac, Monterey), so I don’t think there is a link to that issue. Also I am logging connection events and I don’t see it emitted twice.

Here is a minimal example:

import * as Y from 'yjs'
import {IndexeddbPersistence} from 'y-indexeddb'

const doc = new Y.Doc()
const persistence = new IndexeddbPersistence('test11-2', doc)
persistence.once('synced', () => {
	console.log(` ${yMap1.get('prop1')} loaded from IDB`)
})
const yMap1 = doc.getMap('map1')
yMap1.set('prop1', 'foo')

If you run this, and look in the Debugger at the display for IndexedDB, every time the page is reloaded, a new key and value is generated for database test11-2, object store updates. This is true in both Chrome and Firefox.

This happens here as well, but it doesn’t seem to cause any issues. What kinds of unreliabilities are you experiencing?

I have a yDoc that can often be 3MB or more. Reloading this a good few times uses a lot of memory since 3MB is added each time. I suspect that the problem a client had with my app was caused by IndexedDB being ‘out of memory’ (or out of disk space), but it was not possible to reproduce the issue. Whether or not that was the case, I don’t understand why we are seeing this behaviour.

2 Likes

@micrology did you find a solution for this? I’m also facing the same issue and checking how to resolve it.

I didn’t! Sorry I have nothing to help.

Reg

You can think of Yjs document as a git repository. Every time you change or create a value, you create a new commit. The only relevant difference is that in Yjs conflicts are automatically merged.

When you insert a value every time you start the app, you are creating a commit on an empty document. Of course, the change will automatically be merged (if the same value is set by a remote client/server, then the values will be merged - in the case of Y.Map, one will overwrite the other).

y-indexeddb notices that you created a change and stores the “commit” in a database. It squashes commits into a single entry every now and then, but the produced metadata can never be deleted… So you should avoid making unnecessary changes.

I talked about this a number of times on this discussion board (search for “initial content”). You should only initialize the content once (not every time you load the document). Firstly, it is extremely inefficient (Yjs needs to store all data that was ever produced, even metadata of content that was overwritten). Secondly, there is a good chance that you overwrite the content that is currently used by all other clients which might include new changes. If you manipulate the Yjs document (even if you overwrite content with the exact same content), then Yjs needs to store it in the database (indexeddb) and propagate the change to all clients.

I am seeing the same behavior when there is no initialization or changes to the doc. It still persists a new update.

Minimal example:

Notice that I never update the Doc, or even create a shared type on it for that matter.

I have traced the behavior to the following commit. Anybody know why this was added? It seems like a mistake that it grows the object store without any changes to the Doc.

New to Yjs and still learning, so I kept trying to figure out what I was doing to cause this issue. My updates were growing without me understanding why.

@micrology thanks for setting up the minimal example. And @raine, hopefully that leads to a fix. Nicely done!

1 Like

ChatGPT4 thinks that there is a logic error in the storeState method. Really not sure how accurate that is atm…

The y-indexeddb module, as shown in your code, is a persistence layer that uses IndexedDB to store the state of a Yjs document. The module listens for updates to the Yjs document, and stores these updates in an IndexedDB database. When the Yjs document is loaded, it fetches these updates from the database and applies them to the document, effectively restoring its state.

The storeState function is called periodically (as defined by _storeTimeout) or when the number of updates in the database reaches a certain threshold (PREFERRED_TRIM_SIZE). This function fetches all updates from the database, applies them to the Yjs document, and then stores the resulting state of the document in the database. If the number of updates has reached the threshold, it also deletes the oldest updates from the database, ensuring that the total number of updates does not exceed the threshold.

Refreshing the page causes a new instance of the Yjs document to be created, and this new instance fetches and applies all updates from the database. However, these updates are not deleted from the database, because the storeState function only deletes updates when the number of updates reaches the threshold. As a result, each refresh of the page causes the database to accumulate more updates.

To prevent this accumulation of updates, you could modify the storeState function to delete all updates from the database after they have been applied to the Yjs document, regardless of the number of updates. Alternatively, you could decrease the value of PREFERRED_TRIM_SIZE to a smaller number, so that updates are deleted more frequently.

Please note that this is a high-level explanation based on the provided code and may not cover all edge cases or specific implementation details.

Interesting. That’s more or less correct. However it doesn’t clarify why empty updates are being stored on load. That code was just added in September, so it was not part of the core logic, and presumably was addressing a newly discovered edge case.

A couple steps that might be worth a try:

  1. See which (if any) tests fail when that commit is reverted.
  2. Add a condition that only stores the update if it is non-empty.