I have a standard call to y-indexeddb in a web app , e.g.
const persistence = new IndexeddbPersistence(room, doc)
// once the map is loaded, it can be displayed
persistence.once('synced', () => {
console.log(' local content loaded')
})
When I look in DevTools at Application > IndexedDB > [room] > updates, I see 2 rows, one for a Uint8Array[0,0,buffer: arrayBuffer(2)...] and one long Uint8Array.
If I now refresh the page in the browser (without making any changes to the doc), another two rows are added (similar to the first two). Each refresh adds another two rows. Is this the expected behaviour? Since the data to be stored hasnât changed, I wasnât expecting any change to the database.
(The motive for asking this question is that I am experiencing some unreliability in retrieving the contents of a yDoc when I use both y-indexedDB and y-websocket, and I am trying to track down the cause).
This occurs on Chrome and Safari as well as Firefox (on a Mac, Monterey), so I donât think there is a link to that issue. Also I am logging connection events and I donât see it emitted twice.
import * as Y from 'yjs'
import {IndexeddbPersistence} from 'y-indexeddb'
const doc = new Y.Doc()
const persistence = new IndexeddbPersistence('test11-2', doc)
persistence.once('synced', () => {
console.log(` ${yMap1.get('prop1')} loaded from IDB`)
})
const yMap1 = doc.getMap('map1')
yMap1.set('prop1', 'foo')
If you run this, and look in the Debugger at the display for IndexedDB, every time the page is reloaded, a new key and value is generated for database test11-2, object store updates. This is true in both Chrome and Firefox.
I have a yDoc that can often be 3MB or more. Reloading this a good few times uses a lot of memory since 3MB is added each time. I suspect that the problem a client had with my app was caused by IndexedDB being âout of memoryâ (or out of disk space), but it was not possible to reproduce the issue. Whether or not that was the case, I donât understand why we are seeing this behaviour.
You can think of Yjs document as a git repository. Every time you change or create a value, you create a new commit. The only relevant difference is that in Yjs conflicts are automatically merged.
When you insert a value every time you start the app, you are creating a commit on an empty document. Of course, the change will automatically be merged (if the same value is set by a remote client/server, then the values will be merged - in the case of Y.Map, one will overwrite the other).
y-indexeddb notices that you created a change and stores the âcommitâ in a database. It squashes commits into a single entry every now and then, but the produced metadata can never be deleted⌠So you should avoid making unnecessary changes.
I talked about this a number of times on this discussion board (search for âinitial contentâ). You should only initialize the content once (not every time you load the document). Firstly, it is extremely inefficient (Yjs needs to store all data that was ever produced, even metadata of content that was overwritten). Secondly, there is a good chance that you overwrite the content that is currently used by all other clients which might include new changes. If you manipulate the Yjs document (even if you overwrite content with the exact same content), then Yjs needs to store it in the database (indexeddb) and propagate the change to all clients.
I am seeing the same behavior when there is no initialization or changes to the doc. It still persists a new update.
Minimal example:
Notice that I never update the Doc, or even create a shared type on it for that matter.
I have traced the behavior to the following commit. Anybody know why this was added? It seems like a mistake that it grows the object store without any changes to the Doc.
New to Yjs and still learning, so I kept trying to figure out what I was doing to cause this issue. My updates were growing without me understanding why.
@micrology thanks for setting up the minimal example. And @raine, hopefully that leads to a fix. Nicely done!
ChatGPT4 thinks that there is a logic error in the storeState method. Really not sure how accurate that is atmâŚ
The y-indexeddb module, as shown in your code, is a persistence layer that uses IndexedDB to store the state of a Yjs document. The module listens for updates to the Yjs document, and stores these updates in an IndexedDB database. When the Yjs document is loaded, it fetches these updates from the database and applies them to the document, effectively restoring its state.
The storeState function is called periodically (as defined by _storeTimeout) or when the number of updates in the database reaches a certain threshold (PREFERRED_TRIM_SIZE). This function fetches all updates from the database, applies them to the Yjs document, and then stores the resulting state of the document in the database. If the number of updates has reached the threshold, it also deletes the oldest updates from the database, ensuring that the total number of updates does not exceed the threshold.
Refreshing the page causes a new instance of the Yjs document to be created, and this new instance fetches and applies all updates from the database. However, these updates are not deleted from the database, because the storeState function only deletes updates when the number of updates reaches the threshold. As a result, each refresh of the page causes the database to accumulate more updates.
To prevent this accumulation of updates, you could modify the storeState function to delete all updates from the database after they have been applied to the Yjs document, regardless of the number of updates. Alternatively, you could decrease the value of PREFERRED_TRIM_SIZE to a smaller number, so that updates are deleted more frequently.
Please note that this is a high-level explanation based on the provided code and may not cover all edge cases or specific implementation details.
Interesting. Thatâs more or less correct. However it doesnât clarify why empty updates are being stored on load. That code was just added in September, so it was not part of the core logic, and presumably was addressing a newly discovered edge case.
A couple steps that might be worth a try:
See which (if any) tests fail when that commit is reverted.
Add a condition that only stores the update if it is non-empty.