Approximation of the uncompressed update size

timlock · September 3, 2023, 4:30pm

Hello, I’m trying to approximate the size of an uncompressed update for research purposes. My current approach is to print the update to the console via Y.logUpdate() and then catch the log in an overriden console.log method. Afterwards I convert the log to json and calculate the size in bytes.
For an update with a size of 115 bytes and containing a struct with two items, I obtain an uncompressed size of 490 bytes. For a struct with three items, I get an uncompressed size of 820 bytes and a compressed size of 175 bytes.
Is my approach suitable or is there a better way to do this?

raine · September 7, 2023, 6:25pm

That seems reasonable as an approximation.

However, you may be running into a methodological problem if you try to compare compressed vs uncompressed size. They are not directly comparable in your case. For example, a 400 byte text file is exactly twice as big as a 200 byte text file, but a 400 byte uncompressed YJS update is not exactly twice as big as a 200 byte compressed YJS update, because the JSON is structurally different. Using the same unit could be misleading, depending on how you present the data. I would either stick to only making comparisons within the same type of compression, or express the uncompressed update size as a unit-less value.

Not sure if this is for academic or personal research purposes though, so that may not be relevant for you!

timlock · September 8, 2023, 9:32am

Thanks for the reply. To give a bit more context: I’m currently developing two spreadsheet applications with different consistency guarantees for comparrison. One application exchanges operations in JSON format, while the other uses Yjs. A rough approximation of the message size is sufficient for my comparison of the two approaches.