crazygringo 5 days ago

Working with protobuf has been a huge pain for me, personally -- sites that make API data available only in older versions of protobuf where there's no available library to decode in a popular language. JSON and zip are easy, universal, accessible standards in ways that protobuf simply isn't.

So that's why I'm saying, there's really something to be said for zipped JSON. You point out that you "usually need to decompress the whole thing in memory", and that's precisely what most of my comment was about -- handling it efficiently so you don't.

And it's not like protobuf is inherently any better at that anyways -- if you want to access a record in the middle of the file, you've got to stream the whole file up to that point to get to at it. It doesn't support random access in any native way.

So I'm kind of bummed out my original comment is being downvoted -- I'm making an actual serious proposal in it. I think zipped JSON should actually be taken seriously as a file format in its own right, especially with much more memory-efficient decoding.

2
IshKebab 4 days ago

> And it's not like protobuf is inherently any better at that anyways -- if you want to access a record in the middle of the file, you've got to stream the whole file up to that point to get to at it.

Not true. Libraries may not commonly expose the functionality but Protobuf uses tag-length-value so you can very quickly skip to the part of a file you want. In the worst case it will still be orders of magnitude faster than JSON.

Your proposal sounds very complicated and fragile, and doesn't solve significant issues with JSON like zero-copy, storing numbers and byte arrays directly, etc.

crazygringo 4 days ago

Protobuf doesn't include any kind of indexing. You can't just jump to a 5,000th record without traversing through the previous 4,999 ones. So no, you can't generally skip to some specific part. You've got to stream from the beginning, just like JSON.

And my proposal is literally the opposite of complicated and fragile. It's just taking two extremely robust and relatively simple existing standards.

It's definitely not perfect, but the entire point is its simplicity and robustness and support, unlike protobuf. And the fact that a library could actually make it much more performant to use, without sacrificing any compatibility with zip or JSON.

jlouis 5 days ago

Zipped JSON is just massive waste of energy in a battery. We don't want that.

crazygringo 4 days ago

Of all of the drains on batteries, I think it's close to the bottom...