bob1029 5 days ago

> The key insight: when a binary string has a low density of 1s (specifically below p* ≈ 0.32453), we can encode just the positions of those 1s more efficiently than storing the raw string.

Much of what JPEG/MPEG are doing is rearranging the problem such that it is possible to create long runs of zeroes. The way in which a DCT block is scanned relative to the location of its AC/DC components is potentially one of the most innovative aspects of many video & image compression techniques.

2
akoboldfrying 5 days ago

Agree.

OP's approach is actually terrible for video compression, because it actively throws away the locality of pixel changes present in typical video.

A nicer way of putting it would be that nothing about OP's technique is specific to video frames -- the same idea could be used to compress the diff between any two equal-length bit sequences. Might this nevertheless turn out to be better than existing forms of compression for this problem, like gzipping the concatenated pair of blocks? No, because the only way you get compression at all is if the distribution of inputs (here, sets of different bit positions) is highly predictable, i.e., nonrandom -- and pushing data through a hash function actively destroys that (especially if it's a cryptographically strong hash, the whole point of which is to produce output that is indistinguishable from random).

cogman10 5 days ago

I don't believe this is correct.

What the DCT does along with the color representation transformation is to turn fine details into higher frequencies and core details into low frequencies. From there, the quality of the image and thus compression ratio is as simple as dropping high frequency representations.

And besides that, jpegs use a Huffman table to further reduce the size of the image.

AFAIK, it doesn't do anything special to reduce runs. So lining up zeros really doesn't help much.

IshKebab 5 days ago

This is true, but OP was also correct. The DCT components are quantised and encoded in an order such that you get a long string of 0s at the end (the high frequencies).

Retr0id 5 days ago

Dropping (or rather, heavily quantizing) the high frequency components does create runs of zeroes with high probability. The order the components are stored (a diagonal zig-zag) pushes the likely-to-be-zero elements together. At higher quality settings you might not have actual runs of zeroes, but you'll at the least have runs of low-entropy values.

brigade 5 days ago

Dropping the high frequencies does create zero runs, and even JPEG encodes zero runs as a run-length (RRRR in the spec)

But DCT isn't very useful for lossless since any lossless frequency domain representation requires more range than the source, and you can't quantize to counter it.

Sesse__ 4 days ago

JPEG even has a special code from “the rest from here is all zeros, so we stop this block now”. The entire format is pretty much organized around trying to get as many zero coefficients as possible.