bilsbie 5 days ago

So how does the bloom filter help vs something like a hash table?

3
hxtk 4 days ago

It sounds like your instinct here is that we need to store (x,y,r,g,b) tuples for everything that changed, and leave other pixels unchanged from the previous frame.

You could store all of that in a single map from (x,y) to (r,g,b), which would cost you, say, 4 bytes per pixel for the (x,y) data and then, say, 30 bits or (rounding up) 4 bytes per pixel for the (r,g,b) with 10-bit color. A hash map would optimize lookup, but for efficient storage you really just want an array of 8-byte chunks where the first two bytes are x, next 2 are y, next 10 bits are r, then g, then b. To decode, you just start with the previous frame and then iterate down the list and apply each diff in the list.

If you iterate over (x,y) in a consistent order, you avoid the need to store it because you can simply say that the next (r,g,b) tuple in your list is the pixel data for the next (x,y) value that passes the bloom filter. This effectively cuts the amount of data you have to store per pixel in half, but increases the number of pixels you have to store by a probabilistically small amount because bloom filters have false-positives.

johanvts 4 days ago

You start out with the position of all the 1s (positions of changed pixels). A hashtable would be great for speeding up the query time, but does not compress anything. The bloom filter takes up much less space. The price is that you can get false positives.

returningfory2 5 days ago

You don't need to store the diffs for the coordinates that are not in the bloom filter. If the number of coordinates that changed is small, the size of the bloom filter will be small, and this is a significant optimization.

raincole 5 days ago

So it's just compressing consecutive 0s? Like most of compression algorithms do...?

grumbelbart2 4 days ago

It's more consistent.

As a rule of thumb, Bloom filters require 10 bits per element for a 1% false positive rate. Say 100 pixels changed between the frames, that 1000 bits or ~125 bytes to store the which-pixel-changed-map.

Runlength encoding of the (in)active bits can use anything from ~10-200 bytes for 100 pixels (Say 1 byte per run, 200 runs of active vs. inactive in the worst case).