Parallelism isn't speed. Zstandard decompression is insanely fast, no need for occupying multiple cores.
Bloom filter lookups are embarrassingly parallel to the point that you could occupy 0 CPU cores and do it in a GPU fragment shader.
There are probably other reasons why this is a bad idea but I'm curious to try it.