I tried to find some use cases, this paper has listed some, although I think it's not obvious to me what makes the trees uniquely useful compared to other schemes (https://users.dcc.uchile.cl/~gnavarro/ps/cpm12.pdf seems to be the same Navarro as referenced in the article).
The use cases listed in that pdf are revolving around compression, e.g. graph adjacency list is listed as one. I myself found the last use case listed as smelling interesting (colored range queries), but I would need to dig into the references on that one to see what's actually going on with that one and is it truly anything interesting.
I would be interested in things like what's the unique advantage wavelets trees have compared to e.g. stuffing roaring bitmaps or other kinds of bitmaps into a tree. The RRR has rank-and-select queries which I think roaring bitmap won't do, so that might tie into something. Maybe a problem where the wavelet tree is the only known efficient way to solve it, or maybe it is uniquely really easy to throw at some types of problems or something else.
Anyone know real-world examples of wavelet trees used somewhere? I got interested enough to dig a bit deeper but on the spot as I'm writing this comment, I'm not smart enough to immediately see do these things have killer applications in any niches.
Succinct data structures such as wavelet trees are widely used in bioinformatics. There you often have strings that cannot be tokenized or parsed meaningfully, so you just have to deal with sequences of symbols. And because the strings are often very long and/or there can be a huge number of them, the data structures have to be space-efficient.
A wavelet tree is best seen as an intermediate data structure. It doesn't do anything particularly interesting on its own, but it can be used as a building block for higher-level data structures. For example, you can create an FM-index by storing the Burrows–Wheeler transform in a wavelet tree. (Though there are better options when the alphabet is small.) And then you can use the FM-index to find exact matches of any length between the pattern and the indexed strings.
People working with succinct data structures often talk about bitvectors rather than bitmaps. The difference is that bitmaps tend to focus on set operations, while bitvectors are more about random access with rank, select, and related queries. Then you could see wavelet trees as a generalization of bitvectors from a binary alphabet to larger alphabets. And then you have people talking about wavelet trees, when they really mean a wider class of conceptually and functionally similar data structures, regardless of the actual implementation.