Thanks Jeff -- can you point me to something written up about rANS? All I find on line is turbulence modeling solutions; I presume this is not what you're referring to.
As we know, quantizations are a critical tool for local LLM runners; RAM is typically the gating factor. Are you aware of other better lossless compression of BF16 weights out there?
The reason I ask is this Dfloat11 seems relatively easy to plug in to existing quantization workflows, but you seem dismissive of the paper -- I presume it's my gap in understanding, and I'd like to understand.
I don't know of any great write-ups unfortunately, but the rANS you're looking for is range asymmetric numeral systems.