Yeah, I considered a lot of that, and I haven't completely ruled it out, but I'm kind of setting a few artificial deadlines for this just to make sure I actually make a little progress. I am nearly 100% sure that if I hacked at this long enough (e.g. increased inlining, get rid of an extra loops, maybe abuse concurrency a bit more), it could go considerably faster, and maybe it would be fast enough to do what I need.
Or I just do it in Rust, where even a shitty version I hack up in a day or two will almost certainly be more than fast enough.
This isn't to shit on Erlang at all. Erlang is great, and for the most part it is generally pretty fast for most stuff I want to do (generally more network-heavy stuff), and I'm not really going to complain about it not being able to do something it wasn't designed to do.
Though upon typing this, I am wondering if I could get NIFs working with GraalVM so I could do it in Clojure....
Unless you are encoding very, very small snippets of audio, you are almost certainly better off writing your encoding in a best-of-breed language and shelling out to that program. NIFs aren't for "things that Erlang is slow at", it's for "things that Erlang is slow at that you need in the same process". You don't need your audio encoding to be in the same process, and you really don't want to try to write audio encoding to the restrictions of NIFs.
I ended up having some trouble getting the NIF stuff working so I ended up using ports with Rust. The amount of overhead for something like that is basically nothing, and you are right, I do not need this to be in-process.