People seem to ignore the cost and accuracy aspects of a phone listening to you 24/7. At least with today’s constraints, it is highly unlikely to be happening.
First, the cost to transcribe audio is not free. It is computationally expensive. Any ad network or at scale service would not be able to afford it, especially in orgs where they are concerned about unit economics.
Secondly, the accuracy would be horrible. Most of the time, your phone is in your pocket and would pick up almost nothing. More over, it’s not like you are talking about anything of value to advertisers in most cases. Google is a money printing machine because people search with an intent to buy. The SNR of normal conversation is much much much lower. That makes the unit economics of doing this gets much worse.
Third, it would be pretty hard to not notice this was happening. Your phone would get hot, your battery would deplete very quickly, and you’d be using a lot of data. Moreover on iOS you could see the mic is being used and the OS would likely kill the app if it was using too many resources in the background.
So until we find an example of this actually happening, it’s not worth worrying about.
For all of these reasons, audio snooping is much more likely to be something done by wired, stationary devices that maybe have a decent amount of RAM + a fair bit of usually-idle processing capacity (to run the transcription model locally and just push the resulting text), and which are expected to draw a decent amount of power and use the Internet at vaguely-arbitrary times.
Like a smart TV, for example.
First thing I do is disable that feature on every TV I buy.
Second thing I do is block the TV access to internet after I do one firmware update.
It doesn’t need to listen all the time… just grab a few words after you put it down or hit the lock button. Or listen while you are actively using it.
Building a word cloud would be trivial and with minimal battery impact
These are all points that were brought up in the article as to why voice recording is less useful than all of the other tracking mechanisms advertisers have available
While I think that audio recording is not a thing, your economic argument is not complete.
What if only the audio of "high value" targets is recorded. Meaning people who buy a lot of stuff. So it might be worthwhile to only record their sounds. Which will explain why random testing (usually with new/clean phones) is never successful in detecting a recording event.
I think this is a genuine concern for prominent people. Like if you are Mark Zuckerberg, there is material interest in a bad actor installing malware on his laptop. But for a random person where you get low value data that may or may not let you better target some low value ads? That is much harder to justify. Would have to reevaluate as things change and the cost of compute goes down.