This explains nothing since it's applicable for any set of keybinds, so if you had to type "word back" in normal mode to move by word or "PRETTY PLEASE LET ME OUT" to exit, you could still say the irrelevant "but it's a power tool smashing convention!"
Yep, but vim’s keybinds are composable and with less cognitive load than the conventional way. That’s the selling point. Not knowing how to use it isn’t a good argument for not using it. Riding a motorcycle isn’t natural, but the speed improvement is real.
Cognitive load is somewhat subjective. The composability advantage is what's hard to wrap your head around if you've never developed the skill. Vim / Helix / Kakoune are fundamentally more powerful keybinding systems due to their composability ("grammar"). Learning Vim early on in my career is easily one of the greatest skill investments I've ever made. Every minute saved doing small edits adds up over years, and I've pulled off many large-scale refactors in literally 1% of the time it takes other people with non-composable keybindings.
I imaging a lot of people will increasingly lean on AI to handle most editing tasks, but until people literally stop using keyboards, Vim will still be worth learning.