You could probably make an equally powerful modeless editor but I don't see how it could be anywhere near vim-like. Maybe if you had a foot pedal to indicate what modes to interpret input sequences in.
Use of this silly pedal will hamper the user from learning various insert commands like a, A, o and O, and ones that combine motions with insert like C, cw, and so on.
A switch built into the chair seat which generates ESC when the user gets up could be useful, though.
But isn't it interesting that people would find that easy to use, despite being modal? This suggests that the conclusion from the article - to avoid modes - misses the point.
imo not any more than hitting ctrl-f in notepad to bring up the search dialog is a mode trigger. Meaning I guess technically it is one but I don't parse it as such.
Note that on Mac and Windows, at least, you can trivially do this remapping. I do it on all my computers.
On a Mac, the keyboard preference pane in system prefs has a "modifier keys" section, which is bizarrely separately configurable for the built-in keyboard vs. a USB keyboard on their laptops. On Linux, the configuration is different for the VT vs. window managers.
It's configurable per-keyboard because not all keyboards have modifiers laid out the same way. Mac keyboards have the bottom row ordered Ctrl, Alt, Cmd; Windows keyboards are ordered as Ctrl, Win, Alt. So, if you're using a Windows USB keyboard on an Apple laptop, you'll often want to swap Cmd (= Win) and Alt on the USB keyboard, but leave them alone on the internal one.