I've built an app that has the same goals (not operate a mouse) but approach it completely different.
Rather than try to simulate the moving the mouse itself, Shortcat [https://shortcat.app/] indexes the user interface (buttons, text fields, links, menus, etc) and enables fast fuzzy search of the interface. Type a word, abbreviations, or hints and hit Enter to click or action the element. Works almost everywhere on macOS, including browsers, Electron apps, and even iOS apps!
The goal is to minimise cognitive overhead to achieve a particular intent, so being able to type a word to hit a button, or active a deep menu item when you don't know the shortcut is quick and easy.
I'm currently working on a modal option which enables staying within Shortcat to navigate an interface, as well as chords for simulating scrolling and arrow keys.
Shortcat relies on using the Accessibility API to index UI elements however, and is dependent on how well an app or website has implemented it. One of the goals is to help improve accessibility implementations by exposing more people to its implementations and pushing for developers to fix broken or incorrectly implemented accessibility tagging.
Shortcat is macOS only for now as I haven't been able to investigate how viable doing this on Windows or Linux would be, especially on Linux considering all the different toolkits that exist.
I love this new wave of tools coming out for mouseless computer use. Chronic mouse use has destroyed my wrist so I have to avoid using it as much as possible.
I love Shortcat's approach in general, indexing the UI. However, the reliance on the Accessibility API is actually a significant downside in the real world in my experience since so many apps don't properly implement it. I feel like Warpd is a good complement to this, you could use Hint or Grid mode as a fallback when the indexing approach fails.
I wish I could use shortcat or Warpd, but unfortunately I'm on windows. Curious if anyone has any good tool recommendations for windows? Currently, I'm using:
1. Vimium for Chrome (so good, wish I could just use it across the OS).
2. Hunt and Peck: https://github.com/zsims/hunt-and-peck has been my favorite for OS-level use, a simple version of shortcat for windows. But, it's not maintained and not as slick as some of these newer tools.
I have used a trackball for many years since my wrist started bothering me, and I love it. I am right-handed, and I use Logitech's 575 and MX Ergo. I prefer the Ergo, even though it is more expensive. I keep it beside me on the couch where I sit. That way my elbow makes a 90 degree angle. Very comfortable. My keyboard is on my lap and my monitor at eye level.
Nice, I've actually tried a trackball myself, but with the way I use my desk (sit/stand) it caused more problems than it solved (shoulder issues). Ergonomics is an art I suppose.
Oooooh, Hunt and Peck indicates that it's possible to make a Shortcat for Windows!
I would probably need to pay someone to build that particular version though cause the last time I built anything for Windows was like 15+ years ago
If you're already using Vimium, I suggest trying qutebrowser, which takes keyboard accessibility to a whole new level, by making it a first-class feature for the entire browser.
It does basically cut out the mouse, and had a several-days learning curve for me, but after that it's pretty great. Here are some cool features, off the top of my head:
* Python-scriptable, though I haven't figured out how to use this yet.
* Bind javascript bookmarklets to a keyboard shortcut (use :bind with the jseval command)
* Toggle not only javascript, but image loading and a whole slew of other features, with a keyboard macro.
* Vertical tabs.
* All config is adjustable via commands.
* Keyboard macros like "pop tab into a new window", "clone tab", "close all other tabs", etc.
* Text selection using the keyboard.
* Quite similar keyboard dynamics to vim.
It has a built-in ad blocker, and you should run :adblock-update when you first use it.
Another browser which is similar, but which I haven't gotten into as much, is Luakit.
A bit late to the party, but I’ve just released v1.0 of TPMouse, for Windows: https://github.com/EsportToys/TPMouse
Please post this on Show HN. This looks cool!
I never comment on HN however I just want to say I've downloaded your app and it's very impressive - I'm going to try and incorporate this into my workflow the best I can. Thanks!
This app is quality. I can tell you have been working on it for years. Why not charge for it?
Haha, thanks :)
I did charge for it a couple of years ago, however I rebuilt the whole thing from scratch after a long hiatus and hadn't had bothered to reimplement licensing because the existing options all kinda suck, and figured I'd focus the time on features and usability first. I think with the modal mode in the next release will bring it much closer to a 1.0 release.
If you bundle it and release a paid for application on the App Store, I would totally buy it and even roll it out to my staff. The magic of the App Store allows you to do company wide roll-outs quite easily.
I'm not sure if an app like Shortcat can be released on the App Store given it uses the Accessibility APIs (sandboxing etc), also the 15-30% cut they take is a bit ooooof, but I do have plans to support company/teams licensing!
+1 this is awesome! I'd like to donate if I can :)
edit: nvm me, found the option in settings (on activation show shortcuts immediately).
Quick question, I've been playing around with Shortcat for a while. When I press the activation hot-key it takes about 4 seconds for the yellow two-letter denoated highlights to show up, despite the app's text stating "found n elements in ~0.20s". Is there a config option to instantly show the yellow highlights?
Thanks! I don't have a way to take tips yet, but you can support by pushing for developers to improve their accessibility implementations when you run into issues!
I see you found the setting for that. It was a deliberate default initially as the intended way to use Shortcat is to activate Shortcat and type what you want without waiting to see hints, as this is generally faster and less mental overhead IMO, especially for fast typists and well-structured interfaces.
However, some people prefer minimal keystrokes and I get that. I'm trying to figure out the right set of defaults to make it friendly to new users while nudging people to how Shortcat is designed to be used and will be tweaking it as I go.
Oh my! I came to the comment section to ask about a Mac app that I've seen a long time ago that did this. Lo and behold, you, the author, have written the first comment. :-)
Thank you for Shortcat, I used it a long time ago and loved it. Excited to giv it another go!
No worries! Glad you love Shortcat :D
Shortcat is utterly amazing. I really hope I can work this into my entire MacOS usage. You should be really proud of what you've made because this is fantastic!!
Any plans to add scrolling functionality to shortcat? I'd be able to move over completely from vimac if that gets added.
Working on that right now :)
Excellent! I'd love to have that on Linux.
Very interesting thing. I wonder if the gap of apps not supporting a11y could be reduced by using Tesseract to OCR the text.
That's what https://superkey.app/ does.
Excited to try this out! Is it planned to open source? I would love to try integrating this into Raycast
Is Raycast open source? Could only find that the plugins are on GitHub.
It won't be open source, but I will be adding an API so it can be integrated with other apps and scripted
I have been meaning to build something like this for myself, albeit for Linux. Does anybody know if there is any already existing efforts there?
Given that Linux doesn't have anything like an accessibility API, I think the only option is training ML models.
This is what CMD-Shift-? should be.
I think this paradigm along with more app developers putting all the important functions in menus is a strong contender for Maximum Intuitive Productivity
Have you considered using ML/OCR to figure out the position of the text relative to the screen? Seems much simpler than relying on accessibility APIs
Thank you for your hard work!
I have plans to use ML/OCR to augment results down the road but the AX APIs and ecosystem on most apps (that I encounter, at least) are generally decent. Also, OCR means it won’t understand buttons with just icons, whereas AX APIs can grab em just fine.
Thanks! It’s easily my longest running project at a decade
Should be doable on Linux for most mainstream apps due to the toolkits having a11y support, but obviously not all apps use mainstream toolkits.
This is excellent. I will be trying this out.
I've been playing around with ShortCat recently - really cool app! Keep up the good work.
Thanks :D
Looks very cool, have you considered adding voice input or is it already possible?
There’s already a very sophisticated system on MacOS for voice input so I feel like it’d probably be superfluous.
Looks really cool. Is it able to select text for copying?
I'm working on a version that allows sending arrow keys with modifiers to the targeted application, so soon!
Looks awesome! Wish I had this for windows!
https://github.com/GavinPen/AhkCoordGrid
I just tried that, this is excellent.