points by _chendo_ 3 years ago

I've built an app that has the same goals (not operate a mouse) but approach it completely different.

Rather than try to simulate the moving the mouse itself, Shortcat [https://shortcat.app/] indexes the user interface (buttons, text fields, links, menus, etc) and enables fast fuzzy search of the interface. Type a word, abbreviations, or hints and hit Enter to click or action the element. Works almost everywhere on macOS, including browsers, Electron apps, and even iOS apps!

The goal is to minimise cognitive overhead to achieve a particular intent, so being able to type a word to hit a button, or active a deep menu item when you don't know the shortcut is quick and easy.

I'm currently working on a modal option which enables staying within Shortcat to navigate an interface, as well as chords for simulating scrolling and arrow keys.

Shortcat relies on using the Accessibility API to index UI elements however, and is dependent on how well an app or website has implemented it. One of the goals is to help improve accessibility implementations by exposing more people to its implementations and pushing for developers to fix broken or incorrectly implemented accessibility tagging.

Shortcat is macOS only for now as I haven't been able to investigate how viable doing this on Windows or Linux would be, especially on Linux considering all the different toolkits that exist.

Daynil 3 years ago

I love this new wave of tools coming out for mouseless computer use. Chronic mouse use has destroyed my wrist so I have to avoid using it as much as possible.

I love Shortcat's approach in general, indexing the UI. However, the reliance on the Accessibility API is actually a significant downside in the real world in my experience since so many apps don't properly implement it. I feel like Warpd is a good complement to this, you could use Hint or Grid mode as a fallback when the indexing approach fails.

I wish I could use shortcat or Warpd, but unfortunately I'm on windows. Curious if anyone has any good tool recommendations for windows? Currently, I'm using:

1. Vimium for Chrome (so good, wish I could just use it across the OS).

2. Hunt and Peck: https://github.com/zsims/hunt-and-peck has been my favorite for OS-level use, a simple version of shortcat for windows. But, it's not maintained and not as slick as some of these newer tools.

  • mjcohen 3 years ago

    I have used a trackball for many years since my wrist started bothering me, and I love it. I am right-handed, and I use Logitech's 575 and MX Ergo. I prefer the Ergo, even though it is more expensive. I keep it beside me on the couch where I sit. That way my elbow makes a 90 degree angle. Very comfortable. My keyboard is on my lap and my monitor at eye level.

    • Daynil 3 years ago

      Nice, I've actually tried a trackball myself, but with the way I use my desk (sit/stand) it caused more problems than it solved (shoulder issues). Ergonomics is an art I suppose.

  • _chendo_ 3 years ago

    Oooooh, Hunt and Peck indicates that it's possible to make a Shortcat for Windows!

    I would probably need to pay someone to build that particular version though cause the last time I built anything for Windows was like 15+ years ago

  • forgotmypw17 3 years ago

    If you're already using Vimium, I suggest trying qutebrowser, which takes keyboard accessibility to a whole new level, by making it a first-class feature for the entire browser.

    It does basically cut out the mouse, and had a several-days learning curve for me, but after that it's pretty great. Here are some cool features, off the top of my head:

    * Python-scriptable, though I haven't figured out how to use this yet.

    * Bind javascript bookmarklets to a keyboard shortcut (use :bind with the jseval command)

    * Toggle not only javascript, but image loading and a whole slew of other features, with a keyboard macro.

    * Vertical tabs.

    * All config is adjustable via commands.

    * Keyboard macros like "pop tab into a new window", "clone tab", "close all other tabs", etc.

    * Text selection using the keyboard.

    * Quite similar keyboard dynamics to vim.

    It has a built-in ad blocker, and you should run :adblock-update when you first use it.

    Another browser which is similar, but which I haven't gotten into as much, is Luakit.

m1r5h1 3 years ago

I never comment on HN however I just want to say I've downloaded your app and it's very impressive - I'm going to try and incorporate this into my workflow the best I can. Thanks!

Luc 3 years ago

This app is quality. I can tell you have been working on it for years. Why not charge for it?

  • _chendo_ 3 years ago

    Haha, thanks :)

    I did charge for it a couple of years ago, however I rebuilt the whole thing from scratch after a long hiatus and hadn't had bothered to reimplement licensing because the existing options all kinda suck, and figured I'd focus the time on features and usability first. I think with the modal mode in the next release will bring it much closer to a 1.0 release.

    • justusw 3 years ago

      If you bundle it and release a paid for application on the App Store, I would totally buy it and even roll it out to my staff. The magic of the App Store allows you to do company wide roll-outs quite easily.

      • _chendo_ 3 years ago

        I'm not sure if an app like Shortcat can be released on the App Store given it uses the Accessibility APIs (sandboxing etc), also the 15-30% cut they take is a bit ooooof, but I do have plans to support company/teams licensing!

    • tfsh 3 years ago

      +1 this is awesome! I'd like to donate if I can :)

      edit: nvm me, found the option in settings (on activation show shortcuts immediately).

      Quick question, I've been playing around with Shortcat for a while. When I press the activation hot-key it takes about 4 seconds for the yellow two-letter denoated highlights to show up, despite the app's text stating "found n elements in ~0.20s". Is there a config option to instantly show the yellow highlights?

      • _chendo_ 3 years ago

        Thanks! I don't have a way to take tips yet, but you can support by pushing for developers to improve their accessibility implementations when you run into issues!

        I see you found the setting for that. It was a deliberate default initially as the intended way to use Shortcat is to activate Shortcat and type what you want without waiting to see hints, as this is generally faster and less mental overhead IMO, especially for fast typists and well-structured interfaces.

        However, some people prefer minimal keystrokes and I get that. I'm trying to figure out the right set of defaults to make it friendly to new users while nudging people to how Shortcat is designed to be used and will be tweaking it as I go.

fcoury 3 years ago

Oh my! I came to the comment section to ask about a Mac app that I've seen a long time ago that did this. Lo and behold, you, the author, have written the first comment. :-)

Thank you for Shortcat, I used it a long time ago and loved it. Excited to giv it another go!

  • _chendo_ 3 years ago

    No worries! Glad you love Shortcat :D

bloopernova 3 years ago

Shortcat is utterly amazing. I really hope I can work this into my entire MacOS usage. You should be really proud of what you've made because this is fantastic!!

abalaji 3 years ago

Any plans to add scrolling functionality to shortcat? I'd be able to move over completely from vimac if that gets added.

  • _chendo_ 3 years ago

    Working on that right now :)

zvmaz 3 years ago

Excellent! I'd love to have that on Linux.

hawski 3 years ago

Very interesting thing. I wonder if the gap of apps not supporting a11y could be reduced by using Tesseract to OCR the text.

discodachshund 3 years ago

Excited to try this out! Is it planned to open source? I would love to try integrating this into Raycast

  • punnerud 3 years ago

    Is Raycast open source? Could only find that the plugins are on GitHub.

  • _chendo_ 3 years ago

    It won't be open source, but I will be adding an API so it can be integrated with other apps and scripted

yewenjie 3 years ago

I have been meaning to build something like this for myself, albeit for Linux. Does anybody know if there is any already existing efforts there?

Given that Linux doesn't have anything like an accessibility API, I think the only option is training ML models.

joshspankit 3 years ago

This is what CMD-Shift-? should be.

I think this paradigm along with more app developers putting all the important functions in menus is a strong contender for Maximum Intuitive Productivity

kache_ 3 years ago

Have you considered using ML/OCR to figure out the position of the text relative to the screen? Seems much simpler than relying on accessibility APIs

Thank you for your hard work!

  • _chendo_ 3 years ago

    I have plans to use ML/OCR to augment results down the road but the AX APIs and ecosystem on most apps (that I encounter, at least) are generally decent. Also, OCR means it won’t understand buttons with just icons, whereas AX APIs can grab em just fine.

    Thanks! It’s easily my longest running project at a decade

pabs3 3 years ago

Should be doable on Linux for most mainstream apps due to the toolkits having a11y support, but obviously not all apps use mainstream toolkits.

LASR 3 years ago

This is excellent. I will be trying this out.

smcleod 3 years ago

I've been playing around with ShortCat recently - really cool app! Keep up the good work.

Tepix 3 years ago

Looks very cool, have you considered adding voice input or is it already possible?

  • least 3 years ago

    There’s already a very sophisticated system on MacOS for voice input so I feel like it’d probably be superfluous.

madacol 3 years ago

Looks really cool. Is it able to select text for copying?

  • _chendo_ 3 years ago

    I'm working on a version that allows sending arrow keys with modifiers to the targeted application, so soon!

ta988 3 years ago

I just tried that, this is excellent.