Demo Video and General Introduction

Installing

  • Clone the repo into your Talon user directory.

  • Clone the Talon community repository for general Talon commands

    • This is the sole OS-agnostic dependency of this project.

You should frequently run git pull to update the scripts.

OS-Specific Dependencies

  • Linux

    • Install spd-say to play standard tts.
    • Install piper to use the omnx model for more natural speech
      • run pipx install piper to install it (thus pipx is a dependency)
  • Windows

    • Using NVDA:
      • Install the NVDA addon file from the repo releases page
      • If you do not want to install the addon, disable Speech interrupt for typed characters in NVDA settings to make sure typing text from Talon is not interrupted with every typed character.
  • Mac

    • No extra dependencies

Talon Installing

You should have already installed Talon and the community repository. If not, see the Talon docs for instructions. In order to best use this repository you should be familiar with the basics of Talon and basic commands from the community repository.

This document is not a replacement for the wiki but is intended to be used as a quick way to briefly get an overview of the most important commands and most relevant behavior.

Talon Brief Overview

Talon is a voice control engine. In order to have any behavior, you need to install scripts, the standard of which comes from the community repository. Each time you say a command within Talon, the voice model will try to match the command to the closest one that is defined and is contextually in scope. So, for instance, if there are specific commands to control Gmail, but you are not within Gmail, the command will likely be misrecognized. For this same reason, it is also very important to be in the proper mode. By default, there are two main modes: dictation mode and command mode. The former is for dictating raw text and the latter is for calling specific commands. If you are not in the proper mode but try to say a command that is defined in a different mode, Talon will likely still try to interpret the phrase, but it will be matched to something in the wrong mode.

Debugging Talon Issues

If you are not getting the proper behavior within Talon, most of the time it is likely due to a poor microphone or an error in your scripts. You do not need a fancy microphone to have good performance with Talon; however, too much background noise, static, or fans are likely to cause issues. You should check the "save recordings" option within the Talon tray icon menu if you are getting poor recognition. This will allow you to hear what Talon hears for a given phrase.

Helpful Standard Talon Commands

CommandDescription
command modeSwitches Talon into command mode, where your words are interpreted as commands
dictation modeSwitches Talon into dictation mode, where your words are interpreted as raw text
launchLaunches the specified application
focusFocuses the specified application
talon wakeWakes Talon up if it is asleep
talon sleepPuts Talon to sleep
pressPresses the specified key
air bat cap drum each fine gust harp sit jury crunch look made near odd pit quench red sun trap urge vest whale plex yank zipThe Talon phonetic alphabet
sentenceDictate a sentence with the first word capitalized
titleDictate a sentence with all words capitalized
wordDictate a single word
scratch thatUndoes the last thing you said
wipePresses backspace

Commands Specific to Sight-Free Talon

Sight-Free-Talon has a series of voice commands and settings to make Talon easier to user alongside screen readers. Any general commands for dictating text or controlling your computer can be found in the central community repo, which you should also have installed.

Commands

TODO

Settings

All settings can be set within .talon files and contextually scoped to specific applications.

SettingDescriptionDefault Value
user.echo_dictationEcho the subtitles from talon back via ttstrue
user.tts_speedHow fast to play back text-to-speech -10 to 108
user.tts_volumeHow loud to play back text-to-speech from 0 to 10080
user.echo_contextAutomatically echo the context of the focused window when switching applications/tabsfalse
user.tts_via_screenreaderIf a screen reader is enabled, use it for tts instead of the TTS engine in Talontrue
user.nvda_keyKey used for nvda modifier, change to 'insert' if that is your nvda modifier'capslock'
user.start_screenreader_on_startupStart your screen reader automatically when Talon startsfalse
user.braille_outputOutput dictated text to braille display through your screen readerfalse
user.sound_on_keypressTo prevent errors from accidental key presses, play a sound each time a key is pressedfalse
user.disable_keypressesDisable keypresses from Talon in high risk contexts that cannot afford typosfalse

Contributing

My goal is to make contributing as easy as possible. Please directly reach out to me if you are interested in contributing to this repository. I am happy to help you get started and answer any questions you may have.

I can be reached either through the Talon Slack or my website, https://colton.place.

Technical Contributions

The project repository is structured such that every screen reader or unique feature gets its own folder. Each folder contains a .talon file with the commands for that screen reader or feature. Any features related to global scope or settings are in the root settings .talon file. If you would like to add support for a new screen reader I encourage you to follow the format of the other screen readers and implement similar function overrides. All baseline declarations that are contextually overriden are in the core folder.

Testing with Your Own Setup

I do not have the resources to test certain combinations of screen readers and operating systems. If you would like to contribute to this repository, I encourage you to test the commands on your own setup and provide feedback. If you are not familiar with GitHub, you can directly get in contact with me.

To check for errors, you can send me a copy of your Talon log.

Non-Technical Contribution

I greatly benefit from general qualitative design feedback and learning more about the particular workflows of users. My intention is for this repository to be useful for people of all abilities and technical skill levels, so I am very interested in hearing about any difficulties you may have with the repository or Talon in general.

If you are a user with a vision impairment, I am curious to hear how you have interacted with voice dictation software in the past. I could also use qualitative feedback regarding things like alternative computer feedback mechanisms, such as braille, haptic feedback, or pitch-based audio feedback. I am curious to explore different ways of providing information to the user, and am excited about exploring more experimental ideas.

Philosophy

In this repository, we want to create a solution that is low friction and feels natural and unintrusive for all users. For users with eyestrain or a vision impairment, we want our repository not to feel like a begrudgingly accepted accommodation, but an exciting improvement that unlocks new forms of human computer interaction. This is similar to what Cursorless has accomplished with voice programming. It is not simply a way to code when you do not have access to your hands, but rather it is a full new way to think about coding, one that is often more efficient to begin with. By using Talon's scripting potential and various community tools and AI integrations, we have the potential to realize this within low vision tools as well. After all, the most accessible tasks are the ones which can be automated away, and don't need to be done in the first place.

As such, repository is designed around a series of core principles:

  • Keep as much behavior directly in Talon as possible and make as few screen-reader specific changes as possible.
    • This makes development easier and more maintainable.
  • Make our Talon code well integrated with the rest of the Talon community.
    • This means using the same conventions and style as the rest of the community and not dynamically loading specialized libraries or doing low level hacks if it can be avoided.
  • Create a solution that can be used for people of all abilities
    • This means that we want to make sure that the solution is usable for people who are blind, low vision, or sighted.
    • Make sure that the solution is usable for people who are new to Talon and people who are experienced.
  • On install, the solution should work out of the box with minimal configuration.
    • Settings should not change other parts of the user's Talon configuration.
  • All settings should be located in a central settings file.
  • Application specific voice commands should be located in their own specific file and contextually scoped
  • Feature creep is bad and hurts the long term maintainability of the project.
    • If a feature is not used by a large number of people, it should be removed.
    • Focus on a few popular screen readers and make sure they work well.

Sources & Inspirations

Design Patterns

Emacspeak

NVDA

JAWS

Libraries

VoiceOver