3
Github GitHub - synesthesiam/voice2json: Command-line tools for speech and inten...
source link: https://github.com/synesthesiam/voice2json
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
voice2json
is a collection of command-line tools for offline speech/intent recognition on Linux. It is free, open source (MIT), and supports 17 human languages.
From the command-line:
$ voice2json transcribe-wav \ < turn-on-the-light.wav | \ voice2json recognize-intent | \ jq .
produces a JSON event like:
{ "text": "turn on the light", "intent": { "name": "LightState" }, "slots": { "state": "on" } }
when trained with this template:
[LightState]
states = (on | off)
turn (<states>){state} [the] light
voice2json
is optimized for:
- Sets of voice commands that are described well by a grammar
- Commands with uncommon words or pronunciations
- Commands or intents that can vary at runtime
It can be used to:
Supported speech to text systems include:
- CMU's pocketsphinx
- Dan Povey's Kaldi
- Mozilla's DeepSpeech 0.6
- Kyoto University's Julius
Unique Features
voice2json
is more than just a wrapper around open source speech to text systems!
- Training produces both a speech and intent recognizer. By describing your voice commands with
voice2json
's templating language, you get more than just transcriptions for free. - Re-training is fast enough to be done at runtime (usually < 5s), even up to millions of possible voice commands. This means you can change referenced slot values or add/remove intents on the fly.
- All of the available commands are designed to work well in Unix pipelines, typically consuming/emitting plaintext or newline-delimited JSON. Audio input/output is file-based, so you can receive audio from any source.
Commands
- print-profile - Print profile settings
- train-profile - Generate speech/intent artifacts
- transcribe-wav - Transcribe WAV file to text
- transcribe-stream - Transcribe live audio stream to text
- recognize-intent - Recognize intent from JSON or text
- wait-wake - Listen to live audio stream for wake word
- record-command - Record voice command from live audio stream
- pronounce-word - Look up or guess how a word is pronounced
- generate-examples - Generate random intents
- record-examples - Generate and record speech examples
- test-examples - Test recorded speech examples
- show-documentation - Run HTTP server locally with documentation
- print-downloads - Print profile file download information
- print-files - Print user profile files for backup
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK