Are You Talking to Me? Speech on Mac OS X
Pages: 1, 2, 3
Can Speech Adapt to My Mistakes?
For Newcomers
Surprisingly, yes. Indeed, to a certain extent, Speech will try to understand your command even if you do not get it right immediately. For example, "Get my mails" and "Get my mail" will work the same way.
However, you should not expect Speech to understand sentences that are too different from what the developer intended. If you think that a command is so unnatural that you won't be able to learn it, you may want to create a custom command that will be more natural to you.
For the Cutting-Edge Addicts
If you're ready to explore the latest developments of the Speech technology, you can turn on Panther's Semantic Inference feature. Under this strange-sounding name hides a technology that allows Speech to understand what you say, even if you do not speak the predefined command.
When this is turned on, you can replace "What time is it?" with "What is the time?", "Tell me the time," or even "How late is it?"
Since this technology is still at its early stages of development, Apple chose to turn it off by default. Its accuracy may not be perfect (yet) and it may slow the speech-recognition engine down a bit. In my experience however, all worked perfectly well, so I would encourage you to give it a try.
To do so, follow these steps:
- Open the Speech Preferences pane by saying "Open the Speech Preferences."
- Go to the Speech Recognition tab.
- Select the Commands sub-tab.
- Highlight "Global speakable items" and click on Configure.
- In the sheet that appears, uncheck the box to turn the feature on (I know, I know...).
To test it, read the sentences suggested by the activation sheet and be amazed.
Going One Step Further
Now that you have discovered the joy of Speech, it's time to go one step further and learn how to almost completely get rid of your keyboard and mouse.
Front Window and Menu Bar Control
For now, you may have noticed that many commands are still out of your reach, including menu items, toolbar buttons, etc. The good news is that you can control them with Speech too, making your keyboard and mouse almost obsolete.
In order to turn this option on, follow these steps:
- In the Universal Access preferences pane, click on "Enable access for assistive devices."
- In the Speech preferences pane, click on Commands.
- Select Front Window and Menu Bar.
Now a whole new world is open to you. Try to say the following commands to show or hide the volume in the menu bar:
- Switch to System Preferences.
- Show all.
- Sound.
- Show volume in menu bar.
This gives you a lot of power over your applications and dialog boxes. Unfortunately, some nonstandard controls will not work with this method. Also, you probably will not be able to pick items in complex lists by using Speech. However, most of the functionality of most applications will be available via voice commands.
Even more powerful and more universal is the menu bar. Indeed, you can control it by voice. Since almost all menus are standard, you can without any issue access most of the menu commands from your applications.
To shut your Mac down, you would say:
- Switch to Finder.
- Apple menu.
- Shut down.
- Shut down.
Define Keyboard Shortcuts
This is all very nice but, sometimes, giving a menu and a menu-item name to perform a simple action can be a bit bothersome. That's why the Speech development team introduced a very nifty command that allows you to enter any keyboard shortcut simply by saying "Define new keyboard shortcut."
A palette will then pop up, allowing you to enter the keyboard shortcut and the voice command you wish to associate to it. You can use such a command to, for example, create a "Close tab" command in Safari or a "New chat with" feature in iChat. Users with disabilities could create a custom command for "Zoom in" and "Zoom out."
Of course, since Panther allows you to define custom shortcuts through the Keyboard preferences pane, this feature is even more powerful than one could think at first sight.
Better Interactions with Your Mac
Spending your day in front of your screen isn't always fun, as enjoyable as using a Mac can be. Therefore, you may from time to time, wish to be able to step away from your computer -- when a long task is running, for example -- but without losing contact with your Mac in case something important happens.
That's pretty simple. Indeed, Mac OS X now features "talking alerts" -- this feature will cause your Mac to read the alert messages that may pop-up on your screen if you do not reply to them after a predefined delay.
This feature can also be very useful in an environment where multiple computers run at the same time -- a print shop or a computer lab in a school. Wouldn't it be nice to hear in a clear, distinctive voice "The PowerMac G5 next to the window needs your attention. The printer is out of paper," instead of a "Bong!" that you would need to track down?
In order to benefit from this feature, use the "Spoken User Interface" tab of the "Speech" preferences.
You can then define what the computer will do and after how long it will talk. I wouldn't recommend that you set a short delay since having the Mac read the alert while you are already reading and reacting to it may be annoying. Setting it to 10 seconds gives you the time to react if you already in front of the screen.
Your Mac can also read alert windows that, for any reason, would pop up behind your current application or working document.
The "Announce when an application requires your attention" option can also be a time saver. Indeed, while you are working, you may not notice the icons furiously bouncing in your Dock but will certainly hear "Safari needs your attention."
Adding Commands, Folders, or Files
Like many users, your workflow may require you to access documents that are buried in your folder hierarchy. Luckily, you can easily create a "command" that tells Speech to open them in the blink of an eye.
In order to do that, simply create an alias of the folders that you commonly use in the following folder:
[Home] -> Library -> Speech -> Speakable Items
Now, wherever you are, you simply need to say the name of the folder to open it. To make the alias creation process easier, remember than holding the option and Apple keys while dragging an icon creates an alias.
Making your own items able to be invoked by speech can itself be achieved by speech. Merely click on the item in the Finder and say, "Make this speakable." Speech will take care of making the alias, putting it in the Speakable Items folder, and removing the word "alias" from the alias.
Of course, you have to be careful not to drop any alias with a name that would match the one of an existing command too closely. Otherwise, you may end up opening this folder unwillingly. To avoid this, simply change the name of the alias and all will be well again.
Even cooler, you can put in there aliases to documents that you open often or the HTTP files that Mac OS X creates when you drag an URL from a browser's address bar onto the desktop. Just make sure that you give to these files a name that will be relatively easy to pronounce -- for example, remove the extensions if possible or you will have to pronounce "filename dot extension."
When All This Is not Enough
When adding aliases and interacting with buttons or menu items simply is not enough, keep in mind that both AppleScript and the Terminal can work closely with the Speech technology.
For example, here is how to write a script that will read a string of text ...
... in AppleScript:
Say "This is something very cool very cool very cool this is something
very cool that every Mac can do!" using "Cellos"
... in the Panther Terminal:
say -v Cellos "This is something very cool very cool very cool this is something very cool that every Mac can do"
Note that the voice you pick will be ignored by AppleScript if Voice Recognition is turned on. This is a feature that allows users to enjoy consistency in the dialog they have with their computer.
When using the "Saving to file" option, however, the voice you pick is used, since the consistency of the interaction with the user is no longer a concern.
The ability to interact with the Speech Synthesizer even if you are not a developer will allow you to add speech capabilities to the Terminal scripts or AppleScripts that you already use in your daily workflow without having to learn a whole new set of commands or language.
Getting your Mac to Listen
Now that your existing scripts have gained the ability to speak to interact with you, wouldn't it be even better if they could listen? Well, Apple already thought of it and all the information that you need to create complex listen-and-tell scripts can be found on this page.
That way, you can create even more complex speakable items that will start a true dialog with you and react depending on your needs and answers.
Have Some Suggestions to Make it all more Exciting?
Indeed, I do! The first thing to do is to over-use the "Show me what to say" command and to try to do as much as you can with Speech. At first, it may look like you are actually losing time since you need to learn the commands and sometimes learn to speak into the microphone.
However, very quickly, you will see that you can do almost everything with Speech and get completely rid of meaningless alert sounds, creating a true dialog with your computer.
Many applications are speech-ready -- iChat, for example, can read aloud the name of the persons who invite you to a chat but this option is turned off by default. It is worth taking the time to learn what each one can -- and cannot do.
After a few days of practice, I am glad to say that I now can use my Mac without a keyboard or mouse for most of the day, except when typing, of course.
I Want to Create Sounds from Speech Synthesis
In some occasions, you may want to create a sound file from the text generated by the speech engine. The easiest way to do so is to use an AppleScript command like this:
say "This is something very cool very cool very cool this is something
very cool that every Mac can do!" using "Cellos" saving to "Cool.aiff"
When you run this script, it creates a file at the root level of your hard drive, containing the sound that you would hear if the synthesis had happened on-the-fly.
Other Technologies
To achieve the same effect, you can also use the demo pages of the AT&T "Natural voices" technologies. Indeed, to demonstrate their system, AT&T allows you to type text into a web form and to download the resulting file. The main advantage of it is that it allows you to read text in many languages.
Here is the demo page. Of course, since there are certain limitations and copyrights that apply, I encourage you to read the Terms and conditions first. You should also keep in mind that this system is targeted at professional frameworks and that it runs on powerful servers.
Author's Note
During the preparation of this article, I had the opportunity to talk with Kim Silverman, principal research scientist, manager, spoken language technologies at Apple. May he find here the expression of my gratitude for the information he so kindly provided.
Needless to say, any errors or inaccuracies in the preceding pages remain entirely my responsibility.
FJ de Kermadec is an author, stylist and entrepreneur in Paris, France.
Return to MacDevCenter.com.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 17 of 17.
-
What's really what, re
2007-12-21 13:26:50 a_chameleon [Reply | View]
Unfortunately, the article leads folks to believe that the speech recognition in Apple's OS X can do far more than it actually can do.
Effectively, it can only do some limited command and control scenarios. It is unable to perform dictation, create speech models, learn and it's voices are years behind the times.
Bill Burke
-
Speech Recognition window not showing?
2006-10-15 02:02:46 odiddy [Reply | View]
Hey there! I found your article quite interesting. I was dabbling with my speech recognition and then for some odd reason the speech recognition window just dissapears and when i hold on to esc it's as if there is no program there. I have it turned on in my system preferences, any idea what it could be?
Thanks,
Odiddy
-
Speech recognition issues in OS X 10.3.5
2004-09-11 15:36:04 AndyGoundry [Reply | View]
I really enjoyed this article and really decided to have a play with speech again, which i'd left alone for quite a long time as i found it relatively useless.
Today, i have to say i like the speech options and have setup a keyboard shortcut to speak selected text. Nice.
However, i am having huge trouble using speech recognition. For some reason, when i turn this on and leave it in 'listen continuously' mode, it decides by itself to switch off completely without asking. it has so far done this every time now for 5 attempts. It normallly switches off within 5 minutes. I also set the control to not accept commaands until i pressed the hot key, but then it decided to automatically switch to 'continuous' mode again and the problems reoccurred.
Very strange. This does, as you'd imagine, render the facility useless.
Oh well.
thoughts and experiences welcome.
Andy
-
Please help
2004-05-19 11:26:50 lsonna [Reply | View]
The article only talks about recognition of spoken COMMANDS. That is the least of my worries. Can I dictate my articles and books into a word processor? Can I correct recogntiton errors? Are the corrections stored in user files so that accuracy improves over time? Or do I need to buy Dragon Speaks or Via Voice?
The speed even with my PC Celeron (P) 2.40 Ghz, 224 meg of RAM is much too slow both with Win XP Pro built in speech recognition and Dragon Speaks --about 15 words per minute max if there are no errors. I'm trying to decide whether to buy a Mac. Can I hope for better speed? Should I buy separate software?
I am a writer, now handicapped from RSI and desperate. -
Please help
2004-07-30 13:35:44 JayBee [Reply | View]
Just in case you decide for a Mac, I´d refrain from using the IBM solution. With the programme iListen at www.macspeech.com you can dictate anywhere directly into the applications, the update cycles of the software are more "dedicated", recognition even works fluend on my old 450 MHz PPC. Cost: 99.- bucks for downloading the programme or 149.- boxed with a microphone included. I do use the cheap Jabra headset, which was in my Via Voice bundle I used before - works fine! ;-P
You can also speak commands to boss your programmes around, but you have to buy a "ScriptPack" for every app you want to control :-( Most of these cost between 10.- to 20.- $, some cost more. As you do writing business, you would want to get packs for Microsoft Word or Mariner Write and maybe Apples Mail programme.
For opening files and jumping around in Finder folders you don´t need to buy a script pack.
Hope this helps...
-JayBee- -
Please help
2004-05-20 02:08:45 FJ de Kermadec |
[Reply | View]
Hi !
This article focuses on the Speech technologies built into Mac OS X. Currently, these technologies are centered around the recognition of commands and the interaction between the computer and the user. By creating a real vocal interaction, they can be a lifesaver for visually impaired users.
In order to dictate text to a text editor you would currently need to use an application like Via Voice indeed. Although I haven't used it for a while, the last time I saw it, it seemed to be quite responsive and allowed the demonstrator I observed to talk at (almost) normal speed.
Considering Apple's commitment to help users with disabilities, I am sure that the Mac would be an ideal computing platform for you. Would you live near an Apple Store or Apple Authorized Reseller you can call and ask if they can perform a demo for you, so that you can see for yourself.
Here is a link to their "Accessibility (Vision)" website :
http://www.apple.com/accessibility/vision/
New cool accessibility features have already been announced for the next release of Mac OS X, "Tiger".
Let me know if this helps !
F.J.
-
Two nitpicks
2004-03-23 11:56:52 fofer [Reply | View]
The article starts by saying that Apple's speech synthesis was innovative when it came out in 1993... but the original Mac spoke when it was first introduced in 1984:
http://www.folklore.org/StoryView.py?project=Macintosh&story=Intro_Demo.txt&sortOrder=Sort%20by%20Date
Secondly that "My voice is my password" protection in OS 8.x was a joke. My friends and I used to fake it out all the time just by mimicking each other's voices. I don't see why a recording wouldn't get by it either.
Lastly, the state of speech synthesis has come a LONG way since OS 7, yet Apple's voices haven't seemed to improve all that much. The samples at AT&T Labs bear this out. I'd really like to see them step it up a nothc and improve the "Macintalk Pro" voices a LOT, or just license the AT&T Labs work for better speech synthesis.
-
Apple junk technology
2004-03-20 01:31:15 mac_held [Reply | View]
Indeed, spoken output is great - in theory.
But unfortunatly (as many things done by apple) their effort is minimal to make their "great innovation" multilanguage.
"The main advantage of it is that it allows you to read text in many languages."
This is not an advantage, it is a pure base condition if want to work with that feature (the quoted sentence above is somewhat arrogant, you maybe want to imagine, what happens, if apple forces YOU to speak/write in german only to make some of the features work. Crazy, isn't it?)
I'm living in germany and so speakable items is junk to the same degree that the speak synthesis does not work for me. If you know more than apple, then you realize, that this is true for the great majority of the european continent (if you know much more, than you realize, that this is very true all over the world).
This is the reason, why I'm not using it. Maybe apple will correct their behaviour (very unlikely) and start with spoken interface their apology to all not english and not japanese speaking people on the world (I only believe this, when I see that, not even a petasecond earlier) but...
-
Amiga, ATT
2004-03-18 23:07:20 rofl [Reply | View]
Amiga used the same voice (Bruce), I think Apple just bought it (or the pros).
http://www.naturalvoices.att.com/demos/ here you can find some snippets which good speech synthesis can do nowadays. And Apple uses still the same voices as in System 7.
So this is not anymore innovative.
As long as I can't dictate Text, this is not useful. -
Amiga, ATT
2004-03-19 01:32:49 FJ de Kermadec |
[Reply | View]
Hi !
Just to let you know, the fact that older voices still ship with Panther does not mean that Apple still uses them !
In fact, they continue adding extremely good-sounding voices, like Victoria or, more recently, Vicki.
F.J. -
Amiga, ATT
2004-03-20 00:54:43 rofl [Reply | View]
Go to www.naturalvoices.att.com use Language US English, with Crystal, put in this sentence "Isn't it nice, to have a computer, that will talk to you?" and know compare to Vicki.
I think Vicki is a fine tuned Victoria. With the DevPackage from Apple you can do this by your own.
-
Not really innovative
2004-03-18 20:04:21 mchaput [Reply | View]
"Once again, this shows how innovative Apple can be."
The Amiga personal computer had speech synthesis built into its OS in version 1.0, in 1985.
-
VOIP to Speech Commands
2004-03-18 15:46:16 adolph [Reply | View]
I wonder how long it will take to have some glue between voice over IP and Speech Commands? You could call in to your computer and have it tell you knock-knock jokes! -
VOIP to Speech Commands
2004-03-19 01:30:30 FJ de Kermadec |
[Reply | View]
Hi !
Calling a computer from afar to discuss knock-knock jokes could sure be a fun feature ! ;-)
Maybe a shareware to write ?
F.J.
-
java
2004-03-18 10:16:36 ashields [Reply | View]
I'm rather disapointed that apple have not bothered to implement the official java speech api (especially as they had a hand in creating the specification).
There is the cocoa based java speech api but that's not cross platform so at the moment I'm having to do all my speech based development on a pc.






Effectively, it can only do some limited command and control scenarios. It is unable to perform dictation, create speech models, learn and it's voices are years behind the times.
Bill Burke
http://wirelessspeech.blogspot.com/2007/11/comparing-apples-to-gold-nuggets.html