How useful are speech-to-type apps?
Adrian McDermott
January 31st, 2009
I’m quite a fan of speech typing software, which I’m using right now, so I was very interested when my colleague Ralf Haller blogged recently about his experiences using Google’s voice recognition for search on his iPhone, which seemed to cope pretty well with everyday words but was rather challenged by place names etc. I just came across another review in the Dallas Morning News of Google’s and Vlingo’s competing speech typing software for the iPhone (and in Vlingo’s case, Blackberry):
Pros: Very accurate for speakers with American accents in quiet places. Easy to use. Reduce the need for typing.
Cons: Struggle with background noise and non-American accents. Don’t work for SMS or e-mail on iPhone.
Bottom line: Breakthrough applications that will change the way you use your iPhone or BlackBerry. Download immediately.
These products are clearly maturing nicely. However, according to the BBC, they’re going to be blown out of the water before year end by a small British company based in Hereford, who are planning to offer the world’s first ‘fully accurate’ totally voice-controlled phone. The BBC video shows a presenter standing in a rather low-tech looking factory holding a prototype of the Zumba, a small, very plain looking box, with a kind of a flat clip that is looped over the ear, speaking text to the user, and transmitting voice replies to be converted to text by the Zumba server (which the company claims to be 100% secure). “Whatever happens, this is very exciting tech indeed!” comments the dialaphone blog, and even The Register gave it a straightfaced report. Wired, however, is less convinced, put off not only by the “100% accurate, 100% secure” tag, but also by the fact that the company would not let the presenter actually test a prototype for “security reasons”:
Is your snake-oil sense a-tinglin’? It should be. This video further charts the descent of the Beeb from an internationally respected and neutral reporting machine into a populist tabloid of a TV company.
Ouch!
I’ve actually been using MacSpeech (which is based on Nuance’s Dragon NaturallySpeaking engine) to write most of this post. Oddly enough, the only words it’s really struggled with was “nuances” — it refused to give me the option of a capital letter and an apostrophe — and Zumba (for obvious enough reasons). I cut and paste URLs using a mouse, I have to confess, but even so, I’m pretty impressed with the performance of this software. It doesn’t enable me to produce a mountain of prose at breakneck speed. But that’s because I just don’t think that fast — this thing is way faster than my typing speed.
Given that I had to train MacSpeech for about 15 minutes so that it can recognize my speech patterns, that they recommend processor speed of a least 1GHz and a 1Gb of RAM, and that the support folders also require over 1Gb of hard disk space, you can understand why getting this sort of thing to work on a mobile phone is something of a challenge, and why I tend to agree with Wired’s view of the Zumba phone!








