Thursday, April 18, 2024
Text Size

Dragon Naturally Speaking

We haven't reviewed voice dictation software for several years. The last version of Dragon Naturally Speaking that we looked at was probably somewhere around version 6. Nuance, the company that acquired the Dragon software system, has recently released version 10 of the Dragon Naturally Speaking Professional series. It was received with great anticipation. We eagerly installed it on a test system, after which I sat there quietly thinking to myself, "what am I going to say about this product?" And that's when it hit me. 'Saying' is what it's all about with speech recognition software.


dns10_prof_m.png This minor epiphany led me to decide that I would give it the same test an average user would give it. I would ignore most of the instructions while maintaining high expectations for success. I installed it and immediately started dictating. To say the least, I was astounded. Accuracy was over 95% out-of-the-box. And to be completely honest, I did a few things that aren't necessarily recommended out-of-the-box.

I did follow all the basic quick start directions, which I would hope every new user would do. This meant doing the basic recognition training. I like to use an excerpt supplied by author Dave Barry (always humerous), and then jump merrily into dictating a document. The level of accuracy achieved in out-of-the-box voice recognition has improved dramatically since last we looked at this market space. The article you are reading is being dictated in Dragon Naturally Speaking 10 Professional. With the exception of formatting for the website, you are seeing exactly what was dictated*.

  *including corrections which were also done through the program

 

 A Word About Hardware

You may have noticed that the heading above is in bold characters. This was also, accomplished using simple commands embedded in the software. But I digress.

Voice recognition software is notorious for requiring a lot of processing power. The test machine being used incorporates a quad core AMD processor and 2 GB of RAM on a Windows XP platform. This is not because the software requires that much power. According to specifications it only requires a 1 GHz processor and 512 MB of RAM in a Windows XP environment. It's just that I'm impatient for the results hence, the bigger processor. I will tell you that the text is showing up virtually in real time.

Nuance package Dragon Naturally Speaking 10 with a nice headset/microphone combination from Andrea Labs. andrea_hs_sm.pngAndrea Labs has been in the sound processing business for many years, and we have enjoyed seeing their developments in sound envelope and sound beam technology. To that end, I decided to go one better, and use their sound Max Superbeam array  microphone. I chose to do this not only because of the phenomenal performance of this desktop field array microphone, but also because I just don't like something stuck on my ear for any length of time. With the field array mic, I can sit back and dictate with no encumbrances. It also helps prevent me from looking like a complete idiot when the phone rings and I try to answer it on the covered ear.

superbeamarray_s.pngI ran the audio tests as part of the setup and the accuracy and sound reproduction were excellent. I must stress that this is a critical element of the speech recognition process. It is probably the best real world example of GIGO (garbage in, garbage out).  Nuance made an excellent choice in choosing Andrea Labs to supply the headset/input device.

It's Not What You Say

Dragon Naturally Speaking  does a good job of incorporating the vast majority of words and commands that are commonly used. The included vocabulary is quite extensive and in version 10 they have greatly increased the command and control capabilities. You can now easily initiate complex commands to control programs in Microsoft Office products as well as your browser, including the popular Mozilla Firefox. If you use Google desktop, you can speak commands to search the files on your computer or find something in your e-mails. If you're the adventurous type, you can also script your own commands to alleviate the tedium of repetitive tasks.
dragon_templates.png
 

  It's How You Say It

Certainly consistency is important with any speech recognition software. Dragon Naturally Speaking gives you all the tools to train and improve the speech recognition abilities of the software. As I stated earlier, out-of-the-box accuracy is surprising, and with minimal initial training, taking the time to make corrections in the document as you develop it, you can improve accuracy to virtually 100%.

dragonbar_expandtools.gif

Like anything else in life, practice makes perfect. In the early days of speech recognition, training was a true tedium. Dragon Naturally Speaking 10 has eliminated that hurdle. With their improved speech algorithms my fear of having to spend hours talking into a microphone to get reasonable results has been eliminated.

 
A word of caution. No matter how focused you may be dictating a letter, white paper, or simple e-mail.... life happens. Phones will ring, people will walk in, talking, and extraneous noises will interfere with the process. There's no way to avoid this, however, with Dragon Naturally Speaking's error correction tools, cleaning up those little annoyances takes but a moment.


Conclusions


Working with Dragon Naturally Speaking 10 is not only simple, it's addictive. Being able to spew out words at a rate far superior to my typing abilities makes me constantly looking for ways to apply the voice recognition system to other applications I use daily. Fortunately for me, Dragon Naturally Speaking has taken the time to set up the majority of commands I want to use for my Office and browser applications. Nuance has proven their mastery of speech recognition on the personal computer by getting it to respond to my verbal requests and commands. Now if I could just get my staff to listen the same way.

Dragon Naturally Speaking 10.1 Preferred-Boxed Shipment - $199.99  

Dragon Naturally Speaking is also available for Medical and Legal  professionals .