I have been interested in the voice control of computers for a long time. My first attempt was around 10 years ago, and I had some success with it. In the right environment, I was able to say commands to my computer and it would respond based on what I said. The problem was that I didn’t have a practical use for it yet. It was clear in this early testing that using a keyboard and mouse was far more convenient, reliable and a quicker option than using voice. It will remain that way for many of the standard interactions (i.e email, facebook) we have with computers, at least in the short term.
The day Microsoft Kinect was launched in Australia, I saw the promotional video showing people waving their arms around to navigate through their media centre. It seemed to me that this would be a fairly unreliable and exhausting way to control anything, apart from games specifically designed for the technology. I was way too lazy to consider using this technology into the future.
I concluded that voice is the simplest way to control anything, and that it always will be. This led me to start playing around with voice control again. I ran through the voice tutorials and was able to get the computer to understand my voice some of the time. It did stuff up on me a whole lot, but it was clearly much more reliable than software I had used in the past.
Now around 6 months on, I have written an AutoHotkey script and a WSR macro that interact with Windows Media Center and Windows Speech Recognition software, allowing my media centre to be controlled completely by voice. This is a practical use for voice control. I can navigate faster with my voice than I can with a remote control. Instead of needing to know which button to press on my remote (or remotes), I simply speak my mind. I no longer use a remote at all. This is something I have wanted for a long time and I am excited about this outcome.
This system far exceeds any other voice control setup on the market today in terms of reliability and practicality. Most of the problems as to why systems haven’t worked in the past has not been because the software was inadequate for the task, (the software has worked fine for many years). Most of the problems are environmental, and my solution tackles these environmental issues. Rather than trying to make technology that works in our environment, my solution changes the environment to enable the technology to work. I believe it is inevitable that all future voice control systems will need to take this approach for the system to work.
This article will give you all the information you need to control your Windows Media Center home theatre PC with your voice. I will provide the easy to edit scripts and show you how to install them on your PC. I will also explain what works and what doesn’t, as well as explaining why previous attempts have not been successful. The more I explain how it all works, the easier it will be for you to set it up and get it working reliably. This will not be as easy as installing the software and having the results you want right away. You will need to train it to recognise your voice, and you will need to learn the correct commands to make. A solution that can understand the whole English language is a long way off. It is much more difficult to synthesize human understanding than it is for a computer to understand dictation. That is why we need to have set commands.
There is a video of my home theatre PC running this system after the jump.
Read the rest of this entry »