Talk to your gadgets

AT&T's Watson leads a pack of new gadgets that understand spoken instructions.

By Stephen Cass|Monday, November 26, 2012
RELATED TAGS: GADGETS, COMPUTERS
gps
gps
Courtesy AT&T

The builders of mobile gadgets face a paradox. They want to make the most powerful device they can, squeezed into the smallest box possible. But for a device to be useful, human beings have to be able to interact with all its features. More and more functions mean more and more buttons—and humans have stubbornly remained the same size and shape. A button can be made only so small before it becomes impossible to press, putting a tough limit on miniaturization. Different devices confront this paradox in different ways: Cell phone keypad buttons routinely do double, triple, and even quadruple duty, while devices like tablet computers use touch screens and gesture recognition.

AT&T is developing another solution. It wants you to be able simply to talk to an electronic device and have it follow your instructions. While some cell phones already offer voice recognition for basic tasks, such as looking up phone numbers in a contact list, AT&T envisions devices that can handle much more complicated voice commands, such as “Tell me where I can find the nearest ATM” or “Order me a pepperoni pizza.”

For decades AT&T has been working on a voice recognition system that can handle just such requests. Known as Watson, it is so complex that it is more practical to run the software on centralized servers than to install, manage, and maintain it on countless mobile devices. Fortunately, today’s mobile devices have the ability to connect to the Internet in spades. By including some very basic hardware and software to capture and compress speech (which phones already possess), any device can be given the gift of voice recognition. Captured speech is sent, via the Internet or a cell phone network, to AT&T computers running Watson. The Watson software analyzes the speech and sends back a digital response that the device can translate into commands. To demonstrate the principle, AT&T researchers have built a voice-operated television remote control. Designed to work with AT&T’s Internet TV service, U-verse, the remote lets you do things like ask it to find any comedies that might be on TV now or to search the listings for movies starring Bruce Willis.

AT&T is already working with developers to create prototypes for other real-world applications —a yellow pages application for the iPhone, for instance—and expects to make more announcements about the future of this technology in the next few months.


How it Works
AT&T’s networked voice recognition system is a mash-up. A mash-up is software that uses the Internet to glue different programs with different capabilities together. Here, the goal is to merge a general voice recognition application—Watson—with things like information databases or the specialized software that runs a cable television or digital video recorder. In the example of a TV remote control, the remote captures speech from the user—“I want to see Channel 114”—compresses it, and uses a wireless connection to send it to a server running Watson. Watson not only recognizes individual words but can also be programmed to extract some meaning from simple sentences. It does this using sets of rules that can digest a variety of naturally spoken sentences into standardized text—for example, “What is the time?” means the same thing as “Tell me the time.” The text can then be translated by software running on the device into actual machine commands, such as transmitting to a television the signal to select a particular channel

voicerecognition
voicerecognition
From left courtesy Nuance Communications; G. G. Electronics; Magellan

Buy it Now
Voice recognition is already proving itself in places where people don’t have, or can’t use, a keyboard.

Magellan Maestro 4250
This GPS navigator understands a selection of common questions, such as “Where am I?” and “Nearest gas?” so you can keep your hands on the wheel
http://www.magellangps.com/

Vocally Infinity
When connected to a phone, this device will retain up to 60 phone numbers and will dial them in response to a spoken name. It is aimed at people who have difficulty using a numeric keypad, such as some elderly.

Dragon NaturallySpeaking 10
For those who need to create a lot of text without a keyboard—office workers with repetitive strain injury, for instance—NaturallySpeaking 10 translates speech to printed words on a desktop or laptop computer. www.nuance.com

Comment on this article
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

ADVERTISEMENT
ADVERTISEMENT
Collapse bottom bar
DSC-JanFeb15
+

Log in to your account

X
Email address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it emailed to you.

Not registered yet?

Register now for FREE. It takes only a few seconds to complete. Register now »