Which Way Is the Voice Recognition Technology Going?

From Voice User Interface to AI and Intelligent Agent

  • July 9, 2012
  • Text by Tatsuya Yamaji

Voice-recognition-based user interfaces, including iPhone’s Siri, are attracting a lot of attention lately. The response of voice recognition apps is becoming faster than ever through the use of big data, and new technologies and applications are enabling machines to listen to human voice and figure out the speaker’s emotions and stress levels. What’s behind this recent popularity of voice recognition, and what potentials does the technology have? To find some answers, we need to trace the history of voice recognition research and development.

Mathematical modeling of human voice

Voice user interfaces (VUIs) are becoming rapidly popular these days. The technology itself is not new. In Japan, voice input software for personal computers has existed since the 1990s, and mobile phones with a voice-recognition-based dialing feature was commercially available from DoCoMo, but they did not seem to have stirred a lot of popular interest.

In the U.S., products and services featuring voice recognition technology have been on the market since the 1990s, but the first one that truly proved popular was Siri, a VUI on Apple’s iPhone. Users can simply talk to the phone and bid it to do a lot of things, such as inputting schedules, sending messages, and looking up the web. When asked what life is, Siri answers “42” (a reference to a joke in The Hitchhiker’s Guide to Galaxy). Users seem to enjoy this Siri character that is both intelligent and slightly flaky.

Back in 1987, Apple announced the concept of a futuristic information terminal called Knowledge Navigator. Some Apple enthusiasts claim that Siri is exactly the embodiment of this concept.

Android phones have the Google Voice Action feature that enables voice commands, and Siri-like personal assistant app is also available for the Android platform. A mobile Google search app also offers a voice search feature.

It has finally become possible to control digital devices at will with voice commands. It makes us think the emergence of an AI assistant may be close at hand.

Why have VUIs suddenly become so popular, after having remained in the shadow for so long? Has there been any technological breakthrough?