También disponible en Español

Inf@Vis!

The digital magazine of InfoVis.net

Visualisation or Vocalisation?
by Juan C. Dürsteler [message nº 87]

There’s no general consensus regarding the idea that the future of human-computer interaction relies on speech recognition. As in many other things this is probably true for certain applications, but it’s false for some others.

Many research institutions worldwide are investing important sums of money in the development of speech recognition and synthesis. Once considered as the future of human machine interaction, the promise still has to come true in a generalised way, despite the many years of work already invested in it. 

The keyboard and mouse are still our ubiquitous work mates, albeit the keyboard is a bothersome device that few people really use efficiently. How many fingers do we use to write with it?.

Ben Shneiderman

In an interview with the Washington Post Ben Shneiderman considers that the dream the movie “2001 a Space Odyssey “ showed, with the astronauts speaking with the paranoid HAL 9000 computer, is far away from reality.

According to the studies performed at the Human Computer Interface Lab (HCIL) when generating speech the brain uses auditory memory that is located in the same place as the short term memory and the working memory. In other words, it’s not an easy task to speak and simultaneously work in a concentrated way since speech borrows important areas of short term memory that we need in order to give attention to a particular task. 

For this reason, although it’s obvious that voice recognition and speech synthesis will find a place in the interface between humans and machines, it doesn’t seem that this will be the preferred way of interaction.

Shneiderman considers that in information assimilation and in human machine interaction the mantra of information search will rule: 

“Overview first, zoom and filter, then details-on-demand”

This mantra will be supported by advanced visualisation systems. These systems would be the only ones with the ability to show huge amounts of information in an easily understandable way. 

Nevertheless, the interesting article of the Washington Post presents a dichotomy – verbal vs. visual – that probably is excessively Manichean, as it has been shown with the increasing interest for multimodal interaction systems, where vision, auditory and tactile senses are playing simultaneously.

Important companies such as IBM or Microsoft are investing large amounts of money in speech recognition and synthesis without forgetting the study of visualisation and the interaction between both interfaces. An interesting example can be seen in Emily Benedek’s article for the Think Research Magazine. It shows experimental systems capable of displaying vast amounts of information combined with tracking of the position and gestures of the user and recognition of his/her voice. The system can identify where the user is aiming at with his/her finger, placing a particular image at the indicated location when the person says “place it here”

The day when we say goodbye, not without some sadness, to the keyboard and mouse is still far away, but there are firm plans to eliminate these bothersome travel mates in favour of more natural ways of interaction. 

Usually reality is not black nor white, but a shade of grey. In this case the grey appears, nevertheless, quite bright.


Links of this issue:

http://www.washingtonpost.com/wp-dyn/articles/A56499-2002May8.html  
http://www.infovis.net/printRec.php?rec=persona&lang=2#Shneiderman  
http://www.cs.umd.edu/hcil/research/visualization.shtml  
http://domino.research.ibm.com/comm/wwwr_thinkresearch.nsf/pages/interface198.html  
© Copyright InfoVis.net 2000-2014