También disponible en Español

Inf@Vis!

The digital magazine of InfoVis.net

Exploratory Search
by Juan C. Dürsteler [message nº 185]

We don't always know precisely what we are looking for and, many times, we don't know even what its name is. In these conditions exploratory search is a strategy that allow us, with the help of visualisation, to refine our search and reach the goal of our exploration by means of successive iterations.
wwmxscreenshot.jpeg (139810 bytes)
WorldWideMediaExchange a photographs website with many ways to explore it
Source
: Screenshot as can be seen in the wwmx.org. website. 
Click on the image to enlarge it.

Part of the daily tasks of many of us relies on connecting to the Internet in search of information. Sometimes we know exactly what we are looking for and where it is, but in many other occasions we only have a faint idea of what we are trying to find and we don't even know where, if that information exists, it could be.

For example, we know that searching for information in Internet without knowing exactly what its name or location is is possible but, how do I find information regarding a field I don't have any specific knowledge about, when I only have a vague idea of its nature?

Usually what we do in these cases is to rely on a search engine like Google or Yahoo! and type in more or less fortunate queries in order to begin exploring the results and refining the search according to what we are learning and discovering from them. In the end we use an exploratory technique. But this is a technique that ends up being slow and cumbersome with the current browsers and search engines.

One of the answers to this problem could come from the visualisation way, hand in hand with what it's beginning to be called Exploratory Search). 

Several groups of researchers in fields that span from information retrieval and human machine interface to information visualisation are working to provide appropriate support to this type of search. In fact, although in the past we haven't talked formally about Exploratory Search as such, we have reviewed many applications that were already heading in that direction.

Example of this are Marti Hearst's TileBars and Bailando or Shneiderman and colleagues TreeMaps or initiatives like KartOO, Grokker, and Autofocus, for textual information. For musical information we reviewed Musicplasma (now known as livePlasma)  and others in Visualising Music or Islands of Music en el number 168. These are only some of the articles appeared in InfoVis.net related with this matter. Consequently this topic, or maybe better said, the need is not new.

The new thing is that people are beginning to speak about Exploratory Search in an integrated way, as a discipline itself, that recognises that visual interfaces along with the integration of different search methodologies are a key aspect in order to provide the users with tools enabling them to find those pieces of information they can't find by just issuing a precise query.

One of the aspects where this is important is in the search for personal information. The information of, for example, a business meeting can be summarised in our contact list, included in one e-mail message or maybe in a commercial report. Sould we look for it in only one channel we can fail completely. Add to this that our memory tends to be vague and in many cases we don't follow a rigorous criterion when deciding where to store our data, partly due to the fact that we can't anticipate how we'll be interested in recovering it in the future.

Phlat_1.gif (86314 bytes)
Phlat Image of the interface.
Source: As can be seen in the web of Microsoft Resesarch.
Click on the image to enlarge it.

Phlat is an exploratory search interface developed by the Interaction and Adaptive Systems of Microsoft Research aimed at searching throughout the Windows desktop. You can download it for free. It searches for information using transversal queries through multiple categories of documents like music, text, e-mail, images, etc. using several kinds of filters

OpenVideoSearch.gif (143229 bytes) OpenVideStrybrd.gif (302756 bytes)
Openvideo.org. The interface of "Relation Browser" provides a search page that combines a window for information retrieval with a partitioning of the corpus of videos by genre, duration, contents, etc.
Source: Screenshot as can be seen in the web page of openvideo.org.
Click on the image to enlarge it
Openvideo.org. One of the ways of looking for a video analysing it before deciding to download it is the "storyboard", that shows one image from each of the scenes that makes up the video, this way allowing the user to get a good idea of its contents.
Source: Screenshot as can be seen in the website of openvideo.org.
Click on the image to enlarge it

On the other hand Openvideo uses a collection of extensively annotated videos related with research and education as its database in order to offer multiple ways for selecting and assessing a video before taking the decision of downloading it or not. Open Video allows you to issue a traditional search (lookup) on one side and/or a browsing search on the other. You can also combine them. 

Moreover, besides retrieving th usual information (author, date, duration, etc.), Openvideo offers several formats that allow the user to understand the nature of the video and decide whether to download it or not. Storyboard is one of them. Each of the scenes of the video contributes one image that appears in order like the storyboard of a film. Looking at the whole set we get quite a good idea about its contents.

Clusty.gif (83297 bytes) mspace.gif (133660 bytes)
Clusty. The results are grouped in  "clusters" of documents that have a common topic. 
Source: Screenshot as can be seen in the clusty.com website.
Click on the image to enlarge it
mspace. a combination of partitioning combines with traditional search. 
Source: Screenshot as can be seen in the mspace website.
Pulse sobre la imagen para agrandarla

Clusty uses clustering techniques to group the results according to certain words present in all of them. Besides the typical list of results à la Google, Clusty offers a partitioning scheme that allow us to dive into the results clustered by similarity labeled under a particular cluster tag. For example if we look for "InfoVis" we get a series of results and also a list of topics or clusters, determined automatically from the analysis of the results, like "resources", "blogs", etc. so that we can refine our search by diving into the clusters.

mspace is another tool, that exemplifies the combination of partitioning, sorting and previsualisation (or pre-hearing for musical files) with traditional search. They propose  an example based on the search of pieces of classical music as a way to introduce the tool. 

These are just a few examples of what is beginning to flourish in the web. If we go deeper into them we can notice that some of the most commonly used techniques in these applications are:

  • Clustering. Identification of sets of results coming from traditional search that have common elements and, hence, are similar among them.

  • Particioning or slicing. Segmentation of the results providing different views (slices) of the data set in which we want to look for by restricting the number of dimensions of the characteristic variables.  In many cases the set contains metadata or annotations that provide information in order to create the view. This is the case of OpenVide or mspace, where, besides the data, there's additional information which allow the user to see the data set from different semantic perspectives. 

  • Sorting. Depending on the additional information or the metadata stored with the data. For example by author, date of edition in the case of books.

Most of the instances combine several or all of those techniques with other less usual ones building a multimodal search space. 

In the end, as opposed to the traditional search that allows us to retrieve information by providing precise queries with meaningful keywords, other more eclectic systems have begun to appear, combining different strategies with elements of information visualisation to provide an exploratory search experience. That one where sometimes we even don't know what we are looking for until we have it in front of our own eyes.  


Other resources:

In June 2005 a workshop was organised  at the University of Maryland with the goal of gathering researchers of different specialties like information retrieval, human computer interface and information visualisation to explore in an interdisciplinary way the interfaces that can help to consolidate and conform exploratory search. 

On the other hand the special number of April 2006 of Communications of the ACM is devoted to this discipline. It's worth reading the different articles of this interesting issue that are focused towards solving the problem of exploratory search.

Links of this issue:

http://wwmx.org/   World Wide Media Exchange website
http://www.infovis.net/printMag.php?num=104&lang=2   Number 104 on Tilebars
http://www.infovis.net/printMag.php?num=107&lang=2   Number 107 on Flamenco
http://www.infovis.net/printMag.php?num=51&lang=2   Number 51 on TreeMaps
http://www.infovis.net/printMag.php?num=97&lang=2   Number 97 on KartOO
http://www.infovis.net/printMag.php?num=138&lang=2   Number 138 on Grokker, or Visual Navigation
http://www.infovis.net/printMag.php?num=151&lang=2   Number 151 on Autofocus
http://www.liveplasma.com/   Liveplasma website
http://www.infovis.net/printMag.php?num=161&lang=2   Number 161 on Visualising Music
http://www.infovis.net/printMag.php?num=168&lang=2   Number 168 on The Landscape Metaphor
http://research.microsoft.com/adapt/phlat/   Phlat search interface for Windows desktop
http://www.open-video.org/index.php   Openvideo
http://www.clusty.com/   Clusty
http://www.mspace.fm/   mspace
http://www.umiacs.umd.edu/~ryen/xsi   Workshop on Exploratory Search Interfaces
http://portal.acm.org/toc.cfm?id=1121949&type=issue&coll=GUIDE&dl=GUIDE&CFID=783193&CFTOKEN=90026442#1121977   Special issue of Communications of the ACM on Exploratory Search
© Copyright InfoVis.net 2000-2014