Tag Clouds
by Juan C. Dürsteler [message nº 196]

You can find Tag Clouds widespread over the Internet nowadays. Are they a form of navigation or a way to approach the semantics of a text? Are they as useful as its ubiquity suggests or are they just cool?
TagCrowd_en.gif (26028 bytes)
Tag Cloud  created with the text of this Infovis issue. Note the quite uninformative  nature of this Tag Cloud. 
Source: generated with the TagCrowd program; screen shot by the author 
Click on the image to enlarge it. 

Apparently Tag Clouds began to proliferate in websites after Flickr ,the photography exchange website, adopted it.   

Tagclouds.com attributes the first publication of a Tag Cloud to the novel Microserfs by Douglas Copeland in 1995, although the Welsh poet Doug Lang published a poem in the form of a weighted text already in 1980 in his book Magic Fire Chevrolet.

The process of tagging consists of assigning tags or indices to a piece of information that defines or summarises it. A tag is a clear example of metadata, i.e. data that references other data providing some meaning to it.  In the end we can consider a tag as the name of a category, as a keyword.  

A Tag Set is just a collection of tags where each one of them appears only once. 

A Tag Cloud is a collection of tags where each tag can appear more than once, i.e. the multiplicity of each tag can be higher than 1. The multiplicity can thus be considered a parameter that weights the tag.

Tags can be automatically assigned or manually assigned. Many websites where Tag Clouds began to appear use social tagging. The users create their own unstructured tags that apply freely to the contents.

From the standpoint of Information Visualisation Tag Clouds are a a visual representation  of a list of tags weighted by some of its properties.

In many cases, in a Tag Cloud, the size of the fonts is proportional to its relative frequency of appearance in the text. Nevertheless both font size and colour can be associated to metrics that must not necessarily be the frequency. This increases greatly the possibilities over just considering the frequency as the weight. 

Since in the end a Tag Cloud is just a text made of a list of tags ordered and weighted in some way, we can classify them,  as a funtion of their test structure and/or ordering.

On the other hand we can use different visual variables to express the weighting, typically font size and font colour.

The combination of these two criteria provide us with the following table, partially based on the classification proposed by the interesting Smashing Magazine article Tag Clouds Gallery: Examples and good practices

Classification of Tag Clouds
Visual Variable 
Ordering Criteria
Font Size 

Font size is proportional to the weight

Colour (of font or background )

is used to highlight the keywords with higher weight. Font size is not the primarily visual variable associated to weight.

Unordered BetterTomorrow.gif (4062 bytes) BUreauSLA.gif (17439 bytes)
Example: A better tomorrow Example: Bureau SLA
Alphabetical Flickr_Tags.gif (49259 bytes) isit.gif (4514 bytes)
Example: Flickr

 (Example: isit2.0)

by Weight

Tags with greater weight appear first

abduzeedo.gif (22443 bytes)
Example: Dream Scaper by C.S. Ling

Example: Abduzeedo

by Similarity

Similar tags appear closer in the two dimensional space that represents the Tag Cloud.

webtopmania.gif (37127 bytes) TagWb2.0Angermeier.gif (21781 bytes)
 Example: Webtopmania Example: Markus Angermeier WEb2.0 tag clouid

Tags are not ordered. The linear appearance of a text disappears. 

wordle.gif (58713 bytes) WordleColor.gif (73532 bytes)

  Example: Wordle.net


Actually you can find  many Tag Clouds mix font size and colours sometimes in a redundant way, to convey the same information as you can see in some of the examples that illustrate this article. 

The usage of Tag Clouds has become widespread in the last few years. Many websites contain Tag Clouds, often just to look like "cool" places rather than for their informative value, if we have to judge them by their quality. 

What are they for?

From the Information Visualisation standpoint the deep idea underlying a Tag Cloud should rely on improving the navigational experience of the user by exposing him or her to the most descriptive (because of being popular or significant) keywords of the contents of the site or page.  This way the visitor should be able to get a snapshot of the main topics the site deals with. 

In this sense Tag Clouds aren't predominantly a navigation aid but they should be situation and context awareness providers regarding the semantics of the site at a glance, although thereafter the links send us to other parts of the document or site.  

Whether Tag Clouds really achieve these visualisation goals or not is quite disputable. In many cases it is very difficult to extract from them precise information that allow us, at a glance to understand with some accuracy the basic semantics that are buried in the contents that the cloud tries to visualise. E

Are they a good visualisation?

As Marti Hearst y Daniela Rosner accurately point out in their article "Tag Clouds: Data Analysis Tool or Social Signaller?" TAg Clouds combine good and bad elements of visualisation design: 

Positive Aspects Negative  Aspects 
  • Compact representation

  • Attracts the attention on the most important elements 

  • At least three dimensions are represented simultaneously

    • The words themselves

    • Their relative weight

    • Their ordering
  • Comparing tags of similar size is difficult.

  • Words are bidimensional objects but the only dimension that is coded is font size. Consequently between two words with equal weight the one with more letters will be perceived as more important since its area is bigger. 

  • Redundant assignations such as ordering by weight and assign font size to depict the weight again are useless.

  • Terms with similar or closely related meaning can appear far away or close to other unrelated terms. This  induces confusion.

According to the article, all in all there are few usability studies and the existing ones do not appear to grant the hypothesis of the goodness of Tag Clouds relating to navigation or insight enhancement of the contents of a web page or a text. 

Accordingly, from their qualitative study of 140 discussions around the topic in web pages and personal interviews with 20 experts in web design and Information Visualisation research the following conclusions about what the experts think and write about the usefulness of Tag Clouds to understand and process information can be derived (summarised by the author for the sake of brevity) 

  • They are inferior to the standard alphabetic ordered list. This could be improved with an adequate adjustment of white space, font size and by a change in the layout eventually. 

  • It seems that the true perceived value of this type of visualisation resides in it acting as signaler or marker of individual or social interaction with the contents of a collection of information.

  • It would function more as an attractive or suggestive device of the interest that a certain site provokes than a true representation that provides understanding on the navigation or the semantic contents.
  • In the previous point concepts like fun, the dynamic and social aspects that signal a site as adequate for exploration, exchange and social tagging are relevant.

Therefore, should Tag Clouds become a true tool for navigation and understanding they would lose that social attractiveness of the cool and fashionable, of being in the avant-garde that probably has made them so popular. If we are right its time will pass, if not the clouds will remain in the sky of Internet 

There are different programs  online allowing you to extracta a Tag Cloud from any text just by producing a histogram of the most frequent words. A good list of these tools can be found at Technacular 

