También disponible en Español

Inf@Vis!

The digital magazine of InfoVis.net

Matrix Representations
by Juan C. Dürsteler [message nº 202]

Social networks have found their way into the lives of Internauts. Facebook, Linkedin, Flickr and a countless number of other networks have spread under the warmth of Internet including hundreds of thousands of users and millions of connections.
Matrix representation of a graph. Each dimension is represented on an axis parallel to the rest of the dimensions. In this case each row and column contain members of Xerorx PARC  and you can see who collaborates with whom
Source
: PhD thesis of Natalie Henry.
Click on the image to enlarge it.

In the real world but specially in Internet, the relationship between individuals, objects and concepts are increasingly important. Just as an example, who is not connected today to one or more social networks?. These have become ubiquitous and constitute a vast truss of professional and leisure relations where we knit together our interests.

Consequently, the networks have ended up as important sources of information. How can we visualise them in an effective way? 

From a mathematical standpoint a network can be considered as a graph (see also issue number 137). Each element of a graph (also called node) can be related to other elements of the same. Each relation is called arc, edge or also link.

The traditional representation of a graph consists of a set of points representing the nodes, linked by lines that bind the nodes that hold a relationship. Nevertheless, when the number of nodes begins to become high (above about 20 nodes and 20-30 links depending on the authors*), the problems of occlusion between links and even between nodes make understanding and interacting with the representation very difficult.

An alternative representation that, despite the fact that it's relatively unknown, appears to be very useful is the matrix representation.

For our purposes the matrix representation of a graph is an array of rows and columns where each row and column represent a node (for example a person) and in the intersections (cells) between them a 0 or a 1 (a coloured square or its absence) denotes that there exists a relation (or not) between the corresponding nodes.

So what we are actually painting is a boolean matrix of connectivity, also called and adjacency matrix. Notice that this allows us to visualise one to one, one to many and many to one links in a very simple way.

Obviously the matrix paradigm can be extended beyond the adjacency matrix by assigning a visual variable like the colour to each cell as a function of the value of a given numerical value like, for example, the traffic of a web link or the number of publications of which two nodes are coauthors.

The matrix layout guarantees that there's no occlusion neither between links nor between nodes. On the other hand the study of the visual patterns that arise let you identify clusters and "communities" by permuting the ordering of rows and columns so that the most mutually linked nodes become closer. 

To be more systematic, the advantages and inconveniences of such a representation , according to the PhD Thesis by Nathalie Henry and the articles mentioned at the footer of this page, are the following:

Advantages
  • Absence of occlusion amongst nodes, you can always read their label.
     
  • No crossing of links, enabling you to easily identify the origin and end of a link.
     
  • Easy  identification of the absence of connections.
     
  • It systematically outperforms graphs in several tasks like counting nodes, finding links, etc. when the number of nodes is bigger than 20.
Inconveniences
  • For the same level of detail a bigger space is required compared to traditional graphs.
     
  • For small networks (<20 nodes, 20-30 links) the graph is more effective.
     
  • Greater difficulty to follow paths (for example form node A to B through C)
     
  • Lack of familiarity, it's a much less known and intuitive paradigm.

J. Bertin, who sadly passed away just a few weeks ago, had already introduced in his book Semiologie Graphique the reordenable matrix. Effectively, being able to properly permute the order of rows and columns allows you to understand better the structure of the links and the identification of features making interesting patterns emerge.

In some ways what Bertin came to say is that the matrix provides us with a valid representation, but what makes it usable is choosing an appropriate ordering of rows and columns.
Equivalence between matrix representation and that of a graph: in A you can see an actor that links several communities, in B 2 communities and in C a clique.
Source: PhD thesis of Natalie Henry.
Click on the image to enlarge it.

Particularly you can detect

  • Cliques All the members of a clique are completely interconnected. They are associated with very cohesive social groups.
     
  • Communities, where the members are connected in a less cohesive way; not all of them should be interconnected among each other.
     
  • Individual Actors
     
  • Influences and relations between the previous elements.

Nevertheless the permutation of rows and columns is not unique so it's possible to obtain different views of the communities that make up a social network.

It's not easy, anyway, to choose which ordering is the right one in general. It's clear that different reordering produces different views and, hence, different approaches to the problem. Which of them is the best one is a key aspect of the utility you can give to it and most probably depends on the needs of the user, i.e. on the objective we are pursuing when analysing a social network.

Matrix representations have been successfully used in the study of scientific publications to ascertain, in function of the citations and co-authorship: which researchers work together more often, which are the professional relations between them, the fields where they thrive, which are the main researchers etc.

In the field of patent analysis, as we already saw in the issue number 167 , they have been extensively used to detect alliances between companies and communities of researchers working collaboratively between different institutions or within the same.

But there are many other applications that can benefit from this type of representation allowing you to find social groups sharing purchase habits or groups of people connected by mobile telephony that have common interests.

It isn't difficult to imagine the interest that a revealing visualisation can have when taking marketing decisions affecting considerably large social networks. Nevertheless the bigger the network the more complex it becomes to acquire insight and knowledge. Visualisation has a long road ahead in this field.


On the readability of graphs using node-link and matrix based representations: a controlled experiment and statistical analysis. Information Visualization (2005) 4, 114–135, Palgrave Mac Millan

Dedicated to the memory of Jacques Bertin

Links of this issue:

http://www.infovis.net/printRec.php?rec=glosario&lang=2#grafo   Glossary entry about Graph
http://www.infovis.net/printMag.php?num=137&lang=2   Number 137 about Graphs
http://insitu.lri.fr/~nhenry/Henry-Thesis-Oct08.pdf   Phd thesis of Nathalie Henry
http://insitu.lri.fr/~nhenry/   Personal page of Nathalie Henry
http://www.infovis.net/printMag.php?num=116&lang=2   Number 116 Interview with Jacques Bertin
http://www.infovis.net/printRec.php?rec=llibre&lang=2#SemiologieGraphique   The book Semiologie Graphique by J. Bertin
http://www.infovis.net/printFicha.php?rec=revista&num=167&lang=2   Number 167 Patent Analysis
© Copyright InfoVis.net 2000-2014