También disponible en Español

Inf@Vis!

The digital magazine of InfoVis.net

Collaborative Filtering
by Juan C. Dürsteker [message nº 155]

Collaborative filtering is increasingly present as an integral part of commercial web sites. “Memory based” algorithms are the most simple to implement, yet the most effective when recommending products and predicting preferences.
AmazonDus.gif (89841 bytes)
Recommendation refinement page at Amazon.com. The system proposes recommendations as a function of the user purchases and the purchases of other users with a similar profile. There are several forms to even refine our preferences further. If you don't use them amazon uses your purchases, pages visited and the "wish list" to infer them.
Click on the image to enlarge it
Source: Screenshot by the author.

We ended up the last article considering that many tools using a 3D metaphor imitating the real world in which we move have had less success on daily use than other, more abstract, alternatives. There are many examples of this like Vios or PlanetOasis and others (see the section “The 3D adventure” at number 93)

There are, nevertheless, other, more abstract but perhaps more effective, approximations to the proposals of the information foraging theory. Among the approaches that are working successfully in this field is Collaborative Filtering.

In a society of hunter-gatherers the equivalent of collaborative filtering is the set of implicit recommendations that can be derived from the study of the scent left by the animals and other hunters along with the explicit “word of mouth” recommendations that hunters mutually give each other when they gather, according to their previous experiences. Both can facilitate finding the most appropriate places to hunt and collect the preferred species, the ones that yield the maximum energy with the minimum consumption, whatever the season of the year be.

In Internet, information hunters find the information scent that web-sites leave and the links they visit, but there are also forums for opinion exchange, chats and weblogs where they (we) can find explicit information about where to find the information we are looking for.

In collaborative filtering, different groups of users combine the evaluations of the information they have found or that they own, helping each other to find information of their interest. This happens through news, chats, weblogs and even e-mail in an informal way.

A more sophisticated way of promoting collaborative filtering comes out from recommendation systems, like the one you can find at the web sites of Amazon.com.

These systems use the explicit ratings (assessments of the products for sale or the documents present in a database) or implicit ratings (the purchases that every user makes) in order to build user profiles that allow the system to predict other products or documents not seen by the user but that could be of the user’s interest.

Movielens.gif (129831 bytes)
Movielens: The recommendation system proposes you a series of movies after rating a set of predefined movies shown when creating our account.
Click on the image to enlarge it
Source: Screenshot by the author.

In an interesting article* presented by several researchers at the University of Bucaramanga, Colombia in the 2004 ACM Symposium of Applied Computing they compare 4 different types of common algorithms for collaborative filtering:

  • Memory based. These are the first that were proposed and the easiest to implement. They have two phases:

    • Computation of correlation coefficients. The correlation coefficient between two user’s preferences gives the level of coincidence of the opinions respective to the evaluated objects.

    • Preference prediction. The prediction of the preference for a given object can be computed by using the sum of the ratings of the users to that object weighted by the correlation coefficients between each user and the active user.

  • Dependency-networks. A dependency network is a graph consisting of a certain number of nodes (documents of the database) linked by arcs that represent dependency relationships between the two documents. The states of each node represent the possible values of the ratings. On top of this structure they build probabilistic decision tree algorithms that “learn” to predict the preferences.

  • On-line learning. Continuous interactive processes in which every attribute is considered to be an “expert predictor”. It receives a weight that is used to measure the reliability in the prediction task. There are different types of on-line learning algorithms whose complexity doesn’t allow us to describe them here.

  • Support Vector Machines. Collaborative filtering can be considered as a classification task. It’s possible to create a classification model for every user based on the votes that the user and other users have emitted about a particular object. From here a set of examples of user classifications and their votes emitted about known objects is presented to a learning algorithm, so that it’s capable of predicting the votes that will be emitted for new, not yet classified objects.

These algorithms were used against the dataset of Movielens an experimental movie recommendation system of the University of Minnesota, EachMovie another experimental system that ran in the Digital Equipment Corporation's Systems Research Center (SRC) for 18 months from 1995 to 1997, whose data is available for researchers, and Jester, a joke recommendation system developed at the University of California in Berkeley.

The results of the comparison show that in a wide range of conditions Memory Based algorithms outperform the others in yield and quality of preference prediction. It turns out, on the other hand, that they are the most conceptually simple and the easiest to implement, too.

Recommendation systems are becoming an integral part of the e-commerce web sites. You only have to see the efficiency of the collaborative filtering system at Amazon to realise that there are more straightforward and simpler ways to take advantage of the foraging theory than the use of 3D metaphors, possibly more visually appealing, yet less effective.


* A Comparison of Several Predictive algorithms for Collaborative Filtering on Multi-Valued Ratings. Maritza L. Calderón Benavides, Cristina N. González-Caro, José de J. Pérez-Alcázar Juan C. García-Díaz, Joaquin Delgado. Proceedings of the 2004 ACM symposium on Applied computing, pp 1033-1039.

Links of this issue:

http://www.infovis.net/printMag.php?num=154&lang=2   Num 154 Web Forager.
http://www.infovis.net/printMag.php?num=93&lang=2   Num 93 Two years later.
http://www.amazon.com   Amazon.com website
http://movielens.umn.edu   MovieLens website. Movie recommendation system.
http://shadow.ieor.berkeley.edu/humor/   Jester website. Joke recommendation system.
© Copyright InfoVis.net 2000-2014