Collaborative Filtering - Methodology

Methodology

Collaborative filtering systems have many forms, but many common systems can be reduced to two steps:

  1. Look for users who share the same rating patterns with the active user (the user whom the prediction is for).
  2. Use the ratings from those like-minded users found in step 1 to calculate a prediction for the active user

This falls under the category of user-based collaborative filtering. A specific application of this is the user-based Nearest Neighbor algorithm.

Alternatively, item-based collaborative filtering invented by Amazon.com (users who bought x also bought y), proceeds in an item-centric manner:

  1. Build an item-item matrix determining relationships between pairs of items
  2. Using the matrix, and the data on the current user, infer his taste

See, for example, the Slope One item-based collaborative filtering family.

Another form of collaborative filtering can be based on implicit observations of normal user behavior (as opposed to the artificial behavior imposed by a rating task). In these systems you observe what a user has done together with what all users have done (what music they have listened to, what items they have bought) and use that data to predict the user's behavior in the future or to predict how a user might like to behave if only they were given a chance. These predictions then have to be filtered through business logic to determine how these predictions might affect what a business system ought to do. It is, for instance, not useful to offer to sell somebody some music if they already have demonstrated that they own that music or, considering another example, it is not useful to suggest more travel guides for Paris to someone who already bought a travel guide for this city.

Relying on a scoring or rating system which is averaged across all users ignores specific demands of a user, and is particularly poor in tasks where there is large variation in interest, for example in the recommendation of music. However, there are other methods to combat information explosion, for example web search, data clustering, and more.

Read more about this topic:  Collaborative Filtering

Famous quotes containing the word methodology:

    One might get the impression that I recommend a new methodology which replaces induction by counterinduction and uses a multiplicity of theories, metaphysical views, fairy tales, instead of the customary pair theory/observation. This impression would certainly be mistaken. My intention is not to replace one set of general rules by another such set: my intention is rather to convince the reader that all methodologies, even the most obvious ones, have their limits.
    Paul Feyerabend (1924–1994)