Recommender Systems - Knowledge & Content based

This is the second post of the Recommendation Systems series. Here, we will discuss the knowledge-based and the content-based approaches of recommenders using both the explicit-type and the implicit-type of user’s feedback as an input for our examples wherever it is possible. Also, we are going to discuss the “Cold start problem” in the methods that suffer from it and how we can deal with it.

Knowledge-based Recommenders

As they are called, knowledge-based recommendation systems use the existing knowledge they have somehow acquired for both the users and the items in order to predict what items users may like.

The easiest approach to explain the way such a recommender system works is through an example. The most frequently used example in recommenders are the problem where we want to recommend movies to users. Each movie is described by some characteristics like the set of the actors participating in the movie, the name of the director, the year in which the movie was made, the genre or the general type of movie, the duration etc. On the other hand, each user has some preferences stated for each characteristic like favourite actors or director, movie genres he loves or he hates, how much he prefer the old movies than the newer ones, etc.

In the knowledge-based approach, we assume that we have all the necessary knowledge about both the movies and the users. For the movies, we may had a specialist to specify each characteristic and we could also may have asked the user about his preferences. So, in this case that we have all the necessary data the algorithmic problem is very easy to be solved. We just need to estimate the relevance of movies’ characteristics to the users’ tastes to find out if a user is going to like a movie or not.

The advantage of this approach is that we don’t need any past data of the user on the interaction he had with the platform. We don’t need to know what movies the user watched in the past and whether he liked them or not, neither what movies other users have watched. This is a very important issue because the relevance of new users or newly introduced items could be immediately estimated.

On the other hand, the truth is that rarely the scenario described above would meet the requirements of a real-word application. Asking users about their preferences almost always is not a sustainable choice. In order to solve this issue there is another approach called “Content-based recommenders systems” which we are going to describe in the next chapter.

Content-based Recommenders

Content-based recommender systems are created in order to deal with the drawback of the knowledge-based approach. In the knowledge-based approach we needed the user to explicitly tell us what characteristics he likes or not, which as we said is not always an option.

The content-based approach looks at the past of the user’s interaction with the platform. For example, if we talk about an e-shop like Amazon, we would use the data about what the user had searched for and what he actually bought. If we talk about a streaming service like Netflix or Spotify, we would use the data of what media the user had watched or listened and what he liked. Notice that in this case we don’t know which specific characteristics the user likes, but instead we know what items he likes.

Remember that on this class of recommenders we still know the characteristics of each item (eg. movie). A simple algorithm would be the following. Find each item’s similarity with every other item. Use that score to find the most similar items for each item. Look at what items user liked in the past and find the similar items to these and recommend them back to the user.

From a technical perspective, Content-based recommender systems are classifier systems derived from machine learning research using supervised machine learning to induce a classifier that can discriminate between items likely to be of interest to the user and those likely to be uninteresting.

The advantage of this approach is that you can recommend items to users without needing to know exactly what their tastes are, which is the most common case in a real-word application. Also, in this approach we don’t compare users among each other and we only care about what the user liked in the past.

There are two issues here, though. The first one is the “Cold Start problem” that we explained in the previous chapter. The truth is that “Cold Start” problems do not have a generic solution and usually the answer to such issues includes some business logic like recommending the most frequently ordered items.

Also, another issue would be that we still need data about the products. We need to specify what characteristics are important for them, and populate these characteristics with the proper values. In many cases though, this is not an easy task and we need a different approach where we would be able to recommend items to users only based on the previous interaction between these sets and not their characteristics. This approach is called “Collaborative Filtering” and it is the most used approach on the recommendation systems in our days.

In the following chapter, we will describe how Collaborative Filtering works, the different approaches that can be used, we are going to see some algorithms and how they work and finally we will present a working example.

Written on March 3, 2018