First of all, I consider myself a foodie. Therefore, as described in the Wikipedia, a foodie is “a person who has an ardent or refined interest in food […] and seeks new food experiences as a hobby rather than simply eating out of convenience or hunger“. I guess this is the reason why I enjoy so much eating in those restaurants where each dish surprises you.
In addition, I have a strong interest in research and development of data science techniques such as recommender systems, information retrieval and filtering, search engines, etc. In this direction, I consider that nowadays the digital systems have in hand a lot of user information that can be processed in order to understand better the community and, in the end, offer an improved service. In other words, crunching lots of data can be very helpful.
This post is the result of putting together these last two interests.
- In my quest of finding the restaurants in Barcelona that better fit my tastes and expectations, I recently decided to start visiting those that have been awarded one or more Michelin stars. However, following this strategy is kind of dangerous because it is based on trusting a list of restaurants that is published by a private company from a particular country and produced by a set of secret experts. Check out the documentary ‘Madness of perfection‘ to discover the UK-based opinion on this issue.
- In order to solve this one-sided opinion I thought that it was interesting to confirm – or contradict – the Michelin ranking comparing it to the community’s opinion that is available on the web (this is where the data science stuff comes in!). I used techniques to (1) scrap reviews of Michelin-starred restaurants from different websites and (2) process the gathered numbers in order to come up with a ‘fair‘ and ‘representative‘ score for each restaurant. This value is then used to re-evaluate the Michelin ranking and, as originally pursued, find the restaurants in Barcelona that are worth a visit.
Note: I’d like to point out that, although the potential lack of confidence the Michelin-starred list is still composed by the best places to eat. There are probably many restaurants that should have stars but have not, or vice-versa. But those with stars are definitely among the best! Hence using the Michelin ranking as a starting point seems quite correct.
So, let’s first sketch the methodology used in this study. It can be summarized in the following points:
- The anlaysis considers the Michelin-starred restaurants from Spain and Catalonia (169 restaurants, 8 with 3 stars, 18 with 2 stars, and 143 with 1 star).
- The ratings are collected for each restaurant from three websites: Google Places, TripAdvisor and Verema.
- An enhanced restaurant score is designed based on the information of the three websites. It is important to note that this score contemplates both (1) the value of the restaurant rating and (2) the number of users that submitted a rating. This way it considers whether a restaurant is good (high rating value) and whether this result is trustable (high number of user ratings).
Once you know how it is done, let’s dive into the outcomes so you can realize where to book a table. According to the revisited ranking sorting by the restaurant implemented in the study, the best 3 restaurants are:
As expected, El Celler de Can Roca (3 stars) is the best restaurant in the list. Hence, a first outcome of the study is that confirmation that the 3-starred restaurant, currently listed 2nd in the list of Top 50 Restaurants in the world, is worth a visit (if you get a chance to book a table…). The runner-up of the global list is the restaurant Akelare (3 stars), and the podium is completed with the restaurant Coque (1 star). I definitely know where I will eat in my next trips to San Sebastian and Madrid, respectively.
Let me be a little bit selfish now and allow me to restrict the ranking to the restaurants in Barcelona. The winner in this category is the japanese restaurant Koy Shunka, followed by Dos Cielos and Tickets (all three with 1 star). And they were even well positioned in the global list, ranking 6th, 16th and 18th, respectively. Koy Shunka was already on my list of ‘MUSTs’ in Barcelona, but still never ate there. Now I am positive I will.
Congratulations to all these restaurants, and hope to see you soon! : )
The remaining of this post explains in more detail how the study has been carried out and it is aimed at those readers that want to know all bits and pieces.
True Bayesian Average to rank Michelin-starred restaurants
The more technical details of the study are included in two different posts each one focusing on a different issue. Refer to them for further details on how the study was designed and implemented.
- The post How to Rank (Restaurants) reviews different ranking methodologies commonly used to interpret ratings in order to rank items. For the current study, the selected technique is the True Bayesian Average because it is a simple but effective way to represent both (1) the average rating that the restaurant gets, and (2) the number of used that rated that restaurant.
- The post Retrieving Restaurants Ratings describes how to gather restaurant ratings from three different community-based platforms providing user ratings: Google Places, TripAdvisor and Verema. The corresponding Python scripts are provided as well.
From a more general perspective, the following picture overviews the scheme of techniques used in this study. The methodology is divided into two steps: the first one is based on the True Bayesian Average that generates a score per restaurant and per website (green boxes); and the second that just averages them and generates the final restaurant score (purple box).
Recall that for each website one obtains two values per restaurant: the average rating and the number ratings. These original values are then combined using the True Bayesian Average technique in order to obtain the score for the given restaurant in that website. Note that this first step combines both the original rating and the number of these into the Bayes-based score implemented. This certainly results into 3 scores for each restaurant; one per website. Each 3-tuple is then simply averaged to produce the final restaurant score that is used to sort the restaurants.
Enhanced Restaurant Ranking
Once you know all the technical details of how the ranking is constructed, let’s examine the results with a little bit more details than just top-3 lists.
The following table lists the 20 top restaurants of the list, along with many numerical details that result from the computations. The first two columns contain the ‘Global‘ results considering combined information from all three websites: the ‘rank‘ indicates the position of the restaurant in the final ranking; and the ‘score‘ is computed as the average of the bayesian scores from each website. Each subsequent group of 4 columns shows the results regarding only that particular website: the original ‘rating‘ directly obtained from the website; the ‘number‘ of ratings also directly obtained from the website; the ‘score‘ is the result of applying the True Bayesian Average technique; and ‘rank‘ is the position in a ranking sorted by the score obtained in the corresponding website. Note that all ratings and scores are in a 1-5 scale except the values in Verema where a 1-10 scale is used.
As already noted in the introduction, the best restaurant is El Celler de Can Roca in Girona. This first position is clearly deserved as its global score (4.60) is clearly higher than the others (below 4.40), and it was even considered the best restaurant in TripAdvisor and Verema. Strangely, it was only ranked 23rd in Google Places with a score of 4.48. This fact confirms that the three chosen websites represent different interests of different communities. The second best restaurant is Akelare from San Sebastian. Although ranked 1st in Google Places, observe how its low number of ratings in this website (29) downgraded its value from the original rating of 4.9 to the score of 4.70. Instead, in TripAdvisor Akelare has a much higher number of ratings (458) and hence the original rating of 4.5 only decreases to the score of 4.45. The third in the list is the restaurant Coque in Madrid. This one is fairly good scored in Google Places and in TripAdvisor (4.59 and 4.61), but not that much in Verema (3.94, 7.88/2). Also note that in TripAdvisor, the original rating was a 5, but because of the low number of user opinions (129), the score was reduced to 4.61. Also observe that restaurant Coque is the first surprise regarding the Michelin stars, as it has been awarded only one but instead it reached the top-3 in the current study.
Let me explain a couple of additional remark using as an example the best-scored restaurants in Barcelona. First, regarding the top-ranked japanese Koy Shunka. It is interesting to observe how this restaurant is highly ranked in Google Places (3rd with a score of 4.69) but not that high in Verema (43rd with a score of 3.94, 7.89/2). The reason for such difference might be that the type of cuisine that Koy Shunka offers, japanese, is better evaluated by the Google Places community (more international) than the users in Verema (more spanish-centered). As mentioned before, this variety in ratings sources is not an inconvenience at all, but something that enriches the analysis quite considerably. And second, the restaurants Dos Cielos and Tickets have very similar global score: 4.317 and 4.316 respectively. I want to benefit from this fact and comment that the Bayesian technique used to compute the score is based on shifting the original ratings towards the total average with a different strength depending on the number of user opinions. This is why the score values become closer to each other than the original ratings from the websites. This does not pose a problem; the only consequence is that one might need to refer to further decimals to decide the ranking positions.
And the concluding exercise to complete the analysis is to finally observe whether the Michelin star-based distinction is confirmed with the data under study. I have summarized this point of view in the following picture where you can see the complete list of restaurants sorted by the final score and represented by different colored dots: orange for 3-star restaurants, purple for 2-star, and green for 1-star. First observe again how at the top left corner how El Celler de Can Roca clearly outperforms the rest. Also note that, while all 3-starred restaurants are indeed highly ranked (all within the top-20 list), there is not a big distinction between restaurants with 1 and 2 stars as they all span the entire ranking. For instance, a single-starred reached the third position, and a 2-stars was ranked in position 166 of 169. This observation might be a good conclusion to take back home: don’t waste time discussing between a 1-star or a 2-stars, the difference might be minimal; however, trust the 3-starred restaurants as this distinction is a strong warranty.
Aren’t you convinced yet to book a table in El Celler de Can Roca? I do have mine for the next Sant Jordi’s day!