Open Access Open Access  Restricted Access Subscription or Fee Access

Scalable Recommendation System with MapReduce

Kavin Nagasubramanian, I. Fazil Ahmed, T. Hariharan, S. Surya Prabhakaran, S. Karthiga


If the number of user grows in huge amount in a Recommendation System, the standard approach of sequentially examining each item and looking at all interacting users does not scale. In our proposed system we solve this problem by developing a MapReduce algorithm for the item comparison and Top-N recommendation problem that scales linearly with respect to a growing number of users. We use Similarity-based neighborhood methods for recommendation; infer their predictions by finding users with similar taste or items that have been similarly rated. In Mapreduce, the data to process is split and stored block-wise across the machines of the cluster in a distributed File system (DFS) and is usually represented as (key,value) tuples. It uses parallel algorithm which partitions the data across the clusters and in general it supports a wide range of similarity measures


MapReduce, Parallel Algorithm, Similarity, Pairwise Comparison.

Full Text:



K. Ali and W. van Stam. Tivo: Making showrecommendations using a distributed collaborative filtering architecture. KDD, 2004.

A. S. Das, M. Datar, A. Garg, and S. Rajaram.Google news personalization: scalable online collaborative filtering. WWW, pp. 271-280, 2007.

J. Davidson, B. Liebald, J. Liu, P. Nandy,T. Van Vleet, U. Gargi, S. Gupta,Y.He, M. Lambert,B. Livingston, and D. Sampath. The youtube video recommendation system. RecSys, pp. 293-296, 2010.

D. DeWitt, R. Gerber, G. Graefe, M. Heytens,K. Kumar, and M. Muralikrishna.GAMMA - a high performance dataow database machine. VLDB, pp. 228-237, 1986.

M. D. Ekstrand, M. Ludwig, J. A. Konstan, and J. T.Riedl. Rethinking the recommender research ecosystem: reproducibility, openness, and lenskit.RecSys, pp. 133-140, 2011.

S. Fushimi, M. Kitsuregawa, and H. Tanaka. An overview of the system software of a parallel relationaldatabase machine GRACE. VLDB, pp. 209-219, 1986.

7. Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Mymedialite: a free recommendersystem library. RecSys, pp. 305-308, 2011.

8. R. Gemulla, E. Nijkamp, P. Haas, and Y. Sismannis.Large-scale matrix factorization with distributedstochastic gradient descent. KDD, pp. 69-77, 2011.

M. Jamali and M. Ester. Trustwalker: a random walkmodel for combining trust-based and item-basedrecommendation. KDD, pp. 397-406, 2009.

Sebastian Schelter, Christoph Boden, Volker Markl. Scalable Similarity-Based neighborhood Methods with MapReduce.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.