REFEREE: an open framework for practical testing of recommender systems using ResearchIndex
D. Cosley, S. Lawrence, und D. Pennock. VLDB '02: Proceedings of the 28th international conference on Very Large Data Bases, Seite 35--46. VLDB Endowment, (2002)
Automated recommendation (e.g., personalized product recommendation on an ecommerce web site) is an increasingly valuable service associated with many databases--typically online retail catalogs and web logs. Currently, a major obstacle for evaluating recommendation algorithms is the lack of any standard, public, real-world testbed appropriate for the task. In an attempt to fill this gap, we have created REFEREE, a framework for building recommender systems using ResearchIndex--a huge online digital library of computer science research papers--so that anyone in the research community can develop, deploy, and evaluate recommender systems relatively easily and quickly. Research Index is in many ways ideal for evaluating recommender systems, especially so-called hybrid recommenders that combine information filtering and collaborative filtering techniques. The documents in the database are associated with a wealth of content information (author, title, abstract, full text) and collaborative information (user behaviors), as well as linkage information via the citation structure. Our framework supports more realistic evaluation metrics that assess user buy-in directly, rather than resorting to offline metrics like prediction accuracy that may have little to do with end user utility. The sheer scale of ResearchIndex (over 500,000 documents with thousands of user accesses per hour) will force algorithm designers to make real-world trade-offs that consider performance, not just accuracy. We present our own tradeoff decisions in building an example hybrid recommender called PD-Live. The algorithm uses content-based similarity information to select a set of documents from which to recommend, and collaborative information to rank the documents. PD-Live performs reasonably well compared to other recommenders in ResearchIndex.