Kohavi, R.
(2012).
Online Controlled Experiments: Introduction, Learnings, and Humbling Statistics.
Thelwall, M.
(2012).
Journal impact evaluation: a webometric perspective. Scientometrics,
92, 429--441.
doi: 10.1007/s11192-012-0669-x
Alonso, O., Rose, D. E. & Stewart, B.
(2008).
Crowdsourcing for relevance evaluation. SIGIR Forum,
42, 9--15.
doi: 10.1145/1480506.1480508
de Wit, J. (2008).
Evaluating Recommender Systems.
Unpublished master's thesis
, University of Twente
.
Völker, J., Vrandečić, D., Sure, Y. & Hotho, A.
(2008).
AEON - An approach to the automatic evaluation of ontologies. Applied Ontology,
3, 41--62.
Davis, J. & Goadrich, M.
(2006).
The relationship between Precision-Recall and ROC curves.
ICML '06: Proceedings of the 23rd international conference on Machine learning (p./pp. 233--240),
New York, NY, USA: ACM.
ISBN: 1-59593-383-2
Joachims, T., Granka, L., Pan, B., Hembrooke, H. & Gay, G.
(2005).
Accurately interpreting clickthrough data as implicit feedback.
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (p./pp. 154--161),
New York, NY, USA: ACM.
ISBN: 1-59593-034-5
Herlocker, J. L., Konstan, J. A., Terveen, L. G. & Riedl, J. T.
(2004).
Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst.,
22, 5--53.
doi: http://doi.acm.org/10.1145/963770.963772
Järvelin, K. & Kekäläinen, J.
(2002).
Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems,
20, 422--446.
doi: 10.1145/582415.582418
Järvelin, K. & Kekäläinen, J.
(2000).
IR evaluation methods for retrieving highly relevant documents.
SIGIR '00: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (p./pp. 41--48),
New York, NY, USA: ACM.
ISBN: 1-58113-226-3
Lewis, D. D.
(1991).
Evaluating text categorization.
Proceedings of Speech and Natural Language Workshop (p./pp. 312-318),
Feb,
San Mateo: Morgan Kaufmann.