Denoyer, L. & Gallinari, P. (2006). The Wikipedia XML Corpus. SIGIR Forum, .

Lewis, D. D., Yang, Y., Rose, T. G. & Li, F. (2004). RCV1: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research, 5, 361--397.

Halevy, A. Y. & Madhavan, J. (2003). Corpus-Based Knowledge Representation. In G. Gottlob & T. Walsh (eds.), IJCAI-03, Proceedings of the Eighteenth International Joint Conference
on Artificial Intelligence, Acapulco, Mexico, August 9-15, 2003
(p./pp. 1567-1572), : Morgan Kaufmann.

Jiang, J. J. & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. CoRR, cmp-lg/9709008.