TIR 2010
7th International Workshop on Text-based Information Retrieval
in conjunction with DEXA 2010
University of Deusto
Bilbao, Spain
30 August - 3 September 2010
20 Newsgroups
Abstract
This data set consists of 20000 messages taken from 20 Usenet newsgroups.
Information files:
description of the data
Data files:
20_newsgroups.tar.gz (17.3M; 61.6M uncompressed)
mini_newsgroups.tar.gz A subset composed of 100 articles from each newsgroup. (1.9M; 6.2M uncompressed)
C. Kohlschütter, P. Fankhauser, und W. Nejdl. Proc. of 3rd ACM International Conference on Web Search and Data Mining New York City, NY USA (WSDM 2010)., (2010)
J. Poelmans, D. Ignatov, S. Viaene, G. Dedene, und S. Kuznetsov. Advances in Data Mining. Applications and Theoretical Aspects, Volume 7377 von Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2012)
M. Hearst. Proceedings of the 14th conference on Computational linguistics, 2, Seite 539--545. Stroudsburg, PA, USA, Association for Computational Linguistics, (1992)
B. Martins, H. Manguinhas, und J. Borbinha. Proceedings of the International Conference on Semantic Computing, Seite 1--9. IEEE Computer Society, (August 2008)
D. Lin. Proceedings of the 17th international conference on Computational linguistics, Seite 768--774. Morristown, NJ, USA, Association for Computational Linguistics, (1998)
C. Luo, Y. Li, und S. Chung. Data & Knowledge Engineering68 (11):
1271 - 1288(2009)Including Special Section: Conference on Privacy in Statistical Databases (PSD 2008) - Six selected and extended papers on Database Privacy.
A. Hotho, S. Staab, und G. Stumme. Proceedings of the 2003 IEEE International Conference on Data Mining, Seite 541-544 (Poster. Melbourne, Florida, IEEE Computer Society, (November 2003)
A. Hotho, S. Staab, und G. Stumme. Knowledge Discovery in Databases: PKDD 2003, 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, Volume 2838 von LNAI, Seite 217-228. Heidelberg, Springer, (2003)
E. Breck, Y. Choi, und C. Cardie. IJCAI'07: Proceedings of the 20th International Joint Conference on Artifical Intelligence, Seite 2683--2688. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (2007)
G. Ifrim, M. Theobald, und G. Weikum. Proceedings of the 22nd International Conference on Machine Learning - Learning in Web Search (LWS 2005), Seite 18--26. Bonn, Germany, (2005)
L. Baker, und A. McCallum. Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, Seite 96--103. Melbourne, AU, ACM Press, New York, US, (1998)
A. Hotho, A. Maedche, und S. Staab. ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining, Seite 607--608. Washington, DC, USA, IEEE Computer Society, (2001)
A. Hotho, und G. Stumme. Proceedings of FGML Workshop, Seite 37-45. Special Interest Group of German Informatics Society (FGML --- Fachgruppe Maschinelles Lernen der GI e.V.), (2002)
P. Cimiano, A. Hotho, und S. Staab. Proceedings of the Conference on Languages Resources and Evaluation (LREC), Lisbon, Portugal, ELRA - European Language Ressources Association, (Mai 2004)