TY  - GEN
AU  - Rubin, Timothy N.
AU  - Chambers, America
AU  - Smyth, Padhraic
AU  - Steyvers, Mark
A2  - 
T1  - Statistical Topic Models for Multi-Label Document Classification
JO  - 
PB  - 
AD  - 
PY  - 2011/
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://arxiv.org/abs/1107.2462
M3  - 
KW  - mining
KW  - model
KW  - text
KW  - tm
KW  - topic
KW  - toread
L1  - 
N1  - Statistical Topic Models for Multi-Label Document Classification
N1  - 
AB  -   Machine learning approaches to multi-label document classification have (to date) largely relied on discriminative modeling techniques such as support vector machines. A drawback of these approaches is that performance rapidly drops off as the total number of labels and the number of labels per document increase. This problem is amplified when the label frequencies exhibit the type of highly skewed distributions that are often observed in real-world datasets. In this paper we investigate a class of generative statistical topic models for multi-label documents that associate individual word tokens with different labels. We investigate the advantages of this approach relative to discriminative models, particularly with respect to classification problems involving large numbers of relatively rare labels. We compare the performance of generative and discriminative approaches on document labeling tasks ranging from datasets with several thousand labels to datasets with tens of labels. The experimental results indicate that generative models can achieve competitive multi-label classification performance compared to discriminative methods, and have advantages for datasets with many labels and skewed label frequencies. 
ER  -

TY  - JOUR
AU  - Carpena, P.
AU  - Bernaola-Galván, P.
AU  - Hackenberg, M.
AU  - Coronado, A. V.
AU  - Oliver, J. L.
T1  - Level statistics of words: Finding keywords in literary texts and symbolic sequences
JO  - Physical Review E (Statistical, Nonlinear, and Soft Matter Physics)
PY  - 2009/
VL  - 79
IS  - 3
SP  - 
EP  - 
UR  - http://bioinfo2.ugr.es/TextKeywords/
M3  - 10.1103/PhysRevE.79.035102
KW  - analysis
KW  - extraction
KW  - keyword
KW  - statistical
KW  - text
KW  - tm
KW  - topic
KW  - toread
L1  - 
SN  - 
N1  - Level statistics of words: Finding keywords in literary texts and symbolic sequences
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Huang, Anna
AU  - Milne, David N.
AU  - Frank, Eibe
AU  - Witten, Ian H.
A2  - Theeramunkong, Thanaruk
A2  - Kijsirikul, Boonserm
A2  - Cercone, Nick
A2  - Ho, Tu Bao
T1  - Clustering Documents Using a Wikipedia-Based Concept Representation.
T2  - PAKDD
PB  - Springer
CY  - 
PY  - 2009/
M2  - 
VL  - 5476
IS  - 
SP  - 628
EP  - 636
UR  - http://dblp.uni-trier.de/db/conf/pakdd/pakdd2009.html#HuangMFW09
M3  - 
KW  - background
KW  - clustering
KW  - knowledge
KW  - ontology
KW  - tm
KW  - wikipedia
L1  - 
SN  - 978-3-642-01306-5
N1  - dblp
N1  - 
AB  - 
ER  -

TY  - BOOK
AU  - Heyer, Gerhard
AU  - Quasthoff, Uwe
AU  - Wittig, Thomas
A2  - 
T1  - Text Mining: Wissensrohstoff Text
PB  - W3L-Verl.
AD  - Herdecke ; Bochum
PY  - 2008/
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://aleph.bib.uni-mannheim.de/F/?func=find-b&request=280507895&find_code=020&adjacent=N&local_base=MAN01PUBLIC&x=0&y=0
M3  - 
KW  - einführung
KW  - mining
KW  - text
KW  - tm
L1  - 
SN  - 978-3-937137-30-8
N1  - Konzepte, Algorithmen, Ergebnisse
N1  - 
AB  - 
ER  -

TY  - BOOK
AU  - 
A2  - Berendt, B.
A2  - Hotho, A.
A2  - Mladenic, D.
A2  - Semeraro, G.
T1  - From Web to Social Web: Discovering and Deploying User and Content Profiles 
PB  - Springer
AD  - 
PY  - 2007/
VL  - 4736
IS  - 
SP  - 
EP  - 
UR  - http://www.springer.com/dal/home?SGWID=1-102-22-173759307-0&changeHeader=true&referer=www.springeronline.com&SHORTCUT=www.springer.com/978-3-540-74950-9
M3  - 
KW  - 2007
KW  - data
KW  - dm
KW  - mining
KW  - myown
KW  - social
KW  - tm
KW  - web
L1  - 
SN  - 978-3-540-74950-9
N1  - From Web to Social Web: Discovering and Deploying User and Cont... - Data Mi...Journals, Books &amp; Online Media | Springer
N1  - 
AB  - This book constitutes the refereed proceedings of the Workshop on Web Mining, WebMine 2006, held in Berlin, Germany, September 18th, 2006. Topics included are data mining based on analysis of bloggers and tagging, web mining, XML mining and further techniques of knowledge discovery. The book is especially valuable for those interested in the aspects of the Social Web (Web 2.0) and its inherent dynamic and diversity of user-generated content.
ER  -

TY  - BOOK
AU  - Feldman, Ronen
AU  - Sanger, James
A2  - 
T1  - The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
PB  - Cambridge University Press
AD  - 
PY  - 2007/
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://www.amazon.com/Text-Mining-Handbook-Approaches-Unstructured/dp/0521836573/ref=sr_1_1?s=books&ie=UTF8&qid=1295265273&sr=1-1
M3  - 
KW  - mining
KW  - text
KW  - tm
L1  - 
SN  - 0521836573
N1  - Amazon.com: The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data (9780521836579): Ronen Feldman, James Sanger: Books
N1  - 
AB  - 
ER  -

TY  - JOUR
AU  - Colas, Fabrice
AU  - Brazdil, Pavel
T1  - On the Behavior of SVM and Some Older Algorithms in Binary Text Classification Tasks
JO  - Text, Speech and Dialogue
PY  - 2006/
VL  - 
IS  - 
SP  - 45
EP  - 52
UR  - http://dx.doi.org/10.1007/11846406_6
M3  - 
KW  - classification
KW  - knn
KW  - nb
KW  - preprocessing
KW  - svm
KW  - text
KW  - tm
KW  - toread
L1  - 
SN  - 
N1  - SpringerLink - Buchkapitel
N1  - 
AB  - Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature<p>space transformation whereas some others compared the performance of different algorithms. Recently, following the risinginterest towards the Support Vector Machine, various studies showed that the SVM outperforms other classification algorithms.So should we just not bother about other classification algorithms and opt always for SVM?
ER  -

TY  - JOUR
AU  - Crane, Gregory
T1  - What Do You Do with a Million Books?
JO  - D-Lib Magazine
PY  - 2006/march 
VL  - 12
IS  - 3
SP  - 
EP  - 
UR  - http://www.dlib.org/dlib/march06/crane/03crane.html
M3  - 10.1045/march2006-crane
KW  - Book
KW  - Mining
KW  - Text
KW  - google
KW  - tm
KW  - toread
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - BOOK
AU  - Weiss, Sholom M.
AU  - Indurkhya, Nitin
AU  - Zhang, T.
A2  - 
T1  - Text Mining. Predictive Methods for Analyzing Unstructured Information
PB  - Springer, Berlin
AD  - 
PY  - 2004/
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://www.amazon.de/gp/redirect.html%3FASIN=0387954333%26tag=ws%26lcode=xm2%26cID=2025%26ccmID=165953%26location=/o/ASIN/0387954333%253FSubscriptionId=13CT5CVB80YFWJEPWS02
M3  - 
KW  - dm
KW  - mining
KW  - nlp
KW  - software
KW  - text
KW  - tm
L1  - 
SN  - 0387954333
N1  - Amazon.de: Text Mining. Predictive Methods for Analyzing Unstructured Information: Sholom M. Weiss,Nitin Indurkhya,T. Zhang: English Books
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Hotho, Andreas
AU  - Maedche, Alexander
AU  - Staab, Steffen
A2  - 
T1  - Text Clustering Based on Good Aggregations
T2  - ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining
PB  - IEEE Computer Society
CY  - Washington, DC, USA
PY  - 2001/
M2  - 
VL  - 
IS  - 
SP  - 607
EP  - 608
UR  - http://portal.acm.org/citation.cfm?id=658040
M3  - 
KW  - 2001
KW  - clustering
KW  - gruppenbildung
KW  - kmeans
KW  - myown
KW  - ontology
KW  - text
KW  - tm
L1  - 
SN  - 0-7695-1119-8
N1  - Text Clustering Based on Good Aggregations
N1  - 
AB  - 
ER  -