TY  - RPRT
AU  - Hotho, Andreas
AU  - Staab, Steffen
AU  - Stumme, Gerd
A2  - 
T1  - Text Clustering Based on Background Knowledge
PB  - University of Karlsruhe, Institute AIFB
AD  - 
PY  - 2003/
VL  - 425
IS  - 
SP  - 
EP  - 
UR  - http://www.kde.cs.uni-kassel.de/stumme/papers/2003/hotho2003text.pdf
M3  - 
KW  - 2003
KW  - analysis
KW  - background
KW  - clustering
KW  - concept
KW  - fca
KW  - formal
KW  - knowledge
KW  - myown
KW  - ontologies
KW  - semantic
KW  - text
KW  - web
L1  - 
N1  - Publications of Gerd Stumme
N1  - Technical Report 
N1  - 
AB  - Text document clustering plays an important role in providing intuitive<p>navigation and browsing mechanisms by organizing large amounts of information<p>into a small number of meaningful clusters. Standard partitional or agglomerative<p>clustering methods efficiently compute results to this end.<p>However, the bag of words representation used for these clustering methods is often<p>unsatisfactory as it ignores relationships between important terms that do not<p>co-occur literally. Also, it is mostly left to the user to find out why a particular partitioning<p>has been achieved, because it is only specified extensionally. In order to<p>deal with the two problems, we integrate background knowledge into the process of<p>clustering text documents.<p>First, we preprocess the texts, enriching their representations by background knowledge<p>provided in a core ontology — in our application Wordnet. Then, we cluster<p>the documents by a partitional algorithm. Our experimental evaluation on Reuters<p>newsfeeds compares clustering results with pre-categorizations of news. In the experiments,<p>improvements of results by background knowledge compared to the baseline<p>can be shown for many interesting tasks.<p>Second, the clustering partitions the large number of documents to a relatively small<p>number of clusters, which may then be analyzed by conceptual clustering. In our approach,<p>we applied Formal Concept Analysis. Conceptual clustering techniques are<p>known to be too slow for directly clustering several hundreds of documents, but they<p>give an intensional account of cluster results. They allow for a concise description<p>of commonalities and distinctions of different clusters. With background knowledge<p>they even find abstractions like “food” (vs. specializations like “beef” or “corn”).<p>Thus, in our approach, partitional clustering reduces first the size of the problem<p>such that it becomes tractable for conceptual clustering, which then facilitates the<p>understanding of the results.
ER  -

TY  - CONF
AU  - Stumme, Gerd
A2  - Bock, H.-H.
A2  - Polasek, W.
T1  - Attribute Exploration with Background Implications and Exceptions
T2  - Data Analysis  and  Information  Systems. Statistical and Conceptual approaches. Proc. GfKl'95. Studies in Classification, Data Analysis, and Knowledge Organization 7
PB  - Springer
CY  - Heidelberg
PY  - 1996/
M2  - 
VL  - 
IS  - 
SP  - 457
EP  - 469
UR  - http://www.kde.cs.uni-kassel.de/stumme/papers/1995/P1781-GfKl95.pdf
M3  - 
KW  - 1996
KW  - acquisition
KW  - analysis
KW  - attribute
KW  - background
KW  - concept
KW  - exploration
KW  - fca
KW  - formal
KW  - implications
KW  - knowledge
KW  - lattices
KW  - myown
L1  - 
SN  - 
N1  - Publications of Gerd Stumme
N1  - 
AB  - 
ER  -