Hotho, A.; Staab, S. & Stumme, G.
(2003):
Explaining Text Clustering Results using Semantic Structures.
In: Knowledge Discovery in Databases: PKDD 2003, 7th European Conference on Principles and Practice of Knowledge Discovery in Databases,
Heidelberg.
[Volltext]
[Kurzfassung] [BibTeX][Endnote]
Common text clustering techniques offer rather poor capabilities
r explaining to their users why a particular result has been
hieved. They have the disadvantage that they do not relate
mantically nearby terms and that they cannot explain how
sulting clusters are related to each other.
n this paper, we discuss a way of integrating a large thesaurus
nd the computation of lattices of resulting clusters into common text clustering
n order to overcome these two problems.
its major result, our approach achieves an explanation using an
propriate level of granularity at the concept level as well as
appropriate size and complexity of the explaining lattice of
sulting clusters.
@inproceedings{hotho03explaining,
author = {Hotho, Andreas and Staab, Steffen and Stumme, Gerd},
title = {Explaining Text Clustering Results using Semantic Structures},
editor = {Lavrač, Nada and Gamberger, Dragan and Todorovski, Hendrik BlockeelLjupco},
booktitle = {Knowledge Discovery in Databases: PKDD 2003, 7th European Conference on Principles and Practice of Knowledge Discovery in Databases},
series = {LNAI},
publisher = {Springer},
address = {Heidelberg},
year = {2003},
volume = {2838},
pages = {217-228},
url = {http://www.kde.cs.uni-kassel.de/stumme/papers/2003/hotho2003explaining.pdf},
keywords = {2003, analysis, clustering, concept, fca, formal, myown, ontologies, semantic, semantics, text},
abstract = {Common text clustering techniques offer rather poor capabilities
r explaining to their users why a particular result has been
hieved. They have the disadvantage that they do not relate
mantically nearby terms and that they cannot explain how
sulting clusters are related to each other.
n this paper, we discuss a way of integrating a large thesaurus
nd the computation of lattices of resulting clusters into common text clustering
n order to overcome these two problems.
its major result, our approach achieves an explanation using an
propriate level of granularity at the concept level as well as
appropriate size and complexity of the explaining lattice of
sulting clusters.}
}
%0 = inproceedings
%A = Hotho, Andreas and Staab, Steffen and Stumme, Gerd
%B = Knowledge Discovery in Databases: PKDD 2003, 7th European Conference on Principles and Practice of Knowledge Discovery in Databases
%C = Heidelberg
%D = 2003
%I = Springer
%T = Explaining Text Clustering Results using Semantic Structures
%U = http://www.kde.cs.uni-kassel.de/stumme/papers/2003/hotho2003explaining.pdf
Hotho, A.; Staab, S. & Stumme, G.
(2003):
Text Clustering Based on Background Knowledge.
[Volltext] [Kurzfassung] [BibTeX]
[Endnote]
Text document clustering plays an important role in providing intuitive
vigation and browsing mechanisms by organizing large amounts of information
to a small number of meaningful clusters. Standard partitional or agglomerative
ustering methods efficiently compute results to this end.
wever, the bag of words representation used for these clustering methods is often
satisfactory as it ignores relationships between important terms that do not
-occur literally. Also, it is mostly left to the user to find out why a particular partitioning
s been achieved, because it is only specified extensionally. In order to
al with the two problems, we integrate background knowledge into the process of
ustering text documents.
rst, we preprocess the texts, enriching their representations by background knowledge
ovided in a core ontology — in our application Wordnet. Then, we cluster
e documents by a partitional algorithm. Our experimental evaluation on Reuters
wsfeeds compares clustering results with pre-categorizations of news. In the experiments,
provements of results by background knowledge compared to the baseline
n be shown for many interesting tasks.
cond, the clustering partitions the large number of documents to a relatively small
mber of clusters, which may then be analyzed by conceptual clustering. In our approach,
applied Formal Concept Analysis. Conceptual clustering techniques are
own to be too slow for directly clustering several hundreds of documents, but they
ve an intensional account of cluster results. They allow for a concise description
commonalities and distinctions of different clusters. With background knowledge
ey even find abstractions like “food” (vs. specializations like “beef” or “corn”).
us, in our approach, partitional clustering reduces first the size of the problem
ch that it becomes tractable for conceptual clustering, which then facilitates the
derstanding of the results.
@techreport{hotho03textclustering,
author = {Hotho, Andreas and Staab, Steffen and Stumme, Gerd},
title = {Text Clustering Based on Background Knowledge},
type = {Technical Report },
year = {2003},
volume = {425},
url = {http://www.kde.cs.uni-kassel.de/stumme/papers/2003/hotho2003text.pdf},
keywords = {2003, analysis, background, clustering, concept, fca, formal, knowledge, myown, ontologies, semantic, text, web},
abstract = {Text document clustering plays an important role in providing intuitive
vigation and browsing mechanisms by organizing large amounts of information
to a small number of meaningful clusters. Standard partitional or agglomerative
ustering methods efficiently compute results to this end.
wever, the bag of words representation used for these clustering methods is often
satisfactory as it ignores relationships between important terms that do not
-occur literally. Also, it is mostly left to the user to find out why a particular partitioning
s been achieved, because it is only specified extensionally. In order to
al with the two problems, we integrate background knowledge into the process of
ustering text documents.
rst, we preprocess the texts, enriching their representations by background knowledge
ovided in a core ontology — in our application Wordnet. Then, we cluster
e documents by a partitional algorithm. Our experimental evaluation on Reuters
wsfeeds compares clustering results with pre-categorizations of news. In the experiments,
provements of results by background knowledge compared to the baseline
n be shown for many interesting tasks.
cond, the clustering partitions the large number of documents to a relatively small
mber of clusters, which may then be analyzed by conceptual clustering. In our approach,
applied Formal Concept Analysis. Conceptual clustering techniques are
own to be too slow for directly clustering several hundreds of documents, but they
ve an intensional account of cluster results. They allow for a concise description
commonalities and distinctions of different clusters. With background knowledge
ey even find abstractions like “food” (vs. specializations like “beef” or “corn”).
us, in our approach, partitional clustering reduces first the size of the problem
ch that it becomes tractable for conceptual clustering, which then facilitates the
derstanding of the results.}
}
%0 = techreport
%A = Hotho, Andreas and Staab, Steffen and Stumme, Gerd
%D = 2003
%T = Text Clustering Based on Background Knowledge
%U = http://www.kde.cs.uni-kassel.de/stumme/papers/2003/hotho2003text.pdf
Hotho, A. & Stumme, G.
(2002):
Conceptual Clustering of Text Clusters.
In: Proc. Fachgruppentreffen Maschinelles Lernen (FGML 2002),
[Volltext]
[BibTeX][Endnote]
@inproceedings{hotho02conceptualclustering,
author = {Hotho, A. and Stumme, G.},
title = {Conceptual Clustering of Text Clusters},
editor = {Kókai, G. and Zeidler, J.},
booktitle = {Proc. Fachgruppentreffen Maschinelles Lernen (FGML 2002)},
year = {2002},
pages = {37-45},
url = {http://www.kde.cs.uni-kassel.de/stumme/papers/2002/FGML02.pdf},
keywords = {2002, analysis, clustering, concept, conceptual, fca, formal, myown, text}
}
%0 = inproceedings
%A = Hotho, A. and Stumme, G.
%B = Proc. Fachgruppentreffen Maschinelles Lernen (FGML 2002)
%D = 2002
%T = Conceptual Clustering of Text Clusters
%U = http://www.kde.cs.uni-kassel.de/stumme/papers/2002/FGML02.pdf
Stumme, G.; Taouil, R.; Bastide, Y. & Lakhal, L.
(2001):
Conceptual Clustering with Iceberg Concept Lattices.
In: Proc. GI-Fachgruppentreffen Maschinelles Lernen (FGML'01),
Universität Dortmund 763.
[Volltext]
[BibTeX][Endnote]
@inproceedings{stumme01conceptualclustering,
author = {Stumme, G. and Taouil, R. and Bastide, Y. and Lakhal, L.},
title = {Conceptual Clustering with Iceberg Concept Lattices},
editor = {Klinkenberg, R. and Rüping, S. and Fick, A. and Henze, N. and Herzog, C. and Molitor, R. and Schröder, O.},
booktitle = {Proc. GI-Fachgruppentreffen Maschinelles Lernen (FGML'01)},
address = {Universität Dortmund 763},
year = {2001},
url = {http://www.kde.cs.uni-kassel.de/stumme/papers/2001/FGML01.pdf},
keywords = {2001, analysis, closed, clustering, concept, conceptual, discovery, fca, formal, iceberg, itemsets, kdd, knowledge, lattices, myown}
}
%0 = inproceedings
%A = Stumme, G. and Taouil, R. and Bastide, Y. and Lakhal, L.
%B = Proc. GI-Fachgruppentreffen Maschinelles Lernen (FGML'01)
%C = Universität Dortmund 763
%D = 2001
%T = Conceptual Clustering with Iceberg Concept Lattices
%U = http://www.kde.cs.uni-kassel.de/stumme/papers/2001/FGML01.pdf