PUMA publications for /user/stumme/Texthttps://puma.uni-kassel.de/user/stumme/TextPUMA RSS feed for /user/stumme/Text2024-03-29T12:42:00+01:00Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Researchhttps://puma.uni-kassel.de/bibtex/2f6eddba1f2c6b7cdbfa67a0c79ae5ae8/stummestumme2014-06-30T16:22:33+02:00FCA IR Mining SOTA Text <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Jonas Poelmans" itemprop="url" href="/author/Jonas%20Poelmans"><span itemprop="name">J. Poelmans</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="DmitryI. Ignatov" itemprop="url" href="/author/DmitryI.%20Ignatov"><span itemprop="name">D. Ignatov</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Stijn Viaene" itemprop="url" href="/author/Stijn%20Viaene"><span itemprop="name">S. Viaene</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Guido Dedene" itemprop="url" href="/author/Guido%20Dedene"><span itemprop="name">G. Dedene</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="SergeiO. Kuznetsov" itemprop="url" href="/author/SergeiO.%20Kuznetsov"><span itemprop="name">S. Kuznetsov</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Advances in Data Mining. Applications and Theoretical Aspects</span>, </em><em>том 7377 из Lecture Notes in Computer Science, </em><em><span itemprop="publisher">Springer Berlin Heidelberg</span>, </em></span>(<em><span>2012<meta content="2012" itemprop="datePublished"/></span></em>)Mon Jun 30 16:22:33 CEST 2014Advances in Data Mining. Applications and Theoretical Aspects273-287Lecture Notes in Computer ScienceText Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research73772012FCA IR Mining SOTA Text Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords. Using a prototype of our FCA-based toolset CORDIET, we converted the pdf-files containing the papers to plain text, indexed them with Lucene using a thesaurus containing terms related to FCA research and then created the concept lattice shown in this paper. We visualized, analyzed and explored the literature with concept lattices and discovered multiple interesting research streams in IR of which we give an extensive overview. The core contributions of this paper are the innovative application of FCA to the text mining of scientific papers and the survey of the FCA-based IR research.Text Mining Scientific Papers: A Survey on FCA-Based Information Retrieval Research - SpringerOntologies improve text document clusteringhttps://puma.uni-kassel.de/bibtex/257a39c81cff1982dbefed529be934bee/stummestumme2010-04-07T13:54:41+02:002003 clustering data kdd mining myown ontologies text <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Steffen Staab" itemprop="url" href="/author/Steffen%20Staab"><span itemprop="name">S. Staab</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Gerd Stumme" itemprop="url" href="/author/Gerd%20Stumme"><span itemprop="name">G. Stumme</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Proceedings of the 2003 IEEE International Conference on Data Mining</span>, </em></span><em>стр. <span itemprop="pagination">541-544 (Poster</span>. </em><em>Melbourne, Florida, </em><em><span itemprop="publisher">IEEE Computer Society</span>, </em>(<em><span>Ноябрь 2003<meta content="Ноябрь 2003" itemprop="datePublished"/></span></em>)Wed Apr 07 13:54:41 CEST 2010Melbourne, FloridaProceedings of the 2003 IEEE International Conference on Data MiningNovember 19-22,541-544 (PosterOntologies improve text document clustering20032003 clustering data kdd mining myown ontologies text Publications of Gerd StummeText Clustering Based on Background Knowledgehttps://puma.uni-kassel.de/bibtex/261d58db419af0dbc3681432588219c3d/stummestumme2010-04-07T13:54:41+02:002003 analysis background clustering concept fca formal knowledge myown ontologies semantic text web <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Steffen Staab" itemprop="url" href="/author/Steffen%20Staab"><span itemprop="name">S. Staab</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Gerd Stumme" itemprop="url" href="/author/Gerd%20Stumme"><span itemprop="name">G. Stumme</span></a></span>. </span><em><span itemprop="educationalUse">Technical Report</span>, </em><em>том 425. </em><em><span itemprop="producer">University of Karlsruhe, Institute AIFB</span>, </em>(<em><span>2003<meta content="2003" itemprop="datePublished"/></span></em>)Wed Apr 07 13:54:41 CEST 2010Text Clustering Based on Background KnowledgeTechnical Report 42520032003 analysis background clustering concept fca formal knowledge myown ontologies semantic text web Text document clustering plays an important role in providing intuitive
navigation and browsing mechanisms by organizing large amounts of information
into a small number of meaningful clusters. Standard partitional or agglomerative
clustering methods efficiently compute results to this end.
However, the bag of words representation used for these clustering methods is often
unsatisfactory as it ignores relationships between important terms that do not
co-occur literally. Also, it is mostly left to the user to find out why a particular partitioning
has been achieved, because it is only specified extensionally. In order to
deal with the two problems, we integrate background knowledge into the process of
clustering text documents.
First, we preprocess the texts, enriching their representations by background knowledge
provided in a core ontology — in our application Wordnet. Then, we cluster
the documents by a partitional algorithm. Our experimental evaluation on Reuters
newsfeeds compares clustering results with pre-categorizations of news. In the experiments,
improvements of results by background knowledge compared to the baseline
can be shown for many interesting tasks.
Second, the clustering partitions the large number of documents to a relatively small
number of clusters, which may then be analyzed by conceptual clustering. In our approach,
we applied Formal Concept Analysis. Conceptual clustering techniques are
known to be too slow for directly clustering several hundreds of documents, but they
give an intensional account of cluster results. They allow for a concise description
of commonalities and distinctions of different clusters. With background knowledge
they even find abstractions like “food” (vs. specializations like “beef” or “corn”).
Thus, in our approach, partitional clustering reduces first the size of the problem
such that it becomes tractable for conceptual clustering, which then facilitates the
understanding of the results.Publications of Gerd StummeConceptual Clustering of Text Clustershttps://puma.uni-kassel.de/bibtex/2e253c44552a046fe90236274bcfeab13/stummestumme2010-04-07T13:54:41+02:002002 analysis clustering concept conceptual fca formal myown text <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="A. Hotho" itemprop="url" href="/author/A.%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="G. Stumme" itemprop="url" href="/author/G.%20Stumme"><span itemprop="name">G. Stumme</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Proc. Fachgruppentreffen Maschinelles Lernen (FGML 2002)</span>, </em></span><em>стр. <span itemprop="pagination">37-45</span>. </em>(<em><span>2002<meta content="2002" itemprop="datePublished"/></span></em>)Wed Apr 07 13:54:41 CEST 2010Proc. Fachgruppentreffen Maschinelles Lernen (FGML 2002)37-45Conceptual Clustering of Text Clusters20022002 analysis clustering concept conceptual fca formal myown text Publications of Gerd StummeWordnet improves text document clusteringhttps://puma.uni-kassel.de/bibtex/204c7d86337d68e4ed9ae637029c43414/stummestumme2010-04-07T13:54:41+02:002003 clustering data discovery document information ir kdd kmeans knowledge mining myown retrieval text wordnet <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="A Hotho" itemprop="url" href="/author/A%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="S. Staab" itemprop="url" href="/author/S.%20Staab"><span itemprop="name">S. Staab</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="G. Stumme" itemprop="url" href="/author/G.%20Stumme"><span itemprop="name">G. Stumme</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Proc. SIGIR Semantic Web Workshop</span>, </em></span><em>Toronto, </em>(<em><span>2003<meta content="2003" itemprop="datePublished"/></span></em>)Wed Apr 07 13:54:41 CEST 2010TorontoProc. SIGIR Semantic Web WorkshopWordnet improves text document clustering20032003 clustering data discovery document information ir kdd kmeans knowledge mining myown retrieval text wordnet Publications of Gerd StummeExplaining Text Clustering Results using Semantic Structureshttps://puma.uni-kassel.de/bibtex/253a943b6be4b34cf4e5329d0b58e99f6/stummestumme2010-04-07T13:54:41+02:002003 analysis clustering concept fca formal myown ontologies semantic semantics text <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Steffen Staab" itemprop="url" href="/author/Steffen%20Staab"><span itemprop="name">S. Staab</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Gerd Stumme" itemprop="url" href="/author/Gerd%20Stumme"><span itemprop="name">G. Stumme</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Knowledge Discovery in Databases: PKDD 2003, 7th European Conference on Principles and Practice of Knowledge Discovery in Databases</span>, </em></span><em>том 2838 из LNAI, </em><em>стр. <span itemprop="pagination">217-228</span>. </em><em>Heidelberg, </em><em><span itemprop="publisher">Springer</span>, </em>(<em><span>2003<meta content="2003" itemprop="datePublished"/></span></em>)Wed Apr 07 13:54:41 CEST 2010HeidelbergKnowledge Discovery in Databases: PKDD 2003, 7th European Conference on Principles and Practice of Knowledge Discovery in Databases217-228LNAIExplaining Text Clustering Results using Semantic Structures283820032003 analysis clustering concept fca formal myown ontologies semantic semantics text Common text clustering techniques offer rather poor capabilities
for explaining to their users why a particular result has been
achieved. They have the disadvantage that they do not relate
semantically nearby terms and that they cannot explain how
resulting clusters are related to each other.
In this paper, we discuss a way of integrating a large thesaurus
and the computation of lattices of resulting clusters into common text clustering
in order to overcome these two problems.
As its major result, our approach achieves an explanation using an
appropriate level of granularity at the concept level as well as
an appropriate size and complexity of the explaining lattice of
resulting clusters.Publications of Gerd StummeMachine Learnability Analysis of Textclassifications in a Social Bookmarking Folksonomyhttps://puma.uni-kassel.de/bibtex/29a65067da65e8301182b33b4ae292141/stummestumme2009-01-27T15:10:41+01:00Illig bachelor classification learning machine recommendations text <meta content="thesis" itemprop="educationalUse"/><span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Jens Illig" itemprop="url" href="/author/Jens%20Illig"><span itemprop="name">J. Illig</span></a></span>. </span><em>University of Kassel, </em><em>Kassel, </em><em><span itemprop="educationalUse">Bachelor Thesis</span>, </em>(<em><span>2008<meta content="2008" itemprop="datePublished"/></span></em>)Tue Jan 27 15:10:41 CET 2009KasselMachine Learnability Analysis of Textclassifications in a Social Bookmarking FolksonomyBachelor Thesis2008Illig bachelor classification learning machine recommendations text Distributional measures as proxies for semantic relatednesshttps://puma.uni-kassel.de/bibtex/2fe1ed4dfc0e42165de44853564c7f6af/stummestumme2007-10-31T22:01:56+01:00distributional measure measures relatedness semantic similarity text <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Saif Mohammad" itemprop="url" href="/author/Saif%20Mohammad"><span itemprop="name">S. Mohammad</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Graeme Hirst" itemprop="url" href="/author/Graeme%20Hirst"><span itemprop="name">G. Hirst</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"> </span>(<em><span>Submitted for publication<meta content="Submitted for publication" itemprop="datePublished"/></span></em>)Wed Oct 31 22:01:56 CET 2007Distributional measures as proxies for semantic relatednessSubmitted for publicationdistributional measure measures relatedness semantic similarity text Learning Concept Hierarchies from Text Corpora using Formal Concept Analysishttps://puma.uni-kassel.de/bibtex/2eaaf0e4b3a8b29fab23b6c15ce2d308d/stummestumme2007-09-04T21:36:01+02:00analysis concept fca formal hierarchies hierarchy learning ontologies ontology text <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Philipp Cimiano" itemprop="url" href="/author/Philipp%20Cimiano"><span itemprop="name">P. Cimiano</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, и <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Steffen Staab" itemprop="url" href="/author/Steffen%20Staab"><span itemprop="name">S. Staab</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"><span itemtype="http://schema.org/Periodical" itemscope="itemscope" itemprop="isPartOf"><span itemprop="name"><em>Journal on Artificial Intelligence Research</em></span></span> </span>(<em><span>2005<meta content="2005" itemprop="datePublished"/></span></em>)Tue Sep 04 21:36:01 CEST 2007Journal on Artificial Intelligence Research305-339Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis242005analysis concept fca formal hierarchies hierarchy learning ontologies ontology text