PUMA publications for /user/hotho/tmhttps://puma.uni-kassel.de/user/hotho/tmPUMA RSS feed for /user/hotho/tm2024-03-28T20:53:59+01:00On Knowledgeable Unsupervised Text Mininghttps://puma.uni-kassel.de/bibtex/2a0734d09c40265a173480ce4dc0af57a/hothohotho2008-01-15T11:01:30+01:002003 myown tm <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Alexander Maedche" itemprop="url" href="/author/Alexander%20Maedche"><span itemprop="name">A. Maedche</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Steffen Staab" itemprop="url" href="/author/Steffen%20Staab"><span itemprop="name">S. Staab</span></a></span>, und <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Valentin Zacharias" itemprop="url" href="/author/Valentin%20Zacharias"><span itemprop="name">V. Zacharias</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Text Mining</span>, </em></span>(<em><span>2003<meta content="2003" itemprop="datePublished"/></span></em>)Tue Jan 15 11:01:30 CET 2008Text Mining131-152On Knowledgeable Unsupervised Text Mining20032003 myown tm DBLP Record 'books/sp/franke2003/HothoMSZ03'A study of smoothing methods for language models applied to information retrievalhttps://puma.uni-kassel.de/bibtex/2c7aff853599cdde58a1d27eff4ede314/hothohotho2007-09-03T16:56:35+02:00ir model text tm <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Chengxiang Zhai" itemprop="url" href="/author/Chengxiang%20Zhai"><span itemprop="name">C. Zhai</span></a></span>, und <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="John Lafferty" itemprop="url" href="/author/John%20Lafferty"><span itemprop="name">J. Lafferty</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"><span itemtype="http://schema.org/Periodical" itemscope="itemscope" itemprop="isPartOf"><span itemprop="name"><em>ACM Trans. Inf. Syst.</em></span></span> <em><span itemtype="http://schema.org/PublicationVolume" itemscope="itemscope" itemprop="isPartOf"><span itemprop="volumeNumber">22 </span></span>(<span itemprop="issueNumber">2</span>):
<span itemprop="pagination">179--214</span></em> </span>(<em><span>2004<meta content="2004" itemprop="datePublished"/></span></em>)Mon Sep 03 16:56:35 CEST 2007New York, NY, USAACM Trans. Inf. Syst.2179--214A study of smoothing methods for language models applied to information retrieval222004ir model text tm Language modeling approaches to information retrieval are attractive and promising because they connect the problem of retrieval with that of language model estimation, which has been studied extensively in other application areas such as speech recognition. The basic idea of these approaches is to estimate a language model for each document, and to then rank documents by the likelihood of the query according to the estimated language model. A central issue in language model estimation is smoothing, the problem of adjusting the maximum likelihood estimator to compensate for data sparseness. In this article, we study the problem of language model smoothing and its influence on retrieval performance. We examine the sensitivity of retrieval performance to the smoothing parameters and compare several popular smoothing methods on different test collections. Experimental results show that not only is the retrieval performance generally sensitive to the smoothing parameters, but also the sensitivity pattern is affected by the query type, with performance being more sensitive to smoothing for verbose queries than for keyword queries. Verbose queries also generally require more aggressive smoothing to achieve optimal performance. This suggests that smoothing plays two different role---to make the estimated document language model more accurate and to "explain" the noninformative words in the query. In order to decouple these two distinct roles of smoothing, we propose a two-stage smoothing strategy, which yields better sensitivity patterns and facilitates the setting of smoothing parameters automatically. We further propose methods for estimating the smoothing parameters automatically. Evaluation on five different databases and four types of queries indicates that the two-stage smoothing method with the proposed parameter estimation methods consistently gives retrieval performance that is close to---or better than---the best results achieved using a single smoothing method and exhaustive parameter search on the test data.Statistical Relational Learning for Document Mining.https://puma.uni-kassel.de/bibtex/27cdd6b0791fcdf17ec6d404b55f12c5c/hothohotho2007-08-31T14:53:26+02:002003 classification document mining srl text tm <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Alexandrin Popescul" itemprop="url" href="/author/Alexandrin%20Popescul"><span itemprop="name">A. Popescul</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Lyle H. Ungar" itemprop="url" href="/author/Lyle%20H.%20Ungar"><span itemprop="name">L. Ungar</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Steve Lawrence" itemprop="url" href="/author/Steve%20Lawrence"><span itemprop="name">S. Lawrence</span></a></span>, und <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="David M. Pennock" itemprop="url" href="/author/David%20M.%20Pennock"><span itemprop="name">D. Pennock</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">ICDM</span>, </em></span><em>Seite <span itemprop="pagination">275-282</span>. </em><em><span itemprop="publisher">IEEE Computer Society</span>, </em>(<em><span>2003<meta content="2003" itemprop="datePublished"/></span></em>)Fri Aug 31 14:53:26 CEST 2007ICDMconf/icdm/2003275-282Statistical Relational Learning for Document Mining.20032003 classification document mining srl text tm dblpKnowledge Discovery in Textual Databases (KDT)https://puma.uni-kassel.de/bibtex/2d1bb2e8dff9bd80da158b4b770685dce/hothohotho2007-08-23T15:12:04+02:00mining text tm <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="R. Feldman" itemprop="url" href="/author/R.%20Feldman"><span itemprop="name">R. Feldman</span></a></span>, und <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="I. Dagan" itemprop="url" href="/author/I.%20Dagan"><span itemprop="name">I. Dagan</span></a></span>. </span><span itemtype="http://schema.org/Book" itemscope="itemscope" itemprop="isPartOf"><em><span itemprop="name">Proc. of the First Int. Conf. on Knowledge Discovery (KDD)</span>, </em></span><em>Seite <span itemprop="pagination">112-117</span>. </em>(<em><span>1995<meta content="1995" itemprop="datePublished"/></span></em>)Thu Aug 23 15:12:04 CEST 2007Proc. of the First Int. Conf. on Knowledge Discovery (KDD)feldman95KDT112-117Knowledge Discovery in Textual Databases (KDT)InProceedings1995mining text tm Machine learning in automated text categorizationhttps://puma.uni-kassel.de/bibtex/20fe0d5dd12c2cb59dfc330e684ec4b4a/hothohotho2006-10-27T11:56:43+02:00tm text survey classification categorization ml <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="F. Sebastiani" itemprop="url" href="/author/F.%20Sebastiani"><span itemprop="name">F. Sebastiani</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"><span itemtype="http://schema.org/Periodical" itemscope="itemscope" itemprop="isPartOf"><span itemprop="name"><em>ACM Computing Surveys</em></span></span> <em><span itemtype="http://schema.org/PublicationVolume" itemscope="itemscope" itemprop="isPartOf"><span itemprop="volumeNumber">34 </span></span>(<span itemprop="issueNumber">1</span>):
<span itemprop="pagination">1--47</span></em> </span>(<em><span>2002<meta content="2002" itemprop="datePublished"/></span></em>)Fri Oct 27 11:56:43 CEST 2006ACM Computing Surveys11--47Machine learning in automated text categorization342002tm text survey classification categorization ml A Brief Survey of Text Mininghttps://puma.uni-kassel.de/bibtex/26ecc8a3cee1a99bbb9f8f8dd6a9d2959/hothohotho2006-06-27T09:41:56+02:002005 SumSchool06 mining myown ontology overview survey text tm <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Nürnberger" itemprop="url" href="/author/Andreas%20N%c3%bcrnberger"><span itemprop="name">A. Nürnberger</span></a></span>, und <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Gerhard Paaß" itemprop="url" href="/author/Gerhard%20Paa%c3%9f"><span itemprop="name">G. Paaß</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"><span itemtype="http://schema.org/Periodical" itemscope="itemscope" itemprop="isPartOf"><span itemprop="name"><em>LDV Forum - GLDV Journal for Computational Linguistics and Language Technology</em></span></span> <em><span itemtype="http://schema.org/PublicationVolume" itemscope="itemscope" itemprop="isPartOf"><span itemprop="volumeNumber">20 </span></span>(<span itemprop="issueNumber">1</span>):
<span itemprop="pagination">19-62</span></em> </span>(<em><span>Mai 2005<meta content="Mai 2005" itemprop="datePublished"/></span></em>)Tue Jun 27 09:41:56 CEST 2006LDV Forum - GLDV Journal for Computational Linguistics and Language TechnologyMAY119-62 A Brief Survey of Text Mining2020052005 SumSchool06 mining myown ontology overview survey text tm