PUMA publications for /user/stephandoerfel/testinghttps://puma.uni-kassel.de/user/stephandoerfel/testingPUMA RSS feed for /user/stephandoerfel/testing2024-03-29T00:15:45+01:00Statistical Comparisons of Classifiers over Multiple Data Setshttps://puma.uni-kassel.de/bibtex/293751bd0bfabffe38f799b9bb7f4c227/stephandoerfelstephandoerfel2015-03-19T20:53:26+01:00classification prediction significance testing <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Janez Demsar" itemprop="url" href="/author/Janez%20Dem%5cv%7bs%7dar"><span itemprop="name">J. Demsar</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"><span itemtype="http://schema.org/Periodical" itemscope="itemscope" itemprop="isPartOf"><span itemprop="name"><em>J. Mach. Learn. Res.</em></span></span> </span>(<em><span>Dezember 2006<meta content="Dezember 2006" itemprop="datePublished"/></span></em>)Thu Mar 19 20:53:26 CET 2015J. Mach. Learn. Res.dec1--30Statistical Comparisons of Classifiers over Multiple Data Sets72006classification prediction significance testing While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all but ignored. This article reviews the current practice and then theoretically and empirically examines several suitable tests. Based on that, we recommend a set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparison of more classifiers over multiple data sets. Results of the latter can also be neatly presented with the newly introduced CD (critical difference) diagrams.Statistical Comparisons of Classifiers over Multiple Data SetsOf course we share! Testing Assumptions about Social Tagging Systemshttps://puma.uni-kassel.de/bibtex/2e360f0bd207806e72305efe16491ebe3/stephandoerfelstephandoerfel2014-01-06T11:11:58+01:002014 analysis assumptions bibsonomy data folksonomy log myown share social tagging testing weblog <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Stephan Doerfel" itemprop="url" href="/author/Stephan%20Doerfel"><span itemprop="name">S. Doerfel</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Daniel Zoller" itemprop="url" href="/author/Daniel%20Zoller"><span itemprop="name">D. Zoller</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Philipp Singer" itemprop="url" href="/author/Philipp%20Singer"><span itemprop="name">P. Singer</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Thomas Niebler" itemprop="url" href="/author/Thomas%20Niebler"><span itemprop="name">T. Niebler</span></a></span>, <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Andreas Hotho" itemprop="url" href="/author/Andreas%20Hotho"><span itemprop="name">A. Hotho</span></a></span>, und <span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Markus Strohmaier" itemprop="url" href="/author/Markus%20Strohmaier"><span itemprop="name">M. Strohmaier</span></a></span>. </span>(<em><span>2014<meta content="2014" itemprop="datePublished"/></span></em>)Mon Jan 06 11:11:58 CET 2014cite arxiv:1401.0629Of course we share! Testing Assumptions about Social Tagging Systems20142014 analysis assumptions bibsonomy data folksonomy log myown share social tagging testing weblog Social tagging systems have established themselves as an important part in
today's web and have attracted the interest from our research community in a
variety of investigations. The overall vision of our community is that simply
through interactions with the system, i.e., through tagging and sharing of
resources, users would contribute to building useful semantic structures as
well as resource indexes using uncontrolled vocabulary not only due to the
easy-to-use mechanics. Henceforth, a variety of assumptions about social
tagging systems have emerged, yet testing them has been difficult due to the
absence of suitable data. In this work we thoroughly investigate three
available assumptions - e.g., is a tagging system really social? - by examining
live log data gathered from the real-world public social tagging system
BibSonomy. Our empirical results indicate that while some of these assumptions
hold to a certain extent, other assumptions need to be reflected and viewed in
a very critical light. Our observations have implications for the design of
future search and other algorithms to better reflect the actual user behavior.Of course we share! Testing Assumptions about Social Tagging SystemsLikelihood Ratio Tests for Model Selection and Non-Nested Hypotheseshttps://puma.uni-kassel.de/bibtex/26888912f6666d4de22bdc794a05dfa1b/stephandoerfelstephandoerfel2015-05-13T19:00:50+02:00comparision hypothesis likelihood powerLaw testing <span class="authorEditorList"><span itemtype="http://schema.org/Person" itemscope="itemscope" itemprop="author"><a title="Quang H. Vuong" itemprop="url" href="/author/Quang%20H.%20Vuong"><span itemprop="name">Q. Vuong</span></a></span>. </span><span itemtype="http://schema.org/PublicationIssue" itemscope="itemscope" itemprop="isPartOf"><span itemtype="http://schema.org/Periodical" itemscope="itemscope" itemprop="isPartOf"><span itemprop="name"><em>Econometrica</em></span></span> <em><span itemtype="http://schema.org/PublicationVolume" itemscope="itemscope" itemprop="isPartOf"><span itemprop="volumeNumber">57 </span></span>(<span itemprop="issueNumber">2</span>):
<span itemprop="pagination">pp. 307-333</span></em> </span>(<em><span>1989<meta content="1989" itemprop="datePublished"/></span></em>)Wed May 13 19:00:50 CEST 2015Econometrica2pp. 307-333Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses571989comparision hypothesis likelihood powerLaw testing In this paper, we develop a classical approach to model selection. Using the Kullback-Leibler Information Criterion to measure the closeness of a model to the truth, we propose simple likelihood-ratio based statistics for testing the null hypothesis that the competing models are equally close to the true data generating process against the alternative hypothesis that one model is closer. The tests are directional and are derived successively for the cases where the competing models are non-nested, overlapping, or nested and whether both, one, or neither is misspecified. As a prerequisite, we fully characterize the asymptotic distribution of the likelihood ratio statistic under the most general conditions. We show that it is a weighted sum of chi-square distribution or a normal distribution depending on whether the distributions in the competing models closest to the truth are observationally identical. We also propose a test of this latter condition.Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses on JSTOR