Publications
Media Bias in German Online Newspapers
Dallmann, A.; Lemmerich, F.; Zoller, D. & Hotho, A.
, '26th ACM Conference on Hypertext and Social Media', ACM, Cyprus, Turkey, September 1-4 (2015)
Human-level control through deep reinforcement learning
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A. A.; Veness, J.; Bellemare, M. G.; Graves, A.; Riedmiller, M.; Fidjeland, A. K.; Ostrovski, G.; Petersen, S.; Beattie, C.; Sadik, A.; Antonoglou, I.; King, H.; Kumaran, D.; Wierstra, D.; Legg, S. & Hassabis, D.
Nature, 518(7540) 529-533 (2015) [pdf]
ConDist: A Context-Driven Categorical Distance Measure
Ring, M.; Otto, F.; Becker, M.; Niebler, T.; Landes, D. & Hotho, A.
ECMLPKDD2015, ed. (2015)
Hyptrails: A bayesian approach for comparing hypotheses about human trails
Singer, P.; Helic, D.; Hotho, A. & Strohmaier, M.
, '24th International World Wide Web Conference (WWW2015)', ACM, Firenze, Italy (2015) [pdf]
An Overview of Microsoft Academic Service (MAS) and Applications.
Sinha, A.; Shen, Z.; Song, Y.; Ma, H.; Eide, D.; Hsu, B.-J. P. & Wang, K.
Gangemi, A.; Leonardi, S. & Panconesi, A., ed., 'WWW (Companion Volume)', ACM, 243-246 (2015) [pdf]
Participatory Patterns in an International Air Quality Monitoring Initiative
Sîrbu, A.; Becker, M.; Caminiti, S.; De Baets, B.; Elen, B.; Francis, L.; Gravino, P.; Hotho, A.; Ingarra, S.; Loreto, V.; Molino, A.; Mueller, J.; Peters, J.; Ricchiuti, F.; Saracino, F.; Servedio, V. D. P.; Stumme, G.; Theunis, J.; Tria, F. & Van den Bossche, J.
PLoS ONE, 10(8) e0136763 (2015) [pdf]
<p>The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by developing novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.</p>
Semantic Annotation for Microblog Topics Using Wikipedia Temporal Information
Tran, T.; Tran, N.-K.; Teka Hadgu, A. & Jäschke, R.
, 'Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)', Association for Computational Linguistics (2015)
In this paper we study the problem of semantic annotation for a trending hashtag which is the crucial step towards analyzing user behavior in social media, yet has been largely unexplored. We tackle the problem via linking to entities from Wikipedia. We incorporate the social aspects of trending hashtags by identifying prominent entities for the annotation so as to maximize the information spreading in entity networks. We exploit temporal dynamics of entities in Wikipedia, namely Wikipedia edits and page views to improve the annotation quality. Our experiments show that we significantly outperform the established methods in tweet annotation.
Modeling and Extracting Load Intensity Profiles
v. Kistowski, J.; Nikolas, H.; Zoller, D.; Kounev, S. & Hotho, A.
, 'Proceedings of the 10th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS)' (2015)
Today’s system developers and operators face the challenge of creating software systems that make efficient use of dynamically allocated resources under highly variable and dynamic load profiles, while at the same time delivering reliable performance. Benchmarking of systems under these constraints is difficult, as state-of-the-art benchmarking frameworks provide only limited support for emulating such dynamic and highly vari- able load profiles for the creation of realistic workload scenarios. Industrial benchmarks typically confine themselves to workloads with constant or stepwise increasing loads. Alternatively, they support replaying of recorded load traces. Statistical load inten- sity descriptions also do not sufficiently capture concrete pattern load profile variations over time.
address these issues, we present the Descartes Load Intensity Model (DLIM). DLIM provides a modeling formalism for describing load intensity variations over time. A DLIM instance can be used as a compact representation of a recorded load intensity trace, providing a powerful tool for benchmarking and performance analysis. As manually obtaining DLIM instances can be time consuming, we present three different automated extraction methods, which also help to enable autonomous system analysis for self-adaptive systems. Model expressiveness is validated using the presented extraction methods. Extracted DLIM instances exhibit a median modeling error of 12.4% on average over nine different real-world traces covering between two weeks and seven months. Additionally, extraction methods perform orders of magnitude faster than existing time series decomposition approaches.
On Publication Usage in a Social Bookmarking System
Zoller, D.; Doerfel, S.; Jäschke, R.; Stumme, G. & Hotho, A.
, 'Proceedings of the 2015 ACM Conference on Web Science' (2015)
Scholarly success is traditionally measured in terms of citations to publications. With the advent of publication man- agement and digital libraries on the web, scholarly usage data has become a target of investigation and new impact metrics computed on such usage data have been proposed – so called altmetrics. In scholarly social bookmarking sys- tems, scientists collect and manage publication meta data and thus reveal their interest in these publications. In this work, we investigate connections between usage metrics and citations, and find posts, exports, and page views of publications to be correlated to citations.
Subjective vs. Objective Data: Bridging the Gap
Becker, M.; Hotho, A.; Mueller, J.; Kibanov, M.; Atzmueller, M. & Stumme, G.
, CSSWS 2014, Poster(2014) [pdf]
Sensor data is objective. But when measuring our environment, measured values are contrasted with our perception, which is always subjective. This makes interpreting sensor measurements difficult for a single person in her personal environment. In this context, the EveryAware projects directly connects the concepts of objective sensor data with subjective impressions and perceptions by providing a collective sensing platform with several client applications allowing to explicitly associate those two data types. The goal is to provide the user with personalized feedback, a characterization of the global as well as her personal environment, and enable her to position her perceptions in this global context.
this poster we summarize the collected data of two EveryAware applications, namely WideNoise for noise measurements and AirProbe for participatory air quality sensing. Basic insights are presented including user activity, learning processes and sensor data to perception correlations. These results provide an outlook on how this data can further be used to understand the connection between sensor data and perceptions.
Large-scale factorization of type-constrained multi-relational data
Krompass, D.; Nickel, M. & Tresp, V.
, 'International Conference on Data Science and Advanced Analytics, DSAA 2014, Shanghai, China, October 30 - November 1, 2014', IEEE, [10.1109/DSAA.2014.7058046], 18-24 (2014) [pdf]
Linguistic Regularities in Sparse and Explicit Word Representations.
Levy, O. & Goldberg, Y.
Morante, R. & tau Yih, W., ed., 'CoNLL', ACL, 171-180 (2014) [pdf]
Folksonomies
Singer, P.; Niebler, T.; Hotho, A. & Strohmaier, M.
, 'Encyclopedia of Social Network Analysis and Mining', Springer, 542-547 (2014)
Text as data: The promise and pitfalls of automatic content analysis methods for political texts
Grimmer, J. & Stewart, B. M.
Political Analysis mps028 (2013)
Exploiting Structural Consistencies with Stacked Conditional Random Fields
Kluegl, P.; Toepfer, M.; Lemmerich, F.; Hotho, A. & Puppe, F.
Mathematical Methodologies in Pattern Recognition and Machine Learning Springer Proceedings in Mathematics & Statistics, 30() 111-125 (2013)
Conditional Random Fields (CRF) are popular methods for labeling unstructured or textual data. Like many machine learning approaches, these undirected graphical models assume the instances to be independently distributed. However, in real-world applications data is grouped in a natural way, e.g., by its creation context. The instances in each group often share additional structural consistencies. This paper proposes a domain-independent method for exploiting these consistencies by combining two CRFs in a stacked learning framework. We apply rule learning collectively on the predictions of an initial CRF for one context to acquire descriptions of its specific properties. Then, we utilize these descriptions as dynamic and high quality features in an additional (stacked) CRF. The presented approach is evaluated with a real-world dataset for the segmentation of references and achieves a significant reduction of the labeling error.
Sequential Latent Dirichlet Allocation: Discover Underlying Topic Structures within a Document.
Du, L.; Buntine, W. L. & Jin, H.
Webb, G. I.; 0001, B. L.; Zhang, C.; Gunopulos, D. & Wu, X., ed., 'ICDM', IEEE Computer Society, 148-157 (2010) [pdf]
Boilerplate Detection using Shallow Text Features
Kohlschütter, C.; Fankhauser, P. & Nejdl, W.
, 'Proc. of 3rd ACM International Conference on Web Search and Data Mining New York City, NY USA (WSDM 2010).' (2010)
Dynamic Auto-Encoders for Semantic Indexing
Mirowski, P.; Ranzato, M. & LeCun, Y.
of the NIPS 2010 Workshop on Deep Learning, P., ed. (2010) [pdf]
Wisdom of crowds versus wisdom of linguists - measuring the semantic relatedness of words.
Zesch, T. & Gurevych, I.
Natural Language Engineering, 16(1) 25-59 (2010) [pdf]
Adaptive Multiagent System for Network Traffic Monitoring.
Rehák, M.; Pechoucek, M.; Grill, M.; Stiborek, J.; Bartos, K. & Celeda, P.
IEEE Intelligent Systems, 24(3) 16-25 (2009) [pdf]