Is the Sample Good Enough? Comparing Data from Twitters Streaming API with Twitters Firehose.
, 2013.
Fred Morstatter, J ürgen Pfeffer, Huan Liu und Kathleen M Carley.
[doi]
[BibTeX]
Tagging data as implicit feedback for learning-to-rank.
In:
Proceedings of the ACM WebSci'11.
2011.
Beate Navarro Bullock, Robert Jäschke und Andreas Hotho.
[doi]
[BibTeX]
Ubiquitous Data.
Lecture Notes in Computer Science(6202):61-74, 2010.
Andreas Hotho, Rasmus Ulslev Pedersen und Michael Wurst.
[doi]
[BibTeX]
Limits of Predictability in Human Mobility.
Science, 327(5968):1018-1021, 2010.
Chaoming Song, Zehui Qu, Nicholas Blumm und Albert-László Barabási.
[doi]
[Kurzfassung]
[BibTeX]
A range of applications, from predicting the spread of human and electronic viruses to city planning and resource management in mobile communications, depend on our ability to foresee the whereabouts and mobility of individuals, raising a fundamental question: To what degree is human behavior predictable? Here we explore the limits of predictability in human dynamics by studying the mobility patterns of anonymized mobile phone users. By measuring the entropy of each individual’s trajectory, we find a 93% potential predictability in user mobility across the whole user base. Despite the significant differences in the travel patterns, we find a remarkable lack of variability in predictability, which is largely independent of the distance users cover on a regular basis.
Towards Understanding Spammers - Discovering Local Patterns for Concept Characterization and Description.
In: J. F. A. Knobbe
(Herausgeber):
Proc. LeGo-09: From Local Patterns to Global Models, Workshop at the 2009 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.
2009.
accepted
Martin Atzmueller, Florian Lemmerich, Beate Krause und Andreas Hotho.
[doi]
[BibTeX]
Who are the Spammers? Understandable Local Patterns for Concept Description.
In:
7th Conference on Computer Methods and Systems.
Krakow, Poland, 2009.
ISBN 83-916420-5-4
Martin Atzmueller, Florian Lemmerich, Beate Krause und Andreas Hotho.
[doi]
[BibTeX]
The Anti-Social Tagger - Detecting Spam in Social Bookmarking Systems.
In:
AIRWeb '08: Proceedings of the 4th international workshop on Adversarial information retrieval on the web, Seiten 61-68.
ACM, New York, NY, USA, 2008.
Beate Krause, Christoph Schmitz, Andreas Hotho und Gerd Stumme.
[doi]
[BibTeX]
From Web to Social Web: Discovering and Deploying User and Content Profiles .
2007.
[doi]
[Kurzfassung]
[BibTeX]
This book constitutes the refereed proceedings of the Workshop on Web Mining, WebMine 2006, held in Berlin, Germany, September 18th, 2006. Topics included are data mining based on analysis of bloggers and tagging, web mining, XML mining and further techniques of knowledge discovery. The book is especially valuable for those interested in the aspects of the Social Web (Web 2.0) and its inherent dynamic and diversity of user-generated content.
Distributed feature extraction in a p2p setting: a case study.
Future Gener. Comput. Syst., 23(1):69-75, 2007.
Michael Wurst und Katharina Morik.
[doi]
[BibTeX]
The Intention Behind Web Queries.
String Processing and Information Retrieval:98-109, 2006.
Ricardo Baeza-Yates, Liliana Calderón-Benavides und Cristina González-Caro.
[doi]
[Kurzfassung]
[BibTeX]
The identification of the user’s intention or interest through queries that they submit to a search engine can be very useful
to offer them more adequate results. In this work we present a framework for the identification of user’s interest in an automaticway, based on the analysis of query logs. This identification is made from two perspectives, the objectives or goals of auser and the categories in which these aims are situated. A manual classification of the queries was made in order to havea reference point and then we applied supervised and unsupervised learning techniques. The results obtained show that fora considerable amount of cases supervised learning is a good option, however through unsupervised learning we found relationshipsbetween users and behaviors that are not easy to detect just taking the query words. Also, through unsupervised learning weestablished that there are categories that we are not able to determine in contrast with other classes that were not consideredbut naturally appear after the clustering process. This allowed us to establish that the combination of supervised and unsupervisedlearning is a good alternative to find user’s goals. From supervised learning we can identify the user interest given certainestablished goals and categories; on the other hand, with unsupervised learning we can validate the goals and categories used,refine them and select the most appropriate to the user’s needs.
Discovering communities in complex networks..
In: R. Menezes
(Herausgeber):
ACM Southeast Regional Conference, Seiten 280-285.
ACM, 2006.
Hemant Balakrishnan und Narsingh Deo.
[doi]
[BibTeX]
Ontology Learning from Text: Methods, Evaluation and Applications.
2005.
[BibTeX]
Text Mining. Predictive Methods for Analyzing Unstructured Information.
2004.
Sholom M. Weiss, Nitin Indurkhya und T. Zhang.
[doi]
[BibTeX]
Modeling the Internet and the Web: Probabilistic Methods and Algorithms.
2003.
Pierre Baldi, Paolo Frasconi und Padhraic Smyth.
[doi]
[Kurzfassung]
[BibTeX]
Modeling the Internet and the Web covers the most important aspects of modeling the Web using a modern mathematical and probabilistic treatment. It focuses on the information and application layers, as well as some of the emerging properties of the Internet. Provides a comprehensive introduction to the modeling of the Internet and the Web at the information level. Takes a modern approach based on mathematical, probabilistic, and graphical modeling. Provides an integrated presentation of theory, examples, exercises and applications. Covers key topics such as text analysis, link analysis, crawling techniques, human behaviour, and commerce on the Web. Interdisciplinary in nature, Modeling the Internet and the Web will be of interest to students and researchers from a variety of disciplines including computer science, machine learning, engineering, statistics, economics, business, and the social sciences.
Class visualization of high-dimensional data with applications.
Computational Statistics & Data Analysis, 41(1):59-90, 2002.
Inderjit S. Dhillon, Dharmendra S. Modha und W. Scott Spangler.
[doi]
[Kurzfassung]
[BibTeX]
No abstract is available for this item.
Probabilistic Robotics (Intelligent Robotics and Autonomous Agents).
2001.
Sebastian Thrun, Wolfram Burgard und Dieter Fox.
[doi]
[BibTeX]
Web Usage Analysis and User Profiling, International WEBKDD'99
Workshop, San Diego, California, USA, August 15, 1999, Revised
Papers.
Lecture Notes in Computer Science. Band 1836.
Springer, 2000.
Brij M. Masand und Myra Spiliopoulou.
[BibTeX]
Data Preparation for Data Mining.
1999.
Dorian Pyle.
[BibTeX]
From Data Mining to Knowledge Discovery: An Overview..
In:
Advances in Knowledge Discovery and Data Mining, Seiten 1-34.
1996.
Usama M. Fayyad, Gregory Piatetsky-Shapiro und Padhraic Smyth.
[doi]
[BibTeX]
Probabilistic Counting Algorithms for Data Base Applications.
Journal of Computer and System Sciences, 31(2):182-209, 1985.
Philippe Flajolet und G. Nigel Martin.
[doi]
[BibTeX]