LA-LDA: A Limited Attention Topic Model for Social Recommendation.
2013. cite arxiv:1301.6277Comment: The 2013 International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction (SBP 2013).
Jeon-Hyung Kang, Kristina Lerman and Lise Getoor.
[doi]
[abstract]
[BibTeX]
Social media users have finite attention which limits the number of incoming messages from friends they can process. Moreover, they pay more attention to opinions and recommendations of some friends more than others. In this paper, we propose LA-LDA, a latent topic model which incorporates limited, non-uniformly divided attention in the diffusion process by which opinions and information spread on the social network. We show that our proposed model is able to learn more accurate user models from users' social network and item adoption behavior than models which do not take limited attention into account. We analyze voting on news items on the social news aggregator Digg and show that our proposed model is better able to predict held out votes than alternative models. Our study demonstrates that psycho-socially motivated models have better ability to describe and predict observed behavior than models which only consider topics.
A Probabilistic Approach for Learning Folksonomies from Structured Data.
In:
Proceedings of the 4th ACM Web Search and Data Mining Conference.
2010.
cite arxiv:1011.3557Comment: In Proceedings of the 4th ACM Web Search and Data Mining Conference (WSDM)
Anon Plangprasopchok, Kristina Lerman and Lise Getoor.
[doi]
[abstract]
[BibTeX]
Learning structured representations has emerged as an important problem in many domains, including document and Web data mining, bioinformatics, and image analysis. One approach to learning complex structures is to integrate many smaller, incomplete and noisy structure fragments. In this work, we present an unsupervised probabilistic approach that extends affinity propagation to combine the small ontological fragments into a collection of integrated, consistent, and larger folksonomies. This is a challenging task because the method must aggregate similar structures while avoiding structural inconsistencies and handling noise. We validate the approach on a real-world social media dataset, comprised of shallow personal hierarchies specified by many individual users, collected from the photosharing website Flickr. Our empirical results show that our proposed approach is able to construct deeper and denser structures, compared to an approach using only the standard affinity propagation algorithm. Additionally, the approach yields better overall integration quality than a state-of-the-art approach based on incremental relational clustering.
Growing a tree in the forest: constructing folksonomies by integrating structured metadata..
In: B. Rao, B. Krishnapuram, A. Tomkins and Q. Yang, editors,
KDD, pages 949-958.
ACM, 2010.
Anon Plangprasopchok, Kristina Lerman and Lise Getoor.
[doi]
[abstract]
[BibTeX]
Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured metadata provide valuable evidence for learning how a com- munity organizes knowledge. For instance, we can aggre- gate many personal hierarchies into a common taxonomy, also known as a folksonomy, that will aid users in visualiz- ing and browsing social content, and also to help them in organizing their own content. However, learning from social metadata presents several challenges, since it is sparse, shal- low, ambiguous, noisy, and inconsistent. We describe an ap- proach to folksonomy learning based on relational clustering, which exploits structured metadata contained in personal hierarchies. Our approach clusters similar hierarchies using their structure and tag statistics, then incrementally weaves them into a deeper, bushier tree. We study folksonomy learning using social metadata extracted from the photo- sharing site Flickr, and demonstrate that the proposed ap- proach addresses the challenges. Moreover, comparing to previous work, the approach produces larger, more accurate folksonomies, and in addition, scales better.
Structure of Heterogeneous Networks.
2009. cite arxiv:0906.2212 .
Rumi Ghosh and Kristina Lerman.
[doi]
[abstract]
[BibTeX]
Heterogeneous networks play a key role in the evolution of communities and the decisions individuals make. These networks link different types of entities, for example, people and the events they attend. Network analysis algorithms usually project such networks unto simple graphs composed of entities of a single type. In the process, they conflate relations between entities of different types and loose important structural information. We develop a mathematical framework that can be used to compactly represent and analyze heterogeneous networks that combine multiple entity and link types. We generalize Bonacich centrality, which measures connectivity between nodes by the number of paths between them, to heterogeneous networks and use this measure to study network structure. Specifically, we extend the popular modularity-maximization method for community detection to use this centrality metric. We also rank nodes based on their connectivity to other nodes. One advantage of this centrality metric is that it has a tunable parameter we can use to set the length scale of interactions. By studying how rankings change with this parameter allows us to identify important nodes in the network. We apply the proposed method to analyze the structure of several heterogeneous networks. We show that exploiting additional sources of evidence corresponding to links between, as well as among, different entity types yields new insights into network structure.
Structure of Heterogeneous Networks.
2009. cite arxiv:0906.2212.
Rumi Ghosh and Kristina Lerman.
[doi]
[abstract]
[BibTeX]
Heterogeneous networks play a key role in the evolution of communities and the decisions individuals make. These networks link different types of entities, for example, people and the events they attend. Network analysis algorithms usually project such networks unto simple graphs composed of entities of a single type. In the process, they conflate relations between entities of different types and loose important structural information. We develop a mathematical framework that can be used to compactly represent and analyze heterogeneous networks that combine multiple entity and link types. We generalize Bonacich centrality, which measures connectivity between nodes by the number of paths between them, to heterogeneous networks and use this measure to study network structure. Specifically, we extend the popular modularity-maximization method for community detection to use this centrality metric. We also rank nodes based on their connectivity to other nodes. One advantage of this centrality metric is that it has a tunable parameter we can use to set the length scale of interactions. By studying how rankings change with this parameter allows us to identify important nodes in the network. We apply the proposed method to analyze the structure of several heterogeneous networks. We show that exploiting additional sources of evidence corresponding to links between, as well as among, different entity types yields new insights into network structure.
Constructing folksonomies from user-specified relations on flickr..
In: J. Quemada, G. León, Y. S. Maarek and W. Nejdl, editors,
WWW, pages 781-790.
ACM, 2009.
Anon Plangprasopchok and Kristina Lerman.
[abstract]
[BibTeX]
Automatic folksonomy construction from tags has attracted much attention recently. However, inferring hierarchical relations between concepts from tags has a drawback in that it is difficult to distinguish between more popular and more general concepts. Instead of tags we propose to use userspecified relations for learning folksonomy. We explore two statistical frameworks for aggregating many shallow individual hierarchies, expressed through the collection/set relations on the social photosharing site Flickr, into a common deeper folksonomy that reflects how a community organizes knowledge. Our approach addresses a number of challenges that arise while aggregating information from diverse users, namely noisy vocabulary, and variations in the granularity level of the concepts expressed. Our second contribution is a method for automatically evaluating learned folksonomy by comparing it to a reference taxonomy, e.g., the Web directory created by the Open Directory Project. Our empirical results suggest that user-specified relations are a good source of evidence for learning folksonomies.
Social Information Processing in Social News Aggregation.
arXiv, 2007.
Kristina Lerman.
[doi]
[abstract]
[BibTeX]
The rise of the social media sites, such as blogs, wikis, Digg and Flickr among others, underscores the transformation of the Web to a participatory medium in which users are collaboratively creating, evaluating and distributing information. The innovations introduced by social media has lead to a new paradigm for interacting with information, what we call 'social information processing'. In this paper, we study how social news aggregator Digg exploits social information processing to solve the problems of document recommendation and rating. First, we show, by tracking stories over time, that social networks play an important role in document recommendation. The second contribution of this paper consists of two mathematical models. The first model describes how collaborative rating and promotion of stories emerges from the independent decisions made by many users. The second model describes how a user's influence, the number of promoted stories and the user's social network, changes in time. We find qualitative agreement between predictions of the model and user data gathered from Digg.
Document Clustering in Reduced Dimension Vector Space.
1999. http://www.isi.edu/~lerman/papers/Lerman99.pdf.
Kristina Lerman.
[BibTeX]