Exploring Wikipedia and DMoz as Knowledge Bases for Engineering a User Interests Hierarchy for Social Network Applications.
In:
R. Meersman, T. Dillon and P. Herrero, editors,
On the Move to Meaningful Internet Systems: OTM 2009, pages 1238-1245.
Springer, Berlin / Heidelberg, 2009.
Mandar Haridas and Doina Caragea.
[doi]
[abstract]
[BibTeX]
The outgrowth of social networks in the recent years has resulted in opportunities for interesting data mining problems, such as interest or friendship recommendations. A global ontology over the interests specified by the users of a social network is essential for accurate recommendations. We propose, evaluate and compare three approaches to engineering a hierarchical ontology over user interests. The proposed approaches make use of two popular knowledge bases, Wikipedia and Directory Mozilla, to extract interest definitions and/or relationships between interests. More precisely, the first approach uses Wikipedia to find interest definitions, the latent semantic analysis technique to measure the similarity between interests based on their definitions, and an agglomerative clustering algorithm to group similar interests into higher level concepts. The second approach uses the Wikipedia Category Graph to extract relationships between interests, while the third approach uses Directory Mozilla to extract relationships between interests. Our results show that the third approach, although the simplest, is the most effective for building a hierarchy over user interests.
Personalized recommendation in social tagging systems using hierarchical clustering.
In:
RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, pages 259-266.
ACM, New York, NY, USA, 2008.
Andriy Shepitsen, Jonathan Gemmell, Bamshad Mobasher and Robin Burke.
[doi]
[abstract]
[BibTeX]
Collaborative tagging applications allow Internet users to annotate resources with personalized tags. The complex network created by many annotations, often called a folksonomy, permits users the freedom to explore tags, resources or even other user's profiles unbound from a rigid predefined conceptual hierarchy. However, the freedom afforded users comes at a cost: an uncontrolled vocabulary can result in tag redundancy and ambiguity hindering navigation. Data mining techniques, such as clustering, provide a means to remedy these problems by identifying trends and reducing noise. Tag clusters can also be used as the basis for effective personalized recommendation assisting users in navigation. We present a personalization algorithm for recommendation in folksonomies which relies on hierarchical tag clusters. Our basic recommendation framework is independent of the clustering method, but we use a context-dependent variant of hierarchical agglomerative clustering which takes into account the user's current navigation context in cluster selection. We present extensive experimental results on two real world dataset. While the personalization algorithm is successful in both cases, our results suggest that folksonomies encompassing only one topic domain, rather than many topics, present an easier target for recommendation, perhaps because they are more focused and often less sparse. Furthermore, context dependent cluster selection, an integral step in our personalization algorithm, demonstrates more utility for recommendation in multi-topic folksonomies than in single-topic folksonomies. This observation suggests that topic selection is an important strategy for recommendation in multi-topic folksonomies.
Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis.
Journal on Artificial Intelligence Research, 24:305-339, 2005.
Philipp Cimiano, Andreas Hotho and Steffen Staab.
[doi]
[BibTeX]
Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis.
Journal on Artificial Intelligence Research, 24:305-339, 2005.
Philipp Cimiano, Andreas Hotho and Steffen Staab.
[doi]
[BibTeX]
Formal Concept Analysis used for Software Analysis and Modelling.
2005.
Wolfgang Hesse and Thomas Alan Tilley.
[BibTeX]