Analyzing Tag Semantics Across Collaborative Tagging Systems
Benz, D.; Grobelnik, M.; Hotho, A.; Jäschke, R.; Mladenic, D.; Servedio, V. D. P.; Sizov, S. & Szomszor, M.
Alani, H.; Staab, S. & Stumme, G., ed., 'Proceedings of the Dagstuhl Seminar on Social Web Communities' (2008) [pdf]
The objective of our group was to exploit state-of-the-art Information Retrieval methods for finding associations and dependencies between tags, capturing and representing differences in tagging behavior and vocabulary of various folksonomies, with the overall aim to better understand the semantics of tags and the tagging process. Therefore we analyze the semantic content of tags in the Flickr and Delicious folksonomies. We find that: tag context similarity leads to meaningful results in Flickr, despite its narrow folksonomy character; the comparison of tags across Flickr and Delicious shows little semantic overlap, being tags in Flickr associated more to visual aspects rather than technological as it seems to be in Delicious; there are regions in the tag-tag space, provided with the cosine similarity metric, that are characterized by high density; the order of tags inside a post has a semantic relevance.
Logsonomy - Social Information Retrieval with Logdata
Krause, B.; Jäschke, R.; Hotho, A. & Stumme, G.
, 'HT '08: Proceedings of the Nineteenth ACM Conference on Hypertext and Hypermedia', ACM, New York, NY, USA, [], 157-166 (2008) [pdf]
Social bookmarking systems constitute an establishedpart of the Web 2.0. In such systemsusers describe bookmarks by keywordscalled tags. The structure behind these socialsystems, called folksonomies, can be viewedas a tripartite hypergraph of user, tag and resourcenodes. This underlying network showsspecific structural properties that explain itsgrowth and the possibility of serendipitousexploration.Today’s search engines represent the gatewayto retrieve information from the World WideWeb. Short queries typically consisting oftwo to three words describe a user’s informationneed. In response to the displayedresults of the search engine, users click onthe links of the result page as they expectthe answer to be of relevance.This clickdata can be represented as a folksonomyin which queries are descriptions ofclicked URLs. The resulting network structure,which we will term logsonomy is verysimilar to the one of folksonomies. In orderto find out about its properties, we analyzethe topological characteristics of the tripartitehypergraph of queries, users and bookmarkson a large snapshot of andon query logs of two large search engines.All of the three datasets show small worldproperties. The tagging behavior of users,which is explained by preferential attachmentof the tags in social bookmark systems, isreflected in the distribution of single querywords in search engines. We can concludethat the clicking behaviour of search engineusers based on the displayed search resultsand the tagging behaviour of social bookmarkingusers is driven by similar dynamics.
Extracting semantic relations from query logs
Baeza-Yates, R. & Tiberi, A.
, 'KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining', ACM, New York, NY, USA, [], 76-85 (2007) [pdf]
In this paper we study a large query log of more than twenty million queries with the goal of extracting the semantic relations that are implicitly captured in the actions of users submitting queries and clicking answers. Previous query log analyses were mostly done with just the queries and not the actions that followed after them. We first propose a novel way to represent queries in a vector space based on a graph derived from the query-click bipartite graph. We then analyze the graph produced by our query log, showing that it is less sparse than previous results suggested, and that almost all the measures of these graphs follow power laws, shedding some light on the searching user behavior as well as on the distribution of topics that people want in the Web. The representation we introduce allows to infer interesting semantic relationships between queries. Second, we provide an experimental analysis on the quality of these relations, showing that most of them are relevant. Finally we sketch an application that detects multitopical URLs.
Extracting Relations in Social Networks from the Web Using Similarity Between Collective Contexts
Mori, J.; Tsujishita, T.; Matsuo, Y. & Ishizuka, M.
, 'International Semantic Web Conference', 487-500 (2006)
Social networks have recently garnered considerable interest. With the intention of utilizing social networks for the Semantic Web, several studies have examined automatic extraction of social networks. However, most methods have addressed extraction of the strength of relations. Our goal is extracting the underlying relations between entities that are embedded in social networks. To this end, we propose a method that automatically extracts labels that describe relations among entities. Fundamentally, the method clusters similar entity pairs according to their collective contexts in Web documents. The descriptive labels for relations are obtained from results of clustering. The proposed method is entirely unsupervised and is easily incorporated into existing social network extraction methods. Our method also contributes to ontology population by elucidating relations between instances in social networks. Our experiments conducted on entities in political social networks achieved clustering with high precision and recall. We extracted appropriate relation labels to represent the entities.