TagPlus: A Retrieval System using Synonym Tag in Folksonomy
Lee, S.-S. & Yong, H.-S.
, 'Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineering', MUE '07, IEEE Computer Society, Washington, DC, USA, [], 294-298 (2007) [pdf]
Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs, videos and other content. In ubiquitous computing environment, users access data through various kinds of mobile terminals. Therefore users want more accurate materials because of expensive communication cost or the useless results due to abuse of tags. In this paper, we first describe current limitation of tagging services. We then describe the system (TagPlus) we implemented to minimize ambiguity due to no synonym control. Finally, we give experimental results.
Calculating Relevancy when Searching in Semantically and Hierarchically Structured Data as Exemplified with the eZ Publish Enterprise Content Management System�s Internal Search Engine
Nunninger, T.
2007, Master's thesis, Albert-Ludwigs-Universit�t Freiburg
mantical meaning. Classical concepts of information retrieval usually regard data as plain text; thus they are not able to take into account the structure and semantic meaning of XML data. XML retrieval mainly faces two challenges. First, you need a powerful query language that allows you to formulate queries that take into account both structural and content related search conditions. I will describe XIRQL as introduced by [FG04]. It is the most powerful and generic query language for XML retrieval as it is based on XPath and extends it by several concepts for information retrieval. The second challenge is that you need a concept to find the most specific elements in the dataset. A relevant node (including its subtree) can contain other sub-nodes that could be relevant as well. Thus, in practice, the ranking related statistics of a node in the search index are distributed in its subtree. The question arises how to combine those ranking statistics in a subtree to compute the ranking value of the whole subtree. I only found concepts that calculate ranking values of the relevant sub-nodes and combine those ranking values with the ranking value of the root node of the subtree. As this concept is not convincing, I go one step back: instead of calculating the ranking values of the sub-nodes and �summarizing� them, I will summarize the ranking statistics of the sub-nodes, and based on this, I calculate the ranking value of the node. In this way you can adapt many proven information retrieval approaches directly, and it should be possible to easily apply advanced algorithms for fine-tuning the result as well. Finally, I implemented an evaluation environment using basic concepts of XIRQL and the vector space model. A case studie with real user feedback provided promising results.