S. Lalwani, and M. Huhns. ACM-SE 47: Proceedings of the 47th Annual Southeast Regional Conference, page 1--2. New York, NY, USA, ACM, (2009)
In this paper we describe our investigation of tagging systems and the derivation of ontological structure in the form of a folksonomy from the set of tags. Tagging systems are becoming popular, because the amount of information available on some websites is becoming too large for humans to browse manually and the types of information (multimedia data) is unsuitable for the indexers used by conventional search engines to organize. However, tag-based search is very inaccurate and incomplete (low precision and recall), because the semantics of the tags is both weak and ambiguous. The basic problem is that tags are treated like keywords by search engines, which consider individual tags in isolation. However, there is additional semantics implicit in a collection of tagged data. In this paper, we innovate and investigate techniques to make the implicit semantics explicit, so that search can be improved in both precision and recall and additional utility can be derived from the tags that people associate with multimedia items (pictures, blogs, videos, etc.). Our approach is to propose hypotheses about the ontological structure inherent in a collection of tags and then attempt to verify the hypotheses statistically. We conducted more than one hundred experimental searches on Flickr with different tags and discovered by statistical analysis information about how tags are assigned by users and what ontological knowledge is implicit in these tags that can be made explicit, and ultimately, exploited.