From Frequency to Meaning: Vector Space Models of Semantics
Turney, P. D. & Pantel, P.
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field.
The Wisdom in Tweetonomies: Acquiring Latent Conceptual Structures from Social Awareness Streams
Wagner, C. & Strohmaier, M.
Although one might argue that little wisdom can be conveyed in messages of 140 characters or less, this paper sets out to explore whether the aggregation of messages in social awareness streams, such as Twitter, conveys meaningful information about a given domain. As a research community, we know little about the structural and semantic properties of such streams, and how they can be analyzed, characterized and used. This paper introduces a network-theoretic model of social awareness stream, a so-called together with a set of stream-based measures that allow researchers to systematically define and compare different stream aggregations. We apply the model and measures to a dataset acquired from Twitter to study emerging semantics in selected streams. The network-theoretic model and the corresponding measures introduced in this paper are relevant for researchers interested in information retrieval and ontology learning from social awareness streams. Our empirical findings demonstrate that different social awareness stream aggregations exhibit interesting differences, making them amenable for different applications.
Wissensverarbeitung und die Semantik der natürlichen Sprache: Wissensrepräsentation mit MultiNet
Helbig, H.
2008, Springer, Berlin, [10.1007/978-3-540-76278-2]
Das Buch gibt eine umfassende Darstellung einer Methodik zur Interpretation und Bedeutungsrepräsentation natürlichsprachlicher Ausdrücke. Diese Methodik der Mehrschichtigen Erweiterten Semantischen Netze (MultiNet) ist sowohl für theoretische Untersuchungen als auch für die automatische Verarbeitung natürlicher Sprache auf dem Rechner geeignet. Die vorgestellten Ergebnisse sind eingebettet in ein System von Software-Werkzeugen, die eine praktische Nutzung der MultiNet-Darstellungsmittel als Formalismus zur Bedeutungsrepräsentation sichern. Hierzu gehören: eine Werkbank für den Wissensingenieur, ein Übersetzungssystem zur automatischen Gewinnung von Bedeutungsdarstellungen natürlichsprachlicher Sätze und eine Werkbank für den Computerlexikographen.
Social tags: meaning and suggestions
Suchanek, F. M.; Vojnovic, M. & Gunawardena, D.
This paper aims to quantify two common assumptions about social tagging: (1) that tags are "meaningful" and (2) that the tagging process is influenced by tag suggestions. For (1), we analyze the semantic properties of tags and the relationship between the tags and the content of the tagged page. Our analysis is based on a corpus of search keywords, contents, titles, and tags applied to several thousand popular Web pages. Among other results, we find that the more popular tags of a page tend to be the more meaningful ones. For (2), we develop a model of how the influence of tag suggestions can be measured. From a user study with over 4,000 participants, we conclude that roughly one third of the tag applications may be induced by the suggestions. Our results would be of interest for designers of social tagging systems and are a step towards understanding how to best leverage social tags for applications such as search and information extraction.