Artikel in Tagungsbänden
NLP-based Ontology Learning from Legal Texts. A Case Study..
In: P. Casanovas, M. A. Biasiotti, E. Francesconi und M.-T. Sagri
(Herausgeber):
LOAIT, Band 321, Reihe CEUR Workshop Proceedings, Seiten 113-129.
CEUR-WS.org, 2007.
Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli und Giulia Venturi.
[doi]
[BibTeX]
User-Centred Ontology Learning for Knowledge Management.
In:
NLDB '02: Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers, Seiten 203-207.
Springer-Verlag, London, UK, 2002.
Christopher Brewster, Fabio Ciravegna und Yorick Wilks.
[doi]
[BibTeX]
Artikel in Zeitschriften
Ontology learning and its application to automated terminology translation.
Intelligent Systems, IEEE, 18(1): 22-31, 2003.
R. Navigli, P. Velardi und A. Gangemi.
[doi]
[Kurzfassung]
[BibTeX]
Our OntoLearn system is an infrastructure for automated ontology learning from domain text. It is the only system, as far as we know, that uses natural language processing and machine learning techniques, and is part of a more general ontology engineering architecture. We describe the system and an experiment in which we used a machine-learned tourism ontology to automatically translate multiword terms from English to Italian. The method can apply to other domains without manual adaptation.
CRCTOL: A semantic-based domain ontology learning system.
J. Am. Soc. Inf. Sci. Technol., 61(1):150-168, 2010.
Xing Jiang und Ah-Hwee Tan.
[doi]
[Kurzfassung]
[BibTeX]
Domain ontologies play an important role in supporting knowledge-based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept-Relation-Concept Tuple-based Ontology Learning (CRCTOL), for mining ontologies automatically from domain-specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico-syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, a word sense disambiguation algorithm that disambiguates words in the key concepts, a rule-based algorithm that extracts relations between the key concepts, and a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. As a result, the ontologies learned by CRCTOL are more concise and contain a richer semantics in terms of the range and number of semantic relations compared with alternative systems. We present two case studies where CRCTOL is used to build a terrorism domain ontology and a sport event domain ontology. At the component level, quantitative evaluation by comparing with Text-To-Onto and its successor Text2Onto has shown that CRCTOL is able to extract concepts and semantic relations with a significantly higher level of accuracy. At the ontology level, the quality of the learned ontologies is evaluated by either employing a set of quantitative and qualitative methods including analyzing the graph structural property, comparison to WordNet, and expert rating, or directly comparing with a human-edited benchmark ontology, demonstrating the high quality of the ontologies learned. © 2010 Wiley Periodicals, Inc.
Dissertation
Domain ontology learning from the web an unsupervised, automatic and domain independent approach.
Doktorarbeit, Saarbrücken, 2007.
David Sánchez.
[doi]
[BibTeX]
Artikel in Tagungsbänden
Towards Linguistically Grounded Ontologies.
In:
6th Annual European Semantic Web Conference (ESWC2009), Seiten 111-125.
2009.
Paul Buitelaar, Philipp Cimiano, Peter Haase und Michael Sintek.
[doi]
[Kurzfassung]
[BibTeX]
In this paper we argue why it is necessary to associate linguistic information with ontologies and why more expressive models, beyond RDFS, OWL and SKOS, are needed to capture the relation between natural language constructs on the one hand and ontological entities on the other. We argue that in the light of tasks such as ontology-based information extraction, ontology learning and population from text and natural language generation from ontologies, currently available datamodels are not sufficient as they only allow to associate atomic terms without linguistic grounding or structure to ontology elements. Towards realizing a more expressive model for associating linguistic information to ontology elements, we base our work presented here on previously developed models (LingInfo, LexOnto, LMF) and present a new joint model for linguistic grounding of ontologies called LexInfo. LexInfo combines essential design aspects of LingInfo and LexOnto and builds on a sound model for representing computational lexica called LMF which has been recently approved as a standard under ISO.
Artikel in Zeitschriften
Issues in learning an ontology from text.
BMC Bioinformatics, 10 Suppl 5, 2009.
C Brewster, S Jupp, J Luciano, D Shotton, R D Stevens und Z Zhang.
[doi]
[Kurzfassung]
[BibTeX]
BACKGROUND: Ontology construction for any domain is a labour intensive and complex process. Any methodology that can reduce the cost and increase efficiency has the potential to make a major impact in the life sciences. This paper describes an experiment in ontology construction from text for the animal behaviour domain. Our objective was to see how much could be done in a simple and relatively rapid manner using a corpus of journal papers. We used a sequence of pre-existing text processing steps, and here describe the different choices made to clean the input, to derive a set of terms and to structure those terms in a number of hierarchies. We describe some of the challenges, especially that of focusing the ontology appropriately given a starting point of a heterogeneous corpus. RESULTS: Using mainly automated techniques, we were able to construct an 18055 term ontology-like structure with 73% recall of animal behaviour terms, but a precision of only 26%. We were able to clean unwanted terms from the nascent ontology using lexico-syntactic patterns that tested the validity of term inclusion within the ontology. We used the same technique to test for subsumption relationships between the remaining terms to add structure to the initially broad and shallow structure we generated. All outputs are available at http://thirlmere.aston.ac.uk/iffer/animalbehaviour/. CONCLUSION: We present a systematic method for the initial steps of ontology or structured vocabulary construction for scientific domains that requires limited human effort and can make a contribution both to ontology learning and maintenance. The method is useful both for the exploration of a scientific domain and as a stepping stone towards formally rigourous ontologies. The filtering of recognised terms from a heterogeneous corpus to focus upon those that are the topic of the ontology is identified to be one of the main challenges for research in ontology learning.
Dissertation
Mind the Gap: Bridging from Text to Ontological Knowledge.
Doktorarbeit, Department of Computer Science, University of Sheffield, 2008.
Christopher Brewster.
[BibTeX]
Artikel in Zeitschriften
Ontologies on Demand? -A Description of the State-of-the-Art, Applications, Challenges and Trends for Ontology Learning from Text Information.
Information, Wissenschaft und Praxis, 57(6-7):315-320, 2006.
Philipp Cimiano, Johanna V"olker und Rudi Studer.
[doi]
[BibTeX]
Artikel in Tagungsbänden
Semi-automatic extraction and modeling of ontologies using Wikipedia XML Corpus.
In:
Applications of Digital Information and Web Technologies, 2009. ICADIWT '09. Second International Conference on the, Seiten 446-451.
2009.
L. De Silva und L. Jayaratne.
[doi]
[Kurzfassung]
[BibTeX]
This paper introduces WikiOnto: a system that assists in the extraction and modeling of topic ontologies in a semi-automatic manner using a preprocessed document corpus derived from Wikipedia. Based on the Wikipedia XML Corpus, we present a three-tiered framework for extracting topic ontologies in quick time and a modeling environment to refine these ontologies. Using natural language processing (NLP) and other machine learning (ML) techniques along with a very rich document corpus, this system proposes a solution to a task that is generally considered extremely cumbersome. The initial results of the prototype suggest strong potential of the system to become highly successful in ontology extraction and modeling and also inspire further research on extracting ontologies from other semi-structured document corpora as well.
Buchbeiträge
Learning Expressive Ontologies.
In:
Ontology Learning and Population: Bridging the Gap between Text and Knowledge.
IOS Press, 2008.
J. Völker, P. Haase und P. Hitzler.
[BibTeX]
Artikel in Tagungsbänden
A Survey of Ontology Evaluation Techniques.
In:
Proc. of 8th Int. multi-conf. Information Society, Seiten 166-169.
2005.
Janez Brank, Marko Grobelnik und Dunja Mladenić.
[BibTeX]
Constructing folksonomies from user-specified relations on flickr.
In:
WWW '09: Proceedings of the 18th international conference on World wide web, Seiten 781-790.
ACM, New York, NY, USA, 2009.
A. Plangprasopchok und K. Lerman.
[doi]
[Kurzfassung]
[BibTeX]
Automatic folksonomy construction from tags has attracted much attention recently. However, inferring hierarchical relations between concepts from tags has a drawback in that it is difficult to distinguish between more popular and more general concepts. Instead of tags we propose to use user-specified relations for learning folksonomy. We explore two statistical frameworks for aggregating many shallow individual hierarchies, expressed through the collection/set relations on the social photosharing site Flickr, into a common deeper folksonomy that reflects how a community organizes knowledge. Our approach addresses a number of challenges that arise while aggregating information from diverse users, namely noisy vocabulary, and variations in the granularity level of the concepts expressed. Our second contribution is a method for automatically evaluating learned folksonomy by comparing it to a reference taxonomy, e.g., the Web directory created by the Open Directory Project. Our empirical results suggest that user-specified relations are a good source of evidence for learning folksonomies.
An Unsupervised Model for Exploring Hierarchical Semantics from Social Annotations.
In: K. Aberer, K.-S. Choi, N. Noy, D. Allemang, K.-I. Lee, L. J. B. Nixon, J. Golbeck, P. Mika, D. Maynard, G. Schreiber und P. Cudré-Mauroux
(Herausgeber):
Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea, Band 4825, Reihe LNCS, Seiten 673-686.
Springer Verlag, Berlin, Heidelberg, 2007.
Mianwei Zhou, Shenghua Bao, Xian Wu und Yong Yu.
[doi]
[Kurzfassung]
[BibTeX]
This paper deals with the problem of exploring hierarchical semantics from social annotations. Recently, social annotation services have become more and more popular in Semantic Web. It allows users to arbitrarily annotate web resources, thus, largely lowers the barrier to cooperation. Furthermore, through providing abundant meta-data resources, social annotation might become a key to the development of Semantic Web. However, on the other hand, social annotation has its own apparent limitations, for instance, 1) ambiguity and synonym phenomena and 2) lack of hierarchical information. In this paper, we propose an unsupervised model to automatically derive hierarchical semantics from social annotations. Using a social bookmark service Del.icio.us as example, we demonstrate that the derived hierarchical semantics has the ability to compensate those shortcomings. We further apply our model on another data set from Flickr to testify our model's applicability on different environments. The experimental results demonstrate our model's effciency.
FolksOntology: An Integrated Approach for Turning Folksonomies into Ontologies.
In:
Bridging the Gap between Semantic Web and Web 2.0 (SemNet 2007), Seiten 57-70.
Innsbruck, 2007.
Céline Van Damme, Martin Hepp und Katharina Siorpaes.
[doi]
[BibTeX]
Semantics made by you and me: Self-emerging ontologies can capture the diversity of shared knowledge.
In:
Proceedings of the 2nd Web Science Conference (WebSci10).
Raleigh, NC, USA, 2010.
Dominik Benz, Andreas Hotho und Gerd Stumme.
[BibTeX]
Artikel in Zeitschriften
The Effectiveness of Latent Semantic Analysis for Building Up a Bottom-up Taxonomy from Folksonomy Tags..
World Wide Web, 12(4):421-440, 2009.
Takeharu Eda, Masatoshi Yoshikawa, Toshio Uchiyama und Tadasu Uchiyama.
[doi]
[BibTeX]
Artikel in Tagungsbänden
Multilingual Evidence Improves Clustering-based Taxonomy Extraction..
In: M. Ghallab, C. D. Spyropoulos, N. Fakotakis und N. M. Avouris
(Herausgeber):
ECAI, Band 178, Reihe Frontiers in Artificial Intelligence and Applications, Seiten 288-292.
IOS Press, 2008.
Hans Hjelm und Paul Buitelaar.
[doi]
[BibTeX]
Cross-lingual Information Retrieval with Explicit Semantic Analysis.
In:
Working Notes for the CLEF 2008 Workshop.
2008.
Philipp Sorg und Philipp Cimiano.
[doi]
[BibTeX]
Artikel in Zeitschriften
Ontology learning from domain specific web documents.
International Journal of Metadata, Semantics and Ontologies, 4:24-33(10), 2009.
Maryam Hazman, Samhaa R. El-Beltagy und Ahmed Rafea.
[doi]
[Kurzfassung]
[BibTeX]
Ontologies play a vital role in many web- and internet-related applications. This work presents a system for accelerating the ontology building process via semi-automatically learning a hierarchal ontology given a set of domain-specific web documents and a set of seed concepts. The methods are tested with web documents in the domain of agriculture. The ontology is constructed through the use of two complementary approaches. The presented system has been used to build an ontology in the agricultural domain using a set of Arabic extension documents and evaluated against a modified version of the AGROVOC ontology.