copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Measuring the Similarity of Concept Hierarchies and its Influence on the Evaluation of Learning Procedures

K. Dellschaft. Institute for Computer Science, University of Koblenz-Landau, Germany, (December 2005)

Abstract

The information available in corporate intranets and in the Internet grows from day to day. Looking for a specific information often the question is how to find it. Therefore it is the aim of researchers to allow a more efficient access to large collections of information. Many of the developed algorithms are dependent on additional domain knowledge for improving the achieved results (see (Gonzalo et al., 1998) and (De Buenaga Rodr�guez et al., 2000)). The domain knowledge is often available in the form of ontologies. An ontology reflects the understanding of a domain, on which a community has agreed upon. An ontology consists of different parts like a set of concepts and their mutual relations. These concepts are organized in a hierarchy of sub- and superconcepts. In order to actually improve the results of an application with the help of an ontology, it is crucial to accurately and exhaustively model the domain in question. Because this is a very complex and time consuming task it is a goal to extract an ontology at least semi-automatically. Such learning procedures use documents from the domain for extracting the necessary information. Often these documents are natural language texts like websites or dictionaries which contain domain knowledge (see (Kietz, Maedche and Volz, 2000) and (Cimiano, Hotho and Staab, 2004)). The quality of an automatically learned ontology is basically influenced by two parameters: The actual learning procedure and the document corpus. There exist several alternative learning procedures. They are further differentiated by the types of documents which they can process, i.e. whether they can process unstructured, semi-structured or structured documents. Websites are an example for unstructured documents, while dictionary entries and encyclopedia articles are examples for semi-structured documents. Documents containing artificial languages like database schemes are finally classified as structured documents. It is often assumed that the availability of structural information leads to a better quality of the extracted ontology. In order to enable a comparison of the different learning procedures, so that one can choose the best procedure for a certain purpose, they are often evaluated on an example corpus of documents. Subsequently it is tried to objectively measure the quality of the extracted ontology. Such an evaluation may also be used for fine tuning the parameters of a learning procedure, so that better results are achieved. One way of objectively evaluating a learning procedure is to measure the similarity between the learned ontology and a previously defined reference ontology. This similarity is then an equivalent for the quality. It is assumed that the learning procedure will always produce results with a comparable quality. This quality will only be influenced by the document corpus which must contain the correct informations.

Links and resources

URL:

http://.uni-koblenz.de/FB4/Publications/Theses/ShowThesis?id=1908

BibTeX key:

dellschaft2005measuring

search on:

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

Cite this publication

@mastersthesis{dellschaft2005measuring,
  abstract = {The information available in corporate intranets and in the Internet grows from day to day. Looking for a specific information often the question is how to find it. Therefore it is the aim of researchers to allow a more efficient access to large collections of information. Many of the developed algorithms are dependent on additional domain knowledge for improving the achieved results (see (Gonzalo et al., 1998) and (De Buenaga Rodr�guez et al., 2000)). The domain knowledge is often available in the form of ontologies. An ontology reflects the understanding of a domain, on which a community has agreed upon. An ontology consists of different parts like a set of concepts and their mutual relations. These concepts are organized in a hierarchy of sub- and superconcepts. In order to actually improve the results of an application with the help of an ontology, it is crucial to accurately and exhaustively model the domain in question. Because this is a very complex and time consuming task it is a goal to extract an ontology at least semi-automatically. Such learning procedures use documents from the domain for extracting the necessary information. Often these documents are natural language texts like websites or dictionaries which contain domain knowledge (see (Kietz, Maedche and Volz, 2000) and (Cimiano, Hotho and Staab, 2004)). The quality of an automatically learned ontology is basically influenced by two parameters: The actual learning procedure and the document corpus. There exist several alternative learning procedures. They are further differentiated by the types of documents which they can process, i.e. whether they can process unstructured, semi-structured or structured documents. Websites are an example for unstructured documents, while dictionary entries and encyclopedia articles are examples for semi-structured documents. Documents containing artificial languages like database schemes are finally classified as structured documents. It is often assumed that the availability of structural information leads to a better quality of the extracted ontology. In order to enable a comparison of the different learning procedures, so that one can choose the best procedure for a certain purpose, they are often evaluated on an example corpus of documents. Subsequently it is tried to objectively measure the quality of the extracted ontology. Such an evaluation may also be used for fine tuning the parameters of a learning procedure, so that better results are achieved. One way of objectively evaluating a learning procedure is to measure the similarity between the learned ontology and a previously defined reference ontology. This similarity is then an equivalent for the quality. It is assumed that the learning procedure will always produce results with a comparable quality. This quality will only be influenced by the document corpus which must contain the correct informations.},
  added-at = {2011-02-04T16:10:01.000+0100},
  address = {Germany},
  author = {Dellschaft, Klaas},
  biburl = {https://puma.uni-kassel.de/bibtex/246305dd6539f13b88dd7d288bc5dbab6/benz},
  dateadded = {2006-09-01},
  file = {dellschaft2005measuring.pdf:dellschaft2005measuring.pdf:PDF},
  interhash = {197543e8a02474709ffa0db4b9428d4f},
  intrahash = {46305dd6539f13b88dd7d288bc5dbab6},
  keywords = {evaluation_methods diploma_thesis},
  lastdatemodified = {2006-09-01},
  lastname = {Dellschaft},
  month = {December},
  own = {notown},
  pdf = {dellschaft05-measuring.pdf},
  read = {notread},
  school = {Institute for Computer Science, University of Koblenz-Landau},
  timestamp = {2011-02-04T16:10:01.000+0100},
  title = {Measuring the Similarity of Concept Hierarchies and its Influence on the Evaluation of Learning Procedures},
  url = {http://.uni-koblenz.de/FB4/Publications/Theses/ShowThesis?id=1908},
  year = 2005
}

%0 Thesis
%1 dellschaft2005measuring
%A Dellschaft, Klaas
%C Germany
%D 2005
%K evaluation_methods diploma_thesis
%T Measuring the Similarity of Concept Hierarchies and its Influence on the Evaluation of Learning Procedures
%U http://.uni-koblenz.de/FB4/Publications/Theses/ShowThesis?id=1908
%X The information available in corporate intranets and in the Internet grows from day to day. Looking for a specific information often the question is how to find it. Therefore it is the aim of researchers to allow a more efficient access to large collections of information. Many of the developed algorithms are dependent on additional domain knowledge for improving the achieved results (see (Gonzalo et al., 1998) and (De Buenaga Rodr�guez et al., 2000)). The domain knowledge is often available in the form of ontologies. An ontology reflects the understanding of a domain, on which a community has agreed upon. An ontology consists of different parts like a set of concepts and their mutual relations. These concepts are organized in a hierarchy of sub- and superconcepts. In order to actually improve the results of an application with the help of an ontology, it is crucial to accurately and exhaustively model the domain in question. Because this is a very complex and time consuming task it is a goal to extract an ontology at least semi-automatically. Such learning procedures use documents from the domain for extracting the necessary information. Often these documents are natural language texts like websites or dictionaries which contain domain knowledge (see (Kietz, Maedche and Volz, 2000) and (Cimiano, Hotho and Staab, 2004)). The quality of an automatically learned ontology is basically influenced by two parameters: The actual learning procedure and the document corpus. There exist several alternative learning procedures. They are further differentiated by the types of documents which they can process, i.e. whether they can process unstructured, semi-structured or structured documents. Websites are an example for unstructured documents, while dictionary entries and encyclopedia articles are examples for semi-structured documents. Documents containing artificial languages like database schemes are finally classified as structured documents. It is often assumed that the availability of structural information leads to a better quality of the extracted ontology. In order to enable a comparison of the different learning procedures, so that one can choose the best procedure for a certain purpose, they are often evaluated on an example corpus of documents. Subsequently it is tried to objectively measure the quality of the extracted ontology. Such an evaluation may also be used for fine tuning the parameters of a learning procedure, so that better results are achieved. One way of objectively evaluating a learning procedure is to measure the similarity between the learned ontology and a previously defined reference ontology. This similarity is then an equivalent for the quality. It is assumed that the learning procedure will always produce results with a comparable quality. This quality will only be influenced by the document corpus which must contain the correct informations.

PUMA

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Measuring the Similarity of Concept Hierarchies and its Influence on the Evaluation of Learning Procedures

Abstract

Links and resources

Comments and Reviews
(0)

Tags

Cite this publication

Meta data

community

tags (@benz's tags highlighted)

PUMA

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Measuring the Similarity of Concept Hierarchies and its Influence on the Evaluation of Learning Procedures

Abstract

Links and resources

Comments and Reviews (0)

Tags

Cite this publication

Meta data

community

tags (@benz's tags highlighted)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Measuring the Similarity of Concept Hierarchies and its Influence on the Evaluation of Learning Procedures

Comments and Reviews
(0)