TY  - CHAP
AU  - Haridas, Mandar
AU  - Caragea, Doina
A2  - Meersman, Robert
A2  - Dillon, Tharam
A2  - Herrero, Pilar
T1  - Exploring Wikipedia and DMoz as Knowledge Bases for Engineering a User Interests Hierarchy for Social Network Applications
T2  - On the Move to Meaningful Internet Systems: OTM 2009
PB  - Springer
CY  - Berlin / Heidelberg
PY  - 2009/
VL  - 5871
IS  - 
SP  - 1238
EP  - 1245
UR  - http://dx.doi.org/10.1007/978-3-642-05151-7_35
M3  - 10.1007/978-3-642-05151-7_35
KW  - dmoz
KW  - genta11
KW  - hierarchy
KW  - taxonomy
KW  - wordnet
KW  - ol_web2.0
KW  - data_wikis
KW  - methods_concepthierarchy
L1  - 
SN  - 
N1  - SpringerLink - Abstract
N1  - 
AB  - The outgrowth of social networks in the recent years has resulted in opportunities for interesting data mining problems, such as interest or friendship recommendations. A global ontology over the interests specified by the users of a social network is essential for accurate recommendations. We propose, evaluate and compare three approaches to engineering a hierarchical ontology over user interests. The proposed approaches make use of two popular knowledge bases, Wikipedia and Directory Mozilla, to extract interest definitions and/or relationships between interests. More precisely, the first approach uses Wikipedia to find interest definitions, the latent semantic analysis technique to measure the similarity between interests based on their definitions, and an agglomerative clustering algorithm to group similar interests into higher level concepts. The second approach uses the Wikipedia Category Graph to extract relationships between interests, while the third approach uses Directory Mozilla to extract relationships between interests. Our results show that the third approach, although the simplest, is the most effective for building a hierarchy over user interests.
ER  -

TY  - CONF
AU  - Silva, L. De
AU  - Jayaratne, L.
A2  - 
T1  - Semi-automatic extraction and modeling of ontologies using Wikipedia XML Corpus
T2  - Applications of Digital Information and Web Technologies, 2009. ICADIWT '09. Second International Conference on the
PB  - 
CY  - 
PY  - 2009/aug.
M2  - 
VL  - 
IS  - 
SP  - 446
EP  - 451
UR  - http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=5273826&arnumber=5273871&count=156&index=116
M3  - 10.1109/ICADIWT.2009.5273871
KW  - learning
KW  - ol_web2.0
KW  - ontology
KW  - ontology_learning
KW  - semi_automatic
KW  - wikipedia
KW  - data_wikis
L1  - 
SN  - 
N1  - Welcome to IEEE Xplore 2.0: Semi-automatic extraction and modeling of ontologies using Wikipedia XML Corpus
N1  - 
AB  - This paper introduces WikiOnto: a system that assists in the extraction and modeling of topic ontologies in a semi-automatic manner using a preprocessed document corpus derived from Wikipedia. Based on the Wikipedia XML Corpus, we present a three-tiered framework for extracting topic ontologies in quick time and a modeling environment to refine these ontologies. Using natural language processing (NLP) and other machine learning (ML) techniques along with a very rich document corpus, this system proposes a solution to a task that is generally considered extremely cumbersome. The initial results of the prototype suggest strong potential of the system to become highly successful in ontology extraction and modeling and also inspire further research on extracting ontologies from other semi-structured document corpora as well.
ER  -

TY  - CONF
AU  - Grineva, Maria
AU  - Grinev, Maxim
AU  - Turdakov, Denis
AU  - Velikhov, Pavel
A2  - 
T1  - Harnessing Wikipedia for Smart Tags Clustering
T2  - Proceedings of the International Workshop on Knowledge Acquisition from the Social Web (KASW2008)
PB  - 
CY  - 
PY  - 2008/
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - 
M3  - 
KW  - clustering
KW  - ol_web2.0
KW  - tags
KW  - wikipedia
KW  - methods_concepts
KW  - data_wikis
L1  - 
SN  - 
N1  - 
N1  - 
AB  - The quality of the current tagging services can be greatly improved if the service is able to cluster tags by their meaning. Tag clouds clustered by higher level topics enable the users to explore their tag space, which is especially needed when tag clouds become large. We demonstrate TagCluster - a tool for automated tag clustering that harnesses knowledge from Wikipedia about semantic relatedness between tags and names of categories to achieve smart clustering. Our approach shows much better quality of clusters compared to the existing techniques that rely on tag co-occurrence analysis in the tagging service.
ER  -

TY  - CONF
AU  - Medelyan, O.
AU  - Legg, C.
A2  - 
T1  - Integrating Cyc and Wikipedia: Folksonomy meets rigorously defined common-sense
T2  - Proceedings of the WIKI-AI: Wikipedia and AI Workshop at the AAAI
PB  - 
CY  - 
PY  - 2008/
M2  - 
VL  - 8
IS  - 
SP  - 
EP  - 
UR  - http://scholar.google.de/scholar.bib?q=info:hgFpsjJR__4J:scholar.google.com/&output=citation&hl=de&as_sdt=2000&ct=citation&cd=58
M3  - 
KW  - cyc
KW  - ol_web2.0
KW  - tag_concept_mapping
KW  - data_wikis
L1  - 
SN  - 
N1  - 
N1  - 
AB  - Integration of ontologies begins with establishing mappings between their concept entries. We map categories from the largest manually-built ontology, Cyc, onto Wikipedia articles describing corresponding concepts. Our method draws both on Wikipedia’s rich but chaotic hyperlink structure and Cyc’s carefully defined taxonomic and common-sense knowledge. On 9,333 manual alignments by one person, we achieve an F-measure of 90%; on 100 alignments by six human subjects the average agreement of the method with the subject is close to their agreement with each other. We cover 62.8% of Cyc categories relating to common-sense knowledge and discuss what further information might be added to Cyc given this substantial new alignment.
ER  -

TY  - CONF
AU  - Nazir, F.
AU  - Takeda, H.
A2  - 
T1  - Extraction and analysis of tripartite relationships from Wikipedia
T2  - IEEE International Symposium on Technology and Society
PB  - 
CY  - 
PY  - 2008/06
M2  - 
VL  - 
IS  - 
SP  - 1
EP  - 13
UR  - http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4559785
M3  - 10.1109/ISTAS.2008.4559785
KW  - ol_web2.0
KW  - ontology_learning
KW  - wikipedia
KW  - data_wikis
L1  - 
SN  - 978-1-4244-1669-1
N1  - 
N1  - 
AB  - Social aspects are critical in the decision making process for social actors (human beings). Social aspects can be categorized into social interaction, social communities, social groups or any kind of behavior that emerges from interlinking, overlapping or similarities between interests of a society. These social aspects are dynamic and emergent. Therefore, interlinking them in a social structure, based on bipartite affiliation network, may result in isolated graphs. The major reason is that as these correspondences are dynamic and emergent, they should be coupled with more than a single affiliation in order to sustain the interconnections during interest evolutions. In this paper we propose to interlink actors using multiple tripartite graphs rather than a bipartite graph which was the focus of most of the previous social network building techniques. The utmost benefit of using tripartite graphs is that we can have multiple and hierarchical links between social actors. Therefore in this paper we discuss the extraction, plotting and analysis methods of tripartite relations between authors, articles and categories from Wikipedia. Furthermore, we also discuss the advantages of tripartite relationships over bipartite relationships. As a conclusion of this study we argue based on our results that to build useful, robust and dynamic social networks, actors should be interlinked in one or more tripartite networks.
ER  -

TY  - CONF
AU  - Auer, Sören
AU  - Lehmann, Jens
A2  - 
T1  - What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content
T2  - ESWC
PB  - 
CY  - 
PY  - 2007/
M2  - 
VL  - 
IS  - 
SP  - 503
EP  - 517
UR  - http://www.springerlink.com/content/3131t21p634191n2/
M3  - 
KW  - ol_web2.0
KW  - ontology_learning
KW  - semantics
KW  - wiki
KW  - data_wikis
L1  - 
SN  - 
N1  - 
N1  - 
AB  - Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.
ER  -

TY  - CONF
AU  - Ponzetto, Simone Paolo
AU  - Strube, Michael
A2  - 
T1  - Deriving a Large-Scale Taxonomy from Wikipedia.
T2  - AAAI
PB  - AAAI Press
CY  - 
PY  - 2007/
M2  - 
VL  - 
IS  - 
SP  - 1440
EP  - 1445
UR  - http://dblp.uni-trier.de/db/conf/aaai/aaai2007.html#PonzettoS07
M3  - 
KW  - download
KW  - ol_web2.0
KW  - online
KW  - ontology
KW  - taxonomy
KW  - wikipedia
KW  - methods_concepthierarchy
KW  - data_wikis
L1  - 
SN  - 978-1-57735-323-2
N1  - dblp
N1  - 
AB  - We take the category system inWikipedia as a conceptual network. We label the semantic relations between categories using methods based on connectivity in the network and lexicosyntactic matching. As a result we are able to derive a large scale taxonomy containing a large amount of subsumption, i.e. isa, relations. We evaluate the quality of the created resource by comparing it with ResearchCyc, one of the largest manually annotated ontologies, as well as computing semantic similarity between words in benchmarking datasets.
ER  -

TY  - CONF
AU  - Strube, Michael
AU  - Ponzetto, Simone Paolo
A2  - 
T1  - WikiRelate! Computing Semantic Relatedness Using Wikipedia.
T2  - AAAI
PB  - AAAI Press
CY  - 
PY  - 2006/
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://www.dit.unitn.it/~p2p/RelatedWork/Matching/aaai06.pdf
M3  - 
KW  - ol_web2.0
KW  - semantic_relatedness
KW  - wikipedia
KW  - wikirelate
KW  - data_wikis
L1  - 
SN  - 
N1  - dblp
N1  - 
AB  - Wikipedia provides a knowledge base for computing word relatedness in a more structured fashion than a search engine and with more coverage than WordNet. In this work we present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet when applied to the largest available dataset designed for that purpose. The best results on this dataset are obtained by integrating Google, WordNet and Wikipedia based measures. We also show that including Wikipedia improves the performance of an NLP application processing naturally occurring texts.
ER  -

TY  - CHAP
AU  - Ruiz-Casado, Maria
AU  - Alfonseca, Enrique
AU  - Castells, Pablo
A2  - Montoyo, Andrés
A2  - Muñoz, Rafael
A2  - Métais, Elisabeth
T1  - Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia
T2  - Natural Language Processing and Information Systems
PB  - Springer
CY  - Berlin / Heidelberg
PY  - 2005/
VL  - 3513
IS  - 
SP  - 233
EP  - 242
UR  - http://dx.doi.org/10.1007/11428817_7
M3  - 10.1007/11428817_7
KW  - ol_web2.0
KW  - patterns
KW  - wikipedia
KW  - wordnet
KW  - data_wikis
KW  - methods_relations
L1  - 
SN  - 
N1  - SpringerLink - Abstract
N1  - 
AB  - This paper describes an automatic approach to identify lexical patterns which represent semantic relationships between concepts, from an on-line encyclopedia. Next, these patterns can be applied to extend existing ontologies or semantic networks with new relations. The experiments have been performed with the Simple English Wikipedia and WordNet 1.7. A new algorithm has been devised for automatically generalising the lexical patterns found in the encyclopedia entries. We have found general patterns for the hyperonymy, hyponymy, holonymy and meronymy relations and, using them, we have extracted more than 1200 new relationships that did not appear in WordNet originally. The precision of these relationships ranges between 0.61 and 0.69, depending on the relation.
ER  -