TY  - JOUR
AU  - Doan, Anhai
AU  - Ramakrishnan, Raghu
AU  - Halevy, Alon Y.
T1  - Crowdsourcing systems on the World-Wide Web
JO  - Communications of the ACM
PY  - 2011/04
VL  - 54
IS  - 4
SP  - 86
EP  - 96
UR  - http://doi.acm.org/10.1145/1924421.1924442
DO  - 10.1145/1924421.1924442
KW  - crowdsourcing
KW  - human
KW  - intelligence
KW  - social
KW  - computing
KW  - cirg
KW  - collective
L1  - 
SN  - 
N1  - 
N1  - 
AB  - The practice of crowdsourcing is transforming the Web and giving rise to a new field.
ER  -

TY  - CONF
AU  - Jeffery, Shawn R.
AU  - Franklin, Michael J.
AU  - Halevy, Alon Y.
A2  - 
T1  - Pay-as-you-go user feedback for dataspace systems
T2  - Proceedings of the 2008 ACM SIGMOD international conference on Management of data
PB  - ACM
C1  - New York, NY, USA
PY  - 2008/
CY  -  
VL  - 
IS  - 
SP  - 847
EP  - 860
UR  - http://doi.acm.org/10.1145/1376616.1376701
DO  - 10.1145/1376616.1376701
KW  - matching
KW  - feedback
KW  - entity
KW  - schema
KW  - human
KW  - intelligence
KW  - semantic
KW  - social
KW  - computing
KW  - collective
KW  - linking
KW  - web
L1  - 
SN  - 978-1-60558-102-6
N1  - 
N1  - 
AB  - A primary challenge to large-scale data integration is creating semantic equivalences between elements from different data sources that correspond to the same real-world entity or concept. Dataspaces propose a pay-as-you-go approach: automated mechanisms such as schema matching and reference reconciliation provide initial correspondences, termed <i>candidate matches</i>, and then user feedback is used to incrementally confirm these matches. The key to this approach is to determine in what order to solicit user feedback for confirming candidate matches.</p> <p>In this paper, we develop a decision-theoretic framework for ordering candidate matches for user confirmation using the concept of the <i>value of perfect information (VPI)</i>. At the core of this concept is a <i>utility function</i> that quantifies the desirability of a given state; thus, we devise a utility function for dataspaces based on query result quality. We show in practice how to efficiently apply VPI in concert with this utility function to order user confirmations. A detailed experimental evaluation on both real and synthetic datasets shows that the ordering of user feedback produced by this VPI-based approach yields a dataspace with a significantly higher utility than a wide range of other ordering strategies. Finally, we outline the design of Roomba, a system that utilizes this decision-theoretic framework to guide a dataspace in soliciting user feedback in a pay-as-you-go manner.
ER  -

TY  - CONF
AU  - Dhamankar, Robin
AU  - Lee, Yoonkyong
AU  - Doan, AnHai
AU  - Halevy, Alon Y.
AU  - Domingos, Pedro
A2  - Weikum, Gerhard
A2  - König, Arnd Christian
A2  - Deßloch, Stefan
T1  - iMAP: Discovering Complex Mappings between Database Schemas.
T2  - SIGMOD Conference
PB  - ACM
C1  - 
PY  - 2004/
CY  -  
VL  - 
IS  - 
SP  - 383
EP  - 394
UR  - http://www.cs.washington.edu/homes/pedrod/papers/sigmod04.pdf
DO  - 
KW  - semantic
KW  - srl
KW  - mapping
KW  - web
L1  - 
SN  - 1-58113-859-8
N1  - dblp
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Halevy, Alon Y.
AU  - Madhavan, Jayant
A2  - Gottlob, Georg
A2  - Walsh, Toby
T1  - Corpus-Based Knowledge Representation
T2  - IJCAI-03, Proceedings of the Eighteenth International Joint Conference<p>               on Artificial Intelligence, Acapulco, Mexico, August 9-15, 2003
PB  - Morgan Kaufmann
C1  - 
PY  - 2003/
CY  -  
VL  - 
IS  - 
SP  - 1567
EP  - 1572
UR  - 
DO  - 
KW  - representation
KW  - based
KW  - corpus
KW  - knowledge
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Doan, AnHai
AU  - Madhavan, Jayant
AU  - Domingos, Pedro
AU  - Halevy, Alon
A2  - 
T1  - Learning to Map between Ontologies on the Semantic                Web
T2  - Proceedings to the Eleventh International World Wide
PB  - 
C1  - Honolulu, Hawaii, USA
PY  - 2002/05
CY  -  
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://www.cs.washington.edu/homes/alon/site/files/glue.pdf
DO  - 
KW  - srl
KW  - mapping
KW  - ontology
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - JOUR
AU  - Doan, AnHai
AU  - Domingos, Pedro
AU  - Halevy, Alon Y.
T1  - Reconciling schemas of disparate data sources: a machine-learning approach
JO  - SIGMOD Rec.
PY  - 2001/
VL  - 30
IS  - 2
SP  - 509
EP  - 520
UR  - http://portal.acm.org/citation.cfm?id=375731&dl=GUIDE&coll=GUIDE&CFID=75153142&CFTOKEN=89522229
DO  - http://doi.acm.org/10.1145/376284.375731
KW  - schema
KW  - ol
KW  - learning
KW  - mapping
KW  - data
KW  - mining
L1  - 
SN  - 
N1  - 
N1  - 
AB  - A data-integration system provides access to a multitude of data sources through a single mediated schema. A key bottleneck in building such systems has been the laborious manual construction of semantic mappings between the source schemas and the mediated schema. We describe LSD, a system that employs and extends current machine-learning techniques to semi-automatically find such mappings. LSD first asks the user to provide the semantic mappings for a small set of data sources, then uses these mappings together with the sources to train a set of learners. Each learner exploits a different type of information either in the source schemas or in their data. Once the learners have been trained, LSD finds semantic mappings for a new data source by applying the learners, then combining their predictions using a meta-learner. To further improve matching accuracy, we extend machine learning techniques so that LSD can incorporate domain constraints as an additional source of knowledge, and develop a novel learner that utilizes the structural information in XML documents. Our approach thus is distinguished in that it incorporates multiple types of knowledge. Importantly, its architecture is extensible to additional learners that may exploit new kinds of information. We describe a set of experiments on several real-world domains, and show that LSD proposes semantic mappings with a high degree of accuracy.
ER  -