TY  - CONF
AU  - Pereira Nunes, Bernardo
AU  - Kawase, Ricardo
AU  - Dietze, Stefan
AU  - Taibi, Davide
AU  - Casanova, Marco Antonio
AU  - Nejdl, Wolfgang
A2  - Rizzo, Giuseppe
A2  - Mendes, Pablo
A2  - Charton, Eric
A2  - Hellmann, Sebastian
A2  - Kalyanpur, Aditya
T1  - Can Entities be Friends?
T2  - Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference
PB  - 
CY  - 
PY  - 2012/november
M2  - 
VL  - 906
IS  - 
SP  - 45
EP  - 57
UR  - http://ceur-ws.org/Vol-906/paper6.pdf
M3  - 
KW  - data
KW  - detection
KW  - entity
KW  - graph
KW  - linked
KW  - relation
KW  - web
L1  - 
SN  - 
N1  - 
N1  - 
AB  - The richness of the (Semantic) Web lies in its ability to link related resources as well as data across the Web. However, while relations within particular datasets are often well defined, links between disparate datasets and corpora of Web resources are rare. The increasingly widespread use of cross-domain reference datasets, such as Freebase and DBpedia for annotating and enriching datasets as well as document corpora, opens up opportunities to exploit their inherent semantics to uncover semantic relationships between disparate resources. In this paper, we present an approach to uncover relationships between disparate entities by analyzing the graphs of used reference datasets. We adapt a relationship assessment methodology from social network theory to measure the connectivity between entities in reference datasets and exploit these measures to identify correlated Web resources. Finally, we present an evaluation of our approach using the publicly available datasets Bibsonomy and USAToday. 
ER  -

TY  - JOUR
AU  - Borges, Eduardo N.
AU  - Becker, Karin
AU  - Heuser, Carlos A.
AU  - Galante, Renata
T1  - A Classification-based Approach for Bibliographic Metadata Deduplication
JO  - Proceedings of the IADIS International Conference WWW/Internet 2011 
PY  - 2011/
VL  - 
IS  - 
SP  - 221
EP  - 228
UR  - http://www.eduardo.c3.furg.br/arquivos/download/www-internet2011.pdf
M3  - 
KW  - bibliographic
KW  - classification
KW  - detection
KW  - duplicate
KW  - metadata
L1  - 
SN  - 
N1  - 
N1  - 
AB  - Digital libraries of scientific articles describe them using a set of metadata, including bibliographic references. These references can be represented by several formats and styles. Considerable content variations can occur in some metadata fields such as title, author names and publication venue. Besides, it is quite common to find references that omit same metadata fields such as page numbers. Duplicate entries influence the quality of digital library services once they need to be appropriately identified and treated. This paper presents a comparative analysis among different data classification algorithms used to identify duplicated bibliographic metadata records. We have investigated the discovered patterns by comparing the rules and the decision tree with the heuristics adopted in a previous work. Our experiments show that the combination of specific-purpose similarity functions previously proposed and classification algorithms represent an improvement up to 12% when compared to the experiments using our original approach. 
ER  -

TY  - CONF
AU  - Mitzlaff, Folke
AU  - Benz, Dominik
AU  - Stumme, Gerd
AU  - Hotho, Andreas
A2  - 
T1  - Visit me, click me, be my friend: an analysis of evidence networks of user relationships in BibSonomy
T2  - HT '10: Proceedings of the 21st ACM Conference on Hypertext and Hypermedia
PB  - ACM
CY  - New York, NY, USA
PY  - 2010/
M2  - 
VL  - 
IS  - 
SP  - 265
EP  - 270
UR  - http://portal.acm.org/citation.cfm?id=1810617.1810664
M3  - 10.1145/1810617.1810664
KW  - bibsonomy
KW  - collaborative
KW  - community
KW  - detection
KW  - evidence
KW  - folksonomy
KW  - network
KW  - tagging
L1  - 
SN  - 978-1-4503-0041-4
N1  - 
N1  - 
AB  - The ongoing spread of online social networking and sharing sites has reshaped the way how people interact with each other. Analyzing the relatedness of different users within the resulting large populations of these systems plays an important role for tasks like user recommendation or community detection. Algorithms in these fields typically face the problem that explicit user relationships (like friend lists) are often very sparse. Surprisingly, implicit evidences (like click logs) of user relations have hardly been considered to this end. Based on our long-time experience with running BibSonomy [4], we identify in this paper different evidence networks of user relationships in our system. We broadly classify each network based on whether the links are explicitly established by the users (e.g., friendship or group membership) or accrue implicitly in the running system (e.g., when user u copies an entry of user v). We systematically analyze structural properties of these networks and whether topological closeness (in terms of the length of shortest paths) coincides with semantic similarity between users.
ER  -

TY  - CONF
AU  - Voss, Jakob
AU  - Hotho, Andreas
AU  - Jäschke, Robert
A2  - Kuhlen, Rainer
T1  - Mapping Bibliographic Records with Bibliographic Hash Keys
T2  - Information: Droge, Ware oder Commons?
PB  - Verlag Werner Hülsbusch
CY  - 
PY  - 2009/
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://eprints.rclis.org/15953/
M3  - 
KW  - 2009
KW  - bibkey
KW  - bibliographic
KW  - bibtex
KW  - detection
KW  - duplicate
KW  - hash
KW  - key
KW  - myown
L1  - 
SN  - 
N1  - 
N1  - 
AB  - This poster presents a set of hash keys for bibliographic records called bibkeys. Unlike other methods of duplicate detection, bibkeys can directly be calculated from a set of basic metadata fields (title, authors/editors, year). It is shown how bibkeys are used to map similar bibliographic records in BibSonomy and among distributed library catalogs and other distributed databases.
ER  -

TY  - CONF
AU  - Java, Akshay
AU  - Joshi, Anupam
AU  - Finin, Tim
A2  - 
T1  - Detecting Commmunities via Simultaneous Clustering of Graphs and Folksonomies
T2  - WebKDD 2008 Workshop on Web Mining and Web Usage Analysis
PB  - 
CY  - 
PY  - 2008/08
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - 
M3  - 
KW  - clustering
KW  - community
KW  - detection
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - JOUR
AU  - Barber, M. J.
T1  - Modularity and community detection in bipartite networks
JO  - Physical Review E
PY  - 2007/
VL  - 76
IS  - 6
SP  - 
EP  - 
UR  - http://arxiv.org/abs/arXiv:0707.1616
M3  - 10.1103/PhysRevE.76.066102
KW  - bipartite
KW  - clustering
KW  - community
KW  - detection
KW  - graph
KW  - modularity
KW  - network
L1  - 
SN  - 
N1  - 
N1  - 
AB  - The modularity of a network quantifies the extent, relative to a null model network, to which vertices cluster into community groups. We define a null model appropriate for bipartite networks, and use it to define a bipartite modularity. The bipartite modularity is presented in terms of a modularity matrix B; some key properties of the eigenspectrum of B are identified and used to describe an algorithm for identifying modules in bipartite networks. The algorithm is based on the idea that the modules in the two parts of the network are dependent, with each part mutually being used to induce the vertices for the other part into the modules. We apply the algorithm to real-world network data, showing that the algorithm successfully identifies the modular structure of bipartite networks.
ER  -

TY  - JOUR
AU  - Guimerà, R.
AU  - Sales-Pardo, M.
AU  - Amaral, L.A.N.
T1  - Module identification in bipartite and directed networks
JO  - Physical review. E, Statistical, nonlinear, and soft matter physics
PY  - 2007/
VL  - 76
IS  - 3 Pt 2
SP  - 
EP  - 
UR  - http://arxiv.org/abs/physics/0701151
M3  - 10.1103/PhysRevE.76.036102
KW  - bipartite
KW  - clustering
KW  - community
KW  - detection
KW  - graph
KW  - modularity
KW  - module
KW  - network
L1  - 
SN  - 
N1  - 
N1  - 
AB  - Modularity is one of the most prominent properties of real-world complex networks. Here, we address the issue of module identification in two important classes of networks: bipartite networks and directed unipartite networks. Nodes in bipartite networks are divided into two non-overlapping sets, and the links must have one end node from each set. Directed unipartite networks only have one type of nodes, but links have an origin and an end. We show that directed unipartite networks can be conviniently represented as bipartite networks for module identification purposes. We report a novel approach especially suited for module detection in bipartite networks, and define a set of random networks that enable us to validate the new approach.
ER  -

TY  - CONF
AU  - Hotho, Andreas
AU  - Jäschke, Robert
AU  - Schmitz, Christoph
AU  - Stumme, Gerd
A2  - Avrithis, Yannis S.
A2  - Kompatsiaris, Yiannis
A2  - Staab, Steffen
A2  - O'Connor, Noel E.
T1  - Trend Detection in Folksonomies
T2  - Proc. First International Conference on Semantics And Digital Media Technology (SAMT) 
PB  - Springer
CY  - Heidelberg
PY  - 2006/12
M2  - 
VL  - 4306
IS  - 
SP  - 56
EP  - 70
UR  - http://www.kde.cs.uni-kassel.de/stumme/papers/2006/hotho2006trend.pdf
M3  - 
KW  - 2006
KW  - detection
KW  - folksonomy
KW  - l3s
KW  - myown
KW  - trend
L1  - 
SN  - 3-540-49335-2
N1  - 
N1  - 
AB  - As the number of resources on the web exceeds by far the number of documents one can track, it becomes increasingly difficult to remain up to date on ones own areas of interest. The problem becomes more severe with the increasing fraction of multimedia data, from which it is difficult to extract some conceptual description of their contents.<p><p>One way to overcome this problem are social bookmark tools, which are rapidly emerging on the web. In such systems, users are setting up lightweight conceptual structures called folksonomies, and overcome thus the knowledge acquisition bottleneck. As more and more people participate in the effort, the use of a common vocabulary becomes more and more stable. We present an approach for discovering topic-specific trends within folksonomies. It is based on a differential adaptation of the PageRank algorithm to the triadic hypergraph structure of a folksonomy. The approach allows for any kind of data, as it does not rely on the internal structure of the documents. In particular, this allows to consider different data types in the same analysis step. We run experiments on a large-scale real-world snapshot of a social bookmarking system.
ER  -

TY  - CONF
AU  - Jäschke, Robert
AU  - Hotho, Andreas
AU  - Schmitz, Christoph
AU  - Stumme, Gerd
A2  - Braß, Stefan
A2  - Hinneburg, Alexander
T1  - Wege zur Entdeckung von Communities in Folksonomies
T2  - Proc. 18. Workshop Grundlagen von Datenbanken
PB  - Martin-Luther-Universität 
CY  - Halle-Wittenberg
PY  - 2006/06
M2  - 
VL  - 
IS  - 
SP  - 80
EP  - 84
UR  - http://www.kde.cs.uni-kassel.de/jaeschke/pub/jaeschke2006wege_gvd.pdf
M3  - 
KW  - 2006
KW  - community
KW  - detection
KW  - folksonomy
KW  - iccs_example
KW  - l3s
KW  - myown
KW  - trias_example
L1  - 
SN  - 
N1  - 
N1  - 
AB  - Ein wichtiger Baustein des neu entdeckten World Wide Web -- des "Web 2.0"  -- stellen Folksonomies dar. In diesen Systemen können Benutzer gemeinsam Ressourcen verwalten und<p>mit Schlagwörtern versehen. Die dadurch entstehenden begrifflichen Strukturen stellen ein interessantes Forschungsfeld dar. Dieser Artikel untersucht Ansätze und Wege zur Entdeckung und Strukturierung von Nutzergruppen ("Communities") in Folksonomies.
ER  -

TY  - CONF
AU  - Aggarwal, Charu C.
AU  - Yu, Philip S.
A2  - 
T1  - Online Analysis of Community Evolution in Data Streams.
T2  - SDM
PB  - 
CY  - 
PY  - 2005/
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://web.mit.edu/charu/www/aggar142.pdf 
M3  - 
KW  - data
KW  - detection
KW  - stream
KW  - analysis
KW  - community
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - JOUR
AU  - Duch, J.
AU  - Arenas, A.
T1  - Community detection in complex networks using Extremal Optimization
JO  - Physical Review E
PY  - 2005/
VL  - 72
IS  - 
SP  - 
EP  - 
UR  - http://www.citebase.org/abstract?id=oai:arXiv.org:cond-mat/0501368
M3  - 
KW  - community
KW  - complex
KW  - detection
KW  - network
L1  - 
SN  - 
N1  - Citebase - Community detection in complex networks using Extremal Optimization
N1  - 
AB  - We propose a novel method to find the community structure in complex networks based on an extremal optimization of the value of modularity. The method outperforms the optimal modularity found by the existing algorithms in the literature. We present the results of the algorithm for computer simulated and real networks and compare them with other approaches. The efficiency and accuracy of the method make it feasible to be used for the accurate identification of community structure in large complex networks.
ER  -

TY  - THES
AU  - Trier, Matthias
T1  - IT-supported Visualization and Evaluation of Virtual Knowledge Communities. Applying Social Network Intelligence Software in Knowledge Management to enable knowledge oriented People Network Management
PY  - 2005/
PB  - 
SP  - 
EP  - 
UR  - http://nbn-resolving.de/urn/resolver.pl?urn=urn:nbn:de:kobv:83-opus-10720
M3  - 
KW  - social
KW  - detection
KW  - knowledge
KW  - management
KW  - community
KW  - network
L1  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Almeida, Rodrigo B.
AU  - Almeida, Virgilio A. F.
A2  - 
T1  - A community-aware search engine
T2  - Proceedings of the 13th international conference on World Wide Web
PB  - ACM Press
CY  - New York, NY, USA
PY  - 2004/
M2  - 
VL  - 
IS  - 
SP  - 413
EP  - 421
UR  - http://doi.acm.org/10.1145/988672.988728
M3  - 
KW  - search
KW  - engine
KW  - detection
KW  - hits
KW  - community
KW  - network
L1  - 
SN  - 1-58113-844-X
N1  - 
N1  - 
AB  -  	<p>Current search technologies work in a "one size fits all" fashion. Therefore, the answer to a query is independent of specific user information need. In this paper we describe a novel ranking technique for personalized search servicesthat combines content-based and community-based evidences. The community-based information is used in order to provide context for queries andis influenced by the current interaction of the user with the service. Ouralgorithm is evaluated using data derived from an actual service available on the Web an online bookstore. We show that the quality of content-based ranking strategies can be improved by the use of communityinformation as another evidential source of relevance. In our experiments the improvements reach up to 48% in terms of average precision.
ER  -

TY  - GEN
AU  - Radicchi, Filippo
AU  - Castellano, Claudio
AU  - Cecconi, Federico
AU  - Loreto, Vittorio
AU  - Parisi, Domenico
A2  - 
T1  - Defining and identifying communities in networks
JO  - 
PB  - 
AD  - 
PY  - 2004/02
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://arxiv.org/abs/cond-mat/0309488
M3  - 
KW  - graph
KW  - gn
KW  - detection
KW  - network
KW  - community
L1  - 
N1  - 
N1  - 
AB  - The investigation of community structures in networks is an important issue<p>in many domains and disciplines. This problem is relevant for social tasks<p>(objective analysis of relationships on the web), biological inquiries<p>(functional studies in metabolic, cellular or protein networks) or<p>technological problems (optimization of large infrastructures). Several types<p>of algorithm exist for revealing the community structure in networks, but a<p>general and quantitative definition of community is still lacking, leading to<p>an intrinsic difficulty in the interpretation of the results of the algorithms<p>without any additional non-topological information. In this paper we face this<p>problem by introducing two quantitative definitions of community and by showing<p>how they are implemented in practice in the existing algorithms. In this way<p>the algorithms for the identification of the community structure become fully<p>self-contained. Furthermore, we propose a new local algorithm to detect<p>communities which outperforms the existing algorithms with respect to the<p>computational cost, keeping the same level of reliability. The new algorithm is<p>tested on artificial and real-world graphs. In particular we show the<p>application of the new algorithm to a network of scientific collaborations,<p>which, for its size, can not be attacked with the usual methods. This new class<p>of local algorithms could open the way to applications to large-scale<p>technological and biological applications.
ER  -

TY  - GEN
AU  - Almeida, R.B.
AU  - Almeida, V.A.F.
A2  - 
T1  - Design and evaluation of a user-based community discovery technique
JO  - 
PB  - 
AD  - 
PY  - 2003/
VL  - 
IS  - 
SP  - 17
EP  - 23
UR  - citeseer.ist.psu.edu/almeida03design.html
M3  - 
KW  - detection
KW  - hits
KW  - community
KW  - network
L1  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Kubica, Jeremy
AU  - Moore, Andrew
AU  - Schneider, Jeff
A2  - Wu, Xindong
A2  - Tuzhilin, Alex
A2  - Shavlik, Jude
T1  - Tractable Group Detection on Large Link Data Sets
T2  - The Third IEEE International Conference on Data Mining
PB  - IEEE Computer Society
CY  - 
PY  - 2003/november
M2  - 
VL  - 
IS  - 
SP  - 573
EP  - 576
UR  - 
M3  - 
KW  - large
KW  - detection
KW  - community
KW  - network
KW  - gda
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - RPRT
AU  - Kubica, Jeremy Martin
AU  - Moore, Andrew
AU  - Schneider, Jeff
A2  - 
T1  - K-groups: Tractable Group Detection on Large Link Data Sets
PB  - Robotics Institute, Carnegie Mellon University
AD  - Pittsburgh, PA
PY  - 2003/10
VL  - 
IS  - CMU-RI-TR-03-32
SP  - 
EP  - 
UR  - http://www.ri.cmu.edu/pubs/pub_4489.html
M3  - 
KW  - large
KW  - detection
KW  - gda
KW  - network
KW  - community
L1  - 
N1  - 
N1  - 
N1  - 
AB  - Discovering underlying structure from co-occurrence data is an important task in many fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. For example a store may wish to identify underlying sets of items purchased together or a human resources department may wish to identify groups of employees that collaborate with each other.<p><p>Previously Kubica et. al. presented the group detection algorithm (GDA) - an algorithm for finding underlying groupings of entities from co-occurrence data. This algorithm is based on a probabilistic generative model and produces coherent groups that are consistent with prior knowledge. Unfortunately, the optimization used in GDA is slow, making it potentially infeasible for many real world data sets.<p><p>To this end, we present k-groups - an algorithm that uses an approach similar to that of k-means (hard clustering and localized updates) to significantly accelerate the discovery of the underlying groups while retaining GDA's probabilistic model. In addition, we show that k-groups is guaranteed to converge to a local minimum. We also compare the performance of GDA and k-groups on several real world and artificial data sets, showing that k-groups' sacrifice in solution quality is significantly offset by its increase in speed. This trade-off makes group detection tractable on significantly larger data sets.
ER  -

TY  - CONF
AU  - Kubica, Jeremy
AU  - Moore, Andrew
AU  - Schneider, Jeff
AU  - Yang, Yiming
A2  - 
T1  - Stochastic Link and Group Detection
T2  - Proceedings of the Eighteenth National Conference on Artificial Intelligence
PB  - AAAI Press/MIT Press
CY  - 
PY  - 2002/07
M2  - 
VL  - 
IS  - 
SP  - 798
EP  - 804
UR  - 
M3  - 
KW  - detection
KW  - community
KW  - network
KW  - gda
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Borodin, Allan
AU  - Roberts, Gareth O.
AU  - Rosenthal, Jeffrey S.
AU  - Tsaparas, Panayiotis
A2  - 
T1  - Finding authorities and hubs from link structures on the World Wide Web
T2  - Proceedings of the 10th international conference on World Wide Web
PB  - ACM Press
CY  - New York, NY, USA
PY  - 2001/
M2  - 
VL  - 
IS  - 
SP  - 415
EP  - 429
UR  - http://doi.acm.org/10.1145/371920.372096
M3  - 
KW  - detection
KW  - hits
KW  - community
KW  - network
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - JOUR
AU  - Tejada, Sheila
AU  - Knoblock, Craig A
AU  - Minton, Steven
T1  - Learning object identification rules for information integration
JO  - Information Systems
PY  - 2001/12
VL  - 26
IS  - 8
SP  - 607
EP  - 633
UR  - http://www.sciencedirect.com/science/article/pii/S0306437901000424
M3  - 10.1016/S0306-4379(01)00042-4
KW  - detection
KW  - duplicate
KW  - entity
KW  - extraction
KW  - identification
KW  - information
KW  - integration
L1  - 
SN  - 
N1  - 
N1  - 
AB  - When integrating information from multiple websites, the same data objects can exist in inconsistent text formats across sites, making it difficult to identify matching objects using exact text match. We have developed an object identification system called Active Atlas, which compares the objects’ shared attributes in order to identify matching objects. Certain attributes are more important for deciding if a mapping should exist between two objects. Previous methods of object identification have required manual construction of object identification rules or mapping rules for determining the mappings between objects. This manual process is time consuming and error-prone. In our approach. Active Atlas learns to tailor mapping rules, through limited user input, to a specific application domain. The experimental results demonstrate that we achieve higher accuracy and require less user involvement than previous methods across various application domains.
ER  -