Journal articles
Graph Neural Networks Designed for Different Graph Types: A Survey.
Transactions on Machine Learning Research, 2023.
Josephine Thomas, Alice Moallemy-Oureh, Silvia Beddar-Wiesing and Clara Holzhüter.
[doi]
[abstract]
[BibTeX]
Graphs are ubiquitous in nature and can therefore serve as models for many practical but also theoretical problems. For this purpose, they can be defined as many different types which suitably reflect the individual contexts of the represented problem. To address cutting-edge problems based on graph data, the research field of Graph Neural Networks (GNNs) has emerged. Despite the field’s youth and the speed at which new models are developed, many recent surveys have been published to keep track of them. Nevertheless, it has not yet been gathered which GNN can process what kind of graph types. In this survey, we give a detailed overview of already existing GNNs and, unlike previous surveys, categorize them according to their ability to handle different graph types and properties. We consider GNNs operating on static and dynamic graphs of different structural constitutions, with or without node or edge attributes. Moreover, we distinguish between GNN models for discrete-time or continuous-time dynamic graphs and group the models according to their architecture. We find that there are still graph types that are not or only rarely covered by existing GNN models. We point out where models are missing and give potential reasons for their absence.
Conference articles
Large-scale factorization of type-constrained multi-relational data.
In:
International Conference on Data Science and Advanced Analytics, DSAA 2014, Shanghai, China, October 30 - November 1, 2014, pages 18-24.
IEEE, 2014.
Denis Krompass, Maximilian Nickel and Volker Tresp.
[doi]
[BibTeX]
An analysis of tag-recommender evaluation procedures.
In:
Proceedings of the 7th ACM conference on Recommender systems, series RecSys '13, pages 343-346.
ACM, New York, NY, USA, 2013.
Stephan Doerfel and Robert Jäschke.
[doi]
[abstract]
[BibTeX]
Since the rise of collaborative tagging systems on the web, the tag recommendation task -- suggesting suitable tags to users of such systems while they add resources to their collection -- has been tackled. However, the (offline) evaluation of tag recommendation algorithms usually suffers from difficulties like the sparseness of the data or the cold start problem for new resources or users. Previous studies therefore often used so-called post-cores (specific subsets of the original datasets) for their experiments. In this paper, we conduct a large-scale experiment in which we analyze different tag recommendation algorithms on different cores of three real-world datasets. We show, that a recommender's performance depends on the particular core and explore correlations between performances on different cores.
Journal articles
Internet-Graphen.
Informatik-Spektrum, 36(5):440-448, 2013.
Klaus Heidtmann.
[doi]
[abstract]
[BibTeX]
Bildeten die Keimzellen des Internet noch kleine und einfach strukturierte Netze, so vergrößerten sich sowohl seine physikalischen als auch seine logischen Topologien später rasant. Wuchs einerseits das Netz aus Rechnern als Knoten und Verbindungsleitungen als Kanten immer weiter, so bedienten sich andererseits gleichzeitig immer mehr Anwendungen dieser Infrastruktur, um darüber ihrerseits immer größere und komplexere virtuelle Netze zu weben, z. B. das WWW oder soziale Online-Netze. Auf jeder Ebene dieser Hierarchie lassen sich die jeweiligen Netztopologien mithilfe von Graphen beschreiben und so mathematisch untersuchen. So ergeben sich interessante Einblicke in die Struktureigenschaften unterschiedlicher Graphentypen, die großen Einfluss auf die Leistungsfähigkeit des Internet haben. Hierzu werden charakteristische Eigenschaften und entsprechende Kenngrößen verschiedener Graphentypen betrachtet wie der Knotengrad, die Durchschnittsdistanz, die Variation der Kantendichte in unterschiedlichen Netzteilen und die topologische Robustheit als Widerstandsfähigkeit gegenüber Ausfällen und Angriffen. Es wird dabei Bezug genommen auf analytische, simulative und zahlreiche empirische Untersuchungen des Internets und hingewiesen auf Simulationsprogramme sowie Abbildungen von Internetgraphen im Internet.
Deeper Into the Folksonomy Graph: FolkRank Adaptations and Extensions for Improved Tag Recommendations.
cs.IR, 1310.1498, 2013.
Nikolas Landia, Stephan Doerfel, Robert Jäschke, Sarabjot Singh Anand, Andreas Hotho and Nathan Griffiths.
[doi]
[abstract]
[BibTeX]
The information contained in social tagging systems is often modelled as a graph of connections between users, items and tags. Recommendation algorithms such as FolkRank, have the potential to leverage complex relationships in the data, corresponding to multiple hops in the graph. We present an in-depth analysis and evaluation of graph models for social tagging data and propose novel adaptations and extensions of FolkRank to improve tag recommendations. We highlight implicit assumptions made by the widely used folksonomy model, and propose an alternative and more accurate graph-representation of the data. Our extensions of FolkRank address the new item problem by incorporating content data into the algorithm, and significantly improve prediction results on unpruned datasets. Our adaptations address issues in the iterative weight spreading calculation that potentially hinder FolkRank's ability to leverage the deep graph as an information source. Moreover, we evaluate the benefit of considering each deeper level of the graph, and present important insights regarding the characteristics of social tagging data in general. Our results suggest that the base assumption made by conventional weight propagation methods, that closeness in the graph always implies a positive relationship, does not hold for the social tagging domain.
Deeper Into the Folksonomy Graph: FolkRank Adaptations and Extensions for Improved Tag Recommendations.
cs.IR, 1310.1498, 2013.
Nikolas Landia, Stephan Doerfel, Robert Jäschke, Sarabjot Singh Anand, Andreas Hotho and Nathan Griffiths.
[doi]
[abstract]
[BibTeX]
The information contained in social tagging systems is often modelled as a graph of connections between users, items and tags. Recommendation algorithms such as FolkRank, have the potential to leverage complex relationships in the data, corresponding to multiple hops in the graph. We present an in-depth analysis and evaluation of graph models for social tagging data and propose novel adaptations and extensions of FolkRank to improve tag recommendations. We highlight implicit assumptions made by the widely used folksonomy model, and propose an alternative and more accurate graph-representation of the data. Our extensions of FolkRank address the new item problem by incorporating content data into the algorithm, and significantly improve prediction results on unpruned datasets. Our adaptations address issues in the iterative weight spreading calculation that potentially hinder FolkRank's ability to leverage the deep graph as an information source. Moreover, we evaluate the benefit of considering each deeper level of the graph, and present important insights regarding the characteristics of social tagging data in general. Our results suggest that the base assumption made by conventional weight propagation methods, that closeness in the graph always implies a positive relationship, does not hold for the social tagging domain.
Deeper Into the Folksonomy Graph: FolkRank Adaptations and Extensions for Improved Tag Recommendations.
cs.IR, 1310.1498, 2013.
Nikolas Landia, Stephan Doerfel, Robert Jäschke, Sarabjot Singh Anand, Andreas Hotho and Nathan Griffiths.
[doi]
[abstract]
[BibTeX]
The information contained in social tagging systems is often modelled as a graph of connections between users, items and tags. Recommendation algorithms such as FolkRank, have the potential to leverage complex relationships in the data, corresponding to multiple hops in the graph. We present an in-depth analysis and evaluation of graph models for social tagging data and propose novel adaptations and extensions of FolkRank to improve tag recommendations. We highlight implicit assumptions made by the widely used folksonomy model, and propose an alternative and more accurate graph-representation of the data. Our extensions of FolkRank address the new item problem by incorporating content data into the algorithm, and significantly improve prediction results on unpruned datasets. Our adaptations address issues in the iterative weight spreading calculation that potentially hinder FolkRank's ability to leverage the deep graph as an information source. Moreover, we evaluate the benefit of considering each deeper level of the graph, and present important insights regarding the characteristics of social tagging data in general. Our results suggest that the base assumption made by conventional weight propagation methods, that closeness in the graph always implies a positive relationship, does not hold for the social tagging domain.
Full-Text Citation Analysis: A New Method to Enhance Scholarly Network.
Journal of the American Society for Information Science and Technology, 2012.
Xiaozhong Liu, Jinsong Zhang and Chun Guo.
[doi]
[BibTeX]
Full-Text Citation Analysis: A New Method to Enhance Scholarly Network.
Journal of the American Society for Information Science and Technology, 2012.
Xiaozhong Liu, Jinsong Zhang and Chun Guo.
[doi]
[BibTeX]
Conference articles
Can Entities be Friends?.
In: G. Rizzo, P. Mendes, E. Charton, S. Hellmann and A. Kalyanpur, editors,
Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference, volume 906, series CEUR-WS.org, pages 45-57.
2012.
Bernardo Pereira Nunes, Ricardo Kawase, Stefan Dietze, Davide Taibi, Marco Antonio Casanova and Wolfgang Nejdl.
[doi]
[abstract]
[BibTeX]
The richness of the (Semantic) Web lies in its ability to link related resources as well as data across the Web. However, while relations within particular datasets are often well defined, links between disparate datasets and corpora of Web resources are rare. The increasingly widespread use of cross-domain reference datasets, such as Freebase and DBpedia for annotating and enriching datasets as well as document corpora, opens up opportunities to exploit their inherent semantics to uncover semantic relationships between disparate resources. In this paper, we present an approach to uncover relationships between disparate entities by analyzing the graphs of used reference datasets. We adapt a relationship assessment methodology from social network theory to measure the connectivity between entities in reference datasets and exploit these measures to identify correlated Web resources. Finally, we present an evaluation of our approach using the publicly available datasets Bibsonomy and USAToday.
Can Entities be Friends?.
In: G. Rizzo, P. Mendes, E. Charton, S. Hellmann and A. Kalyanpur, editors,
Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference, volume 906, series CEUR-WS.org, pages 45-57.
2012.
Bernardo Pereira Nunes, Ricardo Kawase, Stefan Dietze, Davide Taibi, Marco Antonio Casanova and Wolfgang Nejdl.
[doi]
[abstract]
[BibTeX]
The richness of the (Semantic) Web lies in its ability to link related resources as well as data across the Web. However, while relations within particular datasets are often well defined, links between disparate datasets and corpora of Web resources are rare. The increasingly widespread use of cross-domain reference datasets, such as Freebase and DBpedia for annotating and enriching datasets as well as document corpora, opens up opportunities to exploit their inherent semantics to uncover semantic relationships between disparate resources. In this paper, we present an approach to uncover relationships between disparate entities by analyzing the graphs of used reference datasets. We adapt a relationship assessment methodology from social network theory to measure the connectivity between entities in reference datasets and exploit these measures to identify correlated Web resources. Finally, we present an evaluation of our approach using the publicly available datasets Bibsonomy and USAToday.
Journal articles
Fast algorithms for determining (generalized) core groups in social networks.
Advances in Data Analysis and Classification, 5(2):129-145, 2011.
Vladimir Batagelj and Matjaž Zaveršnik.
[doi]
[abstract]
[BibTeX]
The structure of a large network (graph) can often be revealed by partitioning it into smaller and possibly more dense sub-networks that are easier to handle. One of such decompositions is based on “ k -cores”, proposed in 1983 by Seidman. Together with connectivity components, cores are one among few concepts that provide efficient decompositions of large graphs and networks. In this paper we propose an efficient algorithm for determining the cores decomposition of a given network with complexity $$O(m)$$, where m is the number of lines (edges or arcs). In the second part of the paper the classical concept of k -core is generalized in a way that uses a vertex property function instead of degree of a vertex. For local monotone vertex property functions the corresponding generalized cores can be determined in $$O(motn))$$ time, where n is the number of vertices and Δ is the maximum degree. Finally the proposed algorithms are illustrated by the analysis of a collaboration network in the field of computational geometry.
Miscellaneous
The Anatomy of the Facebook Social Graph.
2011. cite arxiv:1111.4503Comment: 17 pages, 9 figures, 1 table.
Johan Ugander, Brian Karrer, Lars Backstrom and Cameron Marlow.
[doi]
[abstract]
[BibTeX]
We study the structure of the social graph of active Facebook users, the largest social network ever analyzed. We compute numerous features of the graph including the number of users and friendships, the degree distribution, path lengths, clustering, and mixing patterns. Our results center around three main observations. First, we characterize the global structure of the graph, determining that the social network is nearly fully connected, with 99.91% of individuals belonging to a single large connected component, and we confirm the "six degrees of separation" phenomenon on a global scale. Second, by studying the average local clustering coefficient and degeneracy of graph neighborhoods, we show that while the Facebook graph as a whole is clearly sparse, the graph neighborhoods of users contain surprisingly dense structure. Third, we characterize the assortativity patterns present in the graph by studying the basic demographic and network properties of users. We observe clear degree assortativity and characterize the extent to which "your friends have more friends than you". Furthermore, we observe a strong effect of age on friendship preferences as well as a globally modular community structure driven by nationality, but we do not find any strong gender homophily. We compare our results with those from smaller social networks and find mostly, but not entirely, agreement on common structural network characteristics.
Journal articles
Index design and query processing for graph conductance search.
The VLDB Journal:1-26, 2010.
Soumen Chakrabarti, Amit Pathak and Manish Gupta.
[doi]
[abstract]
[BibTeX]
Graph conductance queries, also known as personalized PageRank and related to random walks with restarts, were originally proposed to assign a hyperlink-based prestige score to Web pages. More general forms of such queries are also very useful for ranking in entity-relation (ER) graphs used to represent relational, XML and hypertext data. Evaluation of PageRank usually involves a global eigen computation. If the graph is even moderately large, interactive response times may not be possible. Recently, the need for interactive PageRank evaluation has increased. The graph may be fully known only when the query is submitted. Browsing actions of the user may change some inputs to the PageRank computation dynamically. In this paper, we describe a system that analyzes query workloads and the ER graph, invests in limited offline indexing, and exploits those indices to achieve essentially constant-time query processing, even as the graph size scales. Our techniques—data and query statistics collection, index selection and materialization, and query-time index exploitation—have parallels in the extensive relational query optimization literature, but is applied to supporting novel graph data repositories. We report on experiments with five temporal snapshots of the CiteSeer ER graph having 74–702 thousand entity nodes, 0.17–1.16 million word nodes, 0.29–3.26 million edges between entities, and 3.29–32.8 million edges between words and entities. We also used two million actual queries from CiteSeer’s logs. Queries run 3–4 orders of magnitude faster than whole-graph PageRank, the gap growing with graph size. Index size is smaller than a text index. Ranking accuracy is 94–98% with reference to whole-graph PageRank.
What's in a crowd? Analysis of face-to-face behavioral networks.
CoRR, abs/1006.1260, 2010.
informal publication
Lorenzo Isella, Juliette Stehlé, Alain Barrat, Ciro Cattuto, Jean-François Pinton and Wouter Van den Broeck.
[doi]
[BibTeX]
Conference articles
Visit me, click me, be my friend: An analysis of evidence networks of user relationships in Bibsonomy.
In:
Proceedings of the 21st ACM conference on Hypertext and hypermedia.
Toronto, Canada, 2010.
(to appear)
Folke Mitzlaff, Dominik Benz, Gerd Stumme and Andreas Hotho.
[doi]
[BibTeX]
Visit me, click me, be my friend: An analysis of evidence networks of user relationships in Bibsonomy.
In:
Proceedings of the 21st ACM conference on Hypertext and hypermedia.
Toronto, Canada, 2010.
(to appear)
Folke Mitzlaff, Dominik Benz, Gerd Stumme and Andreas Hotho.
[doi]
[BibTeX]
RTG: A Recursive Realistic Graph Generator Using Random Typing..
In: W. L. Buntine, M. Grobelnik, D. Mladenic and J. Shawe-Taylor, editors,
ECML/PKDD (1), volume 5781, series Lecture Notes in Computer Science, pages 13-28.
Springer, 2009.
Leman Akoglu and Christos Faloutsos.
[doi]
[BibTeX]
Mining Graph Evolution Rules..
In: W. L. Buntine, M. Grobelnik, D. Mladenic and J. Shawe-Taylor, editors,
ECML/PKDD (1), volume 5781, series Lecture Notes in Computer Science, pages 115-130.
Springer, 2009.
Michele Berlingerio, Francesco Bonchi, Björn Bringmann and Aristides Gionis.
[doi]
[BibTeX]
Journal articles
Binary Decomposition Methods for Multipartite Ranking.
Machine Learning and Knowledge Discovery in Databases:359-374, 2009.
Johannes Fürnkranz, Eyke Hüllermeier and Stijn Vanderlooy.
[doi]
[abstract]
[BibTeX]
Bipartite ranking refers to the problem of learning a ranking function from a training set of positively and negatively labeled
examples. Applied to a set of unlabeled instances, a ranking function is expected to establish a total order in which positiveinstances precede negative ones. The performance of a ranking function is typically measured in terms of the AUC. In thispaper, we study the problem of multipartite ranking, an extension of bipartite ranking to the multi-class case. In this regard,we discuss extensions of the AUC metric which are suitable as evaluation criteria for multipartite rankings. Moreover, tolearn multipartite ranking functions, we propose methods on the basis of binary decomposition techniques that have previouslybeen used for multi-class and ordinal classification. We compare these methods both analytically and experimentally, not onlyagainst each other but also to existing methods applicable to the same problem.
GMap: Drawing Graphs as Maps.
cs.CG, arXiv:0907.2585v1, 2009.
Emden R. Gansner, Yifan Hu and Stephen G. Kobourov.
[doi]
[abstract]
[BibTeX]
Information visualization is essential in making sense out of large data sets. Often, high-dimensional data are visualized as a collection of points in 2-dimensional space through dimensionality reduction techniques. However, these traditional methods often do not capture well the underlying structural information, clustering, and neighborhoods. In this paper, we describe GMap: a practical tool for visualizing relational data with geographic-like maps. We illustrate the effectiveness of this approach with examples from several domains All the maps referenced in this paper can be found in http://www.research.att.com/~yifanhu/GMap
Miscellaneous
Structure of Heterogeneous Networks.
2009. cite arxiv:0906.2212.
Rumi Ghosh and Kristina Lerman.
[doi]
[abstract]
[BibTeX]
Heterogeneous networks play a key role in the evolution of communities and the decisions individuals make. These networks link different types of entities, for example, people and the events they attend. Network analysis algorithms usually project such networks unto simple graphs composed of entities of a single type. In the process, they conflate relations between entities of different types and loose important structural information. We develop a mathematical framework that can be used to compactly represent and analyze heterogeneous networks that combine multiple entity and link types. We generalize Bonacich centrality, which measures connectivity between nodes by the number of paths between them, to heterogeneous networks and use this measure to study network structure. Specifically, we extend the popular modularity-maximization method for community detection to use this centrality metric. We also rank nodes based on their connectivity to other nodes. One advantage of this centrality metric is that it has a tunable parameter we can use to set the length scale of interactions. By studying how rankings change with this parameter allows us to identify important nodes in the network. We apply the proposed method to analyze the structure of several heterogeneous networks. We show that exploiting additional sources of evidence corresponding to links between, as well as among, different entity types yields new insights into network structure.
Conference articles
Simulated Iterative Classification A New Learning Procedure for Graph Labeling..
In: W. L. Buntine, M. Grobelnik, D. Mladenic and J. Shawe-Taylor, editors,
ECML/PKDD (2), volume 5782, series Lecture Notes in Computer Science, pages 47-62.
Springer, 2009.
Francis Maes, Stéphane Peters, Ludovic Denoyer and Patrick Gallinari.
[doi]
[abstract]
[BibTeX]
Collective classification refers to the classification of interlinked and relational objects described as nodes in a graph. The Iterative Classification Algorithm (ICA) is a simple, efficient and widely used method to solve this problem. It is representative of a family of methods for which inference proceeds as an iterative process: at each step, nodes of the graph are classified according to the current predicted labels of their neighbors. We show that learning in this class of models suffers from a training bias. We propose a new family of methods, called Simulated ICA, which helps reducing this training bias by simulating inference during learning. Several variants of the method are introduced. They are both simple, efficient and scale well. Experiments performed on a series of 7 datasets show that the proposed methods outperform representative state-of-the-art algorithms while keeping a low complexity.
Modularities for Bipartite Networks.
In:
HT '09: Proceedings of the Twentieth ACM Conference on Hypertext and Hypermedia.
ACM, New York, NY, USA, 2009.
Tsuyoshi Murata.
[abstract]
[BibTeX]
Real-world relations are often represented as bipartite networks, such as paper-author networks and event-attendee networks. Extracting dense subnetworks (communities) from bipartite networks and evaluating their qualities are practically important research topics. As the attempts for evaluating divisions of bipartite networks, Guimera and Barber propose bipartite modularities. This paper discusses the properties of these bipartite modularities and proposes another bipartite modularity that allows one-to-many correspondence of communities of different vertex types. Preliminary experimental results for the bipartite modularities are also described.
PhD thesis
Eigenvalues and Structures of Graphs.
PhD thesis, University of California, San Diego, 2008.
S.K. Butler.
[BibTeX]
Journal articles
A survey of kernel and spectral methods for clustering.
Pattern recognition, 41(1):176-190, 2008.
M. Filippone, F. Camastra, F. Masulli and S. Rovetta.
[BibTeX]
Average Distance, Diameter, and Clustering in Social Networks with Homophily.
Internet and Network Economics:4-11, 2008.
Matthew Jackson.
[doi]
[abstract]
[BibTeX]
I examine a random network model where nodes are categorized by type and linking probabilities can differ across types. I
show that as homophily increases (so that the probability to link to other nodes of the same type increases and the probabilityof linking to nodes of some other types decreases) the average distance and diameter of the network are unchanged, while theaverage clustering in the network increases.
Miscellaneous
Extending the definition of modularity to directed graphs with
overlapping communities.
2008. cite arxiv:0801.1647
Comment: 22 pages, 11 figures.
V. Nicosia, G. Mangioni, V. Carchiolo and M. Malgeri.
[doi]
[abstract]
[BibTeX]
Complex networks topologies present interesting and surprising properties,
such as community structures, which can be exploited to optimize communication,
to find new efficient and context-aware routing algorithms or simply to
understand the dynamics and meaning of relationships among nodes. Complex
networks are gaining more and more importance as a reference model and are a
powerful interpretation tool for many different kinds of natural, biological
and social networks, where directed relationships and contextual belonging of
nodes to many different communities is a matter of fact. This paper starts from
the definition of modularity function, given by M. Newman to evaluate the
goodness of network community decompositions, and extends it to the more
general case of directed graphs with overlapping community structures.
Interesting properties of the proposed extension are discussed, a method for
finding overlapping communities is proposed and results of its application to
benchmark case-studies are reported. We also propose a new dataset which could
be used as a reference benchmark for overlapping community structures
identification.
Conference articles
Tag recommendations based on tensor dimensionality reduction.
In:
RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, pages 43-50.
ACM, New York, NY, USA, 2008.
Panagiotis Symeonidis, Alexandros Nanopoulos and Yannis Manolopoulos.
[doi]
[BibTeX]
Generating Graphs with Predefined k-Core Structure.
In:
Proceedings of the European Conference of Complex Systems.
2007.
Michael Baur, Marco Gaertler, Robert Görke, Marcus Krug and Dorothea Wagner.
[doi]
[abstract]
[BibTeX]
The modeling of realistic networks is of great importance for modern complex systems research. Previous procedures typically model the natural growth of networks by means of iteratively adding nodes, geometric positioning information, a definition of link connectivity based on the preference for nearest neighbors or already highly connected nodes, or combine several of these approaches. Our novel model is based on the well-know concept of k-cores, originally introduced in social network analysis. Recent studies exposed the significant k-core structure of several real world systems, e.g. the AS network of the Internet. We present a simple and efficient method for generating networks which strictly adhere to the characteristics of a given k-core structure, called core fingerprint. We show-case our algorithm in a comparative evaluation with two well-known AS network generators.
Book chapters
On Finding Graph Clusterings with Maximum Modularity.
In:
A. Brandstädt, D. Kratsch and H. Müller, editors,
Graph-Theoretic Concepts in Computer Science, pages 121-132.
Springer, Berlin / Heidelberg, 2007.
Ulrik Brandes, Daniel Delling, Marco Gaertler, Robert Görke, Martin Hoefer, Zoran Nikoloski and Dorothea Wagner.
[doi]
[abstract]
[BibTeX]
Modularity is a recently introduced quality measure for graph clusterings. It has immediately received considerable attention in several disciplines, and in particular in the complex systems literature, although its properties are not well understood. We study the problem of finding clusterings with maximum modularity, thus providing theoretical foundations for past and present work based on this measure. More precisely, we prove the conjectured hardness of maximizing modularity both in the general case and with the restriction to cuts, and give an Integer Linear Programming formulation. This is complemented by first insights into the behavior and performance of the commonly applied greedy agglomaration approach.
Journal articles
A tutorial on spectral clustering.
Statistics and Computing, 17(4):395-416, 2007.
Ulrike Luxburg.
[doi]
[BibTeX]
Graph clustering.
Computer Science Review, 1(1):27-64, 2007.
S.E. Schaeffer.
[doi]
[BibTeX]
Spectral Graph Theory and its Applications.
Foundations of Computer Science, 2007. FOCS '07. 48th Annual IEEE Symposium on:29-38, 2007.
D.A. Spielman.
[abstract]
[BibTeX]
Spectral graph theory is the study of the eigenvalues and eigenvectors of matrices associated with graphs. In this tutorial, we will try to provide some intuition as to why these eigenvectors and eigenvalues have combinatorial significance, and will sitn'ey some of their applications.
Conference articles
Analysis of the Wikipedia Category Graph for NLP Applications.
In:
Proceedings of the TextGraphs-2 Workshop (NAACL-HLT), pages 1-8.
Association for Computational Linguistics, Rochester, 2007.
Torsten Zesch and Iryna Gurevych.
[doi]
[abstract]
[BibTeX]
In this paper, we discuss two graphs in Wikipedia (i) the article graph, and (ii) the category graph. We perform a graph-theoretic analysis of the category graph, and show that it is a scale-free, small world graph like other well-known lexical semantic networks. We substantiate our findings by transferring semantic relatedness algorithms defined on WordNet to the Wikipedia category graph. To assess the usefulness of the category graph as an NLP resource, we analyze its coverage and the performance of the transferred semantic relatedness algorithms.
Miscellaneous
Spectral Graph Theory: Applications of Courant Fischer.
2006.
Steve Butler.
[BibTeX]
Spectral Graph Theory: Cheeger constants and discrepancy.
2006.
Steve Butler.
[BibTeX]
Spectral Graph Theory: Three common spectra.
2006.
Steve Butler.
[BibTeX]
Journal articles
Building Emergent Social Networks and Group Profiles by Semantic User Preference Clustering.
, 2006.
I. Cantador and P. Castells.
[BibTeX]
Comparison of Graph Clustering Approaches.
, 2006.
G. Frivolt and O. Pok.
[BibTeX]
Conference articles
Information Retrieval in Folksonomies: Search and Ranking.
In: Y. Sure and J. Domingue, editors,
The Semantic Web: Research and Applications, volume 4011, series Lecture Notes in Computer Science, pages 411-426.
Springer, Heidelberg, 2006.
Andreas Hotho, Robert Jäschke, Christoph Schmitz and Gerd Stumme.
[pdf]
[abstract]
[BibTeX]
Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. The reason for their immediate success is the fact that no specific skills are needed for participating. At the moment, however, the information retrieval support is limited. We present a formal model and a new search algorithm for folksonomies,called FolkRank, that exploits the structure of the folksonomy. The proposed algorithm is also applied to findcommunities within the folksonomy and is used to structure search results. All findings are demonstrated on a large scale dataset.
Journal articles
Modularity and community structure in networks.
Proceedings of the National Academy of Sciences, 103(23):8577-8582, 2006.
M. E. J. Newman.
[abstract]
[BibTeX]
Many networks of interest in the sciences, including social networks, computer networks, and metabolic and regulatory networks, are found to divide naturally into communities or modules. The problem of detecting and characterizing this community structure is one of the outstanding issues in the study of networked systems. One highly effective approach is the optimization of the quality function known as “modularity” over the possible divisions of a network. Here I show that the modularity can be expressed in terms of the eigenvectors of a characteristic matrix for the network, which I call the modularity matrix, and that this expression leads to a spectral algorithm for community detection that returns results of demonstrably higher quality than competing methods in shorter running times. I illustrate the method with applications to several published network data sets.
Finding community structure in networks using the eigenvectors of matrices.
Physical Review E, 74(3):36104, 2006.
MEJ Newman.
[BibTeX]
Modularity and community structure in networks.
Proceedings of the National Academy of Sciences, 103(23):8577-8582, 2006.
MEJ Newman.
[BibTeX]
Conference articles
Content Aggregation on Knowledge Bases using Graph Clustering.
In: Y. Sure and J. Domingue, editors,
The Semantic Web: Research and Applications, volume 4011, series LNAI, pages 530-544.
Springer, Heidelberg, 2006.
Christoph Schmitz, Andreas Hotho, Robert Jäschke and Gerd Stumme.
[doi]
[abstract]
[BibTeX]
Recently, research projects such as PADLR and SWAP
have developed tools like Edutella or Bibster, which are targeted at
establishing peer-to-peer knowledge management (P2PKM) systems. In
such a system, it is necessary to obtain provide brief semantic
descriptions of peers, so that routing algorithms or matchmaking
processes can make decisions about which communities peers should
belong to, or to which peers a given query should be forwarded.
This paper provides a graph clustering technique on
knowledge bases for that purpose. Using this clustering, we can show
that our strategy requires up to 58% fewer queries than the
baselines to yield full recall in a bibliographic P2PKM scenario.
Technical reports
A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts.
University of Texas Dept. of Computer Science, 2005. Number TR-04-25.
Inderjit S. Dhillon, Yuqiang Guan and Brian Kulis.
[doi]
[abstract]
[BibTeX]
Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel k-means are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel k-means objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning objective include ratio cut, normalized cut and ratio association. Our equivalence has important consequences: the weighted kernel k-means algorithm may be used to directly optimize the graph partitioning objectives, and conversely, spectral methods may be used to optimize the weighted kernel k-means objective. Hence, in cases where eigenvector computation is prohibitive, we eliminate the need for any eigenvector computation for graph partitioning. Moreover, we show that the Kernighan-Lin objective can also be incorporated into our framework, leading to an incremental weighted kernel k-means algorithm for local optim ization of the objective. We further discuss the issue of convergence of weighted kernel k-means for an arbitrary graph affinity matrix and provide a number of experimental results. These results show that non-spectral methods for graph partitioning are as effective as spectral methods and can be used for problems such as image segmentation in addition to data clustering.
Journal articles
Generating bicliques of a graph in lexicographic order.
Theoretical Computer Science, 337(1-3):240 - 248, 2005.
Vânia M.F. Dias, Celina M.H. de Figueiredo and Jayme L. Szwarcfiter.
[doi]
[abstract]
[BibTeX]
An independent set of a graph is a subset of pairwise non-adjacent vertices. A complete bipartite set B is a subset of vertices admitting a bipartition B=X[union or logical sum]Y, such that both X and Y are independent sets, and all vertices of X are adjacent to those of Y. If both X,Y[not equal to][empty set], then B is called proper. A biclique is a maximal proper complete bipartite set of a graph. We present an algorithm that generates all bicliques of a graph in lexicographic order, with polynomial-time delay between the output of two successive bicliques. We also show that there is no polynomial-time delay algorithm for generating all bicliques in reverse lexicographic order, unless P=NP. The methods are based on those by Johnson, Papadimitriou and Yannakakis, in the solution of these two problems for independent sets, instead of bicliques.
Miscellaneous
Book chapters
Role Assignments.
In:
U. Brandes and T. Erlebach, editors,
Network Analysis, pages 216-252.
Springer, Berlin / Heidelberg, 2005.
Jürgen Lerner.
[doi]
[abstract]
[BibTeX]
9.0. 9.0.1. Preliminaries 9.0.2. Role Graph 9.1. Structural Equivalence 9.1.1. Lattice of Equivalence Relations 9.1.2. Lattice of Structural Equivalences 9.1.3. Computation of Structural Equivalences 9.2. Regular Equivalence 9.2.1. Elementary Properties 9.2.2. Lattice Structure and Regular Interior 9.2.3. Computation of Regular Interior 9.2.4. The Role Assignment Problem 9.2.5. Existence of k-Role Assignments 9.3. Other Equivalences 9.3.1. Exact Role Assignments 9.3.2. Automorphic and Orbit Equivalence 9.3.3. Perfect Equivalence 9.3.4. Relative Regular Equivalence 9.4. Graphs with Multiple Relations 9.5. The Semigroup of a Graph 9.5.1. Winship-Pattison Role Equivalence 9.6. Chapter Notes
Journal articles
A spectral clustering approach to finding communities in graph.
, 2005.
S. White and P. Smyth.
[BibTeX]
Characterizing and Mining the Citation Graph of the Computer Science Literature.
Knowl. Inf. Syst., 6:664-678, 2004.
Yuan An, Jeannette Janssen and Evangelos E. Milios.
[doi]
[BibTeX]
The Diameter of a Scale-Free Random Graph.
Combinatorica, 24(1):5-34, 2004.
Béla Bollobás* and Oliver Riordan.
[doi]
[abstract]
[BibTeX]
We consider a random graph process in which vertices are added to the graph one at a time and joined to a fixed number m of earlier vertices, where each earlier vertex is chosen with probability proportional to its degree. This process was introduced by Barabási and Albert [3], as a simple model of the growth of real-world graphs such as the world-wide web. Computer experiments presented by Barabási, Albert and Jeong [1,5] and heuristic arguments given by Newman, Strogatz and Watts [23] suggest that after n steps the resulting graph should have diameter approximately log n. We show that while this holds for m=1, for m=2 the diameter is asymptotically log n/log log n.
ER -
Clustering large graphs via the singular value decomposition.
Machine Learning, 56(1):9-33, 2004.
P. Drineas, A. Frieze, R. Kannan, S. Vempala and V. Vinay.
[doi]
[BibTeX]
Graph clustering and minimum cut trees.
Internet Mathematics, 1(4):385-408, 2004.
G.W. Flake, R.E. Tarjan and K. Tsioutsiouliklis.
[doi]
[BibTeX]
Deeper inside pagerank.
Internet Mathematics, 1(3):335-380, 2004.
A.N. Langville and C.D. Meyer.
[doi]
[BibTeX]
Miscellaneous
An O(m) Algorithm for Cores Decomposition of Networks.
2003. cite arxiv:cs/0310049.
V. Batagelj and M. Zaversnik.
[doi]
[abstract]
[BibTeX]
The structure of large networks can be revealed by partitioning them to smaller parts, which are easier to handle. One of such decompositions is based on $k$--cores, proposed in 1983 by Seidman. In the paper an efficient, $O(m)$, $m$ is the number of lines, algorithm for determining the cores decomposition of a given network is presented.
Journal articles
Experiments on graph clustering algorithms.
Lecture notes in computer science:568-579, 2003.
U. Brandes, M. Gaertler and D. Wagner.
[doi]
[BibTeX]
Spectral measures of bipartivity in complex networks.
SIAM Rev Phys Rev E, 72:046105, 2003.
E. Estrada and J.A. Rodriguez-Velázquez.
[BibTeX]
The second eigenvalue of the Google matrix.
A Stanford University Technical Report http://dbpubs. stanford. edu, 2003.
T.H. Haveliwala and S.D. Kamvar.
[BibTeX]
The structure and function of complex networks.
SIAM Review, 45(2):167-256, 2003.
M. E. J. Newman.
[BibTeX]
A comparison of spectral clustering algorithms.
University of Washington, Tech. Rep. UW-CSE-03-05-01, 2003.
D. Verma and M. Meila.
[BibTeX]
Conference articles
Multiclass Spectral Clustering.
In:
Proc. International Conference on Computer Vision (ICCV 03).
Nice, France, 2003.
Stella X. Yu and Jianbo Shi.
[BibTeX]
Miscellaneous
Graph Separators.
2002.
Guy Blelloch.
[BibTeX]
Conference articles
Visualization of bibliographic networks with a reshaped landscape metaphor.
In:
Proceedings of the symposium on Data Visualisation 2002, series VISSYM '02, pages 159-ff.
Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 2002.
U. Brandes and T. Willhalm.
[doi]
[abstract]
[BibTeX]
We describe a novel approach to visualize bibliographic networks that facilitates the simultaneous identification of clusters (e.g., topic areas) and prominent entities (e.g., surveys or landmark papers). While employing the landscape metaphor proposed in several earlier works, we introduce new means to determine relevant parameters of the landscape. Moreover, we are able to compute prominent entities, clustering of entities, and the landscape's surface in a surprisingly simple and uniform way. The effectiveness of our network visualizations is illustrated on data from the graph drawing literature.
Journal articles
Markov chain Monte Carlo estimation of exponential random graph models.
Journal of Social Structure, 3(2):1-40, 2002.
T.A.B. Snijders.
[BibTeX]
General formalism for inhomogeneous random graphs.
Phys. Rev. E, 66(6):066121, 2002.
B. Soderberg.
[BibTeX]
Conference articles
Co-clustering documents and words using bipartite spectral graph partitioning.
In:
KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269-274.
ACM Press, New York, NY, USA, 2001.
Inderjit S. Dhillon.
[doi]
[BibTeX]
Miscellaneous
On Spectral Bounds for the k-Partitioning of Graphs.
2001.
B. Monien.
[BibTeX]
Journal articles
Random graphs with arbitrary degree distributions and their applications.
Arxiv preprint cond-mat/0007235, 2001.
MEJ Newman, SH Strogatz and DJ Watts.
[BibTeX]
Conference articles
On spectral clustering: Analysis and an algorithm.
In:
Advances in Neural Information Processing Systems 14, pages 849-856.
MIT Press, 2001.
Andrew Y. Ng, Michael I. Jordan and Yair Weiss.
[abstract]
[BibTeX]
Despite many empirical successes of spectral clustering methods| algorithms that cluster points using eigenvectors of matrices derived from the data|there are several unresolved issues. First, there are a wide variety of algorithms that use the eigenvectors in slightly dierent ways. Second, many of these algorithms have no proof that they will actually compute a reasonable clustering. In this paper, we present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab. Using tools from matrix perturbation theory, we analyze the algorithm, and give conditions under which it can be expected to do well. We also show surprisingly good experimental results on a number of challenging clustering problems. 1
Journal articles
A random graph model for massive graphs.
:171-180, 2000.
W. Aiello, F. Chung and L. Lu.
[doi]
[BibTeX]
Cospectral graphs for both the adjacency and normalized Laplacian matrices.
, 2000.
S. Butler.
[BibTeX]
An open graph visualization system and its applications to software engineering.
Software Practice & Experience, 30(11):1203-1233, 2000.
Emden R. Gansner and Stephen C. North.
[doi]
[abstract]
[BibTeX]
We describe a package of practical tools and libraries for manipulating graphs and their drawings. Our design, which aimed at facilitating the combination of the package components with other tools, includes stream and event interfaces for graph operations, high-quality static and dynamic layout algorithms, and the ability to handle sizable graphs. We conclude with a description of the applications of this package to a variety of software engineering tools.
Miscellaneous
Theory of random graphs.
2000.
Svante Janson, Tomasz Luczak and Andrzej Rucinski.
[doi]
[BibTeX]
Miscellaneous
Some uses of spectral methods.
2000.
A.G. Ranade.
[BibTeX]
Journal articles
A p* primer: Logit models for social networks.
Social Networks, 21(1):37-66, 1999.
C.J. Anderson, S. Wasserman and B. Crouch.
[BibTeX]
Emergence of scaling in random networks.
Science, 286(5439):509-512, 1999.
A. L. Barabasi and R. Albert.
[doi]
[abstract]
[BibTeX]
Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Book chapters
Partitioning Approach to Visualization of Large Graphs.
In:
J. Kratochvíyl, editor,
Graph Drawing, pages 90-97.
Springer, Berlin / Heidelberg, 1999.
Vladimir Batagelj, Andrej Mrvar and Matjaž Zaveršnik.
[doi]
[abstract]
[BibTeX]
The structure of large graphs can be revealed by partitioning graphs to smaller parts, which are easier to handle. In the paper we propose the use of core decomposition as an efficient approach for partitioning large graphs. On the selected subgraphs, computationally more intensive, clustering and blockmodeling can be used to analyze their internal structure. The approach is illustrated by an analysis of Snyder & Kick’s world trade graph.
Conference articles
The Anatomy of a Large-Scale Hypertextual Web Search Engine.
In:
Computer Networks and ISDN Systems, pages 107-117.
1998.
Sergey Brin and Lawrence Page.
[doi]
[abstract]
[BibTeX]
In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://infolab.stanford.edu/~backrub/google.html To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description w...
Journal articles
On the quality of spectral separators.
SIAM Journal on Matrix Analysis and Applications, 19(3):701-719, 1998.
S. Guattery and G.L. Miller.
[BibTeX]
Conference articles
Multilevel k-way Hypergraph Partitioning.
In:
In Proceedings of the Design and Automation Conference, pages 343-348.
1998.
George Karypis and Vipin Kumar.
[abstract]
[BibTeX]
In this paper, we present a new multilevel k-way hypergraph partitioning algorithm that substantially outperforms the existing state-of-the-art K-PM/LR algorithm for multi-way partitioning. both for optimizing local as well as global objectives. Experiments on the ISPD98 benchmark suite show that the partitionings produced by our scheme are on the average 15% to 23% better than those produced by the K-PM/LR algorithm, both in terms of the hyperedge cut as well as the (K - 1) metric. Furthermore, our algorithm is significantly faster, requiring 4 to 5 times less time than that required by K-PM/LR. 1 Introduction Hypergraph partitioning is an important problem with extensive application to many areas, including VLSI design [10], efficient storage of large databases on disks [14], and data mining [13]. The problem is to partition the vertices of a hypergraph into k roughly equal parts, such that a certain objective function defined over the hyperedges is optimized. A commonly used obje...
Miscellaneous
Spectral Graph Theory.
1997.
F. R. K. Chung.
[BibTeX]
Journal articles
Multilevel hypergraph partitioning: Application in VLSI domain.
:526-529, 1997.
G. Karypis, R. Aggarwal, V. Kumar and S. Shekhar.
[BibTeX]
Some applications of Laplace eigenvalues of graphs.
Graph Symmetry: Algebraic Methods and Applications, 497:227-275, 1997.
B. Mohar.
[BibTeX]
Technical reports
Spectral Partitioning Works: Planar Graphs and Finite Element Meshes.
1996.
Daniel A. Spielman and Shang Teng.
[BibTeX]
Conference articles
Spectral partitioning: The more eigenvectors, the better.
In:
Proc. ACM/IEEE Design Automation Conf, pages 195-200.
1995.
Charles J. Alpert, Andrew B. Kahng and So zen Yao.
[BibTeX]
Miscellaneous
A critical point for random graphs with a given degree sequence.
1995.
M. Molloy and B. Reed.
[doi]
[BibTeX]
Journal articles
Spectra and optimal partitions of weighted graphs.
Discrete Math., 128(1-3):1-20, 1994.
Marianna Bolla and Gábor Tusnády.
[doi]
[BibTeX]
Spectral K-way ratio-cut partitioning and clustering..
IEEE Trans. on CAD of Integrated Circuits and Systems, 13(9):1088-1096, 1994.
Pak K. Chan, Martine D. F. Schlag and Jason Y. Zien.
[doi]
[BibTeX]
New spectral methods for ratio cut partitioning and clustering..
IEEE Trans. on CAD of Integrated Circuits and Systems, 11(9):1074-1085, 1992.
Lars W. Hagen and Andrew B. Kahng.
[doi]
[BibTeX]
The Laplacian spectrum of graphs.
Graph Theory, Combinatorics, and Applications, 2:871-898, 1991.
B. Mohar.
[BibTeX]
Partitioning Sparse Matrices with Eigenvectors of Graphs.
SIAM J. MATRIX ANAL. APPLIC., 11(3):430-452, 1990.
A. Pothen, H.D. Simon and K.P. Liou.
[doi]
[BibTeX]
Random sampling and social networks: a survey of various approaches.
Math. Sci. Humaines, 104:19-33, 1988.
O. Frank.
[BibTeX]
On generating all maximal independent sets.
Inf. Process. Lett., 27(3):119-123, 1988.
David S. Johnson and Christos H. Papadimitriou.
[doi]
[BibTeX]
Network structure and minimum degree.
Social Networks, 5(3):269 - 287, 1983.
Stephen B. Seidman.
[doi]
[abstract]
[BibTeX]
Social network researchers have long sought measures of network cohesion, Density has often been used for this purpose, despite its generally admitted deficiencies. An approach to network cohesion is proposed that is based on minimum degree and which produces a sequence of subgraphs of gradually increasing cohesion. The approach also associates with any network measures of local density which promise to be useful both in characterizing network structures and in comparing networks.
A review of random graphs.
Journal of Graph Theory, 6(4), 1982.
M. Karonski.
[BibTeX]
The diameter of random graphs.
Transactions of the American Mathematical Society:41-52, 1981.
B. Bollobas.
[BibTeX]
A Set of Measures of Centrality Based on Betweenness.
Sociometry, 40(1):35-41, 1977.
Linton C. Freeman.
[doi]
[abstract]
[BibTeX]
A Family of new measures of point and graph centrality based on early intuitions of Bavelas (1948) is introduced. These measures define centrality in terms of the degree to which a point falls on the shortest path between others and therefore has a potential for control of communication. They may be used to index centrality in any large or small network of symmetrical relations, whether connected or unconnected.
A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory.
Czechoslovak Mathematical Journal, 25(100):619-633, 1975.
M. Fiedler.
[BibTeX]
Lower bounds for the partitioning of graphs.
IBM Journal of Research and Development, 17(5):420-425, 1973.
W.E. Donath and A.J. Hoffman.
[BibTeX]