Publications
Detecting Commmunities via Simultaneous Clustering of Graphs and Folksonomies
Java, A.; Joshi, A. & Finin, T.
, 'WebKDD 2008 Workshop on Web Mining and Web Usage Analysis' (2008)
Tag recommendations based on tensor dimensionality reduction
Symeonidis, P.; Nanopoulos, A. & Manolopoulos, Y.
, 'RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems', ACM, New York, NY, USA, [http://doi.acm.org/10.1145/1454008.1454017], 43-50 (2008) [pdf]
Evolutionary spectral clustering by incorporating temporal smoothness.
Chi, Y.; Song, X.; Zhou, D.; Hino, K. & Tseng, B. L.
Berkhin, P.; Caruana, R. & Wu, X., ed., 'KDD', ACM, 153-162 (2007) [pdf]
Community detection in large-scale social networks
Du, N.; Wu, B.; Pei, X.; Wang, B. & Xu, L.
, 'WebKDD/SNA-KDD '07: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis', ACM, New York, NY, USA, [http://doi.acm.org/10.1145/1348549.1348552], 16-25 (2007) [pdf]
Recent years have seen that WWW is becoming a flourishing social media which enables individuals to easily share opinions, experiences and expertise at the push of a single button. With the pervasive usage of instant messaging systems and the fundamental shift in the ease of publishing content, social network researchers and graph theory researchers are now concerned with inferring community structures by analyzing the linkage patterns among individuals and web pages. Although the investigation of community structures has motivated many diverse algorithms, most of them are unsuitable for large-scale social networks because of the computational cost. Moreover, in addition to identify the possible community structures, how to define and explain the discovered communities is also significant in many practical scenarios.
In this paper, we present the algorithm ComTector(Community DeTector) which is more efficient for the community detection in large-scale social networks based on the nature of overlapping communities in the real world. This algorithm does not require any priori knowledge about the number or the original division of the communities. Because real networks are often large sparse graphs, its running time is thus O(C × Tri2), where C is the number of the detected communities and Tri is the number of the triangles in the given network for the worst case. Then we propose a general naming method by combining the topological information with the entity attributes to define the discovered communities. With respected to practical applications, ComTector is challenged with several real life networks including the Zachary Karate Club, American College Football, Scientific Collaboration, and Telecommunications Call networks. Experimental results show that this algorithm can extract meaningful communities that are agreed with both of the objective facts and our intuitions.
Conceptual Clustering of Social Bookmark Sites
Grahl, M.; Hotho, A. & Stumme, G.
Hinneburg, A., ed., 'Workshop Proceedings of Lernen - Wissensentdeckung - Adaptivität (LWA 2007)', Martin-Luther-Universität Halle-Wittenberg, 50-54 (2007) [pdf]
Automated Tag Clustering: Improving search and exploration in the tag space
Begelman, G.; Keller, P. & Smadja, F.
, 'Proceedings of the WWW 2006 Workshop on Collaborative Web Tagging Workshop', Edinburgh (2006) [pdf]
Finding community structure in networks using the eigenvectors of matrices
Newman, M.
Physical Review E, 74(3) 36104 (2006)
Inducing Ontology from Flickr Tags
Schmitz, P.
, 'Proceedings of the Workshop on Collaborative Tagging at WWW2006', Edinburgh, Scotland (2006) [pdf]
In this paper, we describe some promising initial results in inducing ontology from the Flickr tag vocabulary, using a subsumption-based model. We describe the utility of faceted ontology as a supplement to a tagging system and present our model and results. We propose a revised, probabilistic model using seed ontologies to induce faceted ontology, and describe how the model can integrate into the logistics of tagging communities.
Comparing community structure identification
Danon, L.; Diaz-Guilera, A.; Duch, J. & Arenas, A.
Journal of Statistical Mechanics: Theory and Experiment, 9() P09008 (2005)
Finding and evaluating community structure in networks
Newman, M. E. & Girvan, M.
Phys Rev E Stat Nonlin Soft Matter Phys, 69(2) 026113.1-15 (2004) [pdf]
We propose and study a set of algorithms for discovering community structure in networks-natural divisions of network nodes into densely connected subgroups. Our algorithms all share two definitive features: first, they involve iterative removal of edges from the network to split it into communities, the edges removed being identified using any one of a number of possible "betweenness" measures, and second, these measures are, crucially, recalculated after each removal. We also propose a measure for the strength of the community structure found by our algorithms, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our algorithms are highly effective at discovering community structure in both computer-generated and real-world network data, and show how they can be used to shed light on the sometimes dauntingly complex structure of networked systems.
Detecting community structure in networks
Newman, M.
The European Physical Journal B-Condensed Matter, 38(2) 321-330 (2004)
Identification of clusters in the Web graph based on link topology
Huang, X. & Lai, W.
Database Engineering and Applications Symposium, 2003. Proceedings. Seventh International 123-128 (2003)
Self-organization and identification of Web communities
Flake, G.; Lawrence, S.; Giles, C. & Coetzee, F.
Computer, 35(3) 66-70 (2002)
Co-clustering documents and words using bipartite spectral graph partitioning
Dhillon, I. S.
, 'KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining', ACM Press, New York, NY, USA, [10.1145/502512.502550], 269-274 (2001) [pdf]
Constrained K-means Clustering with Background Knowledge.
Wagstaff, K.; Cardie, C.; Rogers, S. & Schrödl, S.
Brodley, C. E. & Danyluk, A. P., ed., 'ICML', Morgan Kaufmann, 577-584 (2001) [pdf]
Efficient identification of Web communities
Flake, G. W.; Lawrence, S. & Giles, C. L.
, 'KDD '00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining', ACM Press, New York, NY, USA, [http://doi.acm.org/10.1145/347090.347121], 150-160 (2000)
Normalized Cuts and Image Segmentation.
Shi, J. & Malik, J.
, 'CVPR', 731-737 (1997) [pdf]
Using Linear Algebra for Intelligent Information Retrieval
Berry, M.; Dumais, S. & O'Brien, G.
SIAM REVIEW, 37() 573-595 (1995)
Spectral K-way ratio-cut partitioning and clustering.
Chan, P. K.; Schlag, M. D. F. & Zien, J. Y.
IEEE Trans. on CAD of Integrated Circuits and Systems, 13(9) 1088-1096 (1994) [pdf]
Algebraic connectivity of graphs
Fiedler, M.
Czechoslovak Mathematical Journal, 23(98) 298-305 (1973)