TY  - CONF
AU  - Java, Akshay
AU  - Joshi, Anupam
AU  - FininBook, Tim
A2  - 
T1  - Approximating the Community Structure of the Long Tail
T2  - Proceedings of the Second International Conference on Weblogs and Social Media(ICWSM 2008)
PB  - AAAI Press
CY  - 
PY  - 2008/
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://ebiquity.umbc.edu/paper/html/id/381/Approximating-the-Community-Structure-of-the-Long-Tail
M3  - 
KW  - clustering
KW  - community
KW  - detection
KW  - spectral
KW  - svd
L1  - 
SN  - 
N1  - 
N1  - 
AB  - In many social media applications, a small fraction of the members are highly linked while most are sparsely connected to the network. Such a skewed distribution is sometimes referred to as the"long tail". Popular applications like meme trackers and content aggregators mine for information from only the popular blogs located at the head of this curve. On the other hand, the long tail contains large volumes of interesting information and niches. The question we address in this work is how best to approximate the community membership of entities in the long tail using only a small percentage of the entire graph structure. Our technique utilizes basic linear algebra manipulations and spectral methods. It has the advantage of quickly and efficiently finding a reasonable approximation of the community structure of the overall network. Such a method has significant applications in blog analysis engines as well as social media monitoring tools in general. 
ER  -

TY  - CONF
AU  - Java, Akshay
AU  - Joshi, Anupam
AU  - FininBook, Tim
A2  - 
T1  - Approximating the Community Structure of the Long Tail
T2  - Proceedings of the Second International Conference on Weblogs and Social Media(ICWSM 2008)
PB  - AAAI Press
CY  - 
PY  - 2008/
M2  - 
VL  - 
IS  - 
SP  - 
EP  - 
UR  - http://ebiquity.umbc.edu/paper/html/id/381/Approximating-the-Community-Structure-of-the-Long-Tail
M3  - 
KW  - clustering
KW  - community
KW  - detection
KW  - svd
KW  - toread
L1  - 
SN  - 
N1  - Approximating the Community Structure of the Long Tail
N1  - 
AB  - In many social media applications, a small fraction of the members are highly linked while most are sparsely connected to the network. Such a skewed distribution is sometimes referred to as the"long tail". Popular applications like meme trackers and content aggregators mine for information from only the popular blogs located at the head of this curve. On the other hand, the long tail contains large volumes of interesting information and niches. The question we address in this work is how best to approximate the community membership of entities in the long tail using only a small percentage of the entire graph structure. Our technique utilizes basic linear algebra manipulations and spectral methods. It has the advantage of quickly and efficiently finding a reasonable approximation of the community structure of the overall network. Such a method has significant applications in blog analysis engines as well as social media monitoring tools in general. 
ER  -

TY  - JOUR
AU  - Drineas, P.
AU  - Frieze, A.
AU  - Kannan, R.
AU  - Vempala, S.
AU  - Vinay, V.
T1  - Clustering large graphs via the singular value decomposition
JO  - Machine Learning
PY  - 2004/
VL  - 56
IS  - 1
SP  - 9
EP  - 33
UR  - http://scholar.google.de/scholar.bib?q=info:gQY9HvWhsJcJ:scholar.google.com/&output=citation&hl=de&ct=citation&cd=0
M3  - 
KW  - clustering
KW  - graph
KW  - svd
KW  - vldb
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Ke, Qifa
AU  - Kanade, Takeo
A2  - 
T1  - Robust Subspace Clustering by Combined Use of kNND Metric and SVD Algorithm.
T2  - CVPR (2)
PB  - 
CY  - 
PY  - 2004/
M2  - 
VL  - 
IS  - 
SP  - 592
EP  - 599
UR  - http://dblp.uni-trier.de/db/conf/cvpr/cvpr2004-2.html#KeK04
M3  - 
KW  - clustering
KW  - decomposition
KW  - eigenvalue
KW  - gaussian
KW  - subspace
KW  - svd
L1  - 
SN  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Osinski, Stanislaw
AU  - Stefanowski, Jerzy
AU  - Weiss, Dawid
A2  - 
T1  - Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition
T2  - Intelligent Information Systems
PB  - 
CY  - 
PY  - 2004/
M2  - 
VL  - 
IS  - 
SP  - 359
EP  - 368
UR  - 
M3  - 
KW  - clustering
KW  - lsi
KW  - svd
KW  - toread
L1  - 
SN  - 
N1  - DBLP Record 'conf/iis/OsinskiSW04'
N1  - 
AB  - 
ER  -

TY  - CONF
AU  - Aggarwal, Charu C.
AU  - Yu, Philip S.
A2  - Chen, Weidong
A2  - Naughton, Jeffrey F.
A2  - Bernstein, Philip A.
T1  - Finding Generalized Projected Clusters In High Dimensional Spaces
T2  - Proceedings of the 2000 ACM SIGMOD International Conference on<p>               Management of Data, May 16-18, 2000, Dallas, Texas, USA
PB  - ACM
CY  - 
PY  - 2000/
M2  - 
VL  - 
IS  - 
SP  - 70
EP  - 81
UR  - 
M3  - 
KW  - clustering
KW  - svd
KW  - projected
L1  - 
SN  - 1-58113-218-2
N1  - 
N1  - 
AB  - 
ER  -

TY  - UNPB
AU  - Ranade, A.G.
A2  - 
T1  - Some uses of spectral methods
PY  - 2000/
SP  - 
EP  - 
UR  - 
M3  - 
KW  - clustering
KW  - graph
KW  - spectral
KW  - svd
KW  - theory
L1  - 
N1  - 
N1  - 
AB  - 
ER  -

TY  - JOUR
AU  - Kleinberg, Jon M.
T1  - Authoritative sources in a hyperlinked environment
JO  - Journal of the ACM
PY  - 1999/10
VL  - 46
IS  - 5
SP  - 604
EP  - 632
UR  - http://dx.doi.org/10.1145/324133.324140
M3  - 10.1145/324133.324140
KW  - clustering
KW  - svd
L1  - 
SN  - 
N1  - 
N1  - 
AB  - . The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics,...
ER  -

TY  - CONF
AU  - Gibson, David
AU  - Kleinberg, Jon
AU  - Raghavan, Prabhakar
A2  - 
T1  - Clustering Categorical Data: An Approach Based on Dynamical Systems
T2  - 
PB  - 
CY  - 
PY  - 1998/
M2  - 
VL  - 
IS  - 
SP  - 311
EP  - 322
UR  - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.8003
M3  - 
KW  - clustering
KW  - svd
L1  - 
SN  - 
N1  - 
N1  - 
AB  - We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By &amp;#034;categorical data,&amp;#034; we mean tables with fields that cannot be naturally ordered by a metric --- e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the cooccurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. We discuss experiments on a variety of tables of synthetic and real data; we find that our iterative methods converge quickly to prominently correlated values of various categorical fields. 1 Introduction  Much of the data in databases is categorical: fields in tables whose attributes cannot naturally be ordered as numerical values can. The pro...
ER  -

TY  - JOUR
AU  - Boley, Daniel
T1  - Principal Direction Divisive Partitioning
JO  - Data Mining and Knowledge Discovery
PY  - 1997/
VL  - 2
IS  - 
SP  - 325
EP  - 344
UR  - 
M3  - 
KW  - clustering
KW  - community
KW  - detection
KW  - divisive
KW  - svd
L1  - 
SN  - 
N1  - 
N1  - 
AB  - We propose a new algorithm capable of partitioning a set of documents or other samples based on an embedding in a high dimensional Euclidean space (i.e. in which every document is a vector of real numbers). The method is unusual in that it is divisive, as opposed to agglomerative, and operates by repeatedly splitting clusters into smaller clusters. The splits are not based on any distance or similarity measure. The documents are assembled in to a matrix which is very sparse. It is this sparsity that permits the algorithm to be very efficient. The performance of the method is illustrated with a set of text documents obtained from the World Wide Web. Some possible extensions are proposed for further investigation.
ER  -

TY  - RPRT
AU  - Berry, Michael W.
AU  - Dumais, Susan T.
AU  - O'Brien, Gavin W.
A2  - 
T1  - Using Linear Algebra for Intelligent Information Retrieval
PB  - Computer Science Department, University of Tennessee, Knoxville
AD  - 
PY  - 1994/
VL  - 
IS  - UT-CS-94-270
SP  - 
EP  - 
UR  - http://citeseer.ist.psu.edu/berry95using.html
M3  - 
KW  - clustering
KW  - lsi
KW  - svd
L1  - 
N1  - 
N1  - 
N1  - 
AB  - 
ER  -