Clustering Categorical Data: An Approach Based on Dynamical Systems
D. Gibson, J. Kleinberg, and P. Raghavan.
page 311--322. (1998)

We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By "categorical data," we mean tables with fields that cannot be naturally ordered by a metric --- e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the cooccurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. We discuss experiments on a variety of tables of synthetic and real data; we find that our iterative methods converge quickly to prominently correlated values of various categorical fields. 1 Introduction Much of the data in databases is categorical: fields in tables whose attributes cannot naturally be ordered as numerical values can. The pro...

URL

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.8003

search on

This publication has not been reviewed yet.

rating distribution

average user rating0.0 out of 5.0 based on 0 reviews

Please log in to take part in the discussion (add own reviews or comments).

@inproceedings{Gibson98clusteringcategorical,
  abstract = {We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By &#034;categorical data,&#034; we mean tables with fields that cannot be naturally ordered by a metric --- e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the cooccurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. We discuss experiments on a variety of tables of synthetic and real data; we find that our iterative methods converge quickly to prominently correlated values of various categorical fields. 1 Introduction  Much of the data in databases is categorical: fields in tables whose attributes cannot naturally be ordered as numerical values can. The pro...},
  added-at = {2010-05-04T08:55:46.000+0200},
  author = {Gibson, David and Kleinberg, Jon and Raghavan, Prabhakar},
  biburl = {https://puma.uni-kassel.de/bibtex/231bcdc070e056e9ba33ba155ebc9285d/folke},
  interhash = {1439dc731dbc3225e455c4cd4ec297b1},
  intrahash = {31bcdc070e056e9ba33ba155ebc9285d},
  keywords = {clustering svd},
  pages = {311--322},
  timestamp = {2010-05-04T08:55:48.000+0200},
  title = {Clustering Categorical Data: An Approach Based on Dynamical Systems},
  url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.8003},
  year = 1998
}

%0 Conference Paper
%1 Gibson98clusteringcategorical
%A Gibson, David
%A Kleinberg, Jon
%A Raghavan, Prabhakar
%D 1998
%K clustering svd
%P 311--322
%T Clustering Categorical Data: An Approach Based on Dynamical Systems
%U http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.8003
%X We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By &#034;categorical data,&#034; we mean tables with fields that cannot be naturally ordered by a metric --- e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the cooccurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. We discuss experiments on a variety of tables of synthetic and real data; we find that our iterative methods converge quickly to prominently correlated values of various categorical fields. 1 Introduction  Much of the data in databases is categorical: fields in tables whose attributes cannot naturally be ordered as numerical values can. The pro...

PUMA

Clustering Categorical Data: An Approach Based on Dynamical Systems
D. Gibson, J. Kleinberg, and P. Raghavan.
page 311--322. (1998)

Tags

Users

Comments and Reviews

Cite this publication

PUMA

Clustering Categorical Data: An Approach Based on Dynamical SystemsD. Gibson, J. Kleinberg, and P. Raghavan. page 311--322. (1998)

Tags

Users

Comments and Reviews

Cite this publication

Clustering Categorical Data: An Approach Based on Dynamical Systems
D. Gibson, J. Kleinberg, and P. Raghavan.
page 311--322. (1998)