K-groups: Tractable Group Detection on Large Link Data Sets
J. Kubica, A. Moore, und J. Schneider.
CMU-RI-TR-03-32. Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, (September 2003)

Discovering underlying structure from co-occurrence data is an important task in many fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. For example a store may wish to identify underlying sets of items purchased together or a human resources department may wish to identify groups of employees that collaborate with each other. Previously Kubica et. al. presented the group detection algorithm (GDA) - an algorithm for finding underlying groupings of entities from co-occurrence data. This algorithm is based on a probabilistic generative model and produces coherent groups that are consistent with prior knowledge. Unfortunately, the optimization used in GDA is slow, making it potentially infeasible for many real world data sets. To this end, we present k-groups - an algorithm that uses an approach similar to that of k-means (hard clustering and localized updates) to significantly accelerate the discovery of the underlying groups while retaining GDA's probabilistic model. In addition, we show that k-groups is guaranteed to converge to a local minimum. We also compare the performance of GDA and k-groups on several real world and artificial data sets, showing that k-groups' sacrifice in solution quality is significantly offset by its increase in speed. This trade-off makes group detection tractable on significantly larger data sets.

Dokument

http://www.ri.cmu.edu/pubs/pub_4489.html

Suchen auf

Diese Publikation wurde noch nicht bewertet.

Bewertungsverteilung

Durchschnittliche Benutzerbewertung0,0 von 5.0 auf Grundlage von 0 Rezensionen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

@techreport{Kubica_2003_4489,
  abstract = {Discovering underlying structure from co-occurrence data is an important task in many fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. For example a store may wish to identify underlying sets of items purchased together or a human resources department may wish to identify groups of employees that collaborate with each other.

Previously Kubica et. al. presented the group detection algorithm (GDA) - an algorithm for finding underlying groupings of entities from co-occurrence data. This algorithm is based on a probabilistic generative model and produces coherent groups that are consistent with prior knowledge. Unfortunately, the optimization used in GDA is slow, making it potentially infeasible for many real world data sets.

To this end, we present k-groups - an algorithm that uses an approach similar to that of k-means (hard clustering and localized updates) to significantly accelerate the discovery of the underlying groups while retaining GDA's probabilistic model. In addition, we show that k-groups is guaranteed to converge to a local minimum. We also compare the performance of GDA and k-groups on several real world and artificial data sets, showing that k-groups' sacrifice in solution quality is significantly offset by its increase in speed. This trade-off makes group detection tractable on significantly larger data sets.},
  added-at = {2006-05-16T08:37:04.000+0200},
  address = {Pittsburgh, PA},
  author = {Kubica, Jeremy Martin and Moore, Andrew and Schneider, Jeff},
  biburl = {https://puma.uni-kassel.de/bibtex/23a4df0e814c3a1b125e3d403abe48733/jaeschke},
  institution = {Robotics Institute, Carnegie Mellon University},
  interhash = {cecbc69533ab6d63fd478c7a9c7651a1},
  intrahash = {3a4df0e814c3a1b125e3d403abe48733},
  keywords = {large detection gda network community},
  month = {September},
  number = {CMU-RI-TR-03-32},
  timestamp = {2011-07-28T16:18:55.000+0200},
  title = {K-groups: Tractable Group Detection on Large Link Data Sets},
  url = {http://www.ri.cmu.edu/pubs/pub_4489.html},
  year = 2003
}

%0 Report
%1 Kubica_2003_4489
%A Kubica, Jeremy Martin
%A Moore, Andrew
%A Schneider, Jeff
%C Pittsburgh, PA
%D 2003
%K large detection gda network community
%N CMU-RI-TR-03-32
%T K-groups: Tractable Group Detection on Large Link Data Sets
%U http://www.ri.cmu.edu/pubs/pub_4489.html
%X Discovering underlying structure from co-occurrence data is an important task in many fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. For example a store may wish to identify underlying sets of items purchased together or a human resources department may wish to identify groups of employees that collaborate with each other.

Previously Kubica et. al. presented the group detection algorithm (GDA) - an algorithm for finding underlying groupings of entities from co-occurrence data. This algorithm is based on a probabilistic generative model and produces coherent groups that are consistent with prior knowledge. Unfortunately, the optimization used in GDA is slow, making it potentially infeasible for many real world data sets.

To this end, we present k-groups - an algorithm that uses an approach similar to that of k-means (hard clustering and localized updates) to significantly accelerate the discovery of the underlying groups while retaining GDA's probabilistic model. In addition, we show that k-groups is guaranteed to converge to a local minimum. We also compare the performance of GDA and k-groups on several real world and artificial data sets, showing that k-groups' sacrifice in solution quality is significantly offset by its increase in speed. This trade-off makes group detection tractable on significantly larger data sets.

PUMA

K-groups: Tractable Group Detection on Large Link Data Sets
J. Kubica, A. Moore, und J. Schneider.
CMU-RI-TR-03-32. Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, (September 2003)

Tags

Nutzer

Kommentare und Rezensionen

Zitieren Sie diese Publikation

PUMA

K-groups: Tractable Group Detection on Large Link Data SetsJ. Kubica, A. Moore, und J. Schneider. CMU-RI-TR-03-32. Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, (September 2003)

Tags

Nutzer

Kommentare und Rezensionen

Zitieren Sie diese Publikation

K-groups: Tractable Group Detection on Large Link Data Sets
J. Kubica, A. Moore, und J. Schneider.
CMU-RI-TR-03-32. Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, (September 2003)