TY - JOUR AU - Wu, Xindong AU - Kumar, Vipin AU - Quinlan, J. Ross AU - Ghosh, Joydeep AU - Yang, Qiang AU - Motoda, Hiroshi AU - McLachlan, Geoffrey AU - Ng, Angus AU - Liu, Bing AU - Yu, Philip AU - Zhou, Zhi-Hua AU - Steinbach, Michael AU - Hand, David AU - Steinberg, Dan T1 - Top 10 algorithms in data mining JO - Knowledge and Information Systems PY - 2008/01 VL - 14 IS - 1 SP - 1 EP - 37 UR - http://dx.doi.org/10.1007/s10115-007-0114-2 DO - KW - algorithm KW - ieee KW - icdm KW - top KW - data KW - mining L1 - SN - N1 - N1 - AB - This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM)

in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community.With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current andfurther research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, associationanalysis, and link mining, which are all among the most important topics in data mining research and development. ER - TY - CONF AU - Steinbach, Michael AU - Ertoz, Levent AU - Kumar, Vipin A2 - Wille, L. T. T1 - Challenges of Clustering High Dimensional Data T2 - New Vistas in Statistical Physics -- Applications in Econophysics, Bioinformatics, and Pattern Recognition PB - Springer-Verlag C1 - PY - 2003/ CY - VL - IS - SP - EP - UR - DO - KW - challenges KW - dimensional KW - clustering KW - high L1 - SN - N1 - N1 - AB - ER - TY - CONF AU - Ertoz, Levent AU - Steinbach, Michael AU - Kumar, Vipin A2 - T1 - A New Shared Nearest Neighbor Clustering Algorithm and its Applications T2 - Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining PB - C1 - PY - 2002/ CY - VL - IS - SP - EP - UR - DO - KW - clustering L1 - SN - N1 - N1 - AB - ER -