TY - CONF AU - LIU, Tie-Yan AU - YANG, Yiming AU - WAN, Hao AU - ZHOU, Qian AU - GAO, Bin AU - ZENG, Hua-Jun AU - CHEN, Zheng AU - MA, Wei-Ying A2 - T1 - An experimental study on large-scale web categorization T2 - Special interest tracks and posters of the 14th international conference on World Wide Web PB - ACM CY - New York, NY, USA PY - 2005/ M2 - VL - IS - SP - 1106 EP - 1107 UR - http://doi.acm.org/10.1145/1062745.1062891 M3 - 10.1145/1062745.1062891 KW - categorization KW - bachelor:2011:bachmann KW - web L1 - SN - 1-59593-051-5 N1 - An experimental study on large-scale web categorization N1 - AB - Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification technologies can perform well on and scale up to such large-scale applications. To understand this, we conducted the evaluation of several representative methods (Support Vector Machines, k-Nearest Neighbor and Naive Bayes) with Yahoo! taxonomies. In particular, we evaluated the effectiveness/efficiency tradeoff in classifiers with hierarchical setting compared to conventional (flat) setting, and tested popular threshold tuning strategies for their scalability and accuracy in large-scale classification problems. ER - TY - CONF AU - Kubica, Jeremy AU - Moore, Andrew AU - Schneider, Jeff AU - Yang, Yiming A2 - T1 - Stochastic Link and Group Detection T2 - Proceedings of the Eighteenth National Conference on Artificial Intelligence PB - AAAI Press/MIT Press CY - PY - 2002/07 M2 - VL - IS - SP - 798 EP - 804 UR - M3 - KW - detection KW - gda KW - community KW - network L1 - SN - N1 - N1 - AB - ER -