TY - GEN AU - Kohavi, Ron A2 - T1 - Online Controlled Experiments: Introduction, Learnings, and Humbling Statistics JO - PB - C1 - PY - 2012/10 VL - IS - SP - EP - UR - http://www.exp-platform.com/Pages/2012RecSys.aspx DO - KW - 2012 KW - amazon KW - bing KW - evaluation KW - experiment KW - industry KW - keynote KW - online KW - recommender KW - recsys KW - statistics L1 - N1 - N1 - AB - The web provides an unprecedented opportunity to accelerate innovation by evaluating ideas quickly and accurately using controlled experiments (e.g., A/B tests and their generalizations). Whether for front-end user-interface changes, or backend recommendation systems and relevance algorithms, online controlled experiments are now utilized to make data-driven decisions at Amazon, Microsoft, eBay, Facebook, Google, Yahoo, Zynga, and at many other companies. While the theory of a controlled experiment is simple, and dates back to Sir Ronald A. Fisher’s experiments at the Rothamsted Agricultural Experimental Station in England in the 1920s, the deployment and mining of online controlled experiments at scale—thousands of experiments now—has taught us many lessons. We provide an introduction, share real examples, key learnings, cultural challenges, and humbling statistics. ER - TY - CONF AU - Brew, Anthony AU - Greene, Derek AU - Cunningham, Pádraig A2 - Coelho, Helder A2 - Studer, Rudi A2 - Wooldridge, Michael T1 - Using Crowdsourcing and Active Learning to Track Sentiment in Online Media T2 - Proceedings of the 19th European Conference on Artificial Intelligence PB - IOS Press C1 - Amsterdam, The Netherlands, The Netherlands PY - 2010/ CY - VL - 215 IS - SP - 145 EP - 150 UR - http://dl.acm.org/citation.cfm?id=1860967.1860997 DO - KW - active KW - analysis KW - crowdsourcing KW - datamining KW - learning KW - media KW - online KW - sentiment KW - web L1 - SN - 978-1-60750-605-8 N1 - N1 - AB - Tracking sentiment in the popular media has long been of interest to media analysts and pundits. With the availability of news content via online syndicated feeds, it is now possible to automate some aspects of this process. There is also great potential to crowdsource Crowdsourcing is a term, sometimes associated with Web 2.0 technologies, that describes outsourcing of tasks to a large often anonymous community. much of the annotation work that is required to train a machine learning system to perform sentiment scoring. We describe such a system for tracking economic sentiment in online media that has been deployed since August 2009. It uses annotations provided by a cohort of non-expert annotators to train a learning system to classify a large body of news items. We report on the design challenges addressed in managing the effort of the annotators and in making annotation an interesting experience. ER - TY - CONF AU - Ahn, Yong-Yeol AU - Han, Seungyeop AU - Kwak, Haewoon AU - Moon, Sue AU - Jeong, Hawoong A2 - T1 - Analysis of topological characteristics of huge online social networking services T2 - Proceedings of the 16th International Conference on World Wide Web PB - ACM C1 - New York, NY, USA PY - 2007/ CY - VL - IS - SP - 835 EP - 844 UR - http://portal.acm.org/citation.cfm?id=1242685 DO - 10.1145/1242572.1242685 KW - folksonomy KW - online KW - analysis KW - network KW - sna KW - social L1 - SN - 978-1-59593-654-7 N1 - N1 - AB - Social networking services are a fast-growing business in the Internet. However, it is unknown if online relationships and their growth patterns are the same as in real-life social networks. In this paper, we compare the structures of three online social networking services: Cyworld, MySpace, and orkut, each with more than 10 million users, respectively. We have access to complete data of Cyworld's ilchon (friend) relationships and analyze its degree distribution, clustering property, degree correlation, and evolution over time. We also use Cyworld data to evaluate the validity of snowball sampling method, which we use to crawl and obtain partial network topologies of MySpace and orkut. Cyworld, the oldest of the three, demonstrates a changing scaling behavior over time in degree distribution. The latest Cyworld data's degree distribution exhibits a multi-scaling behavior, while those of MySpace and orkut have simple scaling behaviors with different exponents. Very interestingly, each of the two e ponents corresponds to the different segments in Cyworld's degree distribution. Certain online social networking services encourage online activities that cannot be easily copied in real life; we show that they deviate from close-knit online social networks which show a similar degree correlation pattern to real-life social networks. ER -