have significant consequences for our understanding of natural and man-made

phenomena. Unfortunately, the detection and characterization of power laws is

complicated by the large fluctuations that occur in the tail of the

distribution -- the part of the distribution representing large but rare events

-- and by the difficulty of identifying the range over which power-law behavior

holds. Commonly used methods for analyzing power-law data, such as

least-squares fitting, can produce substantially inaccurate estimates of

parameters for power-law distributions, and even in cases where such methods

return accurate answers they are still unsatisfactory because they give no

indication of whether the data obey a power law at all. Here we present a

principled statistical framework for discerning and quantifying power-law

behavior in empirical data. Our approach combines maximum-likelihood fitting

methods with goodness-of-fit tests based on the Kolmogorov-Smirnov statistic

and likelihood ratios. We evaluate the effectiveness of the approach with tests

on synthetic data and give critical comparisons to previous approaches. We also

apply the proposed methods to twenty-four real-world data sets from a range of

different disciplines, each of which has been conjectured to follow a power-law

distribution. In some cases we find these conjectures to be consistent with the

data while in others the power law is ruled out. ER - TY - JOUR AU - Clauset, Aaron AU - Shalizi, Cosma Rohilla AU - Newman, M. E. J. T1 - Power-Law Distributions in Empirical Data JO - SIAM Review PY - 2009/ VL - 51 IS - 4 SP - 661 EP - 703 UR - http://link.aip.org/link/?SIR/51/661/1 DO - 10.1137/070710111 KW - law KW - power KW - powerlaw L1 - SN - N1 - N1 - AB - Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the detection and characterization of power laws is complicated by the large fluctuations that occur in the tail of the distribution—the part of the distribution representing large but rare events—and by the difficulty of identifying the range over which power-law behavior holds. Commonly used methods for analyzing power-law data, such as least-squares fitting, can produce substantially inaccurate estimates of parameters for power-law distributions, and even in cases where such methods return accurate answers they are still unsatisfactory because they give no indication of whether the data obey a power law at all. Here we present a principled statistical framework for discerning and quantifying power-law behavior in empirical data. Our approach combines maximum-likelihood fitting methods with goodness-of-fit tests based on the Kolmogorov–Smirnov (KS) statistic and likelihood ratios. We evaluate the effectiveness of the approach with tests on synthetic data and give critical comparisons to previous approaches. We also apply the proposed methods to twenty-four real-world data sets from a range of different disciplines, each of which has been conjectured to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data, while in others the power law is ruled out. ER - TY - GEN AU - Newman, M. E. J. A2 - T1 - Power laws, Pareto distributions and Zipf's law JO - PB - C1 - PY - 2004/ VL - IS - SP - EP - UR - http://arxiv.org/abs/cond-mat/0412004 DO - KW - law KW - power KW - powerlaw L1 - N1 - N1 - AB - When the probability of measuring a particular value of some quantity varies

inversely as a power of that value, the quantity is said to follow a power law,

also known variously as Zipf's law or the Pareto distribution. Power laws

appear widely in physics, biology, earth and planetary sciences, economics and

finance, computer science, demography and the social sciences. For instance,

the distributions of the sizes of cities, earthquakes, solar flares, moon

craters, wars and people's personal fortunes all appear to follow power laws.

The origin of power-law behaviour has been a topic of debate in the scientific

community for more than a century. Here we review some of the empirical

evidence for the existence of power-law forms and the theories proposed to

explain them.

ER - TY - JOUR AU - Newman, M. E. J. T1 - Power laws, Pareto distributions and Zipf's law JO - Contemporary Physics PY - 2005/ VL - 46 IS - SP - EP - UR - doi:10.1080/00107510500052444 DO - KW - distribution KW - free KW - law KW - long KW - power KW - scale KW - tail KW - zipf L1 - SN - N1 - N1 - AB - ER - TY - CONF AU - Stumme, Gerd A2 - Wolff, Karl Erich A2 - Pfeiffer, Heather D. A2 - Delugach, Harry S. T1 - Iceberg Query Lattices for Datalog T2 - Conceptual Structures at Work: 12th International Conference on Conceptual Structures (ICCS 2004) PB - Springer C1 - Heidelberg PY - 2004/ CY - VL - 3127 IS - SP - 109 EP - 125 UR - http://www.kde.cs.uni-kassel.de/stumme/papers/2004/stumme2004iceberg.pdf DO - KW - 2004 KW - analysis KW - concept KW - context KW - datalog KW - families KW - family KW - fca KW - formal KW - iceberg KW - itegpub KW - l3s KW - lattices KW - myown KW - pcf KW - power KW - queries KW - query L1 - SN - N1 - Publications of Gerd Stumme N1 - AB - ER - TY - JOUR AU - Goldstein, M. L. AU - Morris, S. A. AU - Yen, G. G. T1 - Fitting to the power-law distribution JO - The European Physical Journal B - Condensed Matter and Complex Systems PY - 2004/ VL - 41 IS - 2 SP - 255 EP - 258 UR - http://arxiv.org/abs/cond-mat/0402322v1 DO - KW - distribution KW - fitting KW - law KW - power KW - powerlaw L1 - SN - N1 - N1 - AB - Version 1 of Goldstein 04 power law fit containing also the chi 2 test ER - TY - JOUR AU - Barabási, Albert-László AU - Albert, Réka T1 - Emergence of scaling in random networks JO - Science PY - 1999/ VL - 286 IS - SP - 509 EP - 512 UR - DO - KW - attachment KW - law KW - power KW - preferential L1 - SN - N1 - BA model - Wikipedia, the free encyclopedia N1 - AB - ER - TY - JOUR AU - Mitzenmacher, M. T1 - A Brief History of Generative Models for Power Law and Lognormal Distributions

JO - Internet Mathematics PY - 2004/ VL - 1 IS - 2 SP - 226 EP - 251 UR - http://www.eecs.harvard.edu/~michaelm/CS223/powerlaw.pdf DO - KW - generative KW - law KW - model KW - power KW - powerlaw L1 - SN - N1 - N1 - AB - Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a lognormal distribution. In trying to learn enough about these distributions to settle the question, I found a rich and long history, spanning many fields. Indeed, several recently proposed models from the computer science community have antecedents in work from decades ago. Here, I briefly survey some of this history, focusing on underlying generative models that

lead to these distributions. One finding is that lognormal and power law distributions connect quite naturally, and hence, it is not surprising that lognormal distributions have arisen as a possible alternative to power law distributions across many fields.

ER -