Rank-size distribution

Rank-size distribution or the rank-size rule (or law) describes the remarkable regularity in many phenomena including the distribution of city sizes around the world, sizes of businesses, particle sizes (such as sand), lengths of rivers, frequencies of word usage, wealth among individuals, etc. All are real-world observations that follow power laws such as those called Zipf's law, the Yule distribution, or the Pareto distribution. If one ranks the population size of cities in a given country or in the entire world and calculates the natural logarithm of the rank and of the city population, the resulting graph will show a remarkable log-linear pattern. This is the rank-size distribution.

In the case of city populations, the resulting distribution in a country, region or the world will be characterized by a largest city, the primate city, with other cities decreasing in size respective to it, initially at a rapid rate and then more slowly. This results in a few large cities, and a much larger number of cities orders of magnitude smaller. For example, a rank 3 city would have ⅓ the population of a country's primate city, a rank four city would have ¼ the population of the primate city, and so on.

Why should simple rank be able to predict so easily such complex distributions? In short, why does the rank size rule “work?” One study has shown why this is so.

The distributions mentioned above such as Zipf, Pareto, Yule, etc., also called power laws, are all also related to the distribution known as the Fibonacci sequence and to that of the equiangular spiral. In the Fibonacci sequence, each term is approximately 1.618 (the Golden ratio) times the preceding term. A special case of the Fibonacci sequence is the Lucas sequence consisting of these sequentially additive numbers 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, 199 ,…

When any log-linear factor is ranked, the ranks follow the Lucas sequence as above and each of the terms in the sequence can also be approximated by the successive values of powers of 1.618. For example, the third term in the sequence above, 4, is approximately 1.6183 or 4.236 (which is approximately 4); the fourth term in the sequence, 7, is approximately 1.6184 or 6.854 (which is approximately 7); the eight term in the series, 47, is approximately 1.6188 or 46.979 (which is approximately 47). With higher and higher values, the figures converge.

Thus it is shown that the rank size rule “works” because it is a “shadow” or coincidental measure of the true phenomenon. The true value of rank size is thus not as an accurate mathematical measure (since other power-law formulas are more accurate, especially at ranks lower than 10) but rather as a handy measure or “rule of thumb” to spot power laws. When presented with a ranking of data, is the third-ranked variable approximately ⅓ the value of the highest-ranked one? Or, conversely, is the highest-ranked variable approximately ten times the value of the tenth-ranked one? If so, the rank size rule has possibly helped spot another power law relationship. A 2002 study found that, Zipf’s Law worked for 44 of 73 countries tested. The study also found that variations of the Pareto exponent are better explained by political variables than by economic geography variables like proxies for economies of scale or transportation costs.