21 April 2010

Zipf

NY Times: Economix Blog | Ed Glaeser | A Tale of Many Cities 

"Zipf’s Law" is one of the great curiosities of urban research. The law claims that the number of people in a city is inversely proportional to the city’s rank among all cities. In other words, the biggest city is about twice the size of the second biggest city, three times the size of the third biggest city, and so forth.

Zipf’s Law is named after the linguist George Kingsley Zipf, who discovered the law when studying the distribution of words: the second most common word in a text typically shows up one-half as often as the most commonly used word. The law has been observed in many other contexts, including firm sizes and income distribution, which follows the closely-connected Pareto Distribution. [...]

Part of the appeal of Zipf’s Law is that it appears be entirely natural. As Steven Strogatz wrote in The New York Times about one year ago: “No city planner imposed it, and no citizens conspired to make it happen.”
It's like people engage in some sort of self-organization! The universe isn't guided by a central planner, and yet it makes some sense!
But Zipf’s Law seems to be mainly a product of city or metropolitan area boundaries, not the natural distribution of population.

Professors Holmes and Lee ignored political boundaries and split America up using a six-by-six-mile grid. Their cities are squares crafted without any attention to actual boundaries. Using Census Block level data, they calculate the population of each square in the grid. It turns out that Zipf’s Law doesn’t work for these fixed geographic areas."
Of course it doesn't. It's not a function of boundaries per se, it's a function of names and human perceptions.

Their experiment is like squishy all the letters in a book together, then dividing them up into contiguous 10 letter blocks, and expecting the frequency distribution of these blocks to be the same as that for words.
Zipf’s Law is a bust at describing the population levels of areas within fixed boundaries.

But that doesn’t make the law irrelevant. It rather pushes us to understand why Zipf’s Law holds across metropolitan areas, but not in large squares of fixed size.
That's an easy question: 6x6 mile grids don't have any semantic relevance. Nobody says "I want to live in cell #481746," just like noboday ever strings the letters ASTHEBESTOFTI together unless it's incidental to saying something meaningful. People may exhibit preference for cell #481746 because it happens to be San Frnacisco's Marina district, and they may type ASTHEBESTOFTI because they're trying to say "It was the best of times, it was the worst of times," but the arbitrary chunks are meaningless.

Glaeser begins making more sense in the paragraphs that follow, but he uses the loaded and inaccurate word "sprawl" when he really means "preferential attachment." Sprawl, which is difficult to define because it's usually used to mean "any development less dense than Manhattan which I don't like," has an element of density and intra-urban space to it that the rest of the piece and Zipf's law lack. I find it slightly irresponsible to invoke the name of a bogie man when there's a commonly used, value-judgement-free term in wide scientific circulation.

No comments:

Post a Comment