The surprise I guess is that—when plotted on a log chart—the prevalence of each ...

The surprise I guess is that—when plotted on a log chart—the prevalence of each rank forms a straight line with respect to its rank.

Of course the frequency is going to be proportional in some way to the rank. But there are many ways that could happen. #1 could occur 10% more than #2. Or twice as much.

And for the law to hold true no matter how deep you go is also surprising. Language seems like it should be a little more chaotic than that, with the top, say, 50 words following one distribution, then the longer tail kinda bumping around at different slopes.

This is my lay understanding. Corrections welcome :)