How the distribution of baby names changes over time?

I have very recently been in the fortunate position where I had to spend a lot of time thinking about baby names. Finally a name was chosen but being in the hospital and hearing other baby names made me think. How much has the diversity of names changed in recent years? The unanimous answer would be that it increased a lot. But then the statistician in me would ask: is it measurable?

To do this one would have to first make a judgement on how names are believed to be distributed. Intuitively in any given year some names are very popular, whereas other names are extremely rare. It is also very likely that there are a few very popular names, that account for the majority of all babies named, and there are many names that each only get given to a small number of newborns. This sounds like any right skewed distribution so I picked an obvious and easily manageable one, the power law (Pareto distribution).

I did a quick research on the internet and bumped into some baby name statistics from Canada. I randomly picked a state (Alberta) and calculated the Pareto index for boys and girls for the last 25 years.

pareto index

The Pareto shape parameter tells us about the skewdness of the distribution or how evenly names are distributed. The larger the number the more obscure and infrequently used names we have in the given year. The graphs show that in general there is a higher diversity of girl names. Also, the variety of names increases over time, i.e. with time we have a larger number of infrequent names. Seems like the obvious intuition is right: conventions restrict parents imagination less and less when it comes to naming their newborn.


