Getting it wrong has implications for healthcare, infrastructure, and climate.


Samir KC is on a mission to get people thinking differently about population growth. The basic idea of predicting future population size is so simple a child could do it. The reality of getting an accurate estimate is fiendishly complex, however, requiring intimate knowledge of how factors like education and migration will affect a given region.

“It’s very easy to do statistical extrapolation,” says KC, a professor at Shanghai University. But accuracy demands local expertise: “You need to understand a lot of things, and not everything is in the data. Only local demographers that are experts in that country can give you the right inputs.”

In a paper published in PNAS last week, KC and his colleagues show how predictions of India’s population over the next century can vary widely, depending on what data gets baked into the calculations. Population data plays a crucial role in planning for healthcare, education, and infrastructure (and in the longer term, climate change), so that variability has clear real-world implications.

Forecasting the future

If you wanted to build a very simple population growth estimate, it might look something like this: we have 10 women and 10 men now, and we expect each of the women to have four children (it can be pretty tricky to tell how many children men have). So, in 20 years’ time, there’ll be an extra forty people, making a population of 60 overall.

Obviously, that’s way too simple to be of any use at all. For one thing, those people will be different ages—so some of them will be too young to have kids, and some will be too old. For another, some of those people will die, including some of the newborns. So you need to build the age structure of the population, as well as the mortality rates at different ages, into your model.

A particularly thorny challenge is figuring out the fertility rate: how many children the average, hypothetical woman can be expected to have over the course of her life. If you just take the average fertility rate across the country, you’d have a simple but functional population model. But if you zoom in on the country and look at what’s going on in different regions, the fertility rate can look very different in different places.

The tricky thing is that fertility rates work like compound interest, where a small difference in interest rate can add up to a huge difference over time. Say there’s one region with a really high fertility rate, and one with a low rate. In the high-fertility region, the population will grow rapidly and exponentially; in the low-fertility region, it might stay the same or shrink. If you work out the projection for each region separately and add them up, your total will be higher than if you just averaged the fertility rates and worked out the whole country in one go. Here’s a toy example of how that works.

On the other hand, fertility rates will probably go down over time as a result of education: women with more education have fewer children. India has seen a rapid explosion in education, and those waves are still making their way through the whole population. Places where the fertility rate was higher a generation ago have seen it drop: “In association with the improving education of younger women, national-level fertility rates have also declined to 2.2, which is just around a third of their levels in the 1960s,” write KC and his colleagues.

Uncertainty all the way down

To explore how different factors would affect their predictions, KC and his colleagues looked at demographic data from India. They drew up estimates of how things will change over the next century—where people are likely to live, how much education they're likely to get, and so on. They also looked at mortality rates and migration, including how these things are likely to change as a result of expanding access to education.

Then they baked these factors into their population models, one at a time, to see how different combinations of them affected the estimate. Because of the population forces currently in motion, all the estimates look pretty similar for the next 20 years, but then they start to break apart dramatically: the highest estimate puts the population in 2060 at nearly 1.8 billion and the lowest at just 1.65 billion. That’s a difference of 150 million people—about half the US population.

If you’re one of the people trying to work out how many schools to build, how many vaccines are needed, or how to prepare for food insecurity in the face of climate change, those numbers are incredibly meaningful—and understanding the regional differences is important, too.

“This is a very good piece of work by a very good group,” says demographer Dennis Ahlburg, who wasn’t involved with the research. But even KC points out that there’s still loads of uncertainty in their projections: there are weaknesses in the data they used to estimate trends in things like mortality, he says. More importantly, it’s an exploration of the differences that will appear depending on what you build into the model.

“These are the most important sources of heterogeneity in India,” he says. But in other countries, a different set of factors might be more important. In Sweden, educational differences won't be that large, but immigration status might play a role in fertility rate. In the US, religion might be a factor to consider including.

“I really want to make sure that every country has its own model,” KC enthuses. To help with this, his team has written freely available demographic software that’s designed to be as easy as possible to use. Because the point of KC’s work is that he can only take his involvement so far: “What’s going to happen in your house is something you can tell better than me.”