iconEuler Examples

US Population Forecast

The data in this example are from an introductory example of Matlab.

The following are numerical data of the US population in millions.
>p := [75.995, 91.972, 105.711, 123.203, 131.669, ...
      150.697, 179.323, 203.212, 226.505, 249.633, 281.422];
>t := 1900:10:2000;
Let us try to fit a polynomial to it the direct way. Since we do not
want to get oscillation, we set the degree to 3.
>pf := polyfit(t,p,3); ...
 plot2d(t,p,points=true,a=1900,b=2010,c=0,d=320); ...
 plot2d("polyval(pf,x)",add=1):

US Population Forecast

This does not look to bad. Let us evaluate the forecast for 2010.
>polyval(pf,2010)
312.691378788
For polynimals of higher degree, we get problems with overflows and
inaccurate values. This can be avoided by rescaling the date to
[0,1].

Even then, the behavior of the polynomial gets erratic.
>pf1 := polyfit((t-1900)/50,p,10); ...
 plot2d(t,p,points=true,a=1900,b=2010,c=0,d=320); ...
 plot2d("polyval(pf1,(x-1900)/50)",add=1):

US Population Forecast

It is best to stay with polynomials of low degree. Or even better, to
use a model, which is appropriate for population growth.

Let us have a look at the growth rate for each year.
>RR := differences(p)/(p[1:cols(p)-1])
[ 0.210237515626  0.149382420737  0.165470007852  0.0687158591917
0.14451389469  0.189957331599  0.133217713288  0.11462413637
0.102108121234  0.127342939435 ]
>plot2d(t,RR,bar=true):

US Population Forecast

It looks decreasing with a problematic break at 1930-1940. We can try
a linar fit.
>pr := polyfit(t[1:cols(t)-1],RR,1)
[ 1.42808998835  -0.000661970691182 ]
>plot2d("polyval(pr,x)",color=blue,thickness=2,style="--",>add); insimg();

US Population Forecast

So we can predict the rate for the years 2000-2010.
>pc := polyval(pr,2000)
0.104148605987
If you look at the data, there is a trend to an increased production
rate from 1990 to 2000. So we are sceptical, if such a long term
prediction really applies.

However, we use it to predict the polpulation in 2010.
>(1+pc)*p[-1]
310.731708994
The true number is 308 millions. It turns out the growth was a much
less than predicted.
>(308-p[-1])/p[-1]
0.0944417991486

More recent Data

I found the following population data from 1960 to 2011 on

http://data.worldbank.org/country/united-states

To generate the file, I downloaded the Excel version and copied the
line with the population data to a file. It contains a single tab
separated line.
>t := 1960:2011; p := readmatrix("uspopulation.dat");
>plot2d(t,p/1000000):

US Population Forecast

Let us try to determine the anual growth rate from this.
>RR := differences(p)/head(p,-2);
>plot2d(head(t,-2),RR):

US Population Forecast

This looks quite erratic, but with a descending tendency. Note that
shar cescent in the 1960-1968 due to the invention of the
contraception.

Even, if we take only the data from 1990 on, it is obvious how dubious
a long term prediction would be. Fitting some exponential curve to the
data seems appropriate, but it is probably only like guessing.

But we try to estimate the growth in 2011 using extrapolation.
>pr := polyfit(1990:2010,RR[-22:-2],1)
[ 0.501914468534  -0.00024555094606 ]
>pc := polyval(pc,2011)
0.104148605987
Now we predict the number for 2012.
>p[-1], p[-1]*(1+pc), %-p[-1]
311591917
344043780.792
32451863.7924
The 2.5 million increase is in accordance with the data you will find
in the net.

By the way, the 10 year rate based on the same rate per year would be
the following.
>(1+polyval(pc,2011))^10-1
1.69324141904
This is much less than the 9.4% we predicted with the data in the last
section.

Examples Homepage