The data in this example are from an introductory example of Matlab. The following are numerical data of the US population in millions.
>p := [75.995, 91.972, 105.711, 123.203, 131.669, ... 150.697, 179.323, 203.212, 226.505, 249.633, 281.422]; >t := 1900:10:2000;
Let us try to fit a polynomial to it the direct way. Since we do not want to get oscillation, we set the degree to 3.
>pf := polyfit(t,p,3); ... plot2d(t,p,points=true,a=1900,b=2010,c=0,d=320); ... plot2d("polyval(pf,x)",add=1):
This does not look to bad. Let us evaluate the forecast for 2010.
>polyval(pf,2010)
312.691378788
For polynimals of higher degree, we get problems with overflows and inaccurate values. This can be avoided by rescaling the date to [0,1]. Even then, the behavior of the polynomial gets erratic.
>pf1 := polyfit((t-1900)/50,p,10); ... plot2d(t,p,points=true,a=1900,b=2010,c=0,d=320); ... plot2d("polyval(pf1,(x-1900)/50)",add=1):
It is best to stay with polynomials of low degree. Or even better, to use a model, which is appropriate for population growth. Let us have a look at the growth rate for each year.
>RR := differences(p)/(p[1:cols(p)-1])
[ 0.210237515626 0.149382420737 0.165470007852 0.0687158591917 0.14451389469 0.189957331599 0.133217713288 0.11462413637 0.102108121234 0.127342939435 ]
>plot2d(t,RR,bar=true):
It looks decreasing with a problematic break at 1930-1940. We can try a linar fit.
>pr := polyfit(t[1:cols(t)-1],RR,1)
[ 1.42808998835 -0.000661970691182 ]
>plot2d("polyval(pr,x)",color=blue,thickness=2,style="--",>add); insimg();
So we can predict the rate for the years 2000-2010.
>pc := polyval(pr,2000)
0.104148605987
If you look at the data, there is a trend to an increased production rate from 1990 to 2000. So we are sceptical, if such a long term prediction really applies. However, we use it to predict the polpulation in 2010.
>(1+pc)*p[-1]
310.731708994
The true number is 308 millions. It turns out the growth was a much less than predicted.
>(308-p[-1])/p[-1]
0.0944417991486
I found the following population data from 1960 to 2011 on http://data.worldbank.org/country/united-states To generate the file, I downloaded the Excel version and copied the line with the population data to a file. It contains a single tab separated line.
>t := 1960:2011; p := readmatrix("uspopulation.dat"); >plot2d(t,p/1000000):
Let us try to determine the anual growth rate from this.
>RR := differences(p)/head(p,-2); >plot2d(head(t,-2),RR):
This looks quite erratic, but with a descending tendency. Note that shar cescent in the 1960-1968 due to the invention of the contraception. Even, if we take only the data from 1990 on, it is obvious how dubious a long term prediction would be. Fitting some exponential curve to the data seems appropriate, but it is probably only like guessing. But we try to estimate the growth in 2011 using extrapolation.
>pr := polyfit(1990:2010,RR[-22:-2],1)
[ 0.501914468534 -0.00024555094606 ]
>pc := polyval(pc,2011)
0.104148605987
Now we predict the number for 2012.
>p[-1], p[-1]*(1+pc), %-p[-1]
311591917 344043780.792 32451863.7924
The 2.5 million increase is in accordance with the data you will find in the net. By the way, the 10 year rate based on the same rate per year would be the following.
>(1+polyval(pc,2011))^10-1
1.69324141904
This is much less than the 9.4% we predicted with the data in the last section.