This notebook corresponds to what was shown in class on September 3 on the topic of least square fitting of a linear or polynomial model to data.
The following line sets up a Matlab-like environment for scientific computing and data analysis. It should be issued each time when starting a new notebook. It is possible to add this to the local startup file so that it is executed automatically, but here we keep it explicit to make sure the code runs on a default install of the Anaconda Python distribution.
%pylab inline
We take the data from MLS ("Mathematics for the Life Sciences") Example 3.1:
x=[2,5,2,4,6]
y=[4,7,5,8,11]
A plot of the data. 'o' is the plot style indicating to draw big dots for the data and no connecting line.
plot(x,y,'o')
We now fit a straight line (=polynomial of degree 1) to the data, minimizing the mean square deviation. polyfit is returning an array which contains the coefficients of the fitting polynomial.
p=polyfit(x,y,1)
p
We can evalate the polynomial using polyval, here at the point x=5.
polyval(p,5)
To plot the result of the fit, we define a vector of points at which to evaluate the model. Here, we use 50 equidistant points in the interval [0,10].
xx=linspace(0,10,50)
And the plot command, where we first plot the fitted curve, then the raw data points as above.
plot(xx,polyval(p,xx),x,y,'o')
We can also fit higher degree polynomials, here a cubic. But note that this is very likely "overfitting" the data, i.e., getting a seemingly good fit due to the many parameters which introduces spurious features into the model that are not robustly reproducible by the underlying experiment.
p=polyfit(x,y,3)
plot(xx,polyval(p,xx),x,y,'o')