Untitled

506 days ago by pub

Example 1C: Least Squares Polynomial Approximation.

Recommend you look at Example 1 for Least Squares Linear Approximation and Example 1 for Least Squares Quadratic Approximation

The following measured data is recorded:

2.5 20.6947
3.1 28.5623
8.1 157.002
12.2 334.634
13.5 406.5697
17.9 696.0331
21 945.1385

Use the method of least squares to find approximating polynomial functions of first (linear, line), second (quadratic, parabola) and third (cubic) degrees.

Find the standard deviation of the resulting function and the given data. (The least squares algorithm should yield a function should that minimizes this standard deviation.)

Use ANOVA to determine which approximation is best (not done).

Solution:

We plot the data points and see that there might be a linear, quadratic or cubic correlation.

# In the GeoGebra applet below we use the command FitPoly[list of points, degree of polynomial]. The slider is set to change the degree of the least squares polynomial. The quadratic and cubic "look" like the best fit. We will get to ANOVA soon to check this observation out with statistics. 
       

 

# So our goal is the coefficients of each polynomial. # This means we need the least squares algorithm. In SAGE, this means we import numpy and numpy.linalg in order to get the linear algebra function called lstsq. # The lstsq function solves a non-square matrix equation while minimizing the "standard deviation" of the "least square algorithm" (see bottom of this page). 
       

You must be very careful when applying formulas to the definition of n! In statistics, n = sample size = (number of points).  In mathematics, n=number of subintervals= (number of points - 1)

This is particularly important in computing the standard deviation. Here we will use the statistical definition! There are 7 pieces of data so n=7 where n is the number of points.

The number of unknown constants in our approximating function is one more than the degree of the polynomial approximation so m=2, 3 or 4.

We need to solve the (non-square) matrix equation AX=B using the least squares algorithm where we add the left columns of the matrix A as we increase the degree.

A is the nxm dimensional array $A=\left[ x^2, x, 1 \right]$

X is the 2x1 dimensional array $\left[ {\begin{array}{*{20}{c} }a \\ b \end{array} } \right] $ and

B is the nx1 dimension array $B=\left[ y \right]$

AX=B  

$ \left[ {\begin{array}{*{20}{c}}{x_0^3}&{x_0^2}&{x_0}&1\\{x_1^3}&{x_1^2}&{x_1}&1\\{x_2^3}&{x_2^2}&{x_2}&1\\{x_3^3}&{x_3^2}&{x_3}&1\\{x_4^3}&{x_4^2}&{x_4}&1\\{x_5^3}&{x_5^2}&{x_5}&1\\{x_6^3}&{x_6^2}&{x_6}&1 \end{array}} \right] \cdot \left[ {\begin{array}{*{20}{c} }a \\ b \end{array} } \right] = \left[ {\begin{array}{*{20}{c}}y_0\\y_1 \\y_2\\y_3\\y_4\\y_5\\y_6 \end{array}} \right] $

  
import numpy as np #This means we write "np" instead of "numpy" everywhere. import numpy.linalg 
       

The syntax for x1 and y1 are easy since they just need to be 1xn arrays. (We save x and y for variable names for plots.)

We point out that when calling elements of a 1xn array, we do NOT have to write x1[0,m], but simply x1[m]. 

x1=np.array([2.5, 3.1, 8.1, 12.2, 13.5, 17.9, 21]); x2=x1^2 x3=x1^3 y1=np.array([20.6947, 28.5623, 157.002, 334.634, 406.5697, 696.0331, 945.1385]); print x1 print x2 print x3 print y1 
       
[  2.5   3.1   8.1  12.2  13.5  17.9  21. ]
[   6.25    9.61   65.61  148.84  182.25  320.41  441.  ]
[   15.625    29.791   531.441  1815.848  2460.375  5735.339  9261.   ]
[  20.6947   28.5623  157.002   334.634   406.5697  696.0331  945.1385]
[  2.5   3.1   8.1  12.2  13.5  17.9  21. ]
[   6.25    9.61   65.61  148.84  182.25  320.41  441.  ]
[   15.625    29.791   531.441  1815.848  2460.375  5735.339  9261.   ]
[  20.6947   28.5623  157.002   334.634   406.5697  696.0331  945.1385]

The syntax for the array A is VERY important because both of its dimensions are bigger than 1. So it must be written properly as an nxm array

We point out that when calling elements of such an array, we must name both the row and the column, i.e. A[3,2] gives the element in the 4th row, 3rd column. 

I am certain there is a SAGE way to append the matrices, but this was easy enough. 

n=len(x1) A1=np.array([[x1[j], 1] for j in range(n)]) A2=np.array([[x2[j], x1[j], 1] for j in range(n)]) A3=np.array([[x3[j], x2[j], x1[j], 1] for j in range(n)]) print A1,A2,A3 
       
[[  2.5   1. ]
 [  3.1   1. ]
 [  8.1   1. ]
 [ 12.2   1. ]
 [ 13.5   1. ]
 [ 17.9   1. ]
 [ 21.    1. ]] [[   6.25    2.5     1.  ]
 [   9.61    3.1     1.  ]
 [  65.61    8.1     1.  ]
 [ 148.84   12.2     1.  ]
 [ 182.25   13.5     1.  ]
 [ 320.41   17.9     1.  ]
 [ 441.     21.      1.  ]] [[  1.56250000e+01   6.25000000e+00  
2.50000000e+00   1.00000000e+00]
 [  2.97910000e+01   9.61000000e+00   3.10000000e+00   1.00000000e+00]
 [  5.31441000e+02   6.56100000e+01   8.10000000e+00   1.00000000e+00]
 [  1.81584800e+03   1.48840000e+02   1.22000000e+01   1.00000000e+00]
 [  2.46037500e+03   1.82250000e+02   1.35000000e+01   1.00000000e+00]
 [  5.73533900e+03   3.20410000e+02   1.79000000e+01   1.00000000e+00]
 [  9.26100000e+03   4.41000000e+02   2.10000000e+01   1.00000000e+00]]
[[  2.5   1. ]
 [  3.1   1. ]
 [  8.1   1. ]
 [ 12.2   1. ]
 [ 13.5   1. ]
 [ 17.9   1. ]
 [ 21.    1. ]] [[   6.25    2.5     1.  ]
 [   9.61    3.1     1.  ]
 [  65.61    8.1     1.  ]
 [ 148.84   12.2     1.  ]
 [ 182.25   13.5     1.  ]
 [ 320.41   17.9     1.  ]
 [ 441.     21.      1.  ]] [[  1.56250000e+01   6.25000000e+00   2.50000000e+00   1.00000000e+00]
 [  2.97910000e+01   9.61000000e+00   3.10000000e+00   1.00000000e+00]
 [  5.31441000e+02   6.56100000e+01   8.10000000e+00   1.00000000e+00]
 [  1.81584800e+03   1.48840000e+02   1.22000000e+01   1.00000000e+00]
 [  2.46037500e+03   1.82250000e+02   1.35000000e+01   1.00000000e+00]
 [  5.73533900e+03   3.20410000e+02   1.79000000e+01   1.00000000e+00]
 [  9.26100000e+03   4.41000000e+02   2.10000000e+01   1.00000000e+00]]

B is an nx1 array, but here we can be lazy and use a 1xn array since SAGE knows what to do with 1-dimensional arrays. The nice thing about doing this is that lstsq then yields a 1x2 array.  

B=y1 print B 
       
[  20.6947   28.5623  157.002   334.634   406.5697  696.0331  945.1385]
[  20.6947   28.5623  157.002   334.634   406.5697  696.0331  945.1385]

Now we want to solve the (non-square) matrix equation AX=B using least squares. In MatLab we could just write X=A\B.

Here we use the numpy linalg command lstsq. Reference (Remember: X is just the constants $a$ and $b$.)

# In Sage, an array with more than one row require that you "name" the row when you call it so if we had written B as an nx1 array, then X would also be an nx1 array and then to get a and b, we would need to write a=X[0,0] and b=X[1,0]. Compare this below where by being lazy and writing B as a 1xn and getting X as a 1x2, we don't have to bother to write the "row" value of X. X=np.linalg.lstsq(A1,B)[0] a1=X[0] b1=X[1] print "linear coefficients are:", a1,b1 X=np.linalg.lstsq(A2,B)[0] a2=X[0] b2=X[1] c2=X[2] print "quadratic coefficients are:", a2,b2,c2 X=np.linalg.lstsq(A3,B)[0] a3=X[0] b3=X[1] c3=X[2] d3=X[3] print "cubic coefficients are:", a3,b3,c3,d3 
       
linear coefficients are: 48.0812660166 -168.018404157
quadratic coefficients are: 1.98988576191 3.24320418396 -0.19639650125
cubic coefficients are: -0.000960253203791 2.02376450367 2.91437781809
0.522881522816
linear coefficients are: 48.0812660166 -168.018404157
quadratic coefficients are: 1.98988576191 3.24320418396 -0.19639650125
cubic coefficients are: -0.000960253203791 2.02376450367 2.91437781809 0.522881522816
var('x') fl(x)=a1*x+b1 fq(x)=a2*x^2+b2*x+c2 fc(x)=a3*x^3+b3*x^2+c3*x+d3 
       
P=list_plot(zip(x1,y1), color='orange', size=40) L=plot(fl(x),(x,0,25), thickness=1,color='red') Q=plot(fq(x),(x,0,25), thickness=1,color='green') C=plot(fc(x),(x,0,25), thickness=1,color='blue') show(P+L+Q+C, figsize=4) 
       

                                
                            

                                
#Where is green (parabola)? show(P+Q, figsize=4) 
       

                                
                            

                                

Now we want to calculate the "standard deviation" σ.

To do this, we calculate the sum S of the squares of the residuals, that is the difference between the value of the approximating function and the measured value for each value of x.

This means we need an array y2 with the values of the approximating function and the measured value for each value of x.

We then calculate $S=\sum (y2-y1)^2 $.

We then calculate the standard deviation using the statistical definition of n=number of points, m=number of constants: $\sigma=\large{\sqrt{\frac{S}{n-m}}}$.

y2=a2*x1*x1+a1*x1+a0 print y2 
       
[ 3.13136966  3.53107511  3.8269514   4.01899853  4.0916053 ]
[ 3.13136966  3.53107511  3.8269514   4.01899853  4.0916053 ]
S=sum((y2-y1)^2) print S 
       
0.00689248895434
0.00689248895434
m=3 sigma=sqrt(S/(n-m)) print sigma 
       
0.0587047227842
0.0587047227842