Home Prosthetics and implantation Point least squares method. Where is the least squares method used?

Prosthetics and implantation

Point least squares method. Where is the least squares method used?

Example.

Experimental data on the values of variables X And at are given in the table.

As a result of their alignment, the function is obtained

Using method least squares , approximate these data by a linear dependence y=ax+b(find parameters A And b). Find out which of the two lines better (in the sense of the least squares method) aligns the experimental data. Make a drawing.

The essence of the least squares method (LSM).

The task is to find the linear dependence coefficients at which the function of two variables A And b takes the smallest value. That is, given A And b the sum of squared deviations of the experimental data from the found straight line will be the smallest. This is the whole point of the least squares method.

Thus, solving the example comes down to finding the extremum of a function of two variables.

Deriving formulas for finding coefficients.

A system of two equations with two unknowns is compiled and solved. Finding the partial derivatives of a function by variables A And b, we equate these derivatives to zero.

We solve the resulting system of equations using any method (for example by substitution method or Cramer's method) and obtain formulas for finding coefficients using the least squares method (LSM).

Given A And b function takes the smallest value. The proof of this fact is given below in the text at the end of the page.

That's the whole method of least squares. Formula for finding the parameter a contains the sums ,,, and parameter n- amount of experimental data. We recommend calculating the values of these amounts separately. Coefficient b found after calculation a.

It's time to remember the original example.

Solution.

In our example n=5. We fill out the table for the convenience of calculating the amounts that are included in the formulas of the required coefficients.

The values in the fourth row of the table are obtained by multiplying the values of the 2nd row by the values of the 3rd row for each number i.

The values in the fifth row of the table are obtained by squaring the values in the 2nd row for each number i.

The values in the last column of the table are the sums of the values across the rows.

We use the formulas of the least squares method to find the coefficients A And b. We substitute the corresponding values from the last column of the table into them:

Hence, y = 0.165x+2.184- the desired approximating straight line.

It remains to find out which of the lines y = 0.165x+2.184 or better approximates the original data, that is, makes an estimate using the least squares method.

Error estimation of the least squares method.

To do this, you need to calculate the sum of squared deviations of the original data from these lines And , a smaller value corresponds to a line that better approximates the original data in the sense of the least squares method.

Since , then straight y = 0.165x+2.184 better approximates the original data.

Graphic illustration of the least squares (LS) method.

Everything is clearly visible on the graphs. The red line is the found straight line y = 0.165x+2.184, the blue line is , pink dots are the original data.

In practice, when modeling various processes - in particular, economic, physical, technical, social - one or another method of calculating approximate values of functions from their known values at certain fixed points is widely used.

This kind of function approximation problem often arises:

when constructing approximate formulas for calculating the values of characteristic quantities of the process under study using tabular data obtained as a result of the experiment;

in numerical integration, differentiation, solution differential equations etc.;

if necessary, calculate the values of functions at intermediate points of the considered interval;

when determining the values of characteristic quantities of a process outside the considered interval, in particular when forecasting.

If, to model a certain process specified by a table, we construct a function that approximately describes this process based on the least squares method, it will be called an approximating function (regression), and the task of constructing approximating functions itself will be called an approximation problem.

This article discusses the capabilities of the MS Excel package for solving this type of problem, in addition, it provides methods and techniques for constructing (creating) regressions for tabulated functions (which is the basis of regression analysis).

Excel has two options for building regressions.

Adding selected regressions (trendlines) to a diagram built on the basis of a data table for the process characteristic under study (available only if a diagram has been constructed);

Using the built-in statistical functions of the Excel worksheet, allowing you to obtain regressions (trend lines) directly from the source data table.

Adding trend lines to a chart

For a table of data that describes a process and is represented by a diagram, Excel has an effective regression analysis tool that allows you to:

build on the basis of the least squares method and add five types of regressions to the diagram, which model the process under study with varying degrees of accuracy;

add the constructed regression equation to the diagram;

determine the degree of correspondence of the selected regression to the data displayed on the chart.

Based on chart data, Excel allows you to obtain linear, polynomial, logarithmic, power, exponential types of regressions, which are specified by the equation:

y = y(x)

where x is an independent variable that often takes the values of a sequence of natural numbers (1; 2; 3; ...) and produces, for example, a countdown of the time of the process under study (characteristics).

1 . Linear regression is good for modeling characteristics whose values increase or decrease at a constant rate. This is the simplest model to construct for the process under study. It is constructed in accordance with the equation:

y = mx + b

where m is the tangent of the angle of inclination linear regression to the abscissa axis; b - coordinate of the point of intersection of linear regression with the ordinate axis.

2 . A polynomial trend line is useful for describing characteristics that have several distinct extremes (maxima and minima). The choice of polynomial degree is determined by the number of extrema of the characteristic under study. Thus, a second-degree polynomial can well describe a process that has only one maximum or minimum; polynomial of the third degree - no more than two extrema; polynomial of the fourth degree - no more than three extrema, etc.

In this case, the trend line is constructed in accordance with the equation:

y = c0 + c1x + c2x2 + c3x3 + c4x4 + c5x5 + c6x6

where coefficients c0, c1, c2,... c6 are constants whose values are determined during construction.

3 . The logarithmic trend line is successfully used when modeling characteristics whose values initially change rapidly and then gradually stabilize.

y = c ln(x) + b

4 . A power-law trend line gives good results if the values of the relationship under study are characterized by a constant change in the growth rate. An example of such a dependence is the graph of uniformly accelerated motion of a car. If there are zero or negative values in the data, you cannot use a power trend line.

Constructed in accordance with the equation:

y = c xb

where coefficients b, c are constants.

5 . An exponential trend line should be used when the rate of change in the data is continuously increasing. For data containing zero or negative values, this type of approximation is also not applicable.

Constructed in accordance with the equation:

y = c ebx

where coefficients b, c are constants.

When selecting a line Excel trend automatically calculates the value of R2, which characterizes the reliability of the approximation: than closer value R2 to unity, the more reliably the trend line approximates the process under study. If necessary, the R2 value can always be displayed on the chart.

Determined by the formula:

To add a trend line to a data series:

activate a chart based on a series of data, i.e. click within the chart area. The Diagram item will appear in the main menu;

after clicking on this item, a menu will appear on the screen in which you should select the Add trend line command.

The same actions can be easily implemented by moving the mouse pointer over the graph corresponding to one of the data series and right-clicking; In the context menu that appears, select the Add trend line command. The Trendline dialog box will appear on the screen with the Type tab opened (Fig. 1).

After this you need:

Select the required trend line type on the Type tab (the Linear type is selected by default). For the Polynomial type, in the Degree field, specify the degree of the selected polynomial.

1 . The Built on series field lists all data series in the chart in question. To add a trend line to a specific data series, select its name in the Built on series field.

If necessary, by going to the Parameters tab (Fig. 2), you can set the following parameters for the trend line:

change the name of the trend line in the Name of the approximating (smoothed) curve field.

set the number of periods (forward or backward) for the forecast in the Forecast field;

display the equation of the trend line in the diagram area, for which you should enable the show equation on the diagram checkbox;

display the approximation reliability value R2 in the diagram area, for which you should enable the Place the approximation reliability value on the diagram (R^2) checkbox;

set the intersection point of the trend line with the Y axis, for which you should enable the checkbox for the intersection of the curve with the Y axis at a point;

Click the OK button to close the dialog box.

In order to start editing an already drawn trend line, there are three ways:

use the Selected trend line command from the Format menu, having previously selected the trend line;

select the Format trend line command from the context menu, which is called up by right-clicking on the trend line;

double click on the trend line.

The Trend Line Format dialog box will appear on the screen (Fig. 3), containing three tabs: View, Type, Parameters, and the contents of the last two completely coincide with the similar tabs of the Trend Line dialog box (Fig. 1-2). On the View tab, you can set the line type, its color and thickness.

To delete a trend line that has already been drawn, select the trend line to be deleted and press the Delete key.

The advantages of the considered regression analysis tool are:

the relative ease of constructing a trend line on charts without creating a data table for it;

a fairly wide list of types of proposed trend lines, and this list includes the most commonly used types of regression;

the ability to predict the behavior of the process under study by an arbitrary (within the limits of common sense) number of steps forward and also backward;

the ability to obtain the trend line equation in analytical form;

the possibility, if necessary, of obtaining an assessment of the reliability of the approximation.

The disadvantages include the following:

the construction of a trend line is carried out only if there is a diagram built on a series of data;

the process of generating data series for the characteristic under study based on the trend line equations obtained for it is somewhat cluttered: the required regression equations are updated with each change in the values of the original data series, but only within the diagram area, while data series, generated based on the old trend line equation, remains unchanged;

In PivotChart reports, changing the view of a chart or associated PivotTable report does not preserve existing trendlines, meaning that before you draw trendlines or otherwise format a PivotChart report, you should ensure that the report layout meets the required requirements.

Trend lines can be used to supplement data series presented on charts such as graph, histogram, flat non-standardized area charts, bar charts, scatter charts, bubble charts, and stock charts.

You cannot add trend lines to data series in 3D, normalized, radar, pie, and donut charts.

Using Excel's built-in functions

Excel also has a regression analysis tool for plotting trend lines outside the chart area. There are a number of statistical worksheet functions you can use for this purpose, but all of them only allow you to build linear or exponential regressions.

Excel has several functions for constructing linear regression, in particular:

TREND;

SLOPE and CUT.

As well as several functions for constructing an exponential trend line, in particular:

LGRFPRIBL.

It should be noted that the techniques for constructing regressions using the TREND and GROWTH functions are almost the same. The same can be said about the pair of functions LINEST and LGRFPRIBL. For these four functions, creating a table of values uses Excel features such as array formulas, which somewhat clutters the process of building regressions. Let us also note that the construction of linear regression, in our opinion, is most easily accomplished using the SLOPE and INTERCEPT functions, where the first of them determines the slope of the linear regression, and the second determines the segment intercepted by the regression on the y-axis.

The advantages of the built-in functions tool for regression analysis are:

a fairly simple, uniform process of generating data series of the characteristic under study for all built-in statistical functions that define trend lines;

standard methodology for constructing trend lines based on generated data series;

the ability to predict the behavior of the process under study by the required number of steps forward or backward.

The disadvantages include the fact that Excel does not have built-in functions for creating other (except linear and exponential) types of trend lines. This circumstance often does not allow choosing a sufficiently accurate model of the process under study, as well as obtaining forecasts that are close to reality. In addition, when using the TREND and GROWTH functions, the equations of the trend lines are not known.

It should be noted that the authors did not set out to present the course of regression analysis with any degree of completeness. Its main task is to show, using specific examples, the capabilities of the Excel package when solving approximation problems; demonstrate what effective tools Excel has for building regressions and forecasting; illustrate how such problems can be solved relatively easily even by a user who does not have extensive knowledge of regression analysis.

Examples of solving specific problems

Let's look at solving specific problems using the listed Excel tools.

Problem 1

With a table of data on the profit of a motor transport enterprise for 1995-2002. you need to do the following:

Build a diagram.

Add linear and polynomial (quadratic and cubic) trend lines to the chart.

Using the trend line equations, obtain tabular data on enterprise profits for each trend line for 1995-2004.

Make a forecast for the enterprise's profit for 2003 and 2004.

The solution of the problem

In the range of cells A4:C11 of the Excel worksheet, enter the worksheet shown in Fig. 4.

Having selected the range of cells B4:C11, we build a diagram.

We activate the constructed diagram and, according to the method described above, after selecting the type of trend line in the Trend Line dialog box (see Fig. 1), we alternately add linear, quadratic and cubic trend lines to the diagram. In the same dialog box, open the Parameters tab (see Fig. 2), in the Name of the approximating (smoothed) curve field, enter the name of the trend being added, and in the Forecast forward for: periods field, set the value 2, since it is planned to make a profit forecast for two years ahead. To display the regression equation and the approximation reliability value R2 in the diagram area, enable the show equation on the screen checkboxes and place the approximation reliability value (R^2) on the diagram. For better visual perception, we change the type, color and thickness of the constructed trend lines, for which we use the View tab of the Trend Line Format dialog box (see Fig. 3). The resulting diagram with added trend lines is shown in Fig. 5.

To obtain tabular data on enterprise profits for each trend line for 1995-2004. Let's use the trend line equations presented in Fig. 5. To do this, in the cells of the range D3:F3, enter text information about the type of the selected trend line: Linear trend, Quadratic trend, Cubic trend. Next, enter the linear regression formula in cell D4 and, using the fill marker, copy this formula with relative references to the cell range D5:D13. It should be noted that each cell with a linear regression formula from the range of cells D4:D13 has as an argument a corresponding cell from the range A4:A13. Similarly, for quadratic regression, fill the range of cells E4:E13, and for cubic regression, fill the range of cells F4:F13. Thus, a forecast for the enterprise's profit for 2003 and 2004 has been compiled. using three trends. The resulting table of values is shown in Fig. 6.

Problem 2

Build a diagram.

Add logarithmic, power and exponential trend lines to the chart.

Derive the equations of the obtained trend lines, as well as the reliability values of the approximation R2 for each of them.

Using the trend line equations, obtain tabular data on the enterprise's profit for each trend line for 1995-2002.

Make a forecast of the company's profit for 2003 and 2004 using these trend lines.

The solution of the problem

Following the methodology given in solving problem 1, we obtain a diagram with logarithmic, power and exponential trend lines added to it (Fig. 7). Next, using the obtained trend line equations, we fill out a table of values for the enterprise’s profit, including the predicted values for 2003 and 2004. (Fig. 8).

In Fig. 5 and fig. it can be seen that the model with a logarithmic trend corresponds to the lowest value of approximation reliability

R2 = 0.8659

The highest values of R2 correspond to models with a polynomial trend: quadratic (R2 = 0.9263) and cubic (R2 = 0.933).

Problem 3

With the table of data on the profit of a motor transport enterprise for 1995-2002, given in task 1, you must perform the following steps.

Obtain data series for linear and exponential trend lines using the TREND and GROW functions.

Using the TREND and GROWTH functions, make a forecast of the enterprise’s profit for 2003 and 2004.

Construct a diagram for the original data and the resulting data series.

The solution of the problem

Let's use the worksheet for Problem 1 (see Fig. 4). Let's start with the TREND function:

select the range of cells D4:D11, which should be filled with the values of the TREND function corresponding to the known data on the profit of the enterprise;

Call the Function command from the Insert menu. In the Function Wizard dialog box that appears, select the TREND function from the Statistical category, and then click the OK button. The same operation can be performed by clicking the (Insert Function) button on the standard toolbar.

In the Function Arguments dialog box that appears, enter the range of cells C4:C11 in the Known_values_y field; in the Known_values_x field - the range of cells B4:B11;

To make the entered formula become an array formula, use the key combination + + .

The formula we entered in the formula bar will look like: =(TREND(C4:C11,B4:B11)).

As a result, the range of cells D4:D11 is filled with the corresponding values of the TREND function (Fig. 9).

To make a forecast of the enterprise's profit for 2003 and 2004. necessary:

select the range of cells D12:D13 where the values predicted by the TREND function will be entered.

call the TREND function and in the Function Arguments dialog box that appears, enter in the Known_values_y field - the range of cells C4:C11; in the Known_values_x field - the range of cells B4:B11; and in the New_values_x field - the range of cells B12:B13.

turn this formula into an array formula using the key combination Ctrl + Shift + Enter.

The entered formula will look like: =(TREND(C4:C11;B4:B11;B12:B13)), and the range of cells D12:D13 will be filled with the predicted values of the TREND function (see Fig. 9).

The data series is similarly filled in using the GROWTH function, which is used in the analysis of nonlinear dependencies and works in exactly the same way as its linear counterpart TREND.

Figure 10 shows the table in formula display mode.

For the initial data and the obtained data series, the diagram shown in Fig. eleven.

Problem 4

With the table of data on the receipt of applications for services by the dispatch service of a motor transport enterprise for the period from the 1st to the 11th of the current month, you must perform the following actions.

Get data series for linear regression: using the SLOPE and INTERCEPT functions; using the LINEST function.

Obtain a series of data for exponential regression using the LGRFPRIBL function.

Using the above functions, make a forecast about the receipt of applications to the dispatch service for the period from the 12th to the 14th of the current month.

Create a diagram for the original and received data series.

The solution of the problem

Note that, unlike the TREND and GROWTH functions, none of the functions listed above (SLOPE, INTERCEPT, LINEST, LGRFPRIB) are regression. These functions play only a supporting role, determining the necessary regression parameters.

For linear and exponential regressions built using the functions SLOPE, INTERCEPT, LINEST, LGRFPRIB, the appearance of their equations is always known, in contrast to linear and exponential regressions corresponding to the TREND and GROWTH functions.

1 . Let's build a linear regression with the equation:

y = mx+b

using the SLOPE and INTERCEPT functions, with the regression slope m determined by the SLOPE function, and the free term b by the INTERCEPT function.

To do this, we carry out the following actions:

enter the original table into the cell range A4:B14;

the value of parameter m will be determined in cell C19. Select the Slope function from the Statistical category; enter the range of cells B4:B14 in the known_values_y field and the range of cells A4:A14 in the known_values_x field. The formula will be entered in cell C19: =SLOPE(B4:B14,A4:A14);

Using a similar technique, the value of parameter b in cell D19 is determined. And its contents will look like: =SEGMENT(B4:B14,A4:A14). Thus, the values of the parameters m and b required for constructing a linear regression will be stored in cells C19, D19, respectively;

Next, enter the linear regression formula in cell C4 in the form: =$C*A4+$D. In this formula, cells C19 and D19 are written with absolute references (the cell address should not change during possible copying). The absolute reference sign $ can be typed either from the keyboard or using the F4 key, after placing the cursor on the cell address. Using the fill handle, copy this formula into the range of cells C4:C17. We obtain the required data series (Fig. 12). Due to the fact that the number of requests is an integer, you should set the number format with the number of decimal places to 0 on the Number tab of the Cell Format window.

2 . Now let's build a linear regression given by the equation:

y = mx+b

using the LINEST function.

For this:

Enter the LINEST function as an array formula in the cell range C20:D20: =(LINEST(B4:B14,A4:A14)). As a result, we obtain the value of parameter m in cell C20, and the value of parameter b in cell D20;

enter the formula in cell D4: =$C*A4+$D;

copy this formula using the fill marker into the cell range D4:D17 and get the desired data series.

3 . We build an exponential regression with the equation:

using the LGRFPRIBL function it is performed similarly:

In the cell range C21:D21 we enter the LGRFPRIBL function as an array formula: =( LGRFPRIBL (B4:B14,A4:A14)). In this case, the value of parameter m will be determined in cell C21, and the value of parameter b will be determined in cell D21;

the formula is entered into cell E4: =$D*$C^A4;

using the fill marker, this formula is copied to the range of cells E4:E17, where the data series for exponential regression will be located (see Fig. 12).

In Fig. Figure 13 shows a table where you can see the functions we use with the required cell ranges, as well as formulas.

Magnitude R 2 called coefficient of determination.

The task of constructing a regression dependence is to find the vector of coefficients m of model (1) at which the coefficient R takes on the maximum value.

To assess the significance of R, Fisher's F test is used, calculated using the formula

Where n- sample size (number of experiments);

k is the number of model coefficients.

If F exceeds some critical value for the data n And k and the accepted confidence probability, then the value of R is considered significant. Tables critical values F are given in reference books on mathematical statistics.

Thus, the significance of R is determined not only by its value, but also by the ratio between the number of experiments and the number of coefficients (parameters) of the model. Indeed, the correlation ratio for n=2 for a simple linear model is equal to 1 (a single straight line can always be drawn through 2 points on a plane). However, if the experimental data are random variables, such a value of R should be trusted with great caution. Usually, to obtain significant R and reliable regression, they strive to ensure that the number of experiments significantly exceeds the number of model coefficients (n>k).

To build a linear regression model you need:

1) prepare a list of n rows and m columns containing experimental data (column containing the output value Y must be either first or last in the list); For example, let’s take the data from the previous task, adding a column called “Period No.”, number the period numbers from 1 to 12. (these will be the values X)

2) go to the menu Data/Data Analysis/Regression

If the "Data Analysis" item in the "Tools" menu is missing, then you should go to the "Add-Ins" item in the same menu and check the "Analysis package" checkbox.

3) in the "Regression" dialog box, set:

· input interval Y;

· input interval X;

· output interval - the upper left cell of the interval in which the calculation results will be placed (it is recommended to place them on a new worksheet);

4) click "Ok" and analyze the results.

Which finds the most wide application in various fields of science and practical activity. This could be physics, chemistry, biology, economics, sociology, psychology, and so on and so forth. By the will of fate, I often have to deal with the economy, and therefore today I will arrange for you a trip to an amazing country called Econometrics=) ...How can you not want it?! It’s very good there – you just need to make up your mind! ...But what you probably definitely want is to learn how to solve problems least squares method. And especially diligent readers will learn to solve them not only accurately, but also VERY QUICKLY ;-) But first general statement of the problem+ accompanying example:

Let us study indicators in a certain subject area that have a quantitative expression. At the same time, there is every reason to believe that the indicator depends on the indicator. This assumption can be either a scientific hypothesis or based on basic common sense. Let's leave science aside, however, and explore more appetizing areas - namely, grocery stores. Let's denote by:

– retail area of a grocery store, sq.m.,
– annual turnover of a grocery store, million rubles.

It is absolutely clear that the larger the store area, the greater in most cases its turnover will be.

Suppose that after carrying out observations/experiments/calculations/dances with a tambourine we have numerical data at our disposal:

With grocery stores, I think everything is clear: - this is the area of the 1st store, - its annual turnover, - the area of the 2nd store, - its annual turnover, etc. By the way, it is not at all necessary to have access to classified materials - a fairly accurate assessment of trade turnover can be obtained by means of mathematical statistics. However, let’s not get distracted, the commercial espionage course is already paid =)

Tabular data can also be written in the form of points and depicted in the familiar form Cartesian system .

We will answer important question: How many points are needed for a qualitative study?

The bigger, the better. The minimum acceptable set consists of 5-6 points. In addition, when the amount of data is small, “anomalous” results cannot be included in the sample. So, for example, a small elite store can earn orders of magnitude more than “its colleagues,” thereby distorting general pattern, which is what you need to find!

To put it very simply, we need to select a function, schedule which passes as close as possible to the points . This function is called approximating (approximation - approximation) or theoretical function . Generally speaking, an obvious “contender” immediately appears here - a high-degree polynomial, the graph of which passes through ALL points. But this option is complicated and often simply incorrect. (since the graph will “loop” all the time and poorly reflect the main trend).

Thus, the sought function must be quite simple and at the same time adequately reflect the dependence. As you might guess, one of the methods for finding such functions is called least squares method. First, let's look at its essence in general view. Let some function approximate experimental data:

How to evaluate the accuracy of this approximation? Let us also calculate the differences (deviations) between the experimental and functional meanings (we study the drawing). The first thought that comes to mind is to estimate how large the sum is, but the problem is that the differences can be negative (For example, ) and deviations as a result of such summation will cancel each other out. Therefore, as an estimate of the accuracy of the approximation, it begs to take the sum modules deviations:

or collapsed: (in case anyone doesn’t know: – this is the sum icon, and – an auxiliary “counter” variable, which takes values from 1 to ).

Bringing experimental points closer various functions, we will receive different meanings, and obviously, where this amount is smaller, that function is more accurate.

Such a method exists and it is called least modulus method. However, in practice it has become much more widespread least square method, in which possible negative values are eliminated not by the module, but by squaring the deviations:

, after which efforts are aimed at selecting a function such that the sum of squared deviations was as small as possible. Actually, this is where the name of the method comes from.

And now we're going back to something else important point: as noted above, the selected function should be quite simple - but there are also many such functions: linear , hyperbolic, exponential, logarithmic, quadratic etc. And, of course, here I would immediately like to “reduce the field of activity.” Which class of functions should I choose for research? Primitive, but effective technique:

– The easiest way is to depict points on the drawing and analyze their location. If they tend to run in a straight line, then you should look for equation of a line With optimal values And . In other words, the task is to find SUCH coefficients so that the sum of squared deviations is the smallest.

If the points are located, for example, along hyperbole, then it is obviously clear that the linear function will give a poor approximation. In this case, we are looking for the most “favorable” coefficients for the hyperbola equation – those that give the minimum sum of squares .

Now note that in both cases we are talking about functions of two variables, whose arguments are searched dependency parameters:

And essentially we need to solve a standard problem - find minimum function of two variables.

Let's remember our example: suppose that “store” points tend to be located in a straight line and there is every reason to believe that linear dependence turnover from retail space. Let's find SUCH coefficients “a” and “be” such that the sum of squared deviations was the smallest. Everything is as usual - first 1st order partial derivatives. According to linearity rule You can differentiate right under the sum icon:

If you want to use this information for an essay or coursework - I will be very grateful for the link in the list of sources, you will find such detailed calculations in few places:

Let's create a standard system:

We reduce each equation by “two” and, in addition, “break up” the sums:

Note : independently analyze why “a” and “be” can be taken out beyond the sum icon. By the way, formally this can be done with the sum

Let's rewrite the system in “applied” form:

after which the algorithm for solving our problem begins to emerge:

Do we know the coordinates of the points? We know. Amounts can we find it? Easily. Let's make the simplest system of two linear equations in two unknowns(“a” and “be”). We solve the system, for example, Cramer's method, as a result of which we obtain a stationary point. Checking sufficient condition for an extremum, we can verify that at this point the function reaches exactly minimum. The check involves additional calculations and therefore we will leave it behind the scenes (if necessary, the missing frame can be viewed). We draw the final conclusion:

Function the best way (at least compared to any other linear function) brings experimental points closer . Roughly speaking, its graph passes as close as possible to these points. In tradition econometrics the resulting approximating function is also called paired linear regression equation .

The problem under consideration is of great practical importance. In our example situation, Eq. allows you to predict what trade turnover ("Igrek") the store will have at one or another value of the sales area (one or another meaning of “x”). Yes, the resulting forecast will only be a forecast, but in many cases it will turn out to be quite accurate.

I will analyze just one problem with “real” numbers, since there are no difficulties in it - all calculations are at the level school curriculum 7-8 grades. In 95 percent of cases, you will be asked to find just a linear function, but at the very end of the article I will show that it is no more difficult to find the equations of the optimal hyperbola, exponential and some other functions.

In fact, all that remains is to distribute the promised goodies - so that you can learn to solve such examples not only accurately, but also quickly. We carefully study the standard:

Task

As a result of studying the relationship between two indicators, the following pairs of numbers were obtained:

Using the least squares method, find the linear function that best approximates the empirical (experienced) data. Make a drawing on which to construct experimental points and a graph of the approximating function in a Cartesian rectangular coordinate system . Find the sum of squared deviations between the empirical and theoretical values. Find out if the feature would be better (from the point of view of the least squares method) bring experimental points closer.

Please note that the “x” meanings are natural, and this has a characteristic meaningful meaning, which I will talk about a little later; but they, of course, can also be fractional. In addition, depending on the content of a particular task, both “X” and “game” values can be completely or partially negative. Well, we have been given a “faceless” task, and we begin it solution:

Odds optimal function we find as a solution to the system:

For the purpose of more compact recording, the “counter” variable can be omitted, since it is already clear that the summation is carried out from 1 to .

It is more convenient to calculate the required amounts in tabular form:

Calculations can be carried out on a microcalculator, but it is much better to use Excel - both faster and without errors; watch a short video:

Thus, we get the following system:

Here you can multiply the second equation by 3 and subtract the 2nd from the 1st equation term by term. But this is luck - in practice, systems are often not a gift, and in such cases it saves Cramer's method:
, which means the system has a unique solution.

Let's check. I understand that you don’t want to, but why skip errors where they can absolutely not be missed? Let us substitute the found solution into left side each equation of the system:

The right-hand sides of the corresponding equations are obtained, which means that the system is solved correctly.

Thus, the desired approximating function: – from all linear functions It is she who best approximates the experimental data.

Unlike straight dependence of the store's turnover on its area, the found dependence is reverse (principle “the more, the less”), and this fact is immediately revealed by the negative slope. Function tells us that with an increase in a certain indicator by 1 unit, the value of the dependent indicator decreases average by 0.65 units. As they say, the higher the price of buckwheat, the less it is sold.

To plot the graph of the approximating function, we find its two values:

and execute the drawing:

The constructed straight line is called trend line (namely, a linear trend line, i.e. in general case a trend is not necessarily a straight line). Everyone is familiar with the expression “to be in trend,” and I think that this term does not need additional comments.

Let's calculate the sum of squared deviations between empirical and theoretical values. Geometrically, this is the sum of the squares of the lengths of the “raspberry” segments (two of which are so small that they are not even visible).

Let's summarize the calculations in a table:

Again, they can be done manually; just in case, I’ll give an example for the 1st point:

but it is much more effective to do it in the already known way:

We repeat once again: What is the meaning of the result obtained? From all linear functions y function the indicator is the smallest, that is, in its family it is the best approximation. And here, by the way, the final question of the problem is not accidental: what if the proposed exponential function would it be better to bring the experimental points closer?

Let's find the corresponding sum of squared deviations - to distinguish, I will denote them with the letter “epsilon”. The technique is exactly the same:

And again, just in case, calculations for the 1st point:

In Excel we use the standard function EXP (syntax can be found in Excel Help).

Conclusion: , which means that the exponential function approximates the experimental points worse than a straight line .

But here it should be noted that “worse” is doesn't mean yet, what is wrong. Now I have built a graph of this exponential function– and it also passes close to the points - so much so that without analytical research it is difficult to say which function is more accurate.

This concludes the solution, and I return to the question of the natural values of the argument. IN various studies, as a rule, economic or sociological, natural “X’s” are used to number months, years or other equal time intervals. Consider, for example, the following problem.

Least square method

Least square method ( OLS, OLS, Ordinary Least Squares) - one of the basic methods of regression analysis for estimating unknown parameters of regression models using sample data. The method is based on minimizing the sum of squares of regression residuals.

It should be noted that the least squares method itself can be called a method for solving a problem in any area if the solution lies in or satisfies some criterion for minimizing the sum of squares of some functions of the required variables. Therefore, the least squares method can also be used for an approximate representation (approximation) of a given function by other (simpler) functions, when finding a set of quantities that satisfy equations or constraints, the number of which exceeds the number of these quantities, etc.

The essence of MNC

Let some (parametric) model of a probabilistic (regression) relationship between the (explained) variable be given y and many factors (explanatory variables) x

where is the vector of unknown model parameters

- random model error.

Let there also be sample observations of the values of these variables. Let be the observation number (). Then are the values of the variables in the th observation. Then, for given values of parameters b, it is possible to calculate the theoretical (model) values of the explained variable y:

The size of the residuals depends on the values of the parameters b.

The essence of the least squares method (ordinary, classical) is to find parameters b for which the sum of the squares of the residuals (eng. Residual Sum of Squares) will be minimal:

In the general case, this problem can be solved by numerical optimization (minimization) methods. In this case they talk about nonlinear least squares(NLS or NLLS - English) Non-Linear Least Squares). In many cases it is possible to obtain an analytical solution. To solve the minimization problem, it is necessary to find stationary points of the function by differentiating it with respect to the unknown parameters b, equating the derivatives to zero and solving the resulting system of equations:

If the model's random errors are normally distributed, have the same variance, and are uncorrelated, OLS parameter estimates are the same as maximum likelihood estimates (MLM).

OLS in the case of a linear model

Let the regression dependence be linear:

Let y is a column vector of observations of the explained variable, and is a matrix of factor observations (the rows of the matrix are vectors of factor values in this observation, in columns - a vector of values of a given factor in all observations). The matrix representation of the linear model is:

Then the vector of estimates of the explained variable and the vector of regression residuals will be equal

Accordingly, the sum of squares of the regression residuals will be equal to

Differentiating this function with respect to the vector of parameters and equating the derivatives to zero, we obtain a system of equations (in matrix form):

The solution of this system of equations gives general formula OLS estimates for the linear model:

For analytical purposes, the latter representation of this formula is useful. If in a regression model the data centered, then in this representation the first matrix has the meaning of a sample covariance matrix of factors, and the second is a vector of covariances of factors with the dependent variable. If in addition the data is also normalized to MSE (that is, ultimately standardized), then the first matrix has the meaning of a sample correlation matrix of factors, the second vector - a vector of sample correlations of factors with the dependent variable.

An important property of OLS estimates for models with constant- the line of the constructed regression passes through the center of gravity of the sample data, that is, the equality is satisfied:

In particular, in the extreme case, when the only regressor is a constant, we find that the OLS estimate of the only parameter (the constant itself) is equal to the average value of the explained variable. That is, the arithmetic mean, known for its good properties from the laws of large numbers, is also an least squares estimate - it satisfies the criterion of the minimum sum of squared deviations from it.

Example: simplest (pairwise) regression

In the case of paired linear regression, the calculation formulas are simplified (you can do without matrix algebra):

Properties of OLS estimators

First of all, we note that for linear models the OLS estimates are linear estimates, as follows from the above formula. For unbiased OLS estimates, it is necessary and sufficient to perform the most important condition regression analysis: conditional on the factors, the mathematical expectation of a random error must be equal to zero. This condition, in particular, is satisfied if

expected value random errors are zero, and
factors and random errors are independent random variables.

The second condition - the condition of exogeneity of factors - is fundamental. If this property is not met, then we can assume that almost any estimates will be extremely unsatisfactory: they will not even be consistent (that is, even a very large amount of data does not allow us to obtain high-quality estimates in this case). In the classical case, a stronger assumption is made about the determinism of the factors, as opposed to a random error, which automatically means that the exogeneity condition is met. In the general case, for the consistency of the estimates, it is sufficient to satisfy the exogeneity condition together with the convergence of the matrix to some non-singular matrix as the sample size increases to infinity.

In order for, in addition to consistency and unbiasedness, estimates of (ordinary) least squares to be also effective (the best in the class of linear unbiased estimates), additional properties of random error must be met:

These assumptions can be formulated for the covariance matrix of the random error vector

A linear model that satisfies these conditions is called classical. OLS estimates for classical linear regression are unbiased, consistent and the most effective estimates in the class of all linear unbiased estimates (in the English literature the abbreviation is sometimes used BLUE (Best Linear Unbaised Estimator) - the best linear unbiased estimate; in Russian literature the Gauss-Markov theorem is more often cited). As is easy to show, the covariance matrix of the vector of coefficient estimates will be equal to:

Generalized OLS

The least squares method allows for broad generalization. Instead of minimizing the sum of squares of the residuals, one can minimize some positive definite quadratic form of the vector of residuals, where is some symmetric positive definite weight matrix. Conventional least squares is a special case of this approach, where the weight matrix is proportional to the identity matrix. As is known from the theory of symmetric matrices (or operators), for such matrices there is a decomposition. Consequently, the specified functional can be represented as follows, that is, this functional can be represented as the sum of the squares of some transformed “remainders”. Thus, we can distinguish a class of least squares methods - LS methods (Least Squares).

It has been proven (Aitken's theorem) that for a generalized linear regression model (in which no restrictions are imposed on the covariance matrix of random errors), the most effective (in the class of linear unbiased estimates) are the so-called estimates. generalized Least Squares (GLS - Generalized Least Squares)- LS method with a weight matrix equal to the inverse covariance matrix of random errors: .

It can be shown that the formula for GLS estimates of the parameters of a linear model has the form

The covariance matrix of these estimates will accordingly be equal to

In fact, the essence of OLS lies in a certain (linear) transformation (P) of the original data and the application of ordinary OLS to the transformed data. The purpose of this transformation is that for the transformed data, the random errors already satisfy the classical assumptions.

Weighted OLS

In the case of a diagonal weight matrix (and therefore a covariance matrix of random errors), we have the so-called weighted Least Squares (WLS). IN in this case the weighted sum of squares of the model residuals is minimized, that is, each observation receives a “weight” inversely proportional to the variance of the random error in this observation: . In fact, the data are transformed by weighting the observations (dividing by an amount proportional to the expected standard deviation random errors), and the usual OLS is applied to weighted data.

Some special cases of using MNC in practice

Approximation of linear dependence

Let us consider the case when, as a result of studying the dependence of a certain scalar quantity on a certain scalar quantity (This could be, for example, the dependence of voltage on current strength: , where is a constant value, the resistance of the conductor), measurements of these quantities were carried out, as a result of which the values and their corresponding values. The measurement data must be recorded in a table.

Table. Measurement results.

Measurement no.
1
2
3
4
5
6

The question is: what value of the coefficient can be selected to best describe the dependence? According to the least squares method, this value should be such that the sum of the squared deviations of the values from the values

was minimal

The sum of squared deviations has one extremum - a minimum, which allows us to use this formula. Let us find from this formula the value of the coefficient. To do this, we transform its left side as follows:

The last formula allows us to find the value of the coefficient, which is what was required in the problem.

Story

Before early XIX V. scientists did not have certain rules for solving a system of equations in which the number of unknowns is less than the number of equations; Until that time, private techniques were used that depended on the type of equations and on the wit of the calculators, and therefore different calculators, based on the same observational data, came to various conclusions. Gauss (1795) was responsible for the first application of the method, and Legendre (1805) independently discovered and published it under modern name(fr. Méthode des moindres quarrés ) . Laplace related the method to probability theory, and the American mathematician Adrain (1808) considered its probability-theoretic applications. The method was widespread and improved by further research by Encke, Bessel, Hansen and others.

Alternative uses of OLS

The idea of the least squares method can also be used in other cases not directly related to regression analysis. The fact is that the sum of squares is one of the most common proximity measures for vectors (Euclidean metric in finite-dimensional spaces).

One application is to “solve” systems linear equations, in which the number of equations more number variables

where the matrix is not square, but rectangular of size .

Such a system of equations, in the general case, has no solution (if the rank is actually greater than the number of variables). Therefore, this system can be “solved” only in the sense of choosing such a vector to minimize the “distance” between the vectors and . To do this, you can apply the criterion of minimizing the sum of squared differences of the left and right parts equations of the system, that is. It is easy to show that solving this minimization problem leads to the solution next system equations

Least square method

The essence of MNC

Let some (parametric) model of a probabilistic (regression) relationship between the (explained) variable be given y and many factors (explanatory variables) x

where is the vector of unknown model parameters

- random model error.

The size of the residuals depends on the values of the parameters b.

The essence of the least squares method (ordinary, classical) is to find parameters b for which the sum of the squares of the residuals (eng. Residual Sum of Squares) will be minimal:

If the model's random errors are normally distributed, have the same variance, and are uncorrelated, OLS parameter estimates are the same as maximum likelihood estimates (MLM).

OLS in the case of a linear model

Let the regression dependence be linear:

Let y is a column vector of observations of the explained variable, and is a matrix of factor observations (the rows of the matrix are the vectors of factor values in a given observation, the columns are the vector of values of a given factor in all observations). The matrix representation of the linear model is:

Then the vector of estimates of the explained variable and the vector of regression residuals will be equal

Accordingly, the sum of squares of the regression residuals will be equal to

Differentiating this function with respect to the vector of parameters and equating the derivatives to zero, we obtain a system of equations (in matrix form):

The solution of this system of equations gives the general formula for least squares estimates for a linear model:

An important property of OLS estimates for models with constant- the line of the constructed regression passes through the center of gravity of the sample data, that is, the equality is satisfied:

Example: simplest (pairwise) regression

In the case of paired linear regression, the calculation formulas are simplified (you can do without matrix algebra):

Properties of OLS estimators

First of all, we note that for linear models, OLS estimates are linear estimates, as follows from the above formula. For unbiased OLS estimates, it is necessary and sufficient to fulfill the most important condition of regression analysis: the mathematical expectation of a random error, conditional on the factors, must be equal to zero. This condition, in particular, is satisfied if

the mathematical expectation of random errors is zero, and
factors and random errors are independent random variables.

These assumptions can be formulated for the covariance matrix of the random error vector

Generalized OLS

It can be shown that the formula for GLS estimates of the parameters of a linear model has the form

The covariance matrix of these estimates will accordingly be equal to

Weighted OLS

In the case of a diagonal weight matrix (and therefore a covariance matrix of random errors), we have the so-called weighted Least Squares (WLS). In this case, the weighted sum of squares of the model residuals is minimized, that is, each observation receives a “weight” that is inversely proportional to the variance of the random error in this observation: . In fact, the data are transformed by weighting the observations (dividing by an amount proportional to the estimated standard deviation of the random errors), and ordinary OLS is applied to the weighted data.

Some special cases of using MNC in practice

Approximation of linear dependence

Table. Measurement results.

Measurement no.
1
2
3
4
5
6

was minimal

The last formula allows us to find the value of the coefficient, which is what was required in the problem.

Story

Until the beginning of the 19th century. scientists did not have certain rules for solving a system of equations in which the number of unknowns is less than the number of equations; Until that time, private techniques were used that depended on the type of equations and on the wit of the calculators, and therefore different calculators, based on the same observational data, came to different conclusions. Gauss (1795) was the first to use the method, and Legendre (1805) independently discovered and published it under its modern name (French. Méthode des moindres quarrés ) . Laplace related the method to probability theory, and the American mathematician Adrain (1808) considered its probability-theoretic applications. The method was widespread and improved by further research by Encke, Bessel, Hansen and others.

Alternative uses of OLS

One application is the “solution” of systems of linear equations in which the number of equations is greater than the number of variables

where the matrix is not square, but rectangular of size .

Such a system of equations, in the general case, has no solution (if the rank is actually greater than the number of variables). Therefore, this system can be “solved” only in the sense of choosing such a vector to minimize the “distance” between the vectors and . To do this, you can apply the criterion of minimizing the sum of squares of the differences between the left and right sides of the system equations, that is. It is easy to show that solving this minimization problem leads to solving the following system of equations

If some physical quantity depends on another quantity, then this dependence can be studied by measuring y at different values of x. As a result of measurements, a number of values are obtained:

x 1, x 2, ..., x i, ..., x n;

y 1 , y 2 , ..., y i , ... , y n .

Based on the data of such an experiment, it is possible to construct a graph of the dependence y = ƒ(x). The resulting curve makes it possible to judge the form of the function ƒ(x). However, the constant coefficients that enter into this function remain unknown. They can be determined using the least squares method. Experimental points, as a rule, do not lie exactly on the curve. The least squares method requires that the sum of the squares of the deviations of the experimental points from the curve, i.e. 2 was the smallest.

In practice, this method is most often (and most simply) used in the case of a linear relationship, i.e. When

y = kx or y = a + bx.

Linear dependence very widespread in physics. And even when the relationship is nonlinear, they usually try to construct a graph so as to get a straight line. For example, if it is assumed that the refractive index of glass n is related to the light wavelength λ by the relation n = a + b/λ 2, then the dependence of n on λ -2 is plotted on the graph.

Consider the dependency y = kx(a straight line passing through the origin). Let's compose the value φ the sum of the squares of the deviations of our points from the straight line

The value of φ is always positive and turns out to be smaller the closer our points are to the straight line. The least squares method states that the value for k should be chosen such that φ has a minimum

or
(19)

The calculation shows that the root-mean-square error in determining the value of k is equal to

, (20)
where n is the number of measurements.

Let us now consider a little more hard case, when the points must satisfy the formula y = a + bx(a straight line that does not pass through the origin).

The task is to find, given a set of values x i , y i best values a and b.

Let us again compose the quadratic form φ, equal to the amount squared deviations of points x i, y i from the straight line

and find the values of a and b for which φ has a minimum

;

The joint solution of these equations gives

(21)

The root mean square errors of determination of a and b are equal

(23)

. (24)

When processing measurement results using this method, it is convenient to summarize all the data in a table in which all the amounts included in formulas (19)(24) are preliminarily calculated. The forms of these tables are given in the examples below.

Example 1. The basic equation of dynamics was studied rotational movementε = M/J (line passing through the origin). At different values of the moment M, the angular acceleration ε of a certain body was measured. It is required to determine the moment of inertia of this body. The results of measurements of the moment of force and angular acceleration are listed in the second and third columns table 5.

Table 5

n	M, N m	ε, s -1	M 2	M ε	ε - kM	(ε - kM) 2
1	1.44	0.52	2.0736	0.7488	0.039432	0.001555
2	3.12	1.06	9.7344	3.3072	0.018768	0.000352
3	4.59	1.45	21.0681	6.6555	-0.08181	0.006693
4	5.90	1.92	34.81	11.328	-0.049	0.002401
5	7.45	2.56	55.5025	19.072	0.073725	0.005435
∑			123.1886	41.1115		0.016436

Using formula (19) we determine:

To determine the root mean square error, we use formula (20)

0.005775kg-1 · m -2 .

According to formula (18) we have

; .

S J = (2.996 0.005775)/0.3337 = 0.05185 kg m2.

Having set the reliability P = 0.95, using the table of Student coefficients for n = 5, we find t = 2.78 and determine absolute mistakeΔJ = 2.78 0.05185 = 0.1441 ≈ 0.2 kg m2.

Let's write the results in the form:

J = (3.0 ± 0.2) kg m2;

Example 2. Let's calculate the temperature coefficient of metal resistance using the least squares method. Resistance depends linearly on temperature

R t = R 0 (1 + α t°) = R 0 + R 0 α t°.

The free term determines the resistance R 0 at a temperature of 0 ° C, and the slope coefficient is the product of the temperature coefficient α and the resistance R 0 .

The results of measurements and calculations are given in the table ( see table 6).

Table 6

n	t°, s	r, Ohm	t-¯t	(t-¯t) 2	(t-¯t)r	r - bt - a	(r - bt - a) 2 .10 -6
1	23	1.242	-62.8333	3948.028	-78.039	0.007673	58.8722
2	59	1.326	-26.8333	720.0278	-35.581	-0.00353	12.4959
3	84	1.386	-1.83333	3.361111	-2.541	-0.00965	93.1506
4	96	1.417	10.16667	103.3611	14.40617	-0.01039	107.898
5	120	1.512	34.16667	1167.361	51.66	0.021141	446.932
6	133	1.520	47.16667	2224.694	71.69333	-0.00524	27.4556
∑	515	8.403		8166.833	21.5985		746.804
∑/n	85.83333	1.4005

Using formulas (21), (22) we determine

R 0 = ¯ R- α R 0 ¯ t = 1.4005 - 0.002645 85.83333 = 1.1735 Ohm.

Let's find an error in the definition of α. Since , then according to formula (18) we have:

Using formulas (23), (24) we have

;

0.014126 Ohm.

Having set the reliability to P = 0.95, using the table of Student coefficients for n = 6, we find t = 2.57 and determine the absolute error Δα = 2.57 0.000132 = 0.000338 deg -1.

α = (23 ± 4) 10 -4 hail-1 at P = 0.95.

Example 3. It is required to determine the radius of curvature of the lens using Newton's rings. The radii of Newton's rings r m were measured and the numbers of these rings m were determined. The radii of Newton's rings are related to the radius of curvature of the lens R and the ring number by the equation

r 2 m = mλR - 2d 0 R,

where d 0 the thickness of the gap between the lens and the plane-parallel plate (or the deformation of the lens),

λ wavelength of incident light.

λ = (600 ± 6) nm;
r 2 m = y;
m = x;
λR = b;
-2d 0 R = a,

then the equation will take the form y = a + bx.

The results of measurements and calculations are entered into table 7.

Table 7

n	x = m	y = r 2, 10 -2 mm 2	m -¯ m	(m -¯m) 2	(m -¯ m)y	y - bx - a, 10 -4	(y - bx - a) 2 , 10 -6
1	1	6.101	-2.5	6.25	-0.152525	12.01	1.44229
2	2	11.834	-1.5	2.25	-0.17751	-9.6	0.930766
3	3	17.808	-0.5	0.25	-0.08904	-7.2	0.519086
4	4	23.814	0.5	0.25	0.11907	-1.6	0.0243955
5	5	29.812	1.5	2.25	0.44718	3.28	0.107646
6	6	35.760	2.5	6.25	0.894	3.12	0.0975819
∑	21	125.129		17.5	1.041175		3.12176
∑/n	3.5	20.8548333