Assessment |
Biopsychology |
Comparative |
Cognitive |
Developmental |
Language |
Individual differences |
Personality |
Philosophy |
Social |

Methods |
Statistics |
Clinical |
Educational |
Industrial |
Professional items |
World psychology |

**Statistics:**
Scientific method ·
Research methods ·
Experimental design ·
Undergraduate statistics courses ·
Statistical tests ·
Game theory ·
Decision theory

In statistics, the **Gauss–Markov theorem**, named after Carl Friedrich Gauss and Andrey Markov, states that in a linear model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimators of the coefficients are the least-squares estimators. More generally, the best linear unbiased estimator of any linear combination of the coefficients is its least-squares estimator. The errors are *not* assumed to be normally distributed, nor are they assumed to be independent (but only uncorrelated — a weaker condition), nor are they assumed to be identically distributed (but only homoscedastic — a weaker condition, defined below).

More explicitly, and more concretely, suppose we have

- $ Y_i=\beta_0+\beta_1 x_i+\varepsilon_i $

for *i* = 1, . . ., *n*, where β_{0} and β_{1} are non-random but **un**observable parameters, *x _{i}* are non-random and observable, ε

_{i}are random, and so

*Y*

_{i}are random. (We set

*x*in lower-case because it is not random, and

*Y*in capital because it is random.) The random variables ε

_{i}are called the "errors" (not to be confused with "residuals"; see errors and residuals in statistics). The

**Gauss–Markov**assumptions state that

- $ {\rm E}\left(\varepsilon_i\right)=0, $
- $ {\rm var}\left(\varepsilon_i\right)=\sigma^2<\infty, $

(i.e., all errors have the same variance; that is "homoscedasticity"), and

- $ {\rm cov}\left(\varepsilon_i,\varepsilon_j\right)=0 $

for $ i\not=j $; that is "uncorrelatedness."
A **linear unbiased estimator** of β_{1} is a linear combination

- $ c_1Y_1+\cdots+c_nY_n $

in which the coefficients *c _{i}* are not allowed to depend on the earlier coefficients β

_{i}, since those are not observable, but are allowed to depend on

*x*, since those are observable, and whose expected value remains β

_{i}_{1}even if the values of β

_{i}change. (The dependence of the coefficients on the

*x*is typically nonlinear; the estimator is linear in that which is random; that is why this is "linear" regression.) The

_{i}**mean squared error**of such an estimator is

- $ E\left((c_1Y_1+\cdots+c_nY_n-\beta_1)^2\right), $

i.e., it is the expectation of the square of the difference between the estimator and the parameter to be estimated. (The mean squared error of an estimator coincides with the estimator's variance if the estimator is unbiased; for biased estimators the mean squared error is the sum of the variance and the square of the bias.) The **best linear unbiased estimator** is the one with the smallest mean squared error. The "least-squares estimators" of β_{0} and β_{1} are the functions $ \widehat{\beta}_0 $ and $ \widehat{\beta}_1 $ of the *Y*s and the *x*s that make the **sum of squares of residuals**

- $ \sum_{i=1}^n\left(Y_i-\widehat{Y}_i\right)^2=\sum_{i=1}^n\left(Y_i-\left(\widehat{\beta}_0+\widehat{\beta}_1 x_i\right)\right)^2 $

as small as possible. (It is easy to confuse the concept of *error* introduced early in this article, with this concept of *residual*. For an account of the differences and the relationship between them, see errors and residuals in statistics.)

The main idea of the proof is that the least-squares estimators are uncorrelated with every **linear unbiased estimator of zero**, i.e., with every linear combination

- $ a_1Y_1+\cdots+a_nY_n $

whose coefficients do not depend upon the unobservable β_{i} but
whose expected value remains zero regardless of how the values of β_{1} and β_{2} change.

- See also linear regression.

In terms of the matrix algebra formulation, the Gauss–Markov theorem shows that the difference between the parameter covariance matrix of an arbitrary linear unbiased estimator and OLS is positive semi definite (see also proof in external link).

## External linksEdit

- Earliest Known Uses of Some of the Words of Mathematics: G (brief history and explanation of its name)
- Proof of the Gauss Markov theorem for multiple linear regression (makes use of matrix algebra)
- A Proof of the Gauss Markov theorem using geometryes:Teorema de Gauss-Markov

This page uses Creative Commons Licensed content from Wikipedia (view authors). |