Assessment |
Biopsychology |
Comparative |
Cognitive |
Developmental |
Language |
Individual differences |
Personality |
Philosophy |
Social |

Methods |
Statistics |
Clinical |
Educational |
Industrial |
Professional items |
World psychology |

**Statistics:**
Scientific method ·
Research methods ·
Experimental design ·
Undergraduate statistics courses ·
Statistical tests ·
Game theory ·
Decision theory

In statistics, a **Studentized residual**, named in honor of William Sealey Gosset, who wrote under the pseudonym * Student*, is a residual adjusted by dividing it by an estimate of its standard deviation. Studentization of residuals is an important technique in the detection of outliers.

## Errors versus residualsEdit

It is very important to understand the difference between errors and residuals in statistics. Consider the simple linear regression model

- $ Y_i=\alpha_0+\alpha_1 x_i+\varepsilon_i, $

where the **errors** ε_{i}, *i* = 1, ..., *n*, are independent and all have the same variance σ^{2}. The **residuals** are not the true, and unobservable, errors, but rather are *estimates*, based on the observable data, of the errors. When the method of least squares is used to estimate α_{0} and α_{1}, then the residuals, unlike the errors, cannot be independent since they satisfy the two constraints

- $ \sum_{i=1}^n \widehat{\varepsilon}_i=0 $

and

- $ \sum_{i=1}^n \widehat{\varepsilon}_i x_i=0. $

(Here $ \varepsilon_i $ is the *i*th error, and $ \widehat{\varepsilon}_i $ is the *i*th residual.) Moreover, the residuals, unlike the errors, do not all have the same variance: the variance increases as the corresponding *x*-value gets farther from the average *x*-value. **The fact that the variances of the residuals differ, even though the variances of the true errors are all equal to each other, is the principal reason for the need for Studentization.**

## How to StudentizeEdit

For this simple model, the design matrix is

- $ X=\left[\begin{matrix}1 & x_1 \\ \vdots & \vdots \\ 1 & x_n \end{matrix}\right] $

and the "hat matrix" *H* is the matrix of the orthogonal projection onto the column space of the design matrix:

- $ H=X(X^T X)^{-1}X^T. $

The "leverage" *h*_{ii} is the *i*th diagonal entry in the hat matrix. The variance of the *i*th residual is

- $ \mbox{var}(\widehat{\varepsilon}_i)=\sigma^2(1-h_{ii}). $

The corresponding **Studentized residual** is then

- $ {\widehat{\varepsilon}_i\over \widehat{\sigma} \sqrt{1-h_{ii}\ }} $

where $ \widehat{\sigma} $ is an appropriate estimate of σ.

## Internal and external StudentizationEdit

The estimate of σ^{2} is

- $ \widehat{\sigma}^2={1 \over n-m}\sum_{j=1}^n \widehat{\varepsilon}_j^2. $

where *m* is the number of parameters in the model (2 in our example).
But it is desirable to exclude the *i*th observation from the process of estimating the variance when one is considering whether the *i*th case may be an outlier. Consequently one may use the estimate

- $ \widehat{\sigma}_{(i)}^2={1 \over n-m-1}\sum_{j=1}^n \widehat{\varepsilon}_j^2, $

based on all but the *i*th case. If the latter estimate is used, *excluding* the *i*th case, then the residual is said to be * externally Studentized*; if the former is used,

*including*the

*i*th case, then it is

*.*

**internally Studentized**If the errors are independent and normally distributed with expected value 0 and variance σ^{2}, then the probability distribution of the *i*th externally Studentized residual is a Student's t-distribution with *n* − *m* − 1 degrees of freedom, and can range from $ -\infty $ to $ +\infty $.

On the other hand, the internally Studentized residuals are in the range $ 0 \pm \sqrt{\mathrm{r.d.f.}} $, where r.d.f. is the number of residual degrees of freedom, namely *n* − *m*. If "i.s.r." represents the internally Studentized residual, and again assuming that the errors are independent identically distributed Gaussian variables, then

- $ \mathrm{i.s.r.}^2 = \mathrm{r.d.f.}{t^2 \over t^2+\mathrm{r.d.f.}-1} $

where *t* is distributed as Student's t-distribution with r.d.f. − 1 degrees of freedom. In fact, this implies that i.s.r.^{2}/r.d.f. follows the beta distribution *B*(1/2,(r.d.f. − 1)/2). When r.d.f. = 3, the internally Studentized residuals are uniformly distributed between $ -\sqrt{3} $ and $ +\sqrt{3} $.

If there is only one residual degree of freedom, the above formula for the distribution of internally Studentized residuals doesn't apply. In this case, the i.s.r.'s are all either +1 or -1, with 50% chance for each.

The standard deviation of the distribution of internally Studentized residuals is always 1, but this does not imply that the standard deviation of all the i.s.r.'s of a particular experiment is 1.

This page uses Creative Commons Licensed content from Wikipedia (view authors). |