Mean squared error

In statistics, the mean squared error or MSE of an estimator is the expected value of the square of the "error." The error is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate.

Definition and basic properties
The MSE of an estimator $$\hat{\theta}$$ with respect to the estimated parameter $$\theta$$ is defined as


 * $$\operatorname{MSE}(\hat{\theta})=\operatorname{E}((\hat{\theta}-\theta)^2).$$

It can be shown that the MSE is the sum of the variance and the bias of the estimator
 * $$\operatorname{MSE}(\hat{\theta})=\operatorname{Var}\left(\hat{\theta}\right)+ \left(\operatorname{Bias}(\hat{\theta},\theta)\right)^2.$$

In that sense, the MSE assess the quality of the estimator in terms of its variation and unbiasedness. Note, that the MSE is not equivalent to the expected value of the absolute error.

The root mean squared error (RMSE) (or root mean squared deviation (RMSD)) is then simply defined as the square root of the MSE.
 * $$\operatorname{RMSE}(\hat{\theta}) = \sqrt{\operatorname{MSE}(\hat{\theta})}.$$

The defined MSE (as well as the RMSE) is a random variable, that needs to be estimated itself. This usually done by the sample mean
 * $$\operatorname{\widehat{MSE}}(\hat{\theta}) = \frac{1}{n} \sum_{j=1}^n \left(\theta_j-\theta\right)^2$$

with $$\theta_j$$ being realizations of the estimator $$\hat{\theta}$$ of size $$n$$.

Examples
Suppose we have a random sample of size n from a normally distributed population, $$X_1,\dots,X_n\sim\operatorname{N}(\mu,\sigma^2)$$.

Some commonly-used estimators of the true parameters of the population, μ and σ2, are:

Notice how these examples also illustrate one facet of the bias-variance decomposition. The MSE of unbiased estimators are just their variance. The MSE of a biased estimator would have a non-zero bias term as well as a variance term. Note that the estimator that minimizes the MSE is not necessarily unbiased; it could compensate for the bias with a smaller variance. In the example above, a biased estimator for the variance, $$S^2 = \frac{1}{n}\sum_{i=1}^n\left(X_i-\overline{X}\,\right)^2$$, actually has a smaller mean squared error than the formula given, despite being biased by $$- \frac{1}{n} \sigma^2$$.

Applications

 * In statistical modelling, the MSE is defined as the difference between the actual observations and the response predicted by the model and is used to determine whether the model does not fit the data or whether the model can be simplified by removing terms.