Linear modeling

In statistics the linear model is a model given by


 * $$Y = X \beta + \varepsilon$$

where Y is an n&times;1 column vector of random variables, X is an n&times;p matrix of "known" (i.e., observable and non-random) quantities, whose rows correspond to statistical units, &beta; is a p&times;1 vector of (unobservable) parameters, and &epsilon; is an n&times;1 vector of "errors", which are uncorrelated random variables each with expected value 0 and variance &sigma;2. Often one takes the components of the vector of errors to be independent and normally distributed. Having observed the values of X and Y, the statistician must estimate &beta; and &sigma;2. Typically the parameters &beta; are estimated by the method of maximum likelihood, which in the case of normal errors is equivalent (by the Gauss-Markov theorem)  to the method of least squares.

If, rather than taking the variance of &epsilon; to be &sigma;2I, where I is the n&times;n identity matrix, one assumes the variance is &sigma;2M, where M is a known matrix other than the identity matrix, then one estimates &beta; by the method of "generalized least squares", in which, instead of minimizing the sum of squares of the residuals, one minimizes a different quadratic form in the residuals &mdash; the quadratic form being the one given by the matrix M-1. This leads to the estimator


 * $$\widehat{\beta}=\left(X'M^{-1}X\right)^{-1}X'M^{-1}y$$

which is the best linear unbiased estimator for $$\beta$$. If all of the off-diagonal entries in the matrix M are 0, then one normally estimates &beta; by the method of "weighted least squares", with weights proportional to the reciprocals of the diagonal entries.

Ordinary linear regression is a very closely related topic.

Generalized linear models
Generalized linear models, for which rather than


 * E(Y) = X&beta;,

one has


 * g(E(Y)) = X&beta;,

where g is the "link function". An example is the Poisson regression model, which states that


 * Yi has a Poisson distribution with expected value e&gamma;+&delta;xi.

The link function is the natural logarithm function. Having observed xi and Yi for i = 1, ..., n, one can estimate &gamma; and &delta; by the method of maximum likelihood.

General linear model
The general linear model (or multivariate regression model) is a linear model with multiple measurements per object. Each object may be represented in a vector.