Error analysis (statistics)

Error analysis is the study of kind and quantity of error that occurs, particularly in the fields of applied mathematics (particularly numerical analysis), applied linguistics and statistics.

Error analysis in numerical modelling
In numerical simulation or modelling of real systems, error analysis is concerned with the changes in the output of the model as the parameters to the model vary about a mean.

For instance, in a system modelled as a function of two variables $$z = f(x,y)$$. Error analysis deals with the propagation of the numerical errors in $$x$$ and $$y$$ (around mean values $$\bar{x}$$ and $$\bar{y}$$) to error in $$z$$ (around a mean $$\bar{z}$$).

In numerical analysis, error analysis comprises both forward error analysis and backward error analysis. Forward error analysis involves the analysis of a function $$z' = f'(a_0,a_1,\dots,a_n)$$ which is an approximation (usually a finite polynomial) to a function $$z = f(a_0,a_1,\dots,a_n)$$ to determine the bounds on the error in the approximation, i.e. to find $$\epsilon$$ such that $$0 \le |z - z'| \le \epsilon$$. Backward error analysis involves the analysis of the approximation function $$z' = f'(a_0,a_1,\dots,a_n)$$, to determine the bounds on the parameters $$a_i = \bar{a_i} \pm \epsilon_i$$ such that the result $$z' = z$$.

Error analysis in language teaching
In language teaching, error analysis studies the types and causes of language errors. Errors are classified according to:
 * modality (i.e. level of proficiency in speaking, writing, reading, listening)
 * linguistic levels (i.e. pronunciation, grammar, vocabulary, style)
 * form (e.g. omission, insertion, substitution)
 * type (systematic errors/errors in competence vs. occasional errors/errors in performance)
 * cause (e.g. interference, interlanguage)
 * norm vs. system

Error analysis in molecular dynamics simulation
In molecular dynamics (MD) simulations, there are errors due to inadequate sampling of the phase space or infrequently occurring events, these lead to the statistical error due to random fluctuation in the measurements.

For a series of M measurements of a fluctuating property A, the mean value is:


 * $$ \langle A \rangle = \frac{1}{M} \sum_{\mu=1}^M A_{\mu}. $$

When these M measurements are independent, the variance of the mean  is:


 * $$ \sigma( \langle A \rangle ) = \frac{1}{M} \sigma^{2}( \langle A \rangle ), $$

but in most MD simulations, there is correlation between quantity A at different time, so the variance of the mean  will be underestimated as the effective number of independent measurements is actually less than M. In such situations we rewrite the variance as :


 * $$ \sigma^{2}( \langle A \rangle ) = \frac{1}{M} \sigma^{2}(A) \left[ 1 + 2 \sum_\mu \left( 1 - \frac{\mu}{M} \right) \phi_{\mu} \right],$$

where $$\phi_{\mu}$$ is the autocorrelation function defined by


 * $$ \phi_{\mu} = \frac{ \langle A_{\mu}A_{0} \rangle - \langle A \rangle^{2} }{ \langle A^{2} \rangle - \langle A \rangle^{2}}.$$

We can then use the autocorrelation function to estimate the error bar. Luckily, we have a much simpler method based on block averaging.

Error Analysis in Undergraduate Science Laboratory
Error Analysis in an Undergraduate Science Laboratory