Assessment |
Biopsychology |
Comparative |
Cognitive |
Developmental |
Language |
Individual differences |
Personality |
Philosophy |
Social |

Methods |
Statistics |
Clinical |
Educational |
Industrial |
Professional items |
World psychology |

**Statistics:**
Scientific method ·
Research methods ·
Experimental design ·
Undergraduate statistics courses ·
Statistical tests ·
Game theory ·
Decision theory

In probability theory, there exist several different notions of convergence of random variables. The convergence (in one of the senses presented below) of sequences of random variables to some limiting random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. For example, if the average of *n* independent, identically distributed random variables *Y*_{i}, *i* = 1, ..., *n*, is given by

- $ X_n = \frac{1}{n}\sum_{i=1}^n Y_i\,, $

then as *n* goes to infinity, *X*_{n} converges *in probability* (see below) to the common mean, μ, of the random variables *Y*_{i}. This result is known as the weak law of large numbers. Other forms of convergence are important in other useful theorems, including the central limit theorem.

Throughout the following, we assume that (*X*_{n}) is a sequence of random variables, and *X* is a random variable, and all of them are defined on the same probability space (Ω, *F*, P).

## Convergence in distribution Edit

Suppose that *F*_{1}, *F*_{2}, ... is a sequence of cumulative distribution functions corresponding to random variables *X*_{1}, *X*_{2}, ..., and that *F* is a distribution function corresponding to a random variable *X*. We say that the sequence *X*_{n} converges towards *X* **in distribution**, if

- $ \lim_{n\rightarrow\infty} F_n(a) = F(a), $

for every real number *a* at which *F* is continuous. Since *F*(a) = Pr(*X* ≤ a), this means that the probability that the value of *X* is in a given range is very similar to the probability that the value of *X*_{n} is in that range, provided *n* is large enough. Convergence in distribution is often denoted by adding the letter $ \mathcal D $ over an arrow indicating convergence:

- $ X_n \, \begin{matrix} {\,}_\mathcal{D} \\ {\,}^{\longrightarrow} \\ \quad \end{matrix} \, X. $

Convergence in distribution is the weakest form of convergence, and is sometimes called **weak convergence**. It does not, in general, imply any other mode of convergence. However, convergence in distribution *is* implied by all other modes of convergence mentioned in this article, and hence, it is the most common and often the most useful form of convergence of random variables. It is the notion of convergence used in the central limit theorem and the (weak) law of large numbers.

A useful result, which may be employed in conjunction with law of large numbers and the central limit theorem, is that if a function *g*: **R** → **R** is continuous, then if *X*_{n} converges in distribution to *X*, then so too does *g*(*X*_{n}) converge in distribution to *g*(*X*). (This may be proved using Skorokhod's representation theorem.)

Convergence in distribution is also called **convergence in law**, since the word "law" is sometimes used as a synonym of "probability distribution."

## Convergence in probability Edit

We say that the sequence *X*_{n} converges towards *X* **in probability** if

- $ \lim_{n\rightarrow\infty}P\left(\left|X_n-X\right|\geq\varepsilon\right)=0 $

for every ε > 0. Convergence in probability is, indeed, the (pointwise) convergence *of* probabilities. Pick any ε > 0 and any δ > 0. Let *P*_{n} be the probability that *X*_{n} is outside a tolerance ε of *X*. Then, if *X*_{n} converges in probability to *X* then there exists a value *N* such that, for all *n* ≥ *N*, *P*_{n} is itself less than δ.

Convergence in probability is often denoted by adding the letter 'P' over an arrow indicating convergence:

- $ X_n \, \begin{matrix} {\,}_P \\{\,}^{\longrightarrow} \\ \quad \end{matrix} \, X. $

Convergence in probability is the notion of convergence used in the weak law of large numbers.
Convergence in probability implies convergence in distribution. To prove it, it's convenient to prove the following, simple lemma:

### Lemma Edit

Let *X*, *Y* be random variables, *c* a real number and ε > 0; then

- $ \Pr(Y\leq c)\leq \Pr(X\leq c+\varepsilon)+\Pr(\left|Y - X\right|>\varepsilon). $

In fact,

- $ \Pr(Y\leq c)=\Pr(Y\leq c,X\leq c+\varepsilon)+\Pr(Y\leq c,X>c+\varepsilon) $

- $ =\Pr(Y\leq c \vert X\leq c+\varepsilon)\Pr(X\leq c+\varepsilon)+\Pr(Y\leq c,c<X - \varepsilon) $

- $ \leq \Pr(X\leq c+\varepsilon)+\Pr(Y - X<- \varepsilon)\leq \Pr(X\leq c+\varepsilon)+\Pr(\left|Y - X\right|>\varepsilon) $

since

- $ \Pr(\left|Y - X\right|>\varepsilon)=\Pr(Y - X>\varepsilon)+\Pr(Y - X<-\varepsilon)\geq \Pr(Y - X<-\varepsilon). $

### Proof Edit

For every ε > 0, due to the preceding lemma, we have:

- $ P(X_n\leq a)\leq P(X\leq a+\varepsilon)+P(\left|X_n - X\right|>\varepsilon) $

- $ P(X\leq a-\varepsilon)\leq P(X_n \leq a)+P(\left|X_n - X\right|>\varepsilon) $

So, we have:

- $ P(X\leq a-\varepsilon)-P(\left|X_n - X\right|>\varepsilon)\leq P(X_n \leq a)\leq P(X\leq a+\varepsilon)+P(\left|X_n - X\right|>\varepsilon) $

Taking the limit for $ n\rightarrow\infty $, we obtain:

- $ P(X\leq a-\varepsilon)\leq \lim_{n\rightarrow\infty} P(X_n \leq a)\leq P(X\leq a+\varepsilon) $

But $ P(X\leq a) $ is the cumulative distribution function $ F_X(a) $, which is continuous by hypothesis, that is:

- $ \lim_{\varepsilon \rightarrow 0^+} F_X(a-\varepsilon)=\lim_{\varepsilon \rightarrow 0^+} F_X(a+\varepsilon)=F_X(a) $

and so, taking the limit for $ \varepsilon \rightarrow 0^+ $, we obtain:

- $ \lim_{n\rightarrow\infty} P(X_n \leq a)=P(X \leq a) $

## Almost sure convergence Edit

We say that the sequence *X*_{n} converges **almost surely** or **almost everywhere** or **with probability 1** or **strongly** towards *X* if

- $ P\left(\lim_{n\rightarrow\infty}X_n=X\right)=1. $

This means that you are virtually guaranteed that the values of *X*_{n} approach the value of *X*, in the sense (see almost surely) that events for which *X*_{n} does not converge to *X* have probability 0. Using the probability space (Ω, *F*, P) and the concept of the random variable as a function from Ω to **R**, this is equivalent to the statement

- $ P\left(\big\{\omega \in \Omega \, | \, \lim_{n \to \infty}X_n(\omega) = X(\omega) \big\}\right) = 1. $

Almost sure convergence implies convergence in probability, and hence implies convergence in distribution. It is the notion of convergence used in the strong law of large numbers.

## Convergence in *r*th mean Edit

We say that the sequence *X*_{n} converges **in rth mean** or

**in the L**towards

^{r}norm*X*, if

*r*≥ 1, E|

*X*

_{n}|

^{r}< ∞ for all

*n*, and

- $ \lim_{n\rightarrow\infty}\mathrm{E}\left(\left|X_n-X\right|^r\right)=0 $

where the operator E denotes the expected value. Convergence in *r*th mean tells us that the expectation of the *r*th power of the difference between *X*_{n} and *X* converges to zero.

The most important cases of convergence in *r*th mean are:

- When
*X*_{n}converges in*r*th mean to*X*for*r*= 1, we say that*X*_{n}converges**in mean**to*X*. - When
*X*_{n}converges in*r*th mean to*X*for*r*= 2, we say that*X*_{n}converges**in mean square**to*X*.

Convergence in *r*th mean, for *r* > 0, implies convergence in probability (by Chebyshev's inequality), while if *r* > *s* ≥ 1, convergence in *r*th mean implies convergence in *s*th mean. Hence, convergence in mean square implies convergence in mean.

## Converse implications Edit

The chain of implications between the various notions of convergence, above, are noted in their respective sections, but it is sometimes important to establish converses to these implications. No other implications other than those noted above hold in general, but a number of special cases do permit converses:

- If
*X*_{n}converges in distribution to a constant*c*, then*X*_{n}converges in probability to*c*.

- If
*X*_{n}converges in probability*X*, and if Pr(|*X*_{n}| ≤*b*) = 1 for all*n*and some*b*, then*X*_{n}converges in*r*th mean to*X*for all*r*≥ 1. In other words, if*X*_{n}converges in probability to*X*and all random variables*X*_{n}are almost surely bounded above and below, then*X*_{n}converges to*X*also in any*r*th mean.

- If for all ε > 0,

- $ \sum_n P\left(|X_n - X| > \varepsilon\right) < \infty, $

- then
*X*_{n}converges almost surely to*X*. In other words, if*X*_{n}converges in probability to*X*sufficiently quickly (*i*.*e*. the above sum converges for all ε > 0), then*X*_{n}also converges almost surely to*X*.

- If
*S*_{n}is a sum of*n*real independent random variables:

- $ S_n = X_1+\ldots+X_n $

then *S*_{n} converges almost surely if and only if *S*_{n} converges in probability.

## References Edit

- G.R. Grimmett and D.R. Stirzaker (1992).
*Probability and Random Processes, 2nd Edition*. Clarendon Press, Oxford, pp 271--285. ISBN 0198536658.

- M. Jacobsen (1992).
*Videregående Sandsynlighedsregning (Advanced Probability Theory) 3rd Edition*. HCØ-tryk, Copenhagen, pp 18--20. ISBN 87-91180-71-6.de:Konvergenz in Verteilung

This page uses Creative Commons Licensed content from Wikipedia (view authors). |