Expected value

In probability theory (and especially gambling), the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff ("value"). Thus, it represents the average amount one "expects" to win per bet if bets with identical odds are repeated many times. Note that the value itself may not be expected in the general sense; it may be unlikely or even impossible. A game or situation in which the expected value for the player is zero (no net gain nor loss) is called a "fair game."

For example, an American roulette wheel has 38 equally possible outcomes. A bet placed on a single number pays 35-to-1 (this means that you are paid 35 times your bet and your bet is returned, so you get 36 times your bet). So the expected value of the profit resulting from a $1 bet on a single number is, considering all 38 possible outcomes:


 * $$\left( -\$1 \times \frac{37}{38} \right) + \left( \$35 \times \frac{1}{38} \right),$$

which is about -$0.0526. Therefore one expects, on average, to lose over five cents for every dollar bet.

Mathematical definition
In general, if $$X\,$$ is a random variable defined on a probability space $$(\Omega, P)\,$$, then the expected value of $$X\,$$ (denoted $$\mathrm{E}(X)\,$$ or sometimes $$\langle X \rangle$$ or $$\mathbb{E}(X)$$) is defined as


 * $$\mathrm{E}(X) = \int_\Omega X\, dP$$

where the Lebesgue integral is employed. Note that not all random variables have an expected value, since the integral may not exist (e.g., Cauchy distribution). Two variables with the same probability distribution will have the same expected value, if it is defined.

If $$X$$ is a discrete random variable with values $$x_1$$, $$x_2$$, ... and corresponding probabilities $$p_1$$, $$p_2$$, ... which add up to 1, then $$\mathrm{E}(X)$$ can be computed as the sum or series


 * $$\mathrm{E}(X) = \sum_i p_i x_i\,$$

as in the gambling example mentioned above.

If the probability distribution of $$X$$ admits a probability density function $$f(x)$$, then the expected value can be computed as


 * $$\mathrm{E}(X) = \int_{-\infty}^\infty x f(x)\, \mathrm d x.$$

It follows directly from the discrete case definition that if $$X$$ is a constant random variable, i.e. $$X = b$$ for some fixed real number $$b$$, then the expected value of $$X$$ is also $$b$$.

The expected value of an arbitrary function of x, g(x), with respect to the probability density function f(x) is given by


 * $$\mathrm{E}(g(X)) = \int_{-\infty}^\infty g(x) f(x)\, \mathrm d x.$$

Linearity
The expected value operator (or expectation operator) $$\mathrm{E}$$ is linear in the sense that


 * $$\mathrm{E}(a X + b Y) = a \mathrm{E}(X) + b \mathrm{E}(Y)\,$$

for any two random variables $$X$$ and $$Y$$ (which need to be defined on the same probability space) and any real numbers $$a$$ and $$b$$.

Iterated expectation
For any two random variables $$X,Y$$ one may define the conditional expectation:


 * $$ \mathrm{E}[X|Y](y) = \mathrm{E}[X|Y=y] = \sum_x x \cdot \mathrm{P}(X=x|Y=y).$$

Then the expectation of $$X$$ satisfies



\begin{matrix} \mathrm{E} \left( \mathrm{E}[X|Y] \right) & = & \sum_y \mathrm{E}[X|Y=y] \cdot \mathrm{P}(Y=y) \\ & = & \sum_y \left( \sum_x x \cdot \mathrm{P}(X=x|Y=y) \right) \cdot \mathrm{P}(Y=y) \\ & = & \sum_y \sum_x x \cdot \mathrm{P}(X=x|Y=y) \cdot \mathrm{P}(Y=y) \\ & = & \sum_y \sum_x x \cdot \mathrm{P}(Y=y|X=x) \cdot \mathrm{P}(X=x) \\ & = & \sum_x x \cdot \mathrm{P}(X=x) \cdot \left( \sum_y \mathrm{P}(Y=y|X=x) \right) \\ & = & \sum_x x \cdot \mathrm{P}(X=x) \\ & = & \mathrm{E}[X]. \end{matrix}$$

Hence, the following equations holds:


 * $$\mathrm{E}[X] = \mathrm{E} \left( \mathrm{E}[X|Y] \right).$$

The right hand side of this equation is referred to as the iterated expectation. This proposition is treated in law of total expectation.

Inequality
If a random variable X is always less than or equal to another random variable Y, the expectation of X is less than or equal to that of Y:

If $$ X \leq Y$$, then $$ \mathrm{E}[X] \leq \mathrm{E}[Y]$$.

In particular, since $$ X \leq |X| $$ and $$ -X \leq |X| $$, the absolute value of expectation of a random variable is less or equal to the expectation of its absolute value:


 * $$|\mathrm{E}[X]| \leq \mathrm{E}[|X|]$$

Representation
The following formula holds for any nonnegative real--valued random variable $$ X $$ (such that $$ \mathrm{E}[X] < \infty $$), and positive real number $$ \alpha $$:


 * $$ \mathrm{E}[X^\alpha] = \alpha \int_{0}^{\infty} t^{\alpha -1}\mathrm{P}(X>t) \mathrm d t.$$

Non-multiplicativity
In general, the expected value operator is not multiplicative, i.e. $$\mathrm{E}(X Y)$$ is not necessarily equal to $$\mathrm{E}(X) \mathrm{E}(Y)$$, except if $$X$$ and $$Y$$ are independent or uncorrelated. This lack of multiplicativity gives rise to study of covariance and correlation.

Functional non-invariance
In general, the expectation operator and functions of random variables do not commute; that is


 * $$\mathrm{E}(g(X)) = \int_{\Omega} g(X)\, \mathrm d P \neq g(\operatorname{E}X),$$

except as noted above.

Uses and applications of the expected value
The expected values of the powers of $$X$$ are called the moments of $$X$$; the moments about the mean of $$X$$ are expected values of powers of $$X - \mathrm{E}(X)$$. The moments of some random variables can be used to specify their distributions, via their moment generating functions.

To empirically estimate the expected value of a random variable, one repeatedly measures observations of the variable and computes the arithmetic mean of the results. This estimates the true expected value in an unbiased manner and has the property of minimizing the sum of the squares of the residuals (the sum of the squared differences between the observations and the estimate). The law of large numbers demonstrates that (under fairly mild conditions) as the size of the sample gets larger, the variance of this estimate gets smaller.

In classical mechanics, the center of mass is an analogous concept to expectation. For example, suppose $$X$$ is a discrete random variable with values $$x_i$$ and corresponding probabilities $$p_i$$. Now consider a weightless rod on which are placed weights, at locations $$x_i$$ along the rod and having masses $$p_i$$ (whose sum is one). The point at which the rod balances (its center of gravity) is $$\mathrm{E}(X)$$. (Note however, that the center of mass is not the same as the center of gravity.)

Expectation of matrices
If $$X$$ is an $$m \times n$$ matrix, then the expected value of the matrix is a matrix of expected values:



\mathrm{E}[X] = \mathrm{E} \begin{bmatrix} x_{1,1} & x_{1,2} & \cdots & x_{1,n} \\ x_{2,1} & x_{2,2} & \cdots & x_{2,n} \\ \vdots \\ x_{m,1} & x_{m,2} & \cdots & x_{m,n} \end{bmatrix} = \begin{bmatrix} \mathrm{E}(x_{1,1}) & \mathrm{E}(x_{1,2}) & \cdots & \mathrm{E}(x_{1,n}) \\ \mathrm{E}(x_{2,1}) & \mathrm{E}(x_{2,2}) & \cdots & \mathrm{E}(x_{2,n}) \\ \vdots \\ \mathrm{E}(x_{m,1}) & \mathrm{E}(x_{m,2}) & \cdots & \mathrm{E}(x_{m,n}) \end{bmatrix} $$

This property is utilized in covariance matrices.