Visualizing the Laws of Total Expectation and Variance
In probability theory and statistics, two of the most important concepts we rely on are expectation and variance - tools that help us understand the average outcome of a random variable and its uncertainty. Sometimes, it is hard to compute these quantities directly. In some cases, we can use the law of total expectation and the law of total variance to break down the problem into simpler parts.
This post aims to make these laws clearer by visualizing them. It assumes familiarity with random variables, expectation, and variance.
Law of Total Expectation
The Law of Total Expectation states that the expected value of a random variable \(X\) can be decomposed into the expectation of its conditional expectation given another random variable \(Y\):
\[ \mathbb{E}[X] = \mathbb{E}[\mathbb{E}[X \mid Y]] \] But what does this mean intuitively?
Let’s consider a simple example. Suppose we have two random variables \(X\) and \(Y\). \(X\) represents the height of a person, and \(Y\) represents a group that the person belongs to. There are three groups: “short (1)” “medium (2)”, and “tall (3)”.
The expected value can be computed by averaging the heights of all people. But we can also computing averages within each group and taking weighted averages of the group means. This is what the law of total expectation tells us.
Law of Total Variance
The Law of Total Variance states that the variance of a random variable \(X\) can be decomposed into the variance of its conditional expectation given another random variable \(Y\), plus the expectation of the conditional variance given \(Y\):
\[ \text{Var}[X] = \text{Var}[\mathbb{E}[X \mid Y]] + \mathbb{E}[\text{Var}[X \mid Y]] \]
Let’s consider the same example as before. The variance of the heights can be computed by averaging the squared differences from the mean height. But we can also compute it by averaging the variances within each group and adding the variance among the group means.
The variance measures the spread of the data. The spread of the data is visualized by the horizontal lines, and the conditional averages are shown by the vertical lines. However, note that horizontal lines are used only for visualization purposes, and they should not be interpreted literally.
Conclusion
The laws of total expectation and total variance are powerful tools that help us break down complex problems into simpler parts. I hope that these visualizations have made these laws more intuitive and easier to understand.