Discrete Random Variables: Expectation & Variance

Discrete Random Variables

🧮 Discrete Random Variables — let’s get our hands dirty

Right, this one sneaks up on people. You think you’re fine adding a few probabilities together and then—hang on—suddenly the question wants expectation, variance, distribution tables, and some weird “interpretation in context” thing that absolutely wasn’t in your notes.
So we’re going to talk it through like we’re in the room, not in some polished textbook universe.

And yes, early on I’ll naturally tap into A Level Maths understanding, because getting a feel for what these variables mean is half the battle.

🔙 Previous topic:

Before calculating expectation and variance for a discrete random variable, it’s important that the data has already been cleaned properly — with outliers identified and handled sensibly so the results are meaningful rather than misleading.

📘 Where examiners use this

Expectation and variance pop up in multiple-choice, long questions, hypothesis tests, binomial setups… they’re everywhere.
And examiners adore tiny slips: wrong table values, forgetting to multiply by probabilities, that sort of thing.
This topic isn’t hard, but it is fiddly, and they know it.

📏 What we’ve got

We’ll imagine a discrete random variable (X) taking values (0, 1, 2, 3).
For example, a probability distribution could be:
P(X=0)=0.1,\ P(X=1)=0.4,\ P(X=2)=0.3,\ P(X=3)=0.2

Nothing wild yet — just a table with weights on it.

🧠 Core thinking steps

🔣 What expectation actually is (not the mystical version)

Students sometimes think expectation is “what actually happens most of the time.”
Nope. It’s a weighted average — the long-run average if you repeated the scenario forever.

For example, E(X)=\sum xP(x).
That’s all. Multiply each value by its probability and add them.

The idea is simple… it’s the fingers doing the arithmetic that go rogue.

🧭 Why variance matters (and why the formula looks odd)

Variance isn’t just “spread”—it’s how far values drift from the mean on average.
But instead of measuring deviations directly, we use a neat identity:

For example, \mathrm{Var}(X)=E(X^2)-[E(X)]^2.

That formula saves you time and calculator rage.
And yes, you can compute it from first principles, but exam questions rarely want the long method unless they explicitly say so.

Somewhere around this point is a perfect moment to slip in a mid-body anchor such as A Level Maths revision techniques—a little nudge that these formulas become automatic with enough structured practice, not memorising panic.

📒 Building the expectation cleanly

Let me do the “teacher pause”—because students rush this bit.

Take our earlier distribution.
Compute the weighted sum:

For example, E(X)=0(0.1)+1(0.4)+2(0.3)+3(0.2)=1.6.

If you ever get a value outside the range of (X), something has gone terribly wrong.
Expectation must lie between the smallest and largest possible (x).

🧷 Computing variance — the efficient way

First calculate (E(X^2)):

For example, E(X^2)=0^2(0.1)+1^2(0.4)+4(0.3)+9(0.2)=3.4.

Then tie it together:
For example, \mathrm{Var}(X)=3.4-(1.6)^2=0.84.

A tiny warning: variance isn’t “interpreted” very naturally. Students often look at a value like (0.84) and think it’s meaningless.
It’s spread. That’s it. The square root (standard deviation) is more intuitive.

⚙️ Expectation of functions (the sneaky exam bit)

If they give you a new variable like (Y=2X+1), don’t panic.

The key facts:

  • For expectation: E(aX+b)=aE(X)+b.

  • For variance: \mathrm{Var}(aX+b)=a^2\mathrm{Var}(X).

Do not add the (+b) to the variance. That’s how you lose method marks in a single keystroke.

Teacher aside: I’ve seen entire classes apply (+b) to variance and then swear the exam board is wrong. You will not win this argument.

🪢 When distributions go missing

Sometimes they don’t give you the probabilities.
They give:

  • a diagram,

  • or some equation like P(X=x)=kx,

  • or a constraint like P(X\ge 2)=0.7.

If you get something like P(X=x)=kx for (x = 1,2,3,4), remember the probabilities must sum to 1:

For example, k(1+2+3+4)=1.

Find (k), then build the table, then compute expectation.
Follow the flow — don’t jump.

🧩 Common structure in long questions

  • A typical big-mark question goes like this:

    1. Give the distribution table.

    2. Compute (E(X)).

    3. Compute (\mathrm{Var}(X)).

    4. Define a new variable (Y=aX+b).

    5. Find (E(Y)) and (\mathrm{Var}(Y)).

    6. Interpret one of those values in context.

    If you spot that pattern, you can breathe — the marks are mechanical.

❗ Danger zone (mistakes people make)

    • Mixing up (E(X^2)) and ([E(X)]^2).

    • Forgetting to normalise probabilities when given an unknown constant (k).

    • Treating expectation as the “most common value.”

    • Misinterpreting variance as something the question expects a story about.

    • Adding the “+b” from (Y=aX+b) into the variance.

    • Leaving probabilities that don’t sum to 1 (this is an instant red flag).

    A quick reference line you might need:
    For example, E(X^2)=\sum x^2P(x).

🌍 Outside-the-classroom version

Expectation and variance basically run the world — insurance pricing, machine-learning models, risk calculations, game theory, quality control.
When someone says “on average this happens…” or “we’re measuring variability,” they’re using these tools whether they know it or not.
Stats is sneaky like that.

🚀 Ready to level up?

If you want to get genuinely fluent with expectation, variance, and all the related modelling steps, the complete A Level Maths Revision Course walks through real exam datasets, full worked solutions, and the shortcuts teachers actually use.

📏 Recap Table

  • Expectation = long-run weighted average.

  • Variance = spread around the mean.

  • Use E(X)=\sum xP(x) and \mathrm{Var}(X)=E(X^2)-[E(X)]^2.

  • For (Y=aX+b): expectation shifts, variance scales.

  • Probabilities must always sum to 1.

Author Bio – S. Mahandru

I’m the kind of A Level Maths teacher who gets far too excited about probability tables and then immediately regrets trusting students not to confuse E(X^2) with [E(X)]^2. If expectation and variance have ever made your calculator cry, you’re in good company here.

🧭 Next topic:

Having learned how expectation and variance work for discrete random variables, the next step is extending those same ideas to continuous random variables, where probability is described using PDFs, CDFs and integration rather than lists of values.

❓FAQ

Why does variance use squared deviations?

Because if we didn’t square them, positive and negative deviations would cancel each other out. Squaring keeps everything positive and prioritises bigger deviations more strongly. And wait—don’t fall into the trap of trying to “interpret the number” too literally; variance measures spread, not a physical quantity.

Think long-run average, not “what happens most often.” If you repeated the scenario a million times, the mean of all those outcomes would settle near (E(X)). Students sometimes expect it to be a possible outcome — it doesn’t have to be. Expectation can be 1.6 even if the variable only takes integer values.

Because it checks whether you understand how transformations affect averages and spreads. The effect is predictable: scaling changes both expectation and variance, shifting changes only expectation. It’s a neat way for examiners to test conceptual understanding with very little extra work.