Hypothesis Testing (Normal): Using z-scores Correctly
Hypothesis Testing (Normal): Using z-scores Correctly
🎧 Hypothesis Tests with the Normal Distribution (let’s make it painless)
Alright—today we’re tackling the normal-based version of hypothesis testing, and honestly this is one of those topics where half the class nods confidently and then suddenly, three questions later, someone whispers “…wait—is this the standard error one?”
Yes. Yes it is. And hang on—before we get into the machinery, the whole point is simply testing whether a sample mean is surprisingly far from the hypothesised population mean. That’s it. Don’t let the symbolism intimidate you.
This is also one of those core A Level Maths techniques that comes up in every single exam series. So let’s make this feel like something you can do while half-awake.
🔙 Previous topic:
Once you’re comfortable with hypothesis testing using binomial distributions, the next step is applying the same decision-making process with z-scores in normal distributions.
📘 Why this matters in exams
Examiners adore this topic because it blends ideas: sampling, variance, the normal distribution, and contextual conclusions. They can test understanding and calculator competence and reasoning marks all at once.
And if you’ve ever seen those 9-mark questions that eat half a page of the paper—they’re usually this.
Also, unlike binomial tests, the normal tests lean heavily on z-scores, and that’s where most of the marks are won or lost.
📏 Scenario first (Problem Setup)
Imagine a manufacturer claims their screws have a mean length of 12.5 mm. You take a random sample of 40 screws, get a sample mean, and test whether the data suggests the screws are actually longer.
So you’d form:
“For example, \bar{X} \sim N!\left(\mu,,\frac{\sigma^2}{n}\right).”
That’s the standard foundation: a normal distribution for the sample mean with reduced variance.
No more symbols than we need.
🧲 Required Diagram
Here’s a visual to reinforce the bell-curve intuition — where your observed sample mean sits, and how far into a tail it lands.
Here’s the visual most students need but don’t always get shown — a normal bell curve, your observed sample mean marked on it, and the distance to the tail measured in standard deviations.
That horizontal distance is the z-score. The shaded tail shows how extreme your result is under the null hypothesis. Once you can see where your value sits relative to the centre, decisions about significance stop feeling mysterious and start feeling obvious.
🧠 Under-the-hood explanation (Key Ideas Explained)
📌 1. The role of the null and alternative hypotheses
Just like binomial tests, the null hypothesis is the “assumed to be true” version. But with normal tests, we deal with means.
For example, H_0: \mu = 12.5.
The alternative depends on context:
- “longer” → H_1: \mu > 12.5
- “shorter” → H_1: \mu < 12.5
- “different” → H_1: \mu \ne 12.5
Pretty straightforward, but phrasing matters—and examiners check it carefully.
🎯 2. What the z-score actually measures
Let me pause here because this is the step everyone rushes. A z-score is basically:
“How many standard errors away is my sample mean from the null hypothesis mean?”
This gives:
“For example, z = \frac{\bar{x} – \mu_0}{\sigma/\sqrt{n}}.”
The denominator is what students often forget—it’s not the population SD, it’s the standard error.
The bigger the sample, the smaller the standard error, the more sensitive the test becomes.
That’s why big samples almost never get “not significant” results unless the effect is tiny.
📐 3. The significance level and tail choice
Pick the direction based on the wording (same rules as binomial).
- One-tailed tests check one extreme.
- Two-tailed tests split the significance level into two halves.
For example, at 5% two–tailed: 2.5% in each tail → critical z = ±1.96.
Note: examiners love asking you to quote why that’s the correct value, and the answer is simply “because it is the z-value marking the outer 2.5% in each tail”.
⏳ 4. The standard error (and why so many students miscalculate it)
This is important enough to deserve its own moan. Students see “standard deviation of 0.18” and instantly plug that into the z-score.
No.
You must divide by \sqrt{n}. Every time.
The examiners phrase this sneakily:
“the population standard deviation is known to be 4.2”
or
“the process standard deviation is 0.9”.
If you see those words, your z-score lives or dies depending on whether you remember the standard error step.
💡 5. Using z-values correctly (the real heart of the topic)
Here comes the mid-blog anchor — because the natural place where students improve fastest is when they practise repeatedly with mixed-tail questions inside A Level Maths revision guidance. It’s the only way to make the tail directions feel intuitive.
Right—back to the mechanics.
The structure is always:
- Compute standard error:
For example, \text{SE} = \sigma/\sqrt{n}. - Compute z-score:
For example, z = \frac{\bar{x} – \mu_0}{\text{SE}}. - Compare to critical value(s).
- Reject or fail to reject H₀.
- State a contextual conclusion.
If it feels too simple, that’s because… it is that simple when you don’t overcomplicate it.
🔍 6. How to choose the right critical value
Use the standard z-table ideas:
- One-tailed at 5% → z = 1.645
- Two-tailed at 5% → z = ±1.96
- One-tailed at 1% → z = 2.33
- Two-tailed at 1% → z = ±2.58
(You don’t need to memorise more, really.)
A line you might write:
“This gives z = 2.14, which lies in the upper 5% critical region, so we reject H₀.”
Clean, clear, mark-friendly.
🧭 7. When calculators do the job — and when they don’t
Your calculator can give you the p-value instantly using normal cdf functions. But the exam still wants reasoning:
- Did you choose the right tail?
- Did you compare correctly?
- Did you interpret it in context?
Never just quote a number—surround it with justification.
❗ Traps + slips (Common Errors)
- Forgetting to divide by \sqrt{n} → wrong standard error.
- Using the population SD instead of the sample SD when population SD is unknown (this depends on the board).
- Two-tailed tests but using only one critical value.
- Incorrect z-critical values—writing 1.96 for one-tailed tests.
- Writing conclusions with no context.
- One optional LaTeX snippet: For example, z = -2.21 < -1.96 → reject H₀.
🌍 The real-world picture (Real-World Link)
Normal-based hypothesis tests show up everywhere: factory quality checks, medical trials, battery life studies, packaging weights, flight delays—you name it.
If someone reports “the difference was statistically significant”, they almost certainly did a z-test (or t-test, but the logic is the same).
You’re learning the same method used in every scientific paper you’ve ever heard quoted on the news.
🚀 If you want more skill
If you want a calm, structured place to practise these—especially with mixed one- and two-tailed tests—the A Level Maths Revision Course for every exam board walks through dozens of normal-based hypothesis tests with step-by-step commentary and mistakes-to-avoid notes.
📏 Optional Table
- Normal tests use z-scores with standard error.
- H₀ always includes equality.
- Tail direction comes from wording.
- Compare z to critical values, or p-value to significance.
- For example, z = \frac{\bar{x}-\mu_0}{\sigma/\sqrt{n}} remains the centre of the method.
Author Bio – S. Mahandru
If you ever see me in a lesson, I’m usually the one speed-scribbling z-scores on the board while also trying to stop someone from pressing the stats-mode button by accident. If you like maths explained like an actual human is talking to you — not a spreadsheet — then we’ll get along just fine.
🧭 Next topic:
Now that you understand hypothesis testing with z-scores, the next step is to dive into how sampling methods affect your results and the data you analyse.
❓ FAQ
When do I use a z-test instead of a t-test?
Great one—and this is where students often memorise the wrong rule. You use a z-test when the population standard deviation is known. That’s the entire distinction. Even if the sample is small, exam boards normally still permit a z-test if σ is given. If the population SD is not known, and the question gives you only the sample SD, then you’re in t-test territory. It doesn’t have to be complicated; it’s just about which standard deviation you have legitimate access to.
Do I need a p-value or a critical value?
Honestly, you can go either way. Some exam boards like you to talk about the critical region; others simply want the probability. And half the time the question doesn’t really mind — they’ll take both as long as your reasoning isn’t a mess.
The only thing examiners really watch for is whether you compare the right things: p-value vs significance, or z-score vs the critical z. Same idea either way. Both are just asking, “How weird is our result if H₀ were true?” Nothing deeper hiding there.
What if my z-value is annoyingly close to the cutoff?
Ah yes — the 1.95 vs 1.96 debate. This comes up every year. Students want there to be some wiggle room, but… nope. If it doesn’t actually cross ±1.96 (for a 5% two-tailed test), it’s not significant. No “close enough.”
That said, if you really want to sound like you know what you’re doing, you can mention the evidence is “borderline” or “weak” — examiners like that little bit of nuance — but the formal decision stays strictly on the “reject / not reject” line.