Hypothesis testing exam technique when comparing to critical values

hypothesis testing exam

Hypothesis testing exam mistakes when using critical values

🎯In a hypothesis testing exam question, most students calculate the test statistic correctly.

The difficulty often appears one step later.

Comparing a test statistic with a critical value sounds straightforward. Yet this is where conclusions go wrong. Signs are misread. Tail directions are confused. Two-tailed regions are treated as one-tailed. A perfectly correct calculation suddenly produces the wrong decision.

Examiners are not simply checking arithmetic. They are checking interpretation. They want to see that you understand what the rejection region represents and how it connects to the alternative hypothesis.

The comparison stage is not mechanical. It is logical.

Building careful reasoning at this stage is part of developing A Level Maths revision that sticks, because understanding why a value lies inside or outside a critical region is more durable than memorising thresholds.

 Accurate comparison of a test statistic with a critical value depends on understanding the complete testing structure. These foundations are explained clearly in Hypothesis Testing — Method & Exam Insight.

🔙 Previous topic:

⚠ Common Problems Students Face

The pattern of errors is consistent.

Students often:

  • Compare the test statistic to the wrong critical value.

  • Forget whether the test is one-tailed or two-tailed.

  • Use \pm 1.96 automatically at 5% without checking significance level.

  • State “accept H_0” instead of “fail to reject H_0”.

  • Ignore the direction of inequality in the alternative hypothesis.

  • Draw conclusions that contradict their earlier hypotheses.

These mistakes are rarely about computation. They happen in the reasoning step that follows.

Examiners award method marks for structure, but conclusions must align logically. A mismatch between comparison and wording restricts credit immediately.

📘 Core Exam-Style Question

A company claims that the mean delivery time is 3 days.

A sample of 36 deliveries has a mean of 3.4 days.
The population standard deviation is known to be 0.9 days.

Test at the 5% level whether the mean delivery time is greater than 3 days.

Step 1: State Hypotheses

Let \mu be the population mean delivery time.

H_0 : \mu = 3
H_1 : \mu > 3

The alternative tells us this is a right-tailed test.

Step 2: Calculate Test Statistic

Use:

Z = \frac{\bar{x} – \mu}{\sigma/\sqrt{n}}

Substitute:

Z = \frac{3.4 – 3}{0.9/\sqrt{36}}

Simplify carefully.

Step 3: Compare to Critical Value

At 5% significance, right-tailed critical value:

1.645

If calculated Z > 1.645, reject H_0.

Notice the direction matters. Because H_1 : \mu > 3, the rejection region lies on the right.

A frequent error is using -1.645 by habit. That reverses the logic.

📊 How This Question Is Marked

Method marks:

  • Correct hypotheses

  • Correct formula

  • Correct substitution

Accuracy marks:

  • Correct numerical value

  • Correct comparison with critical value

  • Logical conclusion in context

If the test statistic is correct but compared to the wrong side of the distribution, marks are limited.

The comparison step carries real weight.

🔥 Harder / Twisted Exam Question

A manufacturer claims that 10% of items are defective.

In a sample of 250 items, 35 are defective.

Test at the 1% level whether the defect rate has changed.

Here the wording shifts.

Let p be the population proportion defective.

H_0 : p = 0.10
H_1 : p \ne 0.10

Two-tailed test.

Step 1: Check Conditions

np = 25
n(1-p)=225

Normal approximation justified.

Step 2: Test Statistic

Z = \frac{\hat{p} – p}{\sqrt{\frac{p(1-p)}{n}}}

Substitute values carefully.

Step 3: Compare to Critical Values

At 1% significance (two-tailed):

Critical values are approximately \pm 2.576.

This step was not required before — here it becomes essential to recognise both tails.

Students often use \pm 1.96 automatically. That would correspond to 5%, not 1%.

Significance level controls the boundary. Ignoring it changes the decision.

📊 How This Is Marked (Twisted Version)

This version rewards:

  • Correct identification of two-tailed test

  • Correct critical values

  • Clear comparison statement

Marks are reduced if:

  • Only one critical value is used

  • The wrong significance level is applied

  • Conclusion contradicts rejection region

Even a correct test statistic cannot compensate for an incorrect critical comparison.

📝 Practice Question (Attempt Before Scrolling)

A school claims the mean exam score is 65.

A sample of 49 students has mean 62.
Population standard deviation is 14.

Test at the 5% level whether the mean score is lower than 65.

Write hypotheses.
Calculate the test statistic.
Compare carefully.

Do not rush the final step.

✅ Model Solution (Exam-Ready Layout)

Let \mu be the population mean score.

H_0 : \mu = 65
H_1 : \mu < 65

Left-tailed test.

Test statistic:

Z = \frac{62 – 65}{14/\sqrt{49}}

At 5% significance (left-tailed):

Critical value:

-1.645

If Z < -1.645, reject H_0.

State conclusion clearly in context.

The direction of inequality determines everything in this comparison.

📚 Setup Reinforcement

Before comparing numbers, ask:

Which tail are we testing?
What significance level is specified?
Are there one or two critical values?
Does the sign of the statistic align with the alternative?

These checks prevent most conclusion errors.

Understanding the reasoning behind rejection regions leads to more secure outcomes. That is the kind of understanding that supports A Level Maths revision that sticks, because it connects logic to symbols rather than relying on memory alone.

🚀 Strengthening Decision-Making Under Pressure

Students often feel comfortable calculating test statistics but hesitate when interpreting them.

During the Intensive 3 Day A Level Maths Revision Course, emphasis is placed on decision structure. Not just calculating Z, but understanding what that number represents in relation to a rejection region. Time is spent deliberately comparing results to boundaries and articulating conclusions clearly.

Confidence increases when the comparison step becomes logical rather than mechanical.

A well-structured decision earns marks quickly and cleanly.

✍️ Preparing for High-Stakes Exam Questions

As exams approach, small interpretative mistakes can become costly.

In hypothesis testing exam questions, the final comparison often determines whether several marks are secured or lost. Recognising tail direction, applying the correct critical value, and writing a conclusion consistent with the hypotheses are habits built through rehearsal.

The Easter A Level Maths Exam Preparation Course focuses on strengthening these final decision steps so that comparisons feel controlled rather than rushed. Under timed conditions, that control makes a visible difference.

Accurate conclusions depend on accurate comparisons.

✍️ Author Bio

S. Mahandru is an experienced A Level Maths specialist focused on examiner standards, modelling clarity, and exam-ready communication across Pure, Statistics, and Mechanics. His teaching emphasises structured reasoning, precise interpretation, and disciplined presentation — the elements that consistently protect method and accuracy marks in high-stakes exams.

🧭 Next topic:

Once you can confidently compare your test statistic with the critical value, the final step is communicating that decision clearly, so read Hypothesis Testing Why Students Lose Marks in the Final Conclusion to make sure you secure those last few marks.

🧠 Conclusion

In hypothesis testing exam questions, the calculation is only half the task.

Comparing the test statistic to the correct critical value — and interpreting that comparison accurately — determines the final outcome.

Read the alternative carefully. Identify the tail. Use the correct significance level. Align your conclusion with the rejection region.

When these steps become structured habits, hypothesis testing feels far less uncertain.

Control the comparison. The marks follow.

❓ FAQs

🎓 Why do I lose marks even when my test statistic is correct?

Because calculating the test statistic is only one step in the process.

You can get the arithmetic completely right and still lose marks if the comparison that follows doesn’t make sense. The number itself isn’t the decision — how you interpret it is.

A common slip happens with direction. For instance, a student might calculate a negative value for Z in a left-tailed test, then compare it against a positive critical value out of habit. The calculation is fine. The comparison isn’t. And that changes the conclusion.

Another issue is using the correct boundary but writing a conclusion that contradicts it. If your statistic lies outside the rejection region, your wording must reflect that. The logic has to match.

The final decision is where the reasoning becomes visible. If that reasoning doesn’t line up with the setup, examiners can’t award full credit — even if the earlier steps were accurate.

Because in hypothesis testing, you are not proving anything true.

You start by assuming the null hypothesis holds. Then you ask: how unusual would this sample be if that assumption were correct? That’s all the test statistic is measuring.

If the result lands in the critical region, it is unusual enough to reject that assumption. If it doesn’t, you simply don’t have strong enough evidence to move away from it.

That’s different from saying it’s true.

“Accept” sounds final. It suggests certainty. Statistical tests don’t give certainty — they give levels of evidence. At 5%, you are saying the result is rare enough to reject only 5% of the time under the null model. If it isn’t that rare, you stay with the null position. Cautiously.

Examiners care about this wording because it shows whether you understand what the test is doing. A conclusion that says “fail to reject” reflects evidence-based reasoning. A conclusion that says “accept” suggests the logic behind significance levels hasn’t fully landed yet.

It’s a small phrase. But it signals whether the method is understood or just followed.

Start with two questions. Everything follows from them:

  1. Is the test one-tailed or two-tailed?

  • If the alternative is \mu>k or p>k, it’s right-tailed (one boundary).

  • If the alternative is \mu<k or p<k, it’s left-tailed (one boundary).

  • If the alternative is \mu\ne k or p\ne k, it’s two-tailed (two boundaries).

  1. What is the significance level?
    This is the total probability placed in the rejection region. The key idea is:

  • One-tailed: all of \alpha sits in one tail.

  • Two-tailed: \alpha is split as \alpha/2 in each tail.

That split is the reason two-tailed critical values are “further out”.

So the logic becomes:

  • 5% one-tailed → z_{0.95}=1.645

  • 5% two-tailed → z_{0.975}=1.96 (because \alpha/2=0.025)

  • 1% one-tailed → z_{0.99}=2.33

  • 1% two-tailed → z_{0.995}=2.576 (because \alpha/2=0.005)

A quick exam-safe checkpoint:

  • If it’s two-tailed, your critical value should be larger than the one-tailed value at the same level, because the rejection region is narrower in each tail.

Finally, don’t treat critical values as “default constants”. Examiners often change the level (10%, 2.5%, 1%) specifically to catch memorisers. Read the wording, identify the tail from H_1, then choose the value that matches that exact setup.