Least Squares Regression Line

Introduction

A correlation coefficient provides you with a measure of the relationship between two variables. If it is a linear correlation then it can be written as a linear equation or as a straight line on the scatter diagram.

Before any calculations are done it is important to plot the dependent variable on the vertical axis and the independent variables on the horizontal axis.

The table below shows the ages of 12 people and how much pocket money they receive:

\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|} \hline \text { Age } & 6 & 7 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 15 & 16 & 16 \\ \hline \text { Money (f) } & 3.5 & 2 & 3 & 4 & 4 & 4.5 & 4.5 & 4 & 4.5 & 3.5 & 4.5 & 6 \\ \hline \end{array}

It can be calculated that the mean age, $\bar{x}=12$ and the mean amount of pocket money, $\bar{y}=4$ , so the mean point is $(12,4)$ .

Least Squares Regression Line - How Points Are Plotted

The points are plotted in the graph below and the mean value is highlighted

The gradient needs to be found and this can be done by using the equation $(y-\bar{y})=b(x-\bar{x})$

y=b x+(\bar{y}-b \bar{x})

Where b is the gradient and $(\bar{y}-b \bar{x})$ is the intercept on the y axis.

Let the distance from each point to the line be $d_1, d_2$

d1, d2, etc and these will take positive and negative values depending on whether they are above or below the line of best fit.

The values will be squared to cancel out any negatives so what is needed is that $\sum d_i^2$ is as small as possible.

The formula for the gradient which will make $\sum d_i^2$ di2 is as small as possible is given by $\therefore y-\bar{y}=\frac{s_{x y}}{s_{x x}}(x-\bar{x})$

Where $S_{x y}=\sum x_i y_i-n \overline{x y} \text { and } S_{x x}=\sum x_i^2-n \bar{x}^2$

Least Squares Regression Line - Summary

This can be summarised as:

The least squares regression line is $y=a+b x$

Where:

$<span style="font-weight: 400;">Where$

And:

b=\frac{S_{x y}}{S_{x x}}=\frac{\sum x_i y_i-n \overline{x y}}{\sum x_i^2-n \bar{x}^2}

Example

The following data shows information relating to time and the concentration of a particular chemical:

\begin{array}{|l|c|c|c|c|c|c|} \hline \text { Time, } \mathrm{x} \text {, Hours } & 0 & 1 & 2 & 3 & 4 & 5 \\ \hline \text { Concentration, } \mathrm{y} & 2.4 & 4.3 & 5.2 & 6.8 & 9.1 & 11.8 \\ \hline \end{array}

a) Find the equation of the regression line of y on x

b) Illustrate the data and your regression line on a scatter diagram

c) Estimate the concentration of the chemical after a) 3.5 hours and b) 10 hours

Solution

\begin{array}{|c|c|c|c|} \hline \mathbf{x} & \mathbf{y} & \mathbf{x}^{\mathbf{2}} & \mathbf{x y} \\ \hline 0 & 2.4 & 0 & 0 \\ \hline 1 & 4.3 & 1 & 4.3 \\ \hline 2 & 5.2 & 4 & 10.4 \\ \hline 3 & 6.8 & 9 & 20.4 \\ \hline 4 & 9.1 & 16 & 36.4 \\ \hline 5 & 11.8 & 25 & 59 \\ \hline \mathbf{1 5} & \mathbf{3 9 . 6} & \mathbf{5 5} & \mathbf{1 3 0 . 5} \\ \hline \end{array}

\begin{aligned} & \bar{x}=\frac{\sum x}{n}=\frac{15}{6}=2.5 \text { and } \bar{y}=\frac{\sum y}{n}=\frac{39.6}{6}=6.6 \\ & \therefore S_{x x}=\sum x^2-n \bar{x}^2=55-6 \times 2.5^2=17.5 \\ & \therefore S_{x y}=\sum x y-n \overline{x y}=130.5-6 \times 2.5 \times 6.6=31.5 \\ & \therefore b=\frac{S_{x y}}{S_{x x}}=\frac{\sum x y-n \overline{x y}}{\sum x^2-n \bar{x}^{-2}}=\frac{31.5}{17.5}=1.8 \end{aligned}

So the least squares regression line is given by:

\begin{gathered} y-\bar{y}=b(x-\bar{x}) \\ \therefore y-6.6=1.8(x-25) \\ \therefore y=2.1+1.8 x \end{gathered}

b) The data and line of regression is shown:

\begin{aligned} & x=3.5 ; y=2.1+1.8 \times 3.5=8.4 \\ & x=10 ; y=2.1+1.8 \times 10=20.1 \end{aligned}

What is important to remember is that even though a lot of calculations have been shown in this article, you are expected to be able to find all these with the aid of a calculator.

The calculations here are to show what processes are happening and even though you will be using a calculator, it is still important to understand how the calculations for a least squares regression line are calculated.

If you, or your parents would like to find out more, please just get in touch via email at info@exam.tips or call us on 0800 689 1272.

LAST NAME

POSTCODE

Mobile Number

EMAIL ADDRESS

Message (Please include exam board, school year any specific topics or other relevant details)

Least Squares Regression Line

Introduction

Least Squares Regression Line - How Points Are Plotted

Least Squares Regression Line - Summary

Mastering Integration Techniques: Success Every Time

How much revision should a year 11 do for mocks?

Interactive Learning: Engaging with Complex Concepts in GCSE Maths with an Online GCSE Maths Tutor

PAGES

QUICK LINKS

Popular Courses

PAGES

QUICK LINKS

Least Squares Regression Line

Introduction

Least Squares Regression Line - How Points Are Plotted

Least Squares Regression Line - Summary

Mastering Integration Techniques: Success Every Time

How much revision should a year 11 do for mocks?

Interactive Learning: Engaging with Complex Concepts in GCSE Maths with an Online GCSE Maths Tutor

New to exam.tips?