Statistics

Statistical Investigations & Sampling

1

Sampling Methods

Draw a line to match each method to its description.

Random sample
Convenience sample
Stratified sample
Census
Everyone in the population is surveyed
Groups in the sample reflect population proportions
Every member has equal chance of selection
Easiest people to reach are chosen
2

Biased or Unbiased?

Sort each sampling scenario.

Survey only Year 8 students about school lunch (for whole school)
Randomly select 50 names from school roll
Ask your friends about favourite music genre to represent all Year 8s
Use random number generator to select participants
Likely biased
Likely unbiased
3

Interpret a Histogram

Test scores: 50–59: 3 students, 60–69: 8, 70–79: 12, 80–89: 7, 90–99: 2.

Total students:

32
30
34

Most common score range:

70–79
60–69
80–89

Distribution shape:

Roughly symmetrical
Positively skewed
Negatively skewed
4

Shape of Distribution

Sort each description by likely distribution shape.

Most students score near top; few score low
Most people earn moderate incomes; a few earn very high
Heights of adult women
Time to complete a difficult exam
Positively skewed
Negatively skewed
Roughly symmetrical
5

Interpret Box Plots

Box plot: Min=20, Q1=35, Median=50, Q3=65, Max=90.

Interquartile range (IQR) =

30
15
70

50% of data lies between:

35 and 65
20 and 90
50 and 90

Spread of middle 50% of data:

IQR = 30
Range = 70
Median = 50
6

Design a Statistical Investigation

Answer all parts.

Write a statistical question that Year 8 students could investigate about sleep and school performance.

What sampling method would you use? How many people? Why?

What data display would best show your results? What conclusions might you draw?

7

Calculating Mean, Median, Mode and Range

Use the data set: 12, 15, 15, 18, 20, 22, 25.

Mean (average):

18.1
15
22

Median (middle value):

18
15
20

Mode (most common):

15
12
18

Range (highest − lowest):

13
7
18
8

Back-to-Back Stem-and-Leaf

Answer the questions using the back-to-back stem-and-leaf plot shown.

Class A scores: 52, 56, 61, 63, 68, 72, 75, 78, 85, 90. Class B scores: 48, 55, 57, 60, 62, 70, 73, 80, 83, 88. Draw a back-to-back stem-and-leaf plot with stems 4, 5, 6, 7, 8, 9.

Draw here

Find the median and range for each class. Which class performed more consistently? Explain using IQR or range.

9

Effect of Outliers

Data set: 10, 12, 11, 14, 13, 12, 60. The value 60 is an outlier.

Mean without the outlier:

12
60
13

Mean with the outlier:

19
12
60

Which measure is least affected by the outlier?

Median
Mean
Range
10

Misleading Statistics

Sort each example: is it a misleading or fair use of statistics?

Graph y-axis starts at 90 to make a small difference look huge
Mean salary reported when a few executives earn much more than others
Random sample of 200 students used for a school survey
A survey of only satisfied customers used to rate a product
Box plot showing both groups' distributions side by side
Percentage increase reported without the original value
Misleading
Fair and honest
11

Measure of Centre — Best Choice

Draw a line to match each situation to the most appropriate measure of centre.

House prices where a few mansions are very expensive
Most popular shoe size in a shop
Average temperature over a month
Middle mark in a class test
Median
Mode
Mean
Median
16

Calculate Statistics — Set A

Use the data set: 7, 9, 4, 11, 6, 9, 14, 3, 9, 8.

Sort the data. Find the mean, median, mode and range.

Find Q1, Q3 and the IQR.

Are there any outliers? (Outlier if value > Q3 + 1.5 × IQR or < Q1 − 1.5 × IQR)

TipAlways sort data first before finding median or quartiles.
17

Calculate Statistics — Set B

Use the data: 22, 35, 28, 19, 31, 44, 27, 33, 25, 28, 31, 18.

Find the mean, median, mode and range.

Find Q1 and Q3. Calculate the IQR.

TipWith 12 values, the median is the average of the 6th and 7th values.
19

Reading a Histogram

A histogram shows test scores: 0–19: 2, 20–39: 5, 40–59: 11, 60–79: 14, 80–99: 8.

How many students sat the test in total?

What is the modal class (most common score range)?

Describe the shape of the distribution. Is it symmetrical, positively skewed, or negatively skewed?

What percentage of students scored 60 or more?

TipRemind your teenager that histograms show frequency (count) on the y-axis and grouped data on the x-axis.
21

Box Plot Construction

Draw a box plot for the following data. Show all steps.

Data: 12, 15, 18, 20, 22, 25, 28, 30, 35, 40. Sort the data and find: Min, Q1, Median, Q3, Max.

Draw a box plot above a number line from 10 to 45.

Draw here
TipSteps: sort, find median, find Q1 and Q3, draw the box from Q1 to Q3 with median marked, add whiskers to min and max.
23

Comparing Two Data Sets

Compare the following data sets using statistics.

Class A test scores: 55, 62, 68, 70, 72, 75, 78, 80, 85, 95. Class B test scores: 40, 52, 60, 65, 70, 72, 74, 80, 88, 99. Find the median and IQR for each class.

Which class performed better overall? Which was more consistent? Use statistics to justify.

TipWhen comparing two groups, compare both centre (median or mean) and spread (IQR or range).
24

Sampling Questions

Choose the best sampling method for each situation.

Surveying 100 students from a school of 1000 by drawing names from a hat:

Random sample
Convenience sample
Census
Stratified sample

Asking everyone at a shopping centre about their political views to represent all voters:

Convenience sample (biased)
Random sample
Census
Stratified sample

Selecting 50 students with 25 from Year 7 and 25 from Year 8 proportionally:

Stratified sample
Random sample
Convenience sample
Census
25

Back-to-Back Stem-and-Leaf Interpretation

Use the following back-to-back stem-and-leaf plot.

Class A (left): 9|5| | Class B (right): 5|5|2,3,7 | 6|6|0,4,8 | 7|7|1,5,9 | 8|8|2,6 | 9|9|0. Find the median for each class.

Find the range for each class. Which class had more consistent results?

Describe a real-world situation this data might represent.

TipIn a back-to-back plot, leaves on the left are read right-to-left.
26

Which Graph Type?

Match each data type to the most appropriate graph.

Comparing spread of two groups side by side
Showing frequency of grouped numerical data
Showing individual values for a small data set
Displaying each value precisely while preserving order
Showing distribution shape for large data sets
Reading min, max, median, Q1, Q3 at a glance
Histogram
Box plot
Dot plot
Stem-and-leaf
27

Misleading Statistics Analysis

Identify and explain why each statistical presentation is misleading.

A graph shows sales increasing from 95 to 100 units, but the y-axis starts at 93. Why might this be misleading? Sketch a more honest version.

Draw here

A health supplement company claims '80% of users saw results' — from a survey of 10 customers who volunteered feedback. What is wrong with this claim?

A school reports 'our mean exam score is 72' when three students scored 20 and the rest scored 80+. Which measure of centre is more appropriate here?

TipTeach critical thinking: always ask 'what is the source?', 'what is the sample?', and 'are the axes honest?'
28

Designing a Survey

Design a valid statistical investigation.

Choose a topic to investigate (e.g. sleep, exercise, screen time). Write your statistical question.

Describe your sampling method. How would you select participants? How many would you select? Why?

Write two survey questions: one that is neutral and one that is biased. Explain why the second is biased.

TipGood surveys have neutral questions, a random sample, and clear measurement criteria.
31

Effect of Adding Data Points

Investigate how the mean and median change when data is added.

Original data: 10, 12, 14, 16, 18. Mean = 14, Median = 14. Add the value 50. What are the new mean and median? Which changed more?

Why is the median considered more 'robust' to outliers than the mean?

TipUnderstanding how statistics change with new data is essential for data literacy.
32

Interpreting Box Plots Side by Side

Two box plots compare two groups.

Group A: Min=10, Q1=20, Med=30, Q3=40, Max=60. Group B: Min=15, Q1=25, Med=35, Q3=50, Max=70. Which group has a larger IQR? Which has a higher median?

Write two sentences comparing the groups. Include at least one reference to centre and one to spread.

TipSide-by-side box plots are a powerful tool for comparing distributions.
34

Statistical Report Writing

Write a short statistical report (4–6 sentences) based on the following data.

Hours of sleep per night for 10 Year 8 students: 7, 8, 6, 9, 7, 8, 5, 8, 7, 10. Calculate mean, median and range. Write your report comparing this to the recommended 9 hours for teenagers.

Draw here
TipMathematical writing should include: context, key statistics, interpretation, and a conclusion.
35

Sampling Simulation

Investigate how sample size affects accuracy.

A bag has 5 red and 5 blue balls. Simulate 10 draws (with replacement) by flipping a coin (H=red, T=blue). Count reds. How close was your proportion to 50%?

Now simulate 50 draws. How close was your proportion to 50%? What does this suggest about sample size?

TipThis investigation connects probability and statistics — a key cross-strand link.
37

Outlier Investigation

Identify outliers using the IQR method and discuss their effect.

Data: 8, 10, 11, 12, 13, 14, 15, 16, 17, 50. Q1 = 11, Q3 = 16, IQR = 5. Is 50 an outlier by the IQR method? Show working.

What is the mean with and without the outlier? What is the median with and without the outlier?

If this data represents exam scores, what might the outlier value of 50 tell you?

TipThe IQR outlier rule: a value is an outlier if it is more than 1.5 × IQR above Q3 or below Q1.
38

Cross-Strand: Statistics and Algebra

Use algebra to solve statistical problems.

The mean of five numbers is 12. Four of the numbers are 8, 11, 14, 16. What is the fifth number? Show your algebraic working.

The mean of a data set is 20 and there are n values. If one more value of 30 is added, the new mean is 21. Find n.

TipUsing algebra to find missing values in statistics problems is a Year 8 and 9 cross-strand skill.
39

Real-World Data Investigation

Use data from a real-world context to practise statistics.

Australian capital city populations (approximate): Sydney 5.3M, Melbourne 5.0M, Brisbane 2.6M, Perth 2.1M, Adelaide 1.4M, Hobart 0.24M, Darwin 0.15M, Canberra 0.46M. Find the mean and median. Which is more representative?

Why is the median often more useful than the mean for population data?

TipReal data often has surprises — outliers, skew, or patterns that pure numerical exercises don't reveal.
41

Media Statistics Critique

Critically evaluate a statistical claim from the media.

A news headline reads: 'Screen time linked to lower grades — students who use screens 3+ hours per day score 10% lower on average.' List three questions you would ask before accepting this claim.

What would a well-designed study need to include to make this claim valid? (Sample size, control variables, causation vs correlation, etc.)

Draw here
TipStatistical literacy is essential in the modern world — this activity develops it directly.
44

Survey Bias Investigation

Evaluate the bias in each survey scenario.

A survey asks: 'Don't you agree that more homework is bad for students?' What kind of bias does this introduce?

Only 20% of people respond to an online survey. The 80% who didn't respond may have different opinions. What is this problem called?

Design an unbiased version of the question in part 1.

TipBias can creep into surveys through question wording, sampling method, or who chooses to respond.
47

Finding the Missing Value Using Mean

Use algebra to find the missing data value. Show all working.

The mean of 6 values is 15. Five values are: 12, 18, 10, 20, 14. Find the missing value.

The mean of 8 values is 25. The sum of seven of them is 175. What is the eighth value?

TipThis bridges statistics and algebra — a common assessment task type.
48

Choosing Sample Size

Discuss how sample size affects the reliability of a study.

A study tests 3 patients with a new medicine. All 3 improve. Can we conclude the medicine works? Explain.

A second study tests 500 patients. 380 improve. What is the improvement rate? Is this more convincing than the first study? Why?

TipLarger samples generally give more reliable results but cost more. There is always a trade-off.
49

Reflection and Summary

Answer each question to consolidate your learning.

List three different ways to measure spread. When would you choose each one?

Describe a situation where the mean would give a misleading picture of the data. What would you use instead?

Why is it important to know HOW data was collected, not just WHAT the data shows?

Draw here
TipOpen-ended reflection develops metacognition and mathematical communication.
50

Statistical Investigation at Home

Conduct a mini statistical investigation.

  • 1Ask 10 people in your family or neighbourhood how many hours of screen time they have per day. Calculate the mean, median, mode and range. Display the data in a dot plot.
  • 2Look at the scores or statistics from last week's AFL or NRL games. Compare two teams using their scores. Which team was more consistent? Use range or IQR to justify.
  • 3Find two graphs in a newspaper or online that could be misleading. Identify what makes them misleading and describe how you would display the data more honestly.