Statistical Investigations & Sampling
Sampling Methods
Draw a line to match each method to its description.
Biased or Unbiased?
Sort each sampling scenario.
Interpret a Histogram
Test scores: 50–59: 3 students, 60–69: 8, 70–79: 12, 80–89: 7, 90–99: 2.
Total students:
Most common score range:
Distribution shape:
Shape of Distribution
Sort each description by likely distribution shape.
Interpret Box Plots
Box plot: Min=20, Q1=35, Median=50, Q3=65, Max=90.
Interquartile range (IQR) =
50% of data lies between:
Spread of middle 50% of data:
Design a Statistical Investigation
Answer all parts.
Write a statistical question that Year 8 students could investigate about sleep and school performance.
What sampling method would you use? How many people? Why?
What data display would best show your results? What conclusions might you draw?
Calculating Mean, Median, Mode and Range
Use the data set: 12, 15, 15, 18, 20, 22, 25.
Mean (average):
Median (middle value):
Mode (most common):
Range (highest − lowest):
Back-to-Back Stem-and-Leaf
Answer the questions using the back-to-back stem-and-leaf plot shown.
Class A scores: 52, 56, 61, 63, 68, 72, 75, 78, 85, 90. Class B scores: 48, 55, 57, 60, 62, 70, 73, 80, 83, 88. Draw a back-to-back stem-and-leaf plot with stems 4, 5, 6, 7, 8, 9.
Find the median and range for each class. Which class performed more consistently? Explain using IQR or range.
Effect of Outliers
Data set: 10, 12, 11, 14, 13, 12, 60. The value 60 is an outlier.
Mean without the outlier:
Mean with the outlier:
Which measure is least affected by the outlier?
Misleading Statistics
Sort each example: is it a misleading or fair use of statistics?
Measure of Centre — Best Choice
Draw a line to match each situation to the most appropriate measure of centre.
Calculate Statistics — Set A
Use the data set: 7, 9, 4, 11, 6, 9, 14, 3, 9, 8.
Sort the data. Find the mean, median, mode and range.
Find Q1, Q3 and the IQR.
Are there any outliers? (Outlier if value > Q3 + 1.5 × IQR or < Q1 − 1.5 × IQR)
Calculate Statistics — Set B
Use the data: 22, 35, 28, 19, 31, 44, 27, 33, 25, 28, 31, 18.
Find the mean, median, mode and range.
Find Q1 and Q3. Calculate the IQR.
Reading a Histogram
A histogram shows test scores: 0–19: 2, 20–39: 5, 40–59: 11, 60–79: 14, 80–99: 8.
How many students sat the test in total?
What is the modal class (most common score range)?
Describe the shape of the distribution. Is it symmetrical, positively skewed, or negatively skewed?
What percentage of students scored 60 or more?
Box Plot Construction
Draw a box plot for the following data. Show all steps.
Data: 12, 15, 18, 20, 22, 25, 28, 30, 35, 40. Sort the data and find: Min, Q1, Median, Q3, Max.
Draw a box plot above a number line from 10 to 45.
Comparing Two Data Sets
Compare the following data sets using statistics.
Class A test scores: 55, 62, 68, 70, 72, 75, 78, 80, 85, 95. Class B test scores: 40, 52, 60, 65, 70, 72, 74, 80, 88, 99. Find the median and IQR for each class.
Which class performed better overall? Which was more consistent? Use statistics to justify.
Sampling Questions
Choose the best sampling method for each situation.
Surveying 100 students from a school of 1000 by drawing names from a hat:
Asking everyone at a shopping centre about their political views to represent all voters:
Selecting 50 students with 25 from Year 7 and 25 from Year 8 proportionally:
Back-to-Back Stem-and-Leaf Interpretation
Use the following back-to-back stem-and-leaf plot.
Class A (left): 9|5| | Class B (right): 5|5|2,3,7 | 6|6|0,4,8 | 7|7|1,5,9 | 8|8|2,6 | 9|9|0. Find the median for each class.
Find the range for each class. Which class had more consistent results?
Describe a real-world situation this data might represent.
Which Graph Type?
Match each data type to the most appropriate graph.
Misleading Statistics Analysis
Identify and explain why each statistical presentation is misleading.
A graph shows sales increasing from 95 to 100 units, but the y-axis starts at 93. Why might this be misleading? Sketch a more honest version.
A health supplement company claims '80% of users saw results' — from a survey of 10 customers who volunteered feedback. What is wrong with this claim?
A school reports 'our mean exam score is 72' when three students scored 20 and the rest scored 80+. Which measure of centre is more appropriate here?
Designing a Survey
Design a valid statistical investigation.
Choose a topic to investigate (e.g. sleep, exercise, screen time). Write your statistical question.
Describe your sampling method. How would you select participants? How many would you select? Why?
Write two survey questions: one that is neutral and one that is biased. Explain why the second is biased.
Effect of Adding Data Points
Investigate how the mean and median change when data is added.
Original data: 10, 12, 14, 16, 18. Mean = 14, Median = 14. Add the value 50. What are the new mean and median? Which changed more?
Why is the median considered more 'robust' to outliers than the mean?
Interpreting Box Plots Side by Side
Two box plots compare two groups.
Group A: Min=10, Q1=20, Med=30, Q3=40, Max=60. Group B: Min=15, Q1=25, Med=35, Q3=50, Max=70. Which group has a larger IQR? Which has a higher median?
Write two sentences comparing the groups. Include at least one reference to centre and one to spread.
Statistical Report Writing
Write a short statistical report (4–6 sentences) based on the following data.
Hours of sleep per night for 10 Year 8 students: 7, 8, 6, 9, 7, 8, 5, 8, 7, 10. Calculate mean, median and range. Write your report comparing this to the recommended 9 hours for teenagers.
Sampling Simulation
Investigate how sample size affects accuracy.
A bag has 5 red and 5 blue balls. Simulate 10 draws (with replacement) by flipping a coin (H=red, T=blue). Count reds. How close was your proportion to 50%?
Now simulate 50 draws. How close was your proportion to 50%? What does this suggest about sample size?
Outlier Investigation
Identify outliers using the IQR method and discuss their effect.
Data: 8, 10, 11, 12, 13, 14, 15, 16, 17, 50. Q1 = 11, Q3 = 16, IQR = 5. Is 50 an outlier by the IQR method? Show working.
What is the mean with and without the outlier? What is the median with and without the outlier?
If this data represents exam scores, what might the outlier value of 50 tell you?
Cross-Strand: Statistics and Algebra
Use algebra to solve statistical problems.
The mean of five numbers is 12. Four of the numbers are 8, 11, 14, 16. What is the fifth number? Show your algebraic working.
The mean of a data set is 20 and there are n values. If one more value of 30 is added, the new mean is 21. Find n.
Real-World Data Investigation
Use data from a real-world context to practise statistics.
Australian capital city populations (approximate): Sydney 5.3M, Melbourne 5.0M, Brisbane 2.6M, Perth 2.1M, Adelaide 1.4M, Hobart 0.24M, Darwin 0.15M, Canberra 0.46M. Find the mean and median. Which is more representative?
Why is the median often more useful than the mean for population data?
Media Statistics Critique
Critically evaluate a statistical claim from the media.
A news headline reads: 'Screen time linked to lower grades — students who use screens 3+ hours per day score 10% lower on average.' List three questions you would ask before accepting this claim.
What would a well-designed study need to include to make this claim valid? (Sample size, control variables, causation vs correlation, etc.)
Survey Bias Investigation
Evaluate the bias in each survey scenario.
A survey asks: 'Don't you agree that more homework is bad for students?' What kind of bias does this introduce?
Only 20% of people respond to an online survey. The 80% who didn't respond may have different opinions. What is this problem called?
Design an unbiased version of the question in part 1.
Finding the Missing Value Using Mean
Use algebra to find the missing data value. Show all working.
The mean of 6 values is 15. Five values are: 12, 18, 10, 20, 14. Find the missing value.
The mean of 8 values is 25. The sum of seven of them is 175. What is the eighth value?
Choosing Sample Size
Discuss how sample size affects the reliability of a study.
A study tests 3 patients with a new medicine. All 3 improve. Can we conclude the medicine works? Explain.
A second study tests 500 patients. 380 improve. What is the improvement rate? Is this more convincing than the first study? Why?
Reflection and Summary
Answer each question to consolidate your learning.
List three different ways to measure spread. When would you choose each one?
Describe a situation where the mean would give a misleading picture of the data. What would you use instead?
Why is it important to know HOW data was collected, not just WHAT the data shows?
Statistical Investigation at Home
Conduct a mini statistical investigation.
- 1Ask 10 people in your family or neighbourhood how many hours of screen time they have per day. Calculate the mean, median, mode and range. Display the data in a dot plot.
- 2Look at the scores or statistics from last week's AFL or NRL games. Compare two teams using their scores. Which team was more consistent? Use range or IQR to justify.
- 3Find two graphs in a newspaper or online that could be misleading. Identify what makes them misleading and describe how you would display the data more honestly.