The FRQ is a great way to prep for the AP exam! Review FRQ practice writing samples and corresponding feedback from Fiveable teacher Jerry Kosoff.
Natural gas is used in many households to heat water, provide cooking fuel, and heat the home. January is typically a month in which many homes have their highest usage of natural gas. A utility company reviewed data from their customers in a certain city for a period of 15 years. For each year, the company recorded the average daily high temperature in the month of January (in degrees Fahrenheit), as well as the average household usage of natural gas (measured in therms, a unit of heat energy). The company displayed the data on a scatterplot, with the average daily high temperature on the x-axis and the average household usage of natural gas on the y-axis. The company noticed a negative, strong, linear association between the variables.
a. In the context of this situation, describe the meaning of the words โnegative,โ โstrong,โ and โlinear.โ
b. A least-squares regression line created from the companyโs data is shown below.
Interpret the meaning of the slope of this regression line in the context of this problem.
c. In one of the years in the data set (2015), the average daily high temperature for the month of January was 20 degrees Fahrenheit, and the average household natural gas usage was 45 therms. Calculate and interpret the residual for the year 2015.
a.) The word negative means that as the daily temperature increases, the usage of natural gas decreases. By using the word strong, the company is implying that a lot (if not all) of the points decrease negatively at the same correlation. The word linear shows that the line of best fit would be a straight line.
b.) As the temperature decreases, it is expected that each householdโs usage of natural gas will decrease around 1/698 degrees Fahrenheit on average.
c.) The residual was -6.412 therms. According to the least squares regression line, the predicted usage per household of natural gas was 51.412 therms. The real average household natural gas usage was 45 therms, which is 6.412 therms less than the predicted usage.
In part (a), you do a good job of describing โnegative,โ but your descriptions of โstrongโ and โlinearโ fall a little bit short. For โstrong,โ itโs not clear what you mean by โsame correlation.โ If youโre referring to a similar rate of change, that would fit under โlinear.โ The word โstrongโ refers to the idea that the predicted values of # of therms used are typically close to the actual # of therms used; that is there are small residuals based on the regression line. For โlinear,โ Iโm actually unsure if your description would get credit or not - you should be referring to how the points on the scatterplot appear to create a straight line.
In parts (b) and ( c ), you have strong explanations and would earn full credit for both parts. Overall, this is well done!
a) Negative means as the average daily high temperature increases, the average household usage of natural gas decreases.
Strong means the average household usage of natural gas tends to be close to the predicted average household usage of natural gas.
Linear means as the average daily high temperature increases, the average household usage of natural gas tends to decreaseat a constant rate.
b) For each additional degree in temperature, we expect the average household usage of natural gas to decrease by 1.698.
c)The average natural gas usage at 45 therms was 6.42 degrees less than predicted.
In part (a), your responses are clear and concise. For the description of โnegative,โ you could strengthen your response by saying โtends to decrease,โ but your response would earn credit as-is.
In part (b), I would be careful to include units in your response (โdegreeโ should be sufficient but adding โFahrenheitโ can strengthen, and you should include โthermsโ for the natural gas). Your sentence works as written, and โexpectedโ will work as a substitute for โpredictedโ or โon average,โ which is whatโs typically written on AP rubrics.
For part c, youโve correctly interpreted the residual, but reversed the units. The residual was for the number of therms being used (so we used 6.42ย thermsย less than predicted); assuming you show your math on the real exam (hard to do in this forum, I know), you would likely earn partial credit for that mix-up.
(a) The word strong means that the points on a scatterplot are close to the line of best fit and that thereโs small residuals. The word linear means that the points form a straight line and that as temperature increases, the usage of natural gas decreases at a constant rate. The word negative means that as the temperature increases, the average usage of natural gas decreases and has a inverse relationship.
(b) As the temperature increases by 1 degree Fahrenheit, we expect the household usage of natural gas to decrease by 1.698 therms on average.
Residual=6.412. My observed y of 51.412 degrees Fahrenheit when it is 20 degrees Fahrenheit outside is 6.412 degrees Fahrenheit above what the model shows.
Your answer to part (a) communicates everything it needs to, and includes context of the problem. In part (b), youโve state everything clearly and including the expression โon averageโ at the end of the sentence makes it clear that this is a prediction, not a guarantee. You would earn โessentially correctโ (E) for both parts a and b. For part ( c ), be careful: the 51.412 youโve calculated is the number ofย therms, and is aย predictedย value (not an observed value). Therefore, the residual should be negative instead of positive (as weโd do 45 - 51.412). Given both the reversal of signs and the incorrect units in your interpretation, it is likely that you would be scored โincorrectโ (I) for that part of the problem.
a.) In the context of this situation, negative means that average daily high temp. increase (x), while the average usage of natural gas (y-hat) decreases. Strong means that the predicted average usage of natural gas is close to the actual average usage of natural gas. Linear means that the average daily temperature increases (x), while the usage of natural gas decreases at a constant rate.
b.) As the average daily high-temperature increases by 1-degree Fahrenheit, the average household usage of natural gas will decrease by 1.698 terms.
c.) y (hat)=85.372-1.698(20)=51.412 therms
Actual-Predicted= 45-51.412=-6.412 therms
The least-regression line shows that the average household natural gas usage was 6.412 thereโs less than the predicted household natural gas usage.
In part (a), you make good use of context throughout your answers. Be careful that when youโre describing theย scatterplot, you are making references to x and y (not y-hat), since the scatterplot contains actual values. You should also say things like โasย the average daily high temperature increasesโฆโ in your description of โnegativeโ and โlinearโ (your answers left of the โasโ).
In part (b), your sentence framing is good, uses correct units, but misses a key thing that โgetsโ a lot of students on these types of problems: youโre missing โnon-deterministicโ language for the โdecrease by 1.698 thermsโ part of your sentence. Thatโs the AP rubricโs fancy way of saying "you make it sound like itย willย decrease by this much, when it is onlyย predictedย to decrease by this muchย on average. Youโll need to use โpredicted,โ โon average,โ โestimated,โ or similar words when describing slope or intercept. Your answer would earn partial credit.
In part ( c ), you correctly calculate and clearly interpret the meaning of your answer in the context of this problem. Nicely done!
a. โnegativeโ: As the average daily high-temperature increases, the average household usage of natural gas decreases
โstrongโ: The points relating the daily high-temperature and the average household usage of natural gas fit the line of best fit well. (small residuals)
โlinearโ: The points relating the daily high-temperature and the average household usage of natural gas follow a straight line pattern.
b. Based on the LSRL, as the average daily high-temperature increases by 1, the average household usage of natural gas decreases by 1.698
c. x= 20 Predicted average household usage of natural gas= 51.412
Residual= Actual - Predicted = 45-51.412 = -6.412
The LSRL overestimated the average household usage of natural gas when it was 20 degrees F during 2015 for being 51.412. The prediction is 6.412 larger than the actual average household usage of natural gas.
In part (a), your answers are clearly-written and presented in context. All three descriptions are correct; you would earn full credit.
In part (b), you are missingย oneย key word that would make your answer complete, and it would be enough to mark you down to partial credit: something like โpredictedโ or โexpectedโ when talking about the change in response variable. We should say something like โโฆthe average household usage of natural gasย is predicted toย decrease by 1.698 thermsโ. This is a common error, so watch out for it.
In part ( c ), you perform the residual calculation correctly and describe the meaning of it in context. Full points!
a) The meaning of the word negative in this statement means that as the temperature is increasing the amount of natural gas used is decreasing. The meaning of the word strong in this statement means that the residuals of the given points are very small and are close together to the line of best fit implying that there is a very close relationship with temperature and the natural gas usage, and the word linear means that the relationship is best described by a linear straight line as the data when plotted is seen to have an approximately straight line.
b) For every increase of 1 Fahrenheit in the temperature, the predicted natural gas usage decreased by 1.698
c) The residual is represented by the formula Actual - Predicted, and the residual value which is calculated would be -6.412 therms, which represents that the actual value for the number of therms is less than the one predicted. It can be said that the actual value for the number of therms would be 45 which is lower than the predicted value of 51.412 by the LSRL.
In part (a), you clearly detail the meaning of the words โnegative,โ โlinear,โ and โstrong,โ including units and context along the way.
In part (b), you give a correct description of slope - including the key phraseย predicted. You can strengthen your answer by including units on the response variable (therms, in this case).
In part ( c ) , you correctly calculate the residual. On the โreal exam,โ youโd want to show where the -6.412 came from. In your explanation of what it stands for, you should also use the number 6.412 in some way (by saying, for example, โwhich is 6.412 therms lower than the predicted valueโฆโ).
A) Negative: for every increase in average daily high temperature in degrees Fahrenheit, the average household usage of natural gas tends to decrease.
Strong: the predicted number of therms used and the actual number of therms used are close in value creating small residual sizes and strong association
Linear: the points in the scatterplot appear to create a straight line. For every increase in average daily temperature, the average household usage of natural gas decreases at a constant rate.
B) As average daily high temperature increases by 1 degree Fahrenheit, the average household usage of natural gas is predicted to decrease by 1.698 therms.
C) y hat = 85.372 - 1.698 (20)
y hat = 51.412
y = 45
residual = 45 - 51.412
residual = -6.412
context : The actual average household usage of natural gas is 6.412 therms less than the predicted average household usage of natural gas by the LSRL.
You answer is very thorough! All three parts address what is asked in the question (without anything unnecessary added), and you use appropriate language such as โpredictedโ in part b and provide answers in context and with appropriate units. You would likely earn full credit on all parts. Well done!