12 min readโขjanuary 5, 2021
Jerry Kosoff
Jerry Kosoff
Teacher Feedback
Inย #1, you give a plausible reason for bias (people travel a lot and would like to see the taxes increase), but do notย explicitlyย describe the impact this has on the estimate they got (37.8%). Saying โit seems like most of the city residents would like to increase city taxesโ is not the same as clearly stating that you believe this the survey resulted in anย overestimateย of the true proportion. I know it seems harsh, but thatโs how the rubrics go with describing bias: (1) explain HOW the bias comes to be, (2) explain WHY the bias happens, (3) give a specific DIRECTION of the bias (over/under estimate of the true ____). Your answer only doesย #1ย andย #2.
In part 2, your sentence looks great - except you said โ95% of city residentsโ instead of โ95% of the constructed intervalsโ or something similar. Read your sentence back, and I think youโll see what I mean. That would ding your answer from full to partial credit unfortunately. You have something similar in part 3 - you say โtrue MEAN of city residentsโ instead of โtrue PROPORTION of city residentsโ. Mixing up mean & proportion there would also knock you down a scoring level, even though the rest of your answer is on point.
Finally, in part 4, you are correct in that we need theย observationsย (not โsamplesโ) within the sample to be independent, but that comes from the โ10% conditionโ (that the sample size is less than 10% of the overall population size). The reason for the โat least 10 successes/failuresโ condition is to ensure an approximately normal sampling distribution of p-hat, from which we can calculate the confidence interval.
Teacher Feedback
In partย #1, you would likely earn partial credit. When discussing bias on the AP exam, you typically have to do 3 things: (1) explain the source of the bias (โhowโ it happens), (2) explain the reason for that source existing (โwhyโ it happens), and (3) explain the impact on the result (โwhatโ happens). When reading your response, I see evidence forย #1ย andย #3ย - you mention โnot everyone responded to the surveyโ (#1ย - how) and that this will probably โunderestimate the true proportionโ (#3ย - the impact). To my eyes, though, your response does not addressย whyย this happens andย whyย the non-response will lead to an underestimate, which would imply that the 37.8% is lower than the true proportion if we actually asked everyone (and it would be maybe more like 50% or something like that). You would need to make an argument for *why the people who responded to the survey are more likely to say no and thus produce an underestimateย * - perhaps they are strongly opposed to taxes of any kind, or the wording of the question made them feel like their money could be better spent elsewhere. Whatever you decide is the case, you should present and defend why it impacts the responses. For nonresponse to turn into nonresponse bias, the people whoย doย participate must be more likely to answer a certain way than the people whoย donโtย participate.
Additionally, while Iโm not assuming this is the case, I often have students misunderstand that getting responses from fewer people than you expect does not automatically produce anย underestimate. โUnderestimateโ specifically refers to the proportion/mean/whatever-statistic-is-being-measured being lower than would be reflected in the population. A small and biased sample can produce anย overestimate just as easily as an underestimate - perhaps in this scenario we ask a small group of people who live near roads with lots of potholes what they think. They would be likely to support the cityโs proposal more than others, and therefore produce an overestimate. [OK, thanks for coming to my TED Talk about bias. On to the next partโฆ]
In partย #2, we have a little bit of reviewing to do. In part (a), you correctly interpret what a 95% confidenceย intervalย is, but that is not the same as a confidenceย level. A confidenceย levelย represents a โlong-run capture rateโ that is then reflected in each specific confidence interval. You can check out an overview from a previous streamย at this linkย 1ย - itโs time-stamped to the part youโd need. The correct answer in this case would sound something like โif we were to take many, many random samples of 300 city residents and ask them the question, about 95% of the confidence intervals we constructed would capture the correct value forย p, the proportion of all city residents who would respond yes to the question.โ
Forย #2ย part (b), youโve also committed a relatively common error, in that while it is true that 50% is in fact in the interval, the presence of other, smaller values in the interval provides evidence against the claim that at least 50% of residents support the proposal. Itโs just as plausible that 48.5%, or 49%, or 49.9% would say โyesโ. And sinceย allย values within a confidence interval are considered โreasonableโ values forย p, we cannot say with confidence that the true population proportion is at least 50%. We could only say that if theย entireย interval is 50% of higher - for example, (0.512, 0.592).
In part ( c ), you give the correct rationale for the โlarge countsโ condition - short and to the point! This would earn full credit.
Teacher Feedback
For part 1, you would likely earn partial credit. When discussing bias on the AP exam, you typically have to do 3 things: (1) explain the source of the bias (โhowโ it happens), (2) explain the reason for that source existing (โwhyโ it happens), and (3) explain the impact on the result (โwhatโ happens). When reading your response, I see evidence for (1) and (2) when you describe โonly 450 of those 100 people repliedโ and cite the possibility that those who replied โmight have different opinions about the plan.โ From there, you should โtake a standโ per se and make a conjecture as toย howย those people would differ in their opinions from the general population, and if that will produce an overestimate (more likely to say yes) or underestimate (less likely to say yes) of the true proportion. You can actually justify either direction here, as long as your explanation is clear.
Your response in part 2a is strong, and shows a clear understanding of what a confidenceย levelย represents. In part 2b, you give the correct answer (โnoโ) with a correct reason (โ50% is contained in the intervalโ), but you lose me a bit with your last sentence. In theory, we could getย anyย proportion in a sample of 300 residents just by random chance alone. A more direct statement may be something like โthis means that 50% is a plausible value for the proportion of all adults who would say yes, and we therefore do not have statistical evidence that the proportion is greaterโ
Your response in part 3 is on the money - the approximately normal sampling distribution is why we check that condition!
Teacher Feedback
In part (a), you clearly identify the possibility of non-response bias. However, when discussing bias on the exam, itโs important to pick aย directionย of the bias - youโre correct that we may end up with an over-representation of strongly opinionated people, but you should โpick a sideโ as to how those people will land (either more or less in favor of the proposal than the general public), resulting in either an under or over-estimate of the true parameter. In most cases, you can defend either side, as long as you give a reasonable possibility. In parts (b) and ( c ), youโve provided correct answers with appropriate context.
Teacher Feedback
Great work! All three parts are complete. In part (a), you named the source of bias, explainedย howย it might impact peopleโs responses,ย andย connected that to the proportion we were trying to estimate. In part (b), you correctly interpret both parts, and in part ( c ) you give the correct reason for checking that condition.
Teacher Feedback
When discussing bias like in part (a), you need to take it step further and explicitly comment on whether you think the sample results in this case are too high (an overestimate ofย p) or too low (an underestimate ofย p). In a case like this where itโs not obvious, itโs OK to โpick a sideโ and just go with it: for example, โcitizens who are concerned about tax increases may be more likely to respond and say no, producing an underestimate of the proportion of all citizens who would support the proposalโ
In part 2, your confidence level interpretation is well done (though I think you should say โ95% of samples of 161ย produce confidence intervalsย thatโฆโ, and you reach the correct conclusion in part b. 3 has the correct rationale for the 10 successes/failures condition.
ยฉ 2023 Fiveable Inc. All rights reserved.