The full quiz is here. The answers appear below. Comments, which are not part of the answers, are italicized.
Time: 20 minutes. This quiz is open book, open notes.
Suppose you have developed a 95% confidence interval (CI) of the mean of some pollutant concentration based on 26 biweekly measurements in the year 2000 obtained at an air monitoring station. The interval is (8, 17) (in parts per million by volume).
1. Indicate which of the following statements are correct and which are incorrect. Provide reasons for each.
a. Incorrect. This is a statement of a prediction interval (for the mean of the next 26 measurements), not a confidence interval.
b. Incorrect. This is a statement of a lower prediction limit. It ignores the lower value of 8, too.
c. Incorrect. This is a statement about a set of measurements, not about their mean.
d. Incorrect. This is a statement about a set of measurements, not about their mean.
e. Incorrect. This is a statement about a tolerance interval.
f. Incorrect. This looks correct, but is not: the someone else would have a 95% probability that their interval would cover the true mean (by definition of CI), but there is no assurance that their estimate of the mean would be covered by your interval. Indeed, this statement is a prediction about the mean of 26 independent measurements.
g. Correct. This is the definition of a confidence interval.
h. Incorrect. This statement makes little sense: how much is 95% of "all possible situations" in the (usual) case of infinitely many possible underlying distributions? And in most cases, it is not worth while demanding 100% probability of anything.
i. Incorrect. This is a statement of a prediction interval for the next measurement.
j. Incorrect. This is a statement of a prediction interval for the minimum and maximum of 26 measurements.
k. Incorrect. This is a statement of a k-of-m simultaneous prediction interval, where m=26 and k=95% * 26 = 25.
l. Incorrect. This is a statement of a k-of-m simultaneous prediction interval, too, albeit of a different kind than (k).
m. Incorrect. How did the computer simulate the measurements? This statement is too vague to be meaningful.
n. Incorrect. This is correct only when the underlying distribution is Normal of arbitrary (but unknown) mean and arbitrary (but unknown) standard deviation.
o. Incorrect. This is incorrect even for the Normal distribution assumptions. For a Normal distribution, this computes a tolerance interval.
2. List at least two statistical assumptions about the data needed for the preceding CI to be correct.
The data must be adequately modeled using tickets in a box. Thus, they must be (1) statistically independent and (2) have the same underlying distribution (come from the same box). In addition, the CI computation most likely made assumptions about the possible underlying distribution (contents of the box) and (3) those assumptions need to be a good representation of reality.
Extra credit: list as many more assumptions (statistical, scientific, or practical) as you can in question 2.
See the answer to #2. Assumption (2) has particularly strong implications, because it requires that the data be obtained in the same way, using the same sampling and analytical technique ("comparability"), and that they be "representative" of all the air for which the CI is meant to apply. This means in particular that the analytical method should not have introduced any systematic bias in the results (and thereby give a false indication of true concentrations): this is the "accuracy" concern. Also, we must assume (4) that a formula for a confidence interval (as opposed to some other interval) was actually applied and (5) the computation was carried out correctly. Problem (1) should have highlighted how important assumption (4) is; to see that assumption (5) is not trivial, consider that many people apply confidence interval formulas (and other formulas) to nondetect values, which is problematic (see chapter 10).
Scoring: The passing score is 85.
Note: There will be a smaller number of statements for question #1 on the actual quiz..
![]()
Return to the Environmental Statistics home page
This page is copyright (c) 2001 Quantitative Decisions. Please cite it as
This page was created 19 March 2001.