Solution to Quiz 11

Time 20 minutes.  Open book, open notes.

The full quiz is here.  The answers appear below.  Comments, which are not part of the answers, are italicized.

A statistician has computed several statistical intervals from the same set of data, assuming a Normal distribution, but he forgot to write down how each interval was computed.  Help him out by matching each interval with its description.  Explain your choices.

Interval or limit Description
1. [27.5, 34] F. 80% confidence interval for the mean  The center of this interval and the centers of #5 and #6 coincide, indicating they are the confidence interval for the mean and the prediction intervals.  The estimated mean is therefore about 30.7.  This interval, being the narrowest of #1, #5, and #6, must be the confidence interval about the mean.
2. [6.5, 11.5] C. 80% confidence interval for the standard deviation.  Both values are low in this interval, but none of the available choices corresponds to an interval around a lower percentile.  Therefore this interval is likely to contain the standard deviation.  Inspection of the remaining intervals shows they are all consistent with a standard deviation in this range.  (See the "overall check" below.)
3. [40.4, 50.8] A. 80% confidence interval for the 95th percentile.  Both values are high in this interval and the next, but their range is moderate.  Therefore #3 and #4 must be confidence intervals for upper percentiles.  The interval that has higher endpoints and a wider range must correspond to the higher percentile, given that they have the same confidence.
4. [34.5, 42.2] B. 80% confidence interval for the 80th percentile.  See the remark to #3.
5. [24.3, 37.2] E. 80% prediction interval for the mean of four future values.  See the remarks to #1 and #6.  We know this interval must be wider than the confidence interval of the mean, because it must account for the additional uncertainty in the future average.
6. [11.9, 49.5] D. 80% prediction interval to contain four future values.  This interval is the widest of #1, #5, and #6, and so must be the one containing all four future values.

Extra credit.  Determine how many numbers are in the data set.

Let n be the data set size.  The length of the confidence interval of the mean, [27.5, 34], is a multiple of sqrt(1/n).  The length of the prediction interval for the mean of four future values, [24.3, 37.2], is the same multiple of sqrt(1/n + 1/4).  Divide the latter value by the former to get the equation

2 = (37.2 - 24.3) / (34 - 27.5) = sqrt(1 + n/4).

Solve for n, giving n=12.

As a check of this result, figure the standard deviation estimate is near (but less than) the middle of its interval, [6.5, 11.5], and therefore is about 8 or 9.  The standard error is therefore around 8/sqrt(12) = 2.3.  We already know the mean is (27.5 + 34)/2 = (24.3 + 37.2)/2 = (11.9 + 49.5)/2 = 30.7 (from #1, #5, and #6, respectively).  The confidence limits for the mean should be roughly 1.5 standard errors from the mean, or about 3.5 units away.  Indeed, 30.7 + 3.5 = 34.2 and 30.7 - 3.5 = 27.2, in close agreement with #1.

As an overall check, estimate the 80th percentile as mean + Z0.80 * sd = 30.7 + Z0.80 * 8, where Z0.80 is the 80th percentile of N(0, 1) and is therefore slightly less than 1, giving an answer slightly less than 40: just in the middle of interval #4.  Similarly, estimate the 95th percentile as 30.7 + Z0.95 * 8 = 30.7 + 1.65 * 8 (more or less) = about 45, right in the middle of interval #5.  Everything is consistent.

The data set used to create this quiz consists of the values 25.34 34.28 39.51 33.36 35.06 37.84 10.99 25.49 40.62 25.34 28.68 and 32.31. 

Hint:  Consider the midpoint (center) of each interval.

Scoring: The passing score is 90 (one incorrect answer).

Return to the Environmental Statistics home page

This page is copyright (c) 2001 Quantitative Decisions.  Please cite it as

This page was created 2 April 2001.