Solution to Practice Quiz 3

The full quiz is here.  The answers appear below.  Comments, which are not part of the answers, are italicized.

1.    A batch A has the summary statistics listed below.  The values in batch B are obtained by dividing each value in A by 2 and then adding 10.  Compute the corresponding summary statistics for B.  In cases where it is not possible to deduce the value, or if the value is not defined, then please so indicate.

Statistic A B Reason
Count 143 143 Obvious
Median 16.0 18.0 Order statistics transform the same way as any data
H-spread 24.2 12.1 The hinges transform like the data, but the +10 terms cancel out in the subtraction
90th percentile 97.6 58.8 Percentiles transform the same way as any data
Variance 1,032 258 The variance is multiplied by the square of the factor (1/2), which is 1/4.
Third order statistic (X[3]) 0.8 10.4 Order statistics transform the same way as any data
Geometric mean 11.2 Defined but not computable Since the GM was defined for A, all values of A are positive.  Therefore all values of B are greater than 10.  But the GM cannot be computed.
10% Trimmed mean 18.4 19.2 The trimmed values in A correspond to the trimmed values in B, so the trimmed mean transforms like the data.

Would you expect A to be positively skewed, negatively skewed, or have approximately zero skewness?  What about B?

A looks positively skewed because the trimmed mean exceeds the median which exceeds the geometric mean.  The CV is approximately sqrt(1032)/18.4, which is almost 2, indicating strong positive skewness (the 10% trimmed mean was substituted for the mean in the CV formula, which is why it's only approximate).  B must be correspondingly skewed, because linear transformations do not change skewness.

2.    Draw a histogram of the following batch of arsenic measurements in soil (mg/Kg dry weight).  Use whole powers of two for the cutpoints: that is, put the bin endpoints at 1, 2, 4, 8, 16, and 32 mg/Kg.  The values have been sorted for you and grouped into sets of five for more reliable reading.

(1.5, 2.1, 2.5, 2.7, 2.8,
3.6, 4.6, 4.6, 4.9, 5.0,
5.2, 5.4, 5.5, 5.5, 5.6,
5.8, 5.8, 6.4, 6.7, 6.7,
6.8, 7.1, 7.3, 7.3, 7.8,
8.1, 9.1, 9.2, 9.3, 10.2,
11.0, 11.6, 12.1, 12.4, 13.9,
15.3, 16.0, 17.0, 24.7, 27.2)

Label the histogram appropriately so that it can be read on its own.  Use relative frequency on the y-axis.

The following table summarizes the bin counts and the computations:

Start End Count Proportion Interval Width Proportion per unit width
1 2 1 2.5% 1 2.5%
2 4 5 12.5% 2 6.3%
4 8 19 47.5% 4 11.9%
8 16 11 27.5% 8 3.4%
16 32 4 10% 16 0.6%
Total (check) 40 100%

(Each bin includes its left endpoint but not its right.)  Here is the histogram:

This histogram should have unit area.  As a quick check, the bulk of it between 1 and 16 ppm is roughly triangular, rising to a height of 0.12.  Its area will therefore approximate 1/2 * (16-1) * 0.12 = 0.90.  That's about right.  (In fact, this estimate is exact, by pure luck.)

3.    A batch has 24 values.  The five largest values are 33.0, 24.6, 17.0, 16.0, and 15.3  Compute the 90th percentile of this batch.  Use Weibull plotting positions (text, page 96).  Round the answer to one decimal place precision (the same as the data).

The orders of the largest values are 24, 23, 22, 21, and 20, with corresponding percentiles equal to 24/25 = 96%, 23/25 = 92%, 22/25 = 88%, and so on.  The 90th percentile is exactly halfway between the second and third highest values because 90 is exactly halfway between 92 and 88.  Thus the 90th percentile is (24.6 + 17.0) / 2 = 20.8.

Scoring: The passing score is 96.

Return to the Environmental Statistics home page

This page is copyright (c) 2001 Quantitative Decisions.  Please cite it as

This page was created 25 January 2001.