Solution to Quiz 8

The full quiz is here.  The answers appear below.  Comments, which are not part of the answers, are italicized.

Time 20 minutes.  This quiz is open book, open notes.

1.    Samples of sediment in a river basin yielded values of 8, 8, 20, and 64 mg/Kg (ppm) arsenic.  To analyze the results, you will model all variation as random.  Specifically, these values will be considered as independent realizations of a random variable governed by one probability distribution.  Its mean characterizes the average arsenic concentration river-wide.

  1. Estimate the mean and variance assuming the underlying distribution is Normal.  Use unbiased estimators.
  2. Estimate the mean using the maximum likelihood estimator, assuming the underlying distribution is Lognormal.  

a.    The sample mean is the minimum variance unbiased estimator (MVUE) of the mean and the sample variance is the MVUE of the variance.  The sample mean is (8 + 8 + 20 + 64)/4 =25 ppm.  The residuals are therefore -17, -17, -5, and 39 ppm, with squares 289, 289, 25, and 1521 ppm2, respectively.  The sample variance is (289 + 289 + 25 + 1521)/(4-1) = 2124/3 = 708 ppm2.

b.    The lognormal MLE requires the mean and variance of the logarithms.  The mean log is (2 + 2 + 3 + 4)/4 = 11/4.  The residuals are -3/4, -3/4, 1/4, and 5/4, with squares 9/16, 9/16, 1/16, and 25/16.  There is no bias correction in the variance formula, so the it is computed as (9 + 9 + 1 + 25)/16 / 4 = 11/16.  The lognormal MLE is exp(11/4 + (11/16)/2) = exp(99/32) = exp(3 + 3/32) = exp(3) * exp(3/32) ~ 20 * (1 + 3/32) ~ 22.

These computations with logarithms were imprecise.  For example, ln(64) = 4.16 is substantially larger than 4.  Using more precise calculations gives the answer 24.4 to problem (1b).

2.    Environmental scientists are concerned about sediments with more than 50 ppm arsenic in this river.

  1. Estimate the proportion of sediments with more than 50 ppm arsenic.  Use the binomial distribution.
  2. Use your answer to (1a) above to estimate the proportion of sediments with more than 50 ppm arsenic.  

a.    Of the four values, only one exceeded 50 ppm.  The estimated proportion is therefore 1/4 = 25%.

b.    The answer to (1a) suggests the underlying distribution is Normal(25, sqrt(708)) = N(25, 26.5), approximately.  The value of 50 is almost one s.d. greater than the mean.  The proportion of a normal distribution exceeding one s.d. is 16%.  A more precise answer is 17.4%.

3.    (Extra credit)  Explain why each method in problem 2 gives a highly uncertain answer.

The binomial model ignores the numerical values of the data and uses only whether they exceed the 50 ppm threshold or not.  This loses information and increases uncertainty.  The binomial model would produce the same answer with any threshold between 20 and 64 ppm, indicating how crude this calculation is.

The normal model cannot be an accurate description of all the sediments because, using a calculation similar to (2b), it predicts that over 16% of the sediments have negative arsenic concentrations.  With only four values, we cannot determine whether the model accurately describes the sediments with the largest concentrations.

The answer is necessarily uncertain, regardless of the estimation method, because it is based on just four samples, a very small number.  Four samples might work if the sediment concentrations were very consistent, but the observed variation from 8 to 64 ppm already shows there is no such consistency.

Scoring: The passing score is 92.  You must show details of your work to get credit for a correct answer or partial credit for an incorrect answer.

Return to the Environmental Statistics home page

This page is copyright (c) 2001 Quantitative Decisions.  Please cite it as

This page was created 13 March 2001.