Homework

Link to previous assignments.

If you cannot obtain an answer or are unsure of your answer, discuss the problem with each other or with anyone else who might be helpful.

Due Assignment
13 March  Midterm project

Purpose:  To apply what you have learned about EDA and characterizing distributions to actual environmental data.

Data: Use your dataset or ask me for one.  It should consist of two to eight related batches related to some kind of environmental investigation or study.  The total amount of data should be between 48 and 500 records, approximately.  For example, measurements of six metals in a set of twelve soil samples (72 numbers); time series of daily air quality monitoring results at two monitoring stations for three months (182 numbers) ; species richness counts for four kinds of species within 63 regions (252 numbers).

Scope: Provide appropriate summary statistics, graphics, and exploratory data analysis.  Identify outliers.  Characterize underlying distributions: in particular, compare the batches  to Normal and Lognormal distributions.  If possible, estimate parameters (mean and standard deviation for each batch) as described in Chapter 5.

Format:  Write your results as a memorandum or report.  In this report you must document the data: how they were obtained, what their quality is, what they are (provide tables).  Present the results: for each statistic, graphic, or test you perform, you should explain what it is, what its purpose is, and interpret its results.  Provide a summary and conclusions section summarizing your interpretations.  Keep the length to eight pages, double-spaced, or less, plus figures and tables.

Medium:  Deliver your paper as a web page, a Word document, or in hardcopy.

27 February
  1. Send an e-mail describing the data set you will be evaluating for your mid-term project.  If you do not have a dataset, send an e-mail requesting one.  (15 min.)
  2. Take the ticket tutorial.  (60 min.)
  3. Using normal probability paper, create probability plots of at least three of the compounds measured in soil gas.  (30 min.)
  4. The state takes an industrial facility to court for violating its safety permits too many times.  The prosecution notes that other regulated facilities in the same industry tend to have safety violations in only ten percent of all inspections, but that this facility had a violation in sixteen percent--19--of the last 120 regular monthly inspections.  The increase of six percent from ten to sixteen percent was sufficient to prove that this facility maintained a workplace significantly less safe than the industry average.  Find a defense. (20 min.)
  5. Using the definitions, compute the mean and second through fifth moments about the mean for the B(1, 1/2), B(1, 1/10), and B(2, 1/2) distributions. (25 min.)
  6. * (This extends the text's problem 4.18.)  Select a value N between 4 and 1,000.  Select a mean mu and standard deviation sigma > 0.  Generate N random numbers from an N(mu, sigma) distribution.  Compute the sample standard deviation.  Repeat until you have at least 20 sd values.  Now explore these results: produce a stem and leaf plot, a boxplot, compute summary statistics, and draw the EDF (empirical distribution function).  Repeat for different values of N.  Repeat for other summary statistics: interesting ones include either extreme, the range, the skewness, and the median. Compare the results.  What conclusions do your results suggest?  Note: all the formulas you need exist in spreadsheets previously introduced on these pages.  See the "Simulation" and "Analysis" sections in the "How to..." links.  (1 - 4 hours)
  7. Read the text through page 257.  Skip any sections about the Poisson distribution.  Focus on three things: (a) understanding confidence intervals, (b) computing confidence intervals for the mean of a Normal distribution, and (c) computing confidence intervals for the mean of a Lognormal distribution  (1 hour.)

Total estimated time: 4:30 - 7:30.

Answers to #4 and #5 are available.

Return to the Environmental Statistics home page

The URL for this page is

This page was created 24 February and last updated 24 February 2001.