Course Outline

Presented by Dr. William Huber

Great Valley Graduate Center
Penn State University

Prerequisites

Facility with a spreadsheet program.
Quantitative competence.
Mathematical training at least through differential and integral calculus.
Some experience or training related to environmental issues.

Requirements for course completion

Successful completion of all homework problems.
Successful completion of all in-class quizzes.
Completion of a project involving detailed analysis of environmental data.

Expect to take about six to nine hours per week on homework and studying.  Some optional homework problems will be challenging and open-ended, allowing you to progress as far and as quickly as you like during the course.

What you will learn

How to "look at" data effectively and efficiently.  This is part of a subject called "Exploratory Data Analysis," or EDA.  The course begins with an introduction to EDA and introduces specialized techniques when applications warrant: smoothing series, assessing two-way tables of data, fitting lines, and so on.

How to understand probability distributions, use them to model and simulate environmental data, and read technical literature that uses the language of probability.  You will acquire facility applying Gaussian ("Normal"), Lognormal, and Binomial distributions to a wide range of environmental applications.

How to make comparisons using hypothesis tests, confidence intervals, prediction limits, tand olerance limits.  We will focus on understanding the assumptions behind these procedures so you can determine when they are appropriate and when they may be misleading.

How to quantify variability and uncertainty and use that for assessing risk and liability.

How to deal with technical problems in environmental data, such as results below or above reporting limits ("censored" data), correlation in time and space, and correlation among related analytical parameters.

Where in U.S. Federal regulations statistical procedures are required or suggested for use in environmental settings.

How to combine statistical distributions in probabilistic assessments of environmental risk and financial liability.  This includes the popular "Monte Carlo" simulation as well as other techniques.

Techniques for designing better sampling plans and critically evaluating past investigations and data-gathering efforts.

How to recognize and avoid pitfalls and common mistakes in the collection, analysis, processing, reporting, and interpretation of environmental data.

How to read and use statistical "cookbooks" effectively.

Outline

This outline will adapt to our needs and rate of progress. 

Underlined topics are either not discussed in the text or will be presented in greater depth or detail than in the text.

1.    Introduction

Chapter 1

Overview of course
Course requirements
Overview of learning resources, materials, data

2.    Exploratory Data Analysis: Looking at Data

Chapter 3 (partial)

Batches
"Resistance, residuals, re-expression, revelation"
Descriptive (summary) statistics
Strip plots, etc.
Stem and leaf diagrams; sorting
N-Letter summaries
Box plots
Histograms
Density plots
Bar charts, dot plots, and pie charts
Probability plots
Q-Q plots
Looking at subtraction
Introduction to computer data manipulation

3.    The Language of Statistics: Probability

Chapter 2 (partial); Chapter 4

What is probability?
(Ticket in) box models
Distributions: CDF and PDF; quantiles and percentiles
Continuous and discrete measures
The anatomy of a distribution; standardizing a distribution
Sums, products, and other combinations of distributions
Generating "random" numbers
Simulations with Excel
Central limit theorems: the Gaussian ("Normal") and Extreme Value distributions
Discrete distributions
Continuous distributions
Mixture distributions
Practical applications of distributions: estimating frequencies, setting permit limits, assessing liabilities

4.    Estimating Parameters

Chapter 5.

Method of moments
Chebychev's inequality
Monte-Carlo simulation
Maximum likelihood
Loss functions
Rational decision theory
Comparing estimators
Unbiased estimators
Bayes estimators
Minimax estimators

5.    Making Comparisons: Determining When Limits Are Exceeded

Chapter 6 (and parts of chapter 5).

Confidence limits
Tolerance limits
Prediction limits

6.    Testing Hypotheses

Chapter 7.

Hypothesis tests in a decision theory framework
T-tests
Size, power, and risk

The course ended here.

7.    Linear Models and Regression

Chapter 3 (second half), Chapter 9 (first half).

Scatterplots
Scatterplot matrices
What can go wrong
What you can and cannot do with R2
Regression diagnostics
Transformations for homoscedasticity (there's a mouthful...)
Nonparametric techniques
Calibration
Paired observations as a form of calibration
Inverse regression
Using regression models to improve sampling plans

8.    Simulation and Risk Assessment

Chapter 13.

(Some of this material was covered throughout the course, especially in section 3: probability and section 4: making estimates.)

9.    Designing Sampling and Monitoring Programs

Chapters 2 and 8.

(The material in chapter 2 was covered in sections 5 and 6.)

Return to the Environmental Statistics home page

This page is copyright (c) 2001 Quantitative Decisions and William A. Huber.  Please cite it as

This page was last updated 4 April 2001.