Final Projects

Collectively, the final project reports exhibited a mastery of all the techniques presented in class.  However, each report exhibited some deficiencies or problems.  The commonest of these are discussed here.

Present the Data

Whenever possible, you should reproduce the data in a statistical report, at least to the extent of displaying them graphically in detail.  This will allow the interested reader to reproduce your results.

This reproducibility criterion is one of the foundations of modern science.  It is a recurring theme in statistical reports.  For instance, when you report the results of a statistical test, you need to provide details about the test and its calculation so it can be reproduced.

Perform EDA

Always perform exploratory data analysis.  It does not matter whether EDA is "required" or not.  Without exception, the reports that did not perform EDA (or did not do it effectively) made significant errors that would have been made obvious with the simplest form of EDA.

Make Clear Tables and Figures

Tables and figures should stand on their own.  Do not be afraid to include lengthy captions if necessary.  Describe the columns of tables and the axes of graphs.  Short headers or computer variable names (such as "Result_1XX") are not usually meaningful.  Provide units of measurement.  Make sure the titles correctly describe the table or figure and that they distinguish the tables and figures from each other.  Omit irrelevant or unnecessary material.  (If the computer produced a number or figure, but you do not understand exactly what it means, then do not present it.)

Graph Data With a Time Component

Monitoring data are associated with specific times.  A basic question about all such data is whether they are changing over time.  In order to conduct the kinds of tests you know, you have to verify that these data behave as if they were tickets obtained from the same box each time and that no draw influences the outcome of any other draw.  This is statistical independence.  However, monitoring data behave this way only when the sampling times are sufficiently well separated.

The most basic technique for exploring such time series data is by plotting values (vertical axis) against time (horizontal axis).  Clustering and regular, smooth fluctuation of value over time indicate lack of independence.  Apparently random "wiggling" about a horizontal value suggests independence.

Explicitly Indicate Which Data You Used

Most situations are complex.  They require small batches of data to be aggregated or large batches of data to be divided into groups.  Whenever you are presenting descriptive statistics, making comparisons, or giving the results of a test, state clearly and explicitly exactly which sets of data are being described, compared, or tested.

Report Test Results in Detail

When using a statistical test of hypothesis, the principle of reproducibility implies you must describe

  1. The test being used.  A name (such as "Welch's t test") is not sufficient; not every reader will know exactly what that is.  Provide at least a reference; better, provide the formula for computing the test statistic.
  2. The assumptions of the test.  For example, Student's t test (involving two batches of values) assumes each batch of values can be modeled as independent realizations of a common  underlying Normal distribution; that the variances of the two Normal distributions are equal, but that the means could differ.
  3. The computed value(s) of the test statistic(s), such as the t-value and degrees of freedom for Student's t.
  4. The associated P-value of the test statistic, as well as how the P-value is computed and any assumptions underlying its computation.  For example, many tests compute approximate P-values based on a Normal approximation to the distribution fo the statistic.  This needs to be documented.

It is conventional to tabulate test results, especially when performing more than one test.

Use What You Have Learned

You have learned many powerful techniques for assessing data.  So, for example, if you want to detect outliers, then compute a five-letter summary, construct the fence and outer fence, and classify values as "outside" or "far outside".  If you want to compare batches to batches, then construct a Q-Q plot.  If you want to assess whether a batch is approximately Normal, then draw its Normal probability plot.  Use robust statistics such as H-spreads and the MAD to describe the spreads of data.

Do Not Use a Computer for the Computer's Sake

Avoid using software just because it is there.  Reports that include redundant, contradictory, and uninterpreted computer output signal that the author is relying on the computer to think.  Reports that include the results of sophisticated but obscure tests (such as Grubbs' test for a single outlier, chosen by several people) err by not describing those tests or justifying their use.  Choose the test according to your decision making needs, not according to your computing capabilities.

Question automatic computer output.  For example, Excel will calculate and display a "trend line" for any scatterplot.  Unless this line closely approximates the data, it probably does not belong in the graphic.  Delete it.  Statistical packages give you a host of descriptive statistics.  For example, a package some of you chose to use automatically reports "Fisher's g1" and "Fisher's g2."  Don't know what they mean?  Then edit them out (or learn what they mean and decide whether they are useful to you, then explain them to the reader).

State Performance Criteria Fully

The purpose of many tests in environmental statistics is to compare conditions to standards.  For example, a regulation may require that a process produce an effluent "not exceeding 10 ppb lead."  As it stands, that is an ambiguous criterion.  Should the average concentration be less than 10 ppb?  If so, averaged over what period of time?  Or should all concentrations be less than 10 ppb?  If so, all concentrations out of how many observations?  Or should the 90th percentile of all concentrations be less than 10 ppb?

The mark of a true criterion is that you can determine unambiguously whether it has been met or not.  That means the criterion must explicitly provide a formula, applicable in all cases, that states definitely whether a set of observations meets or does not meet the criterion.

Some examples of adequate (but informally stated) criteria are "the mean onsite soil concentration must not exceed the mean background soil concentration;" "any running seven-day mean lead concentration must be less than 10 ppb;" and "all groundwater concentrations must be less than or equal to the MCL."

Because one attempts to meet criteria with observations, and those observations have random components, no criterion is met with certainty.  This implies that a confidence (or significance) level is usually needed to make any criterion truly unambiguous.

Choose Appropriate Significance Levels

The conventional 0.01 (1%) and 0.05 (5%) levels derive from habit and the limitation of tables published in the early 20th century.  Unless you are preparing a report for publication in a journal that requires such levels, or are preparing a report for a regulatory agency (such as the US EPA) that requires such levels, then you have no reason to use these values.

Remember that test levels and confidence are related to risk.  One of the most valuable effects of writing a statistical report is that it forces you to consider the elements of risk, which are losses and their probabilities.  Often the mere awareness of these within an organization is a great leap forward.  Whenever possible, try to understand the potential losses in a statistical problem and choose test levels appropriate to manage those losses.

Be Aware of Subtleties of Estimators and T Tests

Any value that is computed from observations consequently is not "known" and is not some "bright line" or constant number: it is just an estimate, subject to uncertainty.  It is usually a mistake to use an estimate in later calculations as if it is a constant value.

The independence assumptions of t tests (and of almost all tests that compare two or more batches of numbers) imply

It is not valid to use a t test to compare a subset of a batch to the entire batch.
It is not valid to use a t test to compare two batches that have some data in common.

Use Statistical Language Correctly

It is tempting to insert the words "statistical" and "significant" at every opportunity.  They make a report sound statistical and significant, don't they?  The problem is that in most cases "statistical" is meaningless and "significant" has a very special meaning.

A result is significant only when you have conducted a test of significance and obtained a sufficiently small P-value.  Whenever you insert the word "significant" in a report, make sure you have included the details of the test you performed (see "reporting test results" above) to support the finding.

Other words with special statistical meanings are "independent," "random," "sample," "distribution," and "correlated."  Use these with care and precision.

Check Your Work

The memorizing you did in this course has a point: namely, to put at your disposal simple criteria for determining whether your calculations are correct.  For example, if you compute that the probability of a standard Normal variable falling between -0.2 and +0.2 is 68%, you should know immediately an error has occurred (because you memorized the fact that the probability is 68% of the variable being between -1.0 and 1.0).

It is so easy to make mistakes, especially in complex calculations, that you should routinely check your answers as many ways as possible.  Constantly ask,

Are the answers internally consistent?
Do the descriptive statistics and test results agree with the EDA results?  Are the EDA results consistent with one another?
Do the answers agree with rough approximations that can be done quickly with pencil and paper?
Do the recommendations (that flow from the statistical calculations) make sense?  Are they consistent with what is known or expected?

Return to the Environmental Statistics home page

The URL for this page is

This page was created 27 April 2001 and last updated 27 April 2001.