Geometric Means, Exp(), Ln(), and All That
"The formula in our text for geometric mean goes something like exp[(1/n)Sum(log(x))]. This differs from the formula
you used that contains a square root and what appears to be the product of
both values in the set (or would those be the "min" and
"max"?). Can you please explain further your calculation for
question 8?" -- Question, January 24, 2001
The formula quoted above is number 3.4 on page 62 of the text.
The expression I used for the geometric mean in question 8 of the sample quiz, for N=2, was sqrt(X1 *
X2). This introduces a long line of interesting questions about what
powers, roots, exp(), and log() really mean. I will quickly review a
standard method of defining these because it's a good opportunity to prepare
you for material later in this course. If you just want the short answer
to the question, skip to paragraphs 4 and 5 below. However, you do need
to know all the facts listed here.
1. The natural logarithm ("ln"
here; "log" in the quotation above) of any number z > 0 is the
area under the curve y = 1/x from x=1 to x=z, using the convention that if z
is to the left of 1 (that is, 0 < z < 1), then the area is considered to
be negative. Using basic properties of areas you can prove these
fundamental facts, which you should commit to memory:
* Ln(1) = 0 (draw a
picture: there's no area).
* As x becomes arbitrarily
large, so does ln(x) (the area gets bigger, obviously, but the point is that
it becomes arbitrarily large, without approaching a limit).
* As x approaches zero,
ln(x) becomes arbitrarily negative (approaches -infinity).
* Ln(u*v) = Ln(u) + Ln(v)
for any positive numbers u and v. (This is the only property listed here
where calculus is helpful in the demonstration. A sophisticated
understanding of Euclidean geometry is sufficient, however.)
* If Ln(u) = Ln(v), then u
= v.
* Ln is monotonically
increasing.
2. The exponential exp(x) of any
number is the inverse of the logarithm. That is, to say exp(x) = y is
exactly the same as saying ln(y) = x. Therefore exp() has these basic
properties, which you also should know:
* exp(ln(x)) = x for all x
> 0.
* ln(exp(x)) = x for all x.
* exp(x) > 0 for all x.
* exp(0) = 1.
* As x becomes arbitrarily
large, so does exp(x).
* As x becomes arbitrarily
negative (approaches -infinity), exp(x) approaches zero.
* exp(u + v) = exp(u)*exp(v)
for any numbers u and v.
* if exp(u) = exp(v), then
u = v.
* exp() is monotonically
increasing.
3. For any numbers x and y, the
value x^y (x to the power y) is *defined* to be exp(y*ln(x)). Clearly x
must be greater than zero for ln(x) to be defined. This is the only
restriction. Using the properties of ln() and exp() above it is
straightforward to demonstrate that powers have their "usual"
properties, including:
* a^0 = 1
* a^1 = a
* a^(b+c) = a^b * a^c
* (a^b)^c = a^(b*c)
for any numbers a, b, and c where these expressions are
defined. From these you can derive the "definitions" used in
elementary texts, such as
* a^n = a * a * ... * a (n
terms) for integers n >= 2
* a^(-n) = 1/(a^n)
* (a^(1/b))^b = a, so
a^(1/b) is a "bth root" of a.
4. Let N=2, so the batch in
question in formula (3.4) can be written {X1, X2}. Writing out the
summation gives
geometric mean =
exp(1/2 * (ln(X1) + ln(X2)))
By *definition* (#3), this is the 1/2 power ("square
root") of exp(ln(X1) + ln(X2)), which by a basic property of exp() is
exp(ln(X1)) * exp(ln(X2)), which by *definition* of exp() is simply X1*X2.
Whence the formula reduces to the more familiar looking
geometric mean =
square root of (X1 * X2).
5. For N >= 1, you can
similarly demonstrate that formula (3.4) implies
geometric mean = Nth
root of (X1 * X2 * ... * XN)
This formula looks a little nicer, but (3.4) is a little easier
to implement on a calculator or computer and is more efficiently computed.
Formula (3.4) also makes it clear how the "geometric mean" really is
the *mean* of something; namely, first you take logarithms, then compute their
mean, then "undo" the logarithm (that's the exponential). It
also makes it easy to see that geometric means are defined only for batches of
strictly positive data. (Although the Nth root formula above actually
works for additional batches, such as {-1, -4}--the square root of (-1 * -4)
is 2, but evidently 2 is not any reasonable kind of "mean" of {-1,
-4}--it does not make much sense.)
6. Another basic, but slightly
subtler property, of ln() (namely, its "convexity") implies that
geometric means of batches can never be greater than arithmetic means.
This is called the "geometric-arithmetic inequality." It is
quite a powerful tool used to solve a large class of optimization problems,
for example. More to the point, we will use this inequality later to
demonstrate that geometric means are "biased": on the average, they
underestimate the true average concentrations or masses of chemicals in an
environmental medium, for instance. This is key to understanding the
regulatory concern about so-called "lognormal" distributions.
By the way, the "convexity" of the logarithm amounts
to the fact that the graph of y = 1/x is always decreasing. That brings
us full circle back to the definition of logarithm. The whole theory is
simple and elegant.
In class we will do some exercises to become more
familiar with ln() and exp(): we will learn how to compute their values (well,
to one or two significant figures, anyway) quickly enough to do most problems
involving ln(), exp(), powers, and roots in our heads. (See Answers to HW 5.)
For these
exercises I will assume you know all the bulleted properties listed above.
In addition, we will need the Taylor Series approximations for exp() and ln():
exp(x) = 1 + x/1! +
x^2/2! + x^3/3! + x^4/4! + ...
ln(1+x) = x - x^2/2
+ x^3/3 - x^4/4 + ... provided -1 < x < 1.
(Recall 0! = 1, 1! = 1*0! = 1, 2! = 2*1! = 2, 3! = 3*2! = 6,
and in general n! = n*(n-1)!. These series can be derived directly from
the definition of ln() above, but the derivations require subtler properties
of integration, more involved algebraic equations, and some consideration of
limits: after all, they are infinite series.)
The patterns in the series are simple and should be memorized.
For more on logarithms and exponentials see the Logarithms
page.

Return
to the Environmental Statistics home page
This page is copyright (c) 2001 Quantitative Decisions.
Please cite it as
This page was created 16 February and last updated 3 May 2001.