Chapter 11: Measuring Geometric Properties

The data for Exercise 11a are corrupted.  Click here to find out why and how to fix the problem.

Exercises 11a and b--measuring distance and area

Computers measure geometric properties using formulas of analytic geometry.  Consider the simplest situation of two points.  The underlying representation of these points, if they are contained in a feature theme source, is either

latitude and longitude on a sphere (or ellipsoid of revolution--more about that later) or
Cartesian coordinates (x, y).

For the sake of illustration we will stick to the simpler and more familiar Euclidean geometry setting where all data are represented in Cartesian coordinates.  (The formulas for spherical coordinates are similar in spirit but more complex in the application.)  The most fundamental calculations of Cartesian geometry concern distance, area, and angle:

  1. Distance between two points (Pythagorean theorem)
        The distance from (0, 0) to (x, y) is Ö (x2 + y2).
  2. Signed area of a triangle
        The signed area of the triangle defined by the vertex sequence (0, 0), (x0, y0), and (x1, y1) is (x0 * y1 - x1 * y0)/2.
  3. Signed angle formed by two rays.
        The angle formed at (0, 0) by the incoming ray from (x0, y0), and the outgoing ray toward (x1, y1) is the unique value between 0 and 360 degrees whose sine is (x0 * y1 - x1 * y0)/d and whose cosine is (x0 * x1 + y0 * y1)/d where d =  Ö(x02 + y02) Ö(x12 + y12).

The first two formulas are simple, easily memorized, and rapidly computed.  A modern (early 2004) PC workstation can perform up to 500 million (5 times 108) of these elementary distance or area calculations in one second.  The third formula is  more complex, because it involves trigonometric functions--that's unavoidable-- but still, a computer can perform tens of millions of angle calculations per second.

Where do these formulas come from?

Perhaps the most powerful way to perform geometric computations in the plane is by exploiting the most elementary properties of complex numbers.  A complex number is not a complicated one, but rather is one made from (a complex of) two ordinary numbers.  When we interpret the first number as an x coordinate and the second as a y coordinate, any complex of two ordinary numbers is just a point in the plane, (x, y).  It can also be considered as the vector (x, y) representing a uniform motion from the point (0, 0) ending at the location (x, y).  The numbers x and y are the components of the complex number (x, y).

The truly useful thing, the observation that gets everything going, is that there are geometrically meaningful ways to add and multiply complex numbers.  They correspond to translation, rescaling, and rotation.  These operations generalize the usual addition and multiplication of ordinary numbers.  Moreover, these operations are simple to define, easy to memorize, and have simple formulas in terms of ordinary arithmetic operations.

Addition of two complex numbers (x, y) and (u, v) proceeds component by component: (x, y) + (u, v) = (x+y, u+v).  Geometrically, addition is vector addition by the parallelogram law: the sum (s, t) of two complex numbers is the unique point that makes the sequence of points (0, 0), (x, y), (s, t), (u, v) into a parallelogram.  This translates (moves) the vector (x, y) by the amount (u, v); or equivalently, it translates (u, v) by the amount (x, y).

The angle, or argument, of a complex number (x, y), is a mathematical version of its geographical azimuth: it is the angle formed between the ray from (0,0) to (x, y) and the ray from (0,0) to (1,0) ("due east").  It is sometimes written arg(x, y).  The angle is positive when measured counterclockwise, negative when measured clockwise.  If we measure angles in degrees (360 degrees is one full circle), then degrees east of north--the more conventional geographic azimuth--is just 90 - arg(x, y).

The length, or absolute value, of a complex number is its distance from (0, 0).  The Pythagorean theorem implies the absolute value of (x, y), written |(x, y)|, is the square root of the sum of the squares of the components:  |(x, y)| = Ö (x2 + y2).

Multiplication of two complex numbers (x, y) and (u, v) adds their arguments and multiplies their lengths.  That is, the product (x, y) * (u, v) is the point in the plane whose absolute value equals |(x, y)| * |(u, v)| and whose argument equals arg(x, y) + arg(u, v).  Multiplying the absolute values rescales the lengths, while adding the arguments rotates the points.  To picture complex multiplication, we need to show the positive x axis (our reference for an angle of zero) and the unit circle (our reference for all points whose length is one).

In the figure, the argument of (x, y) is about -30 degrees while the argument of (u, v) is about 75 degrees.  The argument of their product must be about -30 + 75 = 45 degrees, as shown.  The lengths of these complex numbers can be estimated from the unit circle: the length of (x, y) is about 4/3, the length of (u, v) is a little less than 2, so the length of their product must be a little less than 4/3 * 2 = 8/3.

With a little geometry it is possible to show that (x, y) * (u, v) = (x*u - y*v, x*v + y*u).

Corresponding to addition and multiplication are, of course, subtraction and division (their inverses).  The formulas are

(x, y) - (u, v) = (x-u, y-v) (just as you might guess) and

(x, y) / (u, v) = (x, y) * (u/r, -v/r) where r = |(u, v)|2 = u2 + v2, provided r is not zero.

These formulas show how to translate, rotate, and rescale figures in the plane in terms of their coordinates.  Combinations of complex addition, subtraction, multiplication, and division therefore can be used to compute just about any property of interest, whether it be length, angle, or area.  

Looking back at formulas (1) through (3) above and ahead to the formulas below, you should now be able to recognize the influence of complex numbers in practical GIS computations.  Complex numbers are useful for more than this: they also provide coordinates on the sphere and formulas for rotating the sphere, computing spherical areas and distances, and for projecting the sphere.

Usually, a GIS represents all features by piecewise linear approximations.  Thus, curves are represented by "polylines," which are sequences of points connected by straight line segments, and regions are represented by their boundaries, which are curves.  (This means all regions are actually polygons, albeit potentially very complicated ones.)  With this representation many calculations can be reduced to a series of applications of formulas 1 or 2.

Here are some examples of how these fundamental calculations are used:

Distances and lengths

The distance between points (x0, y0) and (x1, y1) is Ö([x0 - x1]2 + [y0 - y1]2).  This is because this pair of points can be moved so that the first one is at the origin (0, 0) simply by subtracting (x1, y1) from both.  This motion does not change the distance, which is the distance from (0, 0) to (x, y) = (x0 - x1, y0 - y1).  So apply formula 1.
The distance between a point P = (x0, y0) and the line L joining (x1, y1) and (x2, y2) is determined by finding the closest point on L to P.  The points on L are all of the form t*(x1, y1) + (1-t)*(x2, y2) for real numbers t.  By the preceding computation, the square of the distance between such a point and P is 
    (t*(x1-x2) + x2 - x0)2 + (t*(y1-y2) + y2 - y0)2.
By minimizing the square, one will minimize the distance.  But this square is simply a quadratic expression in t; if one writes it out in the form A*t2 + B*t + C, the unique minimum will occur at t = -B/(2*A).  Thus, this approach gives both the distance from P to L and the closest point L0 on L to P.
The distance between a point P and a line segment joining its (distinct) endpoints depends on the closest point to P on the line L defined by the endpoints.  If that closest point L0 lies between the endpoints, its distance to P gives the solution.  Otherwise, exactly one of the two endpoints is closest to P; one computes both their distances and picks the smallest one.
Now you can compute the distance between any point and any polyline by finding the smallest distance between the point and any of the polyline segments.
Distances between more complex objects (two polylines, two polygons, a polyline and a polygon) are better computed using more sophisticated algorithms that limit the number of distance calculations.
The length of the line segment from (x1, y1) and (x2, y2) is just the distance between those points.
The length of a polyline is the sum of the lengths of its segments.
The perimeter of a polygon (or region) is the length of its boundary (which is a polyline).

Areas

The signed area of any triangle is given by formula 2 by translating the triangle until one of its vertices is at (0,0).
The signed area of a simple polygon is given by dividing the polygon into signed triangles.  Suppose the polygon is a simple one whose boundary is defined by the sequence of points P0, P1, P2, ..., PN-1, PN = P0.  Then the polygon (even if it is not convex) is the sum of triangles {P0, P1, P2}, {P0, P2, P3}, ..., {P0, PN-2, PN-1}.  Compute its area by adding the signed areas of these component triangles.
This figure (a seven-sided polygon) is defined by the sequence P0, P1, P2, ..., P6, P7 = P0.
Here is the same figure represented as the sum of five signed triangles.  Triangles {P0, P1, P2} (pink diagonal stripes) and {P0, P2, P3} (green diagonal stripes) have negative orientations.  Their signed areas are negative.  Triangles {P0, P3, P4} (cyan), {P0, P4, P5} (light cyan) and {P0, P5, P6} (white) have positive orientations.  Their signed areas are positive.  The original figure is thereby represented as the the pentagon {P0, P3, P4, P5, P6} formed by the three solid triangles with the quadrilateral {P0, P3, P2, P1} removed.

Things to watch out for

You have to know what the coordinates of your data sources are.  At the very least, you should know whether they are inches or miles!
When you choose to show multiple themes in a View, ArcView assumes the themes have common coordinate systems.  Therefore you specify the units of measurement and other properties of the coordinates using items in the View menu, not in the Theme menu.
The accuracy of interactive measurements of length is limited.  A high-resolution video monitor rarely is more than 2000 pixels across; 1000 pixels is more typical.  If your view extends across, say, 800 pixels and represents a half mile (2640 feet), then each pixel's extent is larger than three feet.  A measurement between two points therefore cannot be more precise than three feet.
The accuracy of area measurements is even more limited.  Consider two rectangular regions on a view, one of 100 X 100 pixels, the other of 101 X 101 pixels.  Both could be equally good approximations to a rectangular feature; they differ by only one pixel (one percent) in either dimension.  However, the first occupies 10,000 pixels while the second occupies 10,201 pixels: their areas differ by two percent.
For areas, ArcView uses Cartesian calculations only, even when you have told it the coordinates are "decimal degrees"--that is, longitude and latitude.  This means that areas will be in units of "square degrees."  Unfortunately, the actual area depends strongly on the latitude.  Therefore the values returned by ArcView in this case are practically meaningless.

Exercise 11c--setting a map projection

ArcView offers "on-the-fly" projection of features in a View.  Specifically, when the coordinates of all feature themes in a view represent longitude and latitude relative to the same datum, ArcView will first project every feature into a common Cartesian coordinate system--"map space"--if asked to do so.

Evidently the projection is a property of how all themes will be displayed by the View, so you will look in the View menu (View|Properties) for the interface to set projections.

Remember that a View document is a method for displaying data, not for changing them.  Therefore setting or changing a View's projection will not alter the underlying data.

Things to watch out for

ArcView will not do on-the-fly projection of images or grids in a View.  There is good reason for this.  First, projection will distort the regular row-by-row, column-by-column nature of such data.  The resulting curvilinear grid of values has to be "straightened" somehow by resampling the results, a process that can require some careful fine-tuning and which always results in some loss of precision in the data.  Second, even a small image or grid contains a large amount of data.  For example, an image that looks good on a mega pixel screen will have a million or more cells.  Each one of these would require reprojection, a process that can take seconds or minutes (depending on the complexity and accuracy of the projection formulas).  This would make every redisplay of a View unacceptably long.
Therefore, for images to appear correctly in a View, their underlying coordinates have to be consistent with the projected coordinates created by the View's projection.  If you have two or more images "in different projections," then you will first have to modify some or all of them to agree with the intended projection.  This implies that you must know what projection (if any) was used to create every image you use.
Projection is a non-linear transformation, so that projections of the straight line segments implied by the internal representations of curves and polygons should produce curves.  However, they only produce straight segments.  Therefore, projection (at least in the manner performed by ArcView) changes shapes.

How projection distorts shapes

The original line segment is on the left.  The green vertices (endpoints) define it.

The projected segment is on the right.  The correctly projected segment is the curved solid line.  However, if the GIS uses only the projected vertices (red) to define the segment, it will draw the dashed blue line instead.

ArcView will actually break the original segment into smaller parts (typically of about one degree in length) to achieve better accuracy during projection.  However, whenever you measure a distance with the measurement tool , ArcView will report the distance measured along a straight line on the map (rather than a straight line on the earth's surface).

Laboratory Exercises

  1. Redo Exercise 11a after setting the View to a gnomonic (North polar) projection.  Identify and explain the differences in results.

Suggest appropriate projections for the following kinds of calculations.  Experiment, if necessary, using the data supplied with ArcView or the GTKAV text:

  1. Rank world countries according to size.
  2. Find all hospitals within a 25 mile radius of your present location.
  3. Determine the shortest air routes between major world ports.

Thought questions:

  1. In Exercise 11c, ArcView successfully changed a circular shape into an egg shape when a projection was applied.  How could it do that when it cannot precisely project single line segments?  (Hint: how does it succeed in projecting complex shapes such as outlines of the U.S. states?)
  2. What kind of feature is the red disk in Exercise 11c?  Because it changed when the View's projection changed, what does this imply about the relationship between a View's projection and the graphics in the View?
  3. How does text in a View respond to changes in projections?  Does its angle change?  Its size?  its position?
  4. In Exercises 11a and b, the [Property] theme's feature table has [Area] and [Perimeter] fields.  What unit of measurement are they expressed in?  What does that suggest about the accuracy of these data?
  5. In Exercise 11a, the View did not have a projection set.  How did this affect the analysis of the shortest water line location?  Does the answer change if you use a projection?  Why or why not?
  6. In Exercise 11b, the View did not have a projection set.  How did this affect the analysis of where to locate a soccer field?  Does the answer change if you use a projection?  If so, what is the correct answer?

End notes

(18 October 2002)

An alert reader writes,

You made a statement at the end of "Things to watch out for" that values returned by ArcView are practically meaningless.  I've often wondered how accurately ArcView measures area.  Does that mean you don't trust any areal/distance measures it makes, whether made with the measuring tool or by updating the area/distance in the table?  What do you use to get these measures?

(Marie Mills)

That statement was made in the context of decimal degree data.  (It has since been modified to clarify that restriction.)  For projected data, ArcView's measurements are accurate, *given the data*.  They are performed using the techniques described earlier on this page and computed with double precision floating point arithmetic (51 binary digits, or about 16 decimal digits).

The potential problems lie with the data: if the data are inaccurate, then clearly the measurements will be. Even when the data were accurately obtained, their projection can introduce distortions. Use equal-area projections for area measurements that need high accuracy. Use projections which locally have small length distortions for distance measurements.

Usually, the fastest way to compute areas and lengths in ArcView is with the Field Calculator. Areas are computed as

[shape].ReturnArea

and lengths (of polylines or perimeters of polygons) are computed as

[shape].ReturnLength

When data are in decimal degrees, they can be projected on-the-fly using the projection in some view. If, for instance, you want to use the projection specified in a view named "View1", the expressions are

[shape].ReturnProjected(av.FindDoc("View1").GetProjection).ReturnArea

and

[shape].ReturnProjected(av.FindDoc("View1").GetProjection).ReturnLength

Note that the view does not necessarily have to be the same view in which the themes are displayed.

For basic information on using the Field Calculator, please refer to Chapter 15b.  A collection of good links to pages on projections appears in an earlier version of this course at http://www.courses.psu.edu/in_sc/in_sc597b_wah5/Notes/01july.htm

This page was last updated 22 March 2004. The sidebar on complex numbers was added.