Chapter 13: Selecting Things

Exercises 13a and b, 14a--selecting features in themes and tables

    The identify tool shows attributes for features within three pixels of the cursor.

It will identify features in all active themes, even when more than one theme is active.
There is a version of this tool in the Table GUI (graphical user interface), too, but it is not terribly useful--you are already looking at the attributes when it is applied.
A list of features identified is created in the left hand side of the "Identify Results" dialog.  Its main use is for exploring unfamiliar data because you cannot export or compare the attributes in this list.
Identified features are not selected.

Identify is not the same as select.  

About selection

One might divide all questions about data into two types:

  1. Those that ask for a statistical summary of all data in a table
  2. Those that ask about properties of data which satisfy some criterion.

The distinction is a practical one more than it is a conceptual one.  You begin with tables (or themes) of data.  If you need to use only a portion of those data to answer the question, it is an example of the second type, and you answer the question in two steps: first identify and (somehow) separate the relevant data and then (usually) summarize them to obtain the answer.

Many spatial analyses work the same way.  First you select, then you do something with the selection.  Often the process must be repeated through several steps.

For example, suppose you have a map of rainfall in areas of the United States and you want to estimate flood potential in low-lying areas where people live.  You might proceed by first finding "low-lying" areas (you will need to find a specific definition of "low-lying").  From those you would select only areas where people live (basing your criterion on data about population density, for example).  Then you would choose the precipitation records for those areas only and perform your analysis of this subset.

What a selection is

A selection in ArcView is a state of a table or theme.  It "flags" each record (or feature) as "selected" or "unselected".  Selection is meaningful in ArcView because most operations, such as statistical summaries, data exports, and data conversions, use only the selected records (or features).  The exception is that if no records (or features) are selected, then all of them will be used in an operation.

Normally, ArcView shows selected records and features by coloring them solid yellow.  You can change this color in the Project|Properties dialog.  (A little experimentation will likely show that yellow is an excellent choice.)

A selection does not change the data at all.

Things to watch out for

One of the most frequently used tools is the Select feature tool and its Table counterpart, Select .  If you inadvertently click the mouse on a View or Table with one of these tools active, you will lose your selection.  Selections are as transient as smoke in the wind and must be treated carefully, especially when you are in the middle of a complex analysis.
Frequently you want to perform an operation on all your data, such as to convert them all to a different format, export them, or compute statistical summaries.  Get in the habit of clearing the selection first so that ArcView does not limit its operation to the selection!  The button for this (common to the View and Table GUIs) is Select none, .
Frequently you will know beforehand exactly how many features or records should be selected in response to a query.  Always check that what you expected is what you got.  If a count is off by even the smallest amount (one), that is a sign something is wrong.  Stop your work until you understand exactly why the count is unexpected: probably there is an error in the data or you made an error in selection.
ArcView maintains selections as bitmaps.  Internally these are just sequences of zeros (not selected) or ones (selected) corresponding one-to-one with the internal sequence of shapes in a theme or records in a table.  This is efficient: the storage requirements are one-eighth of one byte per zero/one bit; so, for example, the bitmap for a theme with a million features (pretty big) requires less than 125 kilobytes of RAM.  Bitmaps can be accessed and processed very rapidly.
However, ArcView also saves bitmaps for every selection in a project.  The information becomes part of the ArcView project file (apr file).  Because this is a text file, the bitmap information has to be coded into a less efficient form.  It requires 352 bytes for every 1024 bits, so the selection bitmap for that million-record theme will need about 350 kilobytes.  Still not bad, but sometimes it gets in the way.
When you use the query builder button, you are actually creating a statement in ArcView's scripting language, Avenue.  For detailed (and mostly correct) information about Avenue's structure and syntax, search in the ArcView help for "Script (Class)".
You would guess from the preceding observation, then, that you should be able to type any valid Avenue expression right into the Query Builder dialog.  You can, provided it has a logical (true or false) result.  The Query Builder is a much more powerful tool than it appears to be.
Avenue has several "gotchas" that will cause you real heartburn until you figure them out.  Here are some:
Avenue expressions work like a hand-held calculator: they are evaluated in the order you type them.  This means they do not follow the usual precedence rules (multiplication and division performed before addition and subtraction, for example).  Thus, in Avenue, 3 + 5 * 4 results in 32, not 23.
The Avenue parser provides little information about syntax errors.  Stick to simple expressions.  Test more complex ones as scripts (we will learn about those later).
Certain ArcView extensions, especially Spatial Analyst, internally translate Avenue into yet another language.  This internal language is unforgiving: it does not like spaces and special characters in file names, for example.  You can solve many Spatial Analyst problems simply by recognizing this and searching for anything--a field name, a file name, a path name--that could be offending the translator.
Dates and times are especially tricky to work with.  Read about them in the help system by searching under "Date (class)."
The table sort buttons and do not modify the underlying data file.  Remember, a Table document is a way of looking at the underlying data.  When you sort the table, you are simply changing the order in which it presents the same, unchanged records to you.  Similarly, the promote button does not change the underlying data.
A consequence of the preceding observation is that you should not be surprised to see a Table suddenly "unsort" itself or "unpromote" its selection.  This will happen whenever something may cause the underlying data to change, such as a request to edit or "refresh" the table, and ArcView needs to re-read the data: re-opening a table, editing it, and exporting it are among such operations.

Aside: Operations with selections; Boolean algebra

You can see from the exercises in Chapter 13 that ArcView has two distinct ways to establish a selection: by pointing to the selected records and by means of a query, or logical criterion.  This is a well-known duality: any logical condition (a statement that is either definitely true or false when applied to any member of a given set) defines the elements (features or records, if you like) for which it is true.  In the ArcView implementation, a query is an Avenue statement (entered by means of the Query Builder dialog, for example) and its result is a selection bitmap--that is, an explicit representation of the records for which the statement is true.

Conversely, every selection defines the logical condition "is selected" which is true exactly of the selected records.  Yes, this observation is utterly trivial, but it is powerful.  It shows there must be a complete parallel between logical operations on queries and similar operations on bitmaps.

For example, consider the conjunction of two queries: query P and query Q.  Only records satisfying both queries should be selected.  (ArcView implements this with the "Select from set" button in the Query Dialog.)  Pick any record and consider how it is flagged in three selection bitmaps: the bitmap for query P, the bitmap for query Q, and the bitmap for query P and Q.  There are only four possibilities as shown in this table:

Query P Query Q Query P and Q
0 0 0
1 0 0
0 1 0
1 1 1

This table defines what and means.  Because the operation must apply uniformly to every record, you can and two selections by operating on their bitmaps bit-by-bit.  For example, suppose query P selects the first, third, and seventh through twelfth elements of a table (or theme) and query Q selects the second through tenth elements.  It is natural to write the corresponding bitmaps as P = 1010 0011 1111 and Q = 0111 1111 1100 (I put a space after every fourth bit to make it more readable).  Then, applying the and table (above) bit-by-bit gives 1 and 0 = 0, 0 and 1 = 0, ..., 1 and 0 = 0, for the result P and Q = 0010 0011 1100.

Notice that the two left columns of the table simply list every possible combination of values for P and Q.  In the right-hand column we could have written any sequence of four zeros and ones for our definition of the and operation.  (Fortunately, we wrote the correct sequence!)  There are 24 = 16 possible such sequences, so there 16 such tables and therefore 16 "Boolean" operations on two queries.

Two of these operations do not depend on P or Q; we can call them "select all" and "select none".  (In ArcView's Table GUI you will find buttons for these operations; they are and , respectively.)

Four of these operations depend on P alone or on Q alone.  They are P, not P, Q, and not Q.  The not operation simply reverses each bit: 0 becomes 1, 1 becomes 0.  ArcView provides a button for not; it is called Switch selection: .

That leaves ten operations that depend simultaneously on the values of P and Q.  The next table defines the commonest ones, with and repeated for completeness and the equivalent ArcView Query Builder buttons shown:

P Q P and Q
(P * Q)
P or Q P implies Q P iff Q P xor Q
(P + Q)
New Set New Set Select From Set Add To Set
0 0 0 0 1 1 0
1 0 0 1 0 0 1
0 1 0 1 1 0 1
1 1 1 1 1 1 0

There are several interesting things to note in this table:

Two of the operations are arithmetic: multiplication and addition modulo two.
Many of the operations correspond to well-known logical operations on queries: conjunction (and), disjunction (or), conditional (implies), equivalence (iff).
Not all common logical operations have Query Builder buttons.
The last two columns are "opposites" of each other: where one has a zero, the other has a one, and conversely.

This last observation is the most useful.  It shows that iff can be expressed in terms of another operation: specifically, P iff Q = not (P xor Q).  Indeed, if you select any one of these operations then (in conjunction with not) you can define any of the others.  For example, P implies Q = P or not Q.  Because of this, ArcView provides sufficient power to implement any logical combination of queries you might want--you just might have to work a little to get the right combination using its capabilities.  We will encounter examples of this in later exercises.

Exercise 13c--establishing permanent "selections"

A fundamental idea of database theory is to create different ways to view the same underlying data.  We saw above how selection is a powerful tool for answering questions.  The two methods can be combined: in ArcView, certain kinds of selections can be made a permanent part of a theme.  The effect is to limit the theme's features to exactly the ones selected.

Here are some equivalent ways of viewing what a theme's definition does:

It hides records (not meeting the definition criterion)
It restricts the underlying data to just the records meeting the definition criterion
It automatically pre-selects data, but instead of showing selected features in yellow, it shows the selected features in their normal colors, makes the unselected features invisible.  It will not allow unselected data ever to become selected.

There are many, many ways to use theme definitions.  For example,

View the counties in your state by loading the U. S. counties theme (in C:/ESRI/Esridata).  Set the definition to include only the counties in your state.
View all major highways by loading a theme of all highways.  Set the definition to include only the major ones.
View the atmospheric ozone readings obtained only at altitudes between 10,000 and 15, 000 meters by loading a theme of all ozone readings.  Set the definition to include only those at the desired altitudes.

Things to watch out for

When opening a project, ArcView must re-apply all theme definitions to the underlying data.  This can take some time.
The Query Builder dialog in the Theme Properties dialog box will normally show all distinct instances of any field when the "Update Values" box is checked.  However, with a definition set, the only instances shown will be those meeting the definition.  (This can be confusing until you realize what is going on.)
If some records are selected when you set a definition, ArcView may (erroneously) report that there are more records selected than exist in the feature table!  This is because setting the definition does not immediately change the selection bitmap.
The power of a GIS comes in part from its built-in ability to select records according to spatial criteria (such as adjacency, proximity, intersection, or containment).  Unfortunately, the ArcView interface does not allow you to use such criteria for setting theme definitions.  (Using some of the capabilities of Avenue, certain kinds of spatial criteria can be used--but the Query Builder does not help you exploit those capabilities.)

Laboratory Exercises

  1. Determine whether using the Identify tool changes the selection.  (What does it do when there is a selection already?)
  2. What is the difference between these two queries: [Zipcode] = "92374" and [Zipcode] = 92374?  In Exercise 13a (using the [Tract] theme) which one will work?
  3. In Exercise 13a, how many records will be selected for the query "[Zipcode]" = "92374"?  How about the query "Zipcode" = "92374"?  Think about these before you try them out.
  4. In Exercise 13a, select all the tracts on Ohio Street by typing this expression directly into the Query Builder text box: [Address] = "*Ohio*".AsPattern  What are the asterisks for in this expression?  Is ArcView's pattern matching case sensitive or not?  (For more information look up "String (class)" in the ArcView help.)  Check your answer by looking at the selected tracts in the View.
  5. In Exercise 13a, the [Tract] theme appears to be a subset of the [Parcels] theme.  But it is not.  How can you tell?  Why didn't the GTKAV authors simply use a definition to restrict the [Parcels] features to the [Tract] theme?
  6. After you set a theme's definition restricting the features that appear, how does the Zoom to active theme(s) button work?  Does it zoom to the theme's original extent or does it zoom just to the extent of the features that meet the definition?
  7. Does the Query Builder button select all records meeting the query criterion regardless of the theme's definition, or does it only select records meeting the definition?
  8. What happens to the legend after you set a theme's definition?  Does it change to reflect the possibly restricted ranges of attribute values that result from the definition?