Notes
Outline
Introduction to GIS
Presented by William Huber, Ph.D.
Quantitative Decisions, Merion, PA and
Penn State-Great Valley, Malvern, PA
Outline
What GIS is and is not.
Key GIS components and techniques.
How GIS works.
Examples of GIS analyses.
Examples of GIS applications.
Geographic Information Systems
Helping With Global Problems
The major challenges we face in the world today--overpopulation, pollution, deforestation, natural disasters--have a critical geographic dimension.
Helping With Local Problems
Whether investigating an industrial facility  or figuring out the best route for an emergency vehicle, local problems also have a geographic component.
What is a GIS ?
A geographic information system (GIS) is a computer-based tool for mapping and analyzing things that exist on and events that happen on Earth:
It is a general purpose tool; that is, it can be applied in many ways to many problems, including many not anticipated by the GIS designers.
It must be able to select desired data from among potentially very large stores of information
It must be able to pre-process those data into a format suitable for mapping or analysis.
It must be able to post-process results into graphics, tables, reports, and maps.
What Makes GIS Distinctive
GIS technology integrates common database operations such as query and statistical analysis with the unique visualization and geographic analysis benefits offered by maps.
GIS readily converts data between different data models (unlike most database and statistical software).
These abilities distinguish GIS from other information systems and make it valuable to a wide range of public and private enterprises for explaining events, predicting outcomes, and planning strategies.
What is a GIS—Core Ingredients
mapping
geographic data storage, retrieval, and conversion
database management
statistical analysis
visualization
geographic analysis
The Case for GIS
GIS delivers useful map making and analytical capabilities to groups and by long distance over the Internet.
Map making and geographic analysis are not new, but a GIS performs these tasks better and faster than do the old manual methods.
GIS is optimized to perform certain kinds of data analyses involving distance, area, direction, and so on, better than other computer software.
GIS on the Internet
Aerial photography
Soils
Water
Digital elevation models (DEM, DTED)
Government agencies
U.S. states
General GIS information
Geodesy
Instructional materials and journals
Interactive mapping
Metadata and standards
Transportation GIS
Organizations
Software
Remote sensing
Cartography
Who Uses GIS?
(Before GIS technology, only a few people had the skills necessary to use geographic information to help with decision making and problem solving.)
Today, GIS is a multi-billion-dollar industry employing hundreds of thousands of people worldwide.
GIS is taught in schools, colleges, and universities throughout the world.
Professionals in many fields are increasingly aware of the advantages of thinking and working geographically.
Internet users of GIS are rapidly growing in numbers.
Categories of GIS Users
Business Support
Spatial information augments the management of traditional business practices: customer management, marketing, logistical planning, retail site selection, and so on.
Personal Productivity
A majority of software users indicate they would like to “communicate with maps.”
GIS Professionals
These people acquire, create, edit, and integrate spatial data, develop the systems used in business and modeling applications, and, increasingly, develop Internet applications to enhance the accessibility of geographic data to occasional or casual users.
Components of a GIS
A working GIS integrates these key components:
hardware
software
data
people
methods
H a r d w a r e
Hardware is the computer on which a GIS operates, including the resources available to the computer:
printers
plotters
digitizers
scanners
monitors
network
wide area communications
Today, GIS software runs on a wide range of hardware types, from centralized computer servers to desktop computers used in stand-alone or networked configurations.
S o f t w a r e
GIS software provides the functions and tools needed to
store
query
display
analyze
create
modify
data.
S o f t w a r e (2)
Key software components are
tools for the input, manipulation, reformatting, and output of geographic data
a database management system (DBMS)
tools for geographic query, analysis, and visualization
a graphical user interface (GUI) for easy access to tools
tools to document data sources and quality (metadata)
D a t a
Possibly the most important component of a GIS is the data.  Geographic data and related tabular data can be collected in-house, found on the Internet for free, or purchased from a commercial data provider.  A GIS will integrate spatial data with other data resources and can even use a DBMS, used by most organizations to organize and maintain their data, to manage spatial data.  (Many GISes are moving toward the use of standard DBMSes, such as Oracle, for core database management functions.)
D a t a (2)
New data can also be entered into a GIS in many different ways, including:
Digitizing from a digitizer
Heads up digitizing
GPS
Spatial “events”
Surveys, via COGO (computer geometry operations)
Scanned images
Acquisition from remote sensing instrumentation
P e o p l e
GIS technology is of limited value without the people who manage the system and develop plans for applying it to real world problems.
GIS is a general purpose tool.  What makes it work at all is the “application domain” knowledge of the system designers and operators who actually apply this tool.  To use GIS effectively in a navigation application requires specialized knowledge of navigation principles, for example.
M e t h o d s
A successful GIS operates according to a well-designed plan and business rules (or scientific rules), which are the models and operating practices unique to each organization.
These models will determine database designs, the formats in which geographical data are stored, and the specialized software used for analysis.
How GIS Works
A GIS stores information about the world as a collection of thematic layers that can be linked together by geography. This simple but powerful and versatile concept has proven invaluable for solving many real-world problems from tracking delivery vehicles, to recording details of planning applications, to modeling global atmospheric circulation.
Conceptual Model of GIS
GIS Conceptual Data Formats
Object models
“Vector” data
Zero dimensions:
Points
Multipoints
One dimension:
Line segments
Polylines
Splines
Two dimensions:
Polygons
“Raster” data
Indicator grids
Categorical grids
GIS Data Formats (2)
Continuous field models
“Vector” data:
Irregularly spaced sample points
Contours
Polygons
Triangular [interpolation] net, or TIN
“Raster” data:
Regularly spaced sample points
Cell grid (numeric values)
Network models
Planar embedding of a one-dimensional graph plus:
Point events
Segment events
Address models
Rule base for addressing a street network
GIS Tasks
General purpose GISes essentially perform five processes or tasks.
Input
Manipulation
Management
Query and Analysis
Visualization
I n p u t
Before geographic data can be used in a GIS, the data must be converted into a suitable digital format. The process of converting data from paper maps into computer files is called digitizing.
Modern GIS technology has the capability to automate this process fully for large projects using scanning technology; smaller jobs may require some manual digitizing (using a digitizing table).
Today many types of geographic data already exist in GIS-compatible formats. These data can be obtained from data suppliers and loaded directly into a GIS.
Data can be obtained directly from commercial or government sensors on satellites or aircraft.
GPS and sensor data can be combined to create point or polyline data sets.
Challenges for Data Input
A myriad of external formats exist.  Data loss in transformation from one format to another must be assessed, documented, and minimized.
Conversion between certain formats, such as between vector and raster representations, inherently creates a loss of information.
Different formats support different levels of precision and resolution.
M a n i p u l a t i o n
It is likely that data types required for a particular GIS project will need to be transformed or manipulated in some way to make them compatible with your system.
For example, geographic information is available at different scales (street centerline files might be available at a scale of 1:100,000; census boundaries at 1:50,000, postal codes at 1:10,000, and surveyed points at 1:500). Before this information can be integrated, it must be referenced to the same datum, projected in a consistent manner, and transformed to the same scale.  This could be a temporary transformation for display purposes or a permanent one required for analysis.
GIS technology offers many tools for manipulating spatial data and for weeding out unnecessary data.  How and how well the software implements these tools is a primary determinant of its ease of use and its cost.
Challenges for Manipulation
Changing datums (for example, from NAD 27 to NAD 83) can be difficult or cumbersome.
Projections are usually implemented using power series approximations that might be inadequate for high precision work.
Manipulating raster data sets usually requires resampling and interpolation, which can introduce errors and cause a loss of resolution.
Manipulation of linear features, including polygon boundaries, often creates nonlinear objects that have to be approximated with linear representations.
M a n a g e m e n t
For small GIS projects it may be sufficient to store geographic information as simple files.
Nowadays many GIS projects, even small ones, may access data scattered throughout a network.  Such data need special management.
It is best to use a database management system (DBMS) to help store, organize, and manage data.  (A DBMS is nothing more than computer software for managing a database--an integrated collection of data.)
Managing changes is particularly difficult.  Changes occur from resolving mismatches among data sources and from updates caused by change over time.  Present-day (early 2000) commercial GISes have few or no tools to support this.
Aside—Relational DBMSs
Data are stored conceptually as a collection of tables.
Common fields in different tables are used to link them together.
This simple design has been widely used because of its flexibility and wide deployment in many applications.
Challenges for Data Management
Many systems still physically separate the geographic from the other (“attribute”) data, which can create difficulties.
Most systems do not support managing information about the data (the metadata).  Such support is crucial for overcoming the problems with data input and manipulation.
There is no established methodology for managing data change.
Mechanisms for assessing, storing, and visualizing data accuracy are only in the research stage.
Query and Analysis
Once you have a functioning GIS containing your geographic information, you can begin to ask simple questions such as
What proportion of prime agricultural land is presently in use?
How far is it between a contaminant source and a potentially exposed individual?
Where is land zoned for industrial use?
And analytical questions such as
Can the projected growth in infrastructure support the predicted population increase within this area?
What is the dominant soil type for oak forest?
If I build a new highway here, how will traffic be affected?
How will potential changes in weather and climate affect the rate of snow melt?
Challenges for Query and Analysis
No GIS can hope to support all the analytical tools and models that are needed.  Every GIS needs mechanisms for incorporating user-written code and interfacing with existing models.
Languages for analysis of specific kinds of geographic data exist (such as for raster data) but are not standardized the way SQL, for example, has been for database manipulation and query.
Many purely geographic analyses have many forms of solution, often depending on how the data are represented.  Lack of awareness of the trade-offs in computing resources (execution time, RAM) causes many analyses never to be done or to tie up computing workstations for weeks or months.
A Core Benefit of GIS
A modern GIS provides both simple point-and-click query capabilities and sophisticated analysis tools to provide timely information to managers and analysts alike.  GIS technology really comes into its own when used to analyze geographic data to look for patterns and trends, and to undertake "what if" scenarios. Modern GISes have many powerful analytical tools, but these are especially important:
Analysis of
Proximity
Adjacency
Containment
Overlay analysis
Evaluating connectedness (finding paths)
Proximity Analysis
Typical questions:
How many low income households lie within two miles of this proposed site?
What is the total number of soil samples within 50 feet of this pipeline?
What proportion of the alfalfa crop is within 500 m of the well?
How many people live within a twenty minute ride from downtown?
To answer such questions, GIS technology often uses a process called buffering to determine the proximity relationship between features.
Buffering Lines
Other Forms of Proximity Analysis
Adjacency Analysis
Typical questions:
Which developed regions lie on a fault line?
Which properties lie on or next to a flood plain?
Which tracts have direct access to a highway?  To a lake?
Which species have habitats in contact with a protected ecological region?
Performing every possible comparison is time-consuming.  A good GIS creates internal data structures (“topology”) for finding answers rapidly.
Adjacency Analysis
Containment Analysis
Typical questions:
Which earthquake zones are located on land masses?
Which crimes occurred within the Fifth District?
Which roads lie entirely within the local jurisdiction?
Which habitats do not lie completely within protected areas?
Clearly there are close relationships among questions of proximity, adjacency, and containment.  Often two or more of these techniques are suitable for answering a question.
Overlay Analysis
At its simplest, overlay is a visual operation, but many analytical operations require one or more data layers to be joined physically to show all distinct combinations of attributes. This overlay, or spatial join, can integrate data on soils, slope, and vegetation, or land ownership with tax assessment.
Typical questions:
Identify all portions of all properties with greater than 15% slope.  (Layers are properties and slopes.)
Show regions where land use changed between 1990 and 2000.  (Layers are land use 1990 and land use 2000.)
Identify the portions of a market service area with population density greater than 50,000 people per square mile.  (Layers are service areas and population density.)
Overlay Analysis Example
Environmental GIS
Modeling deposition of pollutants from air
emissions sources in the Netherlands
Managing water supply in a Morocco river valley
Studying corn production in central
Africa
Biodiversity assessment in Pennsylvania
Ecosystems analysis in Madagascar
Assessing the value of environmentally impaired real estate in New Jersey
Tracking multi-phase contaminants in Silicon Valley, California
Restoring natural habitat at Savannah River, Georgia
Transportation GIS
The integration and analysis of highway crash data in a GIS project
Intelligent crash location
Multimodal investment analysis
Traffic planning tools
Route selection and evaluation
Business GIS
Intelligent routing and logistics
Consumer information
Retail store site selection
Streamlining business mergers
Customer market analysis
Demographic analysis
Tracking Time and Motion
Vehicle tracking
Emergency management
E-911 monitoring and dispatch
Delivery tracking
Wildlife tracking
Precision agriculture
Military asset management
Conclusions
GIS is rapidly becoming a key technology to support decision making at all scales
The near future will continue to see accelerating growth in data availability and computing power to support GIS
The strategic decision to make now is not whether, but when and how to use GIS to support decisions