Fixing A Corrupted Shapefile

Many of you have reported occasional errors in ArcView and GTKAV associated with Exercise 11a.  I have hunted down and fixed the problem.  This page communicates the solution and describes--because it demonstrates a useful general technique--how the solution was found.  If you simply want the solution and do not care to read how it was done, just skip to the last paragraph.

Identifying the problem

I opened Exercise 11a using ArcView.  Then, as reported (many thanks to those of you who provided details), I pushed the "Projection" button in the View|Properties dialog.  This produced an error message stating "Error in reading shape record length for record 156."

Evidently the problem is with a shapefile that has at least 156 records.  I opened the feature tables for all the themes in the view.  [Attributes of Buildings] has exactly 156 records (that's suspicious!) and [Attributes of Waterlines] has 283.  So we're down to two possibilities.

To narrow it down, I simply cut the [Waterlines] theme from the view and pasted it into a new view for safekeeping.  Going back to the original view ("View1"), I again tried the "Projection" button.  Same problem.  Now I was sure that the [Buildings] theme was the source of the problem.  (This is not to say [Waterlines] is trouble-free--it could have problems too--but I knew I had something to go on.)

When you first open a table, ArcView displays its records in physical order.  So my guess was that the very last [Buildings] record had a problem.  Opening its feature table again (I had closed it earlier) showed a record with all zero and blank values.  I selected it and went back to the view, hoping to see which feature it represented.  So I pushed the "Zoom to selected" button.  Bingo!  Same error message.

First attempt at a fix

Before fixing this problem, which would require editing the shapefile, I wanted to save a copy of the original.  The Theme|Properties dialog showed me that the file was GTKAV/Data/Ch11/bldgs2.shp.  Remembering that "shapefiles" are really multiple files, I used Windows Explorer to find and make copies of GTKAV/Data/Ch11/bldgs2.*.  There are three files; interestingly, the .dbf file has a different modification time than the other two--further evidence that somebody at ESRI corrupted the shapefile by separately accessing the .dbf file or by mixing up parts of the shapefile.  (This is the risk a software designer takes on when electing to store data in multiple related files.)

Back in ArcView, I started editing the [Attributes of Buildings] table.  The bad record was already selected, so I just went to Edit|Delete records.  This produced a serious error message--an "assertion."

Sidebar for people interested in software engineering issues

Good programmers use "assertions" as logical checkpoints in programs during program development.  Conditions that *must* be true for correct program execution, and are *assumed* to be true by the programmer, are explicitly tested while the code is running.  If the tests fail, the program is stopped so the programmer can fix the problem.

It is standard software engineering practice to disable asserts in release code, for obvious reasons.  It is also considered very poor practice to let asserts do routine error checking.  Their purpose is different--coercing asserts into this role is a little like leaving the scaffolding around a building because you have discovered during construction that it is needed to hold the building up!  Of course you don't do such a thing: you re-engineer the building to stand on its own.  And so it is with software.  Usually.

Unfortunately, ArcView depends heavily on its assertions for error checking and its programmers have left them in the release version.  If this software did not have some unusual and powerful features, and if it were not generally stable and resistant to most abuse (at least on Win NT systems), then the presence of these assertions alone would make me reject this software outright for use on any project.  As it is, the years have taught me that the building will stand, so I put up with the ugly scaffolding around it.

For more on assertions and their use in writing computer programs, see Maguire, Steve, "Writing Solid Code."  1993, Microsoft Press, WA.  See chapter 2.  Here's a quotation: "Use [assertions] to catch illegal conditions that should never arise.  Don't confuse such conditions with error conditions, which you must handle in the final product."

Second attempt

The assertion is not very useful.  It is essentially a cryptic version of the earlier message.  So what to do now?

The answer is to use the same technique that likely got us into this situation in the first place: edit the attribute table all by itself--as a table, not as the attributes of a shapefile.  Normally, this is dangerous, but here we're fixing a problem and we've already backed up the data.

Here's where ArcView really shines: it's powerful enough to fix this problem on the spot, no tricks necessary.  I went to the Project window and added the GTKAV/Data/Ch11/Bldgs.dbf file as a Table to the project.  Opening it, I selected the last record (number 156, full of zeros and blanks), started editing, and deleted the record.  No problem.  I stopped editing and asked ArcView to save the edits.  It did.  I closed the table, returned to the view, and immediately opened the [Buildings] feature table.  Problem gone!  Only 155 records.

Let's check: back in the view, I set the projection to Georgia State Plane West (1983).  It works just fine.  (And no, the horizontal soccer field simply does not fit into the space.)

The thick orange rectangle surrounds a rectangle of 110 by 73 meters.  It cannot possibly fit into any of the empty (green) space.

To clean up (an unnecessary step here, but worth doing when the project is one of your own), I moved the [Water lines] theme back into the original view--no problem (remember, I wasn't sure that one was problem-free, either); then, to avoid future difficulties, I deleted the bldgs2.dbf table from the project (just select it in the Project window and press the delete key on the keyboard).  We don't want to create the same error at a later date by editing the bldgs2.dbf table all by itself: we always want to edit it as *part of* a theme.  Finally, I deleted the temporary view created to hold [Water lines] earlier.  The project was back to normal and working fine.

I recommend following this procedure yourself--it's a good test of what you have learned to date.  But, knowing the value of your time (and knowing I have already taken several more minutes than you would like with this message), I have linked the modified bldgs2.dbf file to this page.  If you copy it in place of GTKAV/Data/Ch11/Bldgs2.dbf, the problem should be fixed on your computer, too.

(28 March 2000)