Shapefile - Overview

Overview

A shapefile is a digital vector storage format for storing geometric location and associated attribute information. This format lacks the capacity to store topological information. The shapefile format was introduced with ArcView GIS version 2 in the early 1990s. It is now possible to read and write shapefiles using a variety of free and paid programs.

Shapefiles are simple because they store the primitive geometric data types of points, lines, and polygons. They are of limited use without any attributes to specify what they represent. Therefore, a table of records will store properties/attributes for each primitive shape in the shapefile. Shapes (points/lines/polygons) together with data attributes can create infinitely many representations about geographic data. Representation provides the ability for powerful and accurate computations.

While the term "shapefile" is quite common, a "shapefile" is actually a set of several files. Three individual files are mandatory to store the core data that comprise a shapefile: .shp, .shx, and .dbf. The actual shapefile relates specifically to .shp files but alone is incomplete for distribution, as the other supporting files are required.

There are further optional files which store primarily index data to improve performance. Each individual file should conform to the MS DOS 8.3 filename convention (8 character filename prefix, period, 3 character filename suffix such as "shp") in order to be compatible with past applications that handle shapefiles, though many recent software applications accept files with longer names. For this same reason, all files should be located in the same folder.

Mandatory files :

  • .shp — shape format; the feature geometry itself
  • .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
  • .dbf — attribute format; columnar attributes for each shape, in dBase IV format

Optional files :

  • .prj — projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text format
  • .sbn and .sbx — a spatial index of the features
  • .fbn and .fbx — a spatial index of the features for shapefiles that are read-only
  • .ain and .aih — an attribute index of the active fields in a table
  • .ixs — a geocoding index for read-write shapefiles
  • .mxs — a geocoding index for read-write shapefiles (ODB format)
  • .atx — an attribute index for the .dbf file in the form of shapefile.columnname.atx (ArcGIS 8 and later)
  • .shp.xml — geospatial metadata in XML format, such as ISO 19115 or other XML schema
  • .cpg — used to specify the code page (only for .dbf) for identifying the character encoding to be used

In each of the .shp, .shx, and .dbf files, the shapes in each file correspond to each other in sequence (i.e., the first record in the .shp file corresponds to the first record in the .shx and .dbf files, etc.). The .shp and .shx files have various fields with different endianness, so an implementor of the file formats must be very careful to respect the endianness of each field and treat it properly.

Shapefiles deal with coordinates in terms of X and Y, although they are often storing longitude and latitude.

Read more about this topic:  Shapefile