The ESRI Shapefile is one of the most widely used geospatial vector data formats for Geographic Information Systems (GIS). Developed by Esri in the early 1990s, shapefiles store the geometric location and attribute information of geographic features such as points, lines, and polygons.
Despite its name, a "shapefile" is actually a collection of multiple files that work together to describe vector features. This format has become an industry standard due to its simplicity, widespread software support, and ability to handle large datasets efficiently.
or click to browse (.zip file containing .shp, .shx, .dbf, .prj)
2
Explore Shapefile Data
File Structure and Components
A shapefile is actually a collection of files that work together. The three mandatory files are:
1. Main File (.shp)
The .shp file contains the geometry data in binary format. It stores the actual shape coordinates (points, lines, or polygons) using a specific binary structure defined by Esri.
.shp File Binary Structure:
File Header (100 bytes):
- File code (9994)
- File length
- Version (1000)
- Shape type (Point=1, Polyline=3, Polygon=5, etc.)
- Bounding box (minX, minY, maxX, maxY)
Records:
- Record header (record number, content length)
- Shape type
- Geometry data (coordinates)
2. Index File (.shx)
The .shx file is a positional index that stores the byte offset of each record in the .shp file. This allows software to quickly jump to specific features without reading the entire file sequentially.
3. dBASE Table (.dbf)
The .dbf file stores attribute data in dBASE IV format. Each row corresponds to a feature in the .shp file, and columns contain attribute fields like names, IDs, categories, and measurements.
.dbf File Structure:
Header (32 bytes):
- Version number
- Last update date
- Number of records
- Header length
- Record length
Field Descriptors (32 bytes each):
- Field name (11 bytes)
- Field type (C=Character, N=Numeric, D=Date, L=Logical)
- Field length
- Decimal count
Records:
- Deletion flag
- Field values
.sbn/.sbx: Spatial index files for fast spatial queries
.xml: Metadata in XML format (ISO 19115, FGDC)
.cpg: Code page file specifying character encoding for the .dbf file
.qix: Quadtree spatial index (used by QGIS)
.ain/.aih: Attribute index files
Geometry Types
Shapefiles support various geometry types, each representing different spatial features:
Supported Geometry Types:
Type 1: Point Single coordinate (X, Y)
Type 3: Polyline Connected line segments
Type 5: Polygon Closed area with boundary
Type 8: MultiPoint Multiple disconnected points
Type 11: PointZ Point with Z (elevation)
Type 13: PolylineZ Polyline with Z values
Type 15: PolygonZ Polygon with Z values
Type 21: PointM Point with M (measurement)
Type 23: PolylineM Polyline with M values
Type 25: PolygonM Polygon with M values
Type 31: MultiPatch 3D surface (TIN, mesh)
Important: Each shapefile can only contain ONE geometry type. You cannot mix points and polygons in the same .shp file.
Applications and Use Cases
Shapefiles are used across numerous industries and applications:
Urban Planning: Building footprints, zoning districts, land use maps
Transportation: Road networks, railway lines, traffic analysis zones
Environmental Management: Protected areas, habitat mapping, pollution zones
Natural Resources: Forest boundaries, mineral deposits, water bodies
Emergency Services: Fire districts, ambulance coverage areas, evacuation routes
Demographics: Census boundaries, voting districts, school catchment areas
Utilities: Power lines, water pipes, telecommunication networks
Agriculture: Farm parcels, crop types, irrigation systems
Archaeology: Excavation sites, artifact locations, ancient boundaries
Disaster Management: Flood zones, earthquake epicenters, landslide areas
Advantages and Limitations
Advantages
Universal Support: Supported by virtually all GIS software (QGIS, ArcGIS, MapInfo, Google Earth Pro, etc.)
Simple Format: Well-documented and easy to implement
Wide Adoption: Industry standard for over 30 years
Fast Performance: Binary format enables quick reading and rendering
Spatial Indexing: Optional index files provide fast spatial queries
Attribute Storage: dBASE format handles various data types
Limitations
Multiple Files: Must manage several files together, easy to lose components
File Size Limit: 2 GB maximum file size per component
Single Geometry Type: Cannot mix different geometry types in one file
Field Name Length: DBF limits field names to 10 characters
Limited Data Types: No support for time, Boolean, or BLOB data types
No Topology: Does not store topological relationships between features
Encoding Issues: Character encoding can cause problems without .cpg file
Modern Alternatives
While shapefiles remain popular, modern alternatives address many limitations: