September 2009
Objective
This tutorial walks you through the basics of converting GIS vector data to KML using the OGR open source library. While these libraries can be used with most GIS files, this tutorial will focus on working with ESRI shapefiles.
Introduction
Geographic data is available in many forms on the web. KML is one of the most prevalent file formats, but many other file types are used as well. Large companies, governments, and NGOs use geographic information systems to create their maps, along with specialized file formats. These applications often require specialized training, or at least a significant amount of time to learn. The proprietary ones can be quite expensive. On the other hand, they are very powerful and provide a rich set of mapping tools for the professional mapper.
Many government entities release some portion of their GIS data for public use. Portals like Data.gov, Massachusetts Geographic Information System and DataSF make it easy to find data for their communities. This article describes how to convert vector data—that is data is comprised of simple geometries like points, lines, and polygons—into KML. This will give you access to a variety of different types of data, including:
- Parcel data representing the boundaries of building parcels
- Incident data, such as crime reports
- Boundary data, for boundaries of municipalities, states, counties, provinces, etc.
- Road data, including planned roads and existing ones
- Construction permits, indicating where permits have been issued
- Health data, such as describing where there are incidents of flu outbreak
There are many excellent applications for doing the data conversion, such as Google Earth Pro, shp2kml, KML2KML, Arc2Earth, and many others. This article will focus on converting vector data to KML for use in Google Earth or Google Maps, using the open source Geospatial Data Abstraction Library(GDAL) utilities from the command line, and perhaps inspire you to incorporate those libraries into your own applications.
A note on file types
While this article takes you through the steps if converting ESRI shapefiles to KML, the GDAL utilties, in particular OGR, can be used to convert from a wide variety of file types, including CSV, PostGRES/PostGIS databases, and a variety of other formats. Most major data formats you'll encounter are supported. If you find a format that isn't supported, you can write a driver for OGR, since it's an open source library.
Shapefiles
ESRI's popular shapefile format is one of the most common GIS data formats. While technically the shapefile is a single file with a .shp extension, a .shp cannot be opened by itself. It requires at least a .dbf, and .shx file, and there is a variety of other files also required for a variety of purposes. So when you see a reference to a shapefile, it almost always means a collection of files, usually in a zipped archive of some sort to keep it together, and that's how we'll be using the term shapefile for the rest of this article.
Shapefiles contain a large amount of information about the geographies they describe. They describe the actual geometries, metadata about the geometries, and information about the spatial reference system used, as well as many other aspects of the data. For purposes of this article, we'll care most about the geometries, metadata, and spatial reference system.
Geometries and metadata are easy concepts. Geometries are points, lines, and polygons, and can easily be expressed in KML. Metadata is data about the data, often used for filtering or querying purposes. For instance, a line describing a road may have metadata about the type of road (municipal street, national highway, turnpike, etc.), the speed limits, who funds it, its size, etc.
Spatial reference systems (SRS) are used to identify coordinate systems and projections used to create the vector data. In KML uses latitude and longitude in a WGS84 coordinate system. But there are other ways of identifying coordinates on a map. Popular ones include: Universal Transverse Mercator, the British National Grid, and the State Plane systems. To convert data into KML, it may be necessary for you to identify that information. KML only supports WGS84. Typically, shapefiles will carry that information along with them, often in a .prj file, and OGR can detect it from there. However, at times it is necessary to identify the SRS. Sometimes this is provided in some form by the data source, either on the page you download it from, or in a readme document with the download, or some other format. Usually, this is sufficient. The site Spatial Reference contains more information about SRS and has a reference that allows you to look up individual reference systems.
ogr2ogr
GDAL provides a powerful set of libraries for working with vector data. In particular, ogr2ogr
is a powerful utility for data conversion. Many applications, including some of those mentioned above, incorporate GDAL/OGR.
To get started, download and install GDAL. Then you will need a shapefile. For purposes of this tutorial, try using one from DataSF. The example below uses the realtor_neighborhoods
shapefile, which can be obtained, after agreeing to their license, here. Once you've downloaded the file, unzip it into a directory that you will remember. Open up a command line and navigate to the directory that you put the data in. Now for the fun part.
ogr2ogr
can be used from the command line very easily. Here is how you could convert realtor_neighborhoods
from a shapefile to KML:
ogr2ogr -f "KML" -where "NBRHOOD='Telegraph Hill'" realtor_neighborhoods.kml realtor_neighborhoods.shp
Here's a breakdown of what that command does:
ogr2ogr
: This is the core command.-f "KML
: This sets the output format to KML.-where "NBRHOOD='Telegraph Hill'"
: This is an optionalwhere
clause, like in SQL. Basically, it allows you to query the data based on metadata. It works with shapefiles and other file types that support querying. In this case, it is querying for the NBRHOOD field, and only selecting features that have a NBRHOOD of Telegraph Hill. If you leave that parameter off,ogr2ogr
gives you every neighborhood polygon.realtor_neighborhoods.kml
: This is the output file name. Output file name comes first.realtor_neighborhoods.shp
: This is the input file name. The .shp file represents the whole shapefile.
That's it, it's very simple. This command writes a KML file that looks like this:
<?xml version="1.0" encoding="utf-8" ?> <kml xmlns="http://www.opengis.net/kml/2.2"> <Document><Folder><name>realtor_neighborhoods</name> <Schema name="realtor_neighborhoods" id="realtor_neighborhoods"> <SimpleField name="Name" type="string"></SimpleField> <SimpleField name="Description" type="string"></SimpleField> <SimpleField name="OBJECTID" type="float"></SimpleField> <SimpleField name="NBRHOOD" type="string"></SimpleField> <SimpleField name="SFAR_DISTR" type="string"></SimpleField> </Schema> <Placemark> <ExtendedData><SchemaData schemaUrl="#realtor_neighborhoods"> <SimpleData name="OBJECTID">81</SimpleData> <SimpleData name="NBRHOOD">Telegraph Hill</SimpleData> <SimpleData name="SFAR_DISTR">District 8 - Northeast</SimpleData> </SchemaData></ExtendedData> <Polygon><outerBoundaryIs><LinearRing><coordinates>-122.41041847319012,37.805924016582715,0 -122.407203813674,37.806324902060979,0 -122.40667792852096,37.803710121958744,0 -122.40348255423899,37.804117462290641,0 -122.40237202127015,37.798540648764529,0 -122.40876046662795,37.797723222540775,0 -122.41041847319012,37.805924016582715,0</coordinates></LinearRing></outerBoundaryIs></Polygon> <Style><LineStyle><color>ff0000ff</color></LineStyle> <PolyStyle><fill>0</fill></PolyStyle></Style> </Placemark> </Folder></Document></kml>
You can see that the metadata from the shapefile was preserved, in the Schema
and SimpleData
elements. For more information on using ExtendedData and the preservation of custom data, check out the KML Developer's Guide section on Adding Custom Data.
What's next?
GDAL/OGR gives you a tremendous amount of power. In the simplest implementation, you can now convert all your data to KML for use in Google Earth, Maps, or any other KML-supporting geo browser. Even better, you can incorporate the GDAL/OGR libraries into your applications, giving you the capability to automate the conversion of GIS data into KML, and controlling the output of that conversion. Try combining it with libkml for even more programmatic control over your KML generation.