Friday, October 21, 2011

The Tilestache Vector Provider, OGR, and MySQL

UPDATE: TileStache has since added native MySQL support. Update to the latest version. The table structure and config in this blog is still needed, but you will not have to modify the source.

This is a topic that is, once again, related to my undergraduate research. I have a need to plot some data on a map and have decided that the Google Maps API is not powerful enough for what I need. I really need to be able to split up data in to tiles and serve that to the client, so they're not downloading the entire data set. Actually, I would like something even a little more sophisticated than that, but in this blog I'll just be talking about serving tiles of simple lat/long points to Polymaps.

A server application called Tilestache handles splitting data up in to map tiles and serving them in GeoJSON format. The catch is, I need to get the points from a preexisting MySQL database, which isn't supported by Tilestache.

Before I say anything else: if you can, PLEASE use Postgres/PostGIS. The MySQL spatial data extensions are lackluster and this solution is still somewhat of a hack.

This is how I got Tilestache talking to MySQL. Fortunately, Tilestache uses OGR for database interaction which happens to have a MySQL driver.

Modifying Tilestache to use the MySQL driver is simple enough. The changes are as follows.
In /Vector/__init__.py/_open_layer():
    okay_drivers = {'postgis': 'PostgreSQL', 'esri shapefile': 'ESRI Shapefile',
                    'postgresql': 'PostgreSQL', 'shapefile': 'ESRI Shapefile',
                    'geojson': 'GeoJSON', 'mysql': 'MySQL'}
   ...
    elif driver_name == 'MySQL':
        if 'dbname' not in parameters:
            raise KnownUnknown('Need at least a "dbname" parameter for mysql')
    
        conn_parts = []
        conn_parts.append("%s" % parameters['dbname'])
        for part in ('user', 'host', 'password'):
            if part in parameters:
                conn_parts.append("%s=%s" % (part, parameters[part]))
        
        source_name = 'MySQL:' + ','.join(conn_parts)
    ...
    if driver_name in ('PostgreSQL', 'MySQL'):

That should be all you need to edit in __init__.py. Once this has been done, edit your tilestache.cfg file to add a layer using the MySQL driver, which will look something like this:
    "data":
    {
        "provider": {"name": "vector", "driver": "MySQL",
                     "parameters": {"dbname": "data", "user": "usr", "password": "passwd",
                                   "table" : "LocationInfo"}}
    }

You can also use queries to select specific data sets from a table. Make sure the user has all of the necessary rights.

Before you fire it up, there are a few more things that need to be done. You should know is the current MySQL spatial extensions don't have the metadata tables (geometry_columns, and spatial_ref_sys) that PostGIS has and are defined in the OpenGIS specifications. OGR uses these tables, so that means you have to make fake ones somehow. Luckily, the OGR MySQL driver will (kind of) do this for you!

Using the DataSource.CreateLayer() method will create the metadata tables and the table for your geometry data. It will create the spatial_ref_sys table, however it will be empty. This table is supposed to hold a list of spatial reference identifiers and their corresponding projections. You will need to add the appropriate ones (I still don't understand projections all that well, but I'm guessing WGS84 does the trick for most cases?) and then update the geometry_columns table to associate the projection with the column that was just created.

You know what would be even better than trying to get OGR to do it for you? Me just giving you a SQL script to do it. So here it is!
--
-- Table structure for table `geometry_columns`
--

CREATE TABLE IF NOT EXISTS `geometry_columns` (
  `F_TABLE_CATALOG` varchar(256) DEFAULT NULL,
  `F_TABLE_SCHEMA` varchar(256) DEFAULT NULL,
  `F_TABLE_NAME` varchar(256) NOT NULL,
  `F_GEOMETRY_COLUMN` varchar(256) NOT NULL,
  `COORD_DIMENSION` int(11) DEFAULT NULL,
  `SRID` int(11) DEFAULT NULL,
  `TYPE` varchar(256) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Table structure for table `spatial_ref_sys`
--

CREATE TABLE IF NOT EXISTS `spatial_ref_sys` (
  `SRID` int(11) NOT NULL,
  `AUTH_NAME` varchar(256) DEFAULT NULL,
  `AUTH_SRID` int(11) DEFAULT NULL,
  `SRTEXT` varchar(2048) DEFAULT NULL,
  `PROJ4TEXT` varchar(2048) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `spatial_ref_sys`
--

INSERT INTO `spatial_ref_sys` (`SRID`, `AUTH_NAME`, `AUTH_SRID`, `SRTEXT`, `PROJ4TEXT`) VALUES
(4326, 'EPSG', 4326, 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]', '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs ');


Now you need to add rows to geometry_columns. This table basically just keeps track of all your geometries, and what SRID they are using. I am using GPS data, which I'm pretty sure is WGS84, so that's the only SRID I need. F_TABLE_CATALOG and F_TABLE_SCHEMA are allowed to be NULL. F_TABLE_NAME is the name of the table that contains the geometry column. F_GEOMETRY_COLUMN is the name of the column that is a geometry. I'm not 100% sure what COORD_DIMENSION means, but from what I can tell it's how many dimensions your geometry uses. Mine is a point with an x/y value, so my value for COORD_DIMENSION is 2. SRID is the reference ID of a row within spatial_ref_sys. TYPE is the name of the geometry. For me, it's "POINT".

Once all that has been done, OGR should be able to connect to MySQL and run its fancy bounding box queries against your geometries so that Tilestache can serve data as tiles! Yay!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.