Friday, October 21, 2011

The Tilestache Vector Provider, OGR, and MySQL

UPDATE: TileStache has since added native MySQL support. Update to the latest version. The table structure and config in this blog is still needed, but you will not have to modify the source.

This is a topic that is, once again, related to my undergraduate research. I have a need to plot some data on a map and have decided that the Google Maps API is not powerful enough for what I need. I really need to be able to split up data in to tiles and serve that to the client, so they're not downloading the entire data set. Actually, I would like something even a little more sophisticated than that, but in this blog I'll just be talking about serving tiles of simple lat/long points to Polymaps.

A server application called Tilestache handles splitting data up in to map tiles and serving them in GeoJSON format. The catch is, I need to get the points from a preexisting MySQL database, which isn't supported by Tilestache.

Before I say anything else: if you can, PLEASE use Postgres/PostGIS. The MySQL spatial data extensions are lackluster and this solution is still somewhat of a hack.

This is how I got Tilestache talking to MySQL. Fortunately, Tilestache uses OGR for database interaction which happens to have a MySQL driver.

Modifying Tilestache to use the MySQL driver is simple enough. The changes are as follows.
In /Vector/__init__.py/_open_layer():
    okay_drivers = {'postgis': 'PostgreSQL', 'esri shapefile': 'ESRI Shapefile',
                    'postgresql': 'PostgreSQL', 'shapefile': 'ESRI Shapefile',
                    'geojson': 'GeoJSON', 'mysql': 'MySQL'}
   ...
    elif driver_name == 'MySQL':
        if 'dbname' not in parameters:
            raise KnownUnknown('Need at least a "dbname" parameter for mysql')
    
        conn_parts = []
        conn_parts.append("%s" % parameters['dbname'])
        for part in ('user', 'host', 'password'):
            if part in parameters:
                conn_parts.append("%s=%s" % (part, parameters[part]))
        
        source_name = 'MySQL:' + ','.join(conn_parts)
    ...
    if driver_name in ('PostgreSQL', 'MySQL'):

That should be all you need to edit in __init__.py. Once this has been done, edit your tilestache.cfg file to add a layer using the MySQL driver, which will look something like this:
    "data":
    {
        "provider": {"name": "vector", "driver": "MySQL",
                     "parameters": {"dbname": "data", "user": "usr", "password": "passwd",
                                   "table" : "LocationInfo"}}
    }

You can also use queries to select specific data sets from a table. Make sure the user has all of the necessary rights.

Before you fire it up, there are a few more things that need to be done. You should know is the current MySQL spatial extensions don't have the metadata tables (geometry_columns, and spatial_ref_sys) that PostGIS has and are defined in the OpenGIS specifications. OGR uses these tables, so that means you have to make fake ones somehow. Luckily, the OGR MySQL driver will (kind of) do this for you!

Using the DataSource.CreateLayer() method will create the metadata tables and the table for your geometry data. It will create the spatial_ref_sys table, however it will be empty. This table is supposed to hold a list of spatial reference identifiers and their corresponding projections. You will need to add the appropriate ones (I still don't understand projections all that well, but I'm guessing WGS84 does the trick for most cases?) and then update the geometry_columns table to associate the projection with the column that was just created.

You know what would be even better than trying to get OGR to do it for you? Me just giving you a SQL script to do it. So here it is!
--
-- Table structure for table `geometry_columns`
--

CREATE TABLE IF NOT EXISTS `geometry_columns` (
  `F_TABLE_CATALOG` varchar(256) DEFAULT NULL,
  `F_TABLE_SCHEMA` varchar(256) DEFAULT NULL,
  `F_TABLE_NAME` varchar(256) NOT NULL,
  `F_GEOMETRY_COLUMN` varchar(256) NOT NULL,
  `COORD_DIMENSION` int(11) DEFAULT NULL,
  `SRID` int(11) DEFAULT NULL,
  `TYPE` varchar(256) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Table structure for table `spatial_ref_sys`
--

CREATE TABLE IF NOT EXISTS `spatial_ref_sys` (
  `SRID` int(11) NOT NULL,
  `AUTH_NAME` varchar(256) DEFAULT NULL,
  `AUTH_SRID` int(11) DEFAULT NULL,
  `SRTEXT` varchar(2048) DEFAULT NULL,
  `PROJ4TEXT` varchar(2048) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `spatial_ref_sys`
--

INSERT INTO `spatial_ref_sys` (`SRID`, `AUTH_NAME`, `AUTH_SRID`, `SRTEXT`, `PROJ4TEXT`) VALUES
(4326, 'EPSG', 4326, 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]', '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs ');


Now you need to add rows to geometry_columns. This table basically just keeps track of all your geometries, and what SRID they are using. I am using GPS data, which I'm pretty sure is WGS84, so that's the only SRID I need. F_TABLE_CATALOG and F_TABLE_SCHEMA are allowed to be NULL. F_TABLE_NAME is the name of the table that contains the geometry column. F_GEOMETRY_COLUMN is the name of the column that is a geometry. I'm not 100% sure what COORD_DIMENSION means, but from what I can tell it's how many dimensions your geometry uses. Mine is a point with an x/y value, so my value for COORD_DIMENSION is 2. SRID is the reference ID of a row within spatial_ref_sys. TYPE is the name of the geometry. For me, it's "POINT".

Once all that has been done, OGR should be able to connect to MySQL and run its fancy bounding box queries against your geometries so that Tilestache can serve data as tiles! Yay!

Friday, October 7, 2011

GPSd/libgps/gps.h

GPSd is a daemon that interfaces with and monitors GPS devices. In my undergraduate research, I have been working on a project that requires a C++ program that can grab a fix from a USB GPS mouse. I do not need to constantly monitor the GPS device, I just need simple location information as part of my data collection.

Unfortunately, there is very little example code out there for GPSd's C library, libgps (gps.h). Here is my experience with the library, as well as the code that I came up with.

From what I can understand about how GPSd and GPS mouses work together, the mouse is actually constantly looking for a fix. This was a little counter-intuitive to me, since I am used to working with Android where the device doesn't actually look for a fix unless you register with it. The GPS mouse sends status updates to GPSd on a regular basis, so what your code actually needs to do is ask GPSd to give you the next update, and determine if it's a fix or not. If the mouse currently does not have a fix, the function exits (and you should try again later when the mouse says you do have a fix)

Here is my shoddily documented code:

int getGPSData(gps_data_t *gpsdata){
    //connect to GPSd
    if(gps_open("localhost", "2947", gpsdata)<0){
        fprintf(stderr,"Could not connect to GPSd\n");
        return(-1);
    }

    //register for updates
    gps_stream(gpsdata, WATCH_ENABLE | WATCH_JSON, NULL);
    
    fprintf(stderr,"Waiting for gps lock.");
    //when status is >0, you have data.
    while(gpsdata->status==0){
        //block for up to .5 seconds
        if (gps_waiting(gpsdata, 500)){
            //dunno when this would happen but its an error
            if(gps_read(gpsdata)==-1){
                fprintf(stderr,"GPSd Error\n");
                gps_stream(gpsdata, WATCH_DISABLE, NULL);
                gps_close(gpsdata);
                return(-1);
                break;
            }
            else{
                //status>0 means you have data
                if(gpsdata->status>0){
                    //sometimes if your GPS doesnt have a fix, it sends you data anyways
                    //the values for the fix are NaN. this is a clever way to check for NaN.
                    if(gpsdata->fix.longitude!=gpsdata->fix.longitude || gpsdata->fix.altitude!=gpsdata->fix.altitude){
                        fprintf(stderr,"Could not get a GPS fix.\n");
                        gps_stream(gpsdata, WATCH_DISABLE, NULL);
                        gps_close(gpsdata);
                        return(-1);
                    }
                    //otherwise you have a legitimate fix!
                    else
                        fprintf(stderr,"\n");
                }
                //if you don't have any data yet, keep waiting for it.
                else
                    fprintf(stderr,".");
            }
        }
        //apparently gps_stream disables itself after a few seconds.. in this case, gps_waiting returns false.
        //we want to re-register for updates and keep looping! we dont have a fix yet.
        else
            gps_stream(gpsdata, WATCH_ENABLE | WATCH_JSON, NULL);

        //just a sleep for good measure.
        sleep(1);
    }
    //cleanup
    gps_stream(gpsdata, WATCH_DISABLE, NULL);
    gps_close(gpsdata);
}

This works fine with my GlobalSat BU-353 GPS mouse. I guess there is always more testing to be done, but this works consistently.
Here is how you might print the results:

void debugDump(gps_data_t *gpsdata){
    fprintf(stderr,"Longitude: %lf\nLatitude: %lf\nAltitude: %lf\nAccuracy: %lf\n\n",
                gpsdata->fix.latitude, gpsdata->fix.longitude, gpsdata->fix.altitude,
                (gpsdata->fix.epx>gpsdata->fix.epy)?gpsdata->fix.epx:gpsdata->fix.epy);
}

I'm not entirely sure that's the best way to determine the accuracy. I'm not sure exactly what the ep attributes mean, they weren't explained well in the gps.h header file. My goal is to record something analogous to Android's getAccuracy() method. If anyone knows if my solution (just grabbing the larger of the two numbers) works or not, please comment and let me know!

Something to keep in mind when working with libgps/gps.h (and I'm sure this is common sense, but I'm an idiot sometimes), you need to use the -lgps compiler flag!