Spatial SQL

From GRASS-Wiki
Revision as of 05:32, 19 May 2014 by ⚠️HamishBowman (talk | contribs) (→‎Conclusions: wikilinks)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

In recent years, many DBMS have learned to store and manage spatial data directly. Clearly, having feature geometries stored alongside their attribute data in one table makes data management a lot easier in many ways:

  • make use of advanced capabilities provided by powerful DBMS engines: user authentication, data integrity and backup, replication mechanisms, fast indexing, full SQL support, triggers, etc.
  • easily manage 1:m and m:n relationships directly in the database
  • use external front-ends to ease entry of attribute data while managing geometries in the GIS
  • never worry again about breaking the relationship between features and attribute records
  • drop restrictions imposed by file-based storage (maximum size, data table limits, slowness)
  • transfer data from one system to another without loss of information using SQL dumps
  • dramatically improve data access times
  • limit data to be held in memory by importing only spatial subregions from the DBMS
  • create central data repositories where layers will always be found in the same place
  • etc., etc.

While GRASS GIS does not currently support any spatial DBMS directly, it can still interface them using the OGR-provided drivers and v.in.ogr/v.out.ogr for data import/export. Several open source databases are supported this way:

  • PostgreSQL (with the PostGIS extension)
  • MySQL (Spatial)
  • SQLite

In addition, the OGR-supplied command line tool "ogr2ogr" can be used to manage all OGR-supported data sources directly, including

SQLite

SQLite is a light-weight, embeddable database. It works using single, fully portable files. Without the need to set up a complex client/server DBMS, it still provides a powerful data storage back-end:

http://www.sqlite.org

SQLiteStudio is a highly recommended application to manage SQLite databases of any kind:

http://www.sqlitestudio.org

There is a document that outlines how to store spatial data in an SQLite database:

http://trac.osgeo.org/fdo/wiki/FDORfc16

The OGR SQLite driver can store spatial data in an SQLite table in a straight-forward manner:

http://www.gdal.org/ogr/drv_sqlite.html

ogr2ogr

Creation of new database and dumping of 3D Shapefile points into it works. 3D info is preserved, it is OK to not have any SRS info. In that case, an empty references table will be created and the SRID for the geometries table will be set to NULL.

Creation of new table in an existing database using the "-update" option works. It is necessary to also use -nln <name> to specify the new table name.

However, it is possible to manually drop the metadata table and the database keeps working as expected.

Creation of a new database with DSCO "METADATA=no" does not work. This is a known bug to be fixed in GDAL 1.7.0.

French special chars from the DBF file were not preserved. Any attribute fields with a special character will come out as empty strings (not NULL!). Encoding used to save the DBF file was ISO-8859-1. UTF-8 encoding worked fine. Original NULL fields are preserved correctly.

v.out.ogr

This module is currently (GRASS 6.4.0 RC4) not able to correctly export data to an SQLite DB.

A quick, dirty hack allowed opening of an OGR data source in update mode, as required for existing database (not only SQLite DBs!):

    papszDSCO = dsco->answers;
    Ogr_ds = OGR_Dr_CreateDataSource(Ogr_driver, dsn_opt->answer, papszDSCO);
    
    /* Some OGR drivers do not support overwriting existing sources 
       or creating new ones.
       
       Note: this is just a blind attempt, as we don't really know, why
       the call to OGR_Dr_CreateDataSource() failed in the first place!
       
       Maybe it would be better to have an "-u" flag here to open an existing
       DS in update mode (like ogr2ogr does)?
    */
    if (Ogr_ds == NULL) {
        if (!strcmp (frmt_opt->answer, "SQLite") ||
            !strcmp (frmt_opt->answer, "PostgreSQL") ||
            !strcmp (frmt_opt->answer, "MySQL")) {
            G_message(_("Attempting to open existing OGR data source.\n"));
            Ogr_ds =
                OGR_Dr_Open(Ogr_driver, dsn_opt->answer, TRUE);    
        }
    }
    
    CSLDestroy(papszDSCO);
    if (Ogr_ds == NULL)
        G_fatal_error(_("Unable to open OGR data source '%s'"),
        	      dsn_opt->answer);

For new databases, only the first geometry gets stored correctly, all subsequent ones generate a useless error message:

 ERROR 1: sqlite3_step() failed:
   SQL logic error or missing database


Adding a new table to an existing DB fails for unknown reasons. The OGR driver creates the table itself with the correct structure, but not a single geometry is written into it. The module bails out here (main.c):

    CSLDestroy(papszLCO);
    if (Ogr_layer == NULL)
        G_fatal_error(_("Unable to create OGR layer"));

v.in.ogr

Import of 3D point data w/o projection information into a region with UTM system worked using:

 v.in.ogr dsn=tmp3.sqlite layer=finds_xyz output=tmp2 -oz --o

Import of UTF-8 encoded attribute strings worked fine. There was a warning about text fields being truncated to a maximimum length of 255 chars. However, none of the fields exceeded 100 chars, anyway.

Import of layers from a database without metadata tables works fine.

Conclusions

The result is a mixed bag. While the OGR driver itself works fine with some small glitches that can be worked around, export of data directly from GRASS using v.out.ogr is currently impossible. Import using v.in.ogr, however works well.

GRASS' v.out.ogr module needs a "-u" (update flag) to open existing data sources in update mode. It remains useless in its current state.

The SQLite support will be significantly improved in GDAL 1.7.0. At least support for SpatiaLite geometry BLOBs will be added, perhaps also AutoDesk's format in the future.

It is unclear if/how OGR handles character encodings other than UTF-8 correctly! When exporting data as UTF from e.g. OpenOffice.org Calc, double the field length for text fields, since we are dealing with 16-Bit units!

It might be possible to compile GDAL with GRASS support and use ogr2ogr to access GRASS vector data instead of v.in.ogr/v.out.ogr. However, there might be limits in the OGR driver's capabilities plus we'd get that circular linking nightmare again!

The message about truncation issued by v.in.ogr is worrying, since that seemingly needs to be handled by OGR itself, however we could easily add an option to specify maximum field length to v.in.ogr (maybe also a function that catches strings which are too long and were thus truncated?).

Note: the source code for v.out.ogr contains a note that support for kernel type geometries has not yet been implemented. I am not currently sure about v.in.ogr and kernels.