GRASS GIS APIs

From GRASS-Wiki
Jump to navigation Jump to search

Introduction to this document

GRASS GIS is in continues development and this development is what keeps GRASS GIS alive and at the forefront of GIS evolution. From users stringing together a series of modules to build a novel processing chain to developers implementing new cutting-edge algorithms in the core of the GRASS GIS C source code there are many different ways to develop with and for GRASS GIS. This document aims at clarifying the role of the different application programming interfaces (APIs) available. For each of the different APIs, it presents typical use cases and introduces the basic logic of the API. For more detailed information, please look at more detailed introduction on GRASS GIS and Development and GRASS GIS and Python as well as the programming manuals for the C and Python APIs.

GRASS GIS Modules as "functions"

GRASS in itself is not a monolithic application, but rather a collection of over 300 applications, called modules, following the Unix philosophy with the idea that each module does one thing and does it well, even if that thing is trivial, and that real power comes from combining these tools.

A consequence of this principle is that each module is autonomous in its functioning, including its memory management and error handling. In other words, as GRASS GIS is not a monolithic application, GRASS as such cannot crash, only individual modules can.

Module output in the form of new maps is automatically stored in the GRASS GIS Database and can thus be reused by follow-up modules. Other types of outputs can either be stored in temporary files to be read later by other modules or data can be communicated between modules via the standard streams, i.e. the standard output and input.

With this structure in mind, one can argue that GRASS GIS in itself is an application programming interface with each module playing the role of a "function". Chaining these functions thus allows any user to create an application of any level of desired complexity. The description of this API can be found in the module man pages or on the command line via the --help parameter.

Here is a simple example of such an application that extracts estimated stream locations from a digital elevation model, transforms them to vector format and the creates a 500m buffer around them:

g.region rast=elevation
r.watershed elevation=elevation threshold=10000 stream=raster_streams --o
r.to.vect intput=raster_streams output=vector_streams type=line --o
v.buffer input=vector_streams output=stream_buffers distance=500 --o

Just putting the above lines into one file that the operating system can read at the command line can already be considered "programming an application" and by just changing the name of the raster used as input the same program can be launched on different data. There are many advantages of creating such application instead of clicking through the different commands in a graphical user interface:

  • archiving your work flow
  • making your work flow easily reproducible
  • being able to send your work flow to others

However, in this form one is limited to just chaining modules without any control flow such as conditional execution of repeated execution of certain parts of the code. It can, thus be interesting to embed the calls to GRASS GIS modules in a program written in a programming language. Many programming language allow system calls which enable the programmer to call the GRASS GIS modules while having at the same time all the functions of the language at hand.

Here are some examples of such system calls (using the first line of the above example) in different programming languages:

  • Python
import os
os.system('g.region rast=elevation')
  • C
void main()
 {
   system("g.region rast=elevation");
 }
  • Perl
system "g.region rast=elevation";

These system calls are easy to handle when no output is expected from the GRASS module. When output needs to be collected then the programming task already becomes a little harder unless you know what you are doing. It is for this reason that the Python GRASS libraries where developed that are explained in the next section.

The Python scripting library

As was mentioned at the end of the previous section, just calling GRASS GIS modules via system calls out of a language can come very handy, but it is also not always straightforward to deal with the output of these modules. As a first solution to that, the GRASS Python Scripting Library was introduced in GRASS 6.

The aim of this library is to provide a light-weight, easy to learn toolkit that eases the interaction with GRASS GIS modules from Python. It mainly offers specialized wrapper functions to facilitate calling grass modules and capturing their output, as well as a few specific functions for frequent tasks (which themselves are again generally just wrapper functions around module calls). It allows to embed the use of GRASS GIS modules into the larger Python environment, without creating a complete pythonic access to the entire GRASS GIS API. It was used to transform all the shell scripts in the GRASS GIS source code to Python scripts in GRASS7.

The library is a perfect starting point for beginners who just want to write simple scripts without wanting to dive too deeply into Python philosophy and style.

Here is the same example as above using the run_command() function. This shows that the syntax stays very close to the actual GRASS GIS module calls.

import grass.script as grass 
grass.run_command('g.region', rast='elevation')
grass.run_command('r.watershed', elevation='elevation', threshold=10000, stream='raster_streams', overwrite=True)
grass.run_command('r.to.vect', input='raster_streams', output='vector_streams', type='line', overwrite=True
grass.run_command('v.buffer', input='vector_streams', output='stream_buffers', distance=500, overwrite=True)

There are a series of different functions to call modules depending on how output should be handled:

#get the current region settings as a list of parameters one per line (can then be treated with the Python splitlines() function
>>> grass.read_command('g.region', flags='g')
'n=228500\ns=215000\nw=630000\ne=645000\nnsres=10\newres=10\nrows=1350\ncols=1500\ncells=2025000\n'
#get current region settings as a python dictionary:
>>> grass.parse_command('g.region', flags='g')
{'rows': '1350', 'e': '645000', 'cells': '2025000', 'cols': '1500', 'n': '228500', 's': '215000', 'w': '630000', 'ewres': '10', 'nsres': '10'}
#or using a dedicated function as a shortcut (which also directly transforms all numbers from strings to numeric format)
>>> grass.region()
{'rows': 1350, 'e': 645000.0, 'cells': 2025000, 'cols': 1500, 'n': 228500.0, 's': 215000.0, 'w': 630000.0, 'ewres': 10.0, 'nsres': 10.0}

For a description of the different functions to call modules, see the relevant section on the dedicated wiki page.

As mentioned, this library is meant as a small toolkit for making it easier to call GRASS GIS modules from Python. It does not, however, offer a real pythonic access to GRASS GIS, and no access to the underlying functions. This is the role of the pygrass library discussed below.

Graphical programming with the GRASS GIS modeler

Another easy entry into programming with the GRASS GIS Python scripting library is the graphical modeler. Any model created graphically can also be saved as a Python script. For more details see the dedicated wiki page.

The PyGRASS API using ctypes

The PyGRASS API provides two layers of access to GRASS:

  1. An access to GRASS modules in the same line as the Python scripting library
  2. An access to the underlying functions of the C-library thus giving access to GRASS algorithms and geometries directly from Python

Accessing GRASS modules

The first layer, the interface to GRASS GIS modules provides a unified, object-oriented access to GRASS GIS modules. Through its design modules can be called quite similarly to how they would be called on the command line:

>>> from grass.pygrass.modules import Module
>>> Module('g.region', flags='p')
projection: 99 (Lambert Conformal Conic)
zone:       0
datum:      nad83
ellipsoid:  a=6378137 es=0.006694380022900787
north:      318500
south:      -16000
west:       124000
east:       963000
nsres:      500
ewres:      500
rows:       669
cols:       1678
cells:      1122582
Module('g.region')

This is quite similar to the GRASS GIS Python scripting library's run_command() function. However, PyGRASS does not provide wrappers such as the read_command() and parse_command() and thus the developer has to handle output in the same way as it would be handled at the command line, i.e. by manipulating the standard input and standard output streams.

At the same time, the PyGRASS library allows to treat modules as objects and provides a series of functions for these objects:

>>> from grass.pygrass.modules import Module
>>> gregion=Module('g.region', flags='g')
>>> gregion
Module('g.region')
>>> gregion.name
'g.region'
>>> gregion.description
'Manages the boundary definitions for the geographic region.'
>>> gregion.run()
n=318500
s=-16000
w=124000
e=963000
nsres=500
ewres=500
rows=669
cols=1678
cells=1122582
Module('g.region')
>>> gregion.inputs['raster'].value='elevation'
Module('g.region')

In all, this API to module calls thus allows to integrate modules into a Python object oriented logic and manage them in more flexible ways, but it also demands a bit more understanding from the programmer concerning output handling.

Accessing low-level C-functions

  • Direct access to low-level C-functions (via ctypes).
  • Direct access to data
  • Allows writing of more complex Python programs than the scripting library

The C-API

  • Core library is being developed for more than 30 years
  • Different additional libraries added through time

Easy GUI creation in scripts

Whatever the API you chose, when writing GRASS GIS scripts it is very easy to create a user interface, including a graphical user interface, to your script in a semi-automatic way. This interface will then handle all the input (choice of maps, values of parameters, names of outputs, etc) expected from the user. To see the details and examples for the use of this function, see the manual page.

Here is an example showing the option descriptions of the v.db.addcolumn module:

#%module
#% description: Adds one or more columns to the attribute table connected to a given vector map.
#% keyword: vector
#% keyword: attribute table
#% keyword: database
#%end

#%option G_OPT_V_MAP
#%end

#%option G_OPT_V_FIELD
#% label: Layer number where to add column(s)
#%end

#%option
#% key: columns
#% type: string
#% label: Name and type of the new column(s) ('name type [,name type, ...]')
#% description: Data types depend on database backend, but all support VARCHAR(), INT, DOUBLE PRECISION and DATE
#% required: yes
#%end

The result of this definition are the following command line parameters:

> v.db.addcolumn --help

Description:
 Adds one or more columns to the attribute table connected to a given vector map.

Keywords:
 vector, attribute table, database

Usage:
 v.db.addcolumn map=name [layer=string] columns=string [--help]
   [--verbose] [--quiet] [--ui]

Flags:
 --h   Print usage summary
 --v   Sortie du module en mode bavard
 --q   Sortie du module en mode silence
 --ui  Force launching GUI dialog

Parameters:
      map   Name of vector map
             Or data source for direct OGR access
    layer   Layer number where to add column(s)
             Vector features can have category values in different layers. This number determines which layer to use. When used with direct OGR access this is the layer name.
            par défaut: 1
  columns   Name and type of the new column(s) ('name type [,name type, ...]')
             Data types depend on database backend, but all support VARCHAR(), INT, DOUBLE PRECISION and DATE

The graphical result is this:

GUI interface resulting from above option definitions