GRASS and Python

From GRASS-Wiki
Revision as of 20:59, 3 December 2011 by Neteler (talk | contribs) (+alternative from Glynn)
Jump to navigation Jump to search

(for discussions on the new GRASS GUI, see here)

Python SIGs

Python Special Interest Groups are focused collaborative efforts to develop, improve, or maintain specific Python resources. Each SIG has a charter, a coordinator, a mailing list, and a directory on the Python website. SIG membership is informal, defined by subscription to the SIG's mailing list. Anyone can join a SIG, and participate in the development discussions via the SIG's mailing list. Below is the list of currently active Python SIGs, with links to their resources.

See more at http://www.python.org/community/sigs/

Writing Python scripts in GRASS

Python is a programming language which is more powerful than shell scripting but easier and more forgiving than C. The Python script can contain simple module description definitions which will be processed with g.parser, as shown in the example below. In this way with no extra coding a GUI can be built, inputs checked, and a skeleton help page can be generated automatically. In addition it adds links to the GRASS message translation system. For code which needs access to the power of C, you can access the GRASS C library functions via the Python "ctypes" interface.

GRASS Python Scripting Library

Code style: Have a look at SUBMITTING_PYTHON.

Creating Python scripts that call GRASS functionality from outside

In order to use GRASS from outside, some environment variables have to be set.

MS-Windows

GISBASE= C:\GRASS-64
GISRC= C:\Documents and Settings\user\.grassrc6
LD_LIBRARY_PATH= C:\GRASS-64\lib
PATH= C:\GRASS-64\etc;C:\GRASS-64\etc\python;C:\GRASS-64\lib;C:\GRASS-64\bin;C:\GRASS-64\extralib;C:\GRASS-64\msys\bin;C:\Python26;
PYTHONLIB= C:\Python26
PYTHONPATH= C:\GRASS-64\etc\python
GRASS_SH= C:\GRASS-64\msys\bin\sh.exe

Some hints:

  1. The Python interpreter (python.exe) needs to be in the PATH
  2. Python needs to be associated with the .py extension
  3. PATHEXT needs to include .py if you want to be able to omit the extension
  4. PYTHONPATH needs to be set to %WINGISBASE%\etc\python

1-3 should be taken care of by the Python installer. 4 needs to be done by the startup (currently, this doesn't appear to be the case on MS-Windows).

Note:

Currently (as of 22 Feb 2011) if you want to use Python for scripting GRASS on Windows, the best solution is to delete the bundled version of Python 2.5 from the GRASS installation, install Python and the required add-ons (wxPython, NumPy, PyWin32) from their official installers, then edit the GRASS start-up script to remove any references to the bundled version.

Linux

The variables are set like this:

export GISBASE="/usr/local/grass-6.4.svn/"
export PATH="$PATH:$GISBASE/bin:$GISBASE/scripts"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$GISBASE/lib"
# for parallel session management, we use process ID (PID) as lock file number:
export GIS_LOCK=$$
# path to GRASS settings file
export GISRC="$HOME/.grassrc6"

Running external commands from Python

For information on running external commands from Python, see: http://docs.python.org/lib/module-subprocess.html

Avoid using the older os.* functions. Section 17.1.3 lists equivalents using the Popen() interface, which is more robust (particularly on Windows).

Using the GRASS Python Scripting Library

The code in lib/python/ provides 'grass.script' in order to support GRASS scripts written in Python. The scripts/ directory of GRASS 7 contains a series of examples actually provided to the end users (while the script in GRASS 6 are shell scripts).

Python Scripting Library code details:

  • for GRASS 6: core.py, db.py, raster.py, vector.py, setup.py, array.py
  • for GRASS 7: core.py, db.py, raster.py, vector.py, setup.py, array.py

Examples

Display example

Example of Python script, which is processed by g.parser:

#!/usr/bin/env python
#
############################################################################
#
# MODULE:      d.shadedmap
# AUTHOR(S):   Unknown; updated to GRASS 5.7 by Michael Barton
#              Converted to Python by Glynn Clements
# PURPOSE:     Uses d.his to drape a color raster over a shaded relief map
# COPYRIGHT:   (C) 2004,2008,2009 by the GRASS Development Team
#
#              This program is free software under the GNU General Public
#              License (>=v2). Read the file COPYING that comes with GRASS
#              for details.
#
#############################################################################

#%Module
#% description: Drapes a color raster over a shaded relief map using d.his
#%End
#%option
#% key: reliefmap
#% type: string
#% gisprompt: old,cell,raster
#% description: Name of shaded relief or aspect map
#% required : yes
#%end
#%option
#% key: drapemap
#% type: string
#% gisprompt: old,cell,raster
#% description: Name of raster to drape over relief map
#% required : yes
#%end
#%option
#% key: brighten
#% type: integer
#% description: Percent to brighten
#% options: -99-99
#% answer: 0
#%end

import sys
from grass.script import core as grass

def main():
    drape_map = options['drapemap']
    relief_map = options['reliefmap']
    brighten = options['brighten']
    ret = grass.run_command("d.his", h_map = drape_map,  i_map = relief_map, brighten = brighten)
    sys.exit(ret)

if __name__ == "__main__":
    options, flags = grass.parser()
    main()

Parsing the options and flags

grass.parser() is an interface to g.parser, and allows to parse the options and flags passed to your script on the command line. It is to be called at the top-level:

if __name__ == "__main__":
    options, flags = grass.parser()
    main()

Global variables "options" and "flags" are Python dictionaries containing the options/flags values, keyed by lower-case option/flag names. The values in "options" are strings, those in "flags" are Python booleans. All those variables have to be previously declared in the header of your script.

>>> options, flags = grass.parser()
>>> options
{'input': 'my_map', 'output': 'map_out', 'option1': '21.472', 'option2': ''}
>>> flags
{'c': True, 'm': False}

Example for embedding r.mapcalc (map algebra)

grass.mapcalc() accepts a template string followed by keyword arguments for the substitutions, e.g. (code snippets):

grass.mapcalc("${out} = ${rast1} + ${rast2}",
              out = options['output'],
              rast1 = options['raster1'],
              rast2 = options['raster2'])

Best practice: first copy all of the options[] into separate variables at the beginning of main(), i.e.:

def main():
    output = options['output']
    raster1 = options['raster1']
    raster2 = options['raster2']
 
    ...
 
    grass.mapcalc("${out} = ${rast1} + ${rast2}",
                  out = output,
                  rast1 = raster1,
                  rast2 = raster2)

Example for parsing raster category labels

How to obtain the text labels

    # dump cats to file to avoid "too many argument" problem:
    p = grass.pipe_command('r.category', map = rastertmp, fs = ';', quiet = True)
    cats = []
    for line in p.stdout:
        cats.append(line.rstrip('\r\n').split(';')[0])
    p.wait()

    number = len(cats)
    if number < 1:
        grass.fatal(_("No categories found in raster map"))

Example for parsing category numbers

Q: How to obtain the number of cells of a certain category?

A: It is recommended to use pipe_command() and parse the output, e.g.:

       p = grass.pipe_command('r.stats',flags='c',input='map')
       result = {}
       for line in p.stdout:
           val,count = line.strip().split()
           result[int(val)] = int(count)
       p.wait()

Example for getting the region's number of rows and columns

Q: How to obtain the number of rows and columns of the current region?

A: It is recommended to use the "grass.region()" function which will create a dictionary with values for extents and resolution, e.g.:

#!/usr/bin/env python
#-*- coding:utf-8 -*-
#
############################################################################
#
# MODULE:       g.region.resolution
# AUTHOR(S):    based on a post at GRASS-USER mailing list [1]               
# PURPOSE:	Parses "g.region -g", prints out number of rows, cols
# COPYLEFT:     ;-)
# COMMENT:      ...a lot of comments to be easy-to-read for/by beginners
#
#############################################################################
#
#%Module
#% description: Print number of rows, cols of current geographic region
#% keywords: region
#%end

# importing required modules
import sys # the sys module [2]
from grass.script import core as grass # the core module [3]

# information about imported modules can be obtained using the dir() function
# e.g.: dir(sys)

# define the "main" function: get number of rows, cols of region
def main():
    
    # #######################################################################
    # the following commented code works but is kept only for learning purposes
     
    ## assigning the output of the command "g.region -g" in a string called "return_rows_x_cols"
    # return_rows_x_cols = grass.read_command('g.region', flags = 'g')
    
    ## parsing arguments of interest (rows, cols) in a dictionary named "rows_x_cols"
    # rows_x_cols = grass.parse_key_val(return_rows_x_cols)
    
    ## selectively print rows, cols from the dictionary "rows_x_cols"
    # print 'rows=%d \ncols=%d' % (int(rows_x_cols['rows']), int(rows_x_cols['cols']))
    
    # #######################################################################
    
    # faster/ easier way: use of the "grass.region()" function
    gregion = grass.region()
    rows = gregion['rows']
    cols = gregion['cols']
    
    # print rows, cols properly formated 
    print 'rows=%d \ncols=%d' % (rows, cols)

# this "if" condition instructs execution of code contained in this script, *only* if the script is being executed directly 
if __name__ == "__main__": # this allows the script to be used as a module in other scripts or as a standalone script
    options, flags = grass.parser() #
    sys.exit(main()) #

# Links
# [1] http://n2.nabble.com/Getting-rows-cols-of-a-region-in-a-script-tp2787474p2787509.html
# [2] http://www.python.org/doc/2.5.2/lib/module-sys.html
# [3] http://download.osgeo.org/grass/grass6_progman/pythonlib.html#pythonCore

Managing mapsets

To check if a certain mapset exists in the active location, use:

       grass.script.mapsets(False)

... returns a list of mapsets in the current location.

r.mapcalc example

Example of Python script, which is processed by g.parser:

The shell script line:

  r.mapcalc "MASK = if(($cloudResampName < 0.01000),1,null())"

would be written like this:

       import grass.script as grass

       ...

       grass.mapcalc("MASK=if(($cloudResampName < 0.01000),1,null())",
                     cloudResampName = cloudResampName)

The first argument to the mapcalc function is a template (see the Python library documentation for string.Template). Any keyword arguments (other than quiet, verbose or overwrite) specify substitutions.

Using output from GRASS modules in the script

Sometimes you need to use the output of a module for the next step. There are dedicated functions to obtain the result of, for example, a statistical analysis.

Example: get the range of a raster map and use it in r.mapcalc. Here you can use grass.script.raster_info(), e.g.:

       import grass.script as grass

       max = grass.raster_info(inmap)['max']
       grass.mapcalc("$outmap = $inmap / $max",
                     inmap = inmap, outmap = outmap, max = max)

Calling a GRASS module in Python

Imagine, you wanted to execute this command in Python:

  r.profile -g input=mymap output=newfile profile=12244.256,-295112.597,12128.012,-295293.77

All arguments except the first (which is a flag) are keyword arguments, i.e. arg = val. For the flag, use flags = 'g' (note that "-g" would be the negative of a Python variable named "g"!). So:

       grass.run_command(
               'r.profile',
               input = input_map,
               output = output_file,
               profile = [12244.256,-295112.597,12128.012,-295293.77]

or:

               profile = [(12244.256,-295112.597),(12128.012,-295293.77)]

i.e. you need to provide the keyword, and the argument must be a valid Python expression. Function run_command() etc accept lists and tuples.

Differences between run_command() and read_command():

  • run_command() executes the command and waits for it to terminate; it doesn't redirect any of the standard streams.
  • read_command() executes the command with stdout redirected to a pipe, and reads everything written to it. Once the command terminates, it returns the data written to stdout as a string.

How to retrieve error messages from read_command():

None of the existing *_command functions redirect stderr. You can do so with e.g.:

def read2_command(*args, **kwargs):
   kwargs['stdout'] = grass.PIPE
   kwargs['stderr'] = grass.PIPE
   ps = grass.start_command(*args, **kwargs)
   return ps.communicate()

This behaves like read_command() except that it returns a tuple of (stdout,stderr) rather than just stdout.

Percentage output for progress of computation

A) Within a Python script, the grass.script.core.percent() module method wraps the g.message -p command.

B) If you call a GRASS command within the Python code, you have to parse the output by setting GRASS_MESSAGE_FORMAT=gui in the environment when running the command and read from the command's stderr; e.g.

       import grass.script as grass
       env = os.environ.copy()
       env['GRASS_MESSAGE_FORMAT'] = 'gui'
       p = grass.start_command(..., stderr = grass.PIPE, env = env)
       # read from p.stderr
       p.wait()

If you need to capture both stdout and stderr, you need to use threads, select, or non-blocking I/O to consume data from both streams as it is generated in order to avoid deadlock.

ALTERNATIVE:

Redirect both stdout and stderr to the same pipe (and hope that the normal output doesn't include anything which will be mistaken for progress/error/etc messages):

       p = grass.start_command(..., stdout = grass.PIPE, stderr = grass.STDOUT, env = env)

Path to GISDBASE

In order to a avoid hardcoded paths to GRASS mapset files like the SQLite DB file, you can get the GISDBASE variable from the environment:

       import grass.script as grass
       import os.path

       env = grass.gisenv()

       gisdbase = env['GISDBASE']
       location = env['LOCATION_NAME']
       mapset = env['MAPSET']

       path = os.path.join(gisdbase, location, mapset, 'sqlite.db')

Python extensions for GRASS GIS

wxPython GUI development for GRASS

GRASS Python Scripting Library

See GRASS Python Scripting Library (Programmer's manual). See also Converting Bash scripts to Python, and sample Python scripts in GRASS 7

Uses for read, feed and pipe, start and exec commands

All of the *_command functions use make_command to construct a command line for a program which uses the GRASS parser. Most of them then pass that command line to subprocess.Popen() via start_command(), except for exec_command() which uses os.execvpe().

[To be precise, they use grass.Popen(), which just calls subprocess.Popen() with shell=True on Windows and shell=False otherwise. On Windows, you need to use shell=True to be able to execute scripts (including batch files); shell=False only works with binary executables.]

start_command() separates the arguments into those which subprocess.Popen() understands and the rest. The rest are passed to make_command() to construct a command line which is passed as the "args" parameter to subprocess.Popen().

In other words, start_command() is a GRASS-oriented interface to subprocess.Popen(). It should be suitable for any situation where you would use subprocess.Popen() to execute a normal GRASS command (one which uses the GRASS parser, which is almost all of them; the main exception is r.mapcalc in 6.x).

Most of the others are convenience wrappers around start_command(), for common use cases.

  • run_command() calls the wait() method on the process, so it doesn't return until the command has finished, and returns the command's exit code. Similar to system().
  • pipe_command() calls start_command() with stdout=PIPE and returns the process object. You can use the process' .stdout member to read the command's stdout. Similar to popen(..., "r").
  • feed_command() calls start_command() with stdin=PIPE and returns the process object. You can use the process' .stdin member to write to the command's stdout. Similar to popen(..., "w")
  • read_command() calls pipe_command(), reads the data from the command's stdout, and returns it as a string. Similar to `backticks` in the shell.
  • write_command() calls feed_command(), sends the string specified by the "stdin" argument to the command's stdin, waits for the command to finish and returns its exit code. Similar to "echo ... | command".
  • parse_command() calls read_command() and parses its output as key-value pairs. Useful for obtaining information from g.region, g.proj, r.info, etc.
  • exec_command() doesn't use start_command() but os.execvpe(). This causes the specified command to replace the current program (i.e. the Python script), so exec_command() never returns. Similar to bash's "exec" command. This can be useful if the script is a "wrapper" around a single command, where you construct the command line and execute the command as the final step.

If you have any other questions, you might want to look at the code ($GISBASE/etc/python/grass/script/core.py). Most of these functions are only a few lines long.

Interfacing with NumPy

The grass.script.array module defines a class array which is a subclass of numpy.memmap with .read() and .write() methods to read/write the underlying file via r.out.bin/r.in.bin. Metadata can be read with raster::raster_info():

Example:

import grass.script as grass
import grass.script.array as garray

def main():
    map = "elevation.dem"

    # read map
    a = garray.array()
    a.read(map)

    # get raster map info
    print grass.raster_info(map)['datatype']
    i = grass.raster_info(map)
    
    # get computational region info
    c = grass.region()
    print "rows: %d" % c['rows']
    print "cols: %d" % c['cols']

    # new array for result
    b = garray.array()
    # calculate new map from input map and store as GRASS raster map
    b[...] = (a / 50).astype(int) * 50
    b.write("elev.50m")

The size of the array is taken from the current region (computational region).

The main drawback of using numpy is that you're limited by available memory. Using a subclass of numpy.memmap lets you use files which may be much larger, but processing the entire array in one go is likely to produce in-memory results of a similar size.

Interfacing with NumPy and SciPy

SciPy offers simple access to complex calculations. Example:

from scipy import stats
import grass.script as grass
import grass.script.array as garray

def main():
    map = "elevation.dem"

    x = garray.array()
    x.read(map)

    # Descriptive Statistics:
    print "max, min, mean, var:"
    print x.max(), x.min(), x.mean(), x.var()
    print "Skewness test: z-score and 2-sided p-value:"
    print stats.skewtest(stats.skew(x))

Interfacing with NumPy, SciPy and Matlab

One may also use the SciPy - Matlab interface:

   ### SH: in GRASS ###
   r.out.mat input=elevation output=elev.mat
    ### PY ###
    import scipy.io as sio
    # load data
    elev = sio.loadmat('elev.mat')
    # retrive the actual array. the data set contains also the spatial reference
    elev.get('map_data')
    data = elev.get('map_data')
    # a first simple plot
    import pylab
    pylab.plot(data)
    pylab.show()
    # the contour plot
    pylab.contour(data)
    # obviously data needs to ne reversed
    import numpy as np
    data_rev = data[::-1]
    pylab.contour(data_rev)
    # => this is a quick plot. basemap mapping may provide a nicer map!
    #######

Testing and installing Python extensions

Debugging

Make sure the script is executable:

   chmod +x /path/to/my.extension.py

During development, a Python script can be debugged using the Python Debugger (pdb):

   python -m pdb /path/to/my.extension.py input=my_input_layer output=my_output_layer option=value -f

Installation

Once you're happy with your script, you can put it in the scripts/ folder of your GRASS install. To do so, first create a directory named after your extension, then create a Makefile for it, and a HTML man page:

   cd /path/to/grass_src/
   cd scripts
   ls # It is useful to check out the existing scripts and their structure
   mkdir my.extension
   cd my.extension
   cp path/to/my.extension.py .
   touch my.extension.html
   touch Makefile

Next step is to edit the Makefile. It is a very simple text file, the only thing to check is to put the right extension name (WITHOUT the .py file extension) after PGM:

   MODULE_TOPDIR = ../..
   
   PGM = my.extension
   
   include $(MODULE_TOPDIR)/include/Make/Script.make
   
   default: script

The HTML file would be generated automatically. If you want to add more precisions in it, you can do it (just make sure you start at DESCRIPTION. See existing scripts.)

You can then run "make" within the my.extension folder. Running "make" in the extension directory places the resulting files in the staging directory (path/to/grass_src/dist.<YOUR_ARCH>/). If you're running GRASS from the staging directory (/path/to/grass_src/bin.<YOUR_ARCH>/grass7), subsequent commands will used the updated files.

   # in your extension directory (/path/to/grass_src/scripts/my.extension/)
   make
   # Starting GRASS from the staging directory
   /path/to/grass_src/bin.<YOUR_ARCH>/grass7
   my.extension help

You can also run "make install" from the top level directory of your GRASS install (say /usr/local/src/grass_trunk/). Running "make install" from the top level just copies the whole of the dist.<YOUR_ARCH>/ directory to the installation directory (e.g. /usr/local/grass70) and the bin.<YOUR_ARCH>/grass70 bin file to the bin directory (e.g. /usr/local/bin), and fixes any embedded paths in scripts and configuration files.

   cd /path/to/grass_src
   make install
   # Starting GRASS as usual would work and show your extension available
   grass7
   my.extension help

Python Ctypes Interface

This interface allows calling GRASS library functions from Python scripts. See Python Ctypes Examples for details.

Examples:


Sample script for GRASS 6 raster access (use within GRASS, Spearfish session):

#!/usr/bin/env python

## TODO: update example to Ctypes

import os, sys
from grass.lib import grass

if "GISBASE" not in os.environ:
    print "You must be in GRASS GIS to run this program."
    sys.exit(1)

if len(sys.argv)==2:
  input = sys.argv[1]
else:
  input = raw_input("Raster Map Name? ")

# initialize
grass.G_gisinit('')

# find map in search path
mapset = grass.G_find_cell2(input, '')

# determine the inputmap type (CELL/FCELL/DCELL) */
data_type = grass.G_raster_map_type(input, mapset)

infd = grass.G_open_cell_old(input, mapset)
inrast = grass.G_allocate_raster_buf(data_type)

rown = 0
while True:
    myrow = grass.G_get_raster_row(infd, inrast, rown, data_type)
    print rown, myrow[0:10]
    rown += 1
    if rown == 476:
        break

grass.G_close_cell(inrast)
grass.G_free(cell)

Sample script for vector access (use within GRASS, Spearfish session):

#!/usr/bin/python

# run within GRASS Spearfish session
# run this before starting python to append module search path:
#   export PYTHONPATH=/usr/src/grass70/swig/python
#   check with "import sys; sys.path"
# or:
#   sys.path.append("/usr/src/grass70/swig/python")
# FIXME: install the grass bindings in $GISBASE/lib/ ?

import os, sys
from grass.lib import grass
from grass.lib import vector as grassvect

if "GISBASE" not in os.environ:
    print "You must be in GRASS GIS to run this program."
    sys.exit(1)

if len(sys.argv)==2:
  input = sys.argv[1]
else:
  input = raw_input("Vector Map Name? ")

# initialize
grass.G_gisinit('')

# find map in search path
mapset = grass.G_find_vector2(input,'')

# define map structure
map = grassvect.Map_info()

# define open level (level 2: topology)
grassvect.Vect_set_open_level (2)

# open existing map
grassvect.Vect_open_old(map, input, mapset)

# query
print 'Vect map: ', input
print 'Vect is 3D: ', grassvect.Vect_is_3d (map)
print 'Vect DB links: ', grassvect.Vect_get_num_dblinks(map)
print 'Map Scale:  1:', grassvect.Vect_get_scale(map)
print 'Number of areas:', grassvect.Vect_get_num_areas(map)

# close map
grassvect.Vect_close(map)

Python-GRASS add-ons

Stand-alone addons:

  1. Jáchym Čepický's G-ps.map, a GUI to typeset printable maps with ps.map (http://193.84.38.2/~jachym/index.py?cat=gpsmap)
  2. Jáchym Čepický's v.pydigit, a GUI to v.edit (http://les-ejk.cz/?cat=vpydigit)
  3. Jáchym Čepický's PyWPS, GRASS-Web Processing Service (http://pywps.wald.intevation.org)

Using GRASS gui.tcl in Python

Here is some example code to use the grass automatically generated guis in python code. This could (should) all be bundled up and abstracted away so that the implementation can be replaced later.

import Tkinter
import os

# Startup (once):

tk = Tkinter.Tk()
tk.eval ("wm withdraw .")
tk.eval ("source $env(GISBASE)/etc/gui.tcl")
# Here you could do various things to change what the gui does
# See gui.tcl and README.GUI

# Make a gui (per dialog)
# This sets up a window for the command.
# This can be different to integrate with tkinter:
tk.eval ('set path ".dialog$dlg"')
tk.eval ('toplevel .dialog$dlg')
# Load the code for this command:
fd = os.popen ("d.vect --tcltk")
gui = fd.read()
# Run it
tk.eval(gui)
dlg = tk.eval('set dlg') # This is used later to get and set 

# Get the current command in the gui we just made:
currentcommand = tk.eval ("dialog_get_command " + dlg)

# Set the command in the dialog we just made:
tk.eval ("dialog_set_command " + dlg + " {d.vect map=roads}")

FAQ

  • Q: Error message "execl() failed: Permission denied" - what to do?
A: Be sure that the execute bit of the script is set.

Links

General guides

Programming

  • Python and Statistics:
    • RPy - Python interface to the R-statistics programming language

Presentations

From FOSS4G2006: