GRASS GIS for ArcGIS users

From GRASS-Wiki
Jump to navigation Jump to search

This is a guide for people who want to migrate from ArcGIS to GRASS GIS or just want to learn GRASS GIS and are already familiar with ArcGIS.

Projections

ArcGIS supports on-the-fly projection of spatial data while GRASS GIS considers this as a bad practice and requires users to have consistent projection information for all the data entering the analysis. This is one of the ways how GRASS GIS helps user to manage the integrity of their data. The other ways are keeping the geospatial data organized and requiring the data to be in a one format, e.g. to store vector data in the GRASS topological format. However, for raster data there is for example the r.external module which allows to link (rather than import) and use data in different formats as long as they are in the same projection.

When data are in different projection they can be imported into GRASS GIS using r.import and v.import modules. When the data are already in another GRASS Location which has different projection than we want, we can create a new Location with the desired projection and bring the data there using r.proj and v.proj.

Once we import our data into GRASS GIS database with the right projection, we don't have to deal with the projection conversions and or inconsistencies and we can focus on the analysis. GRASS GIS analytical modules won't ask us to provide projection for the output dataset which is a common thing for some ArcGIS tools.

Data, databases and file formats

In ArcGIS, users often have data in different directories on disk. Basically all geospatial data for GRASS GIS are in the GRASS GIS database (a folder on a disk or network drive). This database is divided into Locations and Locations are further divided into Mapsets. When GRASS GIS is started, it connects to database, Location and Mapset as specified by the user.

By this GRASS GIS is able to apply a specific system to organize the data. The system is designed to accommodate large projects and even multiple users in a Location but it is also useful for individuals who just want to keep their data well organized. First, data must be in one directory called GRASS GIS database directory. You can have one or more of these directories on your disk. This directory contains GRASS Locations. All data in one Location have the same projection (coordinate system, datum). A Location is a directory which contains GRASS Mapsets. A Mapset contains raster and vector maps (layers) and other geospatial data.

To avoid unwanted changes in the existing data, GRASS GIS modules handle the creation, modification, change, copying or deletion of those data stored the current Mapset (i.e., in the actually used session). When needed, the data can be copied from other Mapsets into the current Mapset using g.copy, for example when one needs to change attributes of a vector map not belonging to him-/herself.

However, the data from the other Mapsets are available for reading, so you can view and use them in the analysis. By default, you see data only in the current Mapset and in a special Mapset called PERMANENT which must be present in every Location. If desired, you can view the data from other Mapsets (in the current Location) using Settings > GRASS working environment > Mapset access or using module g.mapsets. Data in other Locations are accessible only if you start GRASS GIS in the given Location. Accessing data from multiple Location wouldn't be safe because the data in different Locations are (or can be) in different projections (while all Mapsets in one Location contains data in one particular projection).

You can also link data from different sources such as GeoTIFF or PostGIS database.

The system used in GRASS GIS results in a state when it is always defined where the new data will be created (the current Mapset). All data are in one projection and in consistent formats. The result is that once the data are imported, one can focus on analysis, not data formats.

Users can select different strategies how to use the database directory, Locations and Mapsets. Often users have one GRASS GIS database directory called grassdata. Then every Location is a bigger project or a area of interest. Mapsets are then smaller projects or individual tasks. PERMANENT Mapset is used to keep common data in one place.

GRASS GIS Location or Mapset are in certain way similar to File Geodatabase or Personal Geodatabase as they store all data inside them in a given format.

Cartography

ArcGIS has two modes of the display, one is a standard one for viewing and and interacting with the data. The other is a layout mode which is a design tool for creation of hardcopy maps. The Map Display in GRASS GIS main graphical user interface (GUI) is also meant for interaction but it contains basic cartography features as well. Advanced cartography in GRASS GIS is done in separate application (available from GUI) called Cartographic Composer or g.gui.psmap for starting from command line. This application is a GUI frontend for the ps.map module. Some GRASS GIS users prefer the standard display because it is convenient and easily scriptable, some users prefer ps.map because some other combine the standard display with professional open source graphics editor Inkscape, and some others use QGIS for high-quality cartography tasks.

The standard display in GRASS GIS is well suited for standard viewing and exploration of geospatial data. It is also usable for creating basic cartography outputs, e.g. for scientific publications, since basic map elements such as scale bar or raster legend can be displayed directly. This functionality is available in the main GUI in Map Display and is controlled also in Layer Manager. The functionality is also available through command line interface, so for example:

d.rast elevation

will show elevation raster map in the display. Commands like this one can be used either to control the display in GUI, to control independent graphical display, or to automate creation of hardcopy maps. For the last two options d.mon module is used. For example,

d.mon wx0

opens a new window which looks like Map Display in the main GUI but it is mostly controlled by display modules (d.*) such as aforementioned d.rast. Additionally, d.mon can be used to create images directly, for example the following series of commands will create a PNG image of elevation raster map with streams vector on top of them:

 g.region raster=elevation
 d.mon start=cairo output=map.png
 d.rast map=elevation
 d.vect map=streams
 d.mon stop=cairo

This can be rewritten to Python as follows:

import grass.script as script
gscript.run_command('g.region', raster='elevation')
gscript.run_command('d.mon', start='cairo', output='map.png')
gscript.run_command('d.rast', map='elevation')
gscript.run_command('d.vect', map='streams')
gscript.run_command('d.mon', stop='cairo')

When working in command line or a script, one can also take an advantage of d.frame module which makes layout of different maps or map elements easier.

Even when not using Python or command line one can still take an advantage of the command line interface because the commands are shown in the GUI, can be copied and then saved or shared for later use. For example, one can set a style in GUI, copy the command and get something like the following line:

d.vect map=streets color=255:165:0 width_column=SPEED width_scale=0.07

This can be easily stored in a text file, shared through email or included in a study material for a class. Later this command can be used directly in GUI to get the layer in exactly same style.

The standard display has its limitation but they can often overcome. For example, lines can only have one color, so when we want to display borders of line with different color we need to show the line twice, once as a thick line with desired border color and once as a little bit thinner line with desired inner color:

d.vect map=streets color=255:204:109 width=4
d.vect map=streets color=255:165:0 width=8
Line with border created using two d.vect commands

Some users also often combine standard display with vector graphics applications such as Inkscape because this gives them all power of an actual graphics program. When automating map creation in Python, multiple created images can be combined and further edited using Python imaging package PIL (or Pillow) or ImageMagic, a command line image manipulation suite.

3D visualization

In GRASS GIS, the 3D view is integrated into the main graphical user interface (GUI) while in ArcGIS suite, there is a separate tool ArcScene. The GRASS GIS library which is behind the 3D view is called NVIZ. The integration of NVIZ into GUI which is called wxGUI is called wxNVIZ. The 3D visualization is also available as a module called m.nviz.image (usable from command line or Python).

In GRASS GIS, when you are in 2D and you switch to 3D, all raster map layers are automatically added to the 3D view as surfaces. This means that if you have in 2D a digital elevation model, you will see it as a 3D visualization of terrain in 3D. The colors set will be the same colors as in 2D. This is different from ArcScene where the raster layer is initially flat and you have to specify in properties which raster should be used to create the surface.

Raster algebra

Current ArcGIS raster calculator is implemented in Python. Some parts of the syntax resembles Python while some others differ from it. In GRASS GIS the map algebra is available through r.mapcalc module (implemented in C) or its convenient GUI wrapper Raster Map Calculator. The r.mapcalc syntax is specifically designed for raster map algebra and is based on C syntax (Python is C-like language as well). Both syntaxes are case sensitive as it is common for C-like languages.

Quotes

In ArcGIS map algebra layer names must be quoted. In GRASS GIS, the quotes are optional and usually not used. There is few cases where it is necessary to use quotes and that is when raster map name contains dashes (which would be interpreted as minus operator). However, best practice is not to use dashes in map names at all (since map name should be ideally also usable without quoting in SQL). When writing Python script and you want it to be really robust. Then you may want to use quoting.

Whitespace

In ArcGIS map algebra operators must be surrounded by spaces. In GRASS GIS, the spaces around operators are optional, however it is a best practice to use them (same as in Python or C).

Operators

ArcGIS GRASS GIS Notes
& && Boolean (logical) AND operator is one ampersand (&) in ArcGIS. In GRASS GIS, boolean AND operator is two ampersands (&&) which is the same as in C. Note that & means bitwise AND operator in GRASS GIS (as well as in Python or C). There is also &&& in GRASS GIS which doesn't propagate null values (which is the standard behavior) but instead, treats them as false.
| || Difference between ArcGIS and GRASS GIS in boolean (logical) OR operator is the same as in the case of boolean AND operator. In GRASS GIS there is bitwise | and boolean (logical) || and |||. The || operator preserves nulls (as expected by default) and the ||| operator evaluates null values as false which is often advantageous.
~ ! In GRASS GIS the boolean NOT operator is (one) exclamation mark (!) which is the same as in C. The ArcGIS equivalent is tilde (~) while in GRASS GIS tilde is a bitwise NOT operator (one's complement) which is the same meaning as in C or Python.
^ xor() A caret (^) in ArcGIS means boolean exclusive OR (XOR) operator while in GRASS GIS it means exponentiation (an arithmetic operation). Boolean exclusive OR is done using a function xor() in GRASS GIS.

ArcGIS map algebra expression for the Raster Calculator in GUI for computing NDVI (output name is entered separately):

Float("lsat7_40" - "lsat7_30") / Float("lsat7_40" + "lsat7_30")

The same expression in GRASS GIS also for the Raster Calculator in GUI (output name entered separately as well):

float(lsat7_2002_40 - lsat7_2002_30) / float(lsat7_2002_40 + lsat7_2002_30)

Again the same expression but written as a module call for the command line:

r.mapcalc "ndvi = float(lsat7_2002_40 - lsat7_2002_30) / float(lsat7_2002_40 + lsat7_2002_30)"

See list of all available operators in GRASS GIS in r.mapcalc manual.

Functions

ArcGIS GRASS GIS Notes
Con(a, b, c), Con(a, b, c, d) if(a, b, c) The most used conditional statement (if-statement) in both ArcGIS and GRASS GIS has raster algebra is a function with three parameters first is the conditional expression (condition), second is value to be used when the condition is fulfilled ("if-part") and third one is the value to be used when the condition is not fulfilled ("else-part"). ArcGIS also supports a syntax with four parameters where the first one is just the raster and the fourth one is SQL conditional expression. GRASS GIS has also a version with four parameters but it is a convenience for the cases where we need to decide among three values based on the the first parameter (raster or expression) being lesser than zero, zero or greater than zero.
Con(a, b) if(a, b, null()) In ArcGIS, the expression Con(a, b) will return NoData (NULL) if a is true (or non-zero). In GRASS GIS, the expression if(a, b) would return 0 (a number zero) in this case. So, we explicitly specify the value null() for the case when a is false (or zero) using third parameter. The general syntax is if(a, b, c) and it is the preferred one because it is explicit and thus one doesn't need to remember the special behavior of the other versions when reading the expression.
SetNull(a, b), SetNull(a, b, c) if(a, null(), b) ArcGIS function SetNull returns NoData (NULL) when first parameter is true. The optional third parameter is SQL query which can be used instead of the expression (the first parameter). The GRASS GIS function for returning NULL when condition is not fulfilled is standard if function used with three parameters where second one is null().
IsNull() isnull()
Int() int()
Float() float(), double() GRASS GIS has two floating point numeric types. First is single precision floating point number (float) with corresponding raster map type is FCELL. Second is double precision floating point number (double) with corresponding raster map type is DCELL.

An expression in ArcGIS to use landclass value when landclass is 1 or 2 and null (NoData) value otherwise:

Con( ("landclass" == 1) | ("landclass" == 2), "landclass", null())

The same expression rewritten to GRASS GIS map algebra is:

if(landclass == 1 || landclass == 2, landclass, null())

The expression we used assumes usage in GUI. In command line, used in r.mapcalc module call, we would start it with the name of the output and the whole expression would be in quotes (as command line requires).

An expression in ArcGIS to use values from landclass raster when lakes contains null (NoData) values and use values from lakes raster otherwise:

Con(IsNull("lakes"), "landclass", "lakes")

The expression now rewritten to GRASS GIS syntax as it would be used in command line:

r.mapcalc "land = if(isnull(lakes), landclass, lakes)"

In this case without the expression are very similar. The difference is lowercase isnull, name Con versus if. For the command line, we add also r.mapcalc "land = at the beginning and a quote (") at the end.

See list of all available functions in GRASS GIS in r.mapcalc manual.

Further notes

Note that there are different algebras in GRASS GIS, namely it is 3D raster algebra which the same as the standard 2D one with few differences to accommodate 3D rasters and then it is temporal algebra which contains a lot of additional syntax to work with spatio-temporal data. There is also a module supporting vector map algebra in GRASS GIS Addons repository.

Python interface

Importing toolboxes and checking a license

Python script in ArcGIS need to check whether the given extension is available, i.e. whether the license to use it is OK. In GRASS GIS, all available modules are included in the standard installation, so there is no need for checking if the extension is present or whether we have a license to use it. Consequently, the following doesn't need any equivalent in GRASS GIS:

# check if we have the Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

In ArcGIS, third party tools can be used in a Python script after calling ImportToolbox() function. In GRASS GIS, the 3rd party tools (modules or sets of modules) are typically installed ahead using g.extension module. So, the following is usually not done in Python scripts for GRASS GIS:

arcpy.ImportToolbox("d:/toolsdir/sometools.tbx")

However, any module in GRASS GIS can be called from Python, so if we would like to install some tools or just ensure that they are installed and updated, we can call g.extension from Python:

gscript.run_command('g.extension', extension='v.sometool')

The g.extension module can install modules (extension, addons) from different sources. However, the most common source is the official GRASS GIS Addons repository which contains user wide range of contributed modules. See the g.extension manual for your version of GRASS GIS for more details.

Cost analysis

Cost surface is defined differently in GRASS GIS and ArcGIS. Cost in GRASS GIS is cost to cross a cell while in ArcGIS it is cost to cross one map unit. So for example we define cost as time and we have 30 m cells and speed is 5 m/s. Then the cost value of the cell is 6 s in GRASS GIS and 0.2 s in ArcGIS.

Flow Directions

Like the ArcGIS FLOWDIRECTION function, GRASS can also create a flow direction raster, giving each cell an integer value representing one of the eight directions of flow out of the cell. Flow direction in both programs is determined by finding which of the surrounding cells has the lowest elevation value. However, the values that each software uses are different. In ArcGIS, the flow directions are numbered from east clockwise, and each direction is a power of 2 higher than the previous. Thus, when the flow direction is east out of a cell, that cell is given the value 1. When south-east the value is 2. Then, a cell with flow direction going south gets the value 4, and so on until north-east, which becomes 128. (Details of this procedure are presented in the ArcGIS Resources website)

GRASS numbers the flow directions counter-clock wise with north-east getting the value 1, north is 2, north-west is assigned 3, west becomes 4 and so on until 8 (east). Furthermore, two additional differences between the GRASS implementation and the Arc implementation should be pointed out.

  • What happens when two or more surrounding cells have the same elevation, and flow direction is ambiguous? GRASS uses a Least Cost Path searching algorithm to "look ahead" beyond the adjacent surrounding cells, to find the overall trend. This process gives more realistic results in flat areas. Similarly the GRASS procedure deals well with sinks, without requiring a "Fill Sinks" pre-processing step. ArcGIS on the other hand requires a DEM with no sinks, and cells with ambiguous flow direction are given a value as the sum of the ambiguous directions.
  • What happens at the region edge when flow direction goes "off the map"? In this case GRASS chooses a flow direction but sets its value as negative. ArcGIS does not indicate in any special way that the flow goes out of the region. (Although there is a flag in the ArcGIS function to force edge cells to flow off the map)

Suppose you need to convert a GRASS flow direction raster to the ArcGIS numbering scheme. This is achieved simply by defining a table of reclass values, and running the GRASS r.reclass module. Here's the reclass table:

1 -1 = 128
2 -2 =64
3 -3 =32
4 -4 =16
5 -5 =8
6 -6 =4
7 -7 =2
8 -8 =1
0 =255

(Described more fully in Flow Directions from GRASS to ArcGIS)

See also

External links