Difference between revisions of "From GRASS GIS novice to power user (workshop at FOSS4G Boston 2017)"

From GRASS-Wiki
Jump to: navigation, search
(See also)
(Hydrology)
Line 504: Line 504:
 
{{addonCmd|r.object.activelearning}}, {{addonCmd|r.object.geometry}}, {{addonCmd|r.object.spatialautocor}}.
 
{{addonCmd|r.object.activelearning}}, {{addonCmd|r.object.geometry}}, {{addonCmd|r.object.spatialautocor}}.
  
=== Hydrology ===
+
=== Hydrology: Estimating inundation extent using HAND methodology ===
 +
In this example we will use some of GRASS GIS hydrology tools, namely:
 +
* {{cmd|r.watershed}}: for computing flow accumulation, drainage direction, the location of streams and watershed basins
 +
 
 +
r.watershed elevation=elevation accumulation=flowacc drainage=drainage stream=streams threshold=100000
 +
r.to.vect input=streams@user1 output=streams type=line
 +
 
 +
r.lake elevation=elevation@PERMANENT water_level=90 lake=flood coordinates=637877.833869,218475.441863
 +
 
 +
 
 +
r.stream.distance stream_rast=streams direction=drainage elevation=elevation method=downstream difference=above_stream
 +
 
 +
r.lake elevation=above_stream water_level=5 lake=flood seed=streams
  
 
=== Batch jobs with ''--exec'' ===
 
=== Batch jobs with ''--exec'' ===

Revision as of 14:26, 20 July 2017

Foss4g boston 2017 logo.png
Grassgis logo colorlogo text whitebg.png

Description: Do you want to use GRASS GIS, but never understood what that location and mapset are? Do you struggle with the computational region? Or perhaps you used GRASS GIS already but you wonder what g.region -a does? Maybe you were never comfortable with GRASS command line? In this workshop, we will explain and practice all these functions and answer questions more advanced users may have. We will help you decide when to use graphical user interface and when to use the power of command line. We will go through simple examples of vector, raster, and image processing functionality and we will try couple of new and old tools such as vector network analysis or image segmentation which might be the reason you want to use GRASS GIS. We aim this workshop at absolute beginners without prior knowledge of GRASS GIS, but we hope it can be useful also to current users looking for deeper understanding of basic concepts or the curious ones who want to try the latest additions to GRASS GIS.

Format and requirements: Participants should bring their laptops with GRASS GIS 7. Beginners are encouraged to try using the latest OSGeo-Live virtual machine. There are no special requirements. Just have your laptop with GRASS GIS or OSGeo-Live virtual machine with you.

Authors: Anna Petrasova from North Carolina State University (NCSU), Giuseppe Amatulli from Yale University, and Vaclav Petras from NCSU

Preparation

Software

OSGeo-Live

All needed software is included.

Ubuntu

Install GRASS GIS from packages:

sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
sudo apt-get update
sudo apt-get install grass

Linux

For other Linux distributions other then Ubuntu, please try to find GRASS GIS in their package managers.

MS Windows

Download the standalone GRASS GIS binaries from grass.osgeo.org.

Mac OS

Install GRASS GIS using Homebrew osgeo4mac:

brew tap osgeo/osgeo4mac
brew install grass7

Note that the 3D view may not be accessible.

Data

GRASS GIS introduction

Here we provide an overview of the GRASS GIS project: grass.osgeo.org for first time users. It's not necessary to have a full understanding of how to use GRASS GIS. Here we introduce main concepts necessary for running the tutorial:


Structure of the GRASS GIS Spatial Database

GRASS uses unique database terminology and structure (GRASS database) that are important to understand for the set up of this tutorial, as you will need to place the required data (e.g. Location) in a specific GRASS database. In the following we review important terminology and give step by step directions on how to download and place your data in the correct location.


  • A GRASS GIS Spatial Database (GRASS database) consists of directory with specific Locations (projects) where data (data layers/maps) are stored.
  • Location is a directory with data related to one geographic location or a project. All data within one Location has the same coordinate reference system.
  • Mapset is a collection of maps within Location, containing data related to a specific task, user or a smaller project.

Creating a GRASS database for the tutorial

If you are running GRASS GIS from OSGeoLive, you already have nc_spm_08_grass7 dataset (sample GRASS Location for North Carolina). Otherwise, please download the sample GRASS Location for North Carolina, noting where the files are located. Now, create (unless you already have it) a directory named grassdata (GRASS database) in your home folder (or Documents), unzip the downloaded data into this directory. You should now have a Location nc_spm_08_grass7 in grassdata.

Displaying and exploring data

Now that we have the data in the correct GRASS database, we can launch the Graphical User Interface (GUI) in Mapset user1.

The GUI interface allows you to display raster, vector data as well as navigate through zooming in and out. More advanced exploration and visualization is also possible using, e.g., queries and adding legend. The screenshots below depicts how you can add different map layers (left) and display the metadata of your data layers.

To have a better overview of our raster and vector data, we can use Data tab in Layer Manager. From there we can search for data by name and when we right click at the item, we can select from different actions. In this way we can easily copy or remove data, add them to display, or switch between mapsets. Note that by default for safety reasons you can modify data only in current location and mapset. By unlocking you allow editing other mapsets as well.

GRASS GIS modules

One of the advantages of GRASS is the diversity and number of modules that let you analyze all manner of spatial and temporal. GRASS GIS has over 500 different modules in the core distribution and over 230 addon modules that can be used to prepare and analyze data layers.

GRASS functionality is available through modules (tools, functions). Modules respect the following naming conventions:

Prefix Function Example
r.* raster processing r.mapcalc: map algebra
v.* vector processing v.clean: topological cleaning
i.* imagery processing i.segment: object recognition
db.* database management db.select: select values from table
r3.* 3D raster processing r3.stats: 3D raster statistics
t.* temporal data processing t.rast.aggregate: temporal aggregation
g.* general data management g.rename: renames map
d.* display d.rast: display raster map

These are the main groups of modules. There is few more for specific purposes. Note also that some modules have multiple dots in their names. This often suggests further grouping. For example, modules staring with v.net. deal with vector network analysis.

The name of the module helps to understand its function, for example v.in.lidar starts with v so it deals with vector maps, the name follows with in which indicates that the module is for importing the data into GRASS GIS Spatial Database and finally lidar indicates that it deals with lidar point clouds.

Finding and running a module

To find a module for your analysis, type the term into the search box into the Modules tab in the Layer Manager, then keep pressing Enter until you find your module.

Alternatively, you can just browse through the module tree in the Modules tab. You can also browse through the main menu. For example, to find information about a raster map, use: Raster → Reports and statistics → Basic raster metadata.

Running a module as a command

If you already know the name of the module, you can just use it in the command line. The GUI offers a Command console tab with command line specifically build for running GRASS GIS modules. If you type module name there, you will get suggestions for automatic completion of the name. After pressing Enter, you will get GUI dialog for the module.

You can use the command line to run also whole commands for example when you get a command, i.e. module and list of parameters, in the instructions.

Command line vs. GUI interface

GRASS modules can be executed either through a GUI or command line interface. The GUI offers a user-friendly approach to executing modules where the user can navigate to data layers that they would like to analyze and modify processing options with simple check boxes. The GUI also offers an easily accessible manual on how to execute a model. The command line interface allows users to execute a module using command prompts specific to that module. This is handy when you are running similar analyses with minor modification or are familiar with the module commands for quick efficient processing. In this workshop we provide module prompts that can be copy and pasted into the command line for our workflow, but you can use both GUI and command line depending on personal preference. Look how GUI and command line interface represent the same tool.

Module parameters

The same analysis can be done using the following command:

r.neighbors -c input=elevation output=elev_smooth size=5

Conversely, you can fill the GUI dialog parameter by parameter when you have the command.

Computational region

Before we use a module to compute a new raster map, we must set properly computational region. All raster computations will be performed in the specified extent and with the given resolution.

Computational region is an important raster concept in GRASS GIS. In GRASS a computational region can be set, subsetting larger extent data for quicker testing of analysis or analysis of specific regions based on administrative units. We provide a few points to keep in mind when using the computational region function:

  • defined by region extent and raster resolution
  • applies to all raster operations
  • persists between GRASS sessions, can be different for different mapsets
  • advantages: keeps your results consistent, avoid clipping, for computationally demanding tasks set region to smaller extent, check your result is good and then set the computational region to the entire study area and rerun analysis
  • run g.region -p or in menu Settings - Region - Display region to see current region settings

The numeric values of computational region can be checked using:

g.region -p

After executing the command you will get something like this:

north:      220750
south:      220000
west:       638300
east:       639000
nsres:      1
ewres:      1
rows:       750
cols:       700
cells:      525000

Computational region can be set also using a vector map. In that case, only extent is set (as vector map does not have any resolution - at least not in the way raster map does). In GUI, this can be done in the same way as for the raster map. In the command line, it looks like this:

g.region vector=lakes

Resolution can be set separately using the res parameter of the g.region module. The units are the units of the current location, in our case meters. This can be done in the Resolution tab of the g.region dialog or in the command line in the following way (using also the -p flag to print the new values):

g.region res=3 -p

The new resolution may be slightly modified in this case to fit into the extent which we are not changing. However, often we want the resolution to be the exact value we provide and we are fine with a slight modification of the extent. That's what -a flag is for.

The following example command will use the extent from the vector named lakes, use resolution 10, modify the extent to align it to this 10 meter resolution, and print the values of this new computational region settings:

g.region vector=lakes res=10 -a -p

Running modules

Find the module for computing slope and aspect in menu or the module tree under Raster → Terrain analysis → Slope and aspect or simply run r.slope.aspect.


3D view

We can explore our study area in 3D view.

  1. Add elev_lid792_1m and uncheck or remove any other layers.
  2. Set computational region to this raster. Switch to 3D view (in the right corner on Map Display).
  3. Adjust the view (perspective, height, vertical exaggeration)
  4. In Data tab, set Fine mode resolution to 1 and set slope (computed in the previous step) as the color of the surface.
  5. When finished, switch back to 2D view.
3D visualization of elev_lid792_1m DEM with slope draped over

GRASS working environment and working directory

Create location and mapset (from GUI and from command line)

Move data to and from GRASS database

Move data between locations and mapsets

Case study part I: basic raster and vector operations

In this exercise we will use viewshed modeling to explore which cell towers (from OpenStreetMap) need to be inspected and then use network analysis to send a cell tower climber to those towers. Then we will analyze which areas need better cell coverage using Python scripting.

You will learn:

  • import data
  • work with raster and vector modules
  • use raster algebra
  • run network analysis
  • learn basics of Python API
  • run Python script

Data import

We will use overpass turbo (web-based data filtering tool) to create and run Overpass API queries to obtain OpenStreetMap data.

You can use Wizard to create simple queries on the currently zoomed extent. For example, zoom to a small area and paste this into the Wizard and run the query: man_made=tower (assuming all these towers are cell towers)

The query was built in the editor and ran so you can now see the results in a map.

Now, paste this query in the editor window:

[out:json][timeout:25];
// gather results
(
  // query part for: “"communication:mobile_phone"=yes”
  node["man_made"="tower"](35.68750730,-78.77462049,35.80960938,-78.60830318);
);
// print results
out body;
>;
out skel qt;

It finds towers in our study area (you can get the coordinates based on the computational region by running g.region -lg). When you run the query, the roads appear on the map and we can export them as GeoJSON (Export - Data - as GeoJSON).

In the next step we import the provided data into GRASS GIS. In menu File - Import vector data select Common formats import and in the dialog browse to find and click button Import. Repeat for raster file export.geojson. Note that in this case we need to change the output name from default OGRGeoJSON to towers. This vector data has different CRS, so a dialog informing you about the need to reproject the data appears and we confirm it.

The imported layer should be added to Map Display, if not, add it manually. Then select the layer and right click and select Show attribute data to see what kind of attributes there are.

Viewshed modeling

In our (little naive) scenario we investigate which towers stopped working. From our position (632745,224297) we don't receive any signal, so we need to find out which towers are not broadcasting and then go and inspect each of them.

We first compute from which cell towers we should be getting signal by approximating signal propagation by line-of-sight analysis (r.viewshed). Use either the GUI or run the following command from GUI command line:

 r.viewshed input=elevation output=visible coordinates=632745,224297 target_elevation=50

Next, we convert the resulting raster representing vertical angle into single-value raster map suitable for conversion to vector using raster algebra. The expression says: create new raster visible_1 where each cell is either 1 or NULL depending on values in raster visible.

 r.mapcalc expression="visible_1 = if(visible, 1, null())"

Convert the new raster to vector polygon:

 r.to.vect input=visible_1 output=visible type=area

Now we select towers which are inside the area:

 v.select ainput=towers atype=point binput=visible btype=area output=visible_towers operator=overlap

The resulting vector has 3 points as you can check for example in the metadata.

We will add the viewer position into a new vector map and add it to the towers to use it for the network analysis. Module v.in.ascii creates new vector from coordinates provided in a file. If we run it from GUI dialog of v.in.ascii we can paste the coordinates 632745,224297 into the 'enter values directly' field. Otherwise we need to create a plain text file with one line having the coordinates. Then we provide the path to that file to v.in.ascii:

 v.in.ascii input=path/to/coordinates.txt output=viewpoint separator=comma

Network analysis

Using Vector Network Analysis Tool

Here we show how to do simple network analysis in GUI and what are the actual modules and workflow behind it. First add the following layers (you probably have them already there). You can also paste these individual commands in the GUI command line and press Enter. Don't paste them all at once.

 d.rast map=elevation
 d.vect map=visible color=none fill_color=white width=1
 d.vect map=streets_wake width=1
 d.vect map=viewpoint fill_color=green width=2 icon=basic/marker size=40
 d.vect map=visible_towers fill_color=red width=2 icon=basic/marker size=40

Now select streets_wake in Layer Manager and then go to Map Display - Analyze map - Vector network analysis tool.

  1. Set Traveling salesman in the top field.
  2. Go to Parameters tab and make sure you have streets_wake in the first entry.
  3. Switch tab to Points and click on Add new point so that you have 4 points there (3 points for towers and 1 for viewpoint).
  4. Click on Insert points from Map Display, select first point and click on Map Display to the place where the viewer is located and then click on each of the visible towers.
  5. Finally press the top green button to Execute the analysis. Wait for a little bit and then you should be able to see the computed path.


The same can be done using the following module calls.

First create a new vector by merging the visible towers and the viewer position:

 v.patch input=visible_towers,viewpoint output=nodes

Then prepare the vector network by inserting new line segments to connect unconnected points to the street network:

 v.net input=streets_wake points=nodes output=network operation=connect node_layer=1 threshold=1000

Print the categories of the points in the network map to pass them to the network analysis:

 v.category -g input=network type=point option=print

Run the traveling salesman module:

 v.net.salesman input=network output=route center_cats=11,12,30,1 node_layer=1

Case study part II: Python scripting

Python intro

The simplest way to execute the Python code which uses GRASS GIS packages is to use Simple Python editor integrated in GRASS GIS (accessible from the toolbar or the Python tab in the Layer Manager). Another option is to write the Python code in your favorite plain text editor like Notepad++ (note that Python editors are plain text editors). Then run the script in GRASS GIS using the main menu File -> Launch script.


The GRASS GIS 7 Python Scripting Library provides functions to call GRASS modules within scripts as subprocesses. The most often used functions include:


We will use GRASS GUI Simple Python Editor to run the commands. You can open it from Python tab. For longer scripts, you can create a text file, save it into your current working directory and run it with python myscript.py from the GUI command console or terminal.

When you open Simple Python Editor, you find a short code snippet. It starts with importing GRASS GIS Python Scripting Library:

import grass.script as gscript

In the main function we call g.region to see the current computational region settings:

gscript.run_command('g.region', flags='p')

Note that the syntax is similar to bash syntax (g.region -p), only the flag is specified in a parameter. Now we can run the script by pressing the Run button in the toolbar. In Layer Manager we get the output of g.region.

Before running any GRASS raster modules, you need to set the computational region. In this example, we set the computational extent and resolution to the raster layer elevation. Replace the previous g.region command with the following line:

gscript.run_command('g.region', raster='elevation')

The run_command() function is the most commonly used one. We will use it to compute viewshed using r.viewshed. Add the following line after g.region command:

gscript.run_command('r.viewshed', input='elevation', output='python_viewshed', coordinates=(636054,220707), overwrite=True)

Parameter overwrite is needed if we rerun the script and rewrite the raster.

Now let's look at how big the viewshed is by using r.univar. Here we use parse_command to obtain the statistics as a Python dictionary

univar = gscript.parse_command('r.univar', map='python_viewshed', flags='g')
print univar['n']

The printed result is the number of cells of the viewshed.

GRASS GIS Python Scripting Library also provides several wrapper functions for often called modules. List of convenient wrapper functions with examples includes:

Here are two commands (to be executed in the Console) often used when working with scripts. First is setting the computational region. We can do that in a script, but it is better and more general to do it before executing the script (so that the script can be used with different computational region settings):

g.region raster=elevation

The second command is handy when we want to run the script again and again. In that case, we first need to remove the created raster maps, for example:

g.remove type=raster pattern="viewshed*"

The above command actually won't remove the maps, but it will inform you which it will remove if you provide the -f flag.

Computing cumulative viewshed

3D view of cumulative viewshed draped over elevation raster

In this exercise we will compute viewshed from each tower using a Python script. From these viewsheds we can then easily compute Cumulative viewshed (in our case meaning from how many towers we get signal). For that we will use module r.series, which overlays the viewshed rasters and computes how many times each cell is visible in all the rasters. This identifies places with bad coverage.

Copy the following code into Python Simple Editor (be sure to overwrite any already existing code) and press Run button.

#!/usr/bin/env python

import grass.script as gscript

def main():
    # obtain the coordinates of viewpoints
    viewpoints = gscript.read_command('v.out.ascii', input='towers', separator='comma').strip()

    # loop through the viewpoints and compute viewshed from each of them
    for point in viewpoints.splitlines():
        if not point:
            # skip empty lines
            continue
        x, y, cat = point.split(',')
        gscript.run_command('r.viewshed', input='elevation', output='viewshed' + cat,
                            coordinates=(x, y), observer_elevation=50, overwrite=True)
    # obtain all viewshed results and set their color to yellow
    # export the list of viewshed map names to a file
    maps_file = 'viewsheds.txt'
    gscript.run_command('g.list', type='raster', pattern='viewshed*',
                        output=maps_file, overwrite=True)
    gscript.write_command('r.colors', file=maps_file, rules='-',
                          stdin='0% yellow \n 100% yellow')
    # cumulative viewshed
    gscript.run_command('r.series', file='viewsheds.txt', output='cumulative_viewshed',
                        method='count', overwrite=True)
    # set color of cumulative viewshed from grey to yellow
    gscript.write_command('r.colors', map='cumulative_viewshed', rules='-',
                          stdin='0% 70:70:70 \n 100% yellow')

if __name__ == '__main__':
    main()


You can now add all the layers into Map Diplay. Use tool Add multiple raster or vector map layers to add more layers at once, you can filter them by name.

Right click on cumulative_viewshed and select Report raster statistics (r.report) and we can see that almost half of the area has no coverage (we didn't include towers outside of our study area).

Try out some of the more advanced and latest features

Segmentation

Segmentation with i.segment

We show two recent segmentation algorithms: i.segment and addon i.superpixels.slic, where the addon needs to be installed first:

g.extension i.superpixels.slic

We show here segmentation of Landsat scene.

Imagery modules typically work with imagery groups. We first list the landsat raster data and then create an imagery group:

g.list type=raster pattern="lsat*" sep=comma mapset=PERMANENT
i.group group=lsat subgroup=lsat input=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_61,lsat7_2002_62,lsat7_2002_70,lsat7_2002_80

Now we run i.superpixels.slic and convert the resulting raster to vector for better viewing:

i.superpixels.slic group=lsat output=superpixels num_pixels=2000
r.to.vect input=superpixels output=superpixels type=area

We do the same for i.segment and convert the resulting raster to vector for better viewing:

i.segment group=lsat output=segments threshold=0.5 minsize=50
r.to.vect input=segments output=segments type=area

From landsat data we also compute NDVI to later display it together with the segmentation:

i.vi red=lsat7_2002_30 output=ndvi viname=ndvi nir=lsat7_2002_40

Remove all layers from Layer Manager and add these layers one by one by pasting into the GUI command line and pressing Enter:

d.rast map=ndvi
d.vect map=superpixels fill_color=none
d.vect map=segments fill_color=none

It's important to note that each segmentation algorithm is designed for different purpose, so we can't directly compare them.


Next steps and alternative workflows can involve: i.segment.hierarchical, i.segment.uspo, i.segment.stats, r.object.activelearning, r.object.geometry, r.object.spatialautocor.

Hydrology: Estimating inundation extent using HAND methodology

In this example we will use some of GRASS GIS hydrology tools, namely:

  • r.watershed: for computing flow accumulation, drainage direction, the location of streams and watershed basins
r.watershed elevation=elevation accumulation=flowacc drainage=drainage stream=streams threshold=100000
r.to.vect input=streams@user1 output=streams type=line
r.lake elevation=elevation@PERMANENT water_level=90 lake=flood coordinates=637877.833869,218475.441863


r.stream.distance stream_rast=streams direction=drainage elevation=elevation method=downstream difference=above_stream
r.lake elevation=above_stream water_level=5 lake=flood seed=streams

Batch jobs with --exec

Since GRASS GIS 7.2 modules and scripts can be executed in a GRASS GIS non-interactive session with --exec.

grass7 path/to/database/location/mapset --exec module params
grass7 path/to/database/location/mapset --exec script.py params

So for example we can compute viewshed in this way providing we provide correct relative or absolute path to the mapset:

grass72 grassdata/nc_spm_08_grass7/user1 --exec r.viewshed input=elevation output=viewshed_exec coordinates=642964,222890

This is useful for computing in HPC environments where the processes run in parallel. The individual processes should in general run in different mapsets. If we have an embarrassingly parallel problem we can create a list of these commands, each with different mapset and parameters:

grass72 -c grassdata/location/mapset_temp_1 --exec python script.py params1
grass72 -c grassdata/location/mapset_temp_2 --exec python script.py params2
grass72 -c grassdata/location/mapset_temp_3 --exec python script.py params3

and save them into a text file and use for example GNU Parallel to run them:

parallel --jobs 10 < jobs.txt

See also