Difference between revisions of "R statistics/rgrass7"

From GRASS-Wiki
Jump to: navigation, search
m (GRASS within R)
m (minor version bump)
 
(34 intermediate revisions by 4 users not shown)
Line 1: Line 1:
'''This page refers to the usage of R within a GRASS GIS 7 session and the use of GRASS GIS 7 within an R session.''' (see also [[R_statistics/spgrass6]])
+
'''This page refers to the usage of R within a GRASS GIS 7 session and the use of GRASS GIS 7 within an R session''' (see also [[R_statistics/spgrass6]])
 +
 
 +
== Terminology ==
 +
 
 +
Using R in conjunction with GRASS GIS can have two meanings:
 +
 
 +
* Using '''R within GRASS GIS session''', i.e. you start R (or RStudio) from the GRASS GIS command line. You may like this variant if you are primarily a GRASS GIS user.
 +
* Using '''GRASS GIS within a R session''', i.e. you connect to a GRASS GIS location/mapset from within R (or RStudio). You may like this variant if you are primarily a R user.
 +
 
 +
<!-- not quite clear to the casual user..:
 +
Keep in mind:
 +
* if you already have a GRASS location/mapset, use R inside GRASS and do not use initGRASS();
 +
* if the GRASS location/mapset is only a throwaway one, use initGRASS() and run GRASS inside R.
 +
-->
 +
 
 +
'''References''': see "Overview" in [[R_statistics]]
 +
 
 +
== Installation ==
 +
 
 +
See [[R_statistics/Installation]]
  
 
== R within GRASS ==
 
== R within GRASS ==
 +
 +
Using '''R within GRASS GIS session''', i.e. you start R (or RStudio) from the GRASS GIS command line.
  
 
=== Startup ===
 
=== Startup ===
Line 9: Line 30:
  
 
Reset the region settings to the defaults
 
Reset the region settings to the defaults
GRASS 7.0.1svn (nc_spm_08_grass7):~ > g.region -d
+
<source lang="bash">
 +
GRASS 7.8 (nc_spm_08_grass7):~ > g.region -d
 +
</source>
  
 
Launch R from the GRASS prompt
 
Launch R from the GRASS prompt
GRASS 7.0.1svn (nc_spm_08_grass7):~ > R
+
<source lang="bash">
 +
GRASS 7.8 (nc_spm_08_grass7):~ > R
 +
</source>
  
 
Load the ''rgrass7'' library:
 
Load the ''rgrass7'' library:
> library(rgrass7)
+
<source lang="rsplus">
 +
library(rgrass7)
 +
</source>
 +
 
 +
If you plan to follow the example using the North Carolina sample dataset, load the ''rgdal'' library:
 +
<source lang="rsplus">
 +
library(rgdal)
 +
</source>
  
 
Get the GRASS environment (mapset, region, map projection, etc.); you can display the metadata for your location by printing G:
 
Get the GRASS environment (mapset, region, map projection, etc.); you can display the metadata for your location by printing G:
> G <- gmeta()
+
<source lang="rsplus">
<pre>
+
G <- gmeta()
 
gisdbase    /home/neteler/grassdata  
 
gisdbase    /home/neteler/grassdata  
 
location    nc_spm_08_grass7  
 
location    nc_spm_08_grass7  
Line 34: Line 66:
 
+lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137
 
+lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137
 
+rf=298.257222101 +towgs84=0.000,0.000,0.000 +to_meter=1  
 
+rf=298.257222101 +towgs84=0.000,0.000,0.000 +to_meter=1  
</pre>
+
</source>
  
 
=== Listing of existing maps ===
 
=== Listing of existing maps ===
  
 
List available vector maps:
 
List available vector maps:
> execGRASS("g.list", parameters = list(type = "vector"))
+
<source lang="rsplus">
 +
execGRASS("g.list", parameters = list(type = "vector"))
 +
</source>
  
 
List selected vector maps (wildcard):
 
List selected vector maps (wildcard):
> execGRASS("g.list", parameters = list(type = "vector", pattern = "precip*"))
+
<source lang="rsplus">
 +
execGRASS("g.list", parameters = list(type = "vector", pattern = "precip*"))
 +
</source>
  
 
Save selected vector maps into R vector:
 
Save selected vector maps into R vector:
> my_vmaps <- execGRASS("g.list", parameters = list(type = "vector", pattern = "precip*"))
+
<source lang="rsplus">
> attributes(my_vmaps)
+
my_vmaps <- execGRASS("g.list", parameters = list(type = "vector", pattern = "precip*"))
> attributes(my_vmaps)$resOut
+
attributes(my_vmaps)
 +
attributes(my_vmaps)$resOut
 +
</source>
  
 
List available raster maps:
 
List available raster maps:
> execGRASS("g.list", parameters = list(type = "raster"))
+
<source lang="rsplus">
 +
execGRASS("g.list", parameters = list(type = "raster"))
 +
</source>
  
 
List selected raster maps (wildcard):
 
List selected raster maps (wildcard):
> execGRASS("g.list", parameters = list(type = "raster", pattern = "lsat7_2002*"))
+
<source lang="rsplus">
 +
execGRASS("g.list", parameters = list(type = "raster", pattern = "lsat7_2002*"))
 +
</source>
  
=== Reading in data ===
+
=== Reading in data from GRASS ===
Read in two raster maps (North Carolina sample dataset):
 
> ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE), ignore.stderr=TRUE, plugin=NULL)
 
  
The metadata are accessed and available, but are not (yet) used to structure the ''sp'' class objects, in this case a SpatialGridDataFrame object filled with data from two North Carolina layers. Here is a plot of the elevation data:
+
==== Example 1: raster maps ====
> image(ncdata, attr = 2, col = terrain.colors(20))
+
 
 +
Read in two GRASS raster maps (the maps "geology_30m" and "elevation" from the North Carolina sample data location) into the R current session as a new object "ncdata" (containing then two SpatialGridDataFrames as well as metadata):
 +
<source lang="rsplus">
 +
# the cat parameter indicates which raster values to be returned as factors
 +
# - geology_30m is a categorical map (i.e., it comes with classes)
 +
# - elevation is a continuous surface
 +
ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))
 +
</source>
 +
(A warning may appear since in the "geology_30m" map two labels are not unique - as found in the original data.)
 +
 
 +
We can verify the new R object:
 +
<source lang="rsplus">
 +
str(ncdata)
 +
Formal class 'SpatialGridDataFrame' [package "sp"] with 4 slots
 +
..@ data      :'data.frame': 16616 obs. of  2 variables:
 +
</source>
 +
 
 +
and also check the data structure in more detail:
 +
<source lang="rsplus">
 +
str(ncdata@data)
 +
'data.frame': 16616 obs. of  2 variables:
 +
$ geology_30m: Factor w/ 12 levels "CZfg_217","CZlg_262",..: NA NA NA NA NA NA NA NA NA NA ...
 +
$ elevation  : num  NA NA NA NA NA NA NA NA NA NA ...
 +
</source>
 +
 
 +
The metadata are now accessed and available, but are not (yet) used to structure the ''sp'' class objects, in this case a SpatialGridDataFrame object filled with data from two North Carolina layers. Here is a plot of the elevation data:
 +
<source lang="rsplus">
 +
image(ncdata, "elevation", col = terrain.colors(20))
 +
</source>
  
 
Add a title to the plot:
 
Add a title to the plot:
> title("North Carolina elevation")
+
<source lang="rsplus">
 +
title("North Carolina elevation")
 +
</source>
  
 
[[Image:ncdata.png|center]]
 
[[Image:ncdata.png|center]]
  
 
In addition, we can show what is going on inside the objects read into R:
 
In addition, we can show what is going on inside the objects read into R:
> str(G)
+
<source lang="rsplus">
<pre>
+
str(G)
 
List of 26
 
List of 26
 
  $ DEBUG        : chr "0"
 
  $ DEBUG        : chr "0"
Line 98: Line 168:
 
  $ proj4        : chr "+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137 +"| __truncated__
 
  $ proj4        : chr "+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137 +"| __truncated__
 
  - attr(*, "class")= chr "gmeta"
 
  - attr(*, "class")= chr "gmeta"
</pre>
 
  
> summary(ncdata)
+
summary(ncdata)
<pre>
 
 
Object of class SpatialGridDataFrame
 
Object of class SpatialGridDataFrame
 
Coordinates:
 
Coordinates:
Line 125: Line 193:
 
  CZbg_910: 4351  Max.  :156.25   
 
  CZbg_910: 4351  Max.  :156.25   
 
  (Other) : 4942                   
 
  (Other) : 4942                   
</pre>
+
</source>
 +
 
 +
==== Example 2: raster maps ====
 +
 
 +
Here an example for a single raster map transfer from GRASS GIS to R:
 +
 
 +
<source lang="rsplus">
 +
library(rgrass7)
 +
execGRASS("g.region", raster = "elevation", flags = "p")
 +
ncdata <- readRAST("elevation", cat=FALSE)
 +
summary(ncdata)
 +
spplot(ncdata, col = terrain.colors(20))
 +
</source>
 +
 
 +
==== Example 3: vector maps ====
 +
 
 +
Here an example for a single vector map transfer from GRASS GIS to R:
 +
 
 +
<source lang="rsplus">
 +
# needed prerequisite
 +
library(rgrass7)
 +
use_sf()
 +
library(sf)
 +
 
 +
execGRASS("v.info", map="schools", layer="1")
 +
vInfo("schools")
 +
myschools <- readVECT("schools")
 +
print(summary(myschools))
 +
</source>
  
 
=== Summarizing data ===
 
=== Summarizing data ===
  
 
We can create a table of cell counts:
 
We can create a table of cell counts:
> table(ncdata$geology_30m)
+
<source lang="rsplus">
 +
table(ncdata$geology_30m)
 +
</source>
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
!CZfg_217  
 
!CZfg_217  
Line 162: Line 260:
  
 
And compare with the equivalent GRASS module:
 
And compare with the equivalent GRASS module:
> execGRASS("r.stats", flags=c("c", "l"), parameters=list(input="geology_30m"), ignore.stderr=TRUE)
+
<source lang="rsplus">
 +
execGRASS("r.stats", flags=c("c", "l"), parameters=list(input="geology_30m"), ignore.stderr=TRUE)
 +
</source>
 
<pre>
 
<pre>
 
217 CZfg 70381
 
217 CZfg 70381
Line 180: Line 280:
  
 
Create a box plot of geologic types at different elevations:
 
Create a box plot of geologic types at different elevations:
> boxplot(ncdata$elevation ~ ncdata$geology_30m, medlwd = 1)
+
<source lang="rsplus">
 +
boxplot(ncdata$elevation ~ ncdata$geology_30m, medlwd = 1)
 +
</source>
  
 
[[Image:boxplot_geo.png|center]]
 
[[Image:boxplot_geo.png|center]]
 +
 +
=== Querying maps ===
 +
 +
Sometimes you may want to query GRASS GIS maps from your R session.
 +
 +
As an example, here the transfer of raster pixel values at the position of vector points:
 +
 +
<source lang="rsplus">
 +
# set the computational region first to the raster map:
 +
execGRASS("g.region", raster = "elev_state_500m", flags = "p")
 +
 +
# query raster map at vector points, transfer result into R
 +
goutput <- execGRASS("r.what", map="elev_state_500m", points="geodetic_pts", separator=",", intern=TRUE)
 +
str(goutput)
 +
chr [1:29939] "571530.81289275,265739.968425953,,187.8082200648" ...
 +
 +
# parse it
 +
con <- textConnection(goutput)
 +
go1 <- read.csv(con, header=FALSE)
 +
str(go1)
 +
'data.frame': 29939 obs. of  4 variables:
 +
$ V1: num  571531 571359 571976 572391 573011 ...
 +
$ V2: num  265740 265987 267049 267513 269615 ...
 +
$ V3: logi  NA NA NA NA NA NA ...
 +
$ V4: Factor w/ 22738 levels "-0.0048115728",..: 6859 6642 6749 6411 6356 6904 7506 7224 6908 7167 ...
 +
summary(go1)
 +
      V1              V2            V3                V4     
 +
Min.  :121862  Min.  :  7991  Mode:logical  0      :  723 
 +
1st Qu.:462706  1st Qu.:162508  NA's:29939    *      :  293 
 +
Median :610328  Median :204502                  0.3048 :  47 
 +
Mean  :588514  Mean  :200038                  0.6096 :  44 
 +
3rd Qu.:717610  3rd Qu.:247633                  1.524  :  42 
 +
Max.  :946743  Max.  :327186                  0.9144 :  23 
 +
                                                  (Other):28767 
 +
</source>
  
 
=== Exporting data back to GRASS ===
 
=== Exporting data back to GRASS ===
Line 189: Line 326:
  
 
First prepare some data:  (square root of elevation)
 
First prepare some data:  (square root of elevation)
> ncdata$sqdem <- sqrt(ncdata$elevation)
+
<source lang="rsplus">
 +
ncdata$sqdem <- sqrt(ncdata$elevation)
 +
</source>
  
 
Export data from ''R'' back into a GRASS raster map:
 
Export data from ''R'' back into a GRASS raster map:
> writeRAST(ncdata, "sqdemNC", zcol="sqdem", ignore.stderr=TRUE)
+
<source lang="rsplus">
 +
writeRAST(ncdata, "sqdemNC", zcol="sqdem", ignore.stderr=TRUE)
 +
</source>
  
 
Check that it imported into GRASS ok:
 
Check that it imported into GRASS ok:
> execGRASS("r.info", parameters=list(map="sqdemNC"))
+
<source lang="rsplus">
 +
execGRASS("r.info", parameters=list(map="sqdemNC"))
 +
</source>
  
 
<pre>
 
<pre>
Line 225: Line 368:
 
</pre>
 
</pre>
  
=== Calling RStudio from a GRASS GIS session ===
+
=== Using RStudio in a GRASS GIS session ===
  
 
If you are most used to run R through RStudio, but still want to have all GRASS data available for performing
 
If you are most used to run R through RStudio, but still want to have all GRASS data available for performing
 
any analyses without loosing the possibility of still using GRASS command line, you can run:
 
any analyses without loosing the possibility of still using GRASS command line, you can run:
  
<pre>
+
<source lang="bash">
 
GRASS> rstudio &
 
GRASS> rstudio &
</pre>
+
</source>
  
 
or, if you already are working on a certain RStudio project:
 
or, if you already are working on a certain RStudio project:
  
<pre>
+
<source lang="bash">
 
GRASS> rstudio /path/to/project/folder/ &
 
GRASS> rstudio /path/to/project/folder/ &
</pre>
+
</source>
  
 
Then, you load rgrass7 library in your RStudio project
 
Then, you load rgrass7 library in your RStudio project
 
+
<source lang="rsplus">
> library(rgrass7)  
+
library(rgrass7)
 +
</source>
  
 
and you are ready to go.
 
and you are ready to go.
  
 +
[[File:Grass7_rstudio.png|thumb|500px|center|RStudio used in GRASS GIS 7 session]]
  
 
== GRASS within R ==
 
== GRASS within R ==
 +
 +
Using '''GRASS GIS within a R session''', i.e. you connect to a GRASS GIS location/mapset from within R (or RStudio).
 +
 +
=== Startup ===
  
 
In the first place, find out the path to the GRASS GIS library. This can be easily done with the following command (still outside of R; or through a system() call inside of R):
 
In the first place, find out the path to the GRASS GIS library. This can be easily done with the following command (still outside of R; or through a system() call inside of R):
  
  grass70 --config path
+
<source lang="bash">
 +
# OSGeo4W, Linux, Mac OSX users:
 +
grass78 --config path
 +
</source>
  
 
It may report something like:
 
It may report something like:
  /home/veroandreo/software/grass-7.0.svn/dist.x86_64-unknown-linux-gnu
+
<source lang="bash">
 +
# OSGeo4W users:
 +
C:\OSGeo4W64\apps\grass\grass-7.8.3
  
To call GRASS GIS 7 functionality from R, start R and use the initGRASS() function to define the GRASS settings:
+
# Linux, Mac OSX users:
 +
/usr/local/grass-7.8.3
 +
</source>
  
<pre>
+
To call GRASS GIS 7 functionality from R, start R and use the initGRASS() function to define the GRASS settings. If GRASS GIS was
 +
installed through OSGeo4W, first start the OSGeo4W Shell. Note that it may start in C: or a directory where the user does not have
 +
writing permission. Before calling R or RStudio, first change to a directory where you are sure you have writing permission.
 +
 
 +
<source lang="rsplus">
 +
## MS-Windows users (example for an OSGeo4W 64bit installation)
 +
# change to a directory with writing permission
 +
C:\>d:
 +
D:\>cd temp
 +
D:\temp>cd testR
 +
 
 +
# start R within this directory:
 +
D:\temp\testR>R
 +
 
 +
# to start RStudio from OSGeo4W Shell, run: "C:/Program Files/RStudio/bin/rstudio.exe"
 +
 +
# load rgrass7 package
 +
library(rgrass7)
 +
 
 +
# initialisation of GRASS in the North Carolina sample dataset
 +
initGRASS(gisBase = "C:/OSGeo4W64/apps/grass/grass-7.8.3",
 +
        gisDbase = "C:/Users/marissa/grassdata/",
 +
        location = "nc_spm_08_grass7",
 +
        mapset = "user1",
 +
        SG="elevation")
 +
</source>
 +
 
 +
Linux and Mac OSX users just need to start R or RStudio the usual way, load the rgrass7 package and initialize GRASS GIS with initGRASS() as shown below:
 +
 
 +
<source lang="rsplus">
 +
# load rgrass7 package
 
library(rgrass7)
 
library(rgrass7)
   
+
# initialisation of GRASS in the North Carolina sample dataset
# initialisation and the use of North Carolina sample dataset
+
initGRASS(gisBase = "/usr/local/grass-7.8.3",  
initGRASS(gisBase = "/home/veroandreo/software/grass-7.0.svn/dist.x86_64-unknown-linux-gnu", home = tempdir(),  
+
          home = tempdir(),  
 
           gisDbase = "/home/veroandreo/grassdata/",
 
           gisDbase = "/home/veroandreo/grassdata/",
           location = "nc_spm_08_grass7", mapset = "user1", SG="elevation",
+
           location = "nc_spm_08_grass7",  
          override = TRUE)
+
          mapset = "user1",  
 +
          SG="elevation")
 +
</source>
 +
 
 +
Note: the optional SG parameter is a 'SpatialGrid' object to define the ‘DEFAULT_WIND’ of the temporary location.
  
# set computational region to default (facultative)
+
<source lang="rsplus">
 +
# set computational region to default (optional)
 
system("g.region -dp")
 
system("g.region -dp")
 
# verify metadata
 
# verify metadata
Line 281: Line 472:
 
  CZbg_910  Km_921 CZam_946    NA's  
 
  CZbg_910  Km_921 CZam_946    NA's  
 
       18        5        2  1009790  
 
       18        5        2  1009790  
</pre>
+
</source>
  
 
== R in GRASS in batch mode ==
 
== R in GRASS in batch mode ==
Line 287: Line 478:
 
Run the following script with
 
Run the following script with
  
<pre>
+
<source lang="bash">
 
R CMD BATCH batch.R
 
R CMD BATCH batch.R
</pre>
+
</source>
  
'''To be updated'''
+
<source lang="rsplus">
 
+
library(rgrass7)
  # get path to GRASS binaries:
+
# initialisation of GRASS in the North Carolina sample dataset (example with a compiled version)
  grass70 --config path
+
initGRASS(gisBase = "/home/veroandreo/software/grass78/dist.x86_64-pc-linux-gnu",
 +
          home = tempdir(),
 +
          gisDbase = "/home/veroandreo/grassdata/",
 +
          location = "nc_spm_08_grass7", mapset = "user1", SG="elevation",
 +
          override = TRUE)
 +
# set region to default
 +
system("g.region -dp")
 +
# verify
 +
gmeta()
 +
# read data into R
 +
ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))
 +
# summary of geology map
 +
summary(ncdata$geology_30m)
 +
proc.time()
 +
</source>
  
 
The result is (shorted here):
 
The result is (shorted here):
 +
<source lang="bash">
 +
cat batch.Rout
 +
</source>   
  
    cat batch.Rout
+
<source lang="rsplus">
   
+
     R version 4.0.2 (2020-06-22) -- "Taking Off Again"
     R version 2.10.0 (2009-10-26)
+
     Copyright (C) 2020 The R Foundation for Statistical Computing
     Copyright (C) 2009 The R Foundation for Statistical Computing
+
     Platform: x86_64-redhat-linux-gnu (64-bit)
     ISBN 3-900051-07-0
 
 
     ...
 
     ...
 
     > library(rgrass7)
 
     > library(rgrass7)
    Loading required package: sp
 
    Loading required package: rgdal
 
    Geospatial Data Abstraction Library extensions to R successfully loaded
 
    Loaded GDAL runtime: GDAL 1.7.2, released 2010/04/23
 
    Path to GDAL shared files: /usr/local/share/gdal
 
    Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009
 
    Path to PROJ.4 shared files: (autodetected)
 
 
     Loading required package: XML
 
     Loading required package: XML
 
     GRASS GIS interface loaded with GRASS version: (GRASS not running)
 
     GRASS GIS interface loaded with GRASS version: (GRASS not running)
    >
+
     > # initialisation and the use of north carolina dataset
     > # initialisation and the use of spearfish60 data
+
     > initGRASS(gisBase = "/home/veroandreo/software/grass78/dist.x86_64-pc-linux-gnu", home = tempdir(),  
     > initGRASS(gisBase = "/home/neteler/software/grass70/dist.x86_64-unknown-linux-gnu", home = tempdir(), gisDbase = "/home/neteler/grassdata/",
+
    +          gisDbase = "/home/veroandreo/grassdata/",
     +          location = "spearfish60", mapset = "user1", SG="elevation.dem", override = TRUE)
+
     +          location = "nc_spm_08_grass7", mapset = "user1", SG="elevation",
     gisdbase    /home/neteler/grassdata/  
+
    +          override = TRUE)
     location    spearfish60
+
     gisdbase    /home/veroandreo/grassdata/  
 +
     location    nc_spm_08_grass7
 
     mapset      user1  
 
     mapset      user1  
     rows        477
+
     rows        13500
     columns    634
+
     columns    15000
     north      4928010
+
     north      228500
     south      4913700
+
     south      215000
     west        589980
+
     west        630000
     east        609000
+
     east        645000
     nsres      30
+
     nsres      1
     ewres      30
+
     ewres      1
     projection  +proj=utm +zone=13 +a=6378206.4 +rf=294.9786982 +no_defs
+
     projection  +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334
     +nadgrids=/usr/local/grass-6.4.1/etc/nad/conus +to_meter=1.0  
+
    +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137
     Warning messages:
+
     +rf=298.257222101 +towgs84=0.000,0.000,0.000 +type=crs +to_meter=1
     1: In dir.create(gisDbase) : '/home/neteler/grassdata' already exists
+
   
     2: In dir.create(loc_path) :
+
    > system("g.region -dp")
       '/home/neteler/grassdata//spearfish60' already exists
+
    projection: 99 (Lambert Conformal Conic)
     >
+
    zone:      0
     > system("g.region -d")
+
     datum:     nad83
     > # verify
+
     ellipsoid: a=6378137 es=0.006694380022900787
     > gmeta6()
+
    north:     320000
     gisdbase    /home/neteler/grassdata/  
+
     south:     10000
     location    spearfish60
+
    west:       120000
 +
    east:       935000
 +
    nsres:      500
 +
    ewres:      500
 +
     rows:      620
 +
     cols:      1630
 +
     cells:      1010600
 +
     > gmeta()
 +
     gisdbase    /home/veroandreo/grassdata/  
 +
     location    nc_spm_08_grass7
 
     mapset      user1  
 
     mapset      user1  
     rows        477
+
     rows        620
     columns    634
+
     columns    1630
     north      4928010
+
     north      320000
 +
    south      10000
 
     ...
 
     ...
     >  
+
     > ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))
    > spear <- readRAST6(c("geology", "elevation.dem"),
+
     ...
    +          cat=c(TRUE, FALSE), ignore.stderr=TRUE,
+
     > summary(ncdata$geology_30m)
    +          plugin=NULL)
+
     CZfg_217 CZlg_262 CZig_270 CZbg_405 CZve_583 CZam_720 CZg_766 CZam_862
     >
+
        292       78      277      102        8        1        2       25
     > summary(spear$geology)
+
     CZbg_910  Km_921 CZam_946    NA's  
     metamorphic transition    igneous  sandstone  limestone      shale
+
           18        5       2  1009790
          11693        142       36534       74959      61355      46423
 
     sandy shale    claysand        sand        NA's  
 
           11266      14535      36561       8950
 
    >
 
    >
 
 
     > proc.time()
 
     > proc.time()
 
       user  system elapsed  
 
       user  system elapsed  
       2.891  0.492  3.412
+
       8.061  2.016  10.048
 +
</source>
 +
 
 +
== Troubleshooting ==
 +
 
 +
=== Running out of disk space ===
 +
 
 +
Linux: A common issue is that /tmp/ is used for temporary files albeit being often a small partition. To change that to a larger directory, you may '''add''' to your <tt>$HOME/.bashrc</tt> the entry:
 +
 +
<source lang="bash">
 +
# change TMP directory of R (note: of course also another directory than suggested here is fine)
 +
mkdir -p $HOME/tmp
 +
export TMP=$HOME/tmp
 +
</source>
 +
 
 +
The drawback of this approach is that on modern Linux systems the /tmp/ is a fast RAM disk (based on tempfs) while HOME directories are often on slower spinning disks (unless you have a SSD drive). At least you no longer run out of disk space easily.
  
 
[[Category:FAQ]]
 
[[Category:FAQ]]

Latest revision as of 12:19, 30 August 2020

This page refers to the usage of R within a GRASS GIS 7 session and the use of GRASS GIS 7 within an R session (see also R_statistics/spgrass6)

Terminology

Using R in conjunction with GRASS GIS can have two meanings:

  • Using R within GRASS GIS session, i.e. you start R (or RStudio) from the GRASS GIS command line. You may like this variant if you are primarily a GRASS GIS user.
  • Using GRASS GIS within a R session, i.e. you connect to a GRASS GIS location/mapset from within R (or RStudio). You may like this variant if you are primarily a R user.


References: see "Overview" in R_statistics

Installation

See R_statistics/Installation

R within GRASS

Using R within GRASS GIS session, i.e. you start R (or RStudio) from the GRASS GIS command line.

Startup

  • First start a GRASS GIS session. Then, at the GRASS command line start R (for a 'rstudio' session, see below)
In this example we will use the North Carolina sample dataset.

Reset the region settings to the defaults

GRASS 7.8 (nc_spm_08_grass7):~ > g.region -d

Launch R from the GRASS prompt

GRASS 7.8 (nc_spm_08_grass7):~ > R

Load the rgrass7 library:

library(rgrass7)

If you plan to follow the example using the North Carolina sample dataset, load the rgdal library:

library(rgdal)

Get the GRASS environment (mapset, region, map projection, etc.); you can display the metadata for your location by printing G:

G <- gmeta()
gisdbase    /home/neteler/grassdata 
location    nc_spm_08_grass7 
mapset      user1 
rows        620 
columns     1630 
north       320000 
south       10000 
west        120000 
east        935000 
nsres       500 
ewres       500 
projection  +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334
+lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137
+rf=298.257222101 +towgs84=0.000,0.000,0.000 +to_meter=1

Listing of existing maps

List available vector maps:

execGRASS("g.list", parameters = list(type = "vector"))

List selected vector maps (wildcard):

execGRASS("g.list", parameters = list(type = "vector", pattern = "precip*"))

Save selected vector maps into R vector:

my_vmaps <- execGRASS("g.list", parameters = list(type = "vector", pattern = "precip*"))
attributes(my_vmaps)
attributes(my_vmaps)$resOut

List available raster maps:

execGRASS("g.list", parameters = list(type = "raster"))

List selected raster maps (wildcard):

execGRASS("g.list", parameters = list(type = "raster", pattern = "lsat7_2002*"))

Reading in data from GRASS

Example 1: raster maps

Read in two GRASS raster maps (the maps "geology_30m" and "elevation" from the North Carolina sample data location) into the R current session as a new object "ncdata" (containing then two SpatialGridDataFrames as well as metadata):

# the cat parameter indicates which raster values to be returned as factors
# - geology_30m is a categorical map (i.e., it comes with classes)
# - elevation is a continuous surface
ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))

(A warning may appear since in the "geology_30m" map two labels are not unique - as found in the original data.)

We can verify the new R object:

str(ncdata)
Formal class 'SpatialGridDataFrame' [package "sp"] with 4 slots
 ..@ data       :'data.frame':	16616 obs. of  2 variables:

and also check the data structure in more detail:

str(ncdata@data)
'data.frame':	16616 obs. of  2 variables:
 $ geology_30m: Factor w/ 12 levels "CZfg_217","CZlg_262",..: NA NA NA NA NA NA NA NA NA NA ...
 $ elevation  : num  NA NA NA NA NA NA NA NA NA NA ...

The metadata are now accessed and available, but are not (yet) used to structure the sp class objects, in this case a SpatialGridDataFrame object filled with data from two North Carolina layers. Here is a plot of the elevation data:

image(ncdata, "elevation", col = terrain.colors(20))

Add a title to the plot:

title("North Carolina elevation")
Ncdata.png

In addition, we can show what is going on inside the objects read into R:

str(G)
List of 26
 $ DEBUG        : chr "0"
 $ LOCATION_NAME: chr "nc_spm_08_grass7"
 $ GISDBASE     : chr "/home/veroandreo/grassdata"
 $ MAPSET       : chr "PERMANENT"
 $ GUI          : chr "wxpython"
 $ projection   : chr "99"
 $ zone         : chr "0"
 $ n            : num 228500
 $ s            : num 215000
 $ w            : num 630000
 $ e            : num 645000
 $ t            : num 1
 $ b            : num 0
 $ nsres        : num 27.5
 $ nsres3       : num 10
 $ ewres        : num 37.5
 $ ewres3       : num 10
 $ tbres        : num 1
 $ rows         : int 491
 $ rows3        : int 1350
 $ cols         : int 400
 $ cols3        : int 1500
 $ depths       : int 1
 $ cells        : chr "196400"
 $ cells3       : chr "2025000"
 $ proj4        : chr "+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137 +"| __truncated__
 - attr(*, "class")= chr "gmeta"

summary(ncdata)
Object of class SpatialGridDataFrame
Coordinates:
        min    max
[1,] 630000 645000
[2,] 215000 228500
Is projected: TRUE 
proj4string :
[+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334
+lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137
+rf=298.257222101 +towgs84=0.000,0.000,0.000 +to_meter=1]
Grid attributes:
  cellcentre.offset cellsize cells.dim
1          630018.8 37.50000       400
2          215013.7 27.49491       491
Data attributes:
   geology_30m      elevation     
 CZfg_217:70381   Min.   : 55.92  
 CZig_270:66861   1st Qu.: 94.78  
 CZbg_405:24561   Median :108.88  
 CZlg_262:19287   Mean   :110.38  
 CZam_862: 6017   3rd Qu.:126.78  
 CZbg_910: 4351   Max.   :156.25  
 (Other) : 4942

Example 2: raster maps

Here an example for a single raster map transfer from GRASS GIS to R:

library(rgrass7)
execGRASS("g.region", raster = "elevation", flags = "p")
ncdata <- readRAST("elevation", cat=FALSE)
summary(ncdata)
spplot(ncdata, col = terrain.colors(20))

Example 3: vector maps

Here an example for a single vector map transfer from GRASS GIS to R:

# needed prerequisite
library(rgrass7)
use_sf()
library(sf)

execGRASS("v.info", map="schools", layer="1")
vInfo("schools")
myschools <- readVECT("schools")
print(summary(myschools))

Summarizing data

We can create a table of cell counts:

table(ncdata$geology_30m)
CZfg_217 CZlg_262 CZig_270 CZbg_405 CZve_583 CZam_720 CZg_766 CZam_862 CZbg_910 Km_921 CZbg_945 CZam_946 CZam_948
70381 19287 66861 24561 2089 467 691 6017 4351 1211 1 398 85

And compare with the equivalent GRASS module:

execGRASS("r.stats", flags=c("c", "l"), parameters=list(input="geology_30m"), ignore.stderr=TRUE)
217 CZfg 70381
262 CZlg 19287
270 CZig 66861
405 CZbg 24561
583 CZve 2089
720 CZam 467
766 CZg 691
862 CZam 6017
910 CZbg 4351
921 Km 1211
945 CZbg 1
946 CZam 398
948 CZam 85

Create a box plot of geologic types at different elevations:

boxplot(ncdata$elevation ~ ncdata$geology_30m, medlwd = 1)
Boxplot geo.png

Querying maps

Sometimes you may want to query GRASS GIS maps from your R session.

As an example, here the transfer of raster pixel values at the position of vector points:

# set the computational region first to the raster map:
execGRASS("g.region", raster = "elev_state_500m", flags = "p")

# query raster map at vector points, transfer result into R
goutput <- execGRASS("r.what", map="elev_state_500m", points="geodetic_pts", separator=",", intern=TRUE)
str(goutput)
chr [1:29939] "571530.81289275,265739.968425953,,187.8082200648" ...

# parse it
con <- textConnection(goutput)
go1 <- read.csv(con, header=FALSE)
str(go1)
'data.frame':	29939 obs. of  4 variables:
 $ V1: num  571531 571359 571976 572391 573011 ...
 $ V2: num  265740 265987 267049 267513 269615 ...
 $ V3: logi  NA NA NA NA NA NA ...
 $ V4: Factor w/ 22738 levels "-0.0048115728",..: 6859 6642 6749 6411 6356 6904 7506 7224 6908 7167 ...
summary(go1)
       V1               V2            V3                V4       
 Min.   :121862   Min.   :  7991   Mode:logical   0      :  723  
 1st Qu.:462706   1st Qu.:162508   NA's:29939     *      :  293  
 Median :610328   Median :204502                  0.3048 :   47  
 Mean   :588514   Mean   :200038                  0.6096 :   44  
 3rd Qu.:717610   3rd Qu.:247633                  1.524  :   42  
 Max.   :946743   Max.   :327186                  0.9144 :   23  
                                                  (Other):28767

Exporting data back to GRASS

Finally, a SpatialGridDataFrame object is written back to a GRASS raster map:

First prepare some data: (square root of elevation)

ncdata$sqdem <- sqrt(ncdata$elevation)

Export data from R back into a GRASS raster map:

writeRAST(ncdata, "sqdemNC", zcol="sqdem", ignore.stderr=TRUE)

Check that it imported into GRASS ok:

execGRASS("r.info", parameters=list(map="sqdemNC"))
 +----------------------------------------------------------------------------+
 | Map:      sqdemNC                        Date: Sun Jul 19 13:06:34 2015    |
 | Mapset:   PERMANENT                      Login of Creator: veroandreo      |
 | Location: nc_spm_08_grass7                                                 |
 | DataBase: /home/veroandreo/grassdata                                       |
 | Title:     ( sqdemNC )                                                     |
 | Timestamp: none                                                            |
 |----------------------------------------------------------------------------|
 |                                                                            |
 |   Type of Map:  raster               Number of Categories: 0               |
 |   Data Type:    DCELL                                                      |
 |   Rows:         491                                                        |
 |   Columns:      400                                                        |
 |   Total Cells:  196400                                                     |
 |        Projection: Lambert Conformal Conic                                 |
 |            N:     228500    S: 215000.0002   Res: 27.49490794              |
 |            E:     645000    W:     630000   Res:  37.5                     |
 |   Range of data:    min = 7.47818253045085  max = 12.5000787351036         |
 |                                                                            |
 |   Data Description:                                                        |
 |    generated by r.in.bin                                                   |
 |                                                                            |
 |   Comments:                                                                |
 |                                                                            |
 +----------------------------------------------------------------------------+

Using RStudio in a GRASS GIS session

If you are most used to run R through RStudio, but still want to have all GRASS data available for performing any analyses without loosing the possibility of still using GRASS command line, you can run:

GRASS> rstudio &

or, if you already are working on a certain RStudio project:

GRASS> rstudio /path/to/project/folder/ &

Then, you load rgrass7 library in your RStudio project

library(rgrass7)

and you are ready to go.

RStudio used in GRASS GIS 7 session

GRASS within R

Using GRASS GIS within a R session, i.e. you connect to a GRASS GIS location/mapset from within R (or RStudio).

Startup

In the first place, find out the path to the GRASS GIS library. This can be easily done with the following command (still outside of R; or through a system() call inside of R):

# OSGeo4W, Linux, Mac OSX users:
grass78 --config path

It may report something like:

# OSGeo4W users:
C:\OSGeo4W64\apps\grass\grass-7.8.3

# Linux, Mac OSX users:
/usr/local/grass-7.8.3

To call GRASS GIS 7 functionality from R, start R and use the initGRASS() function to define the GRASS settings. If GRASS GIS was installed through OSGeo4W, first start the OSGeo4W Shell. Note that it may start in C: or a directory where the user does not have writing permission. Before calling R or RStudio, first change to a directory where you are sure you have writing permission.

## MS-Windows users (example for an OSGeo4W 64bit installation)
# change to a directory with writing permission
C:\>d:
D:\>cd temp
D:\temp>cd testR

# start R within this directory:
D:\temp\testR>R

# to start RStudio from OSGeo4W Shell, run: "C:/Program Files/RStudio/bin/rstudio.exe"
 
# load rgrass7 package
library(rgrass7)

# initialisation of GRASS in the North Carolina sample dataset
initGRASS(gisBase = "C:/OSGeo4W64/apps/grass/grass-7.8.3",
         gisDbase = "C:/Users/marissa/grassdata/",
         location = "nc_spm_08_grass7", 
         mapset = "user1", 
         SG="elevation")

Linux and Mac OSX users just need to start R or RStudio the usual way, load the rgrass7 package and initialize GRASS GIS with initGRASS() as shown below:

# load rgrass7 package
library(rgrass7)
# initialisation of GRASS in the North Carolina sample dataset
initGRASS(gisBase = "/usr/local/grass-7.8.3", 
          home = tempdir(), 
          gisDbase = "/home/veroandreo/grassdata/",
          location = "nc_spm_08_grass7", 
          mapset = "user1", 
          SG="elevation")

Note: the optional SG parameter is a 'SpatialGrid' object to define the ‘DEFAULT_WIND’ of the temporary location.

# set computational region to default (optional)
system("g.region -dp")
# verify metadata
gmeta()

# get two raster maps into R space
ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))

# calculate object summaries
summary(ncdata$geology_30m)
 CZfg_217 CZlg_262 CZig_270 CZbg_405 CZve_583 CZam_720  CZg_766 CZam_862 
     292       78      277      102        8        1        2       25 
 CZbg_910   Km_921 CZam_946     NA's 
      18        5        2  1009790

R in GRASS in batch mode

Run the following script with

R CMD BATCH batch.R
library(rgrass7)
# initialisation of GRASS in the North Carolina sample dataset (example with a compiled version)
initGRASS(gisBase = "/home/veroandreo/software/grass78/dist.x86_64-pc-linux-gnu", 
          home = tempdir(), 
          gisDbase = "/home/veroandreo/grassdata/",
          location = "nc_spm_08_grass7", mapset = "user1", SG="elevation",
          override = TRUE)
# set region to default
system("g.region -dp")
# verify
gmeta()
# read data into R
ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))
# summary of geology map
summary(ncdata$geology_30m)
proc.time()

The result is (shorted here):

cat batch.Rout
    R version 4.0.2 (2020-06-22) -- "Taking Off Again"
    Copyright (C) 2020 The R Foundation for Statistical Computing
    Platform: x86_64-redhat-linux-gnu (64-bit)
    ...
    > library(rgrass7)
    Loading required package: XML
    GRASS GIS interface loaded with GRASS version: (GRASS not running)
    > # initialisation and the use of north carolina dataset
    > initGRASS(gisBase = "/home/veroandreo/software/grass78/dist.x86_64-pc-linux-gnu", home = tempdir(), 
    +           gisDbase = "/home/veroandreo/grassdata/",
    +           location = "nc_spm_08_grass7", mapset = "user1", SG="elevation",
    +           override = TRUE)
    gisdbase    /home/veroandreo/grassdata/ 
    location    nc_spm_08_grass7 
    mapset      user1 
    rows        13500 
    columns     15000 
    north       228500 
    south       215000 
    west        630000 
    east        645000 
    nsres       1 
    ewres       1 
    projection  +proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334
    +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137
    +rf=298.257222101 +towgs84=0.000,0.000,0.000 +type=crs +to_meter=1  
    
    > system("g.region -dp")
    projection: 99 (Lambert Conformal Conic)
    zone:       0
    datum:      nad83
    ellipsoid:  a=6378137 es=0.006694380022900787
    north:      320000
    south:      10000
    west:       120000
    east:       935000
    nsres:      500
    ewres:      500
    rows:       620
    cols:       1630
    cells:      1010600
    > gmeta()
    gisdbase    /home/veroandreo/grassdata/ 
    location    nc_spm_08_grass7 
    mapset      user1 
    rows        620 
    columns     1630 
    north       320000 
    south       10000 
    ...
    > ncdata <- readRAST(c("geology_30m", "elevation"), cat=c(TRUE, FALSE))
    ...
    > summary(ncdata$geology_30m)
    CZfg_217 CZlg_262 CZig_270 CZbg_405 CZve_583 CZam_720  CZg_766 CZam_862 
         292       78      277      102        8        1        2       25 
    CZbg_910   Km_921 CZam_946     NA's 
          18        5        2  1009790 
    > proc.time()
       user  system elapsed 
      8.061   2.016  10.048

Troubleshooting

Running out of disk space

Linux: A common issue is that /tmp/ is used for temporary files albeit being often a small partition. To change that to a larger directory, you may add to your $HOME/.bashrc the entry:

# change TMP directory of R (note: of course also another directory than suggested here is fine)
mkdir -p $HOME/tmp
export TMP=$HOME/tmp

The drawback of this approach is that on modern Linux systems the /tmp/ is a fast RAM disk (based on tempfs) while HOME directories are often on slower spinning disks (unless you have a SSD drive). At least you no longer run out of disk space easily.