Temporal data processing: Difference between revisions

From GRASS-Wiki
Jump to navigation Jump to search
m (Restore page (revert Wiki glitch))
 
(62 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== Introduction ==
== Introduction ==


TGRASS is the temporal enabled GRASS GIS. It is available from [http://grass.osgeo.org/grass7/ GRASS GIS 7] onwards.
TGRASS is the temporal enabled GRASS GIS.
TGRASS is completely metadata based, i.e. it does not change any data but simply handles the organization of raster, vector, raster3D maps actually stored in a GRASS GIS mapset by registering in an additional internal database. This is done specifically for managing temporal and spatial extent including temporal topology.
TGRASS is completely metadata based, i.e. it does not change any data but simply handles the organization of raster, vector, raster3D maps actually stored in a GRASS GIS mapset by registering in an additional internal database. This is done specifically for managing temporal and spatial extent including temporal topology.
Manual overview: https://grass.osgeo.org/grass-stable/manuals/temporalintro.html


== Terminology Overview ==
== Terminology Overview ==
Line 19: Line 21:
== Example workflow for a Chlorophyll-a MODIS time series ==
== Example workflow for a Chlorophyll-a MODIS time series ==


The following examples are based on a series of MODIS L3 Chlorophyll-a product that is freely available at the [http://oceancolor.gsfc.nasa.gov/cgi/l3?sen=A&per=MO&prd=CHL_chlor_a ocean color] site. So, say we download the SMI 8-day composite product at 4.6 km resolution for the period 2003-2013. That is a 506 set of images, 46 per year. Data comes as compressed HDF4 files. Chlorophyll products filenames look like this:  
The following examples are based on a series of MODIS L3 Chlorophyll-a product that is freely available at the [http://oceancolor.gsfc.nasa.gov/cgi/l3?sen=A&per=MO&prd=CHL_chlor_a ocean color] site. So, say we download the SMI 8-day composite product at 4.6 km resolution for the period 2003-2013. That is a 506 set of images, 46 per year. Data comes as compressed HDF4 files. Chlorophyll products file names look like this:  


A20030012003008.L3m_8D_CHL_chlor_a_4km
<code>A20030012003008.L3m_8D_CHL_chlor_a_4km</code>
 
where,


   A: MODIS/Aqua
   A: MODIS/Aqua
Line 48: Line 52:
* import global images (506 images, 150 Mb each) with r.in.gdal, resize to study area and remove global files
* import global images (506 images, 150 Mb each) with r.in.gdal, resize to study area and remove global files
* set projection and extension, and resize to study area with gdal_translate, and import already resized images with r.in.gdal
* set projection and extension, and resize to study area with gdal_translate, and import already resized images with r.in.gdal
* import global images (506 images, 150 Mb each) with r.external, resize to study area and remove global files, as showed next:
* import global images (506 images, 150 Mb each) with r.external, resize to study area and remove global files, as showed below:


<source lang="bash">
<source lang="bash">
Line 55: Line 59:


suffix=_tmp
suffix=_tmp
for map in *chlor*; do
for map in *chlor* ; do
  r.external input=$map output=${map}${suffix} -o
    r.external input=$map output=${map}${suffix} -o
  r.mapcalc expression="$map = ${map}${suffix}"
    r.mapcalc expression="$map = ${map}${suffix}"
  g.remove type=raster name=${map}${suffix}
    g.remove type=raster name=${map}${suffix}
done
done
</source>
</source>


Once we have maps inside GRASS we can start the temporal processing. If this is the first time you'll use temporal modules,  
Once we have maps inside GRASS GIS database we can start the temporal processing. If this is the first time you'll use temporal modules,  
you need to run  
you need to run  


Line 69: Line 73:
</source>
</source>


to set the default temporal GIS database connection for the current mapset. The default TGIS database of type ''sqlite3'' is located in the PERMANENT mapset directory. Temporal GIS content from all created mapsets will be stored there.
to set the default temporal GIS database connection for the current mapset. If the connection is not initially set with {{cmd|t.connect}}, any temporal module that you try to run for the first time will first set the connection to the temporal database for you. The default TGIS database of type ''sqlite3'' is located in the PERMANENT mapset directory. Temporal GIS content from all created mapsets will be stored there.


=== Creating a STRDS and registering maps ===
=== Creating a STRDS and registering maps ===
Line 78: Line 82:


<source lang="bash">
<source lang="bash">
t.create type=strds temporaltype=absolute output=cla_test title="Chlorophyll-a concentration" \
t.create type=strds temporaltype=absolute output=cla \
  description="MODIS L3 Chlorophyll-a concentration for Argentinian sea" semantictype=mean
title="Chlorophyll-a concentration" \
description="MODIS L3 Chlorophyll-a concentration for Argentinian sea" \
semantictype=mean
</source>
</source>


'''Registering maps'''
'''Registering maps'''


Then, we '''register the maps''' in the strds using {{cmd|t.register}}. This module assigns time stamps to raster, 3D raster and vector maps and register them into space time datasets. Existing timestamps can be read and used by t.register. Note that this is a metadata based registration, nothing is imported into GRASS GIS since the maps are already present in the location (hence, no duplication occurs).
Then, we '''register the maps''' in the strds using {{cmd|t.register}}. This module assigns timestamps to raster, 3D raster and vector maps and register them in the temporal database and into space time datasets. Existing timestamps can be read and used by t.register. Note that this is a metadata based registration, nothing is imported into GRASS GIS since the maps are already present in the location (hence, no duplication occurs).


This module (and TGRASS in general) supports '''absolute time''' and '''relative time'''. The ''absolute'' temporal type refers to a fixed date or interval in the Gregorian calendar, while the ''relative'' temporal type refers to data without fixed time stamps (e.g., sequential maps used to calculate multi-decadal averages). Refer to the [http://www.geostat-course.org/system/files/Presentation_0.pdf TGRASS overview slides] for background.
This module (and TGRASS in general) supports '''absolute time''' and '''relative time'''. The ''absolute'' temporal type refers to a fixed date or interval in the Gregorian calendar, while the ''relative'' temporal type refers to data without fixed time stamps (e.g., sequential maps used to calculate multi-decadal averages). Refer to the [http://www.geostat-course.org/system/files/Presentation_0.pdf TGRASS overview slides] for background.


Maps can be registered by command line argument (a list of comma separated map names) or using an input file. The start time, end time and a temporal increment can be provided either by command line or in the input file. End time and increment are mutual exclusive. Maps can be registered in several space time datasets using the same timestamp.
Maps can be registered by command line argument (i.e., a list of comma separated map names) or using an input file. The start time, end time and a temporal increment can be provided either by command line or in the input file. End time and increment are mutually exclusive. Maps can be registered in several space time datasets using the same timestamp. For a more detailed explanation and examples on how to register maps in stds, see also [https://grasswiki.osgeo.org/wiki/Temporal_data_processing/maps_registration maps registration] wiki page.  


Start time and end time with absolute time must be provided using the format yyyy-mm-dd HH:MM:SS +HHMM. It is also supported to specify the date yyyy-mm-dd only. In case of relative time the temporal unit (years, months, days, hours, minutes or seconds) must be provided. The relative start time, end time and the increment are integers (see also [https://grass.osgeo.org/grass70/manuals/t.register.html#input-file-format input file format]).
Start time and end time with absolute time must be provided using the format yyyy-mm-dd HH:MM:SS +HHMM. It is also supported to specify the date yyyy-mm-dd only. In case of relative time, the temporal unit (years, months, days, hours, minutes or seconds) must be provided. The relative start time, end time and the increment are integers (see also {{cmd|t.register}}).


<source lang="bash">
<source lang="bash">
# note: with -i we create intervals (start and end time), for the given increment, and starting from start.
# note: with -i we create intervals (start and end time) spanning the given increment, and starting from start.
t.register -i type=raster input=cla_test \
t.register -i type=raster input=cla \
  maps=`g.list raster pattern="*_chlor_*" separator=comma`\
maps=`g.list raster pattern="*_chlor_*" separator=comma` \
  start="2003-01-01" increment="8 days"
start="2003-01-01" \
increment="8 days"
</source>
</source>


Line 105: Line 112:


<source lang="python">
<source lang="python">
# Script to extract the input table for t.register from map names
# Script to extract the input file for t.register from map names
# run in python
# run in python
# modify to your needs
# modify to your needs
Line 139: Line 146:
</source>
</source>


Using the number of characters in the filenames and datetime library in Python, you can convert DOY in the filenames into start_time and end_time as in the list you need to pass to {{cmd|t.register}}. The resulting list
Using the number of characters in the filenames and the datetime library in Python, you can convert DOY in the filenames into start_time and end_time as in the file you need to pass to {{cmd|t.register}}:
looks like this:


   A20030012003008.L3m_8D_CHL_chlor_a_4km_arg|2003-01-01 00:00:00|2003-01-09 00:00:00
   A20030012003008.L3m_8D_CHL_chlor_a_4km_arg|2003-01-01 00:00:00|2003-01-09 00:00:00
Line 156: Line 162:
</source>
</source>


As we are providing start and end time along with map names, we don't need to use the -i flag, neither set start nor increment options because that information is all there in the file.  
As we are providing start and end time along with map names, we don't need to use the -i flag, neither set start nor increment options because that information is already contained in the file.  


'''Assign a color table'''
'''Assign a color table'''


We can also set a color palette for all maps in the strds with:  
We can also set a color table for all maps in the strds with {{cmd|t.rast.colors}}:  


<source lang="bash">
<source lang="bash">
Line 168: Line 174:
t.rast.colors input=cla rules=path/to/your/color_table
t.rast.colors input=cla rules=path/to/your/color_table
</source>
</source>
Further examples of '''extracting timestamps from map names''' can be found here: [https://grasswiki.osgeo.org/wiki/Temporal_data_processing/maps_registration maps registration] wiki page.


=== Getting some basic info and statistics ===
=== Getting some basic info and statistics ===
Line 230: Line 238:
   | # 2014-04-29 14:23:23  
   | # 2014-04-29 14:23:23  
   | t.register --o type="rast" input="cla"
   | t.register --o type="rast" input="cla"
   |    file="map_list"
   |    file="input_list_cla.txt"
   |  
   |  
   +----------------------------------------------------------------------------+
   +----------------------------------------------------------------------------+
Line 240: Line 248:
</source>
</source>


In Linux-based systems you can send the output to a text file using  
Or, you can send the output to a text file using


<source lang="bash">
<source lang="bash">
t.rast.univar input=cla separator=comma > stats_cla.csv
t.rast.univar input=cla separator=comma output=stats_cla.csv
</source>
</source>


Line 250: Line 258:
=== Listing maps and selections ===
=== Listing maps and selections ===


The module {{cmd|t.rast.list}} allows you to list registered maps of a strds and provides several options to achieve what you want. For example, you can select different granules and also perform different queries by means of the where parameter. Some examples are:
The module {{cmd|t.rast.list}} allows you to list all the maps registered in a strds or a selection of them and, provides options for different listing methods, sorting of the list, information to print and so on. The ''where'' option in t.rast.list allows to perform different selections of maps to list. The ''columns'' that can be used to perform these selections are: ''id, name, creator, mapset, temporal_type, creation_time, start_time, end_time, north, south, west, east, nsres, ewres, cols, rows, number_of_cells, min and max''. (Note that for vector time series, i.e. stvds, the columns in {{cmd|t.vect.list}} differ from those for strds. You can check that with <code>t.vect.list --help</code>).


<source lang="bash">
<source lang="bash">
Line 278: Line 286:


<source lang="bash">
<source lang="bash">
t.rast.list input=cla_orig where="start_time=datetime(start_time, 'start of year', ' 0 month')"
t.rast.list input=cla where="start_time=datetime(start_time, 'start of year', ' 0 month')"
</source>
</source>
Note: '''Do not forget to use single quotes around date''', neither in {{cmd|t.rast.list}} nor in any other of the t.* modules offering the ''where'' option, because the clause will be ignored (i.e.: no selection performed) and no warning nor error will be printed.


=== Visualization ===
=== Visualization ===
Line 301: Line 311:


# Select strds
# Select strds
# Select pair of coordinates (east,north) or point in the map
# Select pair of coordinates (east,north) or mark a point in the map
# Hit Run
# Hit Draw
# Customize as desired
# Customize plot as desired
# Save
# Save


Line 311: Line 321:


<source lang="bash">
<source lang="bash">
g.gui.animation strds=cla
g.gui.animation strds=cla_monthly_average
</source>
</source>


Line 320: Line 330:
For aggregations of data with different methods and different granularities, there are two very useful commands:  
For aggregations of data with different methods and different granularities, there are two very useful commands:  


* {{cmd|t.rast.series}} that performs different aggregation algorithms from r.series on all or a subset of raster maps in a strds, and  
* {{cmd|t.rast.series}} that applies different aggregation methods from {{cmd|r.series}} on all or a subset of raster maps in a strds, and  


* {{cmd|t.rast.aggregate}} that temporally aggregates the maps in a strds by a user defined granularity.
* {{cmd|t.rast.aggregate}} that temporally aggregates the maps in a strds by a user defined granularity.
Line 330: Line 340:
<source lang="bash">
<source lang="bash">
# yearly aggregation
# yearly aggregation
t.rast.aggregate input=cla output=cla_yearly_average \
t.rast.aggregate input=cla output=cla_yearly_average basename=cla_yearly_average \
   basename=cla_yearly_average granularity="1 years" \
   granularity="1 years" method=average sampling=starts suffix=gran
  method=average sampling=start


# yearly aggregation with corresponding methods (output: 7 strds with 11 maps each)
# yearly aggregation with corresponding methods (output: 7 strds with 11 maps each)
for method in average median mode minimum maximum stddev range ; do
for method in average median mode minimum maximum stddev range ; do
t.rast.aggregate input=cla output=cla_yearly_${method} \
    t.rast.aggregate input=cla output=cla_yearly_${method} \
                basename=cla_yearly_${method} granularity="1 years" \
    basename=cla_yearly_${method} granularity="1 years" \
                method=${method} sampling=start
    method=${method} sampling=starts suffix=gran
done
done
</source>
</source>


With the where parameter, as exemplified before, we can select for example all 8-day products which start_time is 01 (January) over the years. Like this we can get the so-known climatologies.  
With the where parameter, as exemplified before, we can select for example all 8-day products which start_time is 01 (January) over the years. Like this we can get the so-known '''climatologies'''.  


<source lang="bash">
<source lang="bash">
Line 348: Line 357:
</source>
</source>


Generalizing a bit, we can:  
We could also get long term averages for specific periods in a time series, e.g. January 5-year mean:
 
<source lang="bash">
t.rast.series input=cla \
  where="start_time >= '2005-01-01' AND start_time <= '2009-12-01' AND strftime('%m', start_time)='01'" \
  method=average output=jan_cla_mean_2000_2004
 
# say we take a 15 years period (2000-2014) of monthly data and we want monthly 5-year averages
YEAR_START=(2000 2005 2010)
YEAR_END=(2004 2009 2014)
count=${#YEAR_START[@]}
 
for i in `seq 1 $count` ; do
echo ${YEAR_START[$i-1]} ${YEAR_END[$i-1]}
for MONTH in `seq -w 1 12` ; do
  t.rast.series input=cla \
  where="start_time >= '"${YEAR_START[$i-1]}"-01-01' AND start_time <= '"${YEAR_END[$i-1]}"-12-01' AND strftime('%m', start_time)='"${MONTH}"'" \
  method=average output=cla_${YEAR_START[$i-1]}_${YEAR_END[$i-1]}_month_${MONTH} --o
done
done
</source>
 
Generalizing a bit, to obtain monthly climatologies from a time series, we can:  


<source lang="bash">
<source lang="bash">
# monthly climatologies
# monthly climatologies
for i in 01 02 03 04 05 06 07 08 09 10 11 12 ; do  
for MONTH in `seq -w 1 12` ; do  
   for m in average median stddev minimum maximum ; do  
   for METHOD in average median stddev minimum maximum ; do  
     t.rast.series input=cla method=${m} where="strftime('%m', start_time)='${i}'" output=${m}_${i}
     t.rast.series input=cla method=${METHOD} where="strftime('%m', start_time)='${MONTH}'" output=${METHOD}_${MONTH}
   done
   done
done
done
Line 360: Line 391:
# seasonal climatologies
# seasonal climatologies
for i in "01 02 03" "04 05 06" "07 08 09" "10 11 12" ; do
for i in "01 02 03" "04 05 06" "07 08 09" "10 11 12" ; do
set -- $i ; echo $1 $2 $3
    set -- $i ; echo $1 $2 $3
  for m in average median stddev minimum maximum ; do
    for m in average median stddev minimum maximum ; do
    t.rast.series input=cla method=${m} output=${m}_${1}\
        t.rast.series input=cla method=${m} output=${m}_${1} \
    where="strftime('%m',start_time)='"${1}"' or strftime('%m',start_time)='"${2}"' or strftime('%m', start_time)='"${3}"'"                       
        where="strftime('%m',start_time)='"${1}"' or strftime('%m',start_time)='"${2}"' or strftime('%m', start_time)='"${3}"'"                       
  done
    done
done
done
</source>
</source>
Line 372: Line 403:
<source lang="bash">
<source lang="bash">
# monthly aggregates (132 maps)
# monthly aggregates (132 maps)
for method in average minimum maximum ; do
for method in average minimum maximum ; do
t.rast.aggregate input=cla output=cla_monthly_${method} basename=cla_monthly_${method} \
    t.rast.aggregate input=cla output=cla_monthly_${method} \
  granularity="1 months" method=${method} sampling=contains
    basename=cla_monthly_${method} \
    granularity="1 months" method=${method} \
    sampling=contains suffix=gran
done
done


# January anomalies in mean, min and max Cl-a
# January anomalies in mean, min and max Cl-a
 
t.rast.list -u input=cla_monthly_average where="start_time=datetime(start_time, 'start of year', '0 month')" columns=name
t.rast.list -s input=cla_monthly_average where="start_time=datetime(start_time, 'start of year', '0 month')" columns=name


for m in average minimum maximum ; do  
for m in average minimum maximum ; do  
  for i in 1 13 25 37 49 61 73 85 97 109 121 ; do # these numbers correspond to all january monthly aggregates
    for i in 1 13 25 37 49 61 73 85 97 109 121 ; do # these numbers correspond to all January monthly aggregates
    r.mapcalc expression="Jan_${m}_anomaly_${i}=cla_monthly_${m}_${i}-01_${m}"  
        r.mapcalc expression="Jan_${m}_anomaly_${i}=cla_monthly_${m}_${i}-01_${m}"  
  done
    done
done
done
</source>
</source>
Line 392: Line 423:


<source lang="bash">
<source lang="bash">
for map in `t.rast.list -s input=cla_monthly_average where="start_time=datetime(start_time, 'start of year', '0 month')" columns=name`;  
for map in `t.rast.list -u input=cla_monthly_average where="strftime('%m', start_time)='01'" columns=name` ; do
do
    r.mapcalc expression="anomaly_${map}=${map}-01_average"  
r.mapcalc expression="anomaly_${map}=${map}-01_average"  
done
done
</source>
</source>


Generalizing, we can estimate all monthly anomalies from mean like this:
Generalizing, we can estimate all monthly anomalies from mean like:


<source lang="bash">
<source lang="bash">
for i in 01 02 03 04 05 06 07 08 09 10 11 12 ; do
for i in `seq -w 1 12` ; do
  for map in `t.rast.list -s input=cla_monthly_average columns=name where="strftime('%m', start_time)='"${i}"'"` ; do
    for map in `t.rast.list -u input=cla_monthly_average columns=name where="strftime('%m', start_time)='"${i}"'"` ; do
    r.mapcalc expression="anomaly_${map}=${map}-average_${i}"
        r.mapcalc expression="anomaly_${map}=${map}-average_${i}"
  done
    done
done
done
</source>
</source>
Line 415: Line 445:


# map index for the yearly maximum Cl-a value  
# map index for the yearly maximum Cl-a value  
t.rast.aggregate input=cla granularity="1 year" method=max_raster output=yearly_max_index basename=yearly_max_index
t.rast.aggregate input=cla granularity="1 year" method=max_raster output=yearly_max_index basename=yearly_max_index suffix=gran
</source>
</source>


Line 429: Line 459:
=== Spatio-temporal algebra with STRDS ===
=== Spatio-temporal algebra with STRDS ===


The module {{cmd|t.rast.mapcalc}} allows us to perform spatio-temporal mapcalc expressions on temporally sampled maps of strds. There are spatial and temporal operators available for the "expression" string. Spatial operators, functions and internal variables are those used in {{cmd|r.mapcalc}}. Temporal internal variables supported for both relative and absolute time include: td(), start_time() and end_time(). There are also several very useful internal variables supported especially for absolute time of the current sample interval or instance, e.g.: start_doy(), start_year(), start_month() and so on (see {{cmd|t.rast.mapcalc}} manual site for further details and examples).
The module {{cmd|t.rast.mapcalc}} allows us to perform spatio-temporal mapcalc expressions on temporally sampled maps of strds. There are spatial and temporal operators available for the "expression" string. Spatial operators, functions and internal variables are those used in {{cmd|r.mapcalc}}. Temporal internal variables supported for both relative and absolute time include: td(), start_time() and end_time(). There are also several very useful internal variables supported especially for absolute time of the current sample interval or instance, e.g.: start_doy(), start_year(), start_month() and so on (see {{cmd|t.rast.mapcalc}} manual for further details and examples).


Some examples now. Say we did some analysis and decided that we will only consider values higher than 0.05. Then, we need to set all values below that threshold to null.
Some examples now. Say we did some analysis and decided that we will only consider values higher than 0.05. Then, we need to set all values below that threshold to null.
Line 466: Line 496:
</source>
</source>


In the development version of GRASS GIS, there's also a {{cmd|t.rast.algebra}} module that allows for temporal and spatial operations on strds by means of temporal raster algebra. The module expects an expression as input parameter in the following form:  
There's also a {{cmd|t.rast.algebra}} module that allows for temporal and spatial operations on strds by means of temporal raster algebra. The module expects an expression in the following form: "result = expression".
 
'''"result = expression"'''


The statement structure is similar to r.mapcalc, the result is the name of a new strds that will contain the result of the calculation given as expression. Expressions can be any valid or nested combination of temporal operations and spatial overlay or buffer functions that are provided by the temporal algebra. See the manual for further details and explanations.
The statement structure is similar to r.mapcalc, the result is the name of a new strds that will contain the result of the calculation given as expression. Expressions can be any valid or nested combination of temporal operations and spatial overlay or buffer functions that are provided by the temporal algebra. See the manual for further details and explanations.
Line 475: Line 503:


<source lang="bash">
<source lang="bash">
# we set 8 as fixed denominator, because products are 8-day compositions
t.rast.algebra expression="slope_cla = (cla[1]-cla[0])/8.0" basename=slope_cla
t.rast.algebra expression="slope_cla = (cla[1]-cla[0])/8.0" basename=slope_cla
# we set 8 as fixed denominator, because products are 8-day compositions
</source>
</source>


We can then use any of the aggregation modules that we saw before to get the maximum or minimumm rate of change for different granularities.
We can then use any of the aggregation modules that we saw before to get the maximum or minimumm rate of change for different granularities.
===== Some known issues with t.rast.mapcalc =====
It has been observed that if the names of inputs and output space time datasets (partially) match each other, t.rast.mapcalc may not work properly (especially when several input strds are used). This is because the module uses a simple search and replace approach to substitute the input strds with the corresponding map names. Eventualy, choosing a specific order for the input strds in the input parameter may reduce the risk of wrong substitution. Therefore, especially in operations involving several strds (with partially matching names) as inputs, '''it is recommended to use {{cmd|t.rast.algebra}} that correctly recognizes spatio-temporal datasets''' rather than t.rast.mapcalc.


=== Subseting and something else ===
=== Subseting and something else ===


{{cmd|t.rast.extract}} is another really useful module in temporal GRASS. It allows to extract a subset of a strds and store it in a different strds. You use the "where" condition to do the subset, but you can also specify a r.mapcalc sub-expression that performs operations on all maps of the selected subset. If no r.mapcalc expression is defined, the selected maps are simply registered in the new output strds.
{{cmd|t.rast.extract}} is another really useful module within temporal GRASS. It allows to extract a subset of a strds and store it in a different strds. The "where" condition is used to do the subset, but you can also specify a r.mapcalc sub-expression that performs operations on all maps of the selected subset. If no r.mapcalc expression is defined, the selected maps are simply registered in the new output strds.


Say we need to need to know in how many maps of the strds min values are below a threshold, and not only that, but you also want to know how many pixels per map meet that condition. Then, we can do
Say we need to know in how many maps of the strds minimum values are below a certain threshold, and not only that, but we also want to know how many pixels per map meet that condition. Then, we can do


<source lang="bash">
<source lang="bash">
t.rast.extract input=cla where="min <= '0.05'" output=cla_less_05 basename=cla_less_05 expression="if(cla<0.05,1,null())"
t.rast.extract input=cla where="min <= '0.05'" output=cla_less_05 basename=cla_less_05 expression="if(cla < 0.05, 1, null())"
</source>
</source>


Line 500: Line 532:
</source>
</source>


=== Importing / Exporting strds ===
=== Importing / Exporting STRDS ===


Say we now need to do some processing in R (e.g.: run [http://menugget.blogspot.com.ar/2012/10/dineof-data-interpolating-empirical.html DINEOF] to fill gaps in data), so we need to export our strds, hence we use {{cmd|t.rast.export}}.  
Say we now need to do some processing in R (e.g.: run [http://menugget.blogspot.com.ar/2012/10/dineof-data-interpolating-empirical.html DINEOF] to fill gaps in data), so we need to export our strds, hence we use {{cmd|t.rast.export}}.  
Line 513: Line 545:
t.rast.import input=cla_from_R.tar.gz output=cla_from_R basename=new_map directory=/tmp
t.rast.import input=cla_from_R.tar.gz output=cla_from_R basename=new_map directory=/tmp
</source>
</source>
For a full example of how to handle raster time series between GRASS and R, please visit the  [https://grasswiki.osgeo.org/wiki/Temporal_data_processing/raster_time_series_grass_R_statistics time series GRASS-R Statistics] wiki. The example is based on North Carolina climate location [http://courses.ncsu.edu/mea592/common/media/02/nc_climate_spm_2000_2012.zip] and includes the following steps:
# Exporting the strds out of GRASS
# Importing into R
# Re-formating the data
# Running [http://modb.oce.ulg.ac.be/mediawiki/index.php/DINEOF DINEOF]
# Re-constructing the raster time series
# Exporting out of R, and
# Importing the gap-filled new strds into GRASS.


=== Neighborhood analysis ===
=== Neighborhood analysis ===
Line 522: Line 564:
</source>
</source>


=== Filling and reconstructing time series data with gaps - HANTS ===
=== Filling and reconstructing time series data with gaps ===


[http://grass.osgeo.org/grass70/manuals/addons/r.hants.html r.hants] is an add-on not strictly within the temporal modules, but really useful when working with this kind of data, particularly with gappy data. Here's a simple example for generating a list of temporally ordered maps to use as input in '''r.hants''', running HANTS and getting a map of dominant frequencies, afterwards. See the manual for further information on parameter setting and explanations.
==== Harmonic ANalysis of Time Series - HANTS ====
 
{{AddonCmd|r.hants}} is an add-on not strictly within the temporal modules, but really useful when working with time series data, particularly with gappy data. Here's a simple example for generating a list of temporally ordered maps to use as input in '''r.hants''', running HANTS and getting a map of dominant frequencies, afterwards. See the manual for further information on parameter setting and explanations.


<source lang="bash">
<source lang="bash">
# use t.rast.list to create a list of temporally ordered maps   
# use t.rast.list to create a list of temporally ordered maps   
t.rast.list input=cla order=start_time columns=name -s > map_list.csv
t.rast.list input=cla order=start_time columns=name -u > map_list.csv


# a HANTS run, using the list of maps
# a HANTS run using the list of maps
r.hants file=map_list.csv nf=5 fet=0.1 dod=11 base_period=46 suffix=_hants amplitude=amp_hants phase=pha_hants
r.hants -l file=map_list.csv nf=5 fet=0.1 dod=11 base_period=46 suffix=_hants amplitude=amp_hants phase=pha_hants


# dominant frequency map (0 means the dominant freq is 1, 1 that dominant freq is 2, and so on...)
# dominant frequency map (0 means the dominant freq is 1, 1 that dominant freq is 2, and so on...)
r.series input=`g.list raster pattern=amp_hants* separator=comma` output=dominant_freq_hants method=max_raster
r.series input=`g.list type=raster pattern="amp_hants*" separator=comma` output=dominant_freq_hants method=max_raster
</source>
 
==== Local Weighted Regression - LWR ====
 
Another very useful add-on for the reconstruction of gappy time series, such as those from remote sensing imagery, is {{AddonCmd|r.series.lwr}}. This module performs a local weighted regression (LWR) of the input maps in order to estimate missing values and identify (and remove) outliers. See the manual for further information and explanations.
 
<source lang="bash">
# use same list as before
t.rast.list input=cla order=start_time columns=name -u > map_list.csv
 
# run r.series.lwr
r.series.lwr file=map_list.csv suffix=_lwr order=2 weight=tricube range=0.0,65.0 dod=16 fet=0.1 -l
</source>
 
==== Estimate Mean Absolute Error (MAE) ====
 
One of the measures used to assess how close forecasts or predictions might be to the observed values is the
[https://en.wikipedia.org/wiki/Mean_absolute_error mean absolute error]. We will use it to evaluate the performance of HANTS and LWR.
 
First, we will rebuild our time series with the corresponding output maps of r.hants and r.series.lwr. Here, only showed for LWR.
 
<source lang="bash">
# re-build time series
t.create type=strds temporaltype=absolute output=cla_lwr \
title="LWR output for Chl-a" \
description="MODIS Aqua L3 Chl-a 8-day 4km 2010-2013. Reconstruction with r.series.lwr"
 
# create list with filenames to parse
g.list type=raster pattern="*lwr" output=names_list
 
# parse filenames, convert YYYY-DOY to YYYY-MM-DD and write file to use in t.register
for mapname in `cat names_list` ; do
  year_start=`echo ${mapname:1:4}`
  doy_start=`echo ${mapname:5:3}`
  year_end=`echo ${mapname:8:4}`
  doy_end=`echo ${mapname:12:3}`
  # convert YYYY-DOY to YYYY-MM-DD
  #BEWARE: leading zeros make bash assume the number is in base 8 system, not base 10!
  doy_start=`echo "$doy_start" | sed 's/^0*//'`
  doy_end=`echo "$doy_end" | sed 's/^0*//'`
  START_DATE=`date -d "${year_start}-01-01 +$(( ${doy_start} - 1 ))days" +%Y-%m-%d`
  END_DATE=`date -d "${year_end}-01-01 +$(( ${doy_end} ))days" +%Y-%m-%d`
  # print mapname, start and end date
  echo "$mapname|$START_DATE|$END_DATE" >> map_list_start_and_end_time.txt
done
 
# register maps in strds
t.register input=cla_lwr type=raster file=map_list_start_and_end_time.txt
</source>
 
Now, we will estimate the MAE and use it to compare both methods. Again, only the example for LWR is showed.
 
<source lang="bash">
# obtain a strds of the absolute differences between predicted and observed Chl-a values
t.rast.algebra -n \
basename=lwr_minus_orig suffix=gran \
expression="abs_lwr_minus_orig = abs(cla_lwr - cla)"
 
# sum of the absolute differences (numerator) and count of non-null maps per pixel (denominator)
t.rast.series input=abs_lwr_minus_orig output=sum_abs_lwr_minus_orig method=sum
t.rast.series input=abs_lwr_minus_orig output=count_abs_lwr_minus_orig method=count
 
# MAE for LWR
r.mapcalc expression="mae_lwr = sum_abs_lwr_minus_orig / count_abs_lwr_minus_orig"
 
# remove intermediate strds and maps
t.remove -rf abs_lwr_minus_orig
g.remove -f type=raster name=sum_abs_lwr_minus_orig,count_abs_lwr_minus_orig
</source>
</source>
Let's now compare the predictions generated by both methods, HANTS and LWR, and the corresponding mean absolute error maps.
{|
|-
| [[Image:cla_vs_cla_hants.png|center|thumb|560px|Original Chl-a time series vs HANTS output.]] || [[Image:cla_vs_cla_lwr.png|center|thumb|560px|Original Chl-a time series vs LWR output.)]]
|-
| [[Image:mae_hants_map.png|center|thumb|560px|MAE of HANTS predictions.]] || [[Image:mae_lwr_map.png|center|thumb|560px|MAE of LWR predictions.]]
|}


=== Extract strds values for points in a vector ===
=== Extract strds values for points in a vector ===


Let's say we need to extract values of the strds to check the behaviour of HANTS at given locations and compare it with the original values, you can graphically do that with g.gui.tplot. But, if you need to take data outside GRASS, say, read it into R and do some other analysis (See [http://grasswiki.osgeo.org/wiki/R_statistics/rgrass7#Reading_in_data R_Statistics wiki page]) you may use {{cmd|v.what.strds}}. This module retrieves raster values from a given strds to the attribute table of a point vector map.
Let's say we need to extract values of the strds to check the behaviour of HANTS at given locations and compare it with the original values. We can graphically do that with {{cmd|g.gui.tplot}}, but if you need to take data outside GRASS, say, read it into R and do some other analysis (See [http://grasswiki.osgeo.org/wiki/R_statistics/rgrass7#Reading_in_data R_Statistics wiki page]) you may use {{cmd|v.what.strds}}. This module retrieves raster values from a given strds to the attribute table of a point vector map.


<source lang="bash">
<source lang="bash">
# original data
# original data
v.what.strds --overwrite input=points_cla strds=cla output=points_cla_out
v.what.strds input=points_cla strds=cla output=points_cla_out
v.db.select map=points_cla_out file=ts_points_cla.csv
v.db.select map=points_cla_out file=ts_points_cla.csv


# HANTS' reconstructed data (several runs)
# HANTS' reconstructed data (several runs)
for i in `seq 1 13` ; do
for i in `seq 1 13` ; do
  v.what.strds --overwrite input=points_cla strds=cla_hants_${i} output=points_hants_${i}_out
    v.what.strds --overwrite input=points_cla strds=cla_hants_${i} output=points_hants_${i}_out
  v.db.select map=points_cla file=ts_points_hants_${i}.csv
    v.db.select map=points_cla file=ts_points_hants_${i}.csv
done
done
</source>
</source>


Another option would be to use [https://grass.osgeo.org/grass71/manuals/t.rast.what.html t.rast.what] that samples a space time raster dataset at specific vector point coordinates and write the output to stdout or text file using different layouts. You don't write the vector's attribute table in this case.
Another option would be to use {{cmd|t.rast.what}} that samples a space time raster dataset at specific vector point coordinates and writes the output to stdout or text file using different layouts. This module doesn't write the vector's attribute table.


<source lang="bash">
<source lang="bash">
Line 559: Line 680:
</source>
</source>


=== TODO ===
=== Some other modules and addons useful in the time series context ===


(add some more)
* {{cmd|t.rast.accumulate}} and {{cmd|t.rast.accdetect}} to compute and detect cyclic accumulations in raster time series.
* {{AddonCmd|t.rast.what.aggr}} to sample a strds with a vector point map and get aggregated values of desired granularity either in stdout or in the vector's attribute table.
* {{AddonCmd|t.rast.out.xyz}} to export a strds to a CSV file.
* {{AddonCmd|t.rast.null}} to manage NULL-values of a strds.
* {{AddonCmd|v.strds.stats}} to calculate univariate statistics from a strds based on a vector map.
* {{AddonCmd|r.seasons}} to extract seasons from a time series.
* {{AddonCmd|r.bioclim}} to calculate bio-climatic indices.
* {{AddonCmd|r.change.info}} for landscape change assessment.
* {{AddonCmd|r.quantile.ref}} to determine the corresponding quantile for an input value from a reference raster time series.
* {{AddonCmd|r.regression.series}} and {{AddonCmd|r.mregression.series}} to perform univariate and multivariate linear regression among raster time series, respectively.
* {{AddonCmd|r.series.diversity}} to estimate diversity indices in raster time series.
* {{AddonCmd|r.series.decompose}} to decompose raster time series.
* {{AddonCmd|r.series.filter}} to apply Savitzky-Golay or median filters to raster time series.


== FAQ ==
== FAQ ==
Line 567: Line 700:
=== Best practice data organization ===
=== Best practice data organization ===


When it come to time series, often thousands of maps are involved. In order to keep control, here some suggestion
When it comes to time series, thousands of maps are often involved. In order to keep control, here some suggestions:


* separate original data from derived aggregates: store them in separate mapsets
* separate original data from derived aggregates: store them in separate mapsets
* for specific projects, maintain them in individual mapsets. You can access the TGIS data stored in other mapsets (same location) through
* for specific projects, maintain them in individual mapsets. You can access the STDS stored in other mapsets (same location) through
  t.some.module input=my_strds@the_other_mapset output=result ...
  t.some.module input=my_strds@the_other_mapset output=result ...
as long as mapsets are indeed accessible, i.e.: in the mapset's search path (See {{cmd|g.mapsets}}).
=== Best practice multi user management ===
GRASS GIS supports multi-user data management and allows several users to work in the same location within their own mapsets. Data can be read from other mapsets but not edited there. The same principle applies to space time datasets (STDS), i.e. in case of having a collection of maps corresponding to a time series in a dedicated mapset, you need to also create the STDS therein (See {{cmd|t.create}} and {{cmd|t.register}}), so it is visible to other users in other mapsets when they use {{cmd|t.list}}, for example.
=== Use of ''where'' parameter (ticket [https://trac.osgeo.org/grass/ticket/2270 #2270]) ===
Pay attention when using the "where" option in t.* modules. In some occasions it may not yield the expected results. Given the backend database chosen for TGRASS implementation, you may not be selecting all maps you think if you forget to set time along with date in the where clause. For example, let's say we need to select maps from 2005-03-01 until 2005-05-31 included. You would think the following command would do the job:
<source lang="bash">
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31'"
</source>
but, no... The last map in the list is:
temp_0516|pruebas|2005-05-30 00:00:00|2005-05-31 00:00:00
which would be equivalent to:
<source lang="bash">
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time < '2005-05-31'"
</source>
This is a product of sqlite. Therefore, if you need the last date included in your selection (map from 2005-05-31 in our example), you need to set time too.
<source lang="bash">
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31 00:00:00'" | tail -n1
temp_0517|pruebas|2005-05-31 00:00:00|2005-06-01 00:00:00
</source>


=== Aggregation with defined granularity ===
=== Aggregation with defined granularity ===
Line 581: Line 744:
=== Listing maps with specific start month ===
=== Listing maps with specific start month ===


''Q: I have a strds with 506 maps that correspond to 8-day composite products. I need to sequentially list all maps which "start_month" is January, February and so on... to use them as input in {{cmd|r.series|version=70}} (or {{cmd|t.rast.series|version=70}}). How can I achieve that?''
''Q: I have a strds with 506 maps that correspond to 8-day composite products. I need to sequentially list all maps which "start_month" is January, February and so on... to use them as input in {{cmd|r.series}} (or {{cmd|t.rast.series}}). How can I achieve that?''


A: You can use the [https://www.sqlite.org/lang_datefunc.html datetime functionality of SQLite] to perform this task, this should work for January:
A: You can use the [https://www.sqlite.org/lang_datefunc.html datetime functionality of SQLite] to perform this task, this should work for January:
Line 587: Line 750:
<source lang="bash">
<source lang="bash">
t.rast.list input=cla_null_mayor65 \
t.rast.list input=cla_null_mayor65 \
  where='start_time >= datetime(start_time, "start of year") and
where='start_time >= datetime(start_time, "start of year") and start_time <= datetime(start_time, "start of year", "1 month")'
  start_time <= datetime(start_time, "start of year", "1 month")'
</source>
</source>


Line 599: Line 761:
=== Aggregation of seasonal data using time ranges ===
=== Aggregation of seasonal data using time ranges ===


A way to aggregate seasons from daily data without granularity but by using time ranges is shown here. The issue is to include for the season calculation also a month from the previous year. This can be addressed by some datetime calculations in SQLite:
A way to aggregate seasons from daily data without granularity but by using time ranges is shown here. The issue is to include for the season calculation also a month from the previous year. This can be addressed with some datetime calculations in SQLite:


<source lang="bash">
<source lang="bash">
Line 623: Line 785:


         # Debugging only, to see what it does:
         # Debugging only, to see what it does:
echo "---- Querying ${prevyear}-${1}-01 ... ${year}-${3}-end:"
        echo "---- Querying ${prevyear}-${1}-01 ... ${year}-${3}-end:"
# we use start_time and end_time to get the proper time range
        # we use start_time and end_time to get the proper time range
         t.rast.list input=temp_daily_average@modis_lst_reconstructed_europe_daily \
         t.rast.list input=temp_daily_average@modis_lst_reconstructed_europe_daily \
              where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" > list_${prevyear}_${1}_${3}.csv
        where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" > list_${prevyear}_${1}_${3}.csv
head -n 3 list_${prevyear}_${1}_${3}.csv
echo "..."
        head -n 3 list_${prevyear}_${1}_${3}.csv
tail -n 3 list_${prevyear}_${1}_${3}.csv
        echo "..."
echo "======="
        tail -n 3 list_${prevyear}_${1}_${3}.csv
rm -f list_${prevyear}_${1}_${3}.csv
        echo "======="
 
        rm -f list_${prevyear}_${1}_${3}.csv
       
         # calculate aggregates:
         # calculate aggregates:
method="average"  # median mode minimum maximum stddev
        method="average"  # median mode minimum maximum stddev
 
       
         t.rast.series input=temp_daily_average@modis_lst_reconstructed_europe_daily \
         t.rast.series input=temp_daily_average@modis_lst_reconstructed_europe_daily \
              where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" \
        where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" \
      output=temp_${method}_${prevyear}_${1}
        output=temp_${method}_${prevyear}_${1}
     done
     done
done
done
Line 645: Line 808:
Much simpler '''alternative''' (which runs in parallel on multi-cores):
Much simpler '''alternative''' (which runs in parallel on multi-cores):
<source lang="bash">
<source lang="bash">
t.rast.aggregate input=A output=B basename=b where="start_time >= 2004-03-01" granularity="3 months"
t.rast.aggregate input=A output=B basename=b where="start_time >= '2004-03-01 00:00:00'" granularity="3 months"
</source>
</source>


== Expert tricks ==
=== How to count consecutive days that meet a certain condition? ===


BE CAREFUL. THIS IS NOT RECOMMENDED TO NEW USERS.
This example was kindly contributed by Thomas Leppelt and shows how to obtain the number of consecutive days with temperature below 0 per week.


=== Creating a TGRASS DB using data from a different mapset ===
As of GRASS GIS 7.0.2 the following is no longer needed as TGRASS now works fine also with data present in the other mapset(s).
This entire subsection should be removed.
<strike>
TGRASS is designed to only work with data present in the current mapset.
An '''expert user''' may override this in order to register data from another mapset in his/her TGRASS database.
To achieve this, the following two internal variables must be set:
<source lang="bash">
<source lang="bash">
g.gisenv set="TGIS_DISABLE_MAPSET_CHECK=True"
# Set the computational region and resolution
g.gisenv set="TGIS_DISABLE_TIMESTAMP_WRITE=True"
g.region -p n=-30 s=-50 e=-50 w=-70 res=1
</source>


in order to disable the mapset check and the writing of the timestamps of each map to the map metadata in the spatial database as text files. These variables can be set mapset specific.
# We create maps for 100 days
for map in `seq 1 100` ; do
# generate synthetic maps as a seeding random values for temperatures
r.mapcalc -s expression="daily_temp_${map} = rand(-20,20)" --o
echo daily_temp_${map} >> map_list.txt
done


Settings these variables "True" should (hopefully, because yet partially untested) allow the registration of maps outside the current mapset, even if you do not have the permission to modify the maps.
t.create type=strds temporaltype=absolute \
        output=temperature_daily \
        title="Daily Temperature" \
        description="Test dataset with daily temperature"


A warning will be printed if these variables are set True.
t.register -i type=raster input=temperature_daily \
          file=map_list.txt start="2014-03-07" increment="1 days" --o


BUT, be aware that this feature '''can lead to the corruption of the temporal database''' and unwanted side effects. You can mess up the temporal database if you are not 100% sure what you are doing. It is no longer possible to access the timestamp information of these maps using the C-libraries, because the timestamp information is not available in the map metadata text files.
# check general information of the daily strds
</strike>
t.info type=strds input=temperature_daily


=== More expert tricks ===
# Create weekly mask to which we will add 1 or 0 according to temperature in the days within it (each map in this week mask will have a value of 7)
t.rast.aggregate input=temperature_daily output=weekly_mask basename=mask_week granularity="1 weeks" method=count


Perhaps to be added...
# Calculate consecutive days with negative temperatures
t.rast.algebra base=neg_temp_days expression="consecutive_days = weekly_mask {+,contains,l} if(temperature_daily < 0 && temperature_daily[-1] < 0 || temperature_daily[1] < 0 && temperature_daily < 0, 1, 0)"
</source>


== Online tutorials and courses ==
== Online tutorials and courses ==


* [http://www.geostat-course.org/Topic_Gebbert The temporal GRASS GIS framework: Introduction and application] by Soeren Gebbert (2012)
* [http://www.geostat-course.org/Topic_Gebbert The temporal GRASS GIS framework: Introduction and application] by Sören Gebbert (2012)
* [http://fatra.cnr.ncsu.edu/temporal-grass-workshop/ Spatio-temporal data handling and visualization in GRASS GIS] - FOSS4G 2014 workshop by Vaclav Petras, Anna Petrasova, Helena Mitasova, Markus Neteler (2014)
* [http://fatra.cnr.ncsu.edu/temporal-grass-workshop/ Spatio-temporal data handling and visualization in GRASS GIS] - FOSS4G 2014 workshop by Vaclav Petras, Anna Petrasova, Helena Mitasova, Markus Neteler (2014)
* [http://www.openniche.org/processing-era-interim-dataset-in-ncdf-using-gdal-and-grass-gis/ Processing ERA-Interim dataset in netcdf using GDAL and GRASS GIS] by Sajid Pareeth (2015)
* [http://www.openniche.org/processing-era-interim-dataset-in-ncdf-using-gdal-and-grass-gis/ Processing ERA-Interim dataset in netcdf using GDAL and GRASS GIS] by Sajid Pareeth (2015)
* CZ: [http://training.gismentors.eu/grass-gis-pokrocily/tgrass/index.html Úvod do časoprostorových analýz] (Introduction to space-time analysis) by Martin Landa (2015)
* CZ: [http://training.gismentors.eu/grass-gis-pokrocily/tgrass/index.html Úvod do časoprostorových analýz] (Introduction to space-time analysis) by Martin Landa (2015)
* [https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html Sentinel spatio-temporal processing] by Martin Landa (2020)
* [https://github.com/veroandreo/tgrass_workshop_foss4g_eu TGRASS: temporal data processing with GRASS GIS] FOSS4G-EU workshop by Veronica Andreo, Luca Delucchi and Markus Neteler (2017)
* [https://gitpitch.com/veroandreo/grass-gis-geostat-2018/master?p=tgrass&grs=gitlab Spatio-temporal data processing and visualization with GRASS GIS], GEOSTAT Summer School lecture by Veronica Andreo (2018)


== References ==
== References ==
* Gebbert, S., Pebesma, E., 2014. ''TGRASS: A temporal GIS for field based environmental modeling''. Environmental Modelling & Software 53, 1-12 (http://dx.doi.org/10.1016/j.envsoft.2013.11.001 DOI]) - [http://ifgi.uni-muenster.de/~epebe_01/tgrass.pdf preprint PDF]
* Gebbert, S., Pebesma, E. 2014. ''TGRASS: A temporal GIS for field based environmental modeling''. Environmental Modelling & Software 53, 1-12 ([http://dx.doi.org/10.1016/j.envsoft.2013.11.001 DOI]) - [http://ifgi.uni-muenster.de/~epebe_01/tgrass.pdf preprint PDF]
* Gebbert, S., Pebesma, E. 2017. ''The GRASS GIS temporal framework''. International Journal of Geographical Information Science 31, 1273-1292 ([http://dx.doi.org/10.1080/13658816.2017.1306862 DOI]) - [https://figshare.com/articles/journal_contribution/The_GRASS_GIS_temporal_framework/4834013/1/files/8024429.pdf PDF]
* Gebbert, S., Leppelt, T., Pebesma, E., 2019. ''A topology based spatio-temporal map algebra for big data analysis''. Data 4, 86. ([https://doi.org/10.3390/data4020086 DOI]) - [https://www.mdpi.com/2306-5729/4/2/86/pdf?version=1560849201 PDF]


== See also ==
== See also ==
* Introduction: [http://www.slideshare.net/Luis_de_Sousa/presentation-soeren GRASS as a Temporal GIS] by Sören Gebbert (slides)
* Introduction: [http://www.slideshare.net/Luis_de_Sousa/presentation-soeren GRASS as a Temporal GIS] by Sören Gebbert (slides)
* [http://grass.osgeo.org/grass70/manuals/temporalintro.html Introduction to temporal modules] (GRASS GIS manual)
* [https://archive.org/details/TheTemporalGrassGisFrameworkIntroductionAndApplication# The temporal GRASS GIS framework: Introduction and application] Video of Sören Gebbert's presentation in [http://www.geostat-course.org/Muenster_2012 GEOSTAT course] 2012
* [http://grass.osgeo.org/grass70/manuals/temporal.html Available temporal modules] (GRASS GIS manual)
* [https://grass.osgeo.org/grass-stable/manuals/temporalintro.html Introduction to temporal modules] (GRASS GIS manual)
* [http://trac.osgeo.org/grass/wiki/Grass7/TemporalExtension Temporal Extension] (background info in trac)
* [https://grass.osgeo.org/grass-stable/manuals/temporal.html Available temporal modules] (GRASS GIS manual)
* [https://trac.osgeo.org/grass/wiki/Grass7/TemporalExtension Temporal Extension] (background info in trac)
* [[GRASS GSoC 2013 Temporal GIS Algebra for raster and vector data in GRASS]]
* [[GRASS GSoC 2013 Temporal GIS Algebra for raster and vector data in GRASS]]


[[Category: Documentation]]
[[Category: Documentation]]
[[Category:Temporal]]
[[Category: Tutorial]]
[[Category: Temporal]]

Latest revision as of 17:55, 20 December 2022

Introduction

TGRASS is the temporal enabled GRASS GIS. TGRASS is completely metadata based, i.e. it does not change any data but simply handles the organization of raster, vector, raster3D maps actually stored in a GRASS GIS mapset by registering in an additional internal database. This is done specifically for managing temporal and spatial extent including temporal topology.

Manual overview: https://grass.osgeo.org/grass-stable/manuals/temporalintro.html

Terminology Overview

  • Space time raster datasets (strds) are designed to manage raster map time series. Modules that process strds have the naming prefix t.rast.
  • Space time 3D raster datasets (str3ds) are designed to manage 3D raster map time series. Modules that process str3ds have the naming prefix t.rast3d.
  • Space time vector datasets (stvds) are designed to manage vector map time series. Modules that process stvds have the naming prefix t.vect.

Workflow overview

  1. create an empty space time datasets: strds, str3ds, or stvds (t.create)
  2. register the GRASS GIS maps (t.register)
  3. check the generated space time datasets (t.list, t.info)
  4. do your analysis: t.rast.aggregate, t.info, t.rast.univar, t.vect.univar, ... so many more!

Example workflow for a Chlorophyll-a MODIS time series

The following examples are based on a series of MODIS L3 Chlorophyll-a product that is freely available at the ocean color site. So, say we download the SMI 8-day composite product at 4.6 km resolution for the period 2003-2013. That is a 506 set of images, 46 per year. Data comes as compressed HDF4 files. Chlorophyll products file names look like this:

A20030012003008.L3m_8D_CHL_chlor_a_4km

where,

 A: MODIS/Aqua
 2003: Year at start
 001: Julian day at start
 2003: Year at end
 008: Julian day at end
 L3m: Level 3 data, mapped (Projection: Plate carrée)
 8D: 8 day composition
 CHL: Chlorophyll a concentration product
 chlor_a: algorithm used 10^(a0 + a1*X + a2*X^2 + a3*X^3 + a4*X^4) 
 4km: 4.6km pixel size (8640x4320 image, 2.5 minute resolution)

We now decompress files and check metadata

# Go to where the data is and decompress
  find -iname '*.bz2' -exec bzip2 -d {} \; 
# Check file meta-data (GDAL utilities)
  gdalinfo A20030012003008.L3m_8D_CHL_chlor_a_4km

Next step is to import all 506 images into GRASS. You can use r.in.gdal or r.external for that. Note that global Cl-a images as downloaded from ocean color site are ~150 mb each (disk space issues!). Here, 3 different options:

  • import global images (506 images, 150 Mb each) with r.in.gdal, resize to study area and remove global files
  • set projection and extension, and resize to study area with gdal_translate, and import already resized images with r.in.gdal
  • import global images (506 images, 150 Mb each) with r.external, resize to study area and remove global files, as showed below:
# define region extension
g.region -p n=-38 s=-55 w=-70 e=-56

suffix=_tmp
for map in *chlor* ; do
    r.external input=$map output=${map}${suffix} -o
    r.mapcalc expression="$map = ${map}${suffix}"
    g.remove type=raster name=${map}${suffix}
done

Once we have maps inside GRASS GIS database we can start the temporal processing. If this is the first time you'll use temporal modules, you need to run

t.connect -d

to set the default temporal GIS database connection for the current mapset. If the connection is not initially set with t.connect, any temporal module that you try to run for the first time will first set the connection to the temporal database for you. The default TGIS database of type sqlite3 is located in the PERMANENT mapset directory. Temporal GIS content from all created mapsets will be stored there.

Creating a STRDS and registering maps

Creating a STRDS

First step is to create a space time dataset by means of t.create. As an example, let us create a strds for the Chlorophyll-a (Cl-a) time series. We need to define the name and semantic type of the new space time dataset, its title and a description. By default a raster dataset is created with absolute time (Gregorian calendar). We can change that according to our needs with type and temporaltype options.

t.create type=strds temporaltype=absolute output=cla \
 title="Chlorophyll-a concentration" \
 description="MODIS L3 Chlorophyll-a concentration for Argentinian sea" \
 semantictype=mean

Registering maps

Then, we register the maps in the strds using t.register. This module assigns timestamps to raster, 3D raster and vector maps and register them in the temporal database and into space time datasets. Existing timestamps can be read and used by t.register. Note that this is a metadata based registration, nothing is imported into GRASS GIS since the maps are already present in the location (hence, no duplication occurs).

This module (and TGRASS in general) supports absolute time and relative time. The absolute temporal type refers to a fixed date or interval in the Gregorian calendar, while the relative temporal type refers to data without fixed time stamps (e.g., sequential maps used to calculate multi-decadal averages). Refer to the TGRASS overview slides for background.

Maps can be registered by command line argument (i.e., a list of comma separated map names) or using an input file. The start time, end time and a temporal increment can be provided either by command line or in the input file. End time and increment are mutually exclusive. Maps can be registered in several space time datasets using the same timestamp. For a more detailed explanation and examples on how to register maps in stds, see also maps registration wiki page.

Start time and end time with absolute time must be provided using the format yyyy-mm-dd HH:MM:SS +HHMM. It is also supported to specify the date yyyy-mm-dd only. In case of relative time, the temporal unit (years, months, days, hours, minutes or seconds) must be provided. The relative start time, end time and the increment are integers (see also t.register).

# note: with -i we create intervals (start and end time) spanning the given increment, and starting from start.
t.register -i type=raster input=cla \
 maps=`g.list raster pattern="*_chlor_*" separator=comma` \
 start="2003-01-01" \
 increment="8 days"

Dealing with weekly data

While the former would have been the simplest solution, our 8-day products have a problem: in this example the last image of each year is not a product of an 8-day composition, but 4 or 5-day. Then, when using the increment parameter, dates (and consequently, intervals) are not set properly. The solution is to create a list of maps, with their respective start and end date. As the filenames contain information regarding year and DOY (day of year), we can use the following Python script to read filenames and transform DOY to calendar dates (Thanks Soeren!).

# Script to extract the input file for t.register from map names
# run in python
# modify to your needs

from os import walk
f = []
for (dirpath, dirnames, filenames) in walk("/path/to/the/maps"):
    f.extend(filenames)
    break
# to order the list    
f.sort() 

input_list=[]

import datetime
for map_name in f:
  start_year = int(map_name[1:5])
  start_day  = int(map_name[5:8])
  end_year   = int(map_name[8:12])
  end_day    = int(map_name[12:15])
  start = datetime.datetime(start_year, 1, 1) + datetime.timedelta(start_day - 1)
  end = datetime.datetime(int(end_year), 1, 1) + datetime.timedelta(end_day)
  map_list = map_name + '|' + str(start) + '|' + str(end)
  print map_list
  input_list.append(map_list)

print input_list

txt = '\n' .join(input_list)
f = open("input_list_cla.txt","w")
f.write(txt)
f.close()

Using the number of characters in the filenames and the datetime library in Python, you can convert DOY in the filenames into start_time and end_time as in the file you need to pass to t.register:

 A20030012003008.L3m_8D_CHL_chlor_a_4km_arg|2003-01-01 00:00:00|2003-01-09 00:00:00
 A20030092003016.L3m_8D_CHL_chlor_a_4km_arg|2003-01-09 00:00:00|2003-01-17 00:00:00
 A20030172003024.L3m_8D_CHL_chlor_a_4km_arg|2003-01-17 00:00:00|2003-01-25 00:00:00
 ...
 A20133452013352.L3m_8D_CHL_chlor_a_4km_arg|2013-12-11 00:00:00|2013-12-19 00:00:00
 A20133532013360.L3m_8D_CHL_chlor_a_4km_arg|2013-12-19 00:00:00|2013-12-27 00:00:00
 A20133612013365.L3m_8D_CHL_chlor_a_4km_arg|2013-12-27 00:00:00|2014-01-01 00:00:00

and then, the command would be:

t.register --o type=raster input=cla file=input_list_cla.txt

As we are providing start and end time along with map names, we don't need to use the -i flag, neither set start nor increment options because that information is already contained in the file.

Assign a color table

We can also set a color table for all maps in the strds with t.rast.colors:

# using a predetermined color table
t.rast.colors input=cla color=bcyr
#using your dedicated color table
t.rast.colors input=cla rules=path/to/your/color_table

Further examples of extracting timestamps from map names can be found here: maps registration wiki page.

Getting some basic info and statistics

We now check the space time data sets we have in our mapset with t.list:

t.list type=strds

and get information about our recently created strds. See t.info for additional uses.

t.info type=strds input=cla
 +-------------------- Space Time Raster Dataset -----------------------------+
 |                                                                            |
 +-------------------- Basic information -------------------------------------+
 | Id: ........................ cla@clorofila
 | Name: ...................... cla
 | Mapset: .................... clorofila
 | Creator: ................... veroandreo
 | Temporal type: ............. absolute
 | Creation time: ............. 2014-04-29 14:23:00.579342
 | Modification time:.......... 2014-05-12 09:15:08.917309
 | Semantic type:.............. mean
 +-------------------- Absolute time -----------------------------------------+
 | Start time:................. 2003-01-01 00:00:00
 | End time:................... 2014-01-01 00:00:00
 | Granularity:................ 1 day
 | Temporal type of maps:...... interval
 +-------------------- Spatial extent ----------------------------------------+
 | North:...................... -38.0
 | South:...................... -55.0
 | East:.. .................... -55.0
 | West:....................... -70.0
 | Top:........................ 0.0
 | Bottom:..................... 0.0
 +-------------------- Metadata information ----------------------------------+
 | Raster register table:...... raster_map_register_91ba57d5f0924f4fa0bd7176a1b39b2f
 | North-South resolution min:. 0.041667
 | North-South resolution max:. 0.041667
 | East-west resolution min:... 0.041667
 | East-west resolution max:... 0.041667
 | Minimum value min:.......... 0.02925
 | Minimum value max:.......... 0.26472
 | Maximum value min:.......... 5.2104
 | Maximum value max:.......... 99.953934
 | Aggregation type:........... None
 | Number of registered maps:.. 506
 |
 | Title:
 | Chlorophyll-a
 | Description:
 | Concentracion de Clorofila a
 | Command history:
 | # 2014-04-29 14:23:00 
 | t.create type="strds" temporaltype="absolute"
 |     output="cla" title="Chlorophyll-a"
 |     description="Concentracion de Clorofila a" --o
 | # 2014-04-29 14:23:23 
 | t.register --o type="rast" input="cla"
 |     file="input_list_cla.txt"
 | 
 +----------------------------------------------------------------------------+

Now, we get univariate statistics from the non-null cells for each registered raster map of the strds. For that matter we use t.rast.univar which, by default, returns the name of the map, the start and end date of dataset and the following values: mean, minimum and maximum vale, mean_of_abs, standard deviation, variance, coeff_var, number of null cells, total number of cell.

t.rast.univar input=cla

Or, you can send the output to a text file using

t.rast.univar input=cla separator=comma output=stats_cla.csv

This file "stats_cla.csv" you can now open in a spreadsheet or statistical software for inspection and plotting.

Listing maps and selections

The module t.rast.list allows you to list all the maps registered in a strds or a selection of them and, provides options for different listing methods, sorting of the list, information to print and so on. The where option in t.rast.list allows to perform different selections of maps to list. The columns that can be used to perform these selections are: id, name, creator, mapset, temporal_type, creation_time, start_time, end_time, north, south, west, east, nsres, ewres, cols, rows, number_of_cells, min and max. (Note that for vector time series, i.e. stvds, the columns in t.vect.list differ from those for strds. You can check that with t.vect.list --help).

t.rast.list input=cla method=gran granule="1 month"
# this will give one image every one month, 3 months, 1 year, or the granule you choose

t.rast.list input=cla order=min columns=id,name,start_time,min where="min <= '0.05'" 
# this will order by minimum value all the maps in the strds that have a minimum value lower than or equal to 0.05

t.rast.list input=cla order=max columns=name,start_time,max where="max > '10.0'"
# maps ordered by maximum value in which maximum value is higher than 10.

t.rast.list input=cla where="start_time >= '2003-01' and start_time <= '2003-06'" 
# all the maps in the first 6 month of the time series

t.rast.list input=cla where="strftime('%m', start_time)='01'"
# all the maps from January

t.rast.list input=cla where="strftime('%m-%d', start_time)='01-01'"
# all the maps from January, 1st

t.rast.list input=cla where="strftime('%w', start_time)='1'"
# all Mondays in the time series (Sunday is 0)

If you have monthly (instead of 8-day products) data and you want to list all January maps, then you can do:

t.rast.list input=cla where="start_time=datetime(start_time, 'start of year', ' 0 month')"

Note: Do not forget to use single quotes around date, neither in t.rast.list nor in any other of the t.* modules offering the where option, because the clause will be ignored (i.e.: no selection performed) and no warning nor error will be printed.

Visualization

There are different visualization options for strds.

  • g.gui.timeline allows to compare temporal datasets by displaying their temporal extents in a plot.
# only temporal extent
g.gui.timeline inputs=cla
# temporal and spatio-temporal extent
g.gui.timeline -3 inputs=cla
Temporal and spatio-temporal extent of cla strds.
  • g.gui.tplot allows to see the values of one or more temporal datasets for a queried point defined by a coordinate pair.

Steps to use this module are:

  1. Select strds
  2. Select pair of coordinates (east,north) or mark a point in the map
  3. Hit Draw
  4. Customize plot as desired
  5. Save
Time series plot (Chlorophyll vs Time) for a certain coordinate pair in the study area
  • g.gui.animation is the tool for animating a series of raster and vector maps or a space time raster or vector dataset.
g.gui.animation strds=cla_monthly_average
Monthly mean chlorophyll-a concentration

Aggregation

For aggregations of data with different methods and different granularities, there are two very useful commands:

  • t.rast.series that applies different aggregation methods from r.series on all or a subset of raster maps in a strds, and
  • t.rast.aggregate that temporally aggregates the maps in a strds by a user defined granularity.

With these modules it is very simple to get maps of basic statistical parameters for different temporal granules, and this permits the analysis of the spatio-temporal variability of the variable of interest.

Some examples:

# yearly aggregation
t.rast.aggregate input=cla output=cla_yearly_average basename=cla_yearly_average \
  granularity="1 years" method=average sampling=starts suffix=gran

# yearly aggregation with corresponding methods (output: 7 strds with 11 maps each)
for method in average median mode minimum maximum stddev range ; do
    t.rast.aggregate input=cla output=cla_yearly_${method} \
    basename=cla_yearly_${method} granularity="1 years" \
    method=${method} sampling=starts suffix=gran
done

With the where parameter, as exemplified before, we can select for example all 8-day products which start_time is 01 (January) over the years. Like this we can get the so-known climatologies.

t.rast.series input=cla method=average where="strftime('%m', start_time)='01'" output=january_average

We could also get long term averages for specific periods in a time series, e.g. January 5-year mean:

t.rast.series input=cla \
  where="start_time >= '2005-01-01' AND start_time <= '2009-12-01' AND strftime('%m', start_time)='01'" \
  method=average output=jan_cla_mean_2000_2004

# say we take a 15 years period (2000-2014) of monthly data and we want monthly 5-year averages
YEAR_START=(2000 2005 2010)
YEAR_END=(2004 2009 2014)
count=${#YEAR_START[@]}

for i in `seq 1 $count` ; do
 echo ${YEAR_START[$i-1]} ${YEAR_END[$i-1]}
 for MONTH in `seq -w 1 12` ; do
  t.rast.series input=cla \
   where="start_time >= '"${YEAR_START[$i-1]}"-01-01' AND start_time <= '"${YEAR_END[$i-1]}"-12-01' AND strftime('%m', start_time)='"${MONTH}"'" \
   method=average output=cla_${YEAR_START[$i-1]}_${YEAR_END[$i-1]}_month_${MONTH} --o
 done
done

Generalizing a bit, to obtain monthly climatologies from a time series, we can:

# monthly climatologies
for MONTH in `seq -w 1 12` ; do 
  for METHOD in average median stddev minimum maximum ; do 
    t.rast.series input=cla method=${METHOD} where="strftime('%m', start_time)='${MONTH}'" output=${METHOD}_${MONTH}
  done
done

# seasonal climatologies
for i in "01 02 03" "04 05 06" "07 08 09" "10 11 12" ; do
    set -- $i ; echo $1 $2 $3
    for m in average median stddev minimum maximum ; do
        t.rast.series input=cla method=${m} output=${m}_${1} \
        where="strftime('%m',start_time)='"${1}"' or strftime('%m',start_time)='"${2}"' or strftime('%m', start_time)='"${3}"'"                      
    done
done

Using the climatologies previously obtained, we'll now estimate monthly anomalies in the mean, max and min Cl-a concentration. First, we need to monthly aggregate data, and then do the difference between the monthly climatology and each respective monthly aggregate. We'll do the aggregation for the average, minimum and maximum of Cl-a concentration (from 506 input maps in cla strds, we'll get 132 maps in each monthly aggregated new strds).

# monthly aggregates (132 maps)
for method in average minimum maximum ; do
    t.rast.aggregate input=cla output=cla_monthly_${method} \
    basename=cla_monthly_${method} \
    granularity="1 months" method=${method} \
    sampling=contains suffix=gran
done

# January anomalies in mean, min and max Cl-a
t.rast.list -u input=cla_monthly_average where="start_time=datetime(start_time, 'start of year', '0 month')" columns=name

for m in average minimum maximum ; do 
    for i in 1 13 25 37 49 61 73 85 97 109 121 ; do # these numbers correspond to all January monthly aggregates
        r.mapcalc expression="Jan_${m}_anomaly_${i}=cla_monthly_${m}_${i}-01_${m}" 
    done
done

You can also use t.rast.list for looping, the same way you use g.list:

for map in `t.rast.list -u input=cla_monthly_average where="strftime('%m', start_time)='01'" columns=name` ; do
    r.mapcalc expression="anomaly_${map}=${map}-01_average" 
done

Generalizing, we can estimate all monthly anomalies from mean like:

for i in `seq -w 1 12` ; do
    for map in `t.rast.list -u input=cla_monthly_average columns=name where="strftime('%m', start_time)='"${i}"'"` ; do
        r.mapcalc expression="anomaly_${map}=${map}-average_${i}"
    done
done

Say we now need to know the date of the maximum value of Cl-a concentration over all the study period and/or on a yearly basis:

# map index for the overall maximum Cl-a value 
t.rast.series input=cla method=max_raster output=cla_max_index

# map index for the yearly maximum Cl-a value 
t.rast.aggregate input=cla granularity="1 year" method=max_raster output=yearly_max_index basename=yearly_max_index suffix=gran

The outputs show (pixelwise) the map index in which the maximum value of Cl-a occurs (from 1 to 506 and from 1 to 46, for the whole time series and on a yearly basis, respectively). For relative time this is maybe enough, but you may then want to reclassify data to get DOY, for example. In that case, you may use r.reclass.

If you already have monthly data, you can get climatologies quite simply as follows:

# January averages
t.rast.series input=cla_monthly method=average where="start_time=datetime(start_time, 'start of year', '0 month')" output=jan_average

Spatio-temporal algebra with STRDS

The module t.rast.mapcalc allows us to perform spatio-temporal mapcalc expressions on temporally sampled maps of strds. There are spatial and temporal operators available for the "expression" string. Spatial operators, functions and internal variables are those used in r.mapcalc. Temporal internal variables supported for both relative and absolute time include: td(), start_time() and end_time(). There are also several very useful internal variables supported especially for absolute time of the current sample interval or instance, e.g.: start_doy(), start_year(), start_month() and so on (see t.rast.mapcalc manual for further details and examples).

Some examples now. Say we did some analysis and decided that we will only consider values higher than 0.05. Then, we need to set all values below that threshold to null.

t.rast.mapcalc input=cla expression="if(cla < 0.05, null(), cla)" output=cla_corrected  basename=cla_corrected

or we may also want to take negative erroneous values to the knowm minimum of the strds, so:

t.rast.series input=cla method=minimum output=min_cla
t.rast.mapcalc input=cla expression="if(cla < 0.0, min_cla, cla)" output=cla_corrected  basename=cla_corrected

We may also need to reclassify all maps in the strds according to a certain threshold, e.g.: a certain level of Cl-a that indicates bloom conditions, in order to get bloom frequency afterwards:

# reclassify
t.rast.mapcalc -n input=cla output=cla_bloom basename=cla_bloom expression="if(cla > 0.75, 1, null())"
# bloom frequency
t.rast.series input=cla_bloom output=bloom_freq method=count

Do you remember we wanted to get the DOY of maximum Cl-a value before? Well, here's another way of doing it...

# overall maximum value
t.rast.series input=cla method=maximum output=max_cla
# new strds with DOY of overall maximum 
t.rast.mapcalc -n inputs=cla output=date_max_cla expression="if(cla == max_cla,start_doy(),null())" basename=date_max_cla 
# map with DOY of overall maximum
t.rast.series input=date_max_cla method=maximum output=max_cla_date
# remove date_max_cla strds (we were only interested in the resulting aggregated map)
t.remove -rf inputs=date_max_cla

There's also a t.rast.algebra module that allows for temporal and spatial operations on strds by means of temporal raster algebra. The module expects an expression in the following form: "result = expression".

The statement structure is similar to r.mapcalc, the result is the name of a new strds that will contain the result of the calculation given as expression. Expressions can be any valid or nested combination of temporal operations and spatial overlay or buffer functions that are provided by the temporal algebra. See the manual for further details and explanations.

We'll use this module to estimate the rate of change (slope) between every pair of maps in the "cla" strds. The result will be a new strds consisting of maps with the slope value between every consecutive pair of maps in the original strds:

# we set 8 as fixed denominator, because products are 8-day compositions
t.rast.algebra expression="slope_cla = (cla[1]-cla[0])/8.0" basename=slope_cla

We can then use any of the aggregation modules that we saw before to get the maximum or minimumm rate of change for different granularities.

Some known issues with t.rast.mapcalc

It has been observed that if the names of inputs and output space time datasets (partially) match each other, t.rast.mapcalc may not work properly (especially when several input strds are used). This is because the module uses a simple search and replace approach to substitute the input strds with the corresponding map names. Eventualy, choosing a specific order for the input strds in the input parameter may reduce the risk of wrong substitution. Therefore, especially in operations involving several strds (with partially matching names) as inputs, it is recommended to use t.rast.algebra that correctly recognizes spatio-temporal datasets rather than t.rast.mapcalc.

Subseting and something else

t.rast.extract is another really useful module within temporal GRASS. It allows to extract a subset of a strds and store it in a different strds. The "where" condition is used to do the subset, but you can also specify a r.mapcalc sub-expression that performs operations on all maps of the selected subset. If no r.mapcalc expression is defined, the selected maps are simply registered in the new output strds.

Say we need to know in how many maps of the strds minimum values are below a certain threshold, and not only that, but we also want to know how many pixels per map meet that condition. Then, we can do

t.rast.extract input=cla where="min <= '0.05'" output=cla_less_05 basename=cla_less_05 expression="if(cla < 0.05, 1, null())"

to extract those maps with a minimum value lower than 0.05, and in the same step put 1 in every cell meeting the criterium and null everywhere else. To get a count of maps and pixels meeting the condition we may use:

# univariate stistics to get the count per map  
t.rast.univar input=cla_less_05 
# count map of cells with min value < 0.05
t.rast.series input=cla_less_05 output=count_cla_less_05 method=count

Importing / Exporting STRDS

Say we now need to do some processing in R (e.g.: run DINEOF to fill gaps in data), so we need to export our strds, hence we use t.rast.export.

t.rast.export input=cla output=cla4R compression=gzip

After we have done our analysis we can import the whole strds back to GRASS again, exporting it from R with read/write.tgrass from spacetime R package and using t.rast.import in GRASS.

t.rast.import input=cla_from_R.tar.gz output=cla_from_R basename=new_map directory=/tmp

For a full example of how to handle raster time series between GRASS and R, please visit the time series GRASS-R Statistics wiki. The example is based on North Carolina climate location [1] and includes the following steps:

  1. Exporting the strds out of GRASS
  2. Importing into R
  3. Re-formating the data
  4. Running DINEOF
  5. Re-constructing the raster time series
  6. Exporting out of R, and
  7. Importing the gap-filled new strds into GRASS.

Neighborhood analysis

t.rast.neighbors performs a neighborhood analysis for each map in a space time raster dataset. This module supports a subset of the options already available in r.neighbors. Both, size of the neighborhood and aggregation method can be chosen. As an example, we'll estimate mean Cl-a concentration in a 3x3 neighborhood for every map in the strds.

t.rast.neighbors input=cla output=cla_smoth base=cla_smooth method=average size=3

Filling and reconstructing time series data with gaps

Harmonic ANalysis of Time Series - HANTS

r.hants is an add-on not strictly within the temporal modules, but really useful when working with time series data, particularly with gappy data. Here's a simple example for generating a list of temporally ordered maps to use as input in r.hants, running HANTS and getting a map of dominant frequencies, afterwards. See the manual for further information on parameter setting and explanations.

# use t.rast.list to create a list of temporally ordered maps  
t.rast.list input=cla order=start_time columns=name -u > map_list.csv

# a HANTS run using the list of maps
r.hants -l file=map_list.csv nf=5 fet=0.1 dod=11 base_period=46 suffix=_hants amplitude=amp_hants phase=pha_hants

# dominant frequency map (0 means the dominant freq is 1, 1 that dominant freq is 2, and so on...)
r.series input=`g.list type=raster pattern="amp_hants*" separator=comma` output=dominant_freq_hants method=max_raster

Local Weighted Regression - LWR

Another very useful add-on for the reconstruction of gappy time series, such as those from remote sensing imagery, is r.series.lwr. This module performs a local weighted regression (LWR) of the input maps in order to estimate missing values and identify (and remove) outliers. See the manual for further information and explanations.

# use same list as before
t.rast.list input=cla order=start_time columns=name -u > map_list.csv

# run r.series.lwr
r.series.lwr file=map_list.csv suffix=_lwr order=2 weight=tricube range=0.0,65.0 dod=16 fet=0.1 -l

Estimate Mean Absolute Error (MAE)

One of the measures used to assess how close forecasts or predictions might be to the observed values is the mean absolute error. We will use it to evaluate the performance of HANTS and LWR.

First, we will rebuild our time series with the corresponding output maps of r.hants and r.series.lwr. Here, only showed for LWR.

# re-build time series 
t.create type=strds temporaltype=absolute output=cla_lwr \
 title="LWR output for Chl-a" \
 description="MODIS Aqua L3 Chl-a 8-day 4km 2010-2013. Reconstruction with r.series.lwr"

# create list with filenames to parse
g.list type=raster pattern="*lwr" output=names_list

# parse filenames, convert YYYY-DOY to YYYY-MM-DD and write file to use in t.register
for mapname in `cat names_list` ; do
  year_start=`echo ${mapname:1:4}`
  doy_start=`echo ${mapname:5:3}`
  year_end=`echo ${mapname:8:4}`
  doy_end=`echo ${mapname:12:3}`
  # convert YYYY-DOY to YYYY-MM-DD
  #BEWARE: leading zeros make bash assume the number is in base 8 system, not base 10!
  doy_start=`echo "$doy_start" | sed 's/^0*//'`
  doy_end=`echo "$doy_end" | sed 's/^0*//'`
  START_DATE=`date -d "${year_start}-01-01 +$(( ${doy_start} - 1 ))days" +%Y-%m-%d`
  END_DATE=`date -d "${year_end}-01-01 +$(( ${doy_end} ))days" +%Y-%m-%d`
  # print mapname, start and end date
  echo "$mapname|$START_DATE|$END_DATE" >> map_list_start_and_end_time.txt
done

# register maps in strds
t.register input=cla_lwr type=raster file=map_list_start_and_end_time.txt

Now, we will estimate the MAE and use it to compare both methods. Again, only the example for LWR is showed.

# obtain a strds of the absolute differences between predicted and observed Chl-a values 
t.rast.algebra -n \
 basename=lwr_minus_orig suffix=gran \
 expression="abs_lwr_minus_orig = abs(cla_lwr - cla)"

# sum of the absolute differences (numerator) and count of non-null maps per pixel (denominator) 
t.rast.series input=abs_lwr_minus_orig output=sum_abs_lwr_minus_orig method=sum
t.rast.series input=abs_lwr_minus_orig output=count_abs_lwr_minus_orig method=count

# MAE for LWR
r.mapcalc expression="mae_lwr = sum_abs_lwr_minus_orig / count_abs_lwr_minus_orig"

# remove intermediate strds and maps
t.remove -rf abs_lwr_minus_orig
g.remove -f type=raster name=sum_abs_lwr_minus_orig,count_abs_lwr_minus_orig

Let's now compare the predictions generated by both methods, HANTS and LWR, and the corresponding mean absolute error maps.

Original Chl-a time series vs HANTS output.
Original Chl-a time series vs LWR output.)
MAE of HANTS predictions.
MAE of LWR predictions.

Extract strds values for points in a vector

Let's say we need to extract values of the strds to check the behaviour of HANTS at given locations and compare it with the original values. We can graphically do that with g.gui.tplot, but if you need to take data outside GRASS, say, read it into R and do some other analysis (See R_Statistics wiki page) you may use v.what.strds. This module retrieves raster values from a given strds to the attribute table of a point vector map.

# original data
v.what.strds input=points_cla strds=cla output=points_cla_out
v.db.select map=points_cla_out file=ts_points_cla.csv

# HANTS' reconstructed data (several runs)
for i in `seq 1 13` ; do
    v.what.strds --overwrite input=points_cla strds=cla_hants_${i} output=points_hants_${i}_out
    v.db.select map=points_cla file=ts_points_hants_${i}.csv
done

Another option would be to use t.rast.what that samples a space time raster dataset at specific vector point coordinates and writes the output to stdout or text file using different layouts. This module doesn't write the vector's attribute table.

t.rast.what points=points strds=cla output=cla_points.csv null_value=NA separator=comma

Some other modules and addons useful in the time series context

FAQ

Best practice data organization

When it comes to time series, thousands of maps are often involved. In order to keep control, here some suggestions:

  • separate original data from derived aggregates: store them in separate mapsets
  • for specific projects, maintain them in individual mapsets. You can access the STDS stored in other mapsets (same location) through
t.some.module input=my_strds@the_other_mapset output=result ...

as long as mapsets are indeed accessible, i.e.: in the mapset's search path (See g.mapsets).

Best practice multi user management

GRASS GIS supports multi-user data management and allows several users to work in the same location within their own mapsets. Data can be read from other mapsets but not edited there. The same principle applies to space time datasets (STDS), i.e. in case of having a collection of maps corresponding to a time series in a dedicated mapset, you need to also create the STDS therein (See t.create and t.register), so it is visible to other users in other mapsets when they use t.list, for example.

Use of where parameter (ticket #2270)

Pay attention when using the "where" option in t.* modules. In some occasions it may not yield the expected results. Given the backend database chosen for TGRASS implementation, you may not be selecting all maps you think if you forget to set time along with date in the where clause. For example, let's say we need to select maps from 2005-03-01 until 2005-05-31 included. You would think the following command would do the job:

t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31'"

but, no... The last map in the list is:

temp_0516|pruebas|2005-05-30 00:00:00|2005-05-31 00:00:00

which would be equivalent to:

t.rast.list daily_temp where="start_time > '2005-03-01' and start_time < '2005-05-31'"

This is a product of sqlite. Therefore, if you need the last date included in your selection (map from 2005-05-31 in our example), you need to set time too.

t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31 00:00:00'" | tail -n1
temp_0517|pruebas|2005-05-31 00:00:00|2005-06-01 00:00:00

Aggregation with defined granularity

Q: I need to aggregate a strds with a granularity of 1 year, but shifting the start day one month in each run, i.e.: changing the start_time to 2003-02-01, 2003-03-01, 2003-04-01 and so on... My question is: if i recursively change start_time with the 'where' parameter, will the module t.rast.aggregate "aggregate" to the next february, march, april (what i'd wish) or just till the end of 2003?

A: If you specify a granularity of a year, then the start time to perform the aggregation will always be shifted to the 1st January of the current year and the end time the 1st January of the next year (eg. 2002-01-01 - 2003-01-01).  If you wish to aggregate a full year but shifting one month forward then simply use a granularity of 12 months.

Listing maps with specific start month

Q: I have a strds with 506 maps that correspond to 8-day composite products. I need to sequentially list all maps which "start_month" is January, February and so on... to use them as input in r.series (or t.rast.series). How can I achieve that?

A: You can use the datetime functionality of SQLite to perform this task, this should work for January:

t.rast.list input=cla_null_mayor65 \
where='start_time >= datetime(start_time, "start of year") and start_time <= datetime(start_time, "start of year", "1 month")'

Aggregation of seasonal data

Q: How can I calculate average seasonal temperature starting from a daily temperatures temporal dataset?

A: Use t.rast.aggregate.ds, the input is the daily strds, the sampling stds should have seasonal intervals. Then use average as method. The output will have seasonal aggregated temperatures. For a detailed workflow see: seasonal aggregation example.

Aggregation of seasonal data using time ranges

A way to aggregate seasons from daily data without granularity but by using time ranges is shown here. The issue is to include for the season calculation also a month from the previous year. This can be addressed with some datetime calculations in SQLite:

# We assume to have daily temperature data in DB "temp_daily_average"

# loop over seasons, generate aggregates:
### 'start_time' and 'end_time' are columns in TGRASS

for year in `seq 2004 2014` ; do

    # we consider also a month of the previous year
    for month in "12 01 02" "03 04 05" "06 07 08" "09 10 11" ; do
        set -- $month ; echo $1 $2 $3

        prevyear=$year
        if [ $1 -eq 12 ] ; then
           prevyear=`expr $year - 1`
        fi

        # prepare query strings in SQLite
        MYSTART=`echo "SELECT strftime(datetime('${prevyear}-${1}-01'));" | sqlite3`
        MYEND=`echo "SELECT strftime(datetime('${prevyear}-${1}-01','+3 month'));" | sqlite3`

        # Debugging only, to see what it does:
        echo "---- Querying ${prevyear}-${1}-01 ... ${year}-${3}-end:"
        # we use start_time and end_time to get the proper time range
        t.rast.list input=temp_daily_average@modis_lst_reconstructed_europe_daily \
        where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" > list_${prevyear}_${1}_${3}.csv
	
        head -n 3 list_${prevyear}_${1}_${3}.csv
        echo "..."
        tail -n 3 list_${prevyear}_${1}_${3}.csv
        echo "======="
        rm -f list_${prevyear}_${1}_${3}.csv
        
        # calculate aggregates:
        method="average"   # median mode minimum maximum stddev
        
        t.rast.series input=temp_daily_average@modis_lst_reconstructed_europe_daily \
        where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" \
        output=temp_${method}_${prevyear}_${1}
    done
done

Much simpler alternative (which runs in parallel on multi-cores):

t.rast.aggregate input=A output=B basename=b where="start_time >= '2004-03-01 00:00:00'" granularity="3 months"

How to count consecutive days that meet a certain condition?

This example was kindly contributed by Thomas Leppelt and shows how to obtain the number of consecutive days with temperature below 0 per week.

# Set the computational region and resolution
g.region -p n=-30 s=-50 e=-50 w=-70 res=1

# We create maps for 100 days
for map in `seq 1 100` ; do
 # generate synthetic maps as a seeding random values for temperatures
 r.mapcalc -s expression="daily_temp_${map} = rand(-20,20)" --o
 echo daily_temp_${map} >> map_list.txt
done

t.create type=strds temporaltype=absolute \
         output=temperature_daily \
         title="Daily Temperature" \
         description="Test dataset with daily temperature"

t.register -i type=raster input=temperature_daily \
           file=map_list.txt start="2014-03-07" increment="1 days" --o

# check general information of the daily strds
t.info type=strds input=temperature_daily

# Create weekly mask to which we will add 1 or 0 according to temperature in the days within it (each map in this week mask will have a value of 7)
t.rast.aggregate input=temperature_daily output=weekly_mask basename=mask_week granularity="1 weeks" method=count

# Calculate consecutive days with negative temperatures
t.rast.algebra base=neg_temp_days expression="consecutive_days = weekly_mask {+,contains,l} if(temperature_daily < 0 && temperature_daily[-1] < 0 || temperature_daily[1] < 0 && temperature_daily < 0, 1, 0)"

Online tutorials and courses

References

  • Gebbert, S., Pebesma, E. 2014. TGRASS: A temporal GIS for field based environmental modeling. Environmental Modelling & Software 53, 1-12 (DOI) - preprint PDF
  • Gebbert, S., Pebesma, E. 2017. The GRASS GIS temporal framework. International Journal of Geographical Information Science 31, 1273-1292 (DOI) - PDF
  • Gebbert, S., Leppelt, T., Pebesma, E., 2019. A topology based spatio-temporal map algebra for big data analysis. Data 4, 86. (DOI) - PDF

See also