Temporal data processing: Difference between revisions
Veroandreo (talk | contribs) mNo edit summary |
m (Restore page (revert Wiki glitch)) |
||
(72 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
== Introduction == | == Introduction == | ||
TGRASS is the temporal enabled GRASS GIS | TGRASS is the temporal enabled GRASS GIS. | ||
TGRASS is completely metadata based, i.e. it does not change any data but simply handles the organization of raster, vector, raster3D maps actually stored in a GRASS GIS mapset by registering in an additional internal database. This is done specifically for managing temporal and spatial extent including temporal topology. | TGRASS is completely metadata based, i.e. it does not change any data but simply handles the organization of raster, vector, raster3D maps actually stored in a GRASS GIS mapset by registering in an additional internal database. This is done specifically for managing temporal and spatial extent including temporal topology. | ||
Manual overview: https://grass.osgeo.org/grass-stable/manuals/temporalintro.html | |||
== Terminology Overview == | == Terminology Overview == | ||
Line 19: | Line 21: | ||
== Example workflow for a Chlorophyll-a MODIS time series == | == Example workflow for a Chlorophyll-a MODIS time series == | ||
The following examples are based on a series of MODIS L3 Chlorophyll-a product that is freely available at the [http://oceancolor.gsfc.nasa.gov/cgi/l3?sen=A&per=MO&prd=CHL_chlor_a ocean color] site. So, say we download the SMI 8-day composite product at 4.6 km resolution for the period 2003-2013. That is a 506 set of images, 46 per year. Data comes as compressed HDF4 files. Chlorophyll products | The following examples are based on a series of MODIS L3 Chlorophyll-a product that is freely available at the [http://oceancolor.gsfc.nasa.gov/cgi/l3?sen=A&per=MO&prd=CHL_chlor_a ocean color] site. So, say we download the SMI 8-day composite product at 4.6 km resolution for the period 2003-2013. That is a 506 set of images, 46 per year. Data comes as compressed HDF4 files. Chlorophyll products file names look like this: | ||
A20030012003008.L3m_8D_CHL_chlor_a_4km | <code>A20030012003008.L3m_8D_CHL_chlor_a_4km</code> | ||
where, | |||
A: MODIS/Aqua | A: MODIS/Aqua | ||
Line 48: | Line 52: | ||
* import global images (506 images, 150 Mb each) with r.in.gdal, resize to study area and remove global files | * import global images (506 images, 150 Mb each) with r.in.gdal, resize to study area and remove global files | ||
* set projection and extension, and resize to study area with gdal_translate, and import already resized images with r.in.gdal | * set projection and extension, and resize to study area with gdal_translate, and import already resized images with r.in.gdal | ||
* import global images (506 images, 150 Mb each) with r.external, resize to study area and remove global files, as showed | * import global images (506 images, 150 Mb each) with r.external, resize to study area and remove global files, as showed below: | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 55: | Line 59: | ||
suffix=_tmp | suffix=_tmp | ||
for map in *chlor*; do | for map in *chlor* ; do | ||
r.external input=$map output=${map}${suffix} -o | |||
r.mapcalc expression="$map = ${map}${suffix}" | |||
g.remove type=raster name=${map}${suffix} | |||
done | done | ||
</source> | </source> | ||
Once we have maps inside GRASS we can start the temporal processing. If this is the first time you'll use temporal modules, | Once we have maps inside GRASS GIS database we can start the temporal processing. If this is the first time you'll use temporal modules, | ||
you need to run | you need to run | ||
Line 69: | Line 73: | ||
</source> | </source> | ||
to set the default temporal GIS database connection for the current mapset. The default TGIS database of type ''sqlite3'' is located in the PERMANENT mapset directory. Temporal GIS content from all created mapsets will be stored there. | to set the default temporal GIS database connection for the current mapset. If the connection is not initially set with {{cmd|t.connect}}, any temporal module that you try to run for the first time will first set the connection to the temporal database for you. The default TGIS database of type ''sqlite3'' is located in the PERMANENT mapset directory. Temporal GIS content from all created mapsets will be stored there. | ||
=== Creating a STRDS and registering maps === | === Creating a STRDS and registering maps === | ||
Line 75: | Line 79: | ||
'''Creating a STRDS''' | '''Creating a STRDS''' | ||
First step is to '''create a space time dataset''' by means of {{cmd|t.create}}. As an example, let us create a strds for the Chlorophyll-a (Cl-a) time series. We need to define the type | First step is to '''create a space time dataset''' by means of {{cmd|t.create}}. As an example, let us create a strds for the Chlorophyll-a (Cl-a) time series. We need to define the name and semantic type of the new space time dataset, its title and a description. By default a raster dataset is created with absolute time (Gregorian calendar). We can change that according to our needs with type and temporaltype options. | ||
<source lang="bash"> | <source lang="bash"> | ||
t.create type=strds temporaltype=absolute output= | t.create type=strds temporaltype=absolute output=cla \ | ||
title="Chlorophyll-a concentration" \ | |||
description="MODIS L3 Chlorophyll-a concentration for Argentinian sea" \ | |||
semantictype=mean | |||
</source> | </source> | ||
'''Registering maps''' | '''Registering maps''' | ||
Then, we '''register the maps''' in the strds using {{cmd|t.register}} | Then, we '''register the maps''' in the strds using {{cmd|t.register}}. This module assigns timestamps to raster, 3D raster and vector maps and register them in the temporal database and into space time datasets. Existing timestamps can be read and used by t.register. Note that this is a metadata based registration, nothing is imported into GRASS GIS since the maps are already present in the location (hence, no duplication occurs). | ||
This module supports '''absolute time''' and '''relative time'''. The ''absolute'' temporal type refers to a fixed date while the ''relative'' temporal type refers to data without fixed time stamps (e.g., sequential maps used to calculate multi-decadal averages). Refer to the [http://www.geostat-course.org/system/files/Presentation_0.pdf TGRASS overview slides] for background. | This module (and TGRASS in general) supports '''absolute time''' and '''relative time'''. The ''absolute'' temporal type refers to a fixed date or interval in the Gregorian calendar, while the ''relative'' temporal type refers to data without fixed time stamps (e.g., sequential maps used to calculate multi-decadal averages). Refer to the [http://www.geostat-course.org/system/files/Presentation_0.pdf TGRASS overview slides] for background. | ||
Maps can be registered by command line argument (a list of comma separated map names) or using an input file. The start time, end time and a temporal increment can be provided by command line or in the input file. End time and increment are | Maps can be registered by command line argument (i.e., a list of comma separated map names) or using an input file. The start time, end time and a temporal increment can be provided either by command line or in the input file. End time and increment are mutually exclusive. Maps can be registered in several space time datasets using the same timestamp. For a more detailed explanation and examples on how to register maps in stds, see also [https://grasswiki.osgeo.org/wiki/Temporal_data_processing/maps_registration maps registration] wiki page. | ||
Start time and end time with absolute time must be provided using the format yyyy-mm-dd HH:MM:SS +HHMM. It is also supported to specify the date yyyy-mm-dd only. In case of relative time the temporal unit (years, months, days, hours, minutes or seconds) must be provided. The relative start time, end time and the increment are integers (see also | Start time and end time with absolute time must be provided using the format yyyy-mm-dd HH:MM:SS +HHMM. It is also supported to specify the date yyyy-mm-dd only. In case of relative time, the temporal unit (years, months, days, hours, minutes or seconds) must be provided. The relative start time, end time and the increment are integers (see also {{cmd|t.register}}). | ||
<source lang="bash"> | <source lang="bash"> | ||
# note: with -i we | # note: with -i we create intervals (start and end time) spanning the given increment, and starting from start. | ||
t.register -i type=raster input= | t.register -i type=raster input=cla \ | ||
maps=`g.list raster pattern="*_chlor_*" separator=comma` \ | |||
start="2003-01-01" \ | |||
increment="8 days" | |||
</source> | </source> | ||
'''Dealing with weekly data''' | '''Dealing with weekly data''' | ||
While | While the former would have been the simplest solution, our 8-day products have a problem: in this example the last image of each year is not a product of an 8-day composition, but 4 or 5-day. Then, when using the increment parameter, dates (and consequently, intervals) are not set properly. | ||
The solution is to create a list of maps, with their respective start and end date. As the filenames contain information regarding year and DOY (day of year), we can use the following Python script to '''read filenames and transform DOY to calendar dates''' (Thanks Soeren!). | The solution is to create a list of maps, with their respective start and end date. As the filenames contain information regarding year and DOY (day of year), we can use the following Python script to '''read filenames and transform DOY to calendar dates''' (Thanks Soeren!). | ||
<source lang="python"> | <source lang="python"> | ||
# Script to extract the input | # Script to extract the input file for t.register from map names | ||
# run in python | # run in python | ||
# modify to your needs | # modify to your needs | ||
Line 139: | Line 146: | ||
</source> | </source> | ||
Using the number of characters in the filenames and datetime library in Python, you can convert DOY in the filenames into start_time and end_time as in the | Using the number of characters in the filenames and the datetime library in Python, you can convert DOY in the filenames into start_time and end_time as in the file you need to pass to {{cmd|t.register}}: | ||
A20030012003008.L3m_8D_CHL_chlor_a_4km_arg|2003-01-01 00:00:00|2003-01-09 00:00:00 | A20030012003008.L3m_8D_CHL_chlor_a_4km_arg|2003-01-01 00:00:00|2003-01-09 00:00:00 | ||
Line 155: | Line 161: | ||
t.register --o type=raster input=cla file=input_list_cla.txt | t.register --o type=raster input=cla file=input_list_cla.txt | ||
</source> | </source> | ||
As we are providing start and end time along with map names, we don't need to use the -i flag, neither set start nor increment options because that information is already contained in the file. | |||
'''Assign a color table''' | '''Assign a color table''' | ||
We can also set a color | We can also set a color table for all maps in the strds with {{cmd|t.rast.colors}}: | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 166: | Line 174: | ||
t.rast.colors input=cla rules=path/to/your/color_table | t.rast.colors input=cla rules=path/to/your/color_table | ||
</source> | </source> | ||
Further examples of '''extracting timestamps from map names''' can be found here: [https://grasswiki.osgeo.org/wiki/Temporal_data_processing/maps_registration maps registration] wiki page. | |||
=== Getting some basic info and statistics === | === Getting some basic info and statistics === | ||
We now check the space time data sets we have in our mapset with: | We now check the space time data sets we have in our mapset with {{cmd|t.list}}: | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 175: | Line 185: | ||
</source> | </source> | ||
and | and get information about our recently created strds. See {{cmd|t.info}} for additional uses. | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 228: | Line 238: | ||
| # 2014-04-29 14:23:23 | | # 2014-04-29 14:23:23 | ||
| t.register --o type="rast" input="cla" | | t.register --o type="rast" input="cla" | ||
| file=" | | file="input_list_cla.txt" | ||
| | | | ||
+----------------------------------------------------------------------------+ | +----------------------------------------------------------------------------+ | ||
Line 238: | Line 248: | ||
</source> | </source> | ||
Or, you can send the output to a text file using | |||
<source lang="bash"> | <source lang="bash"> | ||
t.rast.univar input=cla separator=comma | t.rast.univar input=cla separator=comma output=stats_cla.csv | ||
</source> | </source> | ||
This file "stats_cla.csv" you can now open in a spreadsheet software for inspection. | This file "stats_cla.csv" you can now open in a spreadsheet or statistical software for inspection and plotting. | ||
=== Listing maps and selections === | === Listing maps and selections === | ||
The module {{cmd|t.rast.list}} allows you to list registered | The module {{cmd|t.rast.list}} allows you to list all the maps registered in a strds or a selection of them and, provides options for different listing methods, sorting of the list, information to print and so on. The ''where'' option in t.rast.list allows to perform different selections of maps to list. The ''columns'' that can be used to perform these selections are: ''id, name, creator, mapset, temporal_type, creation_time, start_time, end_time, north, south, west, east, nsres, ewres, cols, rows, number_of_cells, min and max''. (Note that for vector time series, i.e. stvds, the columns in {{cmd|t.vect.list}} differ from those for strds. You can check that with <code>t.vect.list --help</code>). | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 276: | Line 286: | ||
<source lang="bash"> | <source lang="bash"> | ||
t.rast.list input= | t.rast.list input=cla where="start_time=datetime(start_time, 'start of year', ' 0 month')" | ||
</source> | </source> | ||
Note: '''Do not forget to use single quotes around date''', neither in {{cmd|t.rast.list}} nor in any other of the t.* modules offering the ''where'' option, because the clause will be ignored (i.e.: no selection performed) and no warning nor error will be printed. | |||
=== Visualization === | === Visualization === | ||
Line 299: | Line 311: | ||
# Select strds | # Select strds | ||
# Select pair of coordinates (east,north) or point in the map | # Select pair of coordinates (east,north) or mark a point in the map | ||
# Hit | # Hit Draw | ||
# Customize as desired | # Customize plot as desired | ||
# Save | # Save | ||
Line 309: | Line 321: | ||
<source lang="bash"> | <source lang="bash"> | ||
g.gui.animation strds= | g.gui.animation strds=cla_monthly_average | ||
</source> | </source> | ||
[[Image:Grass70 temporal chlorophyll anim small.gif|framed|center|Monthly mean chlorophyll-a concentration]] | |||
=== Aggregation === | === Aggregation === | ||
Line 316: | Line 330: | ||
For aggregations of data with different methods and different granularities, there are two very useful commands: | For aggregations of data with different methods and different granularities, there are two very useful commands: | ||
* {{cmd|t.rast.series}} that | * {{cmd|t.rast.series}} that applies different aggregation methods from {{cmd|r.series}} on all or a subset of raster maps in a strds, and | ||
* {{cmd|t.rast.aggregate}} that temporally aggregates the maps in a strds by a user defined granularity. | * {{cmd|t.rast.aggregate}} that temporally aggregates the maps in a strds by a user defined granularity. | ||
Line 326: | Line 340: | ||
<source lang="bash"> | <source lang="bash"> | ||
# yearly aggregation | # yearly aggregation | ||
t.rast.aggregate input=cla output=cla_yearly_average \ | t.rast.aggregate input=cla output=cla_yearly_average basename=cla_yearly_average \ | ||
granularity="1 years" method=average sampling=starts suffix=gran | |||
# yearly aggregation with corresponding methods (output: 7 strds with 11 maps each) | # yearly aggregation with corresponding methods (output: 7 strds with 11 maps each) | ||
for method in average median mode minimum maximum stddev range ; do | for method in average median mode minimum maximum stddev range ; do | ||
t.rast.aggregate input=cla output=cla_yearly_${method} \ | t.rast.aggregate input=cla output=cla_yearly_${method} \ | ||
basename=cla_yearly_${method} granularity="1 years" \ | |||
method=${method} sampling=starts suffix=gran | |||
done | done | ||
</source> | </source> | ||
With the where parameter, as exemplified before, we can select for example all 8-day products which start_time is 01 (January) over the years. Like this we can get the so-known climatologies. | With the where parameter, as exemplified before, we can select for example all 8-day products which start_time is 01 (January) over the years. Like this we can get the so-known '''climatologies'''. | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 344: | Line 357: | ||
</source> | </source> | ||
Generalizing a bit, we can: | We could also get long term averages for specific periods in a time series, e.g. January 5-year mean: | ||
<source lang="bash"> | |||
t.rast.series input=cla \ | |||
where="start_time >= '2005-01-01' AND start_time <= '2009-12-01' AND strftime('%m', start_time)='01'" \ | |||
method=average output=jan_cla_mean_2000_2004 | |||
# say we take a 15 years period (2000-2014) of monthly data and we want monthly 5-year averages | |||
YEAR_START=(2000 2005 2010) | |||
YEAR_END=(2004 2009 2014) | |||
count=${#YEAR_START[@]} | |||
for i in `seq 1 $count` ; do | |||
echo ${YEAR_START[$i-1]} ${YEAR_END[$i-1]} | |||
for MONTH in `seq -w 1 12` ; do | |||
t.rast.series input=cla \ | |||
where="start_time >= '"${YEAR_START[$i-1]}"-01-01' AND start_time <= '"${YEAR_END[$i-1]}"-12-01' AND strftime('%m', start_time)='"${MONTH}"'" \ | |||
method=average output=cla_${YEAR_START[$i-1]}_${YEAR_END[$i-1]}_month_${MONTH} --o | |||
done | |||
done | |||
</source> | |||
Generalizing a bit, to obtain monthly climatologies from a time series, we can: | |||
<source lang="bash"> | <source lang="bash"> | ||
# monthly climatologies | # monthly climatologies | ||
for | for MONTH in `seq -w 1 12` ; do | ||
for | for METHOD in average median stddev minimum maximum ; do | ||
t.rast.series input=cla method=${ | t.rast.series input=cla method=${METHOD} where="strftime('%m', start_time)='${MONTH}'" output=${METHOD}_${MONTH} | ||
done | done | ||
done | done | ||
Line 356: | Line 391: | ||
# seasonal climatologies | # seasonal climatologies | ||
for i in "01 02 03" "04 05 06" "07 08 09" "10 11 12" ; do | for i in "01 02 03" "04 05 06" "07 08 09" "10 11 12" ; do | ||
set -- $i ; echo $1 $2 $3 | set -- $i ; echo $1 $2 $3 | ||
for m in average median stddev minimum maximum ; do | |||
t.rast.series input=cla method=${m} output=${m}_${1} \ | |||
where="strftime('%m',start_time)='"${1}"' or strftime('%m',start_time)='"${2}"' or strftime('%m', start_time)='"${3}"'" | |||
done | |||
done | done | ||
</source> | </source> | ||
Line 368: | Line 403: | ||
<source lang="bash"> | <source lang="bash"> | ||
# monthly aggregates (132 maps) | # monthly aggregates (132 maps) | ||
for method in average minimum maximum ; do | for method in average minimum maximum ; do | ||
t.rast.aggregate input=cla output=cla_monthly_${method} basename=cla_monthly_${method} \ | t.rast.aggregate input=cla output=cla_monthly_${method} \ | ||
basename=cla_monthly_${method} \ | |||
granularity="1 months" method=${method} \ | |||
sampling=contains suffix=gran | |||
done | done | ||
# January anomalies in mean, min and max Cl-a | # January anomalies in mean, min and max Cl-a | ||
t.rast.list -u input=cla_monthly_average where="start_time=datetime(start_time, 'start of year', '0 month')" columns=name | |||
t.rast.list - | |||
for m in average minimum maximum ; do | for m in average minimum maximum ; do | ||
for i in 1 13 25 37 49 61 73 85 97 109 121 ; do # these numbers correspond to all January monthly aggregates | |||
r.mapcalc expression="Jan_${m}_anomaly_${i}=cla_monthly_${m}_${i}-01_${m}" | |||
done | |||
done | done | ||
</source> | </source> | ||
Line 388: | Line 423: | ||
<source lang="bash"> | <source lang="bash"> | ||
for map in `t.rast.list - | for map in `t.rast.list -u input=cla_monthly_average where="strftime('%m', start_time)='01'" columns=name` ; do | ||
do | r.mapcalc expression="anomaly_${map}=${map}-01_average" | ||
r.mapcalc expression="anomaly_${map}=${map}- | done | ||
</source> | |||
Generalizing, we can estimate all monthly anomalies from mean like: | |||
<source lang="bash"> | |||
for i in `seq -w 1 12` ; do | |||
for map in `t.rast.list -u input=cla_monthly_average columns=name where="strftime('%m', start_time)='"${i}"'"` ; do | |||
r.mapcalc expression="anomaly_${map}=${map}-average_${i}" | |||
done | |||
done | done | ||
</source> | </source> | ||
Line 401: | Line 445: | ||
# map index for the yearly maximum Cl-a value | # map index for the yearly maximum Cl-a value | ||
t.rast.aggregate input=cla granularity="1 year" method=max_raster output=yearly_max_index basename=yearly_max_index | t.rast.aggregate input=cla granularity="1 year" method=max_raster output=yearly_max_index basename=yearly_max_index suffix=gran | ||
</source> | </source> | ||
Line 415: | Line 459: | ||
=== Spatio-temporal algebra with STRDS === | === Spatio-temporal algebra with STRDS === | ||
The module {{cmd|t.rast.mapcalc}} allows us to perform spatio-temporal mapcalc expressions on temporally sampled maps of strds. There are spatial and temporal operators available for the "expression" string. Spatial operators, functions and internal variables are those used in {{cmd|r.mapcalc}}. Temporal internal variables supported for both relative and absolute time include: td(), start_time() and end_time(). There are also several very useful internal variables supported especially for absolute time of the current sample interval or instance, e.g.: start_doy(), start_year(), start_month() and so on (see {{cmd|t.rast.mapcalc}} manual | The module {{cmd|t.rast.mapcalc}} allows us to perform spatio-temporal mapcalc expressions on temporally sampled maps of strds. There are spatial and temporal operators available for the "expression" string. Spatial operators, functions and internal variables are those used in {{cmd|r.mapcalc}}. Temporal internal variables supported for both relative and absolute time include: td(), start_time() and end_time(). There are also several very useful internal variables supported especially for absolute time of the current sample interval or instance, e.g.: start_doy(), start_year(), start_month() and so on (see {{cmd|t.rast.mapcalc}} manual for further details and examples). | ||
Some examples now. Say we did some analysis and decided that we will only consider values higher than 0.05. Then, we need to set all values below that threshold to null. | Some examples now. Say we did some analysis and decided that we will only consider values higher than 0.05. Then, we need to set all values below that threshold to null. | ||
Line 452: | Line 496: | ||
</source> | </source> | ||
There's also a {{cmd|t.rast.algebra}} module that allows for temporal and spatial operations on strds by means of temporal raster algebra. The module expects an expression in the following form: "result = expression". | |||
The statement structure is similar to r.mapcalc, the result is the name of a new strds that will contain the result of the calculation given as expression. Expressions can be any valid or nested combination of temporal operations and spatial overlay or buffer functions that are provided by the temporal algebra. See the manual for further details and explanations. | The statement structure is similar to r.mapcalc, the result is the name of a new strds that will contain the result of the calculation given as expression. Expressions can be any valid or nested combination of temporal operations and spatial overlay or buffer functions that are provided by the temporal algebra. See the manual for further details and explanations. | ||
Line 461: | Line 503: | ||
<source lang="bash"> | <source lang="bash"> | ||
# we set 8 as fixed denominator, because products are 8-day compositions | |||
t.rast.algebra expression="slope_cla = (cla[1]-cla[0])/8.0" basename=slope_cla | t.rast.algebra expression="slope_cla = (cla[1]-cla[0])/8.0" basename=slope_cla | ||
</source> | </source> | ||
We can then use any of the aggregation modules that we saw before to get the maximum or minimumm rate of change for different granularities. | We can then use any of the aggregation modules that we saw before to get the maximum or minimumm rate of change for different granularities. | ||
===== Some known issues with t.rast.mapcalc ===== | |||
It has been observed that if the names of inputs and output space time datasets (partially) match each other, t.rast.mapcalc may not work properly (especially when several input strds are used). This is because the module uses a simple search and replace approach to substitute the input strds with the corresponding map names. Eventualy, choosing a specific order for the input strds in the input parameter may reduce the risk of wrong substitution. Therefore, especially in operations involving several strds (with partially matching names) as inputs, '''it is recommended to use {{cmd|t.rast.algebra}} that correctly recognizes spatio-temporal datasets''' rather than t.rast.mapcalc. | |||
=== Subseting and something else === | === Subseting and something else === | ||
{{cmd|t.rast.extract}} is another really useful module | {{cmd|t.rast.extract}} is another really useful module within temporal GRASS. It allows to extract a subset of a strds and store it in a different strds. The "where" condition is used to do the subset, but you can also specify a r.mapcalc sub-expression that performs operations on all maps of the selected subset. If no r.mapcalc expression is defined, the selected maps are simply registered in the new output strds. | ||
Say we | Say we need to know in how many maps of the strds minimum values are below a certain threshold, and not only that, but we also want to know how many pixels per map meet that condition. Then, we can do | ||
<source lang="bash"> | <source lang="bash"> | ||
t.rast.extract input=cla where="min <= '0.05'" output=cla_less_05 basename=cla_less_05 expression="if(cla<0.05,1,null())" | t.rast.extract input=cla where="min <= '0.05'" output=cla_less_05 basename=cla_less_05 expression="if(cla < 0.05, 1, null())" | ||
</source> | </source> | ||
Line 486: | Line 532: | ||
</source> | </source> | ||
=== Importing / Exporting | === Importing / Exporting STRDS === | ||
Say we now need to do some processing in R (e.g.: run [http://menugget.blogspot.com.ar/2012/10/dineof-data-interpolating-empirical.html DINEOF] to fill gaps in data), so we need to export our strds, hence we use {{cmd|t.rast.export}}. | Say we now need to do some processing in R (e.g.: run [http://menugget.blogspot.com.ar/2012/10/dineof-data-interpolating-empirical.html DINEOF] to fill gaps in data), so we need to export our strds, hence we use {{cmd|t.rast.export}}. | ||
Line 499: | Line 545: | ||
t.rast.import input=cla_from_R.tar.gz output=cla_from_R basename=new_map directory=/tmp | t.rast.import input=cla_from_R.tar.gz output=cla_from_R basename=new_map directory=/tmp | ||
</source> | </source> | ||
For a full example of how to handle raster time series between GRASS and R, please visit the [https://grasswiki.osgeo.org/wiki/Temporal_data_processing/raster_time_series_grass_R_statistics time series GRASS-R Statistics] wiki. The example is based on North Carolina climate location [http://courses.ncsu.edu/mea592/common/media/02/nc_climate_spm_2000_2012.zip] and includes the following steps: | |||
# Exporting the strds out of GRASS | |||
# Importing into R | |||
# Re-formating the data | |||
# Running [http://modb.oce.ulg.ac.be/mediawiki/index.php/DINEOF DINEOF] | |||
# Re-constructing the raster time series | |||
# Exporting out of R, and | |||
# Importing the gap-filled new strds into GRASS. | |||
=== Neighborhood analysis === | === Neighborhood analysis === | ||
Line 508: | Line 564: | ||
</source> | </source> | ||
=== Filling and reconstructing time series data with gaps | === Filling and reconstructing time series data with gaps === | ||
==== Harmonic ANalysis of Time Series - HANTS ==== | |||
{{AddonCmd|r.hants}} is an add-on not strictly within the temporal modules, but really useful when working with time series data, particularly with gappy data. Here's a simple example for generating a list of temporally ordered maps to use as input in '''r.hants''', running HANTS and getting a map of dominant frequencies, afterwards. See the manual for further information on parameter setting and explanations. | |||
<source lang="bash"> | <source lang="bash"> | ||
# use t.rast.list to create a list of temporally ordered maps | # use t.rast.list to create a list of temporally ordered maps | ||
t.rast.list input=cla order=start_time columns=name - | t.rast.list input=cla order=start_time columns=name -u > map_list.csv | ||
# a HANTS run | # a HANTS run using the list of maps | ||
r.hants file=map_list.csv nf=5 fet=0.1 dod=11 base_period=46 suffix=_hants amplitude=amp_hants phase=pha_hants | r.hants -l file=map_list.csv nf=5 fet=0.1 dod=11 base_period=46 suffix=_hants amplitude=amp_hants phase=pha_hants | ||
# dominant frequency map (0 means the dominant freq is 1, 1 that dominant freq is 2, and so on...) | # dominant frequency map (0 means the dominant freq is 1, 1 that dominant freq is 2, and so on...) | ||
r.series input=`g.list raster pattern=amp_hants* separator=comma` output=dominant_freq_hants method=max_raster | r.series input=`g.list type=raster pattern="amp_hants*" separator=comma` output=dominant_freq_hants method=max_raster | ||
</source> | |||
==== Local Weighted Regression - LWR ==== | |||
Another very useful add-on for the reconstruction of gappy time series, such as those from remote sensing imagery, is {{AddonCmd|r.series.lwr}}. This module performs a local weighted regression (LWR) of the input maps in order to estimate missing values and identify (and remove) outliers. See the manual for further information and explanations. | |||
<source lang="bash"> | |||
# use same list as before | |||
t.rast.list input=cla order=start_time columns=name -u > map_list.csv | |||
# run r.series.lwr | |||
r.series.lwr file=map_list.csv suffix=_lwr order=2 weight=tricube range=0.0,65.0 dod=16 fet=0.1 -l | |||
</source> | |||
==== Estimate Mean Absolute Error (MAE) ==== | |||
One of the measures used to assess how close forecasts or predictions might be to the observed values is the | |||
[https://en.wikipedia.org/wiki/Mean_absolute_error mean absolute error]. We will use it to evaluate the performance of HANTS and LWR. | |||
First, we will rebuild our time series with the corresponding output maps of r.hants and r.series.lwr. Here, only showed for LWR. | |||
<source lang="bash"> | |||
# re-build time series | |||
t.create type=strds temporaltype=absolute output=cla_lwr \ | |||
title="LWR output for Chl-a" \ | |||
description="MODIS Aqua L3 Chl-a 8-day 4km 2010-2013. Reconstruction with r.series.lwr" | |||
# create list with filenames to parse | |||
g.list type=raster pattern="*lwr" output=names_list | |||
# parse filenames, convert YYYY-DOY to YYYY-MM-DD and write file to use in t.register | |||
for mapname in `cat names_list` ; do | |||
year_start=`echo ${mapname:1:4}` | |||
doy_start=`echo ${mapname:5:3}` | |||
year_end=`echo ${mapname:8:4}` | |||
doy_end=`echo ${mapname:12:3}` | |||
# convert YYYY-DOY to YYYY-MM-DD | |||
#BEWARE: leading zeros make bash assume the number is in base 8 system, not base 10! | |||
doy_start=`echo "$doy_start" | sed 's/^0*//'` | |||
doy_end=`echo "$doy_end" | sed 's/^0*//'` | |||
START_DATE=`date -d "${year_start}-01-01 +$(( ${doy_start} - 1 ))days" +%Y-%m-%d` | |||
END_DATE=`date -d "${year_end}-01-01 +$(( ${doy_end} ))days" +%Y-%m-%d` | |||
# print mapname, start and end date | |||
echo "$mapname|$START_DATE|$END_DATE" >> map_list_start_and_end_time.txt | |||
done | |||
# register maps in strds | |||
t.register input=cla_lwr type=raster file=map_list_start_and_end_time.txt | |||
</source> | </source> | ||
Now, we will estimate the MAE and use it to compare both methods. Again, only the example for LWR is showed. | |||
<source lang="bash"> | |||
# obtain a strds of the absolute differences between predicted and observed Chl-a values | |||
t.rast.algebra -n \ | |||
basename=lwr_minus_orig suffix=gran \ | |||
expression="abs_lwr_minus_orig = abs(cla_lwr - cla)" | |||
# sum of the absolute differences (numerator) and count of non-null maps per pixel (denominator) | |||
t.rast.series input=abs_lwr_minus_orig output=sum_abs_lwr_minus_orig method=sum | |||
t.rast.series input=abs_lwr_minus_orig output=count_abs_lwr_minus_orig method=count | |||
# MAE for LWR | |||
r.mapcalc expression="mae_lwr = sum_abs_lwr_minus_orig / count_abs_lwr_minus_orig" | |||
# remove intermediate strds and maps | |||
t.remove -rf abs_lwr_minus_orig | |||
g.remove -f type=raster name=sum_abs_lwr_minus_orig,count_abs_lwr_minus_orig | |||
</source> | |||
Let's now compare the predictions generated by both methods, HANTS and LWR, and the corresponding mean absolute error maps. | |||
{| | |||
|- | |||
| [[Image:cla_vs_cla_hants.png|center|thumb|560px|Original Chl-a time series vs HANTS output.]] || [[Image:cla_vs_cla_lwr.png|center|thumb|560px|Original Chl-a time series vs LWR output.)]] | |||
|- | |||
| [[Image:mae_hants_map.png|center|thumb|560px|MAE of HANTS predictions.]] || [[Image:mae_lwr_map.png|center|thumb|560px|MAE of LWR predictions.]] | |||
|} | |||
=== Extract strds values for points in a vector === | === Extract strds values for points in a vector === | ||
Let's say we need to extract values of the strds to check the behaviour of HANTS at given locations and compare it with the original values | Let's say we need to extract values of the strds to check the behaviour of HANTS at given locations and compare it with the original values. We can graphically do that with {{cmd|g.gui.tplot}}, but if you need to take data outside GRASS, say, read it into R and do some other analysis (See [http://grasswiki.osgeo.org/wiki/R_statistics/rgrass7#Reading_in_data R_Statistics wiki page]) you may use {{cmd|v.what.strds}}. This module retrieves raster values from a given strds to the attribute table of a point vector map. | ||
<source lang="bash"> | <source lang="bash"> | ||
# original data | # original data | ||
v.what.strds | v.what.strds input=points_cla strds=cla output=points_cla_out | ||
v.db.select map=points_cla_out file=ts_points_cla.csv | v.db.select map=points_cla_out file=ts_points_cla.csv | ||
# HANTS' reconstructed data (several runs) | # HANTS' reconstructed data (several runs) | ||
for i in `seq 1 13` ; do | for i in `seq 1 13` ; do | ||
v.what.strds --overwrite input=points_cla strds=cla_hants_${i} output=points_hants_${i}_out | |||
v.db.select map=points_cla file=ts_points_hants_${i}.csv | |||
done | done | ||
</source> | </source> | ||
Another option would be to use {{cmd|t.rast.what}} that samples a space time raster dataset at specific vector point coordinates and | Another option would be to use {{cmd|t.rast.what}} that samples a space time raster dataset at specific vector point coordinates and writes the output to stdout or text file using different layouts. This module doesn't write the vector's attribute table. | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 545: | Line 680: | ||
</source> | </source> | ||
=== | === Some other modules and addons useful in the time series context === | ||
* {{cmd|t.rast.accumulate}} and {{cmd|t.rast.accdetect}} to compute and detect cyclic accumulations in raster time series. | |||
* {{AddonCmd|t.rast.what.aggr}} to sample a strds with a vector point map and get aggregated values of desired granularity either in stdout or in the vector's attribute table. | |||
* {{AddonCmd|t.rast.out.xyz}} to export a strds to a CSV file. | |||
* {{AddonCmd|t.rast.null}} to manage NULL-values of a strds. | |||
* {{AddonCmd|v.strds.stats}} to calculate univariate statistics from a strds based on a vector map. | |||
* {{AddonCmd|r.seasons}} to extract seasons from a time series. | |||
* {{AddonCmd|r.bioclim}} to calculate bio-climatic indices. | |||
* {{AddonCmd|r.change.info}} for landscape change assessment. | |||
* {{AddonCmd|r.quantile.ref}} to determine the corresponding quantile for an input value from a reference raster time series. | |||
* {{AddonCmd|r.regression.series}} and {{AddonCmd|r.mregression.series}} to perform univariate and multivariate linear regression among raster time series, respectively. | |||
* {{AddonCmd|r.series.diversity}} to estimate diversity indices in raster time series. | |||
* {{AddonCmd|r.series.decompose}} to decompose raster time series. | |||
* {{AddonCmd|r.series.filter}} to apply Savitzky-Golay or median filters to raster time series. | |||
== FAQ == | == FAQ == | ||
Line 553: | Line 700: | ||
=== Best practice data organization === | === Best practice data organization === | ||
When it | When it comes to time series, thousands of maps are often involved. In order to keep control, here some suggestions: | ||
* separate original data from derived aggregates: store them in separate mapsets | * separate original data from derived aggregates: store them in separate mapsets | ||
* for specific projects, maintain them in individual mapsets. You can access the | * for specific projects, maintain them in individual mapsets. You can access the STDS stored in other mapsets (same location) through | ||
t.some.module input=my_strds@the_other_mapset output=result ... | t.some.module input=my_strds@the_other_mapset output=result ... | ||
as long as mapsets are indeed accessible, i.e.: in the mapset's search path (See {{cmd|g.mapsets}}). | |||
=== Best practice multi user management === | |||
GRASS GIS supports multi-user data management and allows several users to work in the same location within their own mapsets. Data can be read from other mapsets but not edited there. The same principle applies to space time datasets (STDS), i.e. in case of having a collection of maps corresponding to a time series in a dedicated mapset, you need to also create the STDS therein (See {{cmd|t.create}} and {{cmd|t.register}}), so it is visible to other users in other mapsets when they use {{cmd|t.list}}, for example. | |||
=== Use of ''where'' parameter (ticket [https://trac.osgeo.org/grass/ticket/2270 #2270]) === | |||
Pay attention when using the "where" option in t.* modules. In some occasions it may not yield the expected results. Given the backend database chosen for TGRASS implementation, you may not be selecting all maps you think if you forget to set time along with date in the where clause. For example, let's say we need to select maps from 2005-03-01 until 2005-05-31 included. You would think the following command would do the job: | |||
<source lang="bash"> | |||
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31'" | |||
</source> | |||
but, no... The last map in the list is: | |||
temp_0516|pruebas|2005-05-30 00:00:00|2005-05-31 00:00:00 | |||
which would be equivalent to: | |||
<source lang="bash"> | |||
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time < '2005-05-31'" | |||
</source> | |||
This is a product of sqlite. Therefore, if you need the last date included in your selection (map from 2005-05-31 in our example), you need to set time too. | |||
<source lang="bash"> | |||
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31 00:00:00'" | tail -n1 | |||
temp_0517|pruebas|2005-05-31 00:00:00|2005-06-01 00:00:00 | |||
</source> | |||
=== Aggregation with defined granularity === | === Aggregation with defined granularity === | ||
Line 567: | Line 744: | ||
=== Listing maps with specific start month === | === Listing maps with specific start month === | ||
''Q: I have a strds with 506 maps that correspond to 8-day composite products. I need to sequentially list all maps which "start_month" is January, February and so on... to use them as input in {{cmd|r.series | ''Q: I have a strds with 506 maps that correspond to 8-day composite products. I need to sequentially list all maps which "start_month" is January, February and so on... to use them as input in {{cmd|r.series}} (or {{cmd|t.rast.series}}). How can I achieve that?'' | ||
A: You can use the [https://www.sqlite.org/lang_datefunc.html datetime functionality of SQLite] to perform this task, this should work for January: | A: You can use the [https://www.sqlite.org/lang_datefunc.html datetime functionality of SQLite] to perform this task, this should work for January: | ||
Line 573: | Line 750: | ||
<source lang="bash"> | <source lang="bash"> | ||
t.rast.list input=cla_null_mayor65 \ | t.rast.list input=cla_null_mayor65 \ | ||
where='start_time >= datetime(start_time, "start of year") and start_time <= datetime(start_time, "start of year", "1 month")' | |||
</source> | </source> | ||
Line 585: | Line 761: | ||
=== Aggregation of seasonal data using time ranges === | === Aggregation of seasonal data using time ranges === | ||
A way to aggregate seasons from daily data without granularity but by using time ranges is shown here. The issue is to include for the season calculation also a month from the previous year. This can be addressed | A way to aggregate seasons from daily data without granularity but by using time ranges is shown here. The issue is to include for the season calculation also a month from the previous year. This can be addressed with some datetime calculations in SQLite: | ||
<source lang="bash"> | <source lang="bash"> | ||
Line 609: | Line 785: | ||
# Debugging only, to see what it does: | # Debugging only, to see what it does: | ||
echo "---- Querying ${prevyear}-${1}-01 ... ${year}-${3}-end:" | |||
# we use start_time and end_time to get the proper time range | |||
t.rast.list input=temp_daily_average@modis_lst_reconstructed_europe_daily \ | t.rast.list input=temp_daily_average@modis_lst_reconstructed_europe_daily \ | ||
where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" > list_${prevyear}_${1}_${3}.csv | |||
head -n 3 list_${prevyear}_${1}_${3}.csv | |||
head -n 3 list_${prevyear}_${1}_${3}.csv | |||
echo "..." | |||
tail -n 3 list_${prevyear}_${1}_${3}.csv | |||
echo "=======" | |||
rm -f list_${prevyear}_${1}_${3}.csv | |||
# calculate aggregates: | # calculate aggregates: | ||
method="average" # median mode minimum maximum stddev | |||
t.rast.series input=temp_daily_average@modis_lst_reconstructed_europe_daily \ | t.rast.series input=temp_daily_average@modis_lst_reconstructed_europe_daily \ | ||
where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" \ | |||
output=temp_${method}_${prevyear}_${1} | |||
done | done | ||
done | done | ||
Line 631: | Line 808: | ||
Much simpler '''alternative''' (which runs in parallel on multi-cores): | Much simpler '''alternative''' (which runs in parallel on multi-cores): | ||
<source lang="bash"> | <source lang="bash"> | ||
t.rast.aggregate input=A output=B basename=b where="start_time >= 2004-03-01" granularity="3 months" | t.rast.aggregate input=A output=B basename=b where="start_time >= '2004-03-01 00:00:00'" granularity="3 months" | ||
</source> | </source> | ||
== | === How to count consecutive days that meet a certain condition? === | ||
This | This example was kindly contributed by Thomas Leppelt and shows how to obtain the number of consecutive days with temperature below 0 per week. | ||
<source lang="bash"> | <source lang="bash"> | ||
g. | # Set the computational region and resolution | ||
g.region -p n=-30 s=-50 e=-50 w=-70 res=1 | |||
# We create maps for 100 days | |||
for map in `seq 1 100` ; do | |||
# generate synthetic maps as a seeding random values for temperatures | |||
r.mapcalc -s expression="daily_temp_${map} = rand(-20,20)" --o | |||
echo daily_temp_${map} >> map_list.txt | |||
done | |||
t.create type=strds temporaltype=absolute \ | |||
output=temperature_daily \ | |||
title="Daily Temperature" \ | |||
description="Test dataset with daily temperature" | |||
t.register -i type=raster input=temperature_daily \ | |||
file=map_list.txt start="2014-03-07" increment="1 days" --o | |||
# check general information of the daily strds | |||
t.info type=strds input=temperature_daily | |||
=== | # Create weekly mask to which we will add 1 or 0 according to temperature in the days within it (each map in this week mask will have a value of 7) | ||
t.rast.aggregate input=temperature_daily output=weekly_mask basename=mask_week granularity="1 weeks" method=count | |||
# Calculate consecutive days with negative temperatures | |||
t.rast.algebra base=neg_temp_days expression="consecutive_days = weekly_mask {+,contains,l} if(temperature_daily < 0 && temperature_daily[-1] < 0 || temperature_daily[1] < 0 && temperature_daily < 0, 1, 0)" | |||
</source> | |||
== Online tutorials and courses == | == Online tutorials and courses == | ||
* [http://www.geostat-course.org/Topic_Gebbert The temporal GRASS GIS framework: Introduction and application] by | * [http://www.geostat-course.org/Topic_Gebbert The temporal GRASS GIS framework: Introduction and application] by Sören Gebbert (2012) | ||
* [http://fatra.cnr.ncsu.edu/temporal-grass-workshop/ Spatio-temporal data handling and visualization in GRASS GIS] - FOSS4G 2014 workshop by Vaclav Petras, Anna Petrasova, Helena Mitasova, Markus Neteler (2014) | * [http://fatra.cnr.ncsu.edu/temporal-grass-workshop/ Spatio-temporal data handling and visualization in GRASS GIS] - FOSS4G 2014 workshop by Vaclav Petras, Anna Petrasova, Helena Mitasova, Markus Neteler (2014) | ||
* [http://www.openniche.org/processing-era-interim-dataset-in-ncdf-using-gdal-and-grass-gis/ Processing ERA-Interim dataset in netcdf using GDAL and GRASS GIS] by Sajid Pareeth (2015) | * [http://www.openniche.org/processing-era-interim-dataset-in-ncdf-using-gdal-and-grass-gis/ Processing ERA-Interim dataset in netcdf using GDAL and GRASS GIS] by Sajid Pareeth (2015) | ||
* CZ: [http://training.gismentors.eu/grass-gis-pokrocily/tgrass/index.html Úvod do časoprostorových analýz] (Introduction to space-time analysis) by Martin Landa (2015) | * CZ: [http://training.gismentors.eu/grass-gis-pokrocily/tgrass/index.html Úvod do časoprostorových analýz] (Introduction to space-time analysis) by Martin Landa (2015) | ||
* [https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html Sentinel spatio-temporal processing] by Martin Landa (2020) | |||
* [https://github.com/veroandreo/tgrass_workshop_foss4g_eu TGRASS: temporal data processing with GRASS GIS] FOSS4G-EU workshop by Veronica Andreo, Luca Delucchi and Markus Neteler (2017) | |||
* [https://gitpitch.com/veroandreo/grass-gis-geostat-2018/master?p=tgrass&grs=gitlab Spatio-temporal data processing and visualization with GRASS GIS], GEOSTAT Summer School lecture by Veronica Andreo (2018) | |||
== References == | == References == | ||
* Gebbert, S., Pebesma, E. | * Gebbert, S., Pebesma, E. 2014. ''TGRASS: A temporal GIS for field based environmental modeling''. Environmental Modelling & Software 53, 1-12 ([http://dx.doi.org/10.1016/j.envsoft.2013.11.001 DOI]) - [http://ifgi.uni-muenster.de/~epebe_01/tgrass.pdf preprint PDF] | ||
* Gebbert, S., Pebesma, E. 2017. ''The GRASS GIS temporal framework''. International Journal of Geographical Information Science 31, 1273-1292 ([http://dx.doi.org/10.1080/13658816.2017.1306862 DOI]) - [https://figshare.com/articles/journal_contribution/The_GRASS_GIS_temporal_framework/4834013/1/files/8024429.pdf PDF] | |||
* Gebbert, S., Leppelt, T., Pebesma, E., 2019. ''A topology based spatio-temporal map algebra for big data analysis''. Data 4, 86. ([https://doi.org/10.3390/data4020086 DOI]) - [https://www.mdpi.com/2306-5729/4/2/86/pdf?version=1560849201 PDF] | |||
== See also == | == See also == | ||
* Introduction: [http://www.slideshare.net/Luis_de_Sousa/presentation-soeren GRASS as a Temporal GIS] by Sören Gebbert (slides) | * Introduction: [http://www.slideshare.net/Luis_de_Sousa/presentation-soeren GRASS as a Temporal GIS] by Sören Gebbert (slides) | ||
* [http://grass.osgeo.org/ | * [https://archive.org/details/TheTemporalGrassGisFrameworkIntroductionAndApplication# The temporal GRASS GIS framework: Introduction and application] Video of Sören Gebbert's presentation in [http://www.geostat-course.org/Muenster_2012 GEOSTAT course] 2012 | ||
* [ | * [https://grass.osgeo.org/grass-stable/manuals/temporalintro.html Introduction to temporal modules] (GRASS GIS manual) | ||
* [ | * [https://grass.osgeo.org/grass-stable/manuals/temporal.html Available temporal modules] (GRASS GIS manual) | ||
* [https://trac.osgeo.org/grass/wiki/Grass7/TemporalExtension Temporal Extension] (background info in trac) | |||
* [[GRASS GSoC 2013 Temporal GIS Algebra for raster and vector data in GRASS]] | * [[GRASS GSoC 2013 Temporal GIS Algebra for raster and vector data in GRASS]] | ||
[[Category: Documentation]] | [[Category: Documentation]] | ||
[[Category:Temporal]] | [[Category: Tutorial]] | ||
[[Category: Temporal]] |
Latest revision as of 17:55, 20 December 2022
Introduction
TGRASS is the temporal enabled GRASS GIS. TGRASS is completely metadata based, i.e. it does not change any data but simply handles the organization of raster, vector, raster3D maps actually stored in a GRASS GIS mapset by registering in an additional internal database. This is done specifically for managing temporal and spatial extent including temporal topology.
Manual overview: https://grass.osgeo.org/grass-stable/manuals/temporalintro.html
Terminology Overview
- Space time raster datasets (strds) are designed to manage raster map time series. Modules that process strds have the naming prefix t.rast.
- Space time 3D raster datasets (str3ds) are designed to manage 3D raster map time series. Modules that process str3ds have the naming prefix t.rast3d.
- Space time vector datasets (stvds) are designed to manage vector map time series. Modules that process stvds have the naming prefix t.vect.
Workflow overview
- create an empty space time datasets: strds, str3ds, or stvds (t.create)
- register the GRASS GIS maps (t.register)
- check the generated space time datasets (t.list, t.info)
- do your analysis: t.rast.aggregate, t.info, t.rast.univar, t.vect.univar, ... so many more!
Example workflow for a Chlorophyll-a MODIS time series
The following examples are based on a series of MODIS L3 Chlorophyll-a product that is freely available at the ocean color site. So, say we download the SMI 8-day composite product at 4.6 km resolution for the period 2003-2013. That is a 506 set of images, 46 per year. Data comes as compressed HDF4 files. Chlorophyll products file names look like this:
A20030012003008.L3m_8D_CHL_chlor_a_4km
where,
A: MODIS/Aqua 2003: Year at start 001: Julian day at start 2003: Year at end 008: Julian day at end L3m: Level 3 data, mapped (Projection: Plate carrée) 8D: 8 day composition CHL: Chlorophyll a concentration product chlor_a: algorithm used 10^(a0 + a1*X + a2*X^2 + a3*X^3 + a4*X^4) 4km: 4.6km pixel size (8640x4320 image, 2.5 minute resolution)
We now decompress files and check metadata
# Go to where the data is and decompress
find -iname '*.bz2' -exec bzip2 -d {} \;
# Check file meta-data (GDAL utilities)
gdalinfo A20030012003008.L3m_8D_CHL_chlor_a_4km
Next step is to import all 506 images into GRASS. You can use r.in.gdal or r.external for that. Note that global Cl-a images as downloaded from ocean color site are ~150 mb each (disk space issues!). Here, 3 different options:
- import global images (506 images, 150 Mb each) with r.in.gdal, resize to study area and remove global files
- set projection and extension, and resize to study area with gdal_translate, and import already resized images with r.in.gdal
- import global images (506 images, 150 Mb each) with r.external, resize to study area and remove global files, as showed below:
# define region extension
g.region -p n=-38 s=-55 w=-70 e=-56
suffix=_tmp
for map in *chlor* ; do
r.external input=$map output=${map}${suffix} -o
r.mapcalc expression="$map = ${map}${suffix}"
g.remove type=raster name=${map}${suffix}
done
Once we have maps inside GRASS GIS database we can start the temporal processing. If this is the first time you'll use temporal modules, you need to run
t.connect -d
to set the default temporal GIS database connection for the current mapset. If the connection is not initially set with t.connect, any temporal module that you try to run for the first time will first set the connection to the temporal database for you. The default TGIS database of type sqlite3 is located in the PERMANENT mapset directory. Temporal GIS content from all created mapsets will be stored there.
Creating a STRDS and registering maps
Creating a STRDS
First step is to create a space time dataset by means of t.create. As an example, let us create a strds for the Chlorophyll-a (Cl-a) time series. We need to define the name and semantic type of the new space time dataset, its title and a description. By default a raster dataset is created with absolute time (Gregorian calendar). We can change that according to our needs with type and temporaltype options.
t.create type=strds temporaltype=absolute output=cla \
title="Chlorophyll-a concentration" \
description="MODIS L3 Chlorophyll-a concentration for Argentinian sea" \
semantictype=mean
Registering maps
Then, we register the maps in the strds using t.register. This module assigns timestamps to raster, 3D raster and vector maps and register them in the temporal database and into space time datasets. Existing timestamps can be read and used by t.register. Note that this is a metadata based registration, nothing is imported into GRASS GIS since the maps are already present in the location (hence, no duplication occurs).
This module (and TGRASS in general) supports absolute time and relative time. The absolute temporal type refers to a fixed date or interval in the Gregorian calendar, while the relative temporal type refers to data without fixed time stamps (e.g., sequential maps used to calculate multi-decadal averages). Refer to the TGRASS overview slides for background.
Maps can be registered by command line argument (i.e., a list of comma separated map names) or using an input file. The start time, end time and a temporal increment can be provided either by command line or in the input file. End time and increment are mutually exclusive. Maps can be registered in several space time datasets using the same timestamp. For a more detailed explanation and examples on how to register maps in stds, see also maps registration wiki page.
Start time and end time with absolute time must be provided using the format yyyy-mm-dd HH:MM:SS +HHMM. It is also supported to specify the date yyyy-mm-dd only. In case of relative time, the temporal unit (years, months, days, hours, minutes or seconds) must be provided. The relative start time, end time and the increment are integers (see also t.register).
# note: with -i we create intervals (start and end time) spanning the given increment, and starting from start.
t.register -i type=raster input=cla \
maps=`g.list raster pattern="*_chlor_*" separator=comma` \
start="2003-01-01" \
increment="8 days"
Dealing with weekly data
While the former would have been the simplest solution, our 8-day products have a problem: in this example the last image of each year is not a product of an 8-day composition, but 4 or 5-day. Then, when using the increment parameter, dates (and consequently, intervals) are not set properly. The solution is to create a list of maps, with their respective start and end date. As the filenames contain information regarding year and DOY (day of year), we can use the following Python script to read filenames and transform DOY to calendar dates (Thanks Soeren!).
# Script to extract the input file for t.register from map names
# run in python
# modify to your needs
from os import walk
f = []
for (dirpath, dirnames, filenames) in walk("/path/to/the/maps"):
f.extend(filenames)
break
# to order the list
f.sort()
input_list=[]
import datetime
for map_name in f:
start_year = int(map_name[1:5])
start_day = int(map_name[5:8])
end_year = int(map_name[8:12])
end_day = int(map_name[12:15])
start = datetime.datetime(start_year, 1, 1) + datetime.timedelta(start_day - 1)
end = datetime.datetime(int(end_year), 1, 1) + datetime.timedelta(end_day)
map_list = map_name + '|' + str(start) + '|' + str(end)
print map_list
input_list.append(map_list)
print input_list
txt = '\n' .join(input_list)
f = open("input_list_cla.txt","w")
f.write(txt)
f.close()
Using the number of characters in the filenames and the datetime library in Python, you can convert DOY in the filenames into start_time and end_time as in the file you need to pass to t.register:
A20030012003008.L3m_8D_CHL_chlor_a_4km_arg|2003-01-01 00:00:00|2003-01-09 00:00:00 A20030092003016.L3m_8D_CHL_chlor_a_4km_arg|2003-01-09 00:00:00|2003-01-17 00:00:00 A20030172003024.L3m_8D_CHL_chlor_a_4km_arg|2003-01-17 00:00:00|2003-01-25 00:00:00 ... A20133452013352.L3m_8D_CHL_chlor_a_4km_arg|2013-12-11 00:00:00|2013-12-19 00:00:00 A20133532013360.L3m_8D_CHL_chlor_a_4km_arg|2013-12-19 00:00:00|2013-12-27 00:00:00 A20133612013365.L3m_8D_CHL_chlor_a_4km_arg|2013-12-27 00:00:00|2014-01-01 00:00:00
and then, the command would be:
t.register --o type=raster input=cla file=input_list_cla.txt
As we are providing start and end time along with map names, we don't need to use the -i flag, neither set start nor increment options because that information is already contained in the file.
Assign a color table
We can also set a color table for all maps in the strds with t.rast.colors:
# using a predetermined color table
t.rast.colors input=cla color=bcyr
#using your dedicated color table
t.rast.colors input=cla rules=path/to/your/color_table
Further examples of extracting timestamps from map names can be found here: maps registration wiki page.
Getting some basic info and statistics
We now check the space time data sets we have in our mapset with t.list:
t.list type=strds
and get information about our recently created strds. See t.info for additional uses.
t.info type=strds input=cla
+-------------------- Space Time Raster Dataset -----------------------------+ | | +-------------------- Basic information -------------------------------------+ | Id: ........................ cla@clorofila | Name: ...................... cla | Mapset: .................... clorofila | Creator: ................... veroandreo | Temporal type: ............. absolute | Creation time: ............. 2014-04-29 14:23:00.579342 | Modification time:.......... 2014-05-12 09:15:08.917309 | Semantic type:.............. mean +-------------------- Absolute time -----------------------------------------+ | Start time:................. 2003-01-01 00:00:00 | End time:................... 2014-01-01 00:00:00 | Granularity:................ 1 day | Temporal type of maps:...... interval +-------------------- Spatial extent ----------------------------------------+ | North:...................... -38.0 | South:...................... -55.0 | East:.. .................... -55.0 | West:....................... -70.0 | Top:........................ 0.0 | Bottom:..................... 0.0 +-------------------- Metadata information ----------------------------------+ | Raster register table:...... raster_map_register_91ba57d5f0924f4fa0bd7176a1b39b2f | North-South resolution min:. 0.041667 | North-South resolution max:. 0.041667 | East-west resolution min:... 0.041667 | East-west resolution max:... 0.041667 | Minimum value min:.......... 0.02925 | Minimum value max:.......... 0.26472 | Maximum value min:.......... 5.2104 | Maximum value max:.......... 99.953934 | Aggregation type:........... None | Number of registered maps:.. 506 | | Title: | Chlorophyll-a | Description: | Concentracion de Clorofila a | Command history: | # 2014-04-29 14:23:00 | t.create type="strds" temporaltype="absolute" | output="cla" title="Chlorophyll-a" | description="Concentracion de Clorofila a" --o | # 2014-04-29 14:23:23 | t.register --o type="rast" input="cla" | file="input_list_cla.txt" | +----------------------------------------------------------------------------+
Now, we get univariate statistics from the non-null cells for each registered raster map of the strds. For that matter we use t.rast.univar which, by default, returns the name of the map, the start and end date of dataset and the following values: mean, minimum and maximum vale, mean_of_abs, standard deviation, variance, coeff_var, number of null cells, total number of cell.
t.rast.univar input=cla
Or, you can send the output to a text file using
t.rast.univar input=cla separator=comma output=stats_cla.csv
This file "stats_cla.csv" you can now open in a spreadsheet or statistical software for inspection and plotting.
Listing maps and selections
The module t.rast.list allows you to list all the maps registered in a strds or a selection of them and, provides options for different listing methods, sorting of the list, information to print and so on. The where option in t.rast.list allows to perform different selections of maps to list. The columns that can be used to perform these selections are: id, name, creator, mapset, temporal_type, creation_time, start_time, end_time, north, south, west, east, nsres, ewres, cols, rows, number_of_cells, min and max. (Note that for vector time series, i.e. stvds, the columns in t.vect.list differ from those for strds. You can check that with t.vect.list --help
).
t.rast.list input=cla method=gran granule="1 month"
# this will give one image every one month, 3 months, 1 year, or the granule you choose
t.rast.list input=cla order=min columns=id,name,start_time,min where="min <= '0.05'"
# this will order by minimum value all the maps in the strds that have a minimum value lower than or equal to 0.05
t.rast.list input=cla order=max columns=name,start_time,max where="max > '10.0'"
# maps ordered by maximum value in which maximum value is higher than 10.
t.rast.list input=cla where="start_time >= '2003-01' and start_time <= '2003-06'"
# all the maps in the first 6 month of the time series
t.rast.list input=cla where="strftime('%m', start_time)='01'"
# all the maps from January
t.rast.list input=cla where="strftime('%m-%d', start_time)='01-01'"
# all the maps from January, 1st
t.rast.list input=cla where="strftime('%w', start_time)='1'"
# all Mondays in the time series (Sunday is 0)
If you have monthly (instead of 8-day products) data and you want to list all January maps, then you can do:
t.rast.list input=cla where="start_time=datetime(start_time, 'start of year', ' 0 month')"
Note: Do not forget to use single quotes around date, neither in t.rast.list nor in any other of the t.* modules offering the where option, because the clause will be ignored (i.e.: no selection performed) and no warning nor error will be printed.
Visualization
There are different visualization options for strds.
- g.gui.timeline allows to compare temporal datasets by displaying their temporal extents in a plot.
# only temporal extent
g.gui.timeline inputs=cla
# temporal and spatio-temporal extent
g.gui.timeline -3 inputs=cla
- g.gui.tplot allows to see the values of one or more temporal datasets for a queried point defined by a coordinate pair.
Steps to use this module are:
- Select strds
- Select pair of coordinates (east,north) or mark a point in the map
- Hit Draw
- Customize plot as desired
- Save
- g.gui.animation is the tool for animating a series of raster and vector maps or a space time raster or vector dataset.
g.gui.animation strds=cla_monthly_average
Aggregation
For aggregations of data with different methods and different granularities, there are two very useful commands:
- t.rast.series that applies different aggregation methods from r.series on all or a subset of raster maps in a strds, and
- t.rast.aggregate that temporally aggregates the maps in a strds by a user defined granularity.
With these modules it is very simple to get maps of basic statistical parameters for different temporal granules, and this permits the analysis of the spatio-temporal variability of the variable of interest.
Some examples:
# yearly aggregation
t.rast.aggregate input=cla output=cla_yearly_average basename=cla_yearly_average \
granularity="1 years" method=average sampling=starts suffix=gran
# yearly aggregation with corresponding methods (output: 7 strds with 11 maps each)
for method in average median mode minimum maximum stddev range ; do
t.rast.aggregate input=cla output=cla_yearly_${method} \
basename=cla_yearly_${method} granularity="1 years" \
method=${method} sampling=starts suffix=gran
done
With the where parameter, as exemplified before, we can select for example all 8-day products which start_time is 01 (January) over the years. Like this we can get the so-known climatologies.
t.rast.series input=cla method=average where="strftime('%m', start_time)='01'" output=january_average
We could also get long term averages for specific periods in a time series, e.g. January 5-year mean:
t.rast.series input=cla \
where="start_time >= '2005-01-01' AND start_time <= '2009-12-01' AND strftime('%m', start_time)='01'" \
method=average output=jan_cla_mean_2000_2004
# say we take a 15 years period (2000-2014) of monthly data and we want monthly 5-year averages
YEAR_START=(2000 2005 2010)
YEAR_END=(2004 2009 2014)
count=${#YEAR_START[@]}
for i in `seq 1 $count` ; do
echo ${YEAR_START[$i-1]} ${YEAR_END[$i-1]}
for MONTH in `seq -w 1 12` ; do
t.rast.series input=cla \
where="start_time >= '"${YEAR_START[$i-1]}"-01-01' AND start_time <= '"${YEAR_END[$i-1]}"-12-01' AND strftime('%m', start_time)='"${MONTH}"'" \
method=average output=cla_${YEAR_START[$i-1]}_${YEAR_END[$i-1]}_month_${MONTH} --o
done
done
Generalizing a bit, to obtain monthly climatologies from a time series, we can:
# monthly climatologies
for MONTH in `seq -w 1 12` ; do
for METHOD in average median stddev minimum maximum ; do
t.rast.series input=cla method=${METHOD} where="strftime('%m', start_time)='${MONTH}'" output=${METHOD}_${MONTH}
done
done
# seasonal climatologies
for i in "01 02 03" "04 05 06" "07 08 09" "10 11 12" ; do
set -- $i ; echo $1 $2 $3
for m in average median stddev minimum maximum ; do
t.rast.series input=cla method=${m} output=${m}_${1} \
where="strftime('%m',start_time)='"${1}"' or strftime('%m',start_time)='"${2}"' or strftime('%m', start_time)='"${3}"'"
done
done
Using the climatologies previously obtained, we'll now estimate monthly anomalies in the mean, max and min Cl-a concentration. First, we need to monthly aggregate data, and then do the difference between the monthly climatology and each respective monthly aggregate. We'll do the aggregation for the average, minimum and maximum of Cl-a concentration (from 506 input maps in cla strds, we'll get 132 maps in each monthly aggregated new strds).
# monthly aggregates (132 maps)
for method in average minimum maximum ; do
t.rast.aggregate input=cla output=cla_monthly_${method} \
basename=cla_monthly_${method} \
granularity="1 months" method=${method} \
sampling=contains suffix=gran
done
# January anomalies in mean, min and max Cl-a
t.rast.list -u input=cla_monthly_average where="start_time=datetime(start_time, 'start of year', '0 month')" columns=name
for m in average minimum maximum ; do
for i in 1 13 25 37 49 61 73 85 97 109 121 ; do # these numbers correspond to all January monthly aggregates
r.mapcalc expression="Jan_${m}_anomaly_${i}=cla_monthly_${m}_${i}-01_${m}"
done
done
You can also use t.rast.list for looping, the same way you use g.list:
for map in `t.rast.list -u input=cla_monthly_average where="strftime('%m', start_time)='01'" columns=name` ; do
r.mapcalc expression="anomaly_${map}=${map}-01_average"
done
Generalizing, we can estimate all monthly anomalies from mean like:
for i in `seq -w 1 12` ; do
for map in `t.rast.list -u input=cla_monthly_average columns=name where="strftime('%m', start_time)='"${i}"'"` ; do
r.mapcalc expression="anomaly_${map}=${map}-average_${i}"
done
done
Say we now need to know the date of the maximum value of Cl-a concentration over all the study period and/or on a yearly basis:
# map index for the overall maximum Cl-a value
t.rast.series input=cla method=max_raster output=cla_max_index
# map index for the yearly maximum Cl-a value
t.rast.aggregate input=cla granularity="1 year" method=max_raster output=yearly_max_index basename=yearly_max_index suffix=gran
The outputs show (pixelwise) the map index in which the maximum value of Cl-a occurs (from 1 to 506 and from 1 to 46, for the whole time series and on a yearly basis, respectively). For relative time this is maybe enough, but you may then want to reclassify data to get DOY, for example. In that case, you may use r.reclass.
If you already have monthly data, you can get climatologies quite simply as follows:
# January averages
t.rast.series input=cla_monthly method=average where="start_time=datetime(start_time, 'start of year', '0 month')" output=jan_average
Spatio-temporal algebra with STRDS
The module t.rast.mapcalc allows us to perform spatio-temporal mapcalc expressions on temporally sampled maps of strds. There are spatial and temporal operators available for the "expression" string. Spatial operators, functions and internal variables are those used in r.mapcalc. Temporal internal variables supported for both relative and absolute time include: td(), start_time() and end_time(). There are also several very useful internal variables supported especially for absolute time of the current sample interval or instance, e.g.: start_doy(), start_year(), start_month() and so on (see t.rast.mapcalc manual for further details and examples).
Some examples now. Say we did some analysis and decided that we will only consider values higher than 0.05. Then, we need to set all values below that threshold to null.
t.rast.mapcalc input=cla expression="if(cla < 0.05, null(), cla)" output=cla_corrected basename=cla_corrected
or we may also want to take negative erroneous values to the knowm minimum of the strds, so:
t.rast.series input=cla method=minimum output=min_cla
t.rast.mapcalc input=cla expression="if(cla < 0.0, min_cla, cla)" output=cla_corrected basename=cla_corrected
We may also need to reclassify all maps in the strds according to a certain threshold, e.g.: a certain level of Cl-a that indicates bloom conditions, in order to get bloom frequency afterwards:
# reclassify
t.rast.mapcalc -n input=cla output=cla_bloom basename=cla_bloom expression="if(cla > 0.75, 1, null())"
# bloom frequency
t.rast.series input=cla_bloom output=bloom_freq method=count
Do you remember we wanted to get the DOY of maximum Cl-a value before? Well, here's another way of doing it...
# overall maximum value
t.rast.series input=cla method=maximum output=max_cla
# new strds with DOY of overall maximum
t.rast.mapcalc -n inputs=cla output=date_max_cla expression="if(cla == max_cla,start_doy(),null())" basename=date_max_cla
# map with DOY of overall maximum
t.rast.series input=date_max_cla method=maximum output=max_cla_date
# remove date_max_cla strds (we were only interested in the resulting aggregated map)
t.remove -rf inputs=date_max_cla
There's also a t.rast.algebra module that allows for temporal and spatial operations on strds by means of temporal raster algebra. The module expects an expression in the following form: "result = expression".
The statement structure is similar to r.mapcalc, the result is the name of a new strds that will contain the result of the calculation given as expression. Expressions can be any valid or nested combination of temporal operations and spatial overlay or buffer functions that are provided by the temporal algebra. See the manual for further details and explanations.
We'll use this module to estimate the rate of change (slope) between every pair of maps in the "cla" strds. The result will be a new strds consisting of maps with the slope value between every consecutive pair of maps in the original strds:
# we set 8 as fixed denominator, because products are 8-day compositions
t.rast.algebra expression="slope_cla = (cla[1]-cla[0])/8.0" basename=slope_cla
We can then use any of the aggregation modules that we saw before to get the maximum or minimumm rate of change for different granularities.
Some known issues with t.rast.mapcalc
It has been observed that if the names of inputs and output space time datasets (partially) match each other, t.rast.mapcalc may not work properly (especially when several input strds are used). This is because the module uses a simple search and replace approach to substitute the input strds with the corresponding map names. Eventualy, choosing a specific order for the input strds in the input parameter may reduce the risk of wrong substitution. Therefore, especially in operations involving several strds (with partially matching names) as inputs, it is recommended to use t.rast.algebra that correctly recognizes spatio-temporal datasets rather than t.rast.mapcalc.
Subseting and something else
t.rast.extract is another really useful module within temporal GRASS. It allows to extract a subset of a strds and store it in a different strds. The "where" condition is used to do the subset, but you can also specify a r.mapcalc sub-expression that performs operations on all maps of the selected subset. If no r.mapcalc expression is defined, the selected maps are simply registered in the new output strds.
Say we need to know in how many maps of the strds minimum values are below a certain threshold, and not only that, but we also want to know how many pixels per map meet that condition. Then, we can do
t.rast.extract input=cla where="min <= '0.05'" output=cla_less_05 basename=cla_less_05 expression="if(cla < 0.05, 1, null())"
to extract those maps with a minimum value lower than 0.05, and in the same step put 1 in every cell meeting the criterium and null everywhere else. To get a count of maps and pixels meeting the condition we may use:
# univariate stistics to get the count per map
t.rast.univar input=cla_less_05
# count map of cells with min value < 0.05
t.rast.series input=cla_less_05 output=count_cla_less_05 method=count
Importing / Exporting STRDS
Say we now need to do some processing in R (e.g.: run DINEOF to fill gaps in data), so we need to export our strds, hence we use t.rast.export.
t.rast.export input=cla output=cla4R compression=gzip
After we have done our analysis we can import the whole strds back to GRASS again, exporting it from R with read/write.tgrass from spacetime R package and using t.rast.import in GRASS.
t.rast.import input=cla_from_R.tar.gz output=cla_from_R basename=new_map directory=/tmp
For a full example of how to handle raster time series between GRASS and R, please visit the time series GRASS-R Statistics wiki. The example is based on North Carolina climate location [1] and includes the following steps:
- Exporting the strds out of GRASS
- Importing into R
- Re-formating the data
- Running DINEOF
- Re-constructing the raster time series
- Exporting out of R, and
- Importing the gap-filled new strds into GRASS.
Neighborhood analysis
t.rast.neighbors performs a neighborhood analysis for each map in a space time raster dataset. This module supports a subset of the options already available in r.neighbors. Both, size of the neighborhood and aggregation method can be chosen. As an example, we'll estimate mean Cl-a concentration in a 3x3 neighborhood for every map in the strds.
t.rast.neighbors input=cla output=cla_smoth base=cla_smooth method=average size=3
Filling and reconstructing time series data with gaps
Harmonic ANalysis of Time Series - HANTS
r.hants is an add-on not strictly within the temporal modules, but really useful when working with time series data, particularly with gappy data. Here's a simple example for generating a list of temporally ordered maps to use as input in r.hants, running HANTS and getting a map of dominant frequencies, afterwards. See the manual for further information on parameter setting and explanations.
# use t.rast.list to create a list of temporally ordered maps
t.rast.list input=cla order=start_time columns=name -u > map_list.csv
# a HANTS run using the list of maps
r.hants -l file=map_list.csv nf=5 fet=0.1 dod=11 base_period=46 suffix=_hants amplitude=amp_hants phase=pha_hants
# dominant frequency map (0 means the dominant freq is 1, 1 that dominant freq is 2, and so on...)
r.series input=`g.list type=raster pattern="amp_hants*" separator=comma` output=dominant_freq_hants method=max_raster
Local Weighted Regression - LWR
Another very useful add-on for the reconstruction of gappy time series, such as those from remote sensing imagery, is r.series.lwr. This module performs a local weighted regression (LWR) of the input maps in order to estimate missing values and identify (and remove) outliers. See the manual for further information and explanations.
# use same list as before
t.rast.list input=cla order=start_time columns=name -u > map_list.csv
# run r.series.lwr
r.series.lwr file=map_list.csv suffix=_lwr order=2 weight=tricube range=0.0,65.0 dod=16 fet=0.1 -l
Estimate Mean Absolute Error (MAE)
One of the measures used to assess how close forecasts or predictions might be to the observed values is the mean absolute error. We will use it to evaluate the performance of HANTS and LWR.
First, we will rebuild our time series with the corresponding output maps of r.hants and r.series.lwr. Here, only showed for LWR.
# re-build time series
t.create type=strds temporaltype=absolute output=cla_lwr \
title="LWR output for Chl-a" \
description="MODIS Aqua L3 Chl-a 8-day 4km 2010-2013. Reconstruction with r.series.lwr"
# create list with filenames to parse
g.list type=raster pattern="*lwr" output=names_list
# parse filenames, convert YYYY-DOY to YYYY-MM-DD and write file to use in t.register
for mapname in `cat names_list` ; do
year_start=`echo ${mapname:1:4}`
doy_start=`echo ${mapname:5:3}`
year_end=`echo ${mapname:8:4}`
doy_end=`echo ${mapname:12:3}`
# convert YYYY-DOY to YYYY-MM-DD
#BEWARE: leading zeros make bash assume the number is in base 8 system, not base 10!
doy_start=`echo "$doy_start" | sed 's/^0*//'`
doy_end=`echo "$doy_end" | sed 's/^0*//'`
START_DATE=`date -d "${year_start}-01-01 +$(( ${doy_start} - 1 ))days" +%Y-%m-%d`
END_DATE=`date -d "${year_end}-01-01 +$(( ${doy_end} ))days" +%Y-%m-%d`
# print mapname, start and end date
echo "$mapname|$START_DATE|$END_DATE" >> map_list_start_and_end_time.txt
done
# register maps in strds
t.register input=cla_lwr type=raster file=map_list_start_and_end_time.txt
Now, we will estimate the MAE and use it to compare both methods. Again, only the example for LWR is showed.
# obtain a strds of the absolute differences between predicted and observed Chl-a values
t.rast.algebra -n \
basename=lwr_minus_orig suffix=gran \
expression="abs_lwr_minus_orig = abs(cla_lwr - cla)"
# sum of the absolute differences (numerator) and count of non-null maps per pixel (denominator)
t.rast.series input=abs_lwr_minus_orig output=sum_abs_lwr_minus_orig method=sum
t.rast.series input=abs_lwr_minus_orig output=count_abs_lwr_minus_orig method=count
# MAE for LWR
r.mapcalc expression="mae_lwr = sum_abs_lwr_minus_orig / count_abs_lwr_minus_orig"
# remove intermediate strds and maps
t.remove -rf abs_lwr_minus_orig
g.remove -f type=raster name=sum_abs_lwr_minus_orig,count_abs_lwr_minus_orig
Let's now compare the predictions generated by both methods, HANTS and LWR, and the corresponding mean absolute error maps.
Extract strds values for points in a vector
Let's say we need to extract values of the strds to check the behaviour of HANTS at given locations and compare it with the original values. We can graphically do that with g.gui.tplot, but if you need to take data outside GRASS, say, read it into R and do some other analysis (See R_Statistics wiki page) you may use v.what.strds. This module retrieves raster values from a given strds to the attribute table of a point vector map.
# original data
v.what.strds input=points_cla strds=cla output=points_cla_out
v.db.select map=points_cla_out file=ts_points_cla.csv
# HANTS' reconstructed data (several runs)
for i in `seq 1 13` ; do
v.what.strds --overwrite input=points_cla strds=cla_hants_${i} output=points_hants_${i}_out
v.db.select map=points_cla file=ts_points_hants_${i}.csv
done
Another option would be to use t.rast.what that samples a space time raster dataset at specific vector point coordinates and writes the output to stdout or text file using different layouts. This module doesn't write the vector's attribute table.
t.rast.what points=points strds=cla output=cla_points.csv null_value=NA separator=comma
Some other modules and addons useful in the time series context
- t.rast.accumulate and t.rast.accdetect to compute and detect cyclic accumulations in raster time series.
- t.rast.what.aggr to sample a strds with a vector point map and get aggregated values of desired granularity either in stdout or in the vector's attribute table.
- t.rast.out.xyz to export a strds to a CSV file.
- t.rast.null to manage NULL-values of a strds.
- v.strds.stats to calculate univariate statistics from a strds based on a vector map.
- r.seasons to extract seasons from a time series.
- r.bioclim to calculate bio-climatic indices.
- r.change.info for landscape change assessment.
- r.quantile.ref to determine the corresponding quantile for an input value from a reference raster time series.
- r.regression.series and r.mregression.series to perform univariate and multivariate linear regression among raster time series, respectively.
- r.series.diversity to estimate diversity indices in raster time series.
- r.series.decompose to decompose raster time series.
- r.series.filter to apply Savitzky-Golay or median filters to raster time series.
FAQ
Best practice data organization
When it comes to time series, thousands of maps are often involved. In order to keep control, here some suggestions:
- separate original data from derived aggregates: store them in separate mapsets
- for specific projects, maintain them in individual mapsets. You can access the STDS stored in other mapsets (same location) through
t.some.module input=my_strds@the_other_mapset output=result ...
as long as mapsets are indeed accessible, i.e.: in the mapset's search path (See g.mapsets).
Best practice multi user management
GRASS GIS supports multi-user data management and allows several users to work in the same location within their own mapsets. Data can be read from other mapsets but not edited there. The same principle applies to space time datasets (STDS), i.e. in case of having a collection of maps corresponding to a time series in a dedicated mapset, you need to also create the STDS therein (See t.create and t.register), so it is visible to other users in other mapsets when they use t.list, for example.
Use of where parameter (ticket #2270)
Pay attention when using the "where" option in t.* modules. In some occasions it may not yield the expected results. Given the backend database chosen for TGRASS implementation, you may not be selecting all maps you think if you forget to set time along with date in the where clause. For example, let's say we need to select maps from 2005-03-01 until 2005-05-31 included. You would think the following command would do the job:
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31'"
but, no... The last map in the list is:
temp_0516|pruebas|2005-05-30 00:00:00|2005-05-31 00:00:00
which would be equivalent to:
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time < '2005-05-31'"
This is a product of sqlite. Therefore, if you need the last date included in your selection (map from 2005-05-31 in our example), you need to set time too.
t.rast.list daily_temp where="start_time > '2005-03-01' and start_time <= '2005-05-31 00:00:00'" | tail -n1
temp_0517|pruebas|2005-05-31 00:00:00|2005-06-01 00:00:00
Aggregation with defined granularity
Q: I need to aggregate a strds with a granularity of 1 year, but shifting the start day one month in each run, i.e.: changing the start_time to 2003-02-01, 2003-03-01, 2003-04-01 and so on... My question is: if i recursively change start_time with the 'where' parameter, will the module t.rast.aggregate "aggregate" to the next february, march, april (what i'd wish) or just till the end of 2003?
A: If you specify a granularity of a year, then the start time to perform the aggregation will always be shifted to the 1st January of the current year and the end time the 1st January of the next year (eg. 2002-01-01 - 2003-01-01). If you wish to aggregate a full year but shifting one month forward then simply use a granularity of 12 months.
Listing maps with specific start month
Q: I have a strds with 506 maps that correspond to 8-day composite products. I need to sequentially list all maps which "start_month" is January, February and so on... to use them as input in r.series (or t.rast.series). How can I achieve that?
A: You can use the datetime functionality of SQLite to perform this task, this should work for January:
t.rast.list input=cla_null_mayor65 \
where='start_time >= datetime(start_time, "start of year") and start_time <= datetime(start_time, "start of year", "1 month")'
Aggregation of seasonal data
Q: How can I calculate average seasonal temperature starting from a daily temperatures temporal dataset?
A: Use t.rast.aggregate.ds, the input is the daily strds, the sampling stds should have seasonal intervals. Then use average as method. The output will have seasonal aggregated temperatures. For a detailed workflow see: seasonal aggregation example.
Aggregation of seasonal data using time ranges
A way to aggregate seasons from daily data without granularity but by using time ranges is shown here. The issue is to include for the season calculation also a month from the previous year. This can be addressed with some datetime calculations in SQLite:
# We assume to have daily temperature data in DB "temp_daily_average"
# loop over seasons, generate aggregates:
### 'start_time' and 'end_time' are columns in TGRASS
for year in `seq 2004 2014` ; do
# we consider also a month of the previous year
for month in "12 01 02" "03 04 05" "06 07 08" "09 10 11" ; do
set -- $month ; echo $1 $2 $3
prevyear=$year
if [ $1 -eq 12 ] ; then
prevyear=`expr $year - 1`
fi
# prepare query strings in SQLite
MYSTART=`echo "SELECT strftime(datetime('${prevyear}-${1}-01'));" | sqlite3`
MYEND=`echo "SELECT strftime(datetime('${prevyear}-${1}-01','+3 month'));" | sqlite3`
# Debugging only, to see what it does:
echo "---- Querying ${prevyear}-${1}-01 ... ${year}-${3}-end:"
# we use start_time and end_time to get the proper time range
t.rast.list input=temp_daily_average@modis_lst_reconstructed_europe_daily \
where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" > list_${prevyear}_${1}_${3}.csv
head -n 3 list_${prevyear}_${1}_${3}.csv
echo "..."
tail -n 3 list_${prevyear}_${1}_${3}.csv
echo "======="
rm -f list_${prevyear}_${1}_${3}.csv
# calculate aggregates:
method="average" # median mode minimum maximum stddev
t.rast.series input=temp_daily_average@modis_lst_reconstructed_europe_daily \
where="start_time >= '$MYSTART' AND end_time <= '$MYEND'" \
output=temp_${method}_${prevyear}_${1}
done
done
Much simpler alternative (which runs in parallel on multi-cores):
t.rast.aggregate input=A output=B basename=b where="start_time >= '2004-03-01 00:00:00'" granularity="3 months"
How to count consecutive days that meet a certain condition?
This example was kindly contributed by Thomas Leppelt and shows how to obtain the number of consecutive days with temperature below 0 per week.
# Set the computational region and resolution
g.region -p n=-30 s=-50 e=-50 w=-70 res=1
# We create maps for 100 days
for map in `seq 1 100` ; do
# generate synthetic maps as a seeding random values for temperatures
r.mapcalc -s expression="daily_temp_${map} = rand(-20,20)" --o
echo daily_temp_${map} >> map_list.txt
done
t.create type=strds temporaltype=absolute \
output=temperature_daily \
title="Daily Temperature" \
description="Test dataset with daily temperature"
t.register -i type=raster input=temperature_daily \
file=map_list.txt start="2014-03-07" increment="1 days" --o
# check general information of the daily strds
t.info type=strds input=temperature_daily
# Create weekly mask to which we will add 1 or 0 according to temperature in the days within it (each map in this week mask will have a value of 7)
t.rast.aggregate input=temperature_daily output=weekly_mask basename=mask_week granularity="1 weeks" method=count
# Calculate consecutive days with negative temperatures
t.rast.algebra base=neg_temp_days expression="consecutive_days = weekly_mask {+,contains,l} if(temperature_daily < 0 && temperature_daily[-1] < 0 || temperature_daily[1] < 0 && temperature_daily < 0, 1, 0)"
Online tutorials and courses
- The temporal GRASS GIS framework: Introduction and application by Sören Gebbert (2012)
- Spatio-temporal data handling and visualization in GRASS GIS - FOSS4G 2014 workshop by Vaclav Petras, Anna Petrasova, Helena Mitasova, Markus Neteler (2014)
- Processing ERA-Interim dataset in netcdf using GDAL and GRASS GIS by Sajid Pareeth (2015)
- CZ: Úvod do časoprostorových analýz (Introduction to space-time analysis) by Martin Landa (2015)
- Sentinel spatio-temporal processing by Martin Landa (2020)
- TGRASS: temporal data processing with GRASS GIS FOSS4G-EU workshop by Veronica Andreo, Luca Delucchi and Markus Neteler (2017)
- Spatio-temporal data processing and visualization with GRASS GIS, GEOSTAT Summer School lecture by Veronica Andreo (2018)
References
- Gebbert, S., Pebesma, E. 2014. TGRASS: A temporal GIS for field based environmental modeling. Environmental Modelling & Software 53, 1-12 (DOI) - preprint PDF
- Gebbert, S., Pebesma, E. 2017. The GRASS GIS temporal framework. International Journal of Geographical Information Science 31, 1273-1292 (DOI) - PDF
- Gebbert, S., Leppelt, T., Pebesma, E., 2019. A topology based spatio-temporal map algebra for big data analysis. Data 4, 86. (DOI) - PDF
See also
- Introduction: GRASS as a Temporal GIS by Sören Gebbert (slides)
- The temporal GRASS GIS framework: Introduction and application Video of Sören Gebbert's presentation in GEOSTAT course 2012
- Introduction to temporal modules (GRASS GIS manual)
- Available temporal modules (GRASS GIS manual)
- Temporal Extension (background info in trac)
- GRASS GSoC 2013 Temporal GIS Algebra for raster and vector data in GRASS