Temporal data processing/maps registration: Difference between revisions
Veroandreo (talk | contribs) (Different options to register maps in stds) |
(+Parsing timestamps from file names with Python) |
||
(8 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Different ways of registering maps in STDS == | |||
There are different ways to register maps in stds with {{cmd|t.register}}. The best one in each case will depend on your data and | There are different ways to register maps in stds (space-time data set) with {{cmd|t.register}}. The best one in each case will depend on your data and whether you need/have '''time instances''' or '''time intervals''' (See [[#FAQ|FAQ section]] for more details). Time instances are defined by an occurrence date, say only start time. On the other hand, time intervals are defined by both start and end time. | ||
If you use either a file with map names or a list of maps (manually typed or as g.list output), and you need time instances (temporal type = point), then you set start and increment options (not -i). | If you use either a file with map names or a list of maps (manually typed or as {{cmd|g.list}} output), and you need ''time instances'' (temporal type = point), then you set start and increment options (not -i). | ||
<source lang="bash"> | <source lang="bash"> | ||
# typed map list | # manually typed map list | ||
t.register input= | t.register input=precipitation \ | ||
maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \ | maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \ | ||
start="2000-01-01" increment="1 months" | start="2000-01-01" increment="1 months" | ||
# with g.list output | # with g.list output | ||
t.register input= | t.register input=precipitation \ | ||
maps=`g.list type=raster pattern=prec* separator=comma` \ | maps=`g.list type=raster pattern="prec*" separator=comma` \ | ||
start="2000-01-01" increment="1 months" | start="2000-01-01" increment="1 months" | ||
# file with only map names | # file with only map names | ||
t.register input= | t.register input=precipitation file=map_list.txt \ | ||
start="2000-01-01" increment="1 months" | start="2000-01-01" increment="1 months" | ||
</source> | </source> | ||
You can also get time instances passing a file with map names and start time. | You can also get ''time instances'' passing a file with map names and start time. | ||
<source lang="bash"> | <source lang="bash"> | ||
# file with map names and start time only | # file with map names and start time only | ||
t.register input= | t.register input=precipitation file=map_list_&_start_time.txt | ||
</source> | </source> | ||
If you need/have data at regular intervals instead, you also need to set the -i flag along with start time. The -i flag and the increment option will only work if a start time is defined (but no end time). | If you need/have data at regular ''intervals'' instead, you also need to set the -i flag along with start time. The -i flag and the increment option will only work if a start time is defined (but no end time). | ||
<source lang="bash"> | <source lang="bash"> | ||
# map list | # map list | ||
t.register -i input= | t.register -i input=precipitation \ | ||
maps=`g.list type=raster pattern=prec* separator=comma` \ | maps=`g.list type=raster pattern="prec*" separator=comma` \ | ||
start="2000-01-01" increment="1 months" | start="2000-01-01" increment="1 months" | ||
# file with only map names | # file with only map names | ||
t.register -i input= | t.register -i input=precipitation file=map_list.txt \ | ||
start="2000-01-01" increment="1 months" | start="2000-01-01" increment="1 months" | ||
</source> | </source> | ||
If you pass a file with map names and start time only, but you want time intervals, then you can use {{cmd|t.snap}} on the STDS to | If you pass a file with map names and start time only, but you want time ''intervals'', then you can use {{cmd|t.snap}} on the STDS to | ||
create a correct temporal topology: maps will use the start time of a potential predecessor as end time. | create a correct temporal topology: maps will use the start time of a potential predecessor as end time. | ||
<source lang="bash"> | <source lang="bash"> | ||
t.register input= | t.register input=precipitation file=map_list_&_start_time.txt | ||
t.snap type=strds input=precipitation | t.snap type=strds input=precipitation | ||
</source> | </source> | ||
Or, you can provide also end_time in the input file (for example, if your time intervals are irregular). If end_time in one map equals start_time of the following, {{cmd|t.register}} will automatically create intervals. | Or, you can provide also end_time in the input file (for example, if your time ''intervals'' are irregular). If end_time in one map equals start_time of the following, {{cmd|t.register}} will automatically create intervals. | ||
<source lang="bash"> | <source lang="bash"> | ||
t.register input= | t.register input=precipitation file=map_list_start_&_end_time.txt | ||
</source> | </source> | ||
So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you | So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you | ||
can provide both time instances and time intervals, and also overlapping times. See manual page for [https://grass.osgeo.org/ | can provide both time instances and time intervals, and also overlapping times. See manual page for [https://grass.osgeo.org/grass-stable/manuals/t.register.html#input-file-format input file formats]. Whatever the case, when you provide an input file, make sure you delete any blank line at the end of the map list. | ||
===== NOTE ===== | |||
If the input option is not used (no stds is provided), maps will be only registered in the temporal database, i.e.: they will be assigned timestamps in the temporal database. If, however, a stds is provided through input, maps will be first registered in the temporal database (if not already registered) and then, in the stds specified. If you want to register maps that have already been registered in that mapset, you don't need to pass information regarding start and end time again, just the list of map names is enough. {{cmd|t.register}} will read timestamps from the temporal database. | |||
Registering maps in the temporal database and registering maps in stds are two different processes and, {{cmd|t.register}} does both. If you use the module 'without input option' you are just assigning timestamps to maps in the temporal database (in the temporal framework). If you use t.register 'with the input option' (i.e.: providing a stds), you are registering timestamped maps that were previously registered in the temporal database, either in the same run of the command or a previous one not using input option. | |||
== Parsing timestamps from file names == | |||
=== Example for Sentinel-2 data === | |||
Sentinel-2 timestamps can be easily determined from the raster map names, for example the imported S-2 image scene <tt>L2A_T32UPB_A021941_20190904T102045</tt> will be timestamped as <tt>2019-09-04 10:20:45</tt>. | |||
A script to automate the creation of a text file to be used as file input for {{cmd|t.register}} can be found here: | |||
https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html | |||
The generated file looks like this: | |||
<source lang="bash"> | |||
T32UPB_20190407T102021_B04_10m|2019-04-07 10:26:45.035007|S2_4 | |||
T32UPB_20190407T102021_B08_10m|2019-04-07 10:26:45.035007|S2_8 | |||
T32UPB_20190422T102029_B04_10m|2019-04-22 10:26:50.312683|S2_4 | |||
</source> | |||
=== Example for monthly climatic data === | |||
The following script is extracting the timestamps of the DWD gridded monthly 1km data file names (provided as open data at https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/), after import into GRASS GIS. The file name scheme be <tt>file name scheme: grids_germany_monthly_air_temp_min.01.1935</tt> in this example: | |||
<source lang="python"> | |||
#!/usr/bin/env python3 | |||
# Timestamp generator for DWD gridded monthly 1km data, Gauß-Krüger Zone 3 | |||
# open data: https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/ | |||
# | |||
# (c) mundialis and Markus Neteler, 2020 | |||
# license: GPL-3 | |||
# | |||
# Script based on "sentinel-timestamp.py" | |||
# https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html | |||
# | |||
# Usage example - air_temp_mean: | |||
# python3 r.dwd_monthly_timestamp_extractor.py product=air_temp_mean output=dwd_monthly_gridded_data_air_temp_mean_timestamped.csv | |||
#%module | |||
#% description: Timestamps DWD raster maps from current mapset. | |||
#%end | |||
#%option | |||
#% key: product | |||
#% type: string | |||
#% description: DWD gridded product | |||
#% answer: air_temp_max,air_temp_mean,air_temp_min,drought_index,precipitation,soil_moist | |||
#% required: yes | |||
#% multiple: no | |||
#%end | |||
#%option G_OPT_F_OUTPUT | |||
#%end | |||
import sys | |||
import os | |||
from datetime import datetime, timedelta | |||
from dateutil.relativedelta import * | |||
import grass.script as gs | |||
from grass.pygrass.gis import Mapset | |||
def main(): | |||
mapset = Mapset() | |||
mapset.current() | |||
dwd_set = "grids_germany_monthly_" + options['product'] + ".*" | |||
gs.message("Scanning for <%s> ..." % dwd_set) | |||
# file name scheme: grids_germany_monthly_air_temp_min.01.1935 | |||
with open(options['output'], 'w') as fd: | |||
for rast in mapset.glist('raster', pattern=dwd_set): | |||
items = rast.split('.') | |||
d = datetime.strptime(items[1] + items[2], '%m%Y') | |||
dd = d + relativedelta(months=+1) | |||
fd.write("{0}|{1}|{2}{3}".format( | |||
rast, | |||
d.strftime('%Y-%m-%d %H:%M:%S'), | |||
dd.strftime('%Y-%m-%d %H:%M:%S'), | |||
os.linesep)) | |||
return 0 | |||
if __name__ == "__main__": | |||
options, flags = gs.parser() | |||
sys.exit(main()) | |||
</source> | |||
The generated text file is to be used as file input for {{cmd|t.register}}. It looks like this: | |||
<source lang="bash"> | |||
grids_germany_monthly_air_temp_mean.01.1881|1881-01-01 00:00:00|1881-02-01 00:00:00 | |||
grids_germany_monthly_air_temp_mean.01.1882|1882-01-01 00:00:00|1882-02-01 00:00:00 | |||
grids_germany_monthly_air_temp_mean.01.1883|1883-01-01 00:00:00|1883-02-01 00:00:00 | |||
grids_germany_monthly_air_temp_mean.01.1884|1884-01-01 00:00:00|1884-02-01 00:00:00 | |||
... | |||
</source> | |||
== FAQ == | |||
''Q: How do I know if my data represents time intervals or time instances?'' | |||
A: If your data is accumulated for a time period (a day, a week, ...), then you have '''interval time'''. If your data represents a value that is valid for a specific time period (daily mean), you have '''interval time''', as well. However, if your data represents a specific state at a point of time that has a smaller period, such as a second for example, then you have '''time instances'''. | |||
The next thing to consider is what you want to do with the data. If you want to compute interactions with other time series, then it makes sense to use time intervals to compute temporal overlapping and intersection. |
Latest revision as of 12:36, 29 December 2020
Different ways of registering maps in STDS
There are different ways to register maps in stds (space-time data set) with t.register. The best one in each case will depend on your data and whether you need/have time instances or time intervals (See FAQ section for more details). Time instances are defined by an occurrence date, say only start time. On the other hand, time intervals are defined by both start and end time.
If you use either a file with map names or a list of maps (manually typed or as g.list output), and you need time instances (temporal type = point), then you set start and increment options (not -i).
# manually typed map list
t.register input=precipitation \
maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \
start="2000-01-01" increment="1 months"
# with g.list output
t.register input=precipitation \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"
# file with only map names
t.register input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"
You can also get time instances passing a file with map names and start time.
# file with map names and start time only
t.register input=precipitation file=map_list_&_start_time.txt
If you need/have data at regular intervals instead, you also need to set the -i flag along with start time. The -i flag and the increment option will only work if a start time is defined (but no end time).
# map list
t.register -i input=precipitation \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"
# file with only map names
t.register -i input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"
If you pass a file with map names and start time only, but you want time intervals, then you can use t.snap on the STDS to create a correct temporal topology: maps will use the start time of a potential predecessor as end time.
t.register input=precipitation file=map_list_&_start_time.txt
t.snap type=strds input=precipitation
Or, you can provide also end_time in the input file (for example, if your time intervals are irregular). If end_time in one map equals start_time of the following, t.register will automatically create intervals.
t.register input=precipitation file=map_list_start_&_end_time.txt
So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you can provide both time instances and time intervals, and also overlapping times. See manual page for input file formats. Whatever the case, when you provide an input file, make sure you delete any blank line at the end of the map list.
NOTE
If the input option is not used (no stds is provided), maps will be only registered in the temporal database, i.e.: they will be assigned timestamps in the temporal database. If, however, a stds is provided through input, maps will be first registered in the temporal database (if not already registered) and then, in the stds specified. If you want to register maps that have already been registered in that mapset, you don't need to pass information regarding start and end time again, just the list of map names is enough. t.register will read timestamps from the temporal database.
Registering maps in the temporal database and registering maps in stds are two different processes and, t.register does both. If you use the module 'without input option' you are just assigning timestamps to maps in the temporal database (in the temporal framework). If you use t.register 'with the input option' (i.e.: providing a stds), you are registering timestamped maps that were previously registered in the temporal database, either in the same run of the command or a previous one not using input option.
Parsing timestamps from file names
Example for Sentinel-2 data
Sentinel-2 timestamps can be easily determined from the raster map names, for example the imported S-2 image scene L2A_T32UPB_A021941_20190904T102045 will be timestamped as 2019-09-04 10:20:45.
A script to automate the creation of a text file to be used as file input for t.register can be found here:
https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html
The generated file looks like this:
T32UPB_20190407T102021_B04_10m|2019-04-07 10:26:45.035007|S2_4
T32UPB_20190407T102021_B08_10m|2019-04-07 10:26:45.035007|S2_8
T32UPB_20190422T102029_B04_10m|2019-04-22 10:26:50.312683|S2_4
Example for monthly climatic data
The following script is extracting the timestamps of the DWD gridded monthly 1km data file names (provided as open data at https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/), after import into GRASS GIS. The file name scheme be file name scheme: grids_germany_monthly_air_temp_min.01.1935 in this example:
#!/usr/bin/env python3
# Timestamp generator for DWD gridded monthly 1km data, Gauß-Krüger Zone 3
# open data: https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/
#
# (c) mundialis and Markus Neteler, 2020
# license: GPL-3
#
# Script based on "sentinel-timestamp.py"
# https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html
#
# Usage example - air_temp_mean:
# python3 r.dwd_monthly_timestamp_extractor.py product=air_temp_mean output=dwd_monthly_gridded_data_air_temp_mean_timestamped.csv
#%module
#% description: Timestamps DWD raster maps from current mapset.
#%end
#%option
#% key: product
#% type: string
#% description: DWD gridded product
#% answer: air_temp_max,air_temp_mean,air_temp_min,drought_index,precipitation,soil_moist
#% required: yes
#% multiple: no
#%end
#%option G_OPT_F_OUTPUT
#%end
import sys
import os
from datetime import datetime, timedelta
from dateutil.relativedelta import *
import grass.script as gs
from grass.pygrass.gis import Mapset
def main():
mapset = Mapset()
mapset.current()
dwd_set = "grids_germany_monthly_" + options['product'] + ".*"
gs.message("Scanning for <%s> ..." % dwd_set)
# file name scheme: grids_germany_monthly_air_temp_min.01.1935
with open(options['output'], 'w') as fd:
for rast in mapset.glist('raster', pattern=dwd_set):
items = rast.split('.')
d = datetime.strptime(items[1] + items[2], '%m%Y')
dd = d + relativedelta(months=+1)
fd.write("{0}|{1}|{2}{3}".format(
rast,
d.strftime('%Y-%m-%d %H:%M:%S'),
dd.strftime('%Y-%m-%d %H:%M:%S'),
os.linesep))
return 0
if __name__ == "__main__":
options, flags = gs.parser()
sys.exit(main())
The generated text file is to be used as file input for t.register. It looks like this:
grids_germany_monthly_air_temp_mean.01.1881|1881-01-01 00:00:00|1881-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1882|1882-01-01 00:00:00|1882-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1883|1883-01-01 00:00:00|1883-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1884|1884-01-01 00:00:00|1884-02-01 00:00:00
...
FAQ
Q: How do I know if my data represents time intervals or time instances?
A: If your data is accumulated for a time period (a day, a week, ...), then you have interval time. If your data represents a value that is valid for a specific time period (daily mean), you have interval time, as well. However, if your data represents a specific state at a point of time that has a smaller period, such as a second for example, then you have time instances.
The next thing to consider is what you want to do with the data. If you want to compute interactions with other time series, then it makes sense to use time intervals to compute temporal overlapping and intersection.