Temporal data processing/maps registration: Difference between revisions

From GRASS-Wiki
Jump to navigation Jump to search
(+Parsing timestamps from file names with Python)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Different ways of registering maps in STDS ==
== Different ways of registering maps in STDS ==


There are different ways to register maps in stds with {{cmd|t.register}}. The best one in each case will depend on your data and wether you need/have '''time instances''' or '''time intervals''' (See [[#FAQ|FAQ section]] for more details). Time instances are defined by an occurrence date, say only start time. On the other hand, time intervals are defined by both start and end time.  
There are different ways to register maps in stds (space-time data set) with {{cmd|t.register}}. The best one in each case will depend on your data and whether you need/have '''time instances''' or '''time intervals''' (See [[#FAQ|FAQ section]] for more details). Time instances are defined by an occurrence date, say only start time. On the other hand, time intervals are defined by both start and end time.  


If you use either a file with map names or a list of maps (manually typed or as g.list output), and you need ''time instances'' (temporal type = point), then you set start and increment options (not -i).
If you use either a file with map names or a list of maps (manually typed or as {{cmd|g.list}} output), and you need ''time instances'' (temporal type = point), then you set start and increment options (not -i).


<source lang="bash">
<source lang="bash">
# typed map list  
# manually typed map list  
t.register input=precipitacion \
t.register input=precipitation \
maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \
maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \
start="2000-01-01" increment="1 months"
start="2000-01-01" increment="1 months"


# with g.list output
# with g.list output
t.register input=precipitacion \
t.register input=precipitation \
maps=`g.list type=raster pattern=prec* separator=comma` \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"
start="2000-01-01" increment="1 months"


# file with only map names
# file with only map names
t.register input=precipitacion file=map_list.txt \
t.register input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"
start="2000-01-01" increment="1 months"
</source>
</source>
Line 25: Line 25:
<source lang="bash">
<source lang="bash">
# file with map names and start time only
# file with map names and start time only
t.register input=precipitacion file=map_list_&_start_time.txt  
t.register input=precipitation file=map_list_&_start_time.txt  
</source>
</source>


Line 32: Line 32:
<source lang="bash">
<source lang="bash">
# map list  
# map list  
t.register -i input=precipitacion \
t.register -i input=precipitation \
maps=`g.list type=raster pattern=prec* separator=comma` \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"
start="2000-01-01" increment="1 months"


# file with only map names
# file with only map names
t.register -i input=precipitacion file=map_list.txt \
t.register -i input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"
start="2000-01-01" increment="1 months"
</source>
</source>
Line 45: Line 45:


<source lang="bash">
<source lang="bash">
t.register input=precipitacion file=map_list_&_start_time.txt  
t.register input=precipitation file=map_list_&_start_time.txt  
t.snap type=strds input=precipitation
t.snap type=strds input=precipitation
</source>
</source>
Line 52: Line 52:


<source lang="bash">
<source lang="bash">
t.register input=precipitacion file=map_list_start_&_end_time.txt  
t.register input=precipitation file=map_list_start_&_end_time.txt  
</source>
</source>


So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you  
So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you  
can provide both time instances and time intervals, and also overlapping times. See manual page for [https://grass.osgeo.org/grass70/manuals/t.register.html#input-file-format input file formats]. Whatever the case, when you provide an input file, make sure you delete any blank line at the end of the map list.
can provide both time instances and time intervals, and also overlapping times. See manual page for [https://grass.osgeo.org/grass-stable/manuals/t.register.html#input-file-format input file formats]. Whatever the case, when you provide an input file, make sure you delete any blank line at the end of the map list.


===== NOTE =====
===== NOTE =====


If you want to register maps that have already been registered in that mapset, you don't need to pass information regarding start and end time again, just the list of map names is enough. {{cmd|t.register}} will read timestamps from the temporal database.
If the input option is not used (no stds is provided), maps will be only registered in the temporal database, i.e.: they will be assigned timestamps in the temporal database. If, however, a stds is provided through input, maps will be first registered in the temporal database (if not already registered) and then, in the stds specified. If you want to register maps that have already been registered in that mapset, you don't need to pass information regarding start and end time again, just the list of map names is enough. {{cmd|t.register}} will read timestamps from the temporal database.
 
Registering maps in the temporal database and registering maps in stds are two different processes and, {{cmd|t.register}} does both. If you use the module 'without input option' you are just assigning timestamps to maps in the temporal database (in the temporal framework). If you use t.register 'with the input option' (i.e.: providing a stds), you are registering timestamped maps that were previously registered in the temporal database, either in the same run of the command or a previous one not using input option.
 
== Parsing timestamps from file names ==
 
=== Example for Sentinel-2 data ===
 
Sentinel-2 timestamps can be easily determined from the raster map names, for example the imported S-2 image scene <tt>L2A_T32UPB_A021941_20190904T102045</tt> will be timestamped as <tt>2019-09-04 10:20:45</tt>.
 
A script to automate the creation of a text file to be used as file input for {{cmd|t.register}} can be found here:
 
https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html
 
The generated file looks like this:
 
<source lang="bash">
T32UPB_20190407T102021_B04_10m|2019-04-07 10:26:45.035007|S2_4
T32UPB_20190407T102021_B08_10m|2019-04-07 10:26:45.035007|S2_8
T32UPB_20190422T102029_B04_10m|2019-04-22 10:26:50.312683|S2_4
</source>
 
=== Example for monthly climatic data ===
 
The following script is extracting the timestamps of the DWD gridded monthly 1km data file names (provided as open data at https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/), after import into GRASS GIS. The file name scheme be <tt>file name scheme: grids_germany_monthly_air_temp_min.01.1935</tt> in this example:
 
<source lang="python">
#!/usr/bin/env python3
 
# Timestamp generator for DWD gridded monthly 1km data, Gauß-Krüger Zone 3
# open data: https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/
#
# (c) mundialis and Markus Neteler, 2020
# license: GPL-3
#
# Script based on "sentinel-timestamp.py"
#  https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html
#
# Usage example - air_temp_mean:
#  python3 r.dwd_monthly_timestamp_extractor.py product=air_temp_mean output=dwd_monthly_gridded_data_air_temp_mean_timestamped.csv
 
#%module
#% description: Timestamps DWD raster maps from current mapset.
#%end
#%option
#% key: product
#% type: string
#% description: DWD gridded product
#% answer: air_temp_max,air_temp_mean,air_temp_min,drought_index,precipitation,soil_moist
#% required: yes
#% multiple: no
#%end
#%option G_OPT_F_OUTPUT
#%end
 
import sys
import os
from datetime import datetime, timedelta
from dateutil.relativedelta import *
 
import grass.script as gs
 
from grass.pygrass.gis import Mapset
 
def main():
    mapset = Mapset()
    mapset.current()
    dwd_set = "grids_germany_monthly_" + options['product'] + ".*"
    gs.message("Scanning for <%s> ..." % dwd_set)
 
    # file name scheme: grids_germany_monthly_air_temp_min.01.1935
 
    with open(options['output'], 'w') as fd:
        for rast in mapset.glist('raster', pattern=dwd_set):
            items = rast.split('.')
            d = datetime.strptime(items[1] + items[2], '%m%Y')
            dd = d + relativedelta(months=+1)
            fd.write("{0}|{1}|{2}{3}".format(
                rast,
                d.strftime('%Y-%m-%d %H:%M:%S'),
                dd.strftime('%Y-%m-%d %H:%M:%S'),
                os.linesep))
       
    return 0
 
if __name__ == "__main__":
    options, flags = gs.parser()
   
    sys.exit(main())
</source>
 
The generated text file is to be used as file input for {{cmd|t.register}}. It looks like this:
 
<source lang="bash">
grids_germany_monthly_air_temp_mean.01.1881|1881-01-01 00:00:00|1881-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1882|1882-01-01 00:00:00|1882-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1883|1883-01-01 00:00:00|1883-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1884|1884-01-01 00:00:00|1884-02-01 00:00:00
...
</source>


== FAQ ==
== FAQ ==
Line 68: Line 167:
A: If your data is accumulated for a time period (a day, a week, ...), then you have '''interval time'''. If your data represents a value that is valid for a specific time period (daily mean), you have '''interval time''', as well. However, if your data represents a specific state at a point of time that has a smaller period, such as a second for example, then you have '''time instances'''.
A: If your data is accumulated for a time period (a day, a week, ...), then you have '''interval time'''. If your data represents a value that is valid for a specific time period (daily mean), you have '''interval time''', as well. However, if your data represents a specific state at a point of time that has a smaller period, such as a second for example, then you have '''time instances'''.


The next thing to consider is what you want to do with the data. If you want to compute interactions with other time series, then it makes sense to use time intervals to compute temporal overlapping and
The next thing to consider is what you want to do with the data. If you want to compute interactions with other time series, then it makes sense to use time intervals to compute temporal overlapping and intersection.
intersection.

Latest revision as of 12:36, 29 December 2020

Different ways of registering maps in STDS

There are different ways to register maps in stds (space-time data set) with t.register. The best one in each case will depend on your data and whether you need/have time instances or time intervals (See FAQ section for more details). Time instances are defined by an occurrence date, say only start time. On the other hand, time intervals are defined by both start and end time.

If you use either a file with map names or a list of maps (manually typed or as g.list output), and you need time instances (temporal type = point), then you set start and increment options (not -i).

# manually typed map list 
t.register input=precipitation \
maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \
start="2000-01-01" increment="1 months"

# with g.list output
t.register input=precipitation \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"

# file with only map names
t.register input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"

You can also get time instances passing a file with map names and start time.

# file with map names and start time only
t.register input=precipitation file=map_list_&_start_time.txt

If you need/have data at regular intervals instead, you also need to set the -i flag along with start time. The -i flag and the increment option will only work if a start time is defined (but no end time).

# map list 
t.register -i input=precipitation \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"

# file with only map names
t.register -i input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"

If you pass a file with map names and start time only, but you want time intervals, then you can use t.snap on the STDS to create a correct temporal topology: maps will use the start time of a potential predecessor as end time.

t.register input=precipitation file=map_list_&_start_time.txt 
t.snap type=strds input=precipitation

Or, you can provide also end_time in the input file (for example, if your time intervals are irregular). If end_time in one map equals start_time of the following, t.register will automatically create intervals.

t.register input=precipitation file=map_list_start_&_end_time.txt

So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you can provide both time instances and time intervals, and also overlapping times. See manual page for input file formats. Whatever the case, when you provide an input file, make sure you delete any blank line at the end of the map list.

NOTE

If the input option is not used (no stds is provided), maps will be only registered in the temporal database, i.e.: they will be assigned timestamps in the temporal database. If, however, a stds is provided through input, maps will be first registered in the temporal database (if not already registered) and then, in the stds specified. If you want to register maps that have already been registered in that mapset, you don't need to pass information regarding start and end time again, just the list of map names is enough. t.register will read timestamps from the temporal database.

Registering maps in the temporal database and registering maps in stds are two different processes and, t.register does both. If you use the module 'without input option' you are just assigning timestamps to maps in the temporal database (in the temporal framework). If you use t.register 'with the input option' (i.e.: providing a stds), you are registering timestamped maps that were previously registered in the temporal database, either in the same run of the command or a previous one not using input option.

Parsing timestamps from file names

Example for Sentinel-2 data

Sentinel-2 timestamps can be easily determined from the raster map names, for example the imported S-2 image scene L2A_T32UPB_A021941_20190904T102045 will be timestamped as 2019-09-04 10:20:45.

A script to automate the creation of a text file to be used as file input for t.register can be found here:

https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html

The generated file looks like this:

T32UPB_20190407T102021_B04_10m|2019-04-07 10:26:45.035007|S2_4
T32UPB_20190407T102021_B08_10m|2019-04-07 10:26:45.035007|S2_8
T32UPB_20190422T102029_B04_10m|2019-04-22 10:26:50.312683|S2_4

Example for monthly climatic data

The following script is extracting the timestamps of the DWD gridded monthly 1km data file names (provided as open data at https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/), after import into GRASS GIS. The file name scheme be file name scheme: grids_germany_monthly_air_temp_min.01.1935 in this example:

#!/usr/bin/env python3

# Timestamp generator for DWD gridded monthly 1km data, Gauß-Krüger Zone 3
# open data: https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/
#
# (c) mundialis and Markus Neteler, 2020
# license: GPL-3
#
# Script based on "sentinel-timestamp.py"
#  https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html
#
# Usage example - air_temp_mean:
#  python3 r.dwd_monthly_timestamp_extractor.py product=air_temp_mean output=dwd_monthly_gridded_data_air_temp_mean_timestamped.csv

#%module
#% description: Timestamps DWD raster maps from current mapset.
#%end
#%option
#% key: product
#% type: string
#% description: DWD gridded product
#% answer: air_temp_max,air_temp_mean,air_temp_min,drought_index,precipitation,soil_moist
#% required: yes
#% multiple: no
#%end
#%option G_OPT_F_OUTPUT
#%end

import sys
import os
from datetime import datetime, timedelta
from dateutil.relativedelta import *

import grass.script as gs

from grass.pygrass.gis import Mapset

def main():
    mapset = Mapset()
    mapset.current()
    dwd_set = "grids_germany_monthly_" + options['product'] + ".*"
    gs.message("Scanning for <%s> ..." % dwd_set)

    # file name scheme: grids_germany_monthly_air_temp_min.01.1935

    with open(options['output'], 'w') as fd:
        for rast in mapset.glist('raster', pattern=dwd_set):
            items = rast.split('.')
            d = datetime.strptime(items[1] + items[2], '%m%Y')
            dd = d + relativedelta(months=+1)
            fd.write("{0}|{1}|{2}{3}".format(
                rast,
                d.strftime('%Y-%m-%d %H:%M:%S'),
                dd.strftime('%Y-%m-%d %H:%M:%S'),
                os.linesep))
        
    return 0

if __name__ == "__main__":
    options, flags = gs.parser()
    
    sys.exit(main())

The generated text file is to be used as file input for t.register. It looks like this:

grids_germany_monthly_air_temp_mean.01.1881|1881-01-01 00:00:00|1881-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1882|1882-01-01 00:00:00|1882-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1883|1883-01-01 00:00:00|1883-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1884|1884-01-01 00:00:00|1884-02-01 00:00:00
...

FAQ

Q: How do I know if my data represents time intervals or time instances?

A: If your data is accumulated for a time period (a day, a week, ...), then you have interval time. If your data represents a value that is valid for a specific time period (daily mean), you have interval time, as well. However, if your data represents a specific state at a point of time that has a smaller period, such as a second for example, then you have time instances.

The next thing to consider is what you want to do with the data. If you want to compute interactions with other time series, then it makes sense to use time intervals to compute temporal overlapping and intersection.