Temporal data processing/maps registration

From GRASS-Wiki
Revision as of 12:36, 29 December 2020 by Neteler (talk | contribs) (+Parsing timestamps from file names with Python)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Different ways of registering maps in STDS

There are different ways to register maps in stds (space-time data set) with t.register. The best one in each case will depend on your data and whether you need/have time instances or time intervals (See FAQ section for more details). Time instances are defined by an occurrence date, say only start time. On the other hand, time intervals are defined by both start and end time.

If you use either a file with map names or a list of maps (manually typed or as g.list output), and you need time instances (temporal type = point), then you set start and increment options (not -i).

# manually typed map list 
t.register input=precipitation \
maps=prec_01,prec_02,prec_03,prec_04,prec_05,prec_06 \
start="2000-01-01" increment="1 months"

# with g.list output
t.register input=precipitation \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"

# file with only map names
t.register input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"

You can also get time instances passing a file with map names and start time.

# file with map names and start time only
t.register input=precipitation file=map_list_&_start_time.txt

If you need/have data at regular intervals instead, you also need to set the -i flag along with start time. The -i flag and the increment option will only work if a start time is defined (but no end time).

# map list 
t.register -i input=precipitation \
maps=`g.list type=raster pattern="prec*" separator=comma` \
start="2000-01-01" increment="1 months"

# file with only map names
t.register -i input=precipitation file=map_list.txt \
start="2000-01-01" increment="1 months"

If you pass a file with map names and start time only, but you want time intervals, then you can use t.snap on the STDS to create a correct temporal topology: maps will use the start time of a potential predecessor as end time.

t.register input=precipitation file=map_list_&_start_time.txt 
t.snap type=strds input=precipitation

Or, you can provide also end_time in the input file (for example, if your time intervals are irregular). If end_time in one map equals start_time of the following, t.register will automatically create intervals.

t.register input=precipitation file=map_list_start_&_end_time.txt

So, to sum up, if you provide a file with time stamps, you should not use -i flag nor increment option. Besides, in the input file you can provide both time instances and time intervals, and also overlapping times. See manual page for input file formats. Whatever the case, when you provide an input file, make sure you delete any blank line at the end of the map list.

NOTE

If the input option is not used (no stds is provided), maps will be only registered in the temporal database, i.e.: they will be assigned timestamps in the temporal database. If, however, a stds is provided through input, maps will be first registered in the temporal database (if not already registered) and then, in the stds specified. If you want to register maps that have already been registered in that mapset, you don't need to pass information regarding start and end time again, just the list of map names is enough. t.register will read timestamps from the temporal database.

Registering maps in the temporal database and registering maps in stds are two different processes and, t.register does both. If you use the module 'without input option' you are just assigning timestamps to maps in the temporal database (in the temporal framework). If you use t.register 'with the input option' (i.e.: providing a stds), you are registering timestamped maps that were previously registered in the temporal database, either in the same run of the command or a previous one not using input option.

Parsing timestamps from file names

Example for Sentinel-2 data

Sentinel-2 timestamps can be easily determined from the raster map names, for example the imported S-2 image scene L2A_T32UPB_A021941_20190904T102045 will be timestamped as 2019-09-04 10:20:45.

A script to automate the creation of a text file to be used as file input for t.register can be found here:

https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html

The generated file looks like this:

T32UPB_20190407T102021_B04_10m|2019-04-07 10:26:45.035007|S2_4
T32UPB_20190407T102021_B08_10m|2019-04-07 10:26:45.035007|S2_8
T32UPB_20190422T102029_B04_10m|2019-04-22 10:26:50.312683|S2_4

Example for monthly climatic data

The following script is extracting the timestamps of the DWD gridded monthly 1km data file names (provided as open data at https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/), after import into GRASS GIS. The file name scheme be file name scheme: grids_germany_monthly_air_temp_min.01.1935 in this example:

#!/usr/bin/env python3

# Timestamp generator for DWD gridded monthly 1km data, Gauß-Krüger Zone 3
# open data: https://opendata.dwd.de/climate_environment/CDC/grids_germany/monthly/
#
# (c) mundialis and Markus Neteler, 2020
# license: GPL-3
#
# Script based on "sentinel-timestamp.py"
#  https://training.gismentors.eu/grass-gis-workshop-jena/units/21.html
#
# Usage example - air_temp_mean:
#  python3 r.dwd_monthly_timestamp_extractor.py product=air_temp_mean output=dwd_monthly_gridded_data_air_temp_mean_timestamped.csv

#%module
#% description: Timestamps DWD raster maps from current mapset.
#%end
#%option
#% key: product
#% type: string
#% description: DWD gridded product
#% answer: air_temp_max,air_temp_mean,air_temp_min,drought_index,precipitation,soil_moist
#% required: yes
#% multiple: no
#%end
#%option G_OPT_F_OUTPUT
#%end

import sys
import os
from datetime import datetime, timedelta
from dateutil.relativedelta import *

import grass.script as gs

from grass.pygrass.gis import Mapset

def main():
    mapset = Mapset()
    mapset.current()
    dwd_set = "grids_germany_monthly_" + options['product'] + ".*"
    gs.message("Scanning for <%s> ..." % dwd_set)

    # file name scheme: grids_germany_monthly_air_temp_min.01.1935

    with open(options['output'], 'w') as fd:
        for rast in mapset.glist('raster', pattern=dwd_set):
            items = rast.split('.')
            d = datetime.strptime(items[1] + items[2], '%m%Y')
            dd = d + relativedelta(months=+1)
            fd.write("{0}|{1}|{2}{3}".format(
                rast,
                d.strftime('%Y-%m-%d %H:%M:%S'),
                dd.strftime('%Y-%m-%d %H:%M:%S'),
                os.linesep))
        
    return 0

if __name__ == "__main__":
    options, flags = gs.parser()
    
    sys.exit(main())

The generated text file is to be used as file input for t.register. It looks like this:

grids_germany_monthly_air_temp_mean.01.1881|1881-01-01 00:00:00|1881-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1882|1882-01-01 00:00:00|1882-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1883|1883-01-01 00:00:00|1883-02-01 00:00:00
grids_germany_monthly_air_temp_mean.01.1884|1884-01-01 00:00:00|1884-02-01 00:00:00
...

FAQ

Q: How do I know if my data represents time intervals or time instances?

A: If your data is accumulated for a time period (a day, a week, ...), then you have interval time. If your data represents a value that is valid for a specific time period (daily mean), you have interval time, as well. However, if your data represents a specific state at a point of time that has a smaller period, such as a second for example, then you have time instances.

The next thing to consider is what you want to do with the data. If you want to compute interactions with other time series, then it makes sense to use time intervals to compute temporal overlapping and intersection.