Time series development: Difference between revisions

From GRASS-Wiki
Jump to navigation Jump to search
Line 92: Line 92:
  e.g: where="startdate < 20 jan 2006 AND startdate > 14 sep 1990 AND daterange < 4 years 3 mon"
  e.g: where="startdate < 20 jan 2006 AND startdate > 14 sep 1990 AND daterange < 4 years 3 mon"
  will be hard to implement, the dates have to be parsed and replaced by one number (seconds)
  will be hard to implement, the dates have to be parsed and replaced by one number (seconds)
* sqlite would be a nice choice, because it is a local database and it has a nice C/C++ api to implement temporal query functionality
* sqlite would be a nice choice, because it is a local database and has a nice C/C++ api to implement temporal query functionality
** the latest sqlite cvs version can handle temporal queries, but they dont have a temporal datatype
** the temporal functionality is emulated, the date is stored as text


====== text based databases ======
====== text based databases ======

Revision as of 22:18, 25 September 2006

This page is born out of discussions in Lausanne. A lot of people seem to be interested in having some standardized, documented format for dealing with time series in GRASS, so that we can deal with data (e.g. climate station, water gauges), link with models, etc., without having to come up with a custom solution each time. This could also lead to modules to help with interpolation to deal with missing values, etc.

There was also discussion of possibly setting up a mailing list. Do we want this?

I would vote against just-another-mailing list Neteler 23:51, 18 September 2006 (CEST)

Ideas from MN

  • each imported raster map get's automatically "registered" in an SQL table
    • Table structure:
      • map name
      • map creator (optional)
      • time stamp of import
      • time stamp of map production (optional)
      • time range of validity (optional)
  • g.list, g.rename etc. tools have a new "where" parameter do search maps in this table
  • g.remove will also remove row from SQL table
  • SQL commands can be used for search by time stamps
maxi's proposal
  • what about a table with time series (sort of header of each time series) and another table for each time serie with a list of associated filename and related timestamp?
    • time series head table structure:
      • series name
      • creator name
      • timestamp of creation
      • range of validity
      • time step
      • ....
    • time serie associated map (table named like the related time serie):
      • timestamp
      • map name
      • timestamp of import

Soerens ideas

a temporal gis extension for grass

The temporal extension should not only help to manage temporal data, the data should also be easy to analyse and visualize in time.

temproal gis implementation in grass
  • a seperate library in grass (dir temporal_gis or tgis)
  • should manage raster, raster3d, vector maps and groups of raster maps (imagery groups)
  • the existing timestamps should be used for registration in a temporal database
  • maps can be registered and unregistered in the temporal database
  • it should be possible to cretae several temporal tables in the database
    • those tables are a new datatype in grass
    • the tables should be used for thematic time series
    • maps and groups may be registered in different tables
    • it should be very easy to register and unregister thousands of maps in the database
    • tables will have a description, history, transaction log, ceration time and modification time
    • statistical Informations about the maps (ranges) should be stored in the database
    • if the raster map changes the table should be easily updatable (automatic update implemented in the libs -> if map is closed update the temporal table)
  • the temporal database should allow temporal queries for one, several or all tables in the database
  • if an sql database is used for map storing the full functionality of the grass-db-lib will be used
  • grass temporal gis library functions should begin with a T_, example:
T_register_rast_in_table(mapset, name, table)
T_unregister_rast_in_table(mapset, name, table)
T_query_table(table, sort, sqlstring)
...
changes to existing libs/modules
  • timestamps support for groups must be added
  • r.timestamp, r3.timestamp and v.timestamp should be implemented in g.timestamp with group support
  • support for temporal region (starttime, endtime, timesteps) in g.region
  • g.remove, g.rename and g.copy will need to have access to the temporal database
    • if a maps is removed and registered in temporal tables, the table entry should also be removed
    • every map keeps the info in which temporal tables it is registered (additional file or info in header)
  • the timestamp should be automatically created if a map is created
  • g.list should list all temporal databases, not the registered maps (t.info should do this)
  • if automatically temporal table update is requested, all the file io libs of raster, raster3d, vector maps and groups have to be extended
temporal database concepts
SQL databases with a time datatype
  • the temporal database in which the maps/groups are registered should provide temporal queries
    • select all maps between 20 jan 2002 and 13 feb 2004
    • select all maps with a valid range of 3 months
    • select all maps with datum < 13 sep 1999
    • select all maps with a valid range of 10 years and a value range between [20:45]
  • Postgresql is the only Open Source database which supports such queries and suitable temporal datatypes [[1]] and [[2]]
SQL databases without a time datatype
  • sql databses which are able to handle double data types can be used
    • if the timestamps of the maps are converted into a double data type (seconds) befor send to the database
      • 21 jan 2004 14:35:21 = 7897456389 seconds
    • the conversion have to be done automatically by the grass lib
    • temporal queries are partly available with staments like " select * from table where dateLarger(date, '20 jan 2006')"
    • complex WHERE statements may not be available because the dates have to be converted into seconds
e.g: where="startdate < 20 jan 2006 AND startdate > 14 sep 1990 AND daterange < 4 years 3 mon"
will be hard to implement, the dates have to be parsed and replaced by one number (seconds)
  • sqlite would be a nice choice, because it is a local database and has a nice C/C++ api to implement temporal query functionality
    • the latest sqlite cvs version can handle temporal queries, but they dont have a temporal datatype
    • the temporal functionality is emulated, the date is stored as text
text based databases
  • another way is to implement this functionality with text tables
  • the time stamps are managed in a double linked list
  • the query functionality is not that powerful then a postgresql database
  • only simple temporal queries are possible

text database example:

temporal_gis
|-- group
|   `-- Landsat_2006_2010
|       |-- head
|       |-- hist
|       |-- list
|       |-- range
|       |-- time_wind
|       `-- transaction_log
|-- raster
|   |-- landuse_2006_2010
|   |   |-- head
|   |   |-- hist
|   |   |-- list
|   |   |-- range
|   |   |-- time_wind
|   |   `-- transaction_log
|   `-- soils_2006_2010
|       |-- head
|       |-- hist
|       |-- list
|       |-- range
|       |-- time_wind
|       `-- transaction_log
|-- raster3d
|   `-- phdist_2006_2010
|       |-- head
|       |-- hist
|       |-- list
|       |-- range
|       |-- time_wind
|       `-- transaction_log
`-- vector
    `-- observ_wells_2006_2010
        |-- head
        |-- hist
        |-- list
        |-- range
        |-- time_wind
        `-- transaction_log
timestamps in temporal database
  • the following timestamps should be created in the database
    • a valid timestamp with startime or start- and endtime
    • creation time
    • modification time
    • if the maps have a valid start and endtime the event duration should be calculated by the database automatically
temporal management tools
  • the time is a new datatype in grass
  • therefore the temporal management should be done by specific commands starting with t.*
  • tools for queries, register, unregister und extraction of maps from the temporal database have to be implemented, like t.info, t.register, t.unregister and t.extract ...

Example t.register:

GRASS 6.3.cvs > t.register help

Description:
 Register groups, raster, raster3d and vector maps into the temporal database

Keywords:
 temporal, time

Usage:
 t.register [-s] [tgroup=string[,string,...]]
   [trast=string[,string,...]] [trast3d=string[,string,...]]
   [tvect=string[,string,...]] [group=string[,string,...]]
   [rast=string[,string,...]] [rast3d=string[,string,...]]
   [vect=string[,string,...]] [date=timestamp[,timestamp,...]]
   [timestep=timestep[,timestep,...]]

Flags:
  -s   Use the timestep between new maps. 

Parameters:
    tgroup   Temporal group database(s) in which the group(s) should be registered
     trast   Temporal raster database(s) in which the raster(s) should be registered
   trast3d   Temporal raster3d database(s) in which the raster3d(s) should be registered
     tvect   Temporal vector database(s) in which the vector(s) should be registered
     group   Group(s) to be registered in the temporal group database
      rast   Raster map(s) to be registered in the temporal raster database
    rast3d   Raster3d map(s) to be registered in the temporal raster3d database
      vect   Vector map(s) to be registered in the temporal vector database
      date   datetime, datetime1/datetime2 for map(s)
  timestep   timestep between the maps 

example

register 3 raster maps beginning from 20 jan 2001 with a timestep of 3 month in temporal database table Landsat

t.register -s trast=Landsat rast=LandsatJan,LandsatApr,LandsatJul date="20 jan 2001" timestep="3 mon"

Table:
 1 LandsatJan 20 jan 2001 ... 
 2 LandsatApr 20 apr 2001 ...
 3 LandsatJul 20 jul 2001 ...
OO Layer
  • it should be possible to have data access with spatial-temporal functions:
    • value = g4dDataObject->Get4DValue(x, y, z, timestamp) -- for volume maps
    • value = g4dDataObject->Get4DValue(x, y, timestamp) -- for raster maps
  • based on this functionality a 4d animation tool based on VTK should be implemented

Jachym's notes

On IRC, we discussed, that time series would be stored in database (PostgreSQL). If data format would be like YYYY-MM-DD-HH-MM-SS, the "database" could be some textfile, which would be sortable via standard 'sort' and g.* modules would not need to speek SQL - KISS, works even on a toaster.

g.list rast should print all raster maps. Raster map belonging to time serie does not stop to be raster map. New data type "times" has to be created, so that modules like g.list, g.remove ... can handle it.

If "times" is going to be new GRASS data type, new t.* group of commands has to be introduced, to have equivalent commands for r.what, d.rast, r.schmeiß.mich.tot, ...

Open Issues

  • will 'g.list rast' show also rasters belonging to time series?
  • how to deal with huge file number in a folder? (very long serie often deals with a huge number of maps)
    • limitation: how many rasters can a folder contain ('fileno' in /etc/security/limits.conf).
    • efficiency: huge number of files will tremendiously slow down each map listing procedures.
  • associated color table: only one color table should serve one series (avoid multiple color table for each map)