Time series development: Difference between revisions
⚠️Wenzeslaus (talk | contribs) (add Category:Historic) |
|||
(23 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
'''Historical document, please keep it.''' | |||
'''See now [[Temporal data processing]]''' | |||
-------- | |||
''Many ideas may be obsolated or implemented now. Maybe it is more a historical document but this does not apply to Open issues section.'' | |||
* ''see also the [[Time series]] user-help wiki page'' | |||
== About == | |||
This page is born out of discussions in Lausanne. A lot of people seem to be interested in having some standardized, documented format for dealing with time series in GRASS, so that we can deal with data (e.g. climate station, water gauges), link with models, etc., without having to come up with a custom solution each time. This could also lead to modules to help with interpolation to deal with missing values, etc. | This page is born out of discussions in Lausanne. A lot of people seem to be interested in having some standardized, documented format for dealing with time series in GRASS, so that we can deal with data (e.g. climate station, water gauges), link with models, etc., without having to come up with a custom solution each time. This could also lead to modules to help with interpolation to deal with missing values, etc. | ||
Line 17: | Line 28: | ||
* SQL commands can be used for search by time stamps | * SQL commands can be used for search by time stamps | ||
==== maxi's proposal ==== | |||
* what about a table with time series (sort of header of each time series) and another table for each time serie with a list of associated filename and related timestamp? | * what about a table with time series (sort of header of each time series) and another table for each time serie with a list of associated filename and related timestamp? | ||
** time series head table structure: | ** time series head table structure: | ||
Line 32: | Line 43: | ||
==== Soerens ideas ==== | ==== Soerens ideas ==== | ||
A concrete implementation idea of a temporal extension for GRASS GIS 7 is located here: | |||
[[Temporal extension for GRASS GIS 7]] | |||
===== a temporal gis extension for grass ===== | ===== a temporal gis extension for grass ===== | ||
Line 71: | Line 86: | ||
* if automatically temporal table update is requested, all the file io libs of raster, raster3d, vector maps and groups have to be extended | * if automatically temporal table update is requested, all the file io libs of raster, raster3d, vector maps and groups have to be extended | ||
===== temporal database ===== | ===== temporal database concepts ===== | ||
====== SQL | ====== SQL databases with a time datatype ====== | ||
* the temporal database in which the maps/groups are registered should provide temporal queries | * the temporal database in which the maps/groups are registered should provide temporal queries | ||
** select all maps between 20 jan 2002 | ** select all maps between 20 jan 2002 and 13 feb 2004 | ||
** select all maps with a valid range of 3 months | ** select all maps with a valid range of 3 months | ||
** select all maps with datum < 13 sep 1999 | ** select all maps with datum < 13 sep 1999 | ||
** select all maps with a valid range of 10 years and a value range between [20:45] | ** select all maps with a valid range of 10 years and a value range between [20:45] | ||
* Postgresql is one Open Source database which supports such queries and suitable temporal datatypes [[http://www.postgresql.org/files/documentation/books/pghandbuch/html/datatype-datetime.html]] and [[http://www.postgresql.org/files/documentation/books/pghandbuch/html/functions-datetime.html]] | |||
* since version 3.0.x sqlite also supports temporal queries and datatypes [[http://www.sqlite.org/cvstrac/wiki?p=DateAndTimeFunctions]] | |||
====== text | ====== SQL databases without a time datatype ====== | ||
* sql databses which are able to handle double data types can be used | |||
** if the timestamps of the maps are converted into a double data type (seconds) befor send to the database | |||
*** 21 jan 2004 14:35:21 = 7897456389 seconds | |||
** the conversion have to be done automatically by the grass lib | |||
** temporal queries are partly available with staments like " select * from table where dateLarger(date, '20 jan 2006')" | |||
** complex WHERE statements may not be available because the dates have to be converted into seconds | |||
e.g: where="startdate < 20 jan 2006 AND startdate > 14 sep 1990 AND daterange < 4 years 3 mon" | |||
will be hard to implement, the dates have to be parsed and replaced by one number (seconds) | |||
====== text based databases ====== | |||
* another way is to implement this functionality with text tables | * another way is to implement this functionality with text tables | ||
Line 142: | Line 169: | ||
===== temporal management tools ===== | ===== temporal management tools ===== | ||
* the time is new datatype in grass | * the time is a new datatype in grass | ||
* therefore the temporal management should be done by specific commands starting with t.* | * therefore the temporal management should be done by specific commands starting with t.* | ||
* tools for queries, register, unregister und extraction of maps from the temporal database have to be implemented, like t.info, t.register, t.unregister and t.extract ... | * tools for queries, register, unregister und extraction of maps from the temporal database have to be implemented, like t.info, t.register, t.unregister and t.extract ... | ||
Line 196: | Line 223: | ||
* based on this functionality a 4d animation tool based on VTK should be implemented | * based on this functionality a 4d animation tool based on VTK should be implemented | ||
[[User:Huhabla|huhabla]] 00:19, 26 September 2006 (CEST) | |||
===== Implemented Prototype: r.rast4d===== | |||
http://trac.osgeo.org/grass/browser/grass-addons/raster/r.rast4d | |||
==== Jachym's notes ==== | ==== Jachym's notes ==== | ||
Line 203: | Line 236: | ||
If "times" is going to be new GRASS data type, new t.* group of commands has to be introduced, to have equivalent commands for r.what, d.rast, r.schmeiß.mich.tot, ... | If "times" is going to be new GRASS data type, new t.* group of commands has to be introduced, to have equivalent commands for r.what, d.rast, r.schmeiß.mich.tot, ... | ||
==== Alessandro Frigeri's notes ==== | |||
Scale/resolution issues: | |||
* Implementation of absolute and relative time scale (e.g. numerical modeling is likely to require times referred to the start of the simulation, remote sensed data is commonly referred to UTC time). | |||
* Implementation of Units Of Measure conversions for time (see specific issue on Units Of Measure implementation) so we can analyze time-series in their more natural unit and resolution (geologic events are in Mega or Giga years while surface temperature variations are in the order of magniture of hours, or kilo seconds) | |||
==== Open Issues ==== | ==== Open Issues ==== | ||
Line 210: | Line 248: | ||
** efficiency: huge number of files will tremendiously slow down each map listing procedures. | ** efficiency: huge number of files will tremendiously slow down each map listing procedures. | ||
* associated color table: only one color table should serve one series (avoid multiple color table for each map) | * associated color table: only one color table should serve one series (avoid multiple color table for each map) | ||
* linear regression in r.series should be modified to support irregular time intervals in time series (will that be a t.series ?) | |||
[[Category:Development]] | [[Category:Development]] | ||
[[Category:Historic]] |
Latest revision as of 15:08, 28 January 2016
Historical document, please keep it.
See now Temporal data processing
Many ideas may be obsolated or implemented now. Maybe it is more a historical document but this does not apply to Open issues section.
- see also the Time series user-help wiki page
About
This page is born out of discussions in Lausanne. A lot of people seem to be interested in having some standardized, documented format for dealing with time series in GRASS, so that we can deal with data (e.g. climate station, water gauges), link with models, etc., without having to come up with a custom solution each time. This could also lead to modules to help with interpolation to deal with missing values, etc.
There was also discussion of possibly setting up a mailing list. Do we want this?
- I would vote against just-another-mailing list Neteler 23:51, 18 September 2006 (CEST)
Ideas from MN
- each imported raster map get's automatically "registered" in an SQL table
- Table structure:
- map name
- map creator (optional)
- time stamp of import
- time stamp of map production (optional)
- time range of validity (optional)
- Table structure:
- g.list, g.rename etc. tools have a new "where" parameter do search maps in this table
- g.remove will also remove row from SQL table
- SQL commands can be used for search by time stamps
maxi's proposal
- what about a table with time series (sort of header of each time series) and another table for each time serie with a list of associated filename and related timestamp?
- time series head table structure:
- series name
- creator name
- timestamp of creation
- range of validity
- time step
- ....
- time serie associated map (table named like the related time serie):
- timestamp
- map name
- timestamp of import
- time series head table structure:
Soerens ideas
A concrete implementation idea of a temporal extension for GRASS GIS 7 is located here:
Temporal extension for GRASS GIS 7
a temporal gis extension for grass
The temporal extension should not only help to manage temporal data, the data should also be easy to analyse and visualize in time.
temproal gis implementation in grass
- a seperate library in grass (dir temporal_gis or tgis)
- should manage raster, raster3d, vector maps and groups of raster maps (imagery groups)
- the existing timestamps should be used for registration in a temporal database
- maps can be registered and unregistered in the temporal database
- it should be possible to cretae several temporal tables in the database
- those tables are a new datatype in grass
- the tables should be used for thematic time series
- maps and groups may be registered in different tables
- it should be very easy to register and unregister thousands of maps in the database
- tables will have a description, history, transaction log, ceration time and modification time
- statistical Informations about the maps (ranges) should be stored in the database
- if the raster map changes the table should be easily updatable (automatic update implemented in the libs -> if map is closed update the temporal table)
- the temporal database should allow temporal queries for one, several or all tables in the database
- if an sql database is used for map storing the full functionality of the grass-db-lib will be used
- grass temporal gis library functions should begin with a T_, example:
T_register_rast_in_table(mapset, name, table) T_unregister_rast_in_table(mapset, name, table) T_query_table(table, sort, sqlstring) ...
changes to existing libs/modules
- timestamps support for groups must be added
- r.timestamp, r3.timestamp and v.timestamp should be implemented in g.timestamp with group support
- support for temporal region (starttime, endtime, timesteps) in g.region
- g.remove, g.rename and g.copy will need to have access to the temporal database
- if a maps is removed and registered in temporal tables, the table entry should also be removed
- every map keeps the info in which temporal tables it is registered (additional file or info in header)
- the timestamp should be automatically created if a map is created
- g.list should list all temporal databases, not the registered maps (t.info should do this)
- if automatically temporal table update is requested, all the file io libs of raster, raster3d, vector maps and groups have to be extended
temporal database concepts
SQL databases with a time datatype
- the temporal database in which the maps/groups are registered should provide temporal queries
- select all maps between 20 jan 2002 and 13 feb 2004
- select all maps with a valid range of 3 months
- select all maps with datum < 13 sep 1999
- select all maps with a valid range of 10 years and a value range between [20:45]
- Postgresql is one Open Source database which supports such queries and suitable temporal datatypes [[1]] and [[2]]
- since version 3.0.x sqlite also supports temporal queries and datatypes [[3]]
SQL databases without a time datatype
- sql databses which are able to handle double data types can be used
- if the timestamps of the maps are converted into a double data type (seconds) befor send to the database
- 21 jan 2004 14:35:21 = 7897456389 seconds
- the conversion have to be done automatically by the grass lib
- temporal queries are partly available with staments like " select * from table where dateLarger(date, '20 jan 2006')"
- complex WHERE statements may not be available because the dates have to be converted into seconds
- if the timestamps of the maps are converted into a double data type (seconds) befor send to the database
e.g: where="startdate < 20 jan 2006 AND startdate > 14 sep 1990 AND daterange < 4 years 3 mon" will be hard to implement, the dates have to be parsed and replaced by one number (seconds)
text based databases
- another way is to implement this functionality with text tables
- the time stamps are managed in a double linked list
- the query functionality is not that powerful then a postgresql database
- only simple temporal queries are possible
text database example:
temporal_gis |-- group | `-- Landsat_2006_2010 | |-- head | |-- hist | |-- list | |-- range | |-- time_wind | `-- transaction_log |-- raster | |-- landuse_2006_2010 | | |-- head | | |-- hist | | |-- list | | |-- range | | |-- time_wind | | `-- transaction_log | `-- soils_2006_2010 | |-- head | |-- hist | |-- list | |-- range | |-- time_wind | `-- transaction_log |-- raster3d | `-- phdist_2006_2010 | |-- head | |-- hist | |-- list | |-- range | |-- time_wind | `-- transaction_log `-- vector `-- observ_wells_2006_2010 |-- head |-- hist |-- list |-- range |-- time_wind `-- transaction_log
timestamps in temporal database
- the following timestamps should be created in the database
- a valid timestamp with startime or start- and endtime
- creation time
- modification time
- if the maps have a valid start and endtime the event duration should be calculated by the database automatically
temporal management tools
- the time is a new datatype in grass
- therefore the temporal management should be done by specific commands starting with t.*
- tools for queries, register, unregister und extraction of maps from the temporal database have to be implemented, like t.info, t.register, t.unregister and t.extract ...
Example t.register:
GRASS 6.3.cvs > t.register help Description: Register groups, raster, raster3d and vector maps into the temporal database Keywords: temporal, time Usage: t.register [-s] [tgroup=string[,string,...]] [trast=string[,string,...]] [trast3d=string[,string,...]] [tvect=string[,string,...]] [group=string[,string,...]] [rast=string[,string,...]] [rast3d=string[,string,...]] [vect=string[,string,...]] [date=timestamp[,timestamp,...]] [timestep=timestep[,timestep,...]] Flags: -s Use the timestep between new maps. Parameters: tgroup Temporal group database(s) in which the group(s) should be registered trast Temporal raster database(s) in which the raster(s) should be registered trast3d Temporal raster3d database(s) in which the raster3d(s) should be registered tvect Temporal vector database(s) in which the vector(s) should be registered group Group(s) to be registered in the temporal group database rast Raster map(s) to be registered in the temporal raster database rast3d Raster3d map(s) to be registered in the temporal raster3d database vect Vector map(s) to be registered in the temporal vector database date datetime, datetime1/datetime2 for map(s) timestep timestep between the maps
example
register 3 raster maps beginning from 20 jan 2001 with a timestep of 3 month in temporal database table Landsat
t.register -s trast=Landsat rast=LandsatJan,LandsatApr,LandsatJul date="20 jan 2001" timestep="3 mon" Table: 1 LandsatJan 20 jan 2001 ... 2 LandsatApr 20 apr 2001 ... 3 LandsatJul 20 jul 2001 ...
OO Layer
- it should be possible to have data access with spatial-temporal functions:
- value = g4dDataObject->Get4DValue(x, y, z, timestamp) -- for volume maps
- value = g4dDataObject->Get4DValue(x, y, timestamp) -- for raster maps
- based on this functionality a 4d animation tool based on VTK should be implemented
huhabla 00:19, 26 September 2006 (CEST)
Implemented Prototype: r.rast4d
http://trac.osgeo.org/grass/browser/grass-addons/raster/r.rast4d
Jachym's notes
On IRC, we discussed, that time series would be stored in database (PostgreSQL). If data format would be like YYYY-MM-DD-HH-MM-SS, the "database" could be some textfile, which would be sortable via standard 'sort' and g.* modules would not need to speek SQL - KISS, works even on a toaster.
g.list rast should print all raster maps. Raster map belonging to time serie does not stop to be raster map. New data type "times" has to be created, so that modules like g.list, g.remove ... can handle it.
If "times" is going to be new GRASS data type, new t.* group of commands has to be introduced, to have equivalent commands for r.what, d.rast, r.schmeiß.mich.tot, ...
Alessandro Frigeri's notes
Scale/resolution issues:
- Implementation of absolute and relative time scale (e.g. numerical modeling is likely to require times referred to the start of the simulation, remote sensed data is commonly referred to UTC time).
- Implementation of Units Of Measure conversions for time (see specific issue on Units Of Measure implementation) so we can analyze time-series in their more natural unit and resolution (geologic events are in Mega or Giga years while surface temperature variations are in the order of magniture of hours, or kilo seconds)
Open Issues
- will 'g.list rast' show also rasters belonging to time series?
- how to deal with huge file number in a folder? (very long serie often deals with a huge number of maps)
- limitation: how many rasters can a folder contain ('fileno' in /etc/security/limits.conf).
- efficiency: huge number of files will tremendiously slow down each map listing procedures.
- associated color table: only one color table should serve one series (avoid multiple color table for each map)
- linear regression in r.series should be modified to support irregular time intervals in time series (will that be a t.series ?)