GRASS GIS Performance

From GRASS-Wiki
Revision as of 03:48, 24 August 2012 by Peter.loewe (talk | contribs)

Jump to: navigation, search

GRASS GIS Performance

GRASS GIS is noted for being ready for massive data analysis. This page contains an yet incomplete collection of performance indicators.

Architecture

GRASS GIS is fully 32bit and 64bit compliant. See also the Software requirements specification.

Number of opened input files

There are only operating system constraints of the number of input files which can be opened simultaneously. Commonly the limit is 1024 files. In operating systems like Linux this limit can be overcome with the "ulimit" settings.

See also

Memory management

Due to the modular architecture of GRASS GIS the overhead is minimal.

See also

Vector management

Maximum Number of Columns

The maximum number of attribute columns for a vector is defined by the settings of the selected database backend (db.connect).

DBF-Backend: GRASS 4.x - 6.x use by default the DBF backend. While there is no explicitly stated maximum number of allowed attributes, trials with GRASS6.4.2 in 2012 result in program failure if > 2000 attributes are used. Export to DBF-based ESRI Shapefile provides a warning if more that 255 attributes are used: Other software tools may ignore all further attributes.


Sqlite-Backend: GRASS 7.x uses by default the sqlite backend. The default maximum number of attribute columns is 2000. This number can be increased by compiling Sqlite with changed settings.


MySQL-Backend: The default maximum number of attribute columns is 4000.


PostgreSQL-Backend: tbd


Oracle: tbd

Large file support

Large raster data processing

GRASS GIS 7 supports the off_t type, hence it can address an enormous amount of raster data.

See also

Some benchmarks:

  • Import of ECAD 6.0 Tmean dataset: 22650 layers in single netCDF file: import takes 300 Seconds while reading file via NFS (i.e. 75 maps per second)

Large vector data processing

GRASS GIS 7 supports the off_t type, hence it can address an enormous amount of vector data. Currently multi-billion vector points have been managed (citation) without topology (since not needed). In all GRASS versions, the limit with topology is at time 2^31 - 1 (about 2 billion) features.

Parallelization

In GRASS 7, a few modules have been parallelized with openMP. However, if data can be processed in chunks, GRASS GIS can be used on clusters.

See also: