GRASS raster semantics

From GRASS-Wiki
Revision as of 12:31, 7 February 2022 by Neteler (talk | contribs) (link to source code updated)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A quick summary c/o Glynn Clements:

Raster map precision types

  • CELL DATA TYPE: a raster map from INTEGER type (4 bytes, whole numbers only).
    • In GRASS GIS, CELL is a 32 bit integer with a range from -2,147,483,647 to +2,147,483,647. The value -2,147,483,648 is reserved for NODATA.
  • FCELL DATA TYPE: a raster map from FLOAT type (4 bytes, 7-9 digits precision).
    • In GRASS GIS, FCELL is a 32 bit float (Float32) with a range from -3.4E38 to 3.4E38. However, the integer precision can be only ensured between -16777216 and 16777216. If your raster overpass this range we strongly suggest to use DCELL, as Float64 data type.
  • DCELL DATA TYPE: a raster map from DOUBLE type (8 bytes, 15-17 digits precision).
    • In GRASS GIS, DCELL is a 64 bit float (Float64) with a range from -1.79E308 to 1.79E308.
  • NULL: represents "no data" in raster maps, to be distinguished from 0 (zero) data value.

Aliases:

  • INTEGER MAP: see CELL DATA TYPE
  • FLOAT MAP: see FCELL DATA TYPE
  • DOUBLE MAP: see DCELL DATA TYPE

(reference in the GRASS GIS source code, around line 613)

Region Calculations

The "current region" or "computational region" is the actual setting of the region boundaries and the actual raster resolution. Note that the computational region isn't limited to raster data; it may also affect some vector operations.

The computational region's bounds describe a rectangle in two-dimensional space. For raster operations, this rectangle is subdivided into a grid of rectangular cells, so the region's bounds are aligned with the edges of the outermost cells.

For details, see Computational region.

Cell Locations

Cells are areas, not points, so they don't have locations. Their corners have locations, as do their centres.

A cell with array indices (i,j) (easting, northing) corresponds to the rectangle:

      { (x,y) : west + i * ewres <= x < west + (i+1) * ewres,
                north - (j+1) * nsres <= y < north - j * nsres }

whose centre is at:

      (west + (i+1/2) * ewres, north - (j+1/2) * nsres)

[Subject to wrapping of longitude values in lat/lon locations.]

Raster to Vector Conversions

IIRC, r.to.vect uses the midpoints of the cell's edges (i.e. one coordinate will be on a grid line, the other will be mid-way between grid lines).

Resampling

(see also Interpolation#Resampling_methods_and_interpolation_in_GRASS)

The built-in nearest-neighbour resampling of raster data calculates the centre of each region cell, and takes the value of the raster cell in which that point falls.

If the point falls exactly upon a grid line, the exact result will be determined by the direction of any rounding error.

[One consequence of this is that downsampling by a factor which is an even integer will always sample exactly on the boundary between cells, meaning that the result is ill-defined.]

r.resample uses the built-in resampling, so it should produce identical results.

r.resamp.interp method=nearest uses the same algorithm, but not the same code, so it may not produce identical results in cases which are decided by the rounding of floating-point numbers.

For method=bilinear and method=bicubic, the raster values are treated as samples at each raster cell's centre, defining a piecewise- continuous surface. The resulting raster values are obtained by sampling the surface at each region cell's centre.

As the algorithm only interpolates, and doesn't extrapolate, a margin of 0.5 (for bilinear) or 1.5 (for bicubic) cells is lost from the extent of the original raster. Any samples taken within this margin will be null.

AFAIK, r.resamp.rst behaves similarly, i.e. it computes a surface assuming that the values are samples at each raster cell's centre, and samples the surface at each region cell's centre.

For r.resamp.stats without -w, the value of each region cell is the chosen aggregate of the values from all of the raster cells whose centres fall within the bounds of the region cell.

With -w, the samples are weighted according to the proportion of the raster cell which falls within the bounds of the region cell, so the result is normally[1] unaffected by rounding error (a miniscule difference in the position of the boundary results in the addition or subtraction of a sample weighted by a miniscule factor).

[1] The min and max aggregates can't use weights, so -w has no effect for those.

General Rules

For the most part, the interpretation is the "obvious" one, given:

  1. Cells are areas rather than points.
  2. Operations which need a point (e.g. interpolation) use the cell's centre.

Handling of NULL values

General remarks

The NULL (no data) handling depends on the map precision type. See for details

r.mapcalc

Nearly all operators and functions return null if any of their arguments are null.

The main exceptions are:

  • "isnull(x)" returns 0 if x is non-null and 1 if x is null.
  • "if(1,a,b)" returns a even if b is null.
  • "if(0,a,b)" returns b even if a is null.
  • "eval(a,b,c,...,x)" returns x even if any or all of a,b,c,... are null.
  • "0 &&& x" and "x &&& 0" return 0 even if x is null.
  • "1 ||| x" and "x ||| 1" return 1 even if x is null.
  • "graph(x,x1,y1,...,xn,yn)" will return null if x is null or any of the x[i] are null, or if a "relevant" y[i] is null, but not where a y[i] which isn't used in the calculation is null.

[The &&& and ||| operators are a relatively recent addition; the older && and || operators follow the usual behaviour for nulls, i.e. they return null if either operand is null.]

The behaviour is quite intuitive if you view null as representing an unknown quantity. This explains why both "x == null()" and "x != null()" are always null, regardless of whether x is null or non-null.

null() is an unknown value, so it's unknown whether or not it's equal to any particular (known) value. If x is also null, then you have two unknown values, and it's unknown whether or not the two are equal.

In many respects, the behaviour is similar to that of NaN in floating-point arithmetic.

With the exception of the cases listed above, r.mapcalc doesn't try to detect "special" cases where the result can be deduced even if one of the operands is null. E.g. if x is null, "x * 0" is null rather than zero, "x == x" is null rather than 1, "x != x" is null rather than 0, etc.

If you have a "boolean" map where the values are either null or 1, you normally want to convert null to 0 before performing any other processing. You can either replace the nulls in an existing map using e.g. "r.null null=0", or adjust the tests within r.mapcalc

Arithmetic operations

r.mapcalc: Arithmetic operations on mixed types promote the lesser type to the greater type, where CELL < FCELL < DCELL:

         |  CELL FCELL DCELL
   ------+------------------
    CELL |  CELL FCELL DCELL
   FCELL | FCELL FCELL DCELL
   DCELL | DCELL DCELL DCELL

Developer Notes Rules

From a programming perspective, the functions:

      G_row_to_northing()
      G_col_to_easting()
      G_northing_to_row()
      G_easting_to_col()

all transform floating-point values.

Passing integer row or column indices to the first two functions will return the coordinates of the cell's top-left corner; for the centre coordinates, pass row+0.5 and/or col+0.5.

Similarly, the last two functions will typically return non-integral values; use floor() to discard the fractional part and obtain the row or column index of the cell within which the point lies.