Large raster data processing
Create a mosaic
Suppose that we have all ASTER GDEM world coverage (>22000 files) and we aim to build a mosaic of Europe and import it in GRASS. A nice reference on how to deal with ASTER GDEM is here. The very first step is to select among the >22000 files those covering our area of interest (Europe). To this aim, we can use use gdaltindex to create an index of all the files:
gdaltindex an_index.shp *.tif
This command will create a polygon shapefile with the footprint of each raster as the polygon shape and the name of the image files represented in the attribute table. After that, we just need to do a spatial select on the shapefile and extract the filenames from the attributes from the selected polygons:
ogr2ogr -f CSV list.csv an_index.shp -spat xmin ymin xmax ymax
will produce a CSV file, from which we can easily copy the list of files in a list.txt.
Our area of interest has the following boundaries:
north: N71: ymax = 72.0001389 south: N34: ymin = 33.9998611 west: W011: xmin = -11.0001389 east: E042: xmax = 43.0001389
Now that we have a list of the fioles interesting our area, and since our files are all in the same datum (WGS84), we can use gdalbuildvrt to create a virtual mosaic. This will take only a few seconds to run.
gdalbuildvrt -input_file_list list.txt mosaic.vrt
To import the mosaic into GRASS, we first need to create a WGS84 location, then:
r.external -r input=/path/to/mosaic.vrt output=mosaic
Number of files limitation: troubleshooting
Linux (and other OSes, likewise) has a limit of opening 1024 files in parallel. So that you may experience the following error message:
Too many open files
However, this can be enlarged, see r.series manual page --> "Number of raster maps to be processed is given by the limit of the operating system."