GPU

From GRASS-Wiki
Jump to navigation Jump to search

Implementation

In GRASS 7 you can ./configure GRASS with:

 --with-opencl

You need a GPU with a proprietary driver with libOpenCL.so library and C header files to go along with it. On Linux, currently, only nVidia, AMD (ATI) and Intel meet this criteria.

The Intel Ivy Bridge HD4000 driver only supplies OpenCL multi-CPU core support for Linux. The same chip has a GPU driver for MS Windows and (probably) for Mac OSX. On Linux + Intel graphics you need a Xeon chip for driver GPU support currently.

Point --with-opencl-includes= to the directory above cl.h, and as long as libOpenCL.so is in the ldconfig search path you're ok (a symlink to /usr/local/lib might be needed).

On Mac OSX OpenCL support is now built in, and the framework should be automatically detected when you use --with-opencl in the ./configure options.

OpenCL allows to utilize any number of GPUs and CPUs at the same time, or to pick what's what's wanted from the available, but any way the code has to be designed for such selection specifically.

Currently r.sun is the first module with any OpenCL support (see here, although integration of Seth's work into svn-trunk is not yet complete.

Discussion

Comments from the mailing list concerning GRASS and GPU parallelization:

  • As I understand it, CUDA is 100% dependent on the closed-source binary driver from nVidia and works on their video cards alone. Which is fine for today for people with nVidia hardware using their binary video card driver. If nVidia decides in a couple of years to stop supporting CUDA, your old card, your specific OS or distro, your OS or distro version+cpu type, or if they go out of business or are bought/sold to another company who is not interested, any code based on it becomes useless. For this reason code written for an open platform such as OpenCL, even if less advanced, seems to have a brighter long-term future. -- HB
  • Support for double precision floating point values must be retained for calculations which deal with positional data (as sub-meter precision for lat/long exceeds single-precision floating poing). For elevation and radiometric data floating point precision may be enough.

Further reading

  • Steinbach, M., Hemmerling, R., 2011. Accelerating batch processing of spatial raster analysis using GPU. Computers & Geosciences. DOI
  • See the "Parallelization" category listing at the bottom of this page.

Modules of interest to be parallelized

The target version will be GRASS 7 (alias SVN trunk).

Notes:

    • r.resamp.stats, r.resamp.filter and r.series should be readily parallelisable, but I/O is likely to be the bottleneck.
    • r.series has the advantage that the I/O is also parallelisable.