GPU: Difference between revisions

From GRASS-Wiki
Jump to navigation Jump to search
(→‎Further reading: added podcasts)
 
(16 intermediate revisions by 4 users not shown)
Line 1: Line 1:
__TOC__
== Implementation ==
In GRASS 7 you can ./configure GRASS with:
  --with-opencl
You need a GPU with a proprietary driver with libOpenCL.so library and C header files to go along with it. On Linux, currently, only nVidia, AMD (ATI) and Intel meet this criteria.
The Intel Ivy Bridge HD4000 driver only supplies OpenCL multi-CPU core support for Linux. The same chip has a GPU driver for MS Windows and (probably) for Mac OSX. On Linux + Intel graphics you need a Xeon chip for driver GPU support currently.
Point --with-opencl-includes= to the directory ''above'' cl.h, and as long as libOpenCL.so is in the ldconfig search path you're ok (a symlink to /usr/local/lib might be needed).
On Mac OSX OpenCL support is now built in, and the framework should be automatically detected when you use --with-opencl in the ./configure options.
OpenCL allows to utilize any number of GPUs and CPUs at the same time, or to pick what's what's wanted from the available, but any way the code has to be designed for such selection specifically.
Currently {{cmd|r.sun}} is the first module with any {{wikipedia|OpenCL}} support (see [[R.sun#Development|here]], although integration of Seth's work into svn-trunk is not yet complete.
== Discussion ==
Comments from the mailing list concerning GRASS and GPU parallelization:
Comments from the mailing list concerning GRASS and GPU parallelization:


Line 8: Line 29:
* As I understand it, CUDA is 100% dependent on the closed-source binary driver from nVidia and works on their video cards alone. Which is fine for today for people with nVidia hardware using their binary video card driver. If nVidia decides in a couple of years to stop supporting CUDA, your old card, your specific OS or distro, your OS or distro version+cpu type, or if they go out of business or are bought/sold to another company who is not interested, any code based on it becomes useless. For this reason code written for an open platform such as OpenCL, even if less advanced, seems to have a brighter long-term future. -- ''HB''
* As I understand it, CUDA is 100% dependent on the closed-source binary driver from nVidia and works on their video cards alone. Which is fine for today for people with nVidia hardware using their binary video card driver. If nVidia decides in a couple of years to stop supporting CUDA, your old card, your specific OS or distro, your OS or distro version+cpu type, or if they go out of business or are bought/sold to another company who is not interested, any code based on it becomes useless. For this reason code written for an open platform such as OpenCL, even if less advanced, seems to have a brighter long-term future. -- ''HB''


* Support for double precision floating point values must be retained for calculations which deal with positional data. For elevation data floating point precision may be enough.
* Support for double precision floating point values must be retained for calculations which deal with positional data (as sub-meter precision for lat/long exceeds single-precision floating poing). For elevation and radiometric data floating point precision may be enough.


== Further reading ==
== Further reading ==


* LINUX Magazine March 10th, 2010
* Steinbach, M., Hemmerling, R., 2011. Accelerating batch processing of spatial raster analysis using GPU. Computers & Geosciences. [http://dx.doi.org/10.1016/j.cageo.2011.11.012 DOI]
: "''GP-GPUs: OpenCL Is Ready For The Heavy Lifting''"
 
: http://www.linux-mag.com/id/7725
* LINUX Magazine March 10th, 2010: "''GP-GPUs: OpenCL Is Ready For The Heavy Lifting''", http://www.linux-mag.com/id/7725


* http://www.hpcwire.com/offthewire/Khronos-Demonstrates-OpenCL-Momentum-at-SC09-70243307.html
* http://www.hpcwire.com/offthewire/Khronos-Demonstrates-OpenCL-Momentum-at-SC09-70243307.html
Line 22: Line 43:
* OpenCL podcasts: http://www.macresearch.org/opencl
* OpenCL podcasts: http://www.macresearch.org/opencl


== Interesting Hardware ==
* Parallel GRASS GIS modules for viewshed and Fresnel analysis running on CUDA GPU, test project, http://s51mo.net/fresnel/
 
* [http://www.nvidia.com/object/product_geforce_gtx_480_us.html NVIDIA GeForce 480], packed with 3 billion transistors, 480 visual processing cores, 16 geometry units and 4 raster units. Multi-card SLI provides additional 90% performance boost.


== Modules of interest to be parallelized ==
== Modules of interest to be parallelized ==
Line 30: Line 49:
The target version will be GRASS 7 (alias SVN trunk).
The target version will be GRASS 7 (alias SVN trunk).


* {{cmd|v.in.ogr}} or <u>underlying vector library functions to build topology and spatial index</u>
* {{cmd|v.in.ogr}} or &rarr; <u>underlying vector library functions to build topology and spatial index</u> &larr;
* {{cmd|v.surf.rst}}
* {{cmd|v.surf.rst}}
* {{cmd|v.vol.rst}}
* {{cmd|v.vol.rst}}
** (''probably best to focus on the RST library first'')
** (''probably the best is to focus on the RST library first'')
* {{AddonCmd|r.viewshed}}
* {{cmd|r.viewshed}}
* {{cmd|r.sun}}
* {{cmd|v.surf.bspline}}
* {{cmd|r.sun}} (Seth added OpenCL support has part of his Google Summer of Code project)
* {{cmd|r.proj}}
* {{cmd|r.proj}}
* {{cmd|v.proj}}
* {{cmd|v.proj}}
* {{cmd|v.net}}.* ???
* {{cmd|v.net}}.* ???
* raster library
* raster library (typically I/O-bound)
* {{cmd|i.rectify}}
* ...
* ...
* <strike>{{cmd|r.mapcalc}}</strike> (already has pthreads support; probably I/O-bound)
* <strike>{{cmd|r.mapcalc|version=70}}</strike> (already has pthreads support (but only for parsing!!); probably I/O-bound)
* <strike>{{cmd|r.los}}</strike> (to be replaced by r.viewshed after last few bugs are fixed)
* <strike>{{cmd|i.rectify}}</strike> (internal code to be replaced by GDALwarp API)


Notes:
** r.resamp.stats, r.resamp.filter and r.series should be readily parallelisable, but I/O is likely to be the bottleneck.
** r.series has the advantage that the I/O is also parallelisable.


[[Category: Hardware]]
[[Category: Parallelization]]
[[Category: Parallelization]]

Latest revision as of 11:39, 4 April 2016

Implementation

In GRASS 7 you can ./configure GRASS with:

 --with-opencl

You need a GPU with a proprietary driver with libOpenCL.so library and C header files to go along with it. On Linux, currently, only nVidia, AMD (ATI) and Intel meet this criteria.

The Intel Ivy Bridge HD4000 driver only supplies OpenCL multi-CPU core support for Linux. The same chip has a GPU driver for MS Windows and (probably) for Mac OSX. On Linux + Intel graphics you need a Xeon chip for driver GPU support currently.

Point --with-opencl-includes= to the directory above cl.h, and as long as libOpenCL.so is in the ldconfig search path you're ok (a symlink to /usr/local/lib might be needed).

On Mac OSX OpenCL support is now built in, and the framework should be automatically detected when you use --with-opencl in the ./configure options.

OpenCL allows to utilize any number of GPUs and CPUs at the same time, or to pick what's what's wanted from the available, but any way the code has to be designed for such selection specifically.

Currently r.sun is the first module with any OpenCL support (see here, although integration of Seth's work into svn-trunk is not yet complete.

Discussion

Comments from the mailing list concerning GRASS and GPU parallelization:

  • As I understand it, CUDA is 100% dependent on the closed-source binary driver from nVidia and works on their video cards alone. Which is fine for today for people with nVidia hardware using their binary video card driver. If nVidia decides in a couple of years to stop supporting CUDA, your old card, your specific OS or distro, your OS or distro version+cpu type, or if they go out of business or are bought/sold to another company who is not interested, any code based on it becomes useless. For this reason code written for an open platform such as OpenCL, even if less advanced, seems to have a brighter long-term future. -- HB
  • Support for double precision floating point values must be retained for calculations which deal with positional data (as sub-meter precision for lat/long exceeds single-precision floating poing). For elevation and radiometric data floating point precision may be enough.

Further reading

  • Steinbach, M., Hemmerling, R., 2011. Accelerating batch processing of spatial raster analysis using GPU. Computers & Geosciences. DOI
  • See the "Parallelization" category listing at the bottom of this page.

Modules of interest to be parallelized

The target version will be GRASS 7 (alias SVN trunk).

Notes:

    • r.resamp.stats, r.resamp.filter and r.series should be readily parallelisable, but I/O is likely to be the bottleneck.
    • r.series has the advantage that the I/O is also parallelisable.