Difference between revisions of "OpenMP/Benchmarks"

From GRASS-Wiki
Jump to: navigation, search
m (Neighborhood analysis)
(Neighborhood analysis)
Line 19: Line 19:
 
  time ./neighbor 5000 5000 23
 
  time ./neighbor 5000 5000 23
 
</source>
 
</source>
 +
 +
Test setup: 5000x5000 array with a window size of 23x23 cells
  
 
results table:
 
results table:
 
{| border="1" class="wikitable sortable" style="margin: 1em 1em 1em 0; background: #f9f9f9; border: 1px #aaaaaa solid; border-collapse: collapse;"
 
{| border="1" class="wikitable sortable" style="margin: 1em 1em 1em 0; background: #f9f9f9; border: 1px #aaaaaa solid; border-collapse: collapse;"
 
!CPU
 
!CPU
!Number of cores
+
!Available cores
 
! OMP NUM THREADS
 
! OMP NUM THREADS
 
!Time "real"
 
!Time "real"
Line 33: Line 35:
 
!OS
 
!OS
 
!System RAM
 
!System RAM
 +
!Data sum
 +
!Data mean
 
|-
 
|-
 +
|-
 +
|AMD Phenom II X6 1090T
 +
|6
 +
|1
 +
|134.70s
 +
|137.78s
 +
|0.89s
 +
|gcc
 +
| 4.4.5
 +
| -O0
 +
|Debian GNU/Linux 6.0.7 (squeeze)
 +
|8.0 gb
 
|
 
|
 
|
 
|
 +
|-
 +
|AMD Phenom II X6 1090T
 +
|6
 +
|2
 +
|65.73s
 +
|129.46s
 +
|0.89s
 +
|gcc
 +
| 4.4.5
 +
| -O0
 +
|Debian GNU/Linux 6.0.7 (squeeze)
 +
|8.0 gb
 
|
 
|
 
|
 
|
|
+
|-
|
+
|AMD Phenom II X6 1090T
|
+
|6
|
+
|4
 +
|37.26s
 +
|145.96s
 +
|0.52s
 +
|gcc
 +
| 4.4.5
 +
| -O0
 +
|Debian GNU/Linux 6.0.7 (squeeze)
 +
|8.0 gb
 
|
 
|
 
|
 
|
 
|-
 
|-
|
+
|AMD Phenom II X6 1090T
|
+
|6
|
+
|6
|
+
|25.17s
|
+
|147.70s
|
+
|0.49s
|
+
|gcc
|
+
| 4.4.5
 +
| -O0
 +
|Debian GNU/Linux 6.0.7 (squeeze)
 +
|8.0 gb
 
|
 
|
 
|
 
|
 
|-
 
|-
| ''Intel i7 3770''
 
| ''4 real (8 w/ hyperthread)''
 
| ''unset (all)''
 
|
 
| ''example entry''
 
|
 
| ''gcc''
 
| ''4.4.8''
 
| ''-O3''
 
| ''Ubuntu 12.04''
 
| ''8 gb''
 
 
|}
 
|}

Revision as of 23:43, 23 July 2013

Neighborhood analysis

Performance using OpenMP and different compilers:

source|svn/sandbox/soeren/benchmarks/neighborhood_openmp/

Best to run it 4 times for each case, discard the first and average the next 3.

Example usage:

 unset OMP_NUM_THREADS
 time ./neighbor 5000 5000 23
 
 export OMP_NUM_THREADS=1
 time ./neighbor 5000 5000 23
 
 ...
 
 export OMP_NUM_THREADS=6
 time ./neighbor 5000 5000 23

Test setup: 5000x5000 array with a window size of 23x23 cells

results table:

CPU Available cores OMP NUM THREADS Time "real" Time "user" Time "sys" Compiler Compiler version Compiler flags OS System RAM Data sum Data mean
AMD Phenom II X6 1090T 6 1 134.70s 137.78s 0.89s gcc 4.4.5 -O0 Debian GNU/Linux 6.0.7 (squeeze) 8.0 gb
AMD Phenom II X6 1090T 6 2 65.73s 129.46s 0.89s gcc 4.4.5 -O0 Debian GNU/Linux 6.0.7 (squeeze) 8.0 gb
AMD Phenom II X6 1090T 6 4 37.26s 145.96s 0.52s gcc 4.4.5 -O0 Debian GNU/Linux 6.0.7 (squeeze) 8.0 gb
AMD Phenom II X6 1090T 6 6 25.17s 147.70s 0.49s gcc 4.4.5 -O0 Debian GNU/Linux 6.0.7 (squeeze) 8.0 gb