OpenMP/Benchmarks: Difference between revisions
< OpenMP
m (→Results table: icc i7) |
mNo edit summary |
||
Line 21: | Line 21: | ||
time ./neighbor 5000 5000 23 | time ./neighbor 5000 5000 23 | ||
</source> | </source> | ||
=== Results table === | |||
Test setup: 5000x5000 array with a window size of 23x23 cells | Test setup: 5000x5000 array with a window size of 23x23 cells | ||
{| border="1" class="wikitable sortable" style="margin: 1em 1em 1em 0; background: #f9f9f9; border: 1px #aaaaaa solid; border-collapse: collapse;" | {| border="1" class="wikitable sortable" style="margin: 1em 1em 1em 0; background: #f9f9f9; border: 1px #aaaaaa solid; border-collapse: collapse;" |
Revision as of 08:25, 24 July 2013
The idea of this wiki page is to explore which compiler and compiler flag combinations are most useful for speeding up computations without degrading data quality.
Neighborhood analysis
Performance using OpenMP and different compilers:
source|svn/sandbox/soeren/benchmarks/neighborhood_openmp/
Best to run it 4 times for each case, discard the first and average the next 3.
Example usage:
unset OMP_NUM_THREADS
time ./neighbor 5000 5000 23
export OMP_NUM_THREADS=1
time ./neighbor 5000 5000 23
...
export OMP_NUM_THREADS=6
time ./neighbor 5000 5000 23
Results table
Test setup: 5000x5000 array with a window size of 23x23 cells
CPU | Available cores | OMP NUM THREADS | Time "real" | Time "user" | Time "sys" | Compiler | Compiler version | Compiler flags | OS | System RAM | Data sum | Data mean |
---|---|---|---|---|---|---|---|---|---|---|---|---|
AMD Phenom II X6 1090T | 6 | 1 | 129.17s | 124.94s | 0.75s | gcc | 4.4.5 | -O0 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 2 | 64.64s | 127.89s | 0.31s | gcc | 4.4.5 | -O0 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 4 | 37.26s | 145.96s | 0.52s | gcc | 4.4.5 | -O0 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 6 | 25.17s | 147.70s | 0.49s | gcc | 4.4.5 | -O0 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 1 | 34.86s | 34.67s | 0.19s | gcc | 4.4.5 | -O3 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 2 | 18.38s | 35.71s | 0.13s | gcc | 4.4.5 | -O3 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 4 | 10.04s | 38.40s | 0.32s | gcc | 4.4.5 | -O3 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
AMD Phenom II X6 1090T | 6 | 6 | 7.03s | 37.69s | 0.29s | gcc | 4.4.5 | -O3 | Debian GNU/Linux 6.0.7 (squeeze) | 8.0 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 1 | 64.08s | 63.90s | 0.07s | gcc | 4.6.3 | -O0 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 2 | 32.64s | 64.69s | 0.21s | gcc | 4.6.3 | -O0 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 4 | 17.54s | 68.18s | 0.20s | gcc | 4.6.3 | -O0 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 6 | 16.07s | 85.88s | 0.19s | gcc | 4.6.3 | -O0 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 8 | 13.80s | 106.93s | 0.24s | gcc | 4.6.3 | -O0 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 1 | 19.56s | 19.38s | 0.07s | gcc | 4.6.3 | -O3 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 2 | 10.49s | 20.53s | 0.07s | gcc | 4.6.3 | -O3 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 4 | 5.67s | 21.57s | 0.07s | gcc | 4.6.3 | -O3 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 6 | 4.75s | 25.64s | 0.08s | gcc | 4.6.3 | -O3 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 8 | 3.80s | 28.97s | 0.10s | gcc | 4.6.3 | -O3 | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 1 | 7.96s | 7.94s | 0.07s | icc | 12.1.3 20120212 | -Ofast | Ubuntu 12.04.2 LTS | 15.5 gb | ? | ? |
Intel Core i7-3770 @ 3.40GHz | 8 | 2 | 4.69s | 8.89s | 0.08s | icc | 12.1.3 20120212 | -Ofast | Ubuntu 12.04.2 LTS | 15.5 gb | ? | ? |
Intel Core i7-3770 @ 3.40GHz | 8 | 4 | 2.83s | 10.01s | 0.08s | icc | 12.1.3 20120212 | -Ofast | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 6 | 2.69s | 13.26s | 0.10s | icc | 12.1.3 20120212 | -Ofast | Ubuntu 12.04.2 LTS | 15.5 gb | ||
Intel Core i7-3770 @ 3.40GHz | 8 | 8 | 2.49s | 17.55s | 0.12s | icc | 12.1.3 20120212 | -Ofast | Ubuntu 12.04.2 LTS | 15.5 gb | ? | ? |