Parallelizing Scripts
Bourne shell script
- Poor-man's multithreading using Bourne shell script & backgrounding. WARNING: not all GRASS modules and scripts are safe to have other things happening in the same mapset while they are running. Try at your own risk after performing a suitable safety audit. e.g. Make sure g.region is not run, externally changing the region settings.
Example:
### r.sun mode 1 loop ###
SUNRISE=7.67
SUNSET=16.33
STEP=0.01
# | wc -l 867
CORES=4
DAY=355
for TIME in `seq $SUNRISE $STEP $SUNSET` ; do
echo "time=$TIME"
CMD="r.sun -s elevin=gauss day=$DAY time=$TIME \
beam_rad=rad1_test.${DAY}_${TIME}_beam --quiet"
# poor man's multi-threading for a multi-core CPU
MODULUS=`echo "$TIME $STEP $CORES" | awk '{print $1 % ($2 * $3)}'`
if [ "$MODULUS" = "$STEP" ] || [ "$TIME" = "$SUNSET" ] ; then
# stall to let the background jobs finish
$CMD
sleep 2
wait
#while [ `pgrep -c r.sun` -ne 0 ] ; do
# sleep 5
#done
else
$CMD &
fi
done
wait # wait for background jobs to finish to avoid race conditions
- This approach has been used in the r3.in.xyz addon script.
- Another example using r.sun Mode 2 can be found on the r.sun wiki page.
Python
- Due to the "GIL" in Python 2.x-3.0, pure python will only run on a single core, even when multi-threaded. All multithreading schemes & modules for (pure) Python are therefore wrappers around multiple system processes, which are a lot more expensive than threads to create and destroy. Thus it is more efficient to create large high-level Python 'threads' (processes) than to bury them deep inside of a loop.
Example of multiprocessing at the GRASS module level:
Similar to the Bourne shell example above, but using the subprocess python module. The i.oif script in GRASS7 is using this method.
bands = [1,2,3,4,5,7]
# run all bands in parallel
if "WORKERS" in os.environ:
workers = int(os.environ["WORKERS"])
else:
workers = 6
proc = {}
pout = {}
# spawn jobs in the background
for band in bands:
grass.debug("band %d, <%s> %% %d" % (band, image[band], band % workers))
proc[band] = grass.pipe_command('r.univar', flags = 'g', map = image[band])
if band % workers is 0:
# wait for the ones launched so far to finish
for bandp in bands[:band]:
if not proc[bandp].stdout.closed:
pout[bandp] = proc[bandp].communicate()[0]
proc[bandp].wait()
# wait for jobs to finish, collect the output
for band in bands:
if not proc[band].stdout.closed:
pout[band] = proc[band].communicate()[0]
proc[band].wait()
# parse the results
for band in bands:
kv = grass.parse_key_val(pout[band])
stddev[band] = float(kv['stddev'])