Parallel GRASS jobs: Difference between revisions

From GRASS-Wiki
Jump to navigation Jump to search
(+docs)
(say more)
Line 37: Line 37:
** At the end of the elaboration call a second batch job which only contains g.copy to copy the result into a common mapset. There is a low risk of race condition here in case that two nodes finish at the same time but this could be even trapped in a loop which checks if the target mapset is locked and, if needed, launches g.copy again 'till success.
** At the end of the elaboration call a second batch job which only contains g.copy to copy the result into a common mapset. There is a low risk of race condition here in case that two nodes finish at the same time but this could be even trapped in a loop which checks if the target mapset is locked and, if needed, launches g.copy again 'till success.
* Job planning
* Job planning
** The challenging part for the user is to estimate the execution time since PBS kills jobs which exceed the requested time. The same applies to the request for number of nodes and CPUs per node as well as the amount of needed RAM. Usually tests are needed to see the performance.
** The challenging part for the user is to estimate the execution time since PBS kills jobs which exceed the requested time. The same applies to the request for number of nodes and CPUs per node as well as the amount of needed RAM. Usually tests are needed to see the performance in order to impose the values correctly in the PBS script (see below).


'''How to write the scripts:'''
'''How to write the scripts:'''
Line 44: Line 44:
to make the data which you want to elaborate accessible. You would then loop over many map names (e.g. "aqua_lst1km20020706.LST_Night_1km.filt") and launch the script with map name as first parameter:
to make the data which you want to elaborate accessible. You would then loop over many map names (e.g. "aqua_lst1km20020706.LST_Night_1km.filt") and launch the script with map name as first parameter:


       ------- snip (complete for PBS stuff and name as 'launch_grassjob_55min.sh' -----------
       ------- snip (you need to update this for PBS stuff and save as 'launch_grassjob_55min.sh' -----------
      #!/bin/sh
      ### Project number, enter if applicable (needed to manage your CPU hours)
      #PBS -A HPC2N-2008-001
      #
      ### Job name - defaults to name of submit script
      #PBS -N modis_interpolation_GRASS_RST.sh
      #
      ### Output files - defaults to jobname.[eo]jobnumber
      #PBS -o modis_rst.$MYMODIS.out
      #PBS -e modis_rst.$MYMODIS.err
      #
      ### Mail on - a=abort, b=beginning, e=end - defaults to a
      #PBS -m abe
      ### Number of nodes - defaults to 1:1
      ### Requesting 1 nodes with 1 processor per node:
      #PBS -l nodes=1:ppn=1
      ### Requesting time - defaults to 30 minutes
      #PBS -l walltime=00:55:00
      ### amount of physical memory (in MB) each processor will use with a line:
      #PBS -l pvmem=3000m
      # we'll call this script below in a loop, giving the name of the map to elaborate as parameter
       MYMAPSET=$1
       MYMAPSET=$1
       TARGETMAPSET=results
       TARGETMAPSET=results
        
        
       # define batch job which does the elaboration
       # define as env. variable the batch job which does the real GRASS elaboration (so, which contains the GRASS commands)
       GRASS_BATCH_JOB=/shareddisk/modis_job.sh
       GRASS_BATCH_JOB=/shareddisk/modis_job.sh
       grass63 -text /shareddisk/grassdata/myloc/$MYMAPSET
       grass63 -text /shareddisk/grassdata/myloc/$MYMAPSET
        
        
       # copy over result to target mapset
       # since we write results to a temporary mapset, copy over result to target mapset
      # this we launch again as small GRASS job
       export INMAP=${CURRMAP}_rst
       export INMAP=${CURRMAP}_rst
       export INMAPSET=$MYMAPSET
       export INMAPSET=$MYMAPSET
Line 61: Line 84:
       ------- snap ----------
       ------- snap ----------


You see, that GRASS is run twice. Note that you need GRASS 6.3 to make use of GRASS_BATCH_JOB (if present, GRASS automatically executes that job instead of launching the user interface.
You see, that GRASS is run twice. Note that you need GRASS Version >=6.3 to make use of GRASS_BATCH_JOB (if variable is present, GRASS automatically executes that job instead of launching the normal interactive user interface).


The script gcopyjob.sh simply contains
The script 'gcopyjob.sh' simply contains:
       ------- snip -----------
       ------- snip -----------
      #!/bin/sh
       g.copy rast=$INMAP@$INMAPSET,$OUTMAP --o
       g.copy rast=$INMAP@$INMAPSET,$OUTMAP --o
       ------- snap ----------
       ------- snap ----------


'''Launch of many jobs:'''
'''Launching many jobs:'''
 
We do this by simply looping over all map names to elaborate:


       cd /shareddisk/
       cd /shareddisk/
       # generate job list in a shell script:
       # loop and launch (we just pick the names from the GRASS DB itself; here: do all maps)
       for i in `find grassdata/myloc/modis_originals/colr/ -name '*'` ; do  
      # instead of launching immediately, we create a launch script:
       for i in `find grassdata/myloc/modis_originals/cell/ -name '*'` ; do  
           NAME=`basename $i`
           NAME=`basename $i`
           echo qsub -v MYMODIS=$NAME ./launch_grassjob_55min.sh
           echo qsub -v MYMODIS=$NAME ./launch_grassjob_55min.sh
       done | sort > launch1.sh  
       done | sort > launch1.sh  
       # launch that
 
       # now really launch the jobs:
       sh launch1.sh
       sh launch1.sh


That's it!
That's it! Emails will arrive to notify upon begin, abort (hopefully not!) and end of job execution.


=== SGE - SUN Grid Engine ===
=== SGE - SUN Grid Engine ===

Revision as of 13:20, 30 April 2008

Parallel GRASS jobs

NOTE: GRASS 6 libraries are NOT thread safe (except for GPDE, see below).

Essentially there are at least two approaches of "poor man" parallelization without modifying GRASS source code:

  • split map into spatial chunks (possibly with overlap to gain smooth results)
  • time series: run each map elaboration on a different node.

GPDE using OpenMP

The only parallelized library in GRASS 6.3 is GRASS Partial Differential Equations Library (GPDE). The library design is thread safe and supports threaded parallelism with OpenMP. The code is not yet widely used in GRASS. See here for details.

OpenMosix

If you want to launch several GRASS jobs in parallel, you have to launch each job in its own mapset. Be sure to indicate the mapset correctly in the GISRC file (see above). You can use the process ID (PID, get with $$ or use PBS jobname) to generate a almost unique number which you can add to the mapset name.


Now you could launch the jobs on an openMosix cluster (just install openMosix on your colleague's computers...).

PBS

Note: For PBS details, read on here.

You need essentially two scripts:

  • GRASS job script (which takes the name(s) of map(s) to elaborate from environmental variables
  • script to launch this GRASS-script as job for each map to elaborate

General steps (for multiple serial jobs on many CPUs):

  • Job definition
    • PBS setup (in the header): define calculation time, number of nodes, number of processors, amount of RAM for individual job;
    • data are stored in centralized directory which is seen by all nodes;
  • Job execution (launch of jobs)
    • user launches all jobs ("qsub"), they are submitted to the queue. Use the GRASS_BATCH_JOB variable to define the name of the elaboration script.
    • the scheduler optimizes among all user the execution of the jobs according to available resources and requested resources;
    • for the user this means that 0..max jobs are executed in parallel (unless the administrators didn't define either priority or limits). The user can then observe the job queue ("showq") to see other jobs ahead and scheduling of own jobs. Once a job is running, the cluster possibly sends a notification email to the user, the same again when a job is terminating.
    • At the end of the elaboration call a second batch job which only contains g.copy to copy the result into a common mapset. There is a low risk of race condition here in case that two nodes finish at the same time but this could be even trapped in a loop which checks if the target mapset is locked and, if needed, launches g.copy again 'till success.
  • Job planning
    • The challenging part for the user is to estimate the execution time since PBS kills jobs which exceed the requested time. The same applies to the request for number of nodes and CPUs per node as well as the amount of needed RAM. Usually tests are needed to see the performance in order to impose the values correctly in the PBS script (see below).

How to write the scripts: To avoid race conditions, you can automatically generate multiple mapsets in a given location. When you start GRASS (in your script) with path to grassdata/location/mapset/ and the requested mapset does not yet exist, it will be automatically created. So, as first step in your job script, be sure to run

     g.mapsets add=mapset1_with_data[,mapset2_with_data]

to make the data which you want to elaborate accessible. You would then loop over many map names (e.g. "aqua_lst1km20020706.LST_Night_1km.filt") and launch the script with map name as first parameter:

     ------- snip (you need to update this for PBS stuff and save as 'launch_grassjob_55min.sh' -----------
     #!/bin/sh
     ### Project number, enter if applicable (needed to manage your CPU hours)
     #PBS -A HPC2N-2008-001
     #
     ### Job name - defaults to name of submit script
     #PBS -N modis_interpolation_GRASS_RST.sh
     #
     ### Output files - defaults to jobname.[eo]jobnumber
     #PBS -o modis_rst.$MYMODIS.out
     #PBS -e modis_rst.$MYMODIS.err
     #
     ### Mail on - a=abort, b=beginning, e=end - defaults to a
     #PBS -m abe
     ### Number of nodes - defaults to 1:1
     ### Requesting 1 nodes with 1 processor per node:
     #PBS -l nodes=1:ppn=1
     ### Requesting time - defaults to 30 minutes
     #PBS -l walltime=00:55:00
     ### amount of physical memory (in MB) each processor will use with a line:
     #PBS -l pvmem=3000m

     # we'll call this script below in a loop, giving the name of the map to elaborate as parameter
     MYMAPSET=$1
     TARGETMAPSET=results
     
     # define as env. variable the batch job which does the real GRASS elaboration (so, which contains the GRASS commands)
     GRASS_BATCH_JOB=/shareddisk/modis_job.sh
     grass63 -text /shareddisk/grassdata/myloc/$MYMAPSET
     
     # since we write results to a temporary mapset, copy over result to target mapset
     # this we launch again as small GRASS job
     export INMAP=${CURRMAP}_rst
     export INMAPSET=$MYMAPSET
     export OUTMAP=$INMAP
     export GRASS_BATCH_JOB=/shareddisk/gcopyjob.sh
     grass63 -text  /shareddisk/grassdata/myloc/$TARGETMAPSET
     exit 0
     ------- snap ----------

You see, that GRASS is run twice. Note that you need GRASS Version >=6.3 to make use of GRASS_BATCH_JOB (if variable is present, GRASS automatically executes that job instead of launching the normal interactive user interface).

The script 'gcopyjob.sh' simply contains:

     ------- snip -----------
     #!/bin/sh
     g.copy rast=$INMAP@$INMAPSET,$OUTMAP --o
     ------- snap ----------

Launching many jobs:

We do this by simply looping over all map names to elaborate:

     cd /shareddisk/
     # loop and launch (we just pick the names from the GRASS DB itself; here: do all maps)
     # instead of launching immediately, we create a launch script:
     for i in `find grassdata/myloc/modis_originals/cell/ -name '*'` ; do 
         NAME=`basename $i`
         echo qsub -v MYMODIS=$NAME ./launch_grassjob_55min.sh
     done | sort > launch1.sh 
     # now really launch the jobs:
     sh launch1.sh

That's it! Emails will arrive to notify upon begin, abort (hopefully not!) and end of job execution.

SGE - SUN Grid Engine