Serial job example

From MediaWiki

(Difference between revisions)
Jump to: navigation, search
 
(4 intermediate revisions not shown)
Line 109: Line 109:
  $ qstat -r
  $ qstat -r
-
+
 
  ce64.ipb.ac.rs:  
  ce64.ipb.ac.rs:  
                                                                           Req'd  Req'd  Elap
                                                                           Req'd  Req'd  Elap
Line 121: Line 121:
  1723168.ce64.ipb    zeki    hpsee    job_19.6_20.pbs  25990    1  --    --  48000 R 02:02
  1723168.ce64.ipb    zeki    hpsee    job_19.6_20.pbs  25990    1  --    --  48000 R 02:02
 +
$ qstat -q
-
qstat -r
+
server: ce64.ipb.ac.rs
-
 
+
   
-
ce64.ipb.ac.rs:
+
Queue           Memory CPU Time Walltime Node Run Que Lm  State
-
                                                                        Req'd Req'd  Elap
+
---------------- ------ -------- -------- ---- --- --- -- -----
-
Job ID              Username Queue   Jobname          SessID NDS  TSK Memory Time  S Time
+
dteam              --   4800:00: 50:00:00  --   0  0  5  E R
-
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
+
hpsee              --   48000:00 500:00:0  --   7  19 68  E R
-
1719574.ce64.ipb    antun   hpsee    job.pbs-1        11654    1 --   --  48000 R 137:0
+
atlas              --   48000:00 500:00:0  --   30  0 68  E R
-
1722071.ce64.ipb    seevo133 see      STDIN            15011    21 --   -- 48000 R 512:1
+
sgdemo            --   48:00:00 72:00:00  --   0  0 --   E R
-
1722834.ce64.ipb    seevo133 see      STDIN            17868    21 --   -- 48000 R 487:2
+
see                --   48000:00 500:00:0  --    8 22 68  E R
-
1722985.ce64.ipb    zeki    hpsee    job_5_25.pbs      11761    1 --   --  48000 R 01:55
+
desktopg          --   48000:00 500:00:0  --   5  3 5  E R
-
1723147.ce64.ipb    zeki    hpsee    job_4.75_5.pbs    20279    1 --   --  48000 R 00:43
+
  seegrid            --   48000:00 500:00:0  --   10  0 68  E R
-
1723168.ce64.ipb    zeki    hpsee    job_19.6_20.pbs   25990    1  --    -- 48000 R 02:02
+
  cms                --   48000:00 500:00:0  --   0  0 68  E R
-
 
+
  ops                --   4800:00: 50:00:00  --   0  0 5  E R
 +
  aegis              --   48000:00 500:00:0   --    0 70 68  E R
 +
                                                ----- -----
 +
                                                  60  114
10. After two minutes your job will be done. List content of <jobID>.ce64.ipb.ac.rs.out file :
10. After two minutes your job will be done. List content of <jobID>.ce64.ipb.ac.rs.out file :

Latest revision as of 21:57, 13 October 2011

In this exercise user should obtain prepared simple job, extract archive, list content of files, submit job on PARADOX cluster, monitor his progress with information from queue and when job is done list resaults file.

1. Login on ui.ipb.ac.rs:

$ ssh ngrkic@ui.ipb.ac.rs

2. Navigate to your folder in nfs filesystem.

$ cd /nfs/ngrkic

3. Download tgz archive with example files.

$ wget http://wiki.ipb.ac.rs/images/d/db/Serial.tgz

4. Extract archive :

$ tar xvzf Serial.tgz

5. Enter Serial folder :

$ cd Serial

6. List content of folder:

$ ll

7. List content of job.pbs and job.sh files:

$ cat job.pbs
#!/bin/bash
#PBS -q hpsee
#PBS -l nodes=1:ppn=1
#PBS -l walltime=10:00:00
#PBS -e ${PBS_JOBID}.err
#PBS -o ${PBS_JOBID}.out

cd $PBS_O_WORKDIR
chmod +x job.sh
./job.sh
cat job.sh
#!/bin/bash
date
hostname
pwd
sleep 120

8. Submit job :

qsub job.pbs

qsub will print output :

<jobID>.ce64.ipb.ac.rs

9. Monitor your job :

qstat <jobID>.ce64.ipb.ac.rs
$ qstat -u ngrkic
ce64.ipb.ac.rs: 
                                                                         Req'd  Req'd   Elap
Job ID               Username Queue    Jobname          SessID NDS   TSK Memory Time  S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
1723627.ce64.ipb     ngrkic   hpsee    job.pbs           32715     1  --    --  48000 R   -- 
$qstat -f 1723627
Job Id: 1723627.ce64.ipb.ac.rs
   Job_Name = job.pbs
   Job_Owner = ngrkic@ui.ipb.ac.rs
   job_state = R
   queue = hpsee
   server = ce64.ipb.ac.rs
   Checkpoint = u
   ctime = Thu Oct 13 22:57:53 2011
   Error_Path = ui.ipb.ac.rs:/nfs/ngrkic/serial/${PBS_JOBID}.err
   exec_host = n08.ipb.ac.rs/0
   Hold_Types = n
   Join_Path = n
   Keep_Files = n
   Mail_Points = a
   mtime = Thu Oct 13 22:57:54 2011
   Output_Path = ui.ipb.ac.rs:/nfs/ngrkic/serial/${PBS_JOBID}.out
   Priority = 0
   qtime = Thu Oct 13 22:57:53 2011
   Rerunable = True
   Resource_List.cput = 48000:00:00
   Resource_List.nodect = 1
   Resource_List.nodes = 1:ppn=1
   Resource_List.walltime = 500:00:00
   session_id = 3759
   Variable_List = PBS_O_HOME=/home/ngrkic,PBS_O_LANG=en_US.UTF-8,
   PBS_O_LOGNAME=ngrkic,
   PBS_O_PATH=/opt/glite/yaim/bin:/usr/kerberos/bin:/opt/d-cache/srm/bin
   :/opt/d-cache/dcap/bin:/opt/edg/bin:/opt/glite/bin:/opt/globus/bin:/op
   t/lcg/bin:/usr/local/bin:/bin:/usr/bin:/opt/intel/Compiler/11.1/064/bi
   n/intel64/:/home/ngrkic/bin:/opt/intel/Compiler/11.1/059/bin/intel64/,
   PBS_O_MAIL=/var/spool/mail/ngrkic,PBS_O_SHELL=/bin/bash,
   PBS_SERVER=ui.ipb.ac.rs,PBS_O_HOST=ui.ipb.ac.rs,
   PBS_O_WORKDIR=/nfs/ngrkic/serial,PBS_O_QUEUE=hpsee
   etime = Thu Oct 13 22:57:53 2011
   submit_args = job.pbs
   start_time = Thu Oct 13 22:57:54 2011
   start_count = 1
$ qstat -r
ce64.ipb.ac.rs: 
                                                                         Req'd  Req'd   Elap
Job ID               Username Queue    Jobname          SessID NDS   TSK Memory Time  S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
1719574.ce64.ipb     antun    hpsee    job.pbs-1         11654     1  --    --  48000 R 137:0
1722071.ce64.ipb     seevo133 see      STDIN             15011    21  --    --  48000 R 512:1
1722834.ce64.ipb     seevo133 see      STDIN             17868    21  --    --  48000 R 487:2
1722985.ce64.ipb     zeki     hpsee    job_5_25.pbs      11761     1  --    --  48000 R 01:55
1723147.ce64.ipb     zeki     hpsee    job_4.75_5.pbs    20279     1  --    --  48000 R 00:43
1723168.ce64.ipb     zeki     hpsee    job_19.6_20.pbs   25990     1  --    --  48000 R 02:02
$ qstat -q
server: ce64.ipb.ac.rs

Queue            Memory CPU Time Walltime Node  Run Que Lm  State
---------------- ------ -------- -------- ----  --- --- --  -----
dteam              --   4800:00: 50:00:00   --    0   0  5   E R
hpsee              --   48000:00 500:00:0   --    7  19 68   E R
atlas              --   48000:00 500:00:0   --   30   0 68   E R
sgdemo             --   48:00:00 72:00:00   --    0   0 --   E R
see                --   48000:00 500:00:0   --    8  22 68   E R
desktopg           --   48000:00 500:00:0   --    5   3  5   E R
seegrid            --   48000:00 500:00:0   --   10   0 68   E R
cms                --   48000:00 500:00:0   --    0   0 68   E R
ops                --   4800:00: 50:00:00   --    0   0  5   E R
aegis              --   48000:00 500:00:0   --    0  70 68   E R
                                                ----- -----
                                                  60   114

10. After two minutes your job will be done. List content of <jobID>.ce64.ipb.ac.rs.out file :

cat <jobID>.ce64.ipb.ac.rs.out
Tue Oct 11 21:42:58 CEST 2011
n08.ipb.ac.rs
/nfs/ngrkic/serial
Personal tools