Serial job example
From MediaWiki
(Difference between revisions)
(3 intermediate revisions not shown) | |||
Line 109: | Line 109: | ||
$ qstat -r | $ qstat -r | ||
- | + | ||
ce64.ipb.ac.rs: | ce64.ipb.ac.rs: | ||
Req'd Req'd Elap | Req'd Req'd Elap | ||
Line 121: | Line 121: | ||
1723168.ce64.ipb zeki hpsee job_19.6_20.pbs 25990 1 -- -- 48000 R 02:02 | 1723168.ce64.ipb zeki hpsee job_19.6_20.pbs 25990 1 -- -- 48000 R 02:02 | ||
+ | $ qstat -q | ||
+ | |||
+ | server: ce64.ipb.ac.rs | ||
+ | |||
+ | Queue Memory CPU Time Walltime Node Run Que Lm State | ||
+ | ---------------- ------ -------- -------- ---- --- --- -- ----- | ||
+ | dteam -- 4800:00: 50:00:00 -- 0 0 5 E R | ||
+ | hpsee -- 48000:00 500:00:0 -- 7 19 68 E R | ||
+ | atlas -- 48000:00 500:00:0 -- 30 0 68 E R | ||
+ | sgdemo -- 48:00:00 72:00:00 -- 0 0 -- E R | ||
+ | see -- 48000:00 500:00:0 -- 8 22 68 E R | ||
+ | desktopg -- 48000:00 500:00:0 -- 5 3 5 E R | ||
+ | seegrid -- 48000:00 500:00:0 -- 10 0 68 E R | ||
+ | cms -- 48000:00 500:00:0 -- 0 0 68 E R | ||
+ | ops -- 4800:00: 50:00:00 -- 0 0 5 E R | ||
+ | aegis -- 48000:00 500:00:0 -- 0 70 68 E R | ||
+ | ----- ----- | ||
+ | 60 114 | ||
10. After two minutes your job will be done. List content of <jobID>.ce64.ipb.ac.rs.out file : | 10. After two minutes your job will be done. List content of <jobID>.ce64.ipb.ac.rs.out file : |
Latest revision as of 21:57, 13 October 2011
In this exercise user should obtain prepared simple job, extract archive, list content of files, submit job on PARADOX cluster, monitor his progress with information from queue and when job is done list resaults file.
1. Login on ui.ipb.ac.rs:
$ ssh ngrkic@ui.ipb.ac.rs
2. Navigate to your folder in nfs filesystem.
$ cd /nfs/ngrkic
3. Download tgz archive with example files.
$ wget http://wiki.ipb.ac.rs/images/d/db/Serial.tgz
4. Extract archive :
$ tar xvzf Serial.tgz
5. Enter Serial folder :
$ cd Serial
6. List content of folder:
$ ll
7. List content of job.pbs and job.sh files:
$ cat job.pbs
#!/bin/bash #PBS -q hpsee #PBS -l nodes=1:ppn=1 #PBS -l walltime=10:00:00 #PBS -e ${PBS_JOBID}.err #PBS -o ${PBS_JOBID}.out cd $PBS_O_WORKDIR chmod +x job.sh ./job.sh
cat job.sh
#!/bin/bash date hostname pwd sleep 120
8. Submit job :
qsub job.pbs
qsub will print output :
<jobID>.ce64.ipb.ac.rs
9. Monitor your job :
qstat <jobID>.ce64.ipb.ac.rs
$ qstat -u ngrkic
ce64.ipb.ac.rs: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 1723627.ce64.ipb ngrkic hpsee job.pbs 32715 1 -- -- 48000 R --
$qstat -f 1723627
Job Id: 1723627.ce64.ipb.ac.rs Job_Name = job.pbs Job_Owner = ngrkic@ui.ipb.ac.rs job_state = R queue = hpsee server = ce64.ipb.ac.rs Checkpoint = u ctime = Thu Oct 13 22:57:53 2011 Error_Path = ui.ipb.ac.rs:/nfs/ngrkic/serial/${PBS_JOBID}.err exec_host = n08.ipb.ac.rs/0 Hold_Types = n Join_Path = n Keep_Files = n Mail_Points = a mtime = Thu Oct 13 22:57:54 2011 Output_Path = ui.ipb.ac.rs:/nfs/ngrkic/serial/${PBS_JOBID}.out Priority = 0 qtime = Thu Oct 13 22:57:53 2011 Rerunable = True Resource_List.cput = 48000:00:00 Resource_List.nodect = 1 Resource_List.nodes = 1:ppn=1 Resource_List.walltime = 500:00:00 session_id = 3759 Variable_List = PBS_O_HOME=/home/ngrkic,PBS_O_LANG=en_US.UTF-8, PBS_O_LOGNAME=ngrkic, PBS_O_PATH=/opt/glite/yaim/bin:/usr/kerberos/bin:/opt/d-cache/srm/bin :/opt/d-cache/dcap/bin:/opt/edg/bin:/opt/glite/bin:/opt/globus/bin:/op t/lcg/bin:/usr/local/bin:/bin:/usr/bin:/opt/intel/Compiler/11.1/064/bi n/intel64/:/home/ngrkic/bin:/opt/intel/Compiler/11.1/059/bin/intel64/, PBS_O_MAIL=/var/spool/mail/ngrkic,PBS_O_SHELL=/bin/bash, PBS_SERVER=ui.ipb.ac.rs,PBS_O_HOST=ui.ipb.ac.rs, PBS_O_WORKDIR=/nfs/ngrkic/serial,PBS_O_QUEUE=hpsee etime = Thu Oct 13 22:57:53 2011 submit_args = job.pbs start_time = Thu Oct 13 22:57:54 2011 start_count = 1
$ qstat -r
ce64.ipb.ac.rs: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 1719574.ce64.ipb antun hpsee job.pbs-1 11654 1 -- -- 48000 R 137:0 1722071.ce64.ipb seevo133 see STDIN 15011 21 -- -- 48000 R 512:1 1722834.ce64.ipb seevo133 see STDIN 17868 21 -- -- 48000 R 487:2 1722985.ce64.ipb zeki hpsee job_5_25.pbs 11761 1 -- -- 48000 R 01:55 1723147.ce64.ipb zeki hpsee job_4.75_5.pbs 20279 1 -- -- 48000 R 00:43 1723168.ce64.ipb zeki hpsee job_19.6_20.pbs 25990 1 -- -- 48000 R 02:02
$ qstat -q
server: ce64.ipb.ac.rs Queue Memory CPU Time Walltime Node Run Que Lm State ---------------- ------ -------- -------- ---- --- --- -- ----- dteam -- 4800:00: 50:00:00 -- 0 0 5 E R hpsee -- 48000:00 500:00:0 -- 7 19 68 E R atlas -- 48000:00 500:00:0 -- 30 0 68 E R sgdemo -- 48:00:00 72:00:00 -- 0 0 -- E R see -- 48000:00 500:00:0 -- 8 22 68 E R desktopg -- 48000:00 500:00:0 -- 5 3 5 E R seegrid -- 48000:00 500:00:0 -- 10 0 68 E R cms -- 48000:00 500:00:0 -- 0 0 68 E R ops -- 4800:00: 50:00:00 -- 0 0 5 E R aegis -- 48000:00 500:00:0 -- 0 70 68 E R ----- ----- 60 114
10. After two minutes your job will be done. List content of <jobID>.ce64.ipb.ac.rs.out file :
cat <jobID>.ce64.ipb.ac.rs.out
Tue Oct 11 21:42:58 CEST 2011 n08.ipb.ac.rs /nfs/ngrkic/serial