반응형
bash script로 slurm에 job 던지기
[USER@login1]$ cat submission.sh
#!/bin/bash
#SBATCH --nodes=1
srun jupyter nbconvert --to notebook --execute mynotebook.ipynb
[USER@login1]$ sbatch submission.sh
command에서 바로 명령어 날려서 실행시키기
[USER@login1]$ conda activate my_env
(my_env)[USER@login1]$ srun jupyter nbconvert --to notebook --execute mynotebook.ipynb
sinfo
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 2-00:00:00 3 idle node[1-3]
optiplex up infinite 0 n/a
sbatch
#!/bin/bash
#SBATCH -J ensemble # job name
#SBATCH -o ensemble.%j.out # standard output and error log
#SBATCH -p normal # queue name or partiton name
#SBATCH -t 70:00:00 # Run time (hh:mm:ss)
samtools view alignments.bam > alignments.sam
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --job-name="A long job"
#SBATCH --mem=5GB
#SBATCH --output=long-job.out
cd /path/where/to/start/the/job
# This may vary per HPC system. At USC's hpc system
# we use: source /usr/usc/R/default/setup.sh
module load R
Rscript --vanilla long-job-rscript.R
srun (할당이 시작되면 부여된 노드 중 하나에서 새 bash 세션이 시작됨)
salloc (새 bash 세션이 로그인 노드에서 시작됨.)
squeue
$ squeue
JOBID NAME STATE USER GROUP PARTITION NODE NODELIST CPUS TRES_PER_NODE TIME_LIMIT TIME_LEFT
6539 ensemble RUNNING dhj1 usercl TITANRTX 1 n1 4 gpu:4 3-00:00:00 1-22:57:11
6532 bash PENDING gildong usercl 2080ti 1 n2 1 gpu:8 3-00:00:00 2-03:25:06
scancel
$ scancel 6539
작업내용 구체적으로 확인
$ scontrol show job 3217
JobId=3217 JobName=ssw_test
UserId=moasys1(100001107) GroupId=in0011(1000011) MCS_label=N/A
Priority=4294901630 Nice=0 Account=kat_user QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:05 TimeLimit=01:00:00 TimeMin=N/A
SubmitTime=2018-04-30T17:54:07 EligibleTime=2018-04-30T17:54:07
StartTime=2018-04-30T17:54:07 EndTime=2018-04-30T18:54:08 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
Partition=ivy_v100_2 AllocNode:Sid=login-tesla02:9203
ReqNodeList=(null) ExcNodeList=(null)
NodeList=tesla[03-04]
BatchHost=tesla03
NumNodes=2 NumCPUs=40 NumTasks=4 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=40,node=2,gres/gpu=2
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=gpu Reservation=(null)
OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)
Command=./kat-2.sh
WorkDir=/scratch2/moasys1/ssw/moasys1/kat_test
StdErr=/scratch2/moasys1/ssw/moasys1/kat_test/ssw.e3217
StdIn=/dev/null
StdOut=/scratch2/moasys1/ssw/moasys1/kat_test/ssw.o3217
Power=
작업 우선순위 조정
$ sudo scontrol update job=1465 nice=-100
참고)
https://doheejin.github.io/linux/2021/02/18/linux-slurm.html
https://www.biostars.org/p/453787/
https://support.nesi.org.nz/hc/en-gb/articles/360001316356-Slurm-Interactive-Sessions
https://dandyrilla.github.io/2017-04-11/jobsched-slurm/
반응형
'OS > Linux' 카테고리의 다른 글
docker, cuda 설치 (0) | 2023.05.24 |
---|---|
VScode로 matlab 설치 안될 때(ubuntu terminal 접속 시) (0) | 2023.04.29 |
ls 명령어 파일 사이즈 옵션 (0) | 2023.01.13 |
[ssh] 접속 시 인증키 사용 (authorized_keys) (0) | 2022.07.14 |
[Ubuntu] PATH 설정 (0) | 2022.03.23 |