Welcome to PNGC Slurm Cluster.
The head node is a small instance. Please don’t run your bash jobs in head node. They will be killed.
The PNGC Cluster includes Head Node (pngc-head01) and its child nodes:
PARTITION | NODELIST |
---|---|
defq* | pngc-node01 |
cadreq | cadre-node01 |
gcadcbq | gcadcb-node01 |
gcadccq | gcadcc-node01 |
najq | naj-node01 |
niagadsq | niagads-node01 |
readdq | readd-node01 |
wleeq | wlee-node01 |
Your home folder /home/your-login-name has only “1TB” space, so please don’t save lots of files here.
Such as /home/<your-user-name>.
You can install your software in /home/<your-user-name> or /s3buckets/<your-project-s3-buckets> folder.
You can work and save your files in /home/<your-user-name> (only small space) or /s3buckets/<your-project-s3-buckets> folder.
The S3 buckets for your Projects:
- acad-psom-s3-bucket-01
- adamnaj-lab-psom-s3-bucket-01
- cadre-psom-s3-bucket-01
- gcad-coreb-psom-s3-bucket-01
- gcad-corec-psom-s3-bucket-01
- niagads-psom-s3-bucket-01
- pngc-psom-s3-bucket-01
- readd-adsp-psom-s3-bucket-01
- wanpinglee-lab-psom-s3-bucket-01
- wanpingleelab-deep-archive
The S3 buckets and datasets were shared to your Projects:
- ADGCdatasets
- BROAD
- GCADCOREBdatasets
- NIH
All S3 buckets include shared datasets were mounted in Head Node.
Each child node only mounted its project’s S3 bucket and its shared datasets.
Such as
The carde-node01 has mounted:
BROAD
NIH
cadre-psom-s3-bucket-01
Each project s3 bucket has users’ folders and a “public-rawdata” folder.
The folder, public-rawdata, in bucket is used for data transfer from external s3 buckets into PNGC cluster by using AWS CLI. The data in the public-rawdata will be forced to have permission for your groups. Please wait maximum time 5 minutes to show up your transfer files when you transfer files by using AWS CLI.
Such as
# ls -l /s3buckets/cadre-psom-s3-bucket-01-nfs/
total 0
drwxr-x--- 1 user1 cadre 0 Dec 4 12:35 user1
-rw-r--r-- 1 root root 0 Jan 5 09:05 cadre-psom-s3-bucket-01.txt
drwxr-x--- 1 user2 cadre 0 Dec 4 12:35 user2
drwxrwxr-x 1 root cadre 0 Dec 12 21:39 public-rawdata
drwxr-x--- 1 user3 cadre 0 Dec 4 12:35 user3
#
You can access your project’s s3 storage at
/s3buckets/<project-name-psom-s3-bucket-01>/<your-user-name>
/s3buckets/<project-name-psom-s3-bucket-01>/<public-rawdata>
The folder, public-rawdata, in bucket is used for data transfer from external s3 buckets or your local files into PNGC cluster by using AWS CLI.
The data in the public-rawdata folder will be forced to have permission for your project group.
For Common Useful Software can be found in:/applications
(in head node and has been shared to all child nodes)
Cluster Head Node Use:
The High Performance Cluster head node is a small server that is configured to perform simple processes that does not require the use of much memory or CPU resources. The cluster head node is specifically designed for script editing and job submission (via Slurm). The OS of the Cluster is Amazon Linux 2 (similarly CentOS 7, not Ubuntu).
The head node should not be used for data analysis or transferring large amounts of data. Users should not run their batch, Conda, R, or Python scripts in head node. These types of scripted jobs should run in interactive jobs or include in Slurm jobs on compute nodes. Any of these script jobs that are run in the head node will be terminated.
Cluster Maintenance:
Users should not log-in to the cluster during the Scheduled Cluster Maintenance. During scheduled cluster maintenance, all compute nodes are shut down, and all of your login sessions and jobs are terminated. Users who attempt to login during the cluster maintenance may cause the maintenance to fail and require rollback – further impacting the outage time of the maintenance window.
Slurm Queues and Job Submission:
The available queues available to each user and their purpose are:
(provide queue name and describe purpose)
defq | //default queue |
cadreq | //cadre queue for cadre project |
gcadcbq | //gcadcbq queue for gcad core b project |
gcadccq | //gcadccq queue for gcad core c project |
najq | //najq queue for Adam Naj lab project |
niagadsq | //niagads queue for niagads project |
readdq | //readdq queue for readd adsp project |
wleeq | //wleeq queue for Wan-Ping Lee lab project |
By default, each user has a cap of use of 48 cores (normal qos) at any given time to ensure that there is equitable use of the system.
The lists of the QoS for your projects:
QoS Name | Priority | Max Resources |
---|---|---|
normal | 10 | cpu=48, mem=384G |
high | 20 | cpu=16, mem=128G |
normal-naj | 10 | cpu=16,mem=64G |
high-naj | 20 | cpu=8,mem=32G |
normal-gcadcb | 10 | cpu=32,mem=256G |
high-gcadcb | 20 | cpu=16,mem=128G |
Your cluster account has been created. Your login info has been sent to you by email.
I’ve attached some information below to get your started. Please let me know if you have any questions.