Code Block

language	bash

#SBATCH --account=ctbp-common
#SBATCH --partition=ctbp-common
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=64G
#SBATCH --gres=gpu:1
 
ml gomkl/2021a OpenMM/7.7.0-CUDA-11.4.2

NOTS (ctbp-onuchic)

This partition includes one GPU node, equipped with an AMD EPYC chip featuring 16 CPUs and 512GB of RAM. In addition, each node includes 8 NVIDIA A40 GPUs with 48GB of memory.

Code Block

language	bash

#SBATCH --account=ctbp-onuchic
#SBATCH --partition=ctbp-onuchic
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=64G
#SBATCH --gres=gpu:1
 
ml gomkl/2021a OpenMM/7.7.0-CUDA-11.4.2

OpenMM on NOTS

You can deploy and run you own version of OpenMM via conda environment. For that, first install the OpenMM inside a conda environment requesting the modules already installed on NOTS. Note that in order to run with Nvidia GPUs, it has to be complicated with CUDA/<version>.

Code Block

language	bash
title	Conda environment with OpenMM

# Load conda and gpu modules
module load Anaconda3/2022.05 CUDA/11.4.2

# Create the openmm environment
conda create --prefix $HOME/openmm

# Activate the new env.
source /opt/apps/software/Anaconda3/2022.05/bin/activate
conda activate $HOME/openmm

# Then install OpenMM. You can also follow by installing your favorite MD wrapper
conda install -c conda-forge openmm cudatoolkit=11.4.2 h5py openmichrom opensmog

This would be an example of a running slurm script.

Code Block

language	bash
title	Slurm running OpenMM via environment

#!/bin/bash -l

#SBATCH --account=ctbp-common
#SBATCH --partition=ctbp-common
#SBATCH --job-name=Template-OPENMM
#SBATCH --ntasks=1
#SBATCH --threads-per-core=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G
#SBATCH --gres=gpu:1
#SBATCH --time=00:05:00
#SBATCH --export=ALL

module purge
module load Anaconda3/2022.05 CUDA/11.4.2
source /opt/apps/software/Anaconda3/2022.05/bin/activate
conda activate $HOME/openmm

python your_script.py

ARIES

This partition includes 22 GPU nodes and 2 High Memory CPU nodes:

19 x MI50 Nodes (gn01-gn19): 1x AMD EPYC 7642 processor (96 CPUs), 512GB RAM, 2TB storage, HDR Infiniband, 8x AMD Radeon Instinct MI50 32GB GPUs.
3x MI100 Nodes (gn20-gn22): 2x AMD EPYC 7V13 processors (128 CPUs), 512GB RAM, 2TB storage, HDR Infiniband, 8x AMD Radeon Instinct MI100 32GB GPUs
2x Large Memory Nodes (hm01-02): 2x AMD EPYC 7302 processors (64 CPUs), 4TB RAM, 4TB storage, HDR Infiniband.

To submit a job to GPU 19 GPU nodes, each equipped with an AMD EPYC chip featuring 48 CPUs and 512GB of RAM. In addition, each node includes 8 AMD MI50 GPUs with 32 GB of memory each. To submit a job to this queue, it is necessary to launch 8 processes in parallel, each with a similar runtime to minimize waiting time. This ensures that all of the GPUs are used efficiently.

...

Code Block

language	bash

#SBATCH --account=commons
#SBATCH --partition=commons
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=6
#SBATCH --threads-per-core=1
#SBATCH --mem-per-cpu=3G
#SBATCH --gres=gpu:8
#SBATCH --time=24:00:00
#SBATCH --export=ALL
 
module load foss/2020b OpenMM

PODS

This partition includes 80 GPU nodes, each equipped with an AMD EPYC chip featuring 48 CPUs and 512GB of RAM. In addition, each node includes 8 AMD MI50 GPUs with 32 GB of memory each.

...

language	bash

...

Checking usage

In order to determine if your process is running correctly, in each cluster you can connect directly to each compute server while you are running the file with ssh. Then use the command top to check the CPU and memory usage, rocm-smi to check the GPU usage for AMD/RADEON GPUs and nvidia-smi to check the GPU usage for NVIDIA GPUs,

Remote access to the clusters

...

Attachments

name	ARIES_Quick_Start_wl52_20220406.pdf

Child pages

Versions Compared

Old Version 20

New Version Current

Key

NOTS (ctbp-onuchic)

OpenMM on NOTS

ARIES

Checking usage

Remote access to the clusters

Child pages

Page History

Versions Compared

Old Version 20

New Version Current

Key

NOTS (ctbp-onuchic)

OpenMM on NOTS

ARIES

Checking usage

Remote access to the clusters