Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Jobs are allocated by nodes, not GPUs. Since there are 8 GPUs per node, each submitted job must be able to use all 8 GPUs. Remember to set properly in your script the --gres=gpu variable, as it is not automaticallyautomatic. Below, we provide examples for how to effectively use this resource.

Since each job submission on Aries runs exclusively on an entire GPU node, the user must launch 8 parallel GPU processes (or launch a single process that can use all 8 GPUs simultaneously). It is ideal that all 8 parallel runs to have similar run time to maximize efficiency. Make sure the 8 parallel runs have different names for the output file generated.

...

Code Block
languagebash
firstline1
titlecreate dir
linenumberstrue
mkdir /work/cms16/$USER

Recommended to store Write all generated data to your directory in /work/cms16/username and not in your /home. Your /home directory will be able to store around 20 GB for you, while /work partition has a total of 100 TB.  Regularly remove your data from /work, to ensure stability of the resource. /work is not backed up! 

Important note: set up ssh communication keys between nodes, as it is explained in the pdf at the end of this page (ARIES_Quick_Start_wl52_20220406.pdf).

...

This partition contains 22 GPU nodes (gn01-gn22). Each node contains 8 AMD Vega20 GPUs, 96 CPUs and 512 GB memory. Expect , except for the last three nodes (gn20-gn22) with that each having have 128 CPUs and 8 AMD MI100 GPUs. 

...

This container does not have OpenSMOG and needs , or other CTBP-specific tools, and they will need to be installed with pip3. For example, to install OpenSMOG: 

Code Block
$pip3 install OpenSMOG

...