Links
NERSC Container Tutorial 1 (2 and 3 also)
When you do the first set of tasks at the end, the exercises are:
- use podman-hpc
- pull awlavely/adamslolcow from dockerhub
- Run this on a login node
- Run this with an interactive job
- run this as a batch job
- repeat 2-5 with the Shifter
Notes
- Nothing to add.
- when you pull, you will need to prepend with "docker.io" and the version: podman-hpc pull docker.io/awlavely/adamslolcow:1.0 Otherwise, you will get a login/denied error.
- Nothing to add.
- I had to modify the switches and set the account: salloc --nodes 1 --time 01:00:00 -C cpu --qos interactive --account mXXXX It is probably a good idea to set the time to 00:05:00 — five minutes.
- This is fine — you have to create a XXX.job file based on the example on the slide and submit it using sbatch.
- Shifter - same caveats w/ time as above 5 minutes is plenty and will limit your account charges.
- pull
- run on login node
- run with an interactive job
- run as a batch job
Gromacs on perlmutter
requires shifter, so make sure you understand how it works:
find the gromacs images:
cmccall@login22: shifterimg images | grep 'gromacs'
…
pull the latest nersc image:
cmccall@login22: shifterimg pull nersc/gromacs:23.2
run that image as a container in a shell ("exit" to exit):
cmccall@login22: shifter --image=nersc/gromacs:23.2 --entrypoint
do some things..
cmccall@login22: exit
Run an interactive job:
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> salloc --nodes 1 --time 01:00:00 -C cpu --qos interactive --account m
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ gmx_mpi mdrun -f 2.0mM_System.top …
run an alternate container in a shell:
cmccall@login22: shifter --image=nvcr.io/hpc/gromacs:2023.2 --entrypoint
cmccall@login22: exit
Note that version 2023.2 is early enough that it will produce TPR files that are compatible with our analysis.
Python/Martiniglass
(see also this document on python from NERSC)
Create a conda environment on Perlmutter outside the container:
cmccall@login22: module load conda
cmccall@login22: conda create -n martini python=3.11
(some messages from conda starting with… you will need to install some things)
Channels:
- conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
Activate the environment and install martiniglass:
cmccall@login22: conda activate martini
(martini) cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> pip install martiniglass
…messages ending with
Installing collected packages: pbr, numpy, networkx, scipy, vermouth, martiniglass
Successfully installed martiniglass-1.1.3 networkx-3.6.1 numpy-2.4.4 pbr-7.0.3 scipy-1.17.1 vermouth-0.15.0
remember to deactivate this environment — the "(martini)" prepend should vanish.
(martini) cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> conda deactivate
Next, run an interactive shifter job (described on this page, scroll down to "Interactive Shifter Jobs"):
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> shifter --image=nersc/gromacs:23.2 /bin/bash
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$
Then inside your Shifter job, find out where the conda is, and then source the startup script to activate the environment:
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ which conda
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ /global/common/software/nersc/pe/conda/26.1.0/Miniforge3-25.11.0-1/condabin/conda
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ source /global/common/software/nersc/pe/conda/26.1.0/Miniforge3-25.11.0-1/etc/profile.d/conda.sh
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ conda activate martini
(martini) cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$
Your Python tools should then coexist with the `gmx` tools already present in the container.
You can also wrap the entire workflow into a batch script so the trajectory reduction happens automatically after simulation, which is a good approach given the file size reduction you described.
Running from an interactive node reservation
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> salloc --nodes 1 --time 01:00:00 -C cpu --qos interactive --account m
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> shifter --image=nersc/gromacs:23.2 --entrypoint
cmccall@nid004219:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ source /global/common/software/nersc/pe/conda/26.1.0/Miniforge3-25.11.0-1/etc/profile.d/conda.sh
cmccall@nid004219:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ conda activate martini
(martini) cmccall@nid004219:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ ~/dry_martini.sh 2.0mM run
Reading input 2.0mM_vis.top
Writing visualisable topology files
Writing a waterless index file. Here're some helpful commands for reference:
gmx trjconv -f traj_comp.xtc -s topol.tpr -pbc mol -n index.ndx -e 0 -o vis.gro
gmx trjconv -f traj_comp.xtc -s topol.tpr -pbc mol -n index.ndx -o vis.xtc
(etc)
For a 2000 ns/ 2 μs 1.25mM L =1.25 mM, L = 43.5 simulation, the script took 10:00 min of CPU time on a compute node.
Example of slurm script
#!/bin/bash
#SBATCH --image docker:nersc/gromacs:23.2
#SBATCH -C gpu
#SBATCH -t 20:00:00
#SBATCH -J 1.75mM_ABZO
#SBATCH -o Gromacs_GPU.o%j
#SBATCH -A mxxxx_g
#SBATCH -q preempt
#SBATCH -N 1
#SBATCH --ntasks-per-node=4
#SBATCH --gpus-per-task=1
#SBATCH -c 16
#SBATCH --mail-user=mmccallum@pacific.edu
#SBATCH --mail-type=ALL
##
label="1.75mM"
cd /global/homes/c/cmccall/TTA/ABZO/${label}
export GMX_ENABLE_DIRECT_GPU_COMM=true
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# EQ Grompp w/ 1 gpu
exe="gmx_mpi grompp"
input="-p ${label}_System.top -c ${label}-eq.gro -f martiniGPU_eq.mdp \
-o ${label}-eq.tpr -po ${label}-eq.mdp -maxwarn 1"
command="srun -n 1 --gpu-bind=none --cpu-bind=cores shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1
# EQ mdrun w/ 4 gpu
input="-v -deffnm ${label}-eq"
exe="gmx_mpi mdrun -bonded gpu -nb gpu -pin on -ntomp $SLURM_CPUS_PER_TASK"
command="srun --cpu-bind=cores --gpu-bind=none --module cuda-mpich shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1
# Run Grompp w/ 1 gpu
exe="gmx_mpi grompp"
input="-p ${label}_System.top -c ${label}-eq.gro -f martiniGPU_md.mdp \
-o ${label}-run.tpr -po ${label}-run.mdp -maxwarn 1"
command="srun -n 1 --gpu-bind=none --cpu-bind=cores shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1
# run mdrun w/ 4 gpu
input="-v -deffnm ${label}-run"
exe="gmx_mpi mdrun -bonded gpu -nb gpu -pin on -ntomp $SLURM_CPUS_PER_TASK"
command="srun --cpu-bind=cores --gpu-bind=none --module cuda-mpich shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1
Other things
module load texlive