Links
NERSC Container Tutorial 1 (2 and 3 also)

When you do the first set of tasks at the end, the exercises are:
  1. use podman-hpc
  2. pull awlavely/adamslolcow from dockerhub
  3. Run this on a login node
  4. Run this with an interactive job
  5. run this as a batch job
  6. repeat 2-5 with the Shifter

Notes
  1. Nothing to add.
  2. when you pull, you will need to prepend with "docker.io" and the version: podman-hpc pull docker.io/awlavely/adamslolcow:1.0 Otherwise, you will get a login/denied error.
  3. Nothing to add.
  4. I had to modify the switches and set the account: salloc --nodes 1 --time 01:00:00 -C cpu --qos interactive --account mXXXX It is probably a good idea to set the time to 00:05:00 five minutes.
  5. This is fine — you have to create a XXX.job file based on the example on the slide and submit it using sbatch.
  6. Shifter - same caveats w/ time as above 5 minutes is plenty and will limit your account charges.
  • pull
  • run on login node
  • run with an interactive job
  • run as a batch job

Gromacs on perlmutter



requires shifter, so make sure you understand how it works:
find the gromacs images:
cmccall@login22: shifterimg images | grep 'gromacs'

pull the latest nersc image:
cmccall@login22:
shifterimg pull nersc/gromacs:23.2
run that image as a container in a shell ("exit" to exit):
cmccall@login22:
shifter --image=nersc/gromacs:23.2 --entrypoint
do some things..
cmccall@login22:
exit

Run an interactive job:
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM>
salloc --nodes 1 --time 01:00:00 -C cpu --qos interactive --account m
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ gmx_mpi mdrun -f 2.0mM_System.top …

run an alternate container in a shell:
cmccall@login22:
shifter --image=nvcr.io/hpc/gromacs:2023.2 --entrypoint
cmccall@login22:
exit
Note that version 2023.2 is early enough that it will produce TPR files that are compatible with our analysis.

Python/Martiniglass


(see also this document on python from NERSC)
Create a conda environment on Perlmutter outside the container:
cmccall@login22: module load conda
cmccall@login22: conda create -n martini python=3.11
(some messages from conda starting with… you will need to install some things)
Channels:
- conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
Activate the environment and install martiniglass:
cmccall@login22: conda activate martini
(martini) cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM>
pip install martiniglass
…messages ending with
Installing collected packages: pbr, numpy, networkx, scipy, vermouth, martiniglass
Successfully installed martiniglass-1.1.3 networkx-3.6.1 numpy-2.4.4 pbr-7.0.3 scipy-1.17.1 vermouth-0.15.0
remember to deactivate this environment — the "(martini)" prepend should vanish.
(martini) cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> conda deactivate

Next, run an interactive shifter job (described on this page, scroll down to "Interactive Shifter Jobs"):
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM> shifter --image=nersc/gromacs:23.2 /bin/bash
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$

Then inside your Shifter job, find out where the conda is, and then source the startup script to activate the environment:
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ which conda
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ /global/common/software/nersc/pe/conda/26.1.0/Miniforge3-25.11.0-1/condabin/conda
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ source /global/common/software/nersc/pe/conda/26.1.0/Miniforge3-25.11.0-1/etc/profile.d/conda.sh
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ conda activate martini
(martini) cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM$

Your Python tools should then coexist with the `gmx` tools already present in the container.

You can also wrap the entire workflow into a batch script so the trajectory reduction happens automatically after simulation, which is a good approach given the file size reduction you described.

Running from an interactive node reservation



cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM>
salloc --nodes 1 --time 01:00:00 -C cpu --qos interactive --account m
cmccall@login22:/global/u2/c/cmccall/TTA/ABZO/2.0mM>
shifter --image=nersc/gromacs:23.2 --entrypoint
cmccall@nid004219:/global/u2/c/cmccall/TTA/ABZO/2.0mM$ source /global/common/software/nersc/pe/conda/26.1.0/Miniforge3-25.11.0-1/etc/profile.d/conda.sh
cmccall@nid004219:/global/u2/c/cmccall/TTA/ABZO/2.0mM$
conda activate martini
(martini) cmccall@nid004219:/global/u2/c/cmccall/TTA/ABZO/2.0mM$
~/dry_martini.sh 2.0mM run
Reading input 2.0mM_vis.top
Writing visualisable topology files
Writing a waterless index file. Here're some helpful commands for reference:
gmx trjconv -f traj_comp.xtc -s topol.tpr -pbc mol -n index.ndx -e 0 -o vis.gro
gmx trjconv -f traj_comp.xtc -s topol.tpr -pbc mol -n index.ndx -o vis.xtc
(etc)

For a 2000 ns/ 2 μs 1.25mM L =1.25 mM, L = 43.5 simulation, the script took 10:00 min of CPU time on a compute node.


Example of slurm script


#!/bin/bash
#SBATCH --image docker:nersc/gromacs:23.2
#SBATCH -C gpu
#SBATCH -t 20:00:00
#SBATCH -J 1.75mM_ABZO
#SBATCH -o Gromacs_GPU.o%j
#SBATCH -A mxxxx_g
#SBATCH -q preempt
#SBATCH -N 1
#SBATCH --ntasks-per-node=4
#SBATCH --gpus-per-task=1
#SBATCH -c 16
#SBATCH --mail-user=mmccallum@pacific.edu
#SBATCH --mail-type=ALL
##

label="1.75mM"
cd /global/homes/c/cmccall/TTA/ABZO/${label}
export GMX_ENABLE_DIRECT_GPU_COMM=true
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# EQ Grompp w/ 1 gpu
exe="gmx_mpi grompp"
input="-p ${label}_System.top -c ${label}-eq.gro -f martiniGPU_eq.mdp \
-o ${label}-eq.tpr -po ${label}-eq.mdp -maxwarn 1"
command="srun -n 1 --gpu-bind=none --cpu-bind=cores shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1

# EQ mdrun w/ 4 gpu
input="-v -deffnm ${label}-eq"
exe="gmx_mpi mdrun -bonded gpu -nb gpu -pin on -ntomp $SLURM_CPUS_PER_TASK"
command="srun --cpu-bind=cores --gpu-bind=none --module cuda-mpich shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1

# Run Grompp w/ 1 gpu
exe="gmx_mpi grompp"
input="-p ${label}_System.top -c ${label}-eq.gro -f martiniGPU_md.mdp \
-o ${label}-run.tpr -po ${label}-run.mdp -maxwarn 1"
command="srun -n 1 --gpu-bind=none --cpu-bind=cores shifter $exe $input"
echo $command
$command >> mdrun.log 2>&1

# run mdrun w/ 4 gpu
input="-v -deffnm ${label}-run"
exe="gmx_mpi mdrun -bonded gpu -nb gpu -pin on -ntomp $SLURM_CPUS_PER_TASK"
command="srun --cpu-bind=cores --gpu-bind=none --module cuda-mpich shifter $exe $input"

echo $command
$command >> mdrun.log 2>&1

Other things


module load texlive