Quantum Espresso in our cluster

This page contains information related to the installation of Quantum Espresso on our group cluster. Some of this is relevant also for self-compilation of the code, for those who want to give this a try.

Quantum Espresso on our group cluster

Note that the Quantum Espresso installation on our group cluster mainly follows the standard syntax introduced by the Quantum-Espresso team with their new installation scheme. Based on their system, we have added two binaries - as commented under.

If you do

module avail q-e

on our group cluster, you will notice that, there are a number of available vasp modules. This might appear confusing.

First; each module is optimised for a particular workstation. For example JRTI/Q-E/nvhpc-21.3/7.0 is optimized for Just Read The Instructions. Second, nvhpc indicates it is compiled with Nvidia HPC compilers. Third 7.0 is the version of Quantum Espresso, and openACC indicates it is compiled with openACC accellaration (with GPUs). All versions of Quantum Espresso is compiled with the support for maximally-localised Wannier functions and the Wannier90 program and also the MPI flag in FPP (-DMPI)

Quantum Espresso Best Practices using GPUs

Please read the MAX webinar on how to use Quantum Espresso on new GPU based HPC systems

The presentation slides are available also here

Some highlights:

  • 1MPI process perGPU,

  • CPU cores can (must!) be exploited with OpenMP parallelism

  • Pool parallelism is very effective, but requires memory

  • The dense eigenvalue problem is solved on 1 GPU, use the serial eigensolver.

  • Check the Wiki, it’s updated with a collaborative effort!

  • More details: P. Giannozzi et al. J. Chem. Phys. 152, 154105 (2020)

In carbon, GPU enabled Q-E installs use NVHPC compilers.

#!/bin/bash -l

################### Quantum Espresso Job Batch Script Example ###################
# Section for defining queue-system variables:
# This script asks for Just Read The Instructions and 64 cores .
# Runtime for this job is 59 minutes; syntax is hh:mm:ss.
# SLURM-section
## Select Just Read The Instructions
#SBATCH --partition=JRTI
## Reserve the node for the GPU
#SBATCH --ntasks=96
##Name of the job
#SBATCH --job-name=q-e_ex-gpu
##one hour
#SBATCH --time=01:00:00
## asks SLURM to send the USR1 signal 10 minutes before the end of the time limit
#SBATCH --signal=B:USR1@600
##The name of the log file (you can see the progress of your job here)
#SBATCH --output=q-e_ex-gpu.log

# This section is for defining job variables and settings
# that needs to be defined before running the job

#name of the input file
#name of the output file

# For the efficient use of GPU/CPU hybrid use a K point parallelization, and openMP threads
qe_executable="pw.x -npool 1 -ndiag 1 -ntg 1 -inp "

# We load all the default program system settings with module load:

module --quiet purge
module load nvhpc/22.7 JRTI/Q-E/nvhpc/git280822-nvhpc22.7-gpu
# You may check other available versions with "module avail q-e"

# A unique file tag for the created files
file_tag=$( date +"%d%m%y-%H%M" )

# Define and create a unique local scratch directory for this job. Remember, this will
# make your job run  much faster!

# You can copy everything you need to the scratch directory
# ${SLURM_SUBMIT_DIR} points to the path where this script was submitted from
rsync -arp ${SLURM_SUBMIT_DIR}/ .

# This section is about collecting the results from the local scratch back to
# where the job was run. Make sure that you have enough quota, if you want to
# collect the wave functions as well!

# define the handler function
# note that this is not executed here, but rather
# when the associated signal is sent
        echo "function cleanup_function called at $(date)"
        tar cvf results-${SLURM_JOBID}-$file_tag.tar *
        gzip  results-${SLURM_JOBID}-$file_tag.tar
        du -sh results-${SLURM_JOBID}-$file_tag.tar.gz
        mv results-${SLURM_JOBID}-$file_tag.tar.gz ${SLURM_SUBMIT_DIR}

# call cleanup_function once we receive USR1 signal
trap 'cleanup_function' USR1

# This section actually runs the job. It needs to be after the previous two
# sections

echo "starting calculation at $(date)"

# First, let's go to the local scratch directory

# Running the program:
# the "&" after the compute step and "wait" are important for the cleanup process
# the "tee" is used to mirror the output to the slurm output, so that you can follow
# the job progress more easily
run_line="mpirun -np 1 ${qe_executable} ${input_file} |tee ${output_file} &"
echo $run_line
eval $run_line

echo "Job finished at"
################### Job Ended ###################
exit 0

About memory allocation for Quantum Espresso

Quantum Espresso is known to be potentially memory demanding. Quite often, you might experience to use less than the full number of cores on the node, but still all of the memory.

For core-count, node-count and amounts of memory on our group cluster, see Workstations.

There are two important considerations to make:

Make sure that you are using the SBATCH –exclusive flag in your run script.