OpenACC directives are available in SQUID, which makes it relatively easy to optimize for the GPU, and can be used in conjunction with OpenMPI.
How to use OpenACC
advance preparation
-
Before using OpenACC, please load the BaseGPU environment with the module command.
module load BaseGPU/2021
Compile command
-
The command to compile is different for each programming language; if you use the OpenACC directive, specify the -acc option
node-internal execution
C language | C++ language | Fortran language | |
---|---|---|---|
Command | nvc -acc | nvc++ -acc | nvfortran -acc |
Node-to-node execution (MPI)
C language | C++ language | Fortran language | |
---|---|---|---|
Command | mpicc -acc | mpic++ -acc | mpif90 -acc |
 
compilation example
-
To compile the code test.f90 with the OpenACC derivation inserted for a set of SQUID GPU nodes, you can use the following (The -Minfo=accel option is used to output optimization messages to the GPU.
nvc -O3 -acc -Minfo=accel test.f90
Execution Script
execution on 1 node
-
Here is an example script for a program that uses GPUs with OpenACC directives. This is a script that executes a batch request with 8 GPUs per node and an elapsed time of 1 hour.
1 2 3 4 5 6 7 8 |
#!/bin/bash #PBS -q SQUID #PBS --group=[your group name] #PBS -l elapstim_req=01:00:00 #PBS -l gpunum_job=8 module load BaseGPU/2021 cd $PBS_O_WORKDIR ./a.out |
execution on multinodes with MPI
-
The following is an example script for an MPI program that uses GPUs with OpenACC directives. This is a script that executes a batch request with 8 GPUs per node, using 2 nodes (8 processes per node, 16 processes in total), and an elapsed time of 1 hour.
1 2 3 4 5 6 7 8 9 10 11 |
#!/bin/bash #PBS -q SQUID #PBS --group=[your group name] #PBS -l elapstim_req=01:00:00 #PBS -b 2 #PBS -l gpunum_job=8 #PBS -T openmpi #PBS -v NQSV_MPI_MODULE=BaseGPU/2021 cd ${PBS_O_WORKDIR} module load BaseGPU/2021 mpirun ${NQSV_MPIOPTS} -np 16 -npernode 8 mpi_prog |