"OpenMP" is used for intra-node parallel (shared memory parallel) processing.
OpenMP
-
This is a method of intra-node parallelization (shared memory parallelization ) by inserting a line of instructions to the compiler in the program. The description method is slightly different in C and Fortran. A simple method of use is described below.
Basic use
-
The OpenMP program written in the following Fortran is explained with reference.
|
1 2 3 4 5 6 7 |
program sample implicit none include "omp_lib.h" !$omp parallel print *, "HELLO WORLD mythread = ",omp_get_thread_num(),"(",omp_get_num_threads(),"threads)" !$omp end parallel end |
The highlighted lines (lines 3-6) are the process related to OpenMP. include "omp_lib.h"[line 3], omp_get_thread_num()[line 5], and omp_get_num_threads()[line 5] are inserted for Print output of OpenMP processing. Even if you don't input them, the movement as a parallel processing doesn't change.
Line 3: include "omp_lib.h"
specifies the include file for the OpenMP library routines. This is required to use the library routines described below.
Line 4: !$omp parallel
and
Line 6: !$omp end parallel
is OpenMP directive. The parallel directive line indicates that the block is to be parallelized until the end parallel directive line. In this case, the 5th line will be parallelized. The number of parallelism is specified by the environment variable OMP_NUM_THREADS.
Line 5 omp_get_thread_num and omp_get_num_threads is OpenMP library routine.
If you want to use these routines, you need to specify the include file on the second line. omp_get_thread_num gets the number of processes divided (parallel threads). omp_get_num_threads gets a number for each divided process (parallel thread). The number starts from 0.
The result of running the above program in 4 parallel is shown below.
HELLO WORLD mythread= 3 ( 4 threads)
HELLO WORLD mythread= 1 ( 4 threads)
HELLO WORLD mythread= 0 ( 4 threads)
HELLO WORLD mythread= 2 ( 4 threads)
About directive lines
Unlike automatic parallelization, users themselves need to insert "directive lines" to instruct the parallelization. The directive line differs depending on the language.
! $omp directive line name (in the case of Fortran)
#pragma omp directive line name (in the case of C/C++)
About the instruction line, please see the reference material.
How to compile
$module load BaseCPU
$ ifx -qopenmp [options] source_file
$ icx -qopenmp [options] source_file
$ icpx -qopenmp [options] source_file
If you want to output the parallelization report, please add the option "-qopt-report=[n] -qopt-report-pahse=openmp". You can specify the message level from 0 to 3 for [n].
Writting job script
At runtime, the number of parallelism is specified in the environment variable OMP_NUM_THREADS. The following is an example of a script that executes 256 parallel computations in a node with an elapsed time of 1 hour.
|
1 2 3 4 5 6 7 8 9 |
#!/bin/bash #PBS -q SQUID #PBS --group=[group name] #PBS -l elapstim_req=1:00:00 #PBS -l cpunum_job=256 #PBS -v OMP_NUM_THREADS=256 module load BaseCPU cd $PBS_O_WORKDIR ./a.out |
Note
Please do not forget to specify "OMP_NUM_THREADS" in the execution script. If you don't specify "OMP_NUM_THREADS" or you specify a wrong value, it will be executed with an unintended parallel number.
Reference
-
A Large-scale Computer System account is required to view the following materials.
Intel(R)OpenMP入門(Japanese)
Intel(R)C,C++コンパイラーOpenMP活用ガイド(Japanese)
Intel(R)FortranコンパイラーOpenMP活用ガイド(Japanese)
Intel(R)コンパイラー自動並列化ガイド(Japanese)

