D3 Center, The University of Osaka » How to tune with XMP(SX-ACE)

On SX-ACE, "XcalableMP" can perform the inter-node parallelism in addition to HPF and MPI. This page introduces how to use the XcalableMP (hereafter called "XMP").

* Because MPI may be difficult for novice users, we highly recommend using HPF or XMP for them.

About XMP

XcalableMP, XMP for short, is a directive-based language extension which allows users to develop parallel programs for distributed memory systems easily and to tune the performance by having minimal and simple notations. Also, the up-port co-array technique can be use easily for one-sided communication.
Please note that you cannot use XMP on our system in addition to SX-ACE.

How to create the XMP program

On the following XMP Website, we provide training session materials for XMP. This page will help you create the XMP program and get a sample code.
XMP training session(XMP Website) XMP manual(XMP Website)

How to use the XMP compiler

The XMP compiler can be used with the following command. Please be aware that command name is different on a per-language basis.

$ xmpcc [options] source_file (C)

$ xmpf90 [options] source_file (FORTRAN)

Please see the following page for available options:
XMP training session(XMP Website) XMP manual(XMP Website)

Execution script

When the XMP program is compiled, the compiler outputs the MPI execution file. Therefore, the execution way is the same as for MPI.

An example of a script is shown as follows. This example requests the MPI batch request by specifying a 4 internode parallelism, a 4 internode parallelism on a node, with an elapsed time of 1 hour, and 60GB memory.

#!/bin/csh

#PBS -q ACE # Queue you want to use

#PBS -l memsz_job=60GB,elapstim_req=5:00:00 # Restrictions you want to enforce (request)

#PBS -b 4 # number of nodes you want to use

#PBS -T mpisx

setenv MPIPROGINF DETAIL

cd $PBS_O_WORKDIR

mpirun -nn 4 -np 16 ./a.out

　　　　 # number of processes (total number of processes involving program execution) must be specified after -np.

# total number of processes involving program execution = number of computing nodes you want to use(#PBS -nn) × internode paralelism on a node (cpunum_job in the case of using all processing cores.)

References

Please see the following for more uses of XMP:

XMP Website

XMP manual(XMP Website)