Can I get the maximum memory usage per compute node ?

The maximum memory usage per generic CPU node can be collected by periodically executing "qstat -Jf".
Of the memory cgroup size (Memory Cgroup Resources) used by the job with "qstat -Jf", "Memory Usage" is the current memory usage and "Max Memory Usage" is the maximum memory usage.


Can I use wandb on SQUID?

You can install wandb with pip command.

pip install wandb
wandb login "XXXX"

SQUID can not access to internet on computational nodes, however, we permitted only for "wandb".
Please describe a job script file as the following:

#PBS --group=[group name]
#PBS -l elapstim_req=1:00:00
export http_proxy="http://ibgw1f-ib0:3128"
export https_proxy="https://ibgw1f-ib0:3128"
python test.py

Please use requests-2.24. Versions such as 2.26 will not work properly with SQUID.


Binary data get into output file of MPI job

Binary data is seldom written in an output file when each process is written to output data to the same name file.

Please use MPI-IO or write to output data to separate files for each process.

Please see the following page for MPI-IO


How do I assign one MPI process each per node by round-robin?

[Supplementation for Question]
If I run parallel computing job(Intel MPI) on 4 node of VCC (20 core), I want to assign MPI nodes as the following:

node 1: rank 0, 4, 8, ..., 76
node 2: rank 1, 5, 9, ..., 77
node 3: rank 2, 6, 10, ..., 78
node 4: rank 3, 7, 11, ..., 79

In the case of this parallel computing, please specify as the following job-script:

#PBS -b 4
mpiexec -ppn 1 -n 80 ./a.out


manual about -ppn option for mpiexec(IntelMPI)

-perhost <# of processes>, -ppn <# of processes, -grr <# of processes>
Use this option to place the specified number of consecutive MPI processes on every host in the group using round robin scheduling.


Why it increase the processing time on un-parallelization part with OpenMP?

When you specified OpenMP or Auto-Parallelization option on compile, the compiler links the library for parallelization whether parallel directive or not.
The functions of library for Parallelization differ from the normal functions in that have the lock routine for that other thread limit access to resources.

If it calls the functions of Library for parallelization on the un-parallelization part, it runs on one thread, of course. Therefore, it will not wait for other processes for lock routine. But, it needs a little bit processing time. Because the functions of library for parallelization have to make a decision for whether should do or do not the lock routine. Please note that.


We want to re-direct standard output of MPI result on vector node of SQUIDto the other file.

If you want to re-direct standard output of MPI result on vector node of SQUID, please use the script "/opt/nec/ve/bin/mpisep.sh".
How to use this script is the following:

In the case, the standard output is output to stdout.0:(MPI process ID), and the standard error output is output to stderr.0:(MPI process ID) in real time.

Please see "3.3" of the following manual about the detail:
NEC MPI User's guide

If you modified mpisep.sh, you change stdout/stderr filename into whatever you want to name.


Can I specify any permissions for standard output file and standard error output file ?

On SX-ACE and VCC, the permission of standard output file and standard error output file depend on "umask". Please specify the permission with "umask" command on front-end server.


How could I make the independent random number generation ?

Many PRNG (pseudo random number generator) make random number from the specified random seed. If you specified same random seed, it will generate same random number. If you want to get the independent random number, you have to change random seed each time.


How do I check MPIPROGINF after performing MPI on SX-ACE?

Please see the following manula section "2.13.6" about the output infomation with MPIPROGINF.
How to use MPI/SX

* Note
If you run the hybrid MPI and OpenMP parallel program, these output information in "UserTime" and "SysTime" are CPU times per process. These times will be longer than "RealTime", because these times are additions each CPU times.


実行中のジョブが終了したことを合図に、次のジョブを自動で投入したい。 その際、実行中のjobが成功したかどうかで、投入するジョブを変えたい。



「NQS利用の手引」のリファレンス編 第1章 ユーザコマンドをご参照頂けますようお願い致します。

※ man qwait でもヘルプを参照できます。

qwait については下記のような使い方が可能です。

監視スクリプトをバックグラウンド実行し、スクリプト内で qwaitを実行します。

$ qsub job1-1
Request 12345.cmc submitted to queue: Pxx.
$ (./chkjob >& log &)

----- chkjob
while :
qwait 12345.cmc #リクエストIDを任意のものに変更して下さい
case $? in
0) qsub job1-2;exit;;
1) qsub job2-1;exit;;
2) qsub job3-1;exit;;
3) echo NQS error | mail xxxx@yyyy.ac.jp;exit;;#メールアドレスを任意のものに変更してください
7) continue;;
*) ;;