In shared usage, if other users are running jobs, you may experience a “waiting time” before your job starts.
The execution order is determined and managed by software called a scheduler, which makes decisions based on requested resources and various limits. For this reason, it is difficult for users to directly control the order. However, you may be able to reduce the waiting time by adjusting your program’s workload and the amount of resources requested in your job script.
The methods are explained below.

1. Check the current usage status

OCTOPUS Usage Status

SQUID Usage Status

The image below is the SQUID usage status page.
Colored bars indicate compute nodes that are in use / scheduled, and black bars indicate compute nodes that are available or stopped.

Depending on the system status, compute nodes may become available within the next few hours.
In the example below, it shows that there is a “gap time” of a few hours on one node in the SQUID general-purpose CPU node group until 15:00.
By adjusting the elapsed time (elapstim_req) to fit into such a “gap time” and submitting the job, your job may start immediately without waiting.

2. Adjust the job size and requested resources

#!/bin/bash
#PBS -q SQUID
#PBS -l elapstim_req=24:00:00
#PBS --group=GROUPNAME
cd $PBS_O_WORKDIR
./a.out

In such cases, first check whether there is a discrepancy between the actual runtime of your program and the “elapstim_req” specified in your job script. The scheduler reserves compute nodes based on the “elapstim_req” value. If you specify a long time, the job may not fit into available “gap times” and may be scheduled later, which can significantly increase the waiting time.

In this example, if your program typically finishes in around 3 hours, you should modify elapstim_req to 4 hours. Even if your program actually requires about 24 hours, if it is possible to reduce the problem size, split it into multiple jobs, or stop and resume the computation, you can change parameters so that each run finishes within 5 hours and run it multiple times. In many cases, this results in an earlier completion overall.

Note: your job will be forcibly terminated when the runtime reaches the time specified by elapstim_req, so make sure to set elapstim_req with some margin above the actual runtime. If you cannot estimate the runtime, set a longer value first, run the job several times to understand the approximate runtime, and then gradually shorten the elapstim_req setting.

This time, we modified the elapstim_req value so that the job finishes within 4 hours.

#!/bin/bash
#PBS -q SQUID
#PBS -l elapstim_req=4:00:00
#PBS --group=GROUPNAME
cd $PBS_O_WORKDIR
./a.out

The job was successfully executed on an “available” node.

Also, when running short jobs such as for testing, please use the “DBG” job class specified by the #PBS -q option. The DBG queue is a short-job class with a maximum elapstim_req of 10 minutes; because jobs complete quickly, the waiting time is often shorter than in normal queues. Below is an example job script for the DBG queue.

#!/bin/bash
#PBS -q DBG
#PBS -l elapstim_req=10:00
cd $PBS_O_WORKDIR
./a.out

Notes

if the runtime exceeds elapstim_req, the job will be forcibly terminated at that point

If you really need to run sooner: Use the high-priority queue

SQUID-H

OCT-H

-q

About the Point System