Added on April 23, 2025
Thank you for using the cloud bursting service.
Currently, the cloud bursting function is disabled.
We will announce on this website once it becomes available again.
--------------------------------------------------
As the end of the fiscal year approaches, the usage of the General Purpose CPU nodes has increased, leading to longer job waiting times. In response to this situation, we have been providing an environment where certain jobs can be executed on Oracle Cloud Infrastructure (OCI) using the "Cloud Bursting" feature implemented in SQUID since January 15.
We would like to inform you that the Cloud Bursting feature using OCI will be discontinued on March 21 at 10:00, and starting from March 21 at 13:00, the Cloud Bursting feature will be provided using Microsoft Azure. By utilizing this feature, you may be able to execute your jobs with shorter waiting times than usual. If you wish to run jobs on cloud nodes, please review the following important notes and make use of the Cloud Bursting feature accordingly.
Service Start Date
March 21, 2025, at 13:00 AM (JST)
Target
We will offload computational requests to Microsoft Azure for users running jobs on SQUID general-purpose CPU nodes with a runtime of 1 node and up to 1 hour.
※ Please note that the target scope of this feature differs from that of the Cloud Bursting function using OCI.
Point Consumption
When executing a job using Microsoft Azure computational resources, the points consumed will be equivalent to the "general-purpose CPU node normal-priority queue (SQUID)." There will be no additional high charges, so you can use the feature with peace of mind. More details are provided below.
How to Use Cloud Bursting
To use Microsoft Azure computational resources, add options to the job submission shell script (job script file) and submit it to the scheduler. Below is an example of a job script:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
#!/bin/bash #------- qsub options ----------- #PBS -q SQUID # Cloud bursting is available only for the SQUID queue. #PBS --group=G01234 # Group name to which you belong #PBS -l cpunum_job=76 # Requested number of CPU cores #PBS -l elapstim_req=01:00:00 # Requested maximum job execution time (example: 1 hour) #PBS --enable-cloud-bursting=yes # Allow cloud bursting #PBS -U cloud_wait_limit=04:00:00 # If the waiting time exceeds the specified limit, the job may be executed on the cloud. (Example: 4 hours waiting time) #------- Program execution ----------- module load BaseCPU # Load the base environment cd $PBS_O_WORKDIR # Move to the directory where qsub was executed ./a.out > result.txt # Execute the program |
Notes
1. The hardware configuration of the Microsoft Azure compute nodes differs from that of the SQUID general-purpose CPU nodes, which may result in different computation times and results.
・SQUID general-purpose CPU node:
Processor: Intel Xeon Platinum 8368 (Icelake / 2.40 GHz, 38 cores) x 2
Memory: 256GB
・Microsoft Azure General-Purpose CPU Node:
Processor: Intel Xeon Platinum 8280 (Cascade Lake / 2.70GHz 24 cores) × 4
Memory: 2048 GB
※ Available resources: CPU 76 cores, Memory 248 GB.
24 cores/socket × 4 sockets configuration differs from the NUMA structure of SQUID general-purpose CPU nodes.
As a result, when bursting, if the job script specifies core allocation assuming an on-premises environment, execution errors or performance degradation may occur.
2. Even if cloud bursting is enabled and Microsoft Azure computational resources are used, the points consumed will be the same as the general-purpose CPU node normal-priority queue (SQUID).
That is, the calculation follows the formula:
Job Execution Time x General-purpose CPU Node normal-priority Queue Consumption Coefficient (0.2998) x General-purpose CPU Node Fuel Coefficient x General-purpose CPU Node Seasonal Coefficient.
3. By enabling the cloud bursting feature, you agree to the following points:
- Computation may be performed using Microsoft Azure computational resources.
- Depending on resource availability, Microsoft Azure resources may not be used, and computation may only occur on SQUID CPU nodes.
4. The cloud node mounts Lustre via NFS, which results in lower I/O performance compared to SQUID general-purpose CPU nodes with Lustre mounted directly.
Therefore, programs with heavy I/O processing may experience delays in computation time.
We hope you can make use of the cloud bursting feature after confirming the above information.
Posted : March 19,2025