Getting the best scheduling placement

The scheduler ensures fair sharing between users on the PBS clusters. It is not simply first come, first serve. The scheduling placement depends on the requirements that you specify in your job file. You can see the status of the queue by running the showq command, but your job might be able to skip the queue (or other jobs may skip infront of you!).

Choose a queue based on number of processors required

Always select the most appropriate queue based on the number of processors and maximum time required by your job. If your job will not last more than 12 hours, choose the short queues. The scheduler may prefer to complete this job before a longer job - or - it may identify a time in the future that it can slot this job between other jobs. Usually, this means that the scheduler will start execution of your job quicker than if jobs were simply run in submission time order.

Specify a wall time in your job file

If you select the ultra long queues, the maximum wall time is very long. However, these queues have the lowest priority in the system. You should always specify a lower wall time limit in the job file that is more suitable for your task. The scheduler will place the job more efficiently if the information on how much time is required is accurate.

#PBS -q paraul
#PBS -l nodes=8:ppn=4
#PBS -l walltime=72:00:00

The above example requests 32 processors (allowable in the big and ultra long queues). The ultra long queue is chosen because the user knows before hand that the time required by this job will be approximately 3 days. However, unless otherwise stated, the scheduler will assume that this job requires the maximum allowable time for that queue. Therefore, the execution of this job might be delayed due to scheduling other higher priority jobs.

Avoid unnecessary idle node reservation

In order to ensure that this job is scheduled efficiently, the job file should contain the information that it is expected to require only 3 days.

This also prevents unnecessary reservation of free idle nodes; The scheduler may decide to reserve nodes for a job if not enough processors are available, but another job will complete soon which will free up enough processors to run this job. During this time, the nodes will appear to be free but will not accept other jobs, even if there are enough processors to run them. This is because the nodes are reserved for the already queued job. If the wall time information is not accurate, the nodes can be potentially be blocked in this state for a long time.

Additionally, if nodes are reserved in this state for an existing job, but a new job specifies a wall time that is less than the time it would take for the existing job’s requirements to be fulfilled, the scheduler will slot the new job in between.

What if I don’t know?

If you are not sure how much time your job will need, it is acceptable to just use the maximum allowed time for your chosen queue, but the scheduling placement may not be optimal. If you have several jobs of similar size, it may be possible to run one first to determine the wall time required before submitting the rest.

Summary

In conclusion, it is always best practice to specify approximate wall times in your job files, especially if using low priority queues.