Run-time Options for Parallel Processing

Run-time options affecting parallel processing can be specified with the XLSMPOPTS environment variable. This environment variable must be set before you run an application, and uses basic syntax of the form:


Syntax Diagram

Parallelization run-time options can also be specified using OMP environment variables. When run-time options specified by OMP- and XLSMPOPTS-specific environment variables conflict, OMP options will prevail.

Note:
You must use thread-safe compiler mode invocations when compiling parallelized program code.

Run-time option settings for the XLSMPOPTS environment variable are shown below, grouped by category:

Scheduling Algorithm Options


XLSMPOPTS Environment Variable Option Description
schedule=algorithm=[n] This option specifies the scheduling algorithm used for loops not explicitly assigned a scheduling algorithm.

Valid options for algorithm are:

  • guided
  • affinity
  • dynamic
  • static

If specified, the chunk size n must be an integer value of 1 or greater.

The default scheduling algorithm is static.

Parallel Environment Options


XLSMPOPTS Environment Variable Option Description
parthds=num num represents the number of parallel threads requested, which is usually equivalent to the number of processors available on the system.

Some applications cannot use more threads than the maximum number of processors available. Other applications can experience significant performance improvements if they use more threads than there are processors. This option gives you full control over the number of user threads used to run your program.

The default value for num is the number of processors available on the system.

usrthds=num num represents the number of user threads expected.

This option should be used if the program code explicitly creates threads, in which case num should be set to the number of threads created.

The default value for num is 0.

stack=num num specifies the largest amount of space required for a thread's stack.

The default value for num is 2097152.

The glibc library is compiled by default to allow a stack size of 2 Mb. Setting num to a value greater than this will cause the default stack size to be used. If larger stack sizes are required, you should link the program to a glibc library compiled with the FLOATING_STACKS parameter turned on.

Performance Tuning Options


XLSMPOPTS Environment Variable Option Description
spins=num num represents the number of loop spins, or iterations, before a yield occurs.

When a thread completes its work, the thread continues executing in a tight loop looking for new work. One complete scan of the work queue is done during each busy-wait state. An extended busy-wait state can make a particular application highly responsive, but can also harm the overall responsiveness of the system unless the thread is given instructions to periodically scan for and yield to requests from other applications.

A complete busy-wait state for benchmarking purposes can be forced by setting both spins and yields to 0.

The default value for num is 100.

yields=num num represents the number of yields before a sleep occurs.

When a thread sleeps, it completely suspends execution until another thread signals that there is work to do. This provides better system utilization, but also adds extra system overhead for the application.

The default value for num is 100.

delays=num num represents a period of do-nothing delay time between each scan of the work queue. Each unit of delay is achieved by running a single no-memory-access delay loop.

The default value for num is 500.

Dynamic Profiling Options


XLSMPOPTS Environment Variable Option Description
profilefreq=num num represents the sampling rate at which each loop is revisited to determine appropriateness for parallel processing.

The run-time library uses dynamic profiling to dynamically tune the performance of automatically-parallelized loops. Dynamic profiling gathers information about loop running times to determine if the loop should be run sequentially or in parallel the next time through. Threshold running times are set by the parthreshold and seqthreshold dynamic profiling options, described below.

If num is 0, all profiling is turned off, and overheads that occur because of profiling will not occur. If num is greater than 0, running time of the loop is monitored once every num times through the loop.

The default for num is 16. The maximum sampling rate is 32. Values of num exceeding 32 are changed to 32.

parthreshold=mSec mSec specifies the expected running time in milliseconds below which a loop must be run sequentially. mSec can be specified using decimal places.

If parthreshold is set to 0, a parallelized loop will never be serialized by the dynamic profiler.

The default value for mSec is 0.2 milliseconds.

seqthreshold=mSec mSec specifies the expected running time in milliseconds beyond which a loop that has been serialized by the dynamic profiler must revert to being run in parallel mode again. mSec can be specified using decimal places.

The default value for mSec is 5 milliseconds.

Related Concepts

Program Parallelization
OpenMP Directives

Related References

OpenMP Run-time Options for Parallel Processing
Built-in Functions Used for Parallel Processing

For complete information about the OpenMP Specification, see:
OpenMP Web site at www.openmp.org
OpenMP Specification at www.openmp.org/specs IBM Copyright 2003