IBM Extension

OpenMP execution environment, lock and timing routines

The OpenMP specification provides a number of routines which allow you to control and query the parallel execution environment.

Parallel threads created by the run-time environment through the OpenMP interface are considered independent of the threads you create and control using calls to the Fortran Pthreads library module. References within the following descriptions to "serial portions of the program" refer to portions of the program that are executed by only one of the threads that have been created by the run-time environment. For example, you can create multiple threads by using f_pthread_create. However, if you then call omp_get_num_threads from outside of an OpenMP parallel block, or from within a serialized nested parallel region, the function will return 1, regardless of the number of threads that are currently executing.

OpenMP run-time library calls must not appear in PURE and ELEMENTAL procedures.

Table 12. OpenMP execution environment routines

Included in the OpenMP run-time library are two routines that support a portable wall-clock timer.

Table 13. OpenMP timing routines

The OpenMP run-time library also supports a set of simple and nestable lock routines. You must only lock variables through these routines. Simple locks may not be locked if they are already in a locked state. Simple lock variables are associated with simple locks and may only be passed to simple lock routines. Nestable locks may be locked multiple times by the same thread. Nestable lock variables are associated with nestable locks and may only be passed to nestable lock routines.

For all the routines listed below, the lock variable is an integer whose KIND type parameter is denoted either by the symbolic constant omp_lock_kind, or by omp_nest_lock_kind.

This variable is sized according to the compilation mode. It is set either to '4' for 32-bit applications or '8' for 64-bit.

Table 14. OpenMP simple lock routines
Table 15. OpenMP nestable lock routines
Note:
You can define and implement your own versions of the OpenMP routines. However, by default, the compiler will substitute the XL Fortran versions of the OpenMP routines regardless of the existence of other implementations, unless you specify the -qnoswapomp compiler option. For more information, see XL Fortran Compiler Reference.

omp_destroy_lock(svar)

Purpose

This subroutine disassociates a given lock variable from all locks. You must use omp_init_lock to reinitialize a lock variable that was destroyed with a call to omp_destroy_lock before using it again as a lock variable.

If you call omp_destroy_lock with an uninitialized lock variable, the result of the call is undefined.

Class

Subroutine.

Argument Type and Attributes

svar
Type integer with kind omp_lock_kind.

Result Type and Attributes

Result Value

Examples

In the following example, one at a time, the threads gain ownership of the lock associated with the lock variable LCK, print the thread ID, and then release ownership of the lock.

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LCK
      INTEGER ID
      CALL omp_init_lock(LCK)
!$OMP PARALLEL SHARED(LCK), PRIVATE(ID)
      ID = omp_get_thread_num()
      CALL omp_set_lock(LCK)
      PRINT *,'MY THREAD ID IS', ID
      CALL omp_unset_lock(LCK)
!$OMP END PARALLEL
      CALL omp_destroy_lock(LCK)
      END

omp_destroy_nest_lock(nvar)

Purpose

This subroutine initializes a nestable lock variable, causing the lock variable to become undefined. The variable nvar must be an unlocked and initialized nestable lock variable.

If you call omp_destroy_nest_lock using an uninitialized variable, the result is undefined.

Class

Subroutine.

Argument Type and Attributes

svar
Type integer with kind omp_nest_lock_kind.

Result Type and Attributes

Result Value

omp_get_dynamic()

Purpose

The omp_get_dynamic function returns .TRUE. if dynamic thread adjustment by the run-time environment is enabled. The omp_get_dynamic function returns .FALSE.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Default logical.

Result Value

omp_get_max_threads()

Purpose

This function returns the maximum number of threads that can execute concurrently in a single parallel region. The return value is equal to the maximum value that can be returned by the omp_get_num_threads function. If you use omp_set_num_threads to change the number of threads, subsequent calls to omp_get_max_threads will return the new value.

The function has global scope, which means that the maximum value it returns applies to all functions, subroutines, and compilation units in the program. It returns the same value whether executing from a serial or parallel region.

You can use omp_get_max_threads to allocate maximum-sized data structures for each thread when you have enabled dynamic thread adjustment by passing omp_set_dynamic an argument which evaluates to .TRUE.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Default integer.

Result Value

The maximum number of threads that can execute concurrently in a single parallel region.

omp_get_nested()

Purpose

The omp_get_nested function returns .TRUE. if nested parallelism is enabled and .FALSE. if nested parallelism is disabled.

Currently, XL Fortran does not support OpenMP nested parallelism.

Class

Function

Argument Type and Attributes

None.

Result Type and Attributes

Default logical.

Result Value

omp_get_num_procs()

Purpose

The omp_get_num_procs function returns the number of online processors on the machine.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Default integer.

Result Value

The number of online processors on the machine.

omp_get_num_threads()

Purpose

The omp_get_num_threads function returns the number of threads in the team currently executing the parallel region from which it is called. The function binds to the closest enclosing PARALLEL directive.

The omp_set_num_threads subroutine and the OMP_NUM_THREADS environment variable control the number of threads in a team. If you do not explicitly set the number of threads, the run-time environment will use the number of online processors on the machine by default.

If you call omp_get_num_threads from a serial portion of your program or from a nested parallel region that is serialized, the function returns 1.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Default integer.

Result Value

The number of threads in the team currently executing the parallel region from which the function is called.

Examples

      USE omp_lib
      INTEGER N1, N2

      N1 = omp_get_num_threads()
      PRINT *, N1
!$OMP PARALLEL PRIVATE(N2)
      N2 = omp_get_num_threads()
      PRINT *, N2
!$OMP END PARALLEL
      END

The omp_get_num_threads call returns 1 in the serial section of the code, so N1 is assigned the value 1. N2 is assigned the number of threads in the team executing the parallel region, so the output of the second print statement will be an arbitrary number less than or equal to the value returned by omp_get_max_threads.

omp_get_thread_num()

Purpose

This function returns the number of the currently executing thread within the team. The number returned will always be between 0 and NUM_PARTHDS - 1. NUM_PARTHDS is the number of currently executing threads within the team.The master thread of the team returns a value of 0.

If you call omp_get_thread_num from within a serial region, from within a serialized nested parallel region, or from outside the dynamic extent of any parallel region, this function will return a value of 0.

This function binds to the closest parallel region.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Default integer.

Result Value

The value of the currently executing thread within the team between 0 and NUM_PARTHDS - 1. NUM_PARTHDS is the number of currently executing threads within the team. A call to omp_get_thread_num from a serialized nested parallel region, or from outside the dynamic extent of any parallel region returns 0.

Examples

The following example illustrates the return value of the omp_get_thread_num routine in a PARALLEL region and a MASTER construct.

      USE omp_lib
      INTEGER NP
      call omp_set_num_threads(4)  ! 4 threads are used in the
                                   ! parallel region

!$OMP PARALLEL PRIVATE(NP)
      NP = omp_get_thread_num()
      CALL WORK('in parallel', NP)

!$OMP MASTER
      NP = omp_get_thread_num()
      CALL WORK('in master', NP)
!$OMP END MASTER
!$OMP END PARALLEL
      END
      SUBROUTINE WORK(msg, THD_NUM)
      INTEGER THD_NUM
      character(*) msg
      PRINT *, msg, THD_NUM
      END

Output:

in parallel 1
in parallel 3
in parallel 2
in parallel 0
in master 0

(The order may be different.)

omp_get_wtick()

Purpose

The omp_get_wtick function returns a double precision value equal to the number of seconds between consecutive clock ticks.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Double precision real.

Result Value

The number of seconds between consecutive ticks of the operating system real-time clock.

Examples

      USE omp_lib
      DOUBLE PRECISION WTICKS
      WTICKS = omp_get_wtick()
      PRINT *, 'The clock ticks ', 10 / WTICKS, &
      ' times in 10 seconds.'
      END

omp_get_wtime()

Purpose

The omp_get_wtime function returns a double precision value equal to the number of seconds since the initial value of the operating system real-time clock. The initial value is guaranteed not to change during execution of the program.

The value returned by the omp_get_wtime function is not consistent across all threads in the team.

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Double precision real.

Result Value

The number of seconds since the initial value of the operating system real-time clock.

Examples

      USE omp_lib
      DOUBLE PRECISION START, END
      START = omp_get_wtime()
!     Work to be timed
      END = omp_get_wtime()
      PRINT *, 'Stuff took ', END - START, ' seconds.'
      END

omp_in_parallel()

Purpose

The omp_in_parallel function returns .TRUE. if you call it from the dynamic extent of a region executing in parallel and returns .FALSE. otherwise. If you call omp_in_parallel from a region that is serialized but nested within the dynamic extent of a region executing in parallel, the function will still return .TRUE.. (Nested parallel regions are serialized by default. See omp_set_nested(enable_expr) and the OMP_NESTED environment variable for more information.)

Class

Function.

Argument Type and Attributes

None.

Result Type and Attributes

Default logical.

Result Value

.TRUE. if called from the dynamic extent of a region executing in parallel. .FALSE. otherwise.

Examples

In the following example, the first call to omp_in_parallel returns .FALSE. because the call is outside the dynamic extent of any parallel region. The second call returns .TRUE., even if the nested PARALLEL DO loop is serialized, because the call is still inside the dynamic extent of the outer PARALLEL DO loop.

      USE omp_lib
      INTEGER N, M
      N = 4
      M = 3
      PRINT*, omp_in_parallel()
!$OMP PARALLEL DO
      DO I = 1,N
!$OMP   PARALLEL DO
        DO J=1, M
          PRINT *, omp_in_parallel()
        END DO
!$OMP   END PARALLEL DO
      END DO
!$OMP END PARALLEL DO
      END

omp_init_lock(svar)

Purpose

The omp_init_lock subroutine initializes a lock and associates it with the lock variable passed in as a parameter. After the call to omp_init_lock, the initial state of the lock variable is unlocked.

If you call this routine with a lock variable that you have already initialized, the result of the call is undefined.

Class

Subroutine.

Argument Type and Attributes

svar
Integer of kind omp_lock_kind.

Result Type and Attributes

Result Value

Examples

In the following example, one at a time, the threads gain ownership of the lock associated with the lock variable LCK, print the thread ID, and release ownership of the lock.

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LCK
      INTEGER ID
      CALL omp_init_lock(LCK)
!$OMP PARALLEL SHARED(LCK), PRIVATE(ID)
      ID = omp_get_thread_num()
      CALL omp_set_lock(LCK)
      PRINT *,'MY THREAD ID IS', ID
      CALL omp_unset_lock(LCK)
!$OMP END PARALLEL
      CALL omp_destroy_lock(LCK)
      END

omp_init_nest_lock(nvar)

Purpose

The omp_init_nest_lock subroutine allows you to initialize a nestable lock and associate it with the lock variable you specify. The initial state of the lock variable is unlocked, and the initial nesting count is zero. The value of nvar must be an unitialized nestable lock variable.

If you call omp_init_nest_lock using a variable that is already initialized, the result is undefined.

Class

Subroutine.

Argument Type and Attributes

nvar
Integer of kind omp_nest_lock_kind.

Result Type and Attributes

Result Value

Examples

The following example to illustrate the use of nestable lock for updating variable P in the PARALLEL SECTIONS construct.

      USE omp_lib
      INTEGER P
      INTEGER A
      INTEGER B
      INTEGER ( kind=omp_nest_lock_kind ) LCK
      CALL omp_init_nest_lock ( LCK )   ! initialize the nestable lock
!$OMP PARALLEL SECTIONS
!$OMP SECTION
      CALL omp_set_nest_lock ( LCK )
      P = P + A
      CALL omp_set_nest_lock ( LCK )
      P = P + B
      CALL omp_unset_nest_lock ( LCK )
      CALL omp_unset_nest_lock ( LCK )
!$OMP SECTION
      CALL omp_set_nest_lock ( LCK )
      P = P + B
      CALL omp_unset_nest_lock ( LCK )
!$OMP END PARALLEL SECTIONS

      CALL omp_destroy_nest_lock ( LCK )
      END

omp_set_dynamic(enable_expr)

Purpose

The omp_set_dynamic subroutine enables or disables dynamic adjustment, by the run-time environment, of the number of threads available to execute parallel regions.

If you call omp_set_dynamic with a scalar_logical_expression that evaluates to .TRUE., the run-time environment can automatically adjust the number of threads that are used to execute subsequent parallel regions to obtain the best use of system resources. The number of threads you specify using omp_set_num_threads becomes the maximum, not exact, thread count.

If you call the subroutine with a scalar_logical_expression which evaluates to .FALSE., dynamic adjustment of the number of threads is disabled. The run-time environment cannot automatically adjust the number of threads used to execute subsequent parallel regions. The value you pass to omp_set_num_threads becomes the exact thread count.

By default, dynamic thread adjustment is enabled. If your code depends on a specific number of threads for correct execution, you should explicitly disable dynamic threads.

If the routine is called from a portion of the program where the omp_in_parallel routine returns true, the routine has no effect.

This subroutine has precedence over the OMP_DYNAMIC environment variable.

Class

Subroutine.

Argument Type and Attributes

enable_expr
Logical.

Result Type and Attributes

None.

Result Value

None.

omp_set_lock(svar)

Purpose

The omp_set_lock subroutine forces the calling thread to wait until the specified lock is available before executing subsequent instructions. The calling thread is given ownership of the lock when it becomes available.

If you call this routine with an uninitialized lock variable, the result of the call is undefined. If a thread that owns a lock tries to lock it again by issuing a call to omp_set_lock, the thread produces a deadlock.

Class

Subroutine.

Argument Type and Attributes

svar
Integer of kind omp_lock_kind.

Result Type and Attributes

None.

Result Value

None.

Examples

In the following example, the lock variable LCK_X is used to avoid race conditions when updating the shared variable X. By setting the lock before each update to X and unsetting it after the update, you ensure that only one thread is updating X at a given time.

      USE omp_lib
      INTEGER A(100), X
      INTEGER(kind=omp_lock_kind) LCK_X
      X=1
      CALL omp_init_lock (LCK_X)
!$OMP PARALLEL PRIVATE (I), SHARED (A, X)
!$OMP DO
      DO I = 3, 100
        A(I) = I * 10
        CALL omp_set_lock (LCK_X)
        X = X + A(I)
        CALL omp_unset_lock (LCK_X)
      END DO
!$OMP END DO
!$OMP END PARALLEL
      CALL omp_destroy_lock (LCK_X)
      END

omp_set_nested(enable_expr)

Purpose

The omp_set_nested subroutine enables or disables nested parallelism.

If you call the subroutine with a scalar_logical_expression that evaluates to .FALSE., nested parallelism is disabled. Nested parallel regions are serialized, and they are executed by the current thread. This is the default setting.

If you call the subroutine with a scalar_logical_expression that evaluates to .TRUE., nested parallelism is enabled. Parallel regions that are nested can deploy additional threads to the team. It is up to the run-time environment to determine whether additional threads should be deployed. Therefore, the number of threads used to execute parallel regions may vary from one nested region to the next.

If the routine is called from a portion of the program where the omp_in_parallel routine returns true, the routine has no effect.

This subroutine takes precedence over the OMP_NESTED environment variable.

Currently, XL Fortran does not support OpenMP nested parallelism.

Class

Subroutine.

Argument Type and Attributes

enable_expr
Logical.

Result Type and Attributes

Default logical.

Result Value

None.

omp_set_nest_lock(nvar)

Purpose

The omp_set_nest_lock subroutine allows you to set a nestable lock. The thread executing the subroutine will wait until the lock becomes available and then set that lock, incrementing the nesting count. A nestable lock is available if it is owned by the thread executing the subroutine, or is unlocked.

Class

Subroutine.

Argument Type and Attributes

nvar
Integer of kind omp_nest_lock_kind.

Result Type and Attributes

Result Value

Examples

USE omp_lib
INTEGER P
INTEGER A
INTEGER B
INTEGER ( kind=omp_nest_lock_kind ) LCK

CALL omp_init_nest_lock ( LCK )

!$OMP PARALLEL SECTIONS
!$OMP SECTION
CALL omp_set_nest_lock ( LCK )
P = P + A
CALL omp_set_nest_lock ( LCK )
P = P + B
CALL omp_unset_nest_lock ( LCK )
CALL omp_unset_nest_lock ( LCK )
!$OMP SECTION
CALL omp_set_nest_lock ( LCK )
P = P + B
CALL omp_unset_nest_lock ( LCK )
!$OMP END PARALLEL SECTIONS

CALL omp_destroy_nest_lock ( LCK )
END

omp_set_num_threads(number_of_threads_expr)

Purpose

The omp_set_num_threads subroutine sets within the run-time environment how many threads to use in the next parallel region. The scalar_integer_expression that you pass to the subroutine is evaluated, and its value is used as the number of threads. If you have enabled dynamic adjustment of the number of threads (see omp_set_dynamic(enable_expr)), omp_set_num_threads sets the maximum number of threads to use for the next parallel region. The run-time environment then determines the exact number of threads to use. However, when dynamic adjustment of the number of threads is disabled, omp_set_num_threads sets the exact number of threads to use in the next parallel region. If the number of threads you request exceeds the number your execution environment can support, your application will terminate.

This subroutine takes precedence over the OMP_NUM_THREADS environment variable.

If you call this subroutine from the dynamic extent of a region executing in parallel, the behavior of the subroutine is undefined.

Class

Subroutine.

Argument Type and Attributes

number_of_threads_expr
scalar_integer_expression

Result Type and Attributes

Result Value

omp_test_lock(svar)

Purpose

The omp_test_lock function attempts to set the lock associated with the specified lock variable. It returns .TRUE. if it was able to set the lock and .FALSE. otherwise. In either case, the calling thread will continue to execute subsequent instructions in the program.

If you call omp_test_lock with an uninitialized lock variable, the result of the call is undefined.

Class

Function.

Argument Type and Attributes

svar
Integer of kind omp_lock_kind.

Result Type and Attributes

Default logical.

Result Value

.TRUE. if the function was able to set the lock. .FALSE. otherwise.

Examples

In the following example, a thread repeatedly executes WORK_A until it can set the lock variable, LCK. When the lock variable is set, the thread executes WORK_B.

      USE omp_lib
      INTEGER LCK
      INTEGER ID
      CALL omp_init_lock (LCK)
!$OMP PARALLEL SHARED(LCK), PRIVATE(ID)
      ID = omp_get_thread_num()
      DO WHILE (.NOT. omp_test_lock(LCK))
        CALL WORK_A (ID)
      END DO
      CALL WORK_B (ID)
      CALL omp_unset_lock (LCK)
!$OMP END PARALLEL
      CALL omp_destroy_lock (LCK)
      END

omp_test_nest_lock(nvar)

Purpose

The omp_test_nest_lock subroutine allows you to attempt to set a lock using the same method as omp_set_nest_lock but the execution thread does not wait for confirmation that the lock is available. If the lock is successfully set, the function will increment the nesting count. If the lock is unavailable the function returns a value of zero. The result value is always a default integer.

Class

Function.

Argument Type and Attributes

nvar
Integer of kind omp_nest_lock_kind.

Result Type and Attributes

Result Value

.TRUE. if the function was able to set the lock. .FALSE. otherwise.

omp_unset_lock(svar)

Purpose

This subroutine causes the executing thread to release ownership of the specified lock. The lock can then be set by another thread as required. The behavior of the omp_unset_lock subroutine is undefined if either of the following conditions occur:

Class

Subroutine.

Argument Type and Attributes

svar
Integer of kind omp_lock_kind.

Result Type and Attributes

None.

Result Value

None.

Examples

      USE omp_lib
      INTEGER A(100)
      INTEGER(kind=omp_lock_kind) LCK_X
      CALL omp_init_lock (LCK_X)
!$OMP PARALLEL PRIVATE (I), SHARED (A, X)
!$OMP DO
      DO I = 3, 100
        A(I) = I * 10
        CALL omp_set_lock (LCK_X)
        X = X + A(I)
        CALL omp_unset_lock (LCK_X)
      END DO
!$OMP END DO
!$OMP END PARALLEL
      CALL omp_destroy_lock (LCK_X)
      END

In this example, the lock variable LCK_X is used to avoid race conditions when updating the shared variable X. By setting the lock before each update to X and unsetting it after the update, you ensure that only one thread is updating X at a given time.

omp_unset_nest_lock(nvar)

Purpose

The omp_unset_nest_lock subroutine allows you to release ownership of a nestable lock. The subroutine decrements the nesting count and releases the associated thread from ownership of the nestable lock.

Class

Subroutine.

Argument Type and Attributes

nvar
Integer of kind omp_lock_kind.

Result Type and Attributes

None.

Result Value

None.

Examples

USE omp_lib
INTEGER P
INTEGER A
INTEGER B
INTEGER ( kind=omp_nest_lock_kind ) LCK

CALL omp_init_nest_lock ( LCK )

!$OMP PARALLEL SECTIONS
!$OMP SECTION
CALL omp_set_nest_lock ( LCK )
P = P + A
CALL omp_set_nest_lock ( LCK )
P = P + B
CALL omp_unset_nest_lock ( LCK )
CALL omp_unset_nest_lock ( LCK )
!$OMP SECTION
CALL omp_set_nest_lock ( LCK )
P = P + B
CALL omp_unset_nest_lock ( LCK )
!$OMP END PARALLEL SECTIONS

CALL omp_destroy_nest_lock ( LCK )
END
End of IBM Extension