Contents

About this document
Who should read this document
How to use this document
How this document is organized
Conventions and terminology used in this document
Typographical conventions
How to read syntax diagrams
How to read syntax statements
Examples
Notes on the path names
Notes on the terminology used
Related information
IBM XL Fortran documentation
Additional documentation
Related documentation
Standards documents
Technical support
How to send your comments
Performance concepts
Optimization explained
Tuning explained
Beyond optimization and tuning: effective programming techniques
Optimizing XL compiler applications
Why optimization is essential
Basic command-line optimization
Optimizing at level 0
Optimizing at level 2
Advanced command-line optimization
Optimizing at level 3
An intermediate step: adding -qhot suboptions at level 3
Optimization at level 4
Optimization at level 5
Benefits of high-order transformation (HOT)
HOT short vectorization
HOT long vectorization
HOT array size adjustment
Benefits of interprocedural analysis (IPA)
Using IPA on the compile step only
IPA Levels and other IPA suboptions
Using IPA across the XL compiler family
Benefits of profile-directed feedback (PDF)
PDF walkthrough
Getting more performance
Tuning XL compiler applications
Tuning for your target architecture
Using -qarch
Using -qtune
Using -qcache
Before you finish tuning
Further option driven tuning
Options for providing application characteristics
Options to control optimization transformations
Options to assist with performance analysis
Options that can inhibit performance
Advanced optimization concepts
Aliasing
Inlining
Finding the right level of inlining
Managing code size
Steps for reducing code size
Compiler option influences on code size
The -qipa compiler option
The -Q inlining option
The -qhot compiler option
The -qcompact compiler option
Other influences on code size
High activity areas
Computed GOTOs and CASE constructs
Linking and code size
Debugging optimized code
Different results in optimized programs
Compiler-friendly programming techniques
General practices
Variables and pointers
Arrays
Choosing appropriate variable sizes
High performance libraries
Using the Mathematical Acceleration Subsystem (MASS)
Using the scalar library
Using the vector libraries
Compiling and linking a program with MASS
Using the Basic Linear Algebra Subprograms (BLAS)
BLAS function syntax
Linking the libxlopt library
Parallel programming with XL Fortran
Compiling your SMP code
Setting OMP and SMP run time options
The XLSMPOPTS environment variable
OpenMP environment variables
Optimizing your SMP code
Developing and running SMP applications
An introduction to SMP directives
Parallel region construct
Work-sharing constructs
Combined parallel work-sharing constructs
Synchronization constructs
Other OpenMP Directives
Non-OpenMP SMP directives
Detailed descriptions of SMP directives
ATOMIC
BARRIER
CRITICAL / END CRITICAL
DO / END DO
DO SERIAL
FLUSH
MASTER / END MASTER
ORDERED / END ORDERED
PARALLEL / END PARALLEL
PARALLEL DO / END PARALLEL DO
PARALLEL SECTIONS / END PARALLEL SECTIONS
PARALLEL WORKSHARE / END PARALLEL WORKSHARE
SCHEDULE
SECTIONS / END SECTIONS
SINGLE / END SINGLE
THREADLOCAL
THREADPRIVATE
WORKSHARE
OpenMP directive clauses
Global rules for directive clauses
COPYIN
COPYPRIVATE
DEFAULT
IF
FIRSTPRIVATE
LASTPRIVATE
NUM_THREADS
ORDERED
PRIVATE
REDUCTION
SCHEDULE
SHARED
OpenMP execution environment, lock and timing routines
omp_destroy_lock(svar)
omp_destroy_nest_lock(nvar)
omp_get_dynamic()
omp_get_max_threads()
omp_get_nested()
omp_get_num_procs()
omp_get_num_threads()
omp_get_thread_num()
omp_get_wtick()
omp_get_wtime()
omp_in_parallel()
omp_init_lock(svar)
omp_init_nest_lock(nvar)
omp_set_dynamic(enable_expr)
omp_set_lock(svar)
omp_set_nested(enable_expr)
omp_set_nest_lock(nvar)
omp_set_num_threads(number_of_threads_expr)
omp_test_lock(svar)
omp_test_nest_lock(nvar)
omp_unset_lock(svar)
omp_unset_nest_lock(nvar)
Pthreads library module
Pthreads data structures, functions, and subroutines
f_maketime(delay)
f_pthread_attr_destroy(attr)
f_pthread_attr_getdetachstate(attr, detach)
f_pthread_attr_getguardsize(attr, guardsize)
f_pthread_attr_getinheritsched(attr, inherit)
f_pthread_attr_getschedparam(attr, param)
f_pthread_attr_getschedpolicy(attr, policy)
f_pthread_attr_getscope(attr, scope)
f_pthread_attr_getstack(attr, stackaddr, ssize)
f_pthread_attr_init(attr)
f_pthread_attr_setdetachstate(attr, detach)
f_pthread_attr_setguardsize(attr, guardsize)
f_pthread_attr_setinheritsched(attr, inherit)
f_pthread_attr_setschedparam(attr, param)
f_pthread_attr_setschedpolicy(attr, policy)
f_pthread_attr_setscope(attr, scope)
f_pthread_attr_setstack(attr, stackaddr, ssize)
f_pthread_attr_t
f_pthread_cancel(thread)
f_pthread_cleanup_pop(exec)
f_pthread_cleanup_push(cleanup, flag, arg)
f_pthread_cond_broadcast(cond)
f_pthread_cond_destroy(cond)
f_pthread_cond_init(cond, cattr)
f_pthread_cond_signal(cond)
f_pthread_cond_t
f_pthread_cond_timedwait(cond, mutex, timeout)
f_pthread_cond_wait(cond, mutex)
f_pthread_condattr_destroy(cattr)
f_pthread_condattr_getpshared(cattr, pshared)
f_pthread_condattr_init(cattr)
f_pthread_condattr_setpshared(cattr, pshared)
f_pthread_condattr_t
f_pthread_create(thread, attr, flag, ent, arg)
f_pthread_detach(thread)
f_pthread_equal(thread1, thread2)
f_pthread_exit(ret)
f_pthread_getconcurrency()
f_pthread_getschedparam(thread, policy, param)
f_pthread_getspecific(key, arg)
f_pthread_join(thread, ret)
f_pthread_key_create(key, dtr)
f_pthread_key_delete(key)
f_pthread_key_t
f_pthread_kill(thread, sig)
f_pthread_mutex_destroy(mutex)
f_pthread_mutex_init(mutex, mattr)
f_pthread_mutex_lock(mutex)
f_pthread_mutex_t
f_pthread_mutex_trylock(mutex)
f_pthread_mutex_unlock(mutex)
f_pthread_mutexattr_destroy(mattr)
f_pthread_mutexattr_getpshared(mattr, pshared)
f_pthread_mutexattr_gettype(mattr, type)
f_pthread_mutexattr_init(mattr)
f_pthread_mutexattr_setpshared(mattr, pshared)
f_pthread_mutexattr_settype(mattr, type)
f_pthread_mutexattr_t
f_pthread_once(once, initr)
f_pthread_once_t
f_pthread_rwlock_destroy(rwlock)
f_pthread_rwlock_init(rwlock, rwattr)
f_pthread_rwlock_rdlock(rwlock)
f_pthread_rwlock_t
f_pthread_rwlock_tryrdlock(rwlock)
f_pthread_rwlock_trywrlock(rwlock)
f_pthread_rwlock_unlock(rwlock)
f_pthread_rwlock_wrlock(rwlock)
f_pthread_rwlockattr_destroy(rwattr)
f_pthread_rwlockattr_getpshared(rwattr, pshared)
f_pthread_rwlockattr_init(rwattr)
f_pthread_rwlockattr_setpshared(rwattr, pshared)
f_pthread_rwlockattr_t
f_pthread_self()
f_pthread_setcancelstate(state, oldstate)
f_pthread_setcanceltype(type, oldtype)
f_pthread_setconcurrency(new_level)
f_pthread_setschedparam(thread, policy, param)
f_pthread_setspecific(key, arg)
f_pthread_t
f_pthread_testcancel()
f_sched_param
f_sched_yield()
f_timespec
Interlanguage calls
Conventions for XL Fortran external names
Mixed-language input and output
Mixing Fortran and C++
Making calls to C functions work
Passing data from one language to another
Passing arguments between languages
Passing global variables between languages
Passing character types between languages
Passing arrays between languages
Passing pointers between languages
Passing arguments by reference or by value
Passing complex values to/from gcc
Returning values from Fortran functions
Arguments with the OPTIONAL attribute
Assembler-level subroutine linkage conventions
The stack
The Link Area and Minimum Stack Frame
The input parameter area
The register save area
The local stack area
The output parameter area
Linkage convention for argument passing
Argument passing rules (by value)
Order of arguments in argument list
Linkage convention for function calls
Pointers to functions
Function values
The Stack floor
Stack overflow
Prolog and epilog
Traceback
THREADLOCAL common blocks and interlanguage calls with C
Example
Implementation details of XL Fortran Input/Output (I/O)
Implementation details of file formats
File names
Preconnected and Implicitly Connected Files
File positioning
I/O Redirection
How XLF I/O interacts with pipes, special files, and links
Default record lengths
File permissions
Selecting error messages and recovery actions
Flushing I/O buffers
Choosing locations and names for Input/Output files
Naming files that are connected with no explicit name
Naming scratch files
Asynchronous I/O
Execution of an asychronous data transfer operation
Usage
Performance
Compiler-generated temporary I/O items
Error handling
XL Fortran thread-safe I/O library
Use of I/O statements in signal handlers
Asynchronous thread cancellation
Implementation details of XL Fortran floating-point processing
IEEE Floating-point overview
Compiling for strict IEEE conformance
IEEE Single- and double-precision values
IEEE Extended-precision values
Infinities and NaNs
Exception-handling model
Hardware-specific floating-point overview
Single- and double-precision values
Extended-precision values
How XL Fortran rounds floating-point calculations
Selecting the rounding mode
Minimizing rounding errors
Minimizing overall rounding
Delaying rounding until run time
Ensuring that the rounding mode is consistent
Duplicating the floating-point results of other systems
Maximizing floating-point performance
Detecting and trapping floating-point exceptions
Compiler features for trapping floating-point exceptions
Installing an exception handler
Producing a core file
Controlling the floating-point status and control register
xlf_fp_util Procedures
fpgets and fpsets subroutines
Sample programs for exception handling
Causing exceptions for particular variables
Minimizing the performance impact of floating-point exception trapping
Porting programs to XL Fortran
Outline of the porting process
Portability of directives
Common industry extensions that XL Fortran supports
Mixing data types in statements
Date and time routines
Other libc routines
Changing the default sizes of data types
Name conflicts between your procedures and XL Fortran intrinsic procedures
Reproducing results from other systems
Finding nonstandard extensions
Appendix. Sample Fortran programs
Example 1 - XL Fortran source file
Execution results
Example 2 - valid C routine source file
Example 3 - valid Fortran SMP source file
Example 4 - invalid Fortran SMP source file
Programming examples using the Pthreads library module
Index