Further option driven tuning
You can use the options in this section to convey the characteristics of
your application to the compiler, tuning the optimizations that the compiler
will apply. Option driven tuning is a process that can require experimentation
to find the right combination of options to increase the performance of your
application.
The XL compilers support many options that allow you to assert that your
application will not follow certain standard language rules in some instances.
The compiler assumes language standard compliance and can perform unsafe optimizations
if your application is not compliant. Standards-conforming applications are
more easily optimized and more portable, but when full compliance is not possible,
use the appropriate options to ensure your code is optimized safely.
For complete compiler option syntax, see the IBM XL Fortran Advanced Edition V10.1 for Linux Compiler Reference.
Options for providing application characteristics
This section provides a list of options that can dictate a wide variety
of characteristics about your application to the compiler including floating-point
and loop behaviors.
- Option
- Description
- -qalias
- Supports several suboptions that can help the compiler analyze the characteristics
of your application. For more information on aliasing, see the Advanced optimization concepts section.
- noaryovrlp
- Asserts that your compilation contains no array assignments between
storage associated (overlapping) arrays.
- nointptr
- Asserts that your compilation does not make use of integer (Cray) pointers.
- nopteovrlp
- Asserts that your compilation does not use contain pointee variables
that refer to any data objects that are not pointee variables. Also, that
the compilation does not contain two pointee variables that can refer to the
same storage location.
- std
- Asserts that your compilation follows all language rules for variable
aliasing. This is the default compiler setting. Specify -qalias=nostd if this compilation does not follow all variable aliasing rules.
- -qassert
- Includes the following suboptions that can be useful for providing some
loop characteristics of your application.
- nodeps
- Asserts that the loops in this compilation do not contain loop carry
dependencies.
- itercnt={number}
- Gives the optimizer a value to use when estimating the number of iterations
for loops where it cannot determine that value.
- -qddim
- Forces the compiler to re-evaluate the bounds of a pointee array each
time the application references the array. Specify this option only if your
application performs dynamic dimensioning of pointee arrays.
- -qdirectstorage
- Asserts that your application accesses write-through-enabled or cache-inhibited
storage.
- -qfloat
- Provides the compiler with floating-point characteristics for your application.
The following suboptions are particularly useful.
- nans
- Asserts that your application makes use of signaling NaN (not-a-number)
floating-point values. Normal floating-point operations do not create these
values, your application must create signalling NaNs.
- rrm
- Prohibits optimization transformations that assume the floating-point
rounding mode must be the default setting round-to-nearest. If your application
changes the rounding mode in any way, specify this option.
- -qflttrap
- Offers you the ability to control various aspects of floating-point
exception handling that your application can require if it attempts to detect
or handle such exceptions.
- -qieee
- Specifies the preferred floating-point rounding mode when evaluating
expressions at compile time. This option is important if your application
requires a non-default rounding mode in order to have consistency between
compile-time evaluation and run-time evaluation.
You can also specify -y to set the preferred floating-point rounding mode.
- -qlibansi
- Asserts that any external function calls in your compilation that have
the same name as standard C library function calls, such as malloc or memcpy,
are in fact those functions and are not a user-written function with that
name.
- -qlibessl
- Asserts that your application will be linked with IBM's ESSL high-performance
mathematical library and that mathematical operations can be transformed into
calls to that library. The High performance
libraries section contains more information on ESSL.
- -qlibposix
- Asserts that any external function calls in your compilation that have
the same name as standard Posix library function calls are in fact those functions
and are not a user-written function with that name.
- -qonetrip
- Asserts that all DO loops in your compilation will execute at least
one iteration. You can also specify this behavior with -1.
- -qnostrictieeemod
- Allows the compiler to relax certain rules required by the Fortran 2003
standard related to the use of the IEEE intrinsic modules. Specify this option
if you application does not use these modules.
- -qstrict_induction
- Prevents optimization transformations that would be unsafe if DO loop
integer iteration count variables overflow and become negative. Few applications
contain algorithms that require this option.
- -qthreaded
- Informs the compiler that your application will execute in a multithreaded/SMP
environment. Using one an _r invocations, like xlf_r, adds this
option automatically.
- -qnounwind
- Informs the compiler that the stack will not be unwound while any routine
in your compilation is active. The -qnounwind option enables prologue
tailoring optimization, which reduces the number of saves and restores of
nonvolatile registers.
- -qnozerosize
- Asserts that this compilation does not require checking for zero-sized
arrays when performing array operations.
Options to control optimization transformations
There are many options available to you in addition to the
base set found in the Optimizing XL compiler applications section. Some of these options prevent an optimization that can be unsafe
for certain applications or enable one that is safe for your application,
but is not normally available as part of the optimization process.
- Option
- Description
- -qcompact
- Chooses a reduction of final code size over a reduction in execution
time. You can use this option to constrain the optimizations of -O3 and higher. For more information on restriction code size, see
the Managing code size section.
- -qenablevmx
- Allows you to take advantage of the VMX capabilities of chips such as
the PPC970. This is the default setting.
- -qfloat
- This option provides a number of suboptions for controlling the optimizations
to your floating-point calculations.
- norelax
- Asserts that the compiler should not perform trivial floating-point
transformations such as removing the addition operation where the right side
is a zero value.
- norsqrt
- Prevents the replacement of the division of the result of a square-root
calculation with a multiplication by the reciprocal of the square root.
- nostrictmaf
- Prevents certain floating-point multiply-and-add instructions where
the sign of signed zero value would not be preserved.
- -qipa
- Includes many suboptions that can assist the IPA optimizations while analyzing your application. If you are using
the -qipa option or higher optimization levels that imply IPA, it
is to your benefit to examine the suboptions available.
- -qmaxmem
- Limits the memory available to certain memory-intensive optimizations
at low levels. Specify -qmaxmem=-1 to remove these memory limits.
- -qnoprefetch
- Prevents the the insertion of prefetching machine instructions into
your application during optimization.
- -Q
- Allows you to exert control over inlining optimization transformations.
For more information on inlining, see the Advanced optimization
concepts section.
- -qsmallstack
- Instructs the compiler to limit the use of stack storage in your application.
This can increase heap usage.
- -qsmp
- Produces code for an SMP system. This option also searches for opportunities
to increase performance by automatically parallelizing your code. The Parallel programming with XL Fortran section contains more information
on writing parallel code.
- -qstacktemp
- Allows you to limit certain compiler temporaries allocated on the stack.
Those not allocated on the stack will be allocated on the heap. This option
is useful for applications that use enough stack space to exceed stack user
or system limits.
- -qstrict
- Limits optimizations to strict adherence to implied program semantics.
This often prevents the compiler from ignoring certain little-used rules in
the IEEE floating-point specification that few applications require for correct
behavior. For example, reordering or reassociating a sequence of floating-point
calculations can cause floating-point exceptions at an unexpected location
or mask them completely. Do not use this option unless your application requires
strict adherence as -qstrict can severely inhibit optimization.
- -qunroll
- Allows you to independently control loop unrolling. At -O3 and higher, -qunroll is a default setting.
Options to assist with performance analysis
The compiler provides a set of options that can help you analyze the performance
aspects of your application. These options are most useful when you are selecting
your level of optimization and tuning the optimization process to the particular
characteristics of your application.
- -d
- Informs the compiler that you want to preserve the preprocessed versions
of your compilation files. Typically these files would have a .F extension.
- -g
- inserts full debugging information into your object code. While the
optimization process can obscure original program meaning, at least some of
the information that this option produces is useful to performance analysis
tools. You can also specify this behavior with -qdbg.
- -p
- Inserts appropriate profiling information into your object to code to
make using tools for performance analysis possible. You can also specify this
behavior with -pg.
- -qdpcl
- Prepares your object for processing by tools based on the Dynamic Probe
Class Library (DPCL).
- -qlinedebug
- An option similar to -g, this option inserts only minimal debug
information into your object code such as function names and line number information.
- -qlist
- Produces a listing file containing a pseuo-assembly listing of your
object code.
- -qreport
- Inserts information in the listing file showing the transformations
done by certain optimizations.
- -S
- Produces a .s file containing the assembly version of the .o file produced
by the compilation.
- -qshowpdf
- If you specify this option with -qpdf1 and a minimum of -O optimization, the optimization process inserts
additional information into your application that the showpdf utility
can make use of when analyzing the result of a PDF run. For more information
on profile directed feedback, see the Benefits of
profile-directed feedback section.
- -qtbtable
- Limits the amount of debugging traceback information
in object files, which reduces the size of the program. Use -qtbtable=full if you intend to analyze your application with a profiling utility.
Options that can inhibit performance
Some compiler options are necessary for some applications
to produce correct or repeatable results. Usually, these options instruct
the compiler to enforce very strict language semantics that few applications
require. Others are supported by the compiler to allow compilation of code
that does not conform to language standards. Avoid these options if you are
trying to increase the runtime performance of your application. In cases where
these options are enabled by default, you must disable them to increase performance.
You can specify -qlistopt to show, in the listing file, the settings
of each of these options.
Consult the IBM XL Fortran Advanced Edition V10.1 for Linux Compiler Reference or the relevant options in
this section for complete descriptions of the following options.
Table 8. Options that can reduce performance
-qalias=nostd |
-qfloat=nosqrt |
-qstacktemp=[value other than
0 or -1] |
-qunwind |
-qcompact |
-qfloat=nostrictmaf |
-qstrict |
-qzerosize |
-qnoenablevmx |
-qnoprefetch |
-qstrictieeemod |
|
-qfloat=norelax |
-Q! |
-qstrict_induction |
|
-qfloat=rrm |
-qsmallstack |
-qnounroll |
|