Further option driven tuning

You can use the options in this section to convey the characteristics of your application to the compiler, tuning the optimizations that the compiler will apply. Option driven tuning is a process that can require experimentation to find the right combination of options to increase the performance of your application.

The XL compilers support many options that allow you to assert that your application will not follow certain standard language rules in some instances. The compiler assumes language standard compliance and can perform unsafe optimizations if your application is not compliant. Standards-conforming applications are more easily optimized and more portable, but when full compliance is not possible, use the appropriate options to ensure your code is optimized safely.

For complete compiler option syntax, see the IBM XL Fortran Advanced Edition V10.1 for Linux Compiler Reference.

Options for providing application characteristics

This section provides a list of options that can dictate a wide variety of characteristics about your application to the compiler including floating-point and loop behaviors.

Option
Description
-qalias
Supports several suboptions that can help the compiler analyze the characteristics of your application. For more information on aliasing, see the Advanced optimization concepts section.
noaryovrlp
Asserts that your compilation contains no array assignments between storage associated (overlapping) arrays.
nointptr
Asserts that your compilation does not make use of integer (Cray) pointers.
nopteovrlp
Asserts that your compilation does not use contain pointee variables that refer to any data objects that are not pointee variables. Also, that the compilation does not contain two pointee variables that can refer to the same storage location.
std
Asserts that your compilation follows all language rules for variable aliasing. This is the default compiler setting. Specify -qalias=nostd if this compilation does not follow all variable aliasing rules.
-qassert
Includes the following suboptions that can be useful for providing some loop characteristics of your application.
nodeps
Asserts that the loops in this compilation do not contain loop carry dependencies.
itercnt={number}
Gives the optimizer a value to use when estimating the number of iterations for loops where it cannot determine that value.
-qddim
Forces the compiler to re-evaluate the bounds of a pointee array each time the application references the array. Specify this option only if your application performs dynamic dimensioning of pointee arrays.
-qdirectstorage
Asserts that your application accesses write-through-enabled or cache-inhibited storage.
-qfloat
Provides the compiler with floating-point characteristics for your application. The following suboptions are particularly useful.
nans
Asserts that your application makes use of signaling NaN (not-a-number) floating-point values. Normal floating-point operations do not create these values, your application must create signalling NaNs.
rrm
Prohibits optimization transformations that assume the floating-point rounding mode must be the default setting round-to-nearest. If your application changes the rounding mode in any way, specify this option.
-qflttrap
Offers you the ability to control various aspects of floating-point exception handling that your application can require if it attempts to detect or handle such exceptions.
-qieee
Specifies the preferred floating-point rounding mode when evaluating expressions at compile time. This option is important if your application requires a non-default rounding mode in order to have consistency between compile-time evaluation and run-time evaluation.

You can also specify -y to set the preferred floating-point rounding mode.

-qlibansi
Asserts that any external function calls in your compilation that have the same name as standard C library function calls, such as malloc or memcpy, are in fact those functions and are not a user-written function with that name.
-qlibessl
Asserts that your application will be linked with IBM's ESSL high-performance mathematical library and that mathematical operations can be transformed into calls to that library. The High performance libraries section contains more information on ESSL.
-qlibposix
Asserts that any external function calls in your compilation that have the same name as standard Posix library function calls are in fact those functions and are not a user-written function with that name.
-qonetrip
Asserts that all DO loops in your compilation will execute at least one iteration. You can also specify this behavior with -1.
-qnostrictieeemod
Allows the compiler to relax certain rules required by the Fortran 2003 standard related to the use of the IEEE intrinsic modules. Specify this option if you application does not use these modules.
-qstrict_induction
Prevents optimization transformations that would be unsafe if DO loop integer iteration count variables overflow and become negative. Few applications contain algorithms that require this option.
-qthreaded
Informs the compiler that your application will execute in a multithreaded/SMP environment. Using one an _r invocations, like xlf_r, adds this option automatically.
-qnounwind
Informs the compiler that the stack will not be unwound while any routine in your compilation is active. The -qnounwind option enables prologue tailoring optimization, which reduces the number of saves and restores of nonvolatile registers.
-qnozerosize
Asserts that this compilation does not require checking for zero-sized arrays when performing array operations.

Options to control optimization transformations

There are many options available to you in addition to the base set found in the Optimizing XL compiler applications section. Some of these options prevent an optimization that can be unsafe for certain applications or enable one that is safe for your application, but is not normally available as part of the optimization process.

Option
Description
-qcompact
Chooses a reduction of final code size over a reduction in execution time. You can use this option to constrain the optimizations of -O3 and higher. For more information on restriction code size, see the Managing code size section.
-qenablevmx
Allows you to take advantage of the VMX capabilities of chips such as the PPC970. This is the default setting.
-qfloat
This option provides a number of suboptions for controlling the optimizations to your floating-point calculations.
norelax
Asserts that the compiler should not perform trivial floating-point transformations such as removing the addition operation where the right side is a zero value.
norsqrt
Prevents the replacement of the division of the result of a square-root calculation with a multiplication by the reciprocal of the square root.
nostrictmaf
Prevents certain floating-point multiply-and-add instructions where the sign of signed zero value would not be preserved.
-qipa
Includes many suboptions that can assist the IPA optimizations while analyzing your application. If you are using the -qipa option or higher optimization levels that imply IPA, it is to your benefit to examine the suboptions available.
-qmaxmem
Limits the memory available to certain memory-intensive optimizations at low levels. Specify -qmaxmem=-1 to remove these memory limits.
-qnoprefetch
Prevents the the insertion of prefetching machine instructions into your application during optimization.
-Q
Allows you to exert control over inlining optimization transformations. For more information on inlining, see the Advanced optimization concepts section.
-qsmallstack
Instructs the compiler to limit the use of stack storage in your application. This can increase heap usage.
-qsmp
Produces code for an SMP system. This option also searches for opportunities to increase performance by automatically parallelizing your code. The Parallel programming with XL Fortran section contains more information on writing parallel code.
-qstacktemp
Allows you to limit certain compiler temporaries allocated on the stack. Those not allocated on the stack will be allocated on the heap. This option is useful for applications that use enough stack space to exceed stack user or system limits.
-qstrict
Limits optimizations to strict adherence to implied program semantics. This often prevents the compiler from ignoring certain little-used rules in the IEEE floating-point specification that few applications require for correct behavior. For example, reordering or reassociating a sequence of floating-point calculations can cause floating-point exceptions at an unexpected location or mask them completely. Do not use this option unless your application requires strict adherence as -qstrict can severely inhibit optimization.
-qunroll
Allows you to independently control loop unrolling. At -O3 and higher, -qunroll is a default setting.

Options to assist with performance analysis

The compiler provides a set of options that can help you analyze the performance aspects of your application. These options are most useful when you are selecting your level of optimization and tuning the optimization process to the particular characteristics of your application.

-d
Informs the compiler that you want to preserve the preprocessed versions of your compilation files. Typically these files would have a .F extension.
-g
inserts full debugging information into your object code. While the optimization process can obscure original program meaning, at least some of the information that this option produces is useful to performance analysis tools. You can also specify this behavior with -qdbg.
-p
Inserts appropriate profiling information into your object to code to make using tools for performance analysis possible. You can also specify this behavior with -pg.
-qdpcl
Prepares your object for processing by tools based on the Dynamic Probe Class Library (DPCL).
-qlinedebug
An option similar to -g, this option inserts only minimal debug information into your object code such as function names and line number information.
-qlist
Produces a listing file containing a pseuo-assembly listing of your object code.
-qreport
Inserts information in the listing file showing the transformations done by certain optimizations.
-S
Produces a .s file containing the assembly version of the .o file produced by the compilation.
-qshowpdf
If you specify this option with -qpdf1 and a minimum of -O optimization, the optimization process inserts additional information into your application that the showpdf utility can make use of when analyzing the result of a PDF run. For more information on profile directed feedback, see the Benefits of profile-directed feedback section.
-qtbtable
Limits the amount of debugging traceback information in object files, which reduces the size of the program. Use -qtbtable=full if you intend to analyze your application with a profiling utility.

Options that can inhibit performance

Some compiler options are necessary for some applications to produce correct or repeatable results. Usually, these options instruct the compiler to enforce very strict language semantics that few applications require. Others are supported by the compiler to allow compilation of code that does not conform to language standards. Avoid these options if you are trying to increase the runtime performance of your application. In cases where these options are enabled by default, you must disable them to increase performance. You can specify -qlistopt to show, in the listing file, the settings of each of these options.

Consult the IBM XL Fortran Advanced Edition V10.1 for Linux Compiler Reference or the relevant options in this section for complete descriptions of the following options.

Table 8. Options that can reduce performance
-qalias=nostd -qfloat=nosqrt -qstacktemp=[value other than 0 or -1] -qunwind
-qcompact -qfloat=nostrictmaf -qstrict -qzerosize
-qnoenablevmx -qnoprefetch -qstrictieeemod
-qfloat=norelax -Q! -qstrict_induction
-qfloat=rrm -qsmallstack -qnounroll