The XL compiler supports several levels of optimization, with each option level building on the levels below by increasing the aggressiveness of transformations and machine resources available to the optimizer. Ensure that your application compiles and executes properly at low optimization levels before trying more aggressive optimizations. This section discusses two optimizations levels, listed with complementary options in the Basic optimizations table. The table also includes a column for compiler options that can have a performance benefit at that optimization level for some applications.
Optimization level | Additional options implied | complementary options | options with possible benefits |
-O0 |
|
|
|
-O2 |
|
|
|
Begin your optimization process at -O0 which the compiler already specifies by default. For SMP programs, the closest equivalent to -O0 is -qsmp=noopt. These levels perform basic analytical optimization by removing obviously redundant code, and can result in better compile time, while ensuring your code is algorithmically correct so you can move forward to more complex optimizations. Optimizing at this level accurately preserves all debug information and can expose problems in existing code, such as uninitialized variables and bad casting.
Additionally, specifying -qarch at this level targets your application for a particular machine and can significantly improve performance by ensuring your application takes advantage of all applicable architectural benefits. The default behavior for -qarch is ppc64grsq, which allows execution of your application on all supported machines. For more information on tuning, consult Tuning for Your Target Architecture.
See the -O option in the XL Fortran Compiler Reference for information on the -O level syntax.
After successfully compiling, executing, and debugging your application using -O0, recompiling at -O2 opens your application to a set of comprehensive low-level transformations that apply to subprogram or compilation unit scopes and can include some inlining. Optimizations at -O2 attempt to find a balance between increasing performance while limiting the impact on compilation time and system resources. You can increase the memory available to some of the optimizations in the -O2 portfolio by providing a larger value for the -qmaxmem option. Specifying -qmaxmem=-1 allows the optimizer to use memory as needed without checking for limits but does not change the transformations the optimizer applies to your application at -02.
Choosing the right hardware architecture target or family of targets becomes even more important at -O2 and higher. Targeting the proper hardware allows the optimizer to make the best use of the hardware facilities available. If you choose a family of hardware targets, the -qtune option can direct the compiler to emit code consistent with the architecture choice, but will execute optimally on the chosen tuning hardware target. This allows you to compile for a general set of targets but have the code run best on a particular target. See the Tuning for Your Target Architecture section for details on the -qarch and -qtune options.
The -O2 option can perform a number of beneficial optimizations, including:
Eliminates redundant instructions.
Evaluates constant expressions at compile-time.
Eliminates instructions that a particular control flow does not reach, or that generate an unused result.
Eliminates unnecessary variable assignments.
Globally assigns user variables to registers.
Simplifies algebraic expressions, by eliminating redundant computations.
Even with -O2 optimizations, some useful information about your source code is made available to the debugger if you specify -g. Higher optimization levels can transform code to an extent to which debug information is no longer accurate. Use that information with discretion. The section on Debugging Optimized Code discusses other debugging strategies in detail.
See the -O option in the XL Fortran Compiler Reference for information on the -O level syntax.