Interprocedural analysis (IPA) enables the compiler to optimize across different files (whole-program analysis), and can result in significant performance improvements. You can specify interprocedural analysis on the compile step only or on both compile and link steps in "whole program" mode (with the exception of the clonearch and cloneproc suboptions, which must be specified on the link step). Whole program mode expands the scope of optimization to an entire program unit, which can be an executable or shared object. As IPA can significantly increase compilation time, you should limit using IPA to the final performance tuning stage of development.
You enable IPA by specifying the -qipa option. The most commonly used suboptions and their effects are described in the following table. The full set of suboptions and syntax is described in -qipa.
suboption | Behavior |
---|---|
level=0 | Program partitioning and simple interprocedural optimization,
which consists of:
|
level=1 | Inlining and global data mapping. Specifically:
|
level=2 | Global alias analysis, specialization, interprocedural
data flow:
|
inline=variable | Allows precise control over function inlining. |
clonearch=arch_list | Allows you to specify multiple architectures for which optimized instructions can be generated. Supported architecture values are PWR4, PWR5, and PPC970. For every function in your program, the compiler generates a generic version of the instruction set, according to the -qarch value in effect, and, if appropriate, clones specialized versions of the instruction set for the architectures you specify in this suboption. The compiler inserts code into your application to check for the processor architecture at run time, and selects the version of the generated instructions that is optimized for the runtime environment. |
cloneproc=func_list | Allows you to specify the exact functions which should be cloned for the specified architectures in the clonearch suboption. |
fine_tuning | Other values for -qipa provide the ability to specify the behavior of library code, tune program partitioning, read commands from a file, etc. |