Turns on or customizes a class of optimizations known as interprocedural analysis (IPA).
>>- -qipa--+-----------------+--------------------------------->< | .-object---. | '-=--+-noobject-+-'
where:
-qipa compile-time options | Description |
---|---|
-qipa | Activates interprocedural analysis with the following -qipa suboption defaults:
|
-qipa=object
-qipa=noobject |
Specifies whether to include standard object code in the
object files.
Specifying the noobject suboption can substantially reduce overall compile time by not generating object code during the first IPA phase. If the -S compiler option is specified with noobject, noobject is ignored. If compilation and linking are performed in the same step, and neither the -S nor any listing option is specified, -qipa=noobject is implied by default. If any object file used in linking with -qipa was created with the -qipa=noobject option, any file containing an entry point (the main program for an executable program, or an exported function for a library) must be compiled with -qipa. |
.-noipa------------------------------------------------. >>- -q--+-ipa--+---------------------------------------------+-+->< | .-:------------------------------------. | | | .-noclonearch------------. | | | | | .-,----. | | | | V | V | | | | '-=----+-+-clonearch--=----arch-+-+-------+-+-' | .-nocloneproc------------. | | | .-,----. | | | | V | | | +-+-cloneproc--=----name-+-+-------+ | .-,----. | | V | | +-exits--=----name-+---------------+ +-inline--+----------------------+-+ | | .-auto----------. | | | '-=--+-noauto--------+-' | | | .-,---------. | | | | V | | | | +---suboption-+-+ | | +-threshold=num-+ | | | .-,----. | | | | V | | | | '---name-+------' | | .-,----. | | V | | +-noinline--=----name-+------------+ | .-,----. | | V | | +-infrequentlabel--=----name-+-----+ | .-,----. | | V | | +-isolated--=----name-+------------+ | .-1-. | +-level--=--+-0-+------------------+ | '-2-' | | .-a.lst-. .- short-. | +-list--=--+-------+--+--------+---+ | '-name--' '- long--' | | .-,----. | | V | | +-lowfreq--=----name-+-------------+ | .-unknown--. | +-missing--=--+-safe-----+---------+ | +-isolated-+ | | '-pure-----' | | .-medium-. | +-partition--=--+-small--+---------+ | '-large--' | | .-nopdfname----------------. | +-+-pdfname--+-------------+-+-----+ | '-=--filename-' | | .-nothreads------. | +-+-threads-+----+-+---------------+ | '-=N-' | | .-,----. | | V | | +-+-pure----+--=----name-+---------+ | +-safe----+ | | '-unknown-' | '-filename-------------------------'
where:
Link-time suboptions | Description |
---|---|
-qnoipa | Deactivates interprocedural analysis. |
-qipa | Activates interprocedural analysis with the following -qipa suboption defaults:
|
Suboptions can also include one or more of the forms shown below.
Link-time suboptions | Description |
---|---|
clonearch=arch{,arch}
noclonearch |
Specifies the architectures for which multiple versions
of the same instruction set are produced.
During the IPA link phase, the compiler generates a generic version of a procedure targeted for the default architecture setting and then if appropriate, produces another version that is optimized for the specified architectures. At run time, the compiler dynamically determines which architecture the program is running on, and chooses the particular version of the function that will be executed accordingly. Using this option, your program can achieve compatibility for different PowerPC architectures. arch is a comma-separated list of architectures. The supported clonearch values are pwr4, pwr5 and ppc970. If you specify no value, an invalid value or a value equal to the -qarch setting, no function versioning will be performed for this option. Notes:
|
cloneproc=name{,name}
nocloneproc=name{,name} |
Specifies the name of the functions to clone for the
architectures specified by clonearch suboption. Where name is a comma-separated list of function names.
Note:
If you do not specify -qipa=clonearch or specify -qipa=noclonearch, -qipa=cloneproc=name,{name} and -qipa=nocloneproc=name,{name} have no effect. |
exits=name{,name} | Specifies names of functions which represent program exits. Program exits are calls which can never return and can never call any procedure which has been compiled with IPA pass 1. |
infrequentlabel=name{,name} | Specifies a list of user-defined labels that are likely to be called infrequently during a program run. |
inline=auto
inline=noauto |
Enables or disables automatic inlining only. The compiler still accepts user-specified functions as candidates for inlining. |
inline[=suboption] | Same as specifying the -qinline compiler option, with suboption being any valid -qinline suboption. |
inline=threshold=num | Specifies an upper limit for the number of functions to be inlined, where num is a non-negative integer. This argument is implemented only when inline=auto is on. |
inline=name{,name} | Specifies a comma-separated list of functions to try to inline, where functions are identified by name. |
noinline=name{,name} | Specifies a comma-separated list of functions that must not be inlined, where functions are identified by name. |
isolated=name,{name} | Specifies a list of isolated functions that are not compiled with IPA. Neither isolated functions nor functions within their call chain can refer to global variables. |
level=0
level=1 level=2 |
Specifies the optimization level for interprocedural analysis.
The default level is 1. Valid levels are as follows:
|
list
list=[name] [short|long] |
Specifies that a listing file be generated during the
link phase. The listing file contains information about transformations and
analyses performed by IPA, as well as an optional object listing generated
by the back end for each partition. This option can also be used to specify
the name of the listing file.
If listings have been requested (using either the -qlist or -qipa=list options), and name is not specified, the listing file name defaults to a.lst. The long and short suboptions can be used to request more or less information in the listing file. The short suboption, which is the default, generates the Object File Map, Source File Map and Global Symbols Map sections of the listing. The long suboption causes the generation of all of the sections generated through the short suboption, as well as the Object Resolution Warnings, Object Reference Map, Inliner Report and Partition Map sections. |
lowfreq=name{,name} | Specifies names of functions which are likely to be called infrequently. These will typically be error handling, trace, or initialization functions. The compiler may be able to make other parts of the program run faster by doing less optimization for calls to these functions. |
missing=attribute | Specifies the interprocedural behavior of procedures that are not compiled
with -qipa and are not explicitly named in an unknown, safe, isolated,
or pure suboption.
The following attributes may be used to refine this information:
|
partition=small
partition=medium partition=large |
Specifies the size of each program partition created by IPA during pass 2. |
nopdfname
pdfname
pdfname=filename |
Specifies the name of the profile data file containing the PDF profiling information. If you do not specify filename, the default file name is ._pdf. The profile is placed in the current working directory or in the directory named by the PDFDIR environment variable. This lets you do simultaneous runs of multiple executables using the same PDFDIR, which can be useful when tuning with PDF on dynamic libraries. |
nothreads
threads threads=N |
Specifies the number of threads the compiler assigns to code generation.
Specifying nothreads is equivalent to running one serial process. This is the default. Specifying threads allows the compiler to determine how many threads to use, depending on the number of processors available. Specifying threads=N instructs the program to use N threads. Though N can be any integer value in the range of 1 to MAXINT, N is effectively limited to the number of processors available on your system. |
pure=name{,name} | Specifies a list of pure functions that are not compiled with -qipa. Any function specified as pure must be isolated and safe, and must not alter the internal state nor have side-effects, defined as potentially altering any data visible to the caller. |
safe=name{,name} | Specifies a list of safe functions that are not compiled with -qipa and do not call any other part of the program. Safe functions can modify global variables, but may not call functions compiled with -qipa. |
unknown=name{,name} | Specifies a list of unknown functions that are not compiled with -qipa. Any function specified as unknown can make calls to other parts of the program compiled with -qipa, and modify global variables and dummy arguments. |
filename | Gives the name of a file which contains suboption information in a
special format.
The file format is the following: # ... comment
attribute{, attribute} = name{, name}
clonearch=arch,{arch}
cloneproc=name,{name}
missing = attribute{, attribute}
exits = name{, name}
lowfreq = name{, name}
inline [ = auto | = noauto ]
inline = name{, name} [ from name{, name}]
inline-threshold = unsigned_int
inline-limit = unsigned_int
list [ = file-name | short | long ]
noinline
noinline = name{, name} [ from name{, name}]
level = 0 | 1 | 2
prof [ = file-name ]
noprof
partition = small | medium | large | unsigned_int where attribute is one of:
|
The following table shows the allowed clonearch values for different -qarch settings.
-qarch setting | Allowed clonearch value |
---|---|
ppc, pwr3, ppc64, ppcgr, ppc64gr, ppc64grsq | pwr4, pwr5, ppc970 |
pwr4 | pwr5, ppc970 |
ppc64v | ppc970 |
pwr5, ppc970 | N/A |
The necessary steps to use IPA are:
Note: If a severe error occurs during compilation, -qipa returns RC=1 and terminates. Performance analysis also terminates.
Regular expression syntax can be used when specifying a name for the following suboptions.
Syntax rules for specifying regular expressions are described below:
Expression | Description |
---|---|
string | Matches any of the characters specified in string. For example, test will match testimony, latest, and intestine. |
^string | Matches the pattern specified by string only if it occurs at the beginning of a line. |
string$ | Matches the pattern specified by string only if it occurs at the end of a line. |
str.ing | The period ( . ) matches any single character. For example, t.st will match test, tast, tZst, and t1st. |
string\special_char | The backslash ( \ ) can be used to escape special characters. For example, assume that you want to find lines ending with a period. Simply specifying the expression .$ would show all lines that had at least one character of any kind in it. Specifying \.$ escapes the period ( . ), and treats it as an ordinary character for matching purposes. |
[string] | Matches any of the characters specified in string. For example, t[a-g123]st matches tast and test, but not t-st or tAst. |
[^string] | Does not match any of the characters specified in string. For example, t[^a-zA-Z]st matches t1st, t-st, and t,st but not test or tYst. |
string* | Matches zero or more occurrences of the pattern specified by string. For example, te*st will match tst, test, and teeeeeest. |
string+ | Matches one or more occurrences of the pattern specified by string. For example, t(es)+t matches test, tesest, but not tt. |
string? | Matches zero or one occurrences of the pattern specified by string. For example, te?st matches either tst or test. |
string{m,n} | Matches between m and n occurrence(s) of the pattern specified by string. For example, a{2} matches aa, and b{1,4} matches b, bb, bbb, and bbbb. |
string1 | string2 | Matches the pattern specified by either string1 or string2. For example, s | o matches both characters s and o. |
To compile a set of files with interprocedural analysis, enter:
xlc++ -c -O3 *.C -qipa xlc++ -o product *.o -qipa
Here is how you might compile the same set of files, improving the optimization of the second compilation, and the speed of the first compile step. Assume that there exits two functions, trace_error and debug_dump, which are rarely executed.
xlc++ -c -O3 *.C -qipa=noobject xlc++ -c *.o -qipa=lowfreq=trace_error,debug_dump
Related information