Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections.
To use PDF, follow these steps:
In a large application, concentrate on those areas of the code that can benefit most from optimization. You do not need to compile all of the application's code with the -qpdf1 option.
For best performance, use the -O3, -O4, or -O5 option with all compilations when you use PDF.
The profile is placed in the current working directory or in the directory that the PDFDIR environment variable names, if that variable is set.
To avoid wasting compilation and execution time, make sure that the PDFDIR environment variable is set to an absolute path. Otherwise, you might run the application from the wrong directory, and it will not be able to locate the profile data files. When that happens, the program may not be optimized correctly or may be stopped by a segmentation fault. A segmentation fault might also happen if you change the value of the PDFDIR variable and execute the application before finishing the PDF process.
Because this option requires compiling the entire application twice, it is intended to be used after other debugging and tuning is finished, as one of the last steps before putting the application into production.
The following utility programs, found in /usr/xlopt/bin, are
available for managing the PDFDIR directory:
cleanpdf | cleanpdf [pathname]
Removes all profiling information from the pathname directory; or if pathname is not specified, from the PDFDIR directory; or if PDFDIR is not set, from the current directory. Removing profiling information reduces run-time overhead if you change the program and then go through the PDF process again. Run cleanpdf only when you are finished with the PDF process for a particular application. Otherwise, if you want to resume using PDF with that application, you will need to recompile all of the files again with -qpdf1.
|
mergepdf | mergepdf [-r scaling] input {[-r scaling]
input} ... --o output [-n] [-v]
Merges two or more PDF records into a single PDF output record.
|
resetpdf | resetpdf [pathname]
Same as cleanpdf [pathname], described above.
|
showpdf | showpdf
Displays the call and block counts for all procedures executed in a program run. To use this command, you must first compile your application specifying both -qpdf1 and -qshowpdf compiler options on the command line. |
Here is a simple example:
/* Set the PDFDIR variable. */ export PDFDIR=$HOME/project_dir /* Compile all files with -qpdf1. */ xlc++ -qpdf1 -O3 file1.C file2.C file3.C /* Run with one set of input data. */ a.out <sample.data /* Recompile all files with -qpdf2. */ xlc++ -qpdf2 -O3 file1.C file2.C file3.C /* The program should now run faster than without PDF if the sample data is typical. */
Here is a more elaborate example.
/* Set the PDFDIR variable. */ export PDFDIR=$HOME/project_dir /* Compile most of the files with -qpdf1. */ xlc++ -qpdf1 -O3 -c file1.C file2.C file3.C /* This file is not so important to optimize. xlc++ -c file4.C /* Non-PDF object files such as file4.o can be linked in. */ xlc++ -qpdf1 file1.o file2.o file3.o file4.o /* Run several times with different input data. */ a.out <polar_orbit.data a.out <elliptical_orbit.data a.out <geosynchronous_orbit.data /* No need to recompile the source of non-PDF object files (file4.C). */ xlc++ -qpdf2 -O3 file1.C file2.C file3.C /* Link all the object files into the final application. */ xlc++ file1.o file2.o file3.o file4.o