Using XL builtin floating-point functions for Blue Gene/L

The XL C/C++ and XL Fortran compilers include a set of built-in functions that are optimized for the PowerPC architecture. For a full description of them, refer to the following documents (available from the Web pages listed at the beginning of this chapter):

In addition, on Blue Gene/L, the XL compilers provide a set of built-in functions that are specifically optimized for the PowerPC 440d's Double Hummer dual FPU. These built-in functions provide an almost one-to-one correspondence with the Double Hummer instruction set.

All of the C/C++ and Fortran built-in functions operate on complex data types, which have an underlying representation of a two-element array, in which the real part represents the primary element and the imaginary part represents the second element. The input data you provide does not actually need to represent complex numbers: in fact, both elements are represented internally as two real values, and none of the built-in functions actually performs complex arithmetic. A set of built-in functions especially designed to efficiently manipulate complex-type variables is also available.

The Blue Gene/L built-in functions perform the several types of operations as explained in the following paragraphs.

Parallel operations perform SIMD computations on the primary and secondary elements of one or more input operands. They store the results in the corresponding elements of the output. As an example, Figure 8 illustrates how a parallel multiply operation is performed.

Figure 8. Parallel operations
Parallel operations

Cross operations perform SIMD computations on the opposite primary and secondary elements of one or more input operands. They store the results in the corresponding elements in the output. As an example, Figure 9 illustrates how a cross-multiply operation is performed.

Figure 9. Cross operations

Copy-primary operations perform SIMD computation between the corresponding primary and secondary elements of two input operands, where the primary element of the first operand is replicated to the secondary element. As an example, Figure 10 illustrates how a cross-primary multiply operation is performed.

Figure 10. Copy-primary operations

Copy-secondary operations perform SIMD computation between the corresponding primary and secondary elements of two input operands, where the secondary element of the first operand is replicated to the primary element. As an example, Figure 11 illustrates how a cross-secondary multiply operation is performed.

Figure 11. Copy-secondary operations

In cross-copy operations, the compiler crosses either the primary or secondary element of the first operand, so that copy-primary and copy-secondary operations can be used interchangeably to achieve the same result. The operation is performed on the total value of the first operand. As an example, Figure 12 illustrates the result of a cross-copy multiply operation.

Figure 12. Cross-copy operations

The following sections describe the available built-in functions by category:

For each function, the C/C++ prototype is provided. In C, you do not need to include a header file to obtain the prototypes. The compiler includes them automatically. In C++, you need to include the header file builtins.h.

Fortran does not use prototypes for built-in functions. Therefore, the interfaces for the Fortran functions are provided in textual form. The function names omit the double underscore (__ ) in Fortran.

All of the built-in functions, with the exception of the complex type manipulation functions, require compilation under -qarch=440d . This is the default setting on Blue Gene/L.

To help clarify the English description of each function, the following notation is used:

element (variable )

where element represents one of primary or secondary , and variable represents input variable a , b , or c , and the output variable result . For example, consider the following formula:

primary(result) = primary(a) + primary(b)

The formula indicates that the primary element of input variable a is added to the primary element of input variable b and stored in the primary element of the result.

To optimize your calls to the Blue Gene/L built-in functions, follow the guidelines provided in Tuning your code for Blue Gene/L. Using the alignx built-in function (described in Checking for data alignment), and specifying the disjoint pragma (described in Removing possibilities for aliasing (C/C++)), are recommended for code that calls any of the built-in functions.