The MASS libraries consist of a library of scalar routines, described in Using the scalar library, and a set of vector libraries tuned for specific architectures, described in Using the vector libraries. The routines contained in both scalar and vector libraries are automatically called at certain levels of optimization, but you can also call them explicitly in your programs. Note that the accuracy and exception handling might not be identical in MASS routines and system library routines.
Compiling and linking a program with MASS describes how to compile and link a program that uses the MASS libraries, and how to selectively use the MASS scalar library routines in concert with the regular system library scalar routines.
The MASS scalar library, libmass.a1, contains an accelerated set of frequently used math intrinsic functions that provide improved performance over the corresponding standard system library functions. When you compile programs with any of the following options:
the compiler automatically uses the faster MASS routines for all scalar routines (with the exception of atan2, dnint, sqrt, rsqrt). (The compiler first tries to "vectorize" calls to the scalar routines by replacing them with the MASS vector routines; if the compiler cannot do so, it will use the MASS scalar routines.) When you use these options, the compiler uses versions of the MASS routines contained in the system library libxlopt.a, and you do not need to add any special calls to the MASS routines in your code, or to link to the libxlopt library.
If you are not using any of these optimization levels, and/or want to explicitly call the MASS scalar routines, you can do so by linking the MASS scalar library libmass.a (or the 64-bit version, libmass_64.a) with your application (for instructions, see Compiling and linking a program with MASS). The MASS scalar routines all accept double-precision parameters and return a double-precision result, and are summarized in Table 9. All the MASS scalar routines except rsqrt are recognized by XL Fortran as intrinsic functions, so no explicit interface block is needed. To provide an interface block for rsqrt, include mass.include in your source file.
Function | Description |
---|---|
sqrt | Returns the square root of x |
rsqrt | Returns the reciprocal of the square root of x |
exp | Returns the exponential function of x |
expm1 | Returns (the exponential function of x) - 1 |
log | Returns the natural logarithm of x |
log1p | Returns the natural logarithm of (x + 1) |
sin | Returns the sine of x |
cos | Returns the cosine of x |
tan | Returns the tangent of x |
atan | Returns the arctangent of x |
atan2 | Returns the arctangent of x/y |
sinh | Returns the hyperbolic sine of x |
cosh | Returns the hyperbolic cosine of x |
tanh | Returns the hyperbolic tangent of x |
dnint | Returns the nearest integer to x (as a double) |
x**y | Returns x raised to the power y |
The following example shows the interface declaration for the rsqrt scalar function:
interface real*8 function rsqrt (%val(x)) real*8 x ! Returns the reciprocal of the square root of x. end function rsqrt end interface
The trigonometric functions (sin, cos, tan) return NaN (Not-a-Number) values for large arguments (abs(x)>2**50*pi).
When you compile programs with any of the following options:
the compiler automatically attempts to vectorize calls to system math routines by calling the equivalent MASS vector routines (with the exceptions of functions vatan2, vsatan2, vdnint, vdint, vsincos, vssincos, vcosisin, vscosisin, vqdrt, vsqdrt, vrqdrt, vsrqdrt, vpopcnt4, and vpopcnt8).
If you are not using any of these optimization levels, and/or want to explicitly call any of the MASS vector routines, you can do so by including massv.include in your source files to provide the interface declarations for the routines, and by linking to any of the following vector library archives (information on linking is provided in Compiling and linking a program with MASS):
On Linux(R), 32-bit and 64-bit objects must not be mixed in a single library, so a separate 64-bit version of each vector library is provided: libmassvp4_64.a and libmassvp5_64.a.
With the exception of a few routines (described below), all of the floating-point routines in the vector libraries accept three parameters:
These routines are all of the form:
function_name (y,x,n)
where y is the output vector, x is the source vector, and n is the vector length. The parameters y and x are assumed to be double-precision for functions whose prefix is v, and single-precision for functions with the prefix vs. As an example, the following code:
include 'massv.include' real*8 x(500), y(500) integer n n = 500 ... call vexp (y, x, n)
outputs a vector y of length 500 whose elements are exp(x(i)), with i=1,...,500.
The routines vatan2, vdiv, and vpow take four parameters and are of the form routine_name(z,x,y,n). The routine vsincos takes four parameters of the form routine_name(y,z,x,n). The routine vatan2 outputs a vector z whose elements are atan(x(i)/y(i)). The routine vdiv outputs a vector z whose elements are x(i)/y(i). The routine vpow outputs a vector z whose elements are x(i)y(i). The routine vsincos outputs two vectors, y and z, whose elements are sin(x(i)) and cos(x(i)) respectively.
In vcosisin(y,x,n), x is a vector of n double elements and the routine outputs a vector y of n complex*16 elements of the form (cos(x(i)),sin(x(i))).
The single-precision and double-precision floating-point routines contained in the vector libraries are summarized in Table 10
Double-precision function | Single-precision function | Arguments | Description |
vacos | vsacos | (y,x,n) | Sets y(i) to the arccosine of x(i), for i=1,..,n |
vasin | vsasin | (y,x,n) | Sets y(i) to the arcsine of x(i), for i=1,..,n |
vatan2 | vsatan2 | (z,x,y,n) | Sets z(i) to the arctangent of x(i)/y(i), for i=1,..,n |
vcbrt | vscbrt | (y,x,n) | Sets y(i) to the cube root of x(i), for i=1,..,n |
vcos | vscos | (y,x,n) | Sets y(i) to the cosine of x(i), for i=1,..,n |
vcosh | vscosh | (y,x,n) | Sets y(i) to the hyperbolic cosine of x(i), for i=1,..,n |
vcosisin | vscosisin | (y,x,n) | Sets the real part of y(i) to the cosine of x(i) and the imaginary part of y(i) to the sine of x(i), for i=1,..,n |
vdint | (y,x,n) | Sets y(i) to the integer truncation of x(i), for i=1,..,n | |
vdiv | vsdiv | (z,x,y,n) | Sets z(i) to x(i)/y(i), for i=1,..,n |
vdnint | (y,x,n) | Sets y(i) to the nearest integer to x(i), for i=1,..,n | |
vexp | vsexp | (y,x,n) | Sets y(i) to the exponential function of x(i), for i=1,..,n |
vexpm1 | vsexpm1 | (y,x,n) | Sets y(i) to (the exponential function of x(i))-1, for i=1,..,n |
vlog | vslog | (y,x,n) | Sets y(i) to the natural logarithm of x(i), for i=1,..,n |
vlog10 | vslog10 | (y,x,n) | Sets y(i) to the base-10 logarithm of x(i), for i=1,..,n |
vlog1p | vslog1p | (y,x,n) | Sets y(i) to the natural logarithm of (x(i)+1), for i=1,..,n |
vpow | vspow | (z,x,y,n) | Sets z(i) to x(i) raised to the power y(i), for i=1,..,n |
vqdrt | vsqdrt | (y,x,n) | Sets y(i) to the 4th root of x(i), for i=1,..,n |
vrcbrt | vsrcbrt | (y,x,n) | Sets y(i) to the reciprocal of the cube root of x(i), for i=1,..,n |
vrec | vsrec | (y,x,n) | Sets y(i) to the reciprocal of x(i), for i=1,..,n |
vrqdrt | vsrqdrt | (y,x,n) | Sets y(i) to the reciprocal of the 4th root of x(i), for i=1,..,n |
vrsqrt | vsrsqrt | (y,x,n) | Sets y(i) to the reciprocal of the square root of x(i), for i=1,..,n |
vsin | vssin | (y,x,n) | Sets y(i) to the sine of x(i), for i=1,..,n |
vsincos | vssincos | (y,z,x,n) | Sets y(i) to the sine of x(i) and z(i) to the cosine of x(i), for i=1,..,n |
vsinh | vssinh | (y,x,n) | Sets y(i) to the hyperbolic sine of x(i), for i=1,..,n |
vsqrt | vssqrt | (y,x,n) | Sets y(i) to the square root of x(i), for i=1,..,n |
vtan | vstan | (y,x,n) | Sets y(i) to the tangent of x(i), for i=1,..,n |
vtanh | vstanh | (y,x,n) | Sets y(i) to the hyperbolic tangent of x(i), for i=1,..,n |
The integer routines are of the form function_name (x, n), where x is a vector of 4-byte (for vpopcnt4) or 8-byte (for vpopcnt8) numeric objects (integer or floating-point), and n is the vector length. The vector integer routines are summarized in Table 11.
Function | Description | Interface |
---|---|---|
vpopcnt4 | Returns the total number of 1 bits in the concatenation of the binary representation of x(i), for i=1,...,n, where x is vector of 32-bit objects | integer*4 function vpopcnt4 (x, n) integer*4 x(*), n |
vpopcnt8 | Returns the total number of 1 bits in the concatenation of the binary representation of x(i), for i=1,...,n, where x is vector of 64-bit objects | integer*4 function vpopcnt8 (x, n) integer*8 x(*) |
The following example shows interface declarations for some of the MASS double-precision vector routines:
interface subroutine vsqrt (y, x, n) real*8 y(*), x(*) integer n ! Sets y(i) to the square root of x(i), for i=1,..,n end subroutine vsqrt subroutine vrsqrt (y, x, n) real*8 y(*), x(*) integer n ! Sets y(i) to the reciprocal of the square root of x(i), ! for i=1,..,n end subroutine vrsqrt end interface
The following example shows interface declarations for some of the MASS single-precision vector routines:
interface subroutine vssqrt (y, x, n) real*4 y(*), x(*) integer n ! Sets y(i) to the square root of x(i), for i=1,..,n end subroutine vssqrt subroutine vsrsqrt (y, x, n) real*4 y(*), x(*) integer n ! Sets y(i) to the reciprocal of the square root of x(i), ! for i=1,..,n end subroutine vsrsqrt end interface
Normally, Fortran subroutine calls should pass only parameters that are disjoint, meaning that they do not overlap in memory. However, in calls to the MASS vector routines, this restriction is relaxed, and applications can use the same vector for both input and output parameters (for example, vsin (y, y, n)). Other kinds of overlap (where input and output vectors are neither disjoint nor identical) should be avoided, since they may produce unexpected results:
The vectors x(1:n) and y(1:n) must be either disjoint or identical, or unexpected results may be obtained.
The previous restriction applies to both pairs of vectors y, x1 and y, x2. That is, y(1:n) and x1(1:n) must be either disjoint or identical; and y(1:n) and x2(1:n) must be either disjoint or identical.
The above restriction applies to both pairs of vectors y1, x and y2, x. That is, y1(1:n) and x(1:n) must be either disjoint or identical; and y2(1:n) and x(1:n) must be either disjoint or identical. Also, the vectors y1(1:n) and y2(1:n) must be disjoint.
All of the routines in the MASS vector libraries are consistent, in the sense that a given input value will always produce the same result, regardless of its position in the vector, and regardless of the vector length.
To compile an application that calls the functions in the MASS libraries, specify mass and massvp4 (or massvp5) (32-bit), or mass_64 and massvp4_64 (or massvp5_64) (64-bit) on the -l linker option. For example, if the MASS libraries are installed in the default directory, you could specify one of the following:
xlf progf.f -o progf -lmass -lmassvp4 xlf progf.f -o progf -lmass_64 -lmassvp4_64 -q64
The MASS routines must run in the round-to-nearest rounding mode and with floating-point exception trapping disabled. (These are the default compilation settings.)
If you wish to use the libmass.a (or libmass_64.a) scalar library for some functions and the system library for other functions, follow this procedure to compile and link your program:
ar -x tan.s32.o libmass.a
ar -qv libfasttan.a tan.s32.o ranlib libfasttan.a
xlf sample.f -o sample dir_containing_libfasttan.a -lfasttanThis links only the tan function from MASS (now in libfasttan.a) and the remainder of the math functions from the standard system library.