This pragma instructs the compiler to attempt an unroll and fuse operation on nested for loops.
>>-#--pragma--+-nounrollandfuse------------+------------------->< '-unrollandfuse--(--+---+--)-' '-n-'
where n is a loop unrolling factor. In C programs, the value of n is a positive integral constant expression. In C++ programs, the value of n is a positive scalar integer or compile-time constant initialization expression. If n is not specified and if -qhot, -qsmp, or -O4 or higher is specified, the optimizer determines an appropriate unrolling factor for each nested loop.
The #pragma unrollandfuse directive applies only to the outer loops of nested for loop structures that meet the following conditions:
For loop unrolling to occur, the #pragma unrollandfuse directive must precede a for loop. You must not specify #pragma unrollandfuse for the innermost for loop.
You must not specify #pragma unrollandfuse more than once, or combine the directive with nounrollandfuse, nounroll, unroll, or stream_unroll directives for the same for loop.
Specifying #pragma nounrollandfuse instructs the compiler to not unroll that loop.
int i, j; int a[1000][1000]; int b[1000][1000]; int c[1000][1000]; .... #pragma unrollandfuse(2) for (i=1; i<1000; i++) { for (j=1; j<1000; j++) { a[j][i] = b[i][j] * c[j][i]; } }The for loop below shows a possible result of applying the #pragma unrollandfuse(2) directive to the loop structure shown above.
for (i=1; i<1000; i=i+2) { for (j=1; j<1000; j++) { a[j][i] = b[i][j] * c[j][i]; a[j][i+1] = b[i+1][j] * c[j][i+1]; } }
int i, j, k; int a[1000][1000]; int b[1000][1000]; int c[1000][1000]; int d[1000][1000]; int e[1000][1000]; .... #pragma unrollandfuse(4) for (i=1; i<1000; i++) { #pragma unrollandfuse(2) for (j=1; j<1000; j++) { for (k=1; k<1000; k++) { a[j][i] = b[i][j] * c[j][i] + d[j][k] * e[i][k]; } } }
Related information