OpenACC Directives for Data Movement

Array Shaping

In the examples so far, the entire array has been used in the OpenACC data directive and clauses. The OpenACC standard allows you to use just portions of arrays in the data clauses, referred to as “array shaping,” in which you tell the compiler the size of the arrays so it can generate the best possible code (Table 9).

Table 9: Array Shaping

Fortran C
!$pragma acc data copyin(a(1:size)), 
    copyout(b(s/4:3*s/4))
#pragma acc data copyin(a[0:size-1]), 
    copyout(b[s/4:3*s/4])

You can use reasonably complex expressions to determine the portion and size of the array to use. In this way, you give the compiler more information to create better accelerator code.

Array shaping can also be used for arrays starting with different indices (e.g., for C or for Fortran). For entire arrays, you can decide whether to define the bounds or not. Purely for code documentation – for those who might touch the code in the future – it is probably a good idea to include the array bounds.

Derived Data Types

Derived data types that the user creates can also be used with OpenACC data directives. These derived types, whether in C, C++, or Fortran, can contain multiple “levels” of pointers to other data (e.g., pointers to pointers) – often referred to as nested data structures. OpenACC, by default, copies only the highest level of the data structure to the device. The rest of the data structure remains on the host.

Compilers can have a difficult time following pointers down a derived type (a struct in C/C++), especially when copying data from the host to the device and back again. Ultimately, the lower levels of the derived type stay on the host. Fundamentally, any allocated data below the first level in the derived type won’t be copied over to the accelerator device.

This concept can be illustrated with a simple example from an OpenACC blog post. To begin, assume you have the simple data structure or derived type shown in Table 10.

Table 10: A Simple Derived Type

Fortran C
type mytype
  integer :: x(2)
end type mytype
type (mytype) A(2)
!$acc data copy (A(:2))
struct {
  int x[2];   // size 2
} *A           // size 2
#pragma acc data copy(A[0:2])

 When array is copied back and forth, the the entire array is copied because the the compiler knows exactly how it is laid out by the fixed array sizes.

If you modify the code as shown in Table 11, the data type now includes a pointer to an another array, x, that is a pointer (down one level). When array is copied to the device (the accelerator), only the “upper” part of the array is copied. The parts pointed to by are not copied, but stay on the host.

Table 11: A Nested Dynamic Data Structure

Fortran C
type mytype
  integer, allocatable :: x
end type mytype
type (mytype) A(2)
!$acc data copy (A(:2))
struct {
  int *x;
} *A           // size 2
#pragma acc data copy(A[0:2])

Before OpenACC 2.6, you had to copy all of the nested data to the device and change all of the references to use the pointers on the device (something of a pain). With OpenACC 2.6, you can now copy the lower levels of the array and they will be connected (the specification calls it “attached”) in the data structure on the device (Table 12).

Table 12: Attached Nested Data

Fortran C
type mytype
  integer, allocatable :: x
end type mytype
type (mytype) A(2)
!$acc data copy (A(:2))
!$acc data copy(A%x(1:2))
struct {
  int *x;
} *A           // size 2
#pragma acc data copy(A[0:2])
#pragma acc data copy(A.x[0:2])

OpenACC generally refers to this operation as a “manual deep copy.” You have to copy all of the levels of the array and the derived type to the device. Be sure you check the version of OpenACC your compiler is using, because only OpenACC 2.6 can use a manual deep copy. In previous versions, you still have to set the pointers on the device manually.