pragma SDS data copy
Description
The pragma SDS data copy | zero_copy
must be specified immediately
preceding a function declaration, or immediately preceding other #pragma
SDS
bound to the function declaration.
COPY
pragma and the
ZERO_COPY
pragma are mutually exclusive and should not be
specified together on the same object. The COPY
pragma implies that data is explicitly copied between the
host processor memory and the hardware function. A suitable data mover performs the
data transfer. See Improving System Performance for more information.
The ZERO_COPY
means that the hardware function accesses the data
directly from shared memory through an AXI master bus interface.
COPY
pragma for an array argument, meaning the data is explicitly copied from the
processor to the accelerator via a data mover.Syntax
#pragma SDS data copy|zero_copy(ArrayName[<offset>:<length>])
Where:
ArrayName[<offset>:<length>]
specifies the function parameter or argument to assign the pragma to, and the array dimension and data transfer size.ArrayName
: must be one of the formal parameters of the function definition, not from the prototype (where parameter names are optional) but from the function definition.<offset>
: Optionally specifies the number of elements from the first element in the array. It must be specified as a compile-time constant.Important: The <offset> value is currently ignored, and should be specified as 0.<length>
: Specifies the number of elements transferred from the array for the specified dimension. It can be an arbitrary expression as long as the expression can be resolved at runtime inside the function.Tip: As shown in the examples below,<length>
can be a C arithmetic expression involving other scalar arguments of the same function.- For a multi-dimensional array, each dimension should be separately specified.
For example, for a 2-dimensional array, use:
pragma SDS data copy(ArrayName[offset_dim1:length1][offset_dim2:length2])
- Multiple arrays can be specified in the same pragma, using a comma separated
list. For example, use:
pragma SDS data copy(ArrayName1[offset1:length1], ArrayName2[offset2:length2])
- The
[<offset>:<length>]
argument is optional, and is only needed if the data transfer size for an array cannot be determined at compile time. When this is not specified, theCOPY
orZERO_COPY
pragma is only used to select between copying the memory to/from the accelerator through a data mover versus directly accessing the processor memory by the accelerator. To determine the array size, the SDSoC compiler analyzes the callers to the accelerator function to determine the transfer size based on the memory allocation APIs for the array, for examplemalloc
orsds_alloc
. If the analysis fails, it checks the argument type to see if the argument type has a compile-time array size and uses that size as the data transfer size. If the data transfer size cannot be determined, the compiler generates an error message so that you can specify the data size with[<offset_dim>:<length>]
. If the data size is different between the caller and callee, or different between multiple callers, the compiler also generates an error message so that you can correct the source code or use this pragma to override the compiler analysis.
Example 1
COPY
pragma to both the "A" and
"B" arguments of the accelerator function foo
right before the
function declaration. Notice the <length> option is specified
as an expression,
size*size
:#pragma SDS data copy(A[0:size*size], B[0:size*size])
void foo(int *A, int *B, int size) {
...
}
foo
with accelerator control, data transfer, and data synchronization code. The
following code snippet shows the data transfer
part:void _p0_foo_0(int *A, int *B, int size)
{
...
cf_send_i(&(_p0_swinst_foo_0.A), A, (size*size) * 4, &_p0_request_0);
cf_receive_i(&(_p0_swinst_foo_0.B), B, (size*size) * 4, &_p0_request_1);
...
}
size*size
is used to tell the SDSoC
runtime the number of elements of arrays "A" and "B". cf_send_i
and cf_receive_i
functions
require the number of bytes to transfer, so the compiler will multiply the
number of elements specified by <length> with the number
of bytes for each element (4 in this case).Example 2
ZERO_COPY
pragma instead of the COPY
pragma
above:#pragma SDS data zero_copy(A[0:size*size], B[0:size*size])
void foo(int *A, int *B, int size)
foo
becomes:void _p0_foo_0(int *A, int *B, int size)
{
...
cf_send_ref_i(&(_p0_swinst_foo_0.A), A, (size*size) * 4, &_p0_request_0);
cf_receive_ref_i(&(_p0_swinst_foo_0.B), B, (size*size) * 4, &_p0_request_1);
...
}
The cf_send_ref_i
and cf_receive_ref_i
functions
only transfer the reference or pointer of the array to the accelerator, and the
accelerator accesses the processor memory directly.
Example 3
#pragma SDS data zero_copy(in1[0:mat_dim*mat_dim], in2[0:mat_dim*mat_dim], out[0:mat_dim*mat_dim])
void matmul_partition_accel(int *in1, // Read-Only Matrix 1
int *in2, // Read-Only Matrix 2
int *out, // Output Result
int mat_dim); // Matrix Dim (assumed only square matrix) {
...
}
Example 4
"foo.h"
#pragma SDS data copy(in_A[0:1024])
void foo(int *in_A, int *out_B)
"foo.cpp"
#include "foo.h"
void foo(int *A, int *B)
{
...
}
WARNING: [SDSoC 0-0] Cannot find argument in_A in accelerator function foo(int *A, int *B)