SDS Pragmas
Optimizations in SDSoC
This section describes pragmas for the SDSoC™ system compilers, sdscc
and
sds++
to assist system optimization.
The SDSoC environment system compilers target a base platform and invoke the Vivado® High-Level Synthesis (HLS) tool to compile synthesizeable C/C++ functions into programmable logic. Using the SDSoC IDE, or sdscc/sds++ command line options, you select functions from your source program to run in hardware, specify accelerator and system clocks, and set properties on data transfers.
In the SDSoC environment, you control the system generation process by structuring hardware functions and calls to hardware functions to balance communication and computation, and by inserting pragmas into your source code to guide the system compiler. The SDSoC compiler automatically chooses the best possible system port to use for any data transfer, but allows you to override this selection by using pragmas. You can also specify pragmas to select different data movers for your hardware function arguments, and use pragmas to control the number of data elements that are transferred to/from the hardware function.
All pragmas specific to the SDSoC
environment are prefixed with #pragma SDS
and
should be inserted into C/C++ source code, either immediately prior to a function
declaration or at a function call site for optimization of a specific function
call.
#pragma SDS data access_pattern(in_a:SEQENTIAL, out_b:SEQUENTIAL)
void f1(int in_a[20], int out_b[20]);
The SDS pragmas include the types specified below:
Type | Pragmas |
---|---|
Data Access Patterns | |
Data Transfer Size | |
Memory Attributes | |
Data Mover Type | |
SDSoC Platform Interfaces to External Memory | |
Hardware Buffer Depth | |
Asynchronous Function Execution | |
Specifying Resource Binding | |
Hardware/Software Tracing |
pragma SDS async
Description
The ASYNC pragma must be paired with the WAIT pragma to support manual control of the hardware function synchronization.
The ASYNC pragma is specified immediately preceding a call to a
hardware function, directing the compiler not to automatically generate the wait
based on data flow analysis. The WAIT pragma must be inserted at an appropriate
point in the program to direct the CPU to wait until the associated ASYNC
function call with the same ID has completed.
In the presence of an ASYNC pragma, the SDSoC system compiler does not generate an sds_wait()
in the stub function for the associated call. The program
must contain the matching sds_wait(ID)
or #pragma SDS wait(ID)
at an appropriate point to
synchronize the controlling thread running on the CPU with the hardware function
thread. An advantage of using the #pragma SDS
wait(ID)
over the sds_wait(ID)
function call is that the source code can then be compiled by compilers other than
the SDSoC compiler, like gcc
, that does not interpret either ASYNC, or WAIT pragmas.
Syntax
Place the pragma in the C source immediately before the function call:
#pragma SDS async(<ID>)
...
#pragma SDS wait(<ID>)
<ID>
: Is a user-defined ID for theASYNC
/WAIT
pair specified as a compile time unsigned integer constant.
Example 1
The following code snippet shows an example of using these pragmas with different IDs:
{
#pragma SDS async(1)
mmult(A, B, C);
#pragma SDS async(2)
mmult(D, E, F);
...
#pragma SDS wait(1)
#pragma SDS wait(2)
}
The program running on the hardware first transfers A
and B
to the mmult
hardware and returns immediately. Then the program transfers D
and E
to the mmult hardware and
returns immediately. When the program later executes to the point of #pragma SDS wait(1)
, it waits for the output C
to be ready. When the program later executes to the
point of #pragma SDS wait(2)
, it waits for the
output F
to be ready.
Example 2
The following code snippet shows an example of using these pragmas
with the same ID
to pipeline the data transfer and
accelerator execution:
for (int i = 0; i < pipeline_depth; i++) {
#pragma SDS async(1)
mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]);
}
for (int i = pipeline_depth; i < NUM_TESTS; i++) {
#pragma SDS wait(1)
#pragma SDS async(1)
mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]);
}
for (int i = 0; i < pipeline_depth; i++) {
#pragma SDS wait(1)
}
In the above example, the first loop ramps up the pipeline with a
depth of pipeline_depth
, the second loop executes
the pipeline, and the third loop ramps down the pipeline. The hardware buffer depth
(pragma SDS data buffer_depth) should be set to the same
value as pipeline_depth
. The goal of this pipeline
is to transfer data to the accelerator for the next execution while the current
execution is not finished. Refer to
"Increasing
System Parallelism and Concurrency" in
SDSoC
Profiling and Optimization Guide
for more information.
See Also
pragma SDS data access_pattern
Description
This pragma must be specified immediately preceding a function declaration,
or immediately preceding another #pragma SDS
bound to the
function declaration.
This pragma specifies the data access pattern in the hardware function. The
SDSoC compiler checks the value of this pragma to
determine the hardware interface to synthesize. If the access pattern is SEQUENTIAL
, a streaming interface (such as ap_fifo
) will be generated. Otherwise, with RANDOM
access pattern, a RAM interface will be generated. Refer to "Data Motion Network Generation in SDSoC"
in the SDSoC
Environment Profiling and Optimization Guide (UG1235) for more
information on the use of this pragma in data motion network generation.
Syntax
The syntax for this pragma is:
#pragma SDS data access_pattern(ArrayName:<pattern>)
Where:
ArrayName
: Specifies one of the formal parameters of the function to assign the pragma to.- <pattern>: can be either
SEQUENTIAL
orRANDOM
. The default isRANDOM
.
Example 1
The following code snippet shows an example of using this pragma for the
array argument (A
):
#pragma SDS data access_pattern(A:SEQUENTIAL)
void foo(int A[1024], int B[1024]);
In the example shown above, a streaming interface will be generated for
argument A
, while a RAM interface will be generated for
argument B
. The access pattern for argument A
must be A[0], A[1], A[2], ... , A[1023], and all elements must
be accessed only once. On the other hand, argument B
can be
accessed in a random fashion, and each element can be accessed zero or more times.
Example 2
The following code snippet shows an example of using this pragma for a pointer argument:
#pragma SDS data access_pattern(A:SEQUENTIAL)
#pragma SDS data copy(A[0:1024])
void foo(int *A, int B[1024]);
In the above example, if argument A
is
intended to be a streaming port, the two pragmas shown must be applied. Without these,
SDSoC tool synthesizes argument A
as a register (IN, OUT, or INOUT based on the usage of A
in function foo
).
Example 3
The following code snippet shows the combination of the ZERO_COPY pragma and the ACCESS_PATTERN pragma:
#pragma SDS data zero_copy(A)
#pragma SDS data access_pattern(A:SEQUENTIAL)
void foo(int A[1024], int B[1024]);
In the above example, the ACCESS_PATTERN pragma is ignored. After the ZERO_COPY pragma is applied to an argument, an AXI Master interface will be synthesized for that argument. Refer to "Zero Copy Data Mover" in the SDSoC Environment Profiling and Optimization Guide (UG1235) for more information.
See Also
pragma SDS data buffer_depth
Description
This pragma must be specified immediately preceding a function
declaration, or immediately preceding another #pragma
SDS
bound to the function declaration, and applies to all the callers
of the function.
This pragma only applies to arrays that map to block RAM or FIFO interfaces. For a block RAM-mapped array, the <BufferDepth> value specifies hardware multi-buffer depth. For a FIFO-mapped array, the <BufferDepth> value specifies the depth of the hardware FIFO allocated for the array. For this pragma, the following must be true:
- BRAM: 1 ≤ <BufferDepth> ≤ 4, and 2 ≤ ArraySize ≤ 16384.
- FIFO: <BufferDepth> = 2n, where 4 ≤ n ≤ 20.
Syntax
The syntax of this pragma is:
#pragma SDS data buffer_depth(ArrayName:<BufferDepth>)
Where:
ArrayName
: Specifies one of the formal parameters of the function to assign the pragma to.- <BufferDepth>: must be a compile-time constant value.
- Multiple arrays can be specified as a comma separated list in
one pragma. For example:
#pragma SDS data buffer_depth(ArrayName1:BufferDepth1, ArrayName2:BufferDepth2)
Example 1
This example specifies a multi-buffer of size 4 used for the RAM
interface of argument a
:
#pragma SDS data buffer_depth(a:4)
void foo(int a[1024], b[1024);
See Also
pragma SDS data copy
Description
The pragma SDS data copy | zero_copy
must be specified immediately preceding a function declaration, or immediately
preceding another #pragma SDS
bound to the function
declaration.
ZERO_COPY
pragma are mutually
exclusive and should not be specified together on the same object.The COPY pragma implies that data is explicitly copied between the host processor memory and the hardware function. A suitable data mover performs the data transfer. See "Improving System Performance" in SDSoC Profiling and Optimization Guide for more information.
The ZERO_COPY means that the hardware function accesses the data directly from shared memory through an AXI master bus interface.
Syntax
The syntax for this pragma is:
#pragma SDS data copy|zero_copy(ArrayName[<offset>:<length>])
Where:
ArrayName[<offset>:<length>]
: specifies the function parameter or argument to assign the pragma to, and the array dimension and data transfer size.ArrayName
: must be one of the formal parameters of the function definition, not from the prototype (where parameter names are optional) but from the function definition.<offset>
: Optionally specifies the number of elements from the first element in the array. It must be specified as a compile-time constant.IMPORTANT: The <offset> value is currently ignored, and should be specified as 0.<length>
: Specifies the number of elements transferred from the array for the specified dimension. It can be an arbitrary expression as long as the expression can be resolved at runtime inside the function.TIP: As shown in the examples below, <length> can be a C arithmetic expression involving other scalar arguments of the same function.- For a multi-dimensional array, each dimension should be
separately specified. For example, for a two-dimensional array, use:
pragma SDS data copy(ArrayName[offset_dim1:length1][offset_dim2:length2])
- Multiple arrays can be specified in the same pragma, using a
comma separated list. For example, use:
pragma SDS data copy(ArrayName1[offset1:length1], ArrayName2[offset2:length2])
- The
[<offset>:<length>]
argument is optional, and is only needed if the data transfer size for an array cannot be determined at compile time. When this is not specified, theCOPY
orZERO_COPY
pragma is only used to select between copying the memory to/from the accelerator through a data mover versus directly accessing the processor memory by the accelerator. To determine the array size, the SDSoC compiler analyzes the callers to the accelerator function to determine the transfer size based on the memory allocation APIs for the array, for example,malloc
orsds_alloc
. If the analysis fails, it checks the argument type to see if the argument type has a compile-time array size and uses that size as the data transfer size. If the data transfer size cannot be determined, the compiler generates an error message so that you can specify the data size with[<offset_dim>:<length>]
. If the data size is different between the caller and the callee, or different between multiple callers, the compiler also generates an error message so that you can correct the source code or use this pragma to override the compiler analysis.
Example 1
The following example applies the COPY pragma to both the "A" and "B"
arguments of the accelerator function foo
right
before the function declaration. Notice the <length>
option is specified as an expression, size*size
:
#pragma SDS data copy(A[0:size*size], B[0:size*size])
void foo(int *A, int *B, int size);
The SDSoC system compiler will
replace the body of the function foo
with
accelerator control, data transfer, and data synchronization code. The following
code snippet shows the data transfer part:
void _p0_foo_0(int *A, int *B, int size)
{
...
cf_send_i(&(_p0_swinst_foo_0.A), A, (size*size) * 4, &_p0_request_0);
cf_receive_i(&(_p0_swinst_foo_0.B), B, (size*size) * 4, &_p0_request_1);
...
}
As shown above, the offset value size*size
is used to tell the SDSoC runtime the number of elements of arrays "A" and "B."
cf_send_i
and cf_receive_i
functions require the number of bytes to transfer, so the
compiler will multiply the number of elements specified by <length>
with the number of bytes for each element (4 in this
case).Example 2
The following code snippet shows an example of applying the ZERO_COPY pragma, instead of the COPY pragma above:
#pragma SDS data zero_copy(A[0:size*size], B[0:size*size])
void foo(int *A, int *B, int size);
The body of function foo
becomes:
void _p0_foo_0(int *A, int *B, int size)
{
...
cf_send_ref_i(&(_p0_swinst_foo_0.A), A, (size*size) * 4, &_p0_request_0);
cf_receive_ref_i(&(_p0_swinst_foo_0.B), B, (size*size) * 4, &_p0_request_1);
...
}
The cf_send_ref_i
and cf_receive_ref_i
functions only transfer the reference
or pointer of the array to the accelerator, and the accelerator accesses the
processor memory directly.
Example 3
The following example shows a ZERO_COPY pragma with multiple arrays specified to generate a direct memory interface with DDR and the hardware function:
#pragma SDS data zero_copy(in1[0:mat_dim*mat_dim], in2[0:mat_dim*mat_dim], out[0:mat_dim*mat_dim])
void matmul_partition_accel(int *in1, // Read-Only Matrix 1
int *in2, // Read-Only Matrix 2
int *out, // Output Result
int mat_dim); // Matrix Dim (assumed only square matrix)
Example 4
A DATA COPY pragma instructs the compiler to insert the transfer size expression into the corresponding send/receive call within stub function body. As a result, it is essential that the argument names used in the function declaration match the argument names in the function definition. The following code snippet illustrates a common mistake: using an argument name in the function declaration that is different from the argument name used in the function definition:
"foo.h"
#pragma SDS data copy(in_A[0:1024])
void foo(int *in_A, int *out_B);
"foo.cpp"
#include "foo.h"
void foo(int *A, int *B)
{
...
}
Any C/C++ compiler will ignore the argument name in the function declaration, because the C/C++ standard makes the argument name in the function declaration optional. Only the argument name in the function definition is used by the compiler. However, the SDSoC compiler will issue a warning when trying to apply the pragma:
WARNING: [SDSoC 0-0] Cannot find argument in_A in accelerator function foo(int *A, int *B)
See Also
pragma SDS data data_mover
Description
This pragma must be specified immediately preceding a function declaration,
or immediately preceding another #pragma SDS
bound to the
function declaration. This pragma applies to all the callers of the bound function.
By default, the SDSoC compiler chooses
the type of the data mover automatically by analyzing the code. The DATA_MOVER pragma can
be used to override the compiler default. This pragma specifies the HW IP type, or DataMover
, used to transfer an array argument.
The FASTDMA data mover supports a wider data-width to support higher bandwidth for data transfer. For Zynq® UltraScale+™ MPSoC the data-width is from 64-bits to 256-bits. For Zynq-7000 the data-width is 64-bits.
The SDSoC™ compiler automatically
assigns an instance of the data mover HW IP to use for transferring the corresponding array.
The :id
can be specified to assign a specific data mover
instance for the associated formal parameter. If more than two formal parameters have the
same DataMover
and the same id
, they will share the same data mover HW IP instance.
sds_alloc()
.Syntax
The syntax for this pragma is:
#pragma SDS data data_mover(ArrayName:DataMover[:id])
Where:
ArrayName
: Specifies one of the formal parameters of the function to assign the pragma to.DataMover
: Must be one of the following:- AXIFIFO: used for non-contiguous memory, <300 bytes.
- AXIDMA_SIMPLE: used for contiguous memory, <32MB.
- AXIDMA_SG: can be used for either contiguous or non-contiguous memory, >300 bytes.
- FASTDMA: contiguous memory only. The pragma is required when FASTDMA is desired.
:id
: is optional, but must be specified as a positive integer when it is used.- Multiple arrays can be specified in one pragma, separated by a comma
(,). For example:
#pragma SDS data data_mover(ArrayName1:DataMover[:id], ArrayName2:DataMover[:id])
Example 1
The following code snippet shows an example of specifying the data mover ID in the pragma:
#pragma SDS data data_mover(A:AXIDMA_SG:1, B:AXIDMA_SG:1)
void foo(int A[1024], int B[1024]);
In the example above, the same instance of the AXIDMA_SG IP is shared to
transfer data for arguments A
, and B
, because the same data mover ID has been specified.
Example 2
#pragma SDS data data_mover(A:FASTDMA,B:FASTDMA,C:FASTDMA,D:AXIDMA_SIMPLE,E:AXIDMA_SIMPLE)
void foo(float A[1024], float B[1024], float C[1024], int D[1024], int E[1024]);
The compiler will transfer arrays A, B, and C with individual FASTDMA data movers, and arrays D, E with individual AXIDMA_SIMPLE data movers.
See Also
pragma SDS data mem_attribute
Description
This pragma must be specified immediately preceding a function declaration,
or immediately preceding another #pragma SDS
bound to the
function declaration. This pragma applies to all the callers of the function.
For an operating system like Linux that supports virtual memory, user-space allocated memory is paged, which can affect system performance. The SDSoC runtime also provides an API to allocate physically contiguous memory. The pragmas in this section can be used to tell the compiler whether the arguments have been allocated in physically contiguous memory.
Syntax
The syntax for this pragma is:
#pragma SDS data mem_attribute(ArrayName:contiguity)
Where:
ArrayName
: Specifies one of the formal parameters of the function to assign the pragma to.Contiguity
: Must be specified as eitherPHYSICAL_CONTIGUOUS
orNON_PHYSICAL_CONTIGUOUS
. The default value isNON_PHYSICAL_CONTIGUOUS
:PHYSICAL_CONTIGUOUS
means that all memory corresponding to the associatedArrayName
is allocated usingsds_alloc
.NON_PHYSICAL_CONTIGUOUS
means that all memory corresponding to the associatedArrayName
is allocated usingmalloc
or as a free variable on the stack. This helps the SDSoC compiler select the optimal data mover.
- Multiple arrays can be specified in one pragma, separated by a comma
(,). For example:
#pragma SDS data mem_attribute(ArrayName:contiguity, ArrayName:contiguity)
Example 1
The following code snippet shows an example of specifying the contiguity
attribute:
#pragma SDS data mem_attribute(A:PHYSICAL_CONTIGUOUS)
void foo(int A[1024], int B[1024]);
In the example above, the user tells the SDSoC compiler that array A
is allocated in
the memory block that is physically contiguous. The SDSoC compiler then chooses AXIDMA_SIMPLE
instead of AXIDMA_SG
, because the former is smaller and faster for
transferring physically contiguous memory.
See Also
pragma SDS data sys_port
Description
This pragma must be specified immediately preceding a function declaration,
or immediately preceding another #pragma SDS
bound to the
function declaration, and applies to all the callers of the function.
This pragma overrides the SDSoC compiler default choice of memory port. If the SYS_PORT pragma is not specified for an array argument, the interface to the external memory is automatically determined by the SDSoC system compilers (sdscc/sds++) based on array memory attributes (cacheable or non-cacheable), array size, data mover used, etc.
The Zynq®-7000 device provides a cache coherent interface (S_AXI_ACP) between programmable logic and external memory, and high-performance ports (S_AXI_HP) for non-cache coherent access. The Zynq® UltraScale+™ MPSoC provides a cache coherent interface (S_AXI_HPCn_FPD), and non-cache coherent interface called (S_AXI_HPn_FPD).
Syntax
#pragma SDS data sys_port(<param_name>:<port>)
Where:
- <param_name>: Specifies one of the formal parameters of the function to assign the pragma to.
-
<port>: The SDSoC compiler recognizes predefined memory port types: ACP for Zynq-7000 devices only, HPC, HP, or MIG, which represent cache coherent access (ACP, HPC), high speed non-coherent access (HP), or memory accessible through a soft memory controller implemented in PL logic (MIG). You can also use a specific platform port name for the <port>, but this is not recommended unless the compiler does not select the correct port, which could occur for a stream port in the platform. To get a list of platform ports, in a terminal shell, run
sds++ -sds-pf-info <platform>
.For example, the
sds++ -sds-pfm-info zcu102
command returns the following under System Ports:System Ports Use the system port name in a sysport pragma, for example #pragma SDS data sys_port(parameter_name:system_port_name) System Port Name (Vivado BD instance name, Vivado BD port name) ps_e_S_AXI_HPC0_FPD (ps_e, S_AXI_HPC0_FPD) ps_e_S_AXI_HPC1_FPD (ps_e, S_AXI_HPC1_FPD) ps_e_S_AXI_HP0_FPD (ps_e, S_AXI_HP0_FPD) ps_e_S_AXI_HP1_FPD (ps_e, S_AXI_HP1_FPD) ps_e_S_AXI_HP2_FPD (ps_e, S_AXI_HP2_FPD) ps_e_S_AXI_HP3_FPD (ps_e, S_AXI_HP3_FPD)
In this case, the SYS_PORT pragma could be defined as:#pragma SDS data sys_port(Array1:ps_e_S_AXI_HPC0_FPD)
If <port> is defined using the HPC shortcut, then the argument, Array1, could be assigned to either HPC0, or HPC1, by the SDSoC compiler.
When the platform is created, the designer could specify a shortcut for a specific platform port, using thePFM.AXI_PORT
property. Refer to SDSoC Environment Platform Development Guide for more information onPFM
properties. For example:set_property PFM.AXI_PORT {M_AXIS {type "M_AXIS" sptag "Counter"}} \ [get_bd_cells /stream_fifo]
This defines a SYS_PORT tag, "Counter", which can be specified in the pragma as:#pragma SDS data sys_port(Array1:Counter)
This would be the same as declaring the following:#pragma SDS data sys_port(Array1:stream_fifo_M_AXIS)
- Multiple arguments can be specified in one pragma, separated by
commas:
#pragma SDS data sys_port(param1:port, param2:port)
Example 1
The following code snippet shows an example of using this pragma:
#pragma SDS data sys_port(A:HP)
void foo(int A[1024], int B[1024]);
In the above example, if the caller passes an array (A) allocated with
cache coherent calls, such as malloc
, or sds_alloc
, the SDSoC compiler
uses an HP platform interface even though this might not be the best choice.
See Also
pragma SDS resource
Description
This pragma can be used at function call sites to manually specify resource binding.
The RESOURCE pragma is specified immediately preceding a call to a hardware function, directing the compiler to bind the caller to a specified accelerator instance. The SDSoC compiler identifies when multiple resource IDs have been specified for a function, and automatically generates a hardware accelerator and data motion network realizing the hardware functions in programmable logic.
Syntax
#pragma SDS resource(<ID>)
Where:
- <ID>: Must be a compile time unsigned integer constant. For the same function, each unique ID represents a unique instance of the hardware accelerator.
Example 1
The following code snippet shows an example of using this pragma with
a different ID
:
{
#pragma SDS resource(1)
mmult(A, B, C);
#pragma SDS resource(2)
mmult(D, E, F);
...
}
In the previous example, the first call to function mmult
will be bound to an accelerator with an ID of 1,
and the second call to mmult
will be bound to
another accelerator with an ID of 2.
See Also
pragma SDS trace
Description
The SDSoC environment tracing feature provides a detailed view of what is happening in the system during execution of an application, through the use of hardware/software event tracing. See the SDSoC Environment User Guide for more information.
This pragma specifies the trace insertion for the accelerator with granularity at the function level or the argument level, to let you monitor the activity on the accelerator for debug purposes. When tracing is enabled, tracing instrumentation is automatically inserted into the software code, and hardware monitors are inserted into the hardware system during implementation of the hardware logic. You can monitor either the complete accelerator function, or an individual parameter of the function.
The type of trace can be SW
, HW
, or both. HW
trace means the
"start" and "stop" of the corresponding hardware component, such as the start and stop of
the hardware accelerator, or the start and stop of data transfer of the specified argument.
This lets you monitor activity moving onto, and off of, the hardware. The SW
trace lets you observe the software stub for the hardware
accelerated function, to monitor the function, and arguments on the software side of the
transaction. You can also monitor both the hardware, and software transactions.
Syntax
This pragma must be specified immediately preceding a function declaration,
or immediately preceding another #pragma SDS
bound to the
function declaration.
#pragma SDS trace(<var1>[:SW|:HW][,<var2>[:SW|:HW]])
Where:
- <var>: Specifies either the function name, or one of the parameters of the function.
[:SW|:HW]
: Specifies either HW tracing or SW tracing. The absence of this option indicates that both HW and SW traces are inserted.
Example 1
The following example traces the specified function, foo
:
#pragma SDS monitor trace(foo)
void foo(int a, int b);
:HW
or :SW
indicates that both traces are inserted for the accelerator.Example 2
The following example demonstrates using this pragma to trace multiple arguments of the function.
#pragma SDS monitor trace(a, b:SW, c:HW)
void foo(int a, int b, int *c);
In the previous example, both HW
and
SW
traces are inserted for argument a
. Only the SW
trace is inserted
for argument b
. For argument c
, only the HW
trace is inserted.
See Also
pragma SDS wait
Description
The WAIT pragma must be paired with the ASYNC pragma to support manual control of the hardware function synchronization.
The ASYNC pragma is specified immediately preceding a call to a hardware function, directing the compiler not to automatically generate the wait based on data flow analysis. The WAIT pragma must be inserted at an appropriate point in the program to direct the CPU to wait until the associated ASYNC function call with the same ID has completed.
See Also
pragma SDS data zero_copy
Description
The COPY pragma implies that data is explicitly copied between the host processor memory and the hardware function, using a suitable data mover for the transfer. The ZERO_COPY pragma means that the hardware function accesses the data directly from shared memory through an AXI master bus interface.
Example 1
The following example shows a ZERO_COPY pragma with multiple arrays specified to generate a direct memory interface with DDR and the hardware function:
#pragma SDS data zero_copy(in1[0:mat_dim*mat_dim], in2[0:mat_dim*mat_dim], out[0:mat_dim*mat_dim])
void matmul_partition_accel(int *in1, // Read-Only Matrix 1
int *in2, // Read-Only Matrix 2
int *out, // Output Result
int mat_dim); // Matrix Dim (assumed only square matrix)
See Also
- SDSoC Environment Profiling and Optimization Guide (UG1235)