SDS Pragmas

Optimizations in SDSoC

This section describes pragmas for the SDSoC™ system compilers, sdscc and sds++ to assist system optimization.

The SDSoC environment system compilers target a base platform and invoke the Vivado® High-Level Synthesis (HLS) tool to compile synthesizeable C/C++ functions into programmable logic. Using the SDSoC IDE, or sdscc/sds++ command line options, you select functions from your source program to run in hardware, specify accelerator and system clocks, and set properties on data transfers.

In the SDSoC environment, you control the system generation process by structuring hardware functions and calls to hardware functions to balance communication and computation, and by inserting pragmas into your source code to guide the system compiler. The SDSoC compiler automatically chooses the best possible system port to use for any data transfer, but allows you to override this selection by using pragmas. You can also specify pragmas to select different data movers for your hardware function arguments, and use pragmas to control the number of data elements that are transferred to/from the hardware function.

All pragmas specific to the SDSoC environment are prefixed with #pragma SDS and should be inserted into C/C++ source code, either immediately prior to a function declaration or at a function call site for optimization of a specific function call.

#pragma SDS data access_pattern(in_a:SEQENTIAL, out_b:SEQUENTIAL)
void f1(int in_a[20], int out_b[20]);

The SDS pragmas include the types specified below:

Table 1. SDS Pragmas by Type
Type Pragmas
Data Access Patterns
Data Transfer Size
Memory Attributes
Data Mover Type
SDSoC Platform Interfaces to External Memory
Hardware Buffer Depth
Asynchronous Function Execution
Specifying Resource Binding
Hardware/Software Tracing

pragma SDS async

Description

The ASYNC pragma must be paired with the WAIT pragma to support manual control of the hardware function synchronization.

The ASYNC pragma is specified immediately preceding a call to a hardware function, directing the compiler not to automatically generate the wait based on data flow analysis. The WAIT pragma must be inserted at an appropriate point in the program to direct the CPU to wait until the associated ASYNC function call with the same ID has completed.

In the presence of an ASYNC pragma, the SDSoC system compiler does not generate an sds_wait() in the stub function for the associated call. The program must contain the matching sds_wait(ID) or #pragma SDS wait(ID) at an appropriate point to synchronize the controlling thread running on the CPU with the hardware function thread. An advantage of using the #pragma SDS wait(ID) over the sds_wait(ID) function call is that the source code can then be compiled by compilers other than the SDSoC compiler, like gcc, that does not interpret either ASYNC, or WAIT pragmas.

Syntax

Place the pragma in the C source immediately before the function call:

#pragma SDS async(<ID>)
...
#pragma SDS wait(<ID>)
Where:
  • <ID>: Is a user-defined ID for the ASYNC/WAIT pair specified as a compile time unsigned integer constant.

Example 1

The following code snippet shows an example of using these pragmas with different IDs:

{
    #pragma SDS async(1)
    mmult(A, B, C);
    #pragma SDS async(2)
    mmult(D, E, F);
    ...
    #pragma SDS wait(1)
    #pragma SDS wait(2)
}

The program running on the hardware first transfers A and B to the mmult hardware and returns immediately. Then the program transfers D and E to the mmult hardware and returns immediately. When the program later executes to the point of #pragma SDS wait(1), it waits for the output C to be ready. When the program later executes to the point of #pragma SDS wait(2), it waits for the output F to be ready.

Example 2

The following code snippet shows an example of using these pragmas with the same ID to pipeline the data transfer and accelerator execution:

for (int i = 0; i < pipeline_depth; i++) {    
    #pragma SDS async(1)
    mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]);
}
for (int i = pipeline_depth; i < NUM_TESTS; i++) {
    #pragma SDS wait(1)
    #pragma SDS async(1)
    mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]);
}
for (int i = 0; i < pipeline_depth; i++) {
    #pragma SDS wait(1)
} 

In the above example, the first loop ramps up the pipeline with a depth of pipeline_depth, the second loop executes the pipeline, and the third loop ramps down the pipeline. The hardware buffer depth (pragma SDS data buffer_depth) should be set to the same value as pipeline_depth. The goal of this pipeline is to transfer data to the accelerator for the next execution while the current execution is not finished. Refer to "Increasing System Parallelism and Concurrency" in SDSoC Profiling and Optimization Guide for more information.

See Also

pragma SDS data access_pattern

Description

This pragma must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration.

This pragma specifies the data access pattern in the hardware function. The SDSoC compiler checks the value of this pragma to determine the hardware interface to synthesize. If the access pattern is SEQUENTIAL, a streaming interface (such as ap_fifo) will be generated. Otherwise, with RANDOM access pattern, a RAM interface will be generated. Refer to "Data Motion Network Generation in SDSoC" in the SDSoC Environment Profiling and Optimization Guide (UG1235) for more information on the use of this pragma in data motion network generation.

Syntax

The syntax for this pragma is:

#pragma SDS data access_pattern(ArrayName:<pattern>)

Where:

  • ArrayName: Specifies one of the formal parameters of the function to assign the pragma to.
  • <pattern>: can be either SEQUENTIAL or RANDOM. The default is RANDOM.

Example 1

The following code snippet shows an example of using this pragma for the array argument (A):

#pragma SDS data access_pattern(A:SEQUENTIAL)
void foo(int A[1024], int B[1024]);

In the example shown above, a streaming interface will be generated for argument A, while a RAM interface will be generated for argument B. The access pattern for argument A must be A[0], A[1], A[2], ... , A[1023], and all elements must be accessed only once. On the other hand, argument B can be accessed in a random fashion, and each element can be accessed zero or more times.

Example 2

The following code snippet shows an example of using this pragma for a pointer argument:

#pragma SDS data access_pattern(A:SEQUENTIAL)
#pragma SDS data copy(A[0:1024])
void foo(int *A, int B[1024]);

In the above example, if argument A is intended to be a streaming port, the two pragmas shown must be applied. Without these, SDSoC tool synthesizes argument A as a register (IN, OUT, or INOUT based on the usage of A in function foo).

Example 3

The following code snippet shows the combination of the ZERO_COPY pragma and the ACCESS_PATTERN pragma:

#pragma SDS data zero_copy(A)
#pragma SDS data access_pattern(A:SEQUENTIAL)
void foo(int A[1024], int B[1024]);

In the above example, the ACCESS_PATTERN pragma is ignored. After the ZERO_COPY pragma is applied to an argument, an AXI Master interface will be synthesized for that argument. Refer to "Zero Copy Data Mover" in the SDSoC Environment Profiling and Optimization Guide (UG1235) for more information.

See Also

pragma SDS data buffer_depth

Description

This pragma must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration, and applies to all the callers of the function.

This pragma only applies to arrays that map to block RAM or FIFO interfaces. For a block RAM-mapped array, the <BufferDepth> value specifies hardware multi-buffer depth. For a FIFO-mapped array, the <BufferDepth> value specifies the depth of the hardware FIFO allocated for the array. For this pragma, the following must be true:

  • BRAM: 1 ≤ <BufferDepth> ≤ 4, and 2 ≤ ArraySize ≤ 16384.
  • FIFO: <BufferDepth> = 2n, where 4 ≤ n ≤ 20.
TIP: When the pragma is not specified, the default <buffer_depth> is 1.

Syntax

The syntax of this pragma is:

#pragma SDS data buffer_depth(ArrayName:<BufferDepth>)

Where:

  • ArrayName: Specifies one of the formal parameters of the function to assign the pragma to.
  • <BufferDepth>: must be a compile-time constant value.
  • Multiple arrays can be specified as a comma separated list in one pragma. For example:
    #pragma SDS data buffer_depth(ArrayName1:BufferDepth1, ArrayName2:BufferDepth2)

Example 1

This example specifies a multi-buffer of size 4 used for the RAM interface of argument a:

#pragma SDS data buffer_depth(a:4)
void foo(int a[1024], b[1024);

See Also

pragma SDS data copy

Description

The pragma SDS data copy | zero_copy must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration.

IMPORTANT: The COPY pragma and the ZERO_COPY pragma are mutually exclusive and should not be specified together on the same object.

The COPY pragma implies that data is explicitly copied between the host processor memory and the hardware function. A suitable data mover performs the data transfer. See "Improving System Performance" in SDSoC Profiling and Optimization Guide for more information.

The ZERO_COPY means that the hardware function accesses the data directly from shared memory through an AXI master bus interface.

IMPORTANT: By default, the SDSoC compiler assumes the COPY pragma for an array argument, meaning the data is explicitly copied from the processor to the accelerator via a data mover.

Syntax

The syntax for this pragma is:

#pragma SDS data copy|zero_copy(ArrayName[<offset>:<length>])

Where:

  • ArrayName[<offset>:<length>]: specifies the function parameter or argument to assign the pragma to, and the array dimension and data transfer size.
  • ArrayName: must be one of the formal parameters of the function definition, not from the prototype (where parameter names are optional) but from the function definition.
  • <offset>: Optionally specifies the number of elements from the first element in the array. It must be specified as a compile-time constant.
    IMPORTANT: The <offset> value is currently ignored, and should be specified as 0.
  • <length>: Specifies the number of elements transferred from the array for the specified dimension. It can be an arbitrary expression as long as the expression can be resolved at runtime inside the function.
    TIP: As shown in the examples below, <length> can be a C arithmetic expression involving other scalar arguments of the same function.
  • For a multi-dimensional array, each dimension should be separately specified. For example, for a two-dimensional array, use:
    pragma SDS data copy(ArrayName[offset_dim1:length1][offset_dim2:length2])
  • Multiple arrays can be specified in the same pragma, using a comma separated list. For example, use:
    pragma SDS data copy(ArrayName1[offset1:length1], ArrayName2[offset2:length2])
  • The [<offset>:<length>] argument is optional, and is only needed if the data transfer size for an array cannot be determined at compile time. When this is not specified, the COPY or ZERO_COPY pragma is only used to select between copying the memory to/from the accelerator through a data mover versus directly accessing the processor memory by the accelerator. To determine the array size, the SDSoC compiler analyzes the callers to the accelerator function to determine the transfer size based on the memory allocation APIs for the array, for example, malloc or sds_alloc. If the analysis fails, it checks the argument type to see if the argument type has a compile-time array size and uses that size as the data transfer size. If the data transfer size cannot be determined, the compiler generates an error message so that you can specify the data size with [<offset_dim>:<length>]. If the data size is different between the caller and the callee, or different between multiple callers, the compiler also generates an error message so that you can correct the source code or use this pragma to override the compiler analysis.

Example 1

The following example applies the COPY pragma to both the "A" and "B" arguments of the accelerator function foo right before the function declaration. Notice the <length> option is specified as an expression, size*size:

#pragma SDS data copy(A[0:size*size], B[0:size*size])
void foo(int *A, int *B, int size);

The SDSoC system compiler will replace the body of the function foo with accelerator control, data transfer, and data synchronization code. The following code snippet shows the data transfer part:

void _p0_foo_0(int *A, int *B, int size)
{
    ...
    cf_send_i(&(_p0_swinst_foo_0.A), A, (size*size) * 4, &_p0_request_0);
    cf_receive_i(&(_p0_swinst_foo_0.B), B, (size*size) * 4, &_p0_request_1);
    ...
}

As shown above, the offset value size*size is used to tell the SDSoC runtime the number of elements of arrays "A" and "B."

TIP: The cf_send_i and cf_receive_i functions require the number of bytes to transfer, so the compiler will multiply the number of elements specified by <length> with the number of bytes for each element (4 in this case).

Example 2

The following code snippet shows an example of applying the ZERO_COPY pragma, instead of the COPY pragma above:

#pragma SDS data zero_copy(A[0:size*size], B[0:size*size])
void foo(int *A, int *B, int size);

The body of function foo becomes:

void _p0_foo_0(int *A, int *B, int size)
{
    ...
    cf_send_ref_i(&(_p0_swinst_foo_0.A), A, (size*size) * 4, &_p0_request_0);
    cf_receive_ref_i(&(_p0_swinst_foo_0.B), B, (size*size) * 4, &_p0_request_1);
    ...
}

The cf_send_ref_i and cf_receive_ref_i functions only transfer the reference or pointer of the array to the accelerator, and the accelerator accesses the processor memory directly.

Example 3

The following example shows a ZERO_COPY pragma with multiple arrays specified to generate a direct memory interface with DDR and the hardware function:

#pragma SDS data zero_copy(in1[0:mat_dim*mat_dim], in2[0:mat_dim*mat_dim], out[0:mat_dim*mat_dim])
void matmul_partition_accel(int *in1,  // Read-Only Matrix 1
                            int *in2,  // Read-Only Matrix 2
                            int *out,  // Output Result
                            int mat_dim); //  Matrix Dim (assumed only square matrix)

Example 4

A DATA COPY pragma instructs the compiler to insert the transfer size expression into the corresponding send/receive call within stub function body. As a result, it is essential that the argument names used in the function declaration match the argument names in the function definition. The following code snippet illustrates a common mistake: using an argument name in the function declaration that is different from the argument name used in the function definition:

"foo.h"
#pragma SDS data copy(in_A[0:1024])
void foo(int *in_A, int *out_B);

"foo.cpp"
#include "foo.h"
void foo(int *A, int *B)
{
...
}

Any C/C++ compiler will ignore the argument name in the function declaration, because the C/C++ standard makes the argument name in the function declaration optional. Only the argument name in the function definition is used by the compiler. However, the SDSoC compiler will issue a warning when trying to apply the pragma:

WARNING: [SDSoC 0-0] Cannot find argument in_A in accelerator function foo(int *A, int *B)

See Also

pragma SDS data data_mover

Description

IMPORTANT: This pragma is not recommended for normal use. Only use this pragma if the compiler-generated data mover type does not meet the design requirement.

This pragma must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration. This pragma applies to all the callers of the bound function.

By default, the SDSoC compiler chooses the type of the data mover automatically by analyzing the code. The DATA_MOVER pragma can be used to override the compiler default. This pragma specifies the HW IP type, or DataMover, used to transfer an array argument.

The FASTDMA data mover supports a wider data-width to support higher bandwidth for data transfer. For Zynq® UltraScale+™ MPSoC the data-width is from 64-bits to 256-bits. For Zynq-7000 the data-width is 64-bits.

The SDSoC™ compiler automatically assigns an instance of the data mover HW IP to use for transferring the corresponding array. The :id can be specified to assign a specific data mover instance for the associated formal parameter. If more than two formal parameters have the same DataMover and the same id, they will share the same data mover HW IP instance.

IMPORTANT: An additional requirement for using the AXIDMA_SIMPLE data mover is that the corresponding array must be allocated using sds_alloc().

Syntax

The syntax for this pragma is:

#pragma SDS data data_mover(ArrayName:DataMover[:id])

Where:

  • ArrayName: Specifies one of the formal parameters of the function to assign the pragma to.
  • DataMover: Must be one of the following:
    • AXIFIFO: used for non-contiguous memory, <300 bytes.
    • AXIDMA_SIMPLE: used for contiguous memory, <32MB.
    • AXIDMA_SG: can be used for either contiguous or non-contiguous memory, >300 bytes.
    • FASTDMA: contiguous memory only. The pragma is required when FASTDMA is desired.
  • :id: is optional, but must be specified as a positive integer when it is used.
  • Multiple arrays can be specified in one pragma, separated by a comma (,). For example:
    #pragma SDS data data_mover(ArrayName1:DataMover[:id], ArrayName2:DataMover[:id])

Example 1

The following code snippet shows an example of specifying the data mover ID in the pragma:

#pragma SDS data data_mover(A:AXIDMA_SG:1, B:AXIDMA_SG:1)
void foo(int A[1024], int B[1024]);

In the example above, the same instance of the AXIDMA_SG IP is shared to transfer data for arguments A, and B, because the same data mover ID has been specified.

Example 2

The following example uses the FASTDMA data mover:
#pragma SDS data data_mover(A:FASTDMA,B:FASTDMA,C:FASTDMA,D:AXIDMA_SIMPLE,E:AXIDMA_SIMPLE)
void foo(float A[1024], float B[1024], float C[1024], int D[1024], int E[1024]);
 

The compiler will transfer arrays A, B, and C with individual FASTDMA data movers, and arrays D, E with individual AXIDMA_SIMPLE data movers.

See Also

pragma SDS data mem_attribute

Description

This pragma must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration. This pragma applies to all the callers of the function.

For an operating system like Linux that supports virtual memory, user-space allocated memory is paged, which can affect system performance. The SDSoC runtime also provides an API to allocate physically contiguous memory. The pragmas in this section can be used to tell the compiler whether the arguments have been allocated in physically contiguous memory.

Syntax

The syntax for this pragma is:

#pragma SDS data mem_attribute(ArrayName:contiguity)

Where:

  • ArrayName: Specifies one of the formal parameters of the function to assign the pragma to.
  • Contiguity: Must be specified as either PHYSICAL_CONTIGUOUS or NON_PHYSICAL_CONTIGUOUS. The default value is NON_PHYSICAL_CONTIGUOUS:
    • PHYSICAL_CONTIGUOUS means that all memory corresponding to the associated ArrayName is allocated using sds_alloc.
    • NON_PHYSICAL_CONTIGUOUS means that all memory corresponding to the associated ArrayName is allocated using malloc or as a free variable on the stack. This helps the SDSoC compiler select the optimal data mover.
  • Multiple arrays can be specified in one pragma, separated by a comma (,). For example:
    #pragma SDS data mem_attribute(ArrayName:contiguity, ArrayName:contiguity)

Example 1

The following code snippet shows an example of specifying the contiguity attribute:

#pragma SDS data mem_attribute(A:PHYSICAL_CONTIGUOUS)
void foo(int A[1024], int B[1024]);

In the example above, the user tells the SDSoC compiler that array A is allocated in the memory block that is physically contiguous. The SDSoC compiler then chooses AXIDMA_SIMPLE instead of AXIDMA_SG, because the former is smaller and faster for transferring physically contiguous memory.

See Also

pragma SDS data sys_port

Description

This pragma must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration, and applies to all the callers of the function.

This pragma overrides the SDSoC compiler default choice of memory port. If the SYS_PORT pragma is not specified for an array argument, the interface to the external memory is automatically determined by the SDSoC system compilers (sdscc/sds++) based on array memory attributes (cacheable or non-cacheable), array size, data mover used, etc.

The Zynq®-7000 device provides a cache coherent interface (S_AXI_ACP) between programmable logic and external memory, and high-performance ports (S_AXI_HP) for non-cache coherent access. The Zynq® UltraScale+™ MPSoC provides a cache coherent interface (S_AXI_HPCn_FPD), and non-cache coherent interface called (S_AXI_HPn_FPD).

Syntax

The syntax for this pragma is:
#pragma SDS data sys_port(<param_name>:<port>)

Where:

  • <param_name>: Specifies one of the formal parameters of the function to assign the pragma to.
  • <port>: The SDSoC compiler recognizes predefined memory port types: ACP for Zynq-7000 devices only, HPC, HP, or MIG, which represent cache coherent access (ACP, HPC), high speed non-coherent access (HP), or memory accessible through a soft memory controller implemented in PL logic (MIG). You can also use a specific platform port name for the <port>, but this is not recommended unless the compiler does not select the correct port, which could occur for a stream port in the platform. To get a list of platform ports, in a terminal shell, run sds++ -sds-pf-info <platform>.

    For example, the sds++ -sds-pfm-info zcu102 command returns the following under System Ports:

    System Ports
    
    Use the system port name in a sysport pragma, for example 
    #pragma SDS data sys_port(parameter_name:system_port_name)
     
    System Port Name (Vivado BD instance name, Vivado BD port name)
    ps_e_S_AXI_HPC0_FPD (ps_e, S_AXI_HPC0_FPD)
    ps_e_S_AXI_HPC1_FPD (ps_e, S_AXI_HPC1_FPD)
    ps_e_S_AXI_HP0_FPD (ps_e, S_AXI_HP0_FPD)
    ps_e_S_AXI_HP1_FPD (ps_e, S_AXI_HP1_FPD)
    ps_e_S_AXI_HP2_FPD (ps_e, S_AXI_HP2_FPD)
    ps_e_S_AXI_HP3_FPD (ps_e, S_AXI_HP3_FPD)
    
    
    In this case, the SYS_PORT pragma could be defined as:
    #pragma SDS data sys_port(Array1:ps_e_S_AXI_HPC0_FPD)

    If <port> is defined using the HPC shortcut, then the argument, Array1, could be assigned to either HPC0, or HPC1, by the SDSoC compiler.

    When the platform is created, the designer could specify a shortcut for a specific platform port, using the PFM.AXI_PORT property. Refer to SDSoC Environment Platform Development Guide for more information on PFM properties. For example:
    set_property PFM.AXI_PORT {M_AXIS {type "M_AXIS" sptag "Counter"}} \
    [get_bd_cells /stream_fifo]
    This defines a SYS_PORT tag, "Counter", which can be specified in the pragma as:
    #pragma SDS data sys_port(Array1:Counter)
    
    This would be the same as declaring the following:
    #pragma SDS data sys_port(Array1:stream_fifo_M_AXIS)
    
  • Multiple arguments can be specified in one pragma, separated by commas:
    #pragma SDS data sys_port(param1:port, param2:port)

Example 1

The following code snippet shows an example of using this pragma:

#pragma SDS data sys_port(A:HP)
void foo(int A[1024], int B[1024]);

In the above example, if the caller passes an array (A) allocated with cache coherent calls, such as malloc, or sds_alloc, the SDSoC compiler uses an HP platform interface even though this might not be the best choice.

See Also

pragma SDS resource

Description

This pragma can be used at function call sites to manually specify resource binding.

The RESOURCE pragma is specified immediately preceding a call to a hardware function, directing the compiler to bind the caller to a specified accelerator instance. The SDSoC compiler identifies when multiple resource IDs have been specified for a function, and automatically generates a hardware accelerator and data motion network realizing the hardware functions in programmable logic.

Syntax

The syntax of the pragma is:
#pragma SDS resource(<ID>)

Where:

  • <ID>: Must be a compile time unsigned integer constant. For the same function, each unique ID represents a unique instance of the hardware accelerator.

Example 1

The following code snippet shows an example of using this pragma with a different ID:

{
    #pragma SDS resource(1)
    mmult(A, B, C);
    #pragma SDS resource(2)
    mmult(D, E, F);
    ...
}

In the previous example, the first call to function mmult will be bound to an accelerator with an ID of 1, and the second call to mmult will be bound to another accelerator with an ID of 2.

See Also

pragma SDS trace

Description

The SDSoC environment tracing feature provides a detailed view of what is happening in the system during execution of an application, through the use of hardware/software event tracing. See the SDSoC Environment User Guide for more information.

This pragma specifies the trace insertion for the accelerator with granularity at the function level or the argument level, to let you monitor the activity on the accelerator for debug purposes. When tracing is enabled, tracing instrumentation is automatically inserted into the software code, and hardware monitors are inserted into the hardware system during implementation of the hardware logic. You can monitor either the complete accelerator function, or an individual parameter of the function.

The type of trace can be SW, HW, or both. HW trace means the "start" and "stop" of the corresponding hardware component, such as the start and stop of the hardware accelerator, or the start and stop of data transfer of the specified argument. This lets you monitor activity moving onto, and off of, the hardware. The SW trace lets you observe the software stub for the hardware accelerated function, to monitor the function, and arguments on the software side of the transaction. You can also monitor both the hardware, and software transactions.

Syntax

This pragma must be specified immediately preceding a function declaration, or immediately preceding another #pragma SDS bound to the function declaration.

#pragma SDS trace(<var1>[:SW|:HW][,<var2>[:SW|:HW]])

Where:

  • <var>: Specifies either the function name, or one of the parameters of the function.
  • [:SW|:HW]: Specifies either HW tracing or SW tracing. The absence of this option indicates that both HW and SW traces are inserted.

Example 1

The following example traces the specified function, foo:

#pragma SDS monitor trace(foo)
void foo(int a, int b);
TIP: The absence of either :HW or :SW indicates that both traces are inserted for the accelerator.

Example 2

The following example demonstrates using this pragma to trace multiple arguments of the function.

#pragma SDS monitor trace(a, b:SW, c:HW)
void foo(int a, int b, int *c);

In the previous example, both HW and SW traces are inserted for argument a. Only the SW trace is inserted for argument b. For argument c, only the HW trace is inserted.

See Also

pragma SDS wait

Description

TIP: Refer to the ASYNC pragma for more information.

The WAIT pragma must be paired with the ASYNC pragma to support manual control of the hardware function synchronization.

The ASYNC pragma is specified immediately preceding a call to a hardware function, directing the compiler not to automatically generate the wait based on data flow analysis. The WAIT pragma must be inserted at an appropriate point in the program to direct the CPU to wait until the associated ASYNC function call with the same ID has completed.

See Also

pragma SDS data zero_copy

Description

TIP: Refer to pragma SDS data copy for a complete description of the ZERO_COPY pragma.

The COPY pragma implies that data is explicitly copied between the host processor memory and the hardware function, using a suitable data mover for the transfer. The ZERO_COPY pragma means that the hardware function accesses the data directly from shared memory through an AXI master bus interface.

IMPORTANT: The COPY pragma and the ZERO_COPY pragma are mutually exclusive and should not be specified together on the same object.

Example 1

The following example shows a ZERO_COPY pragma with multiple arrays specified to generate a direct memory interface with DDR and the hardware function:

#pragma SDS data zero_copy(in1[0:mat_dim*mat_dim], in2[0:mat_dim*mat_dim], out[0:mat_dim*mat_dim])
void matmul_partition_accel(int *in1,  // Read-Only Matrix 1
                            int *in2,  // Read-Only Matrix 2
                            int *out,  // Output Result
                            int mat_dim); //  Matrix Dim (assumed only square matrix) 
IMPORTANT: The array argument passed to a ZERO_COPY data mover must be physically contiguous. Passing a malloc'd buffer to a ZERO_COPY data mover will result in undefined behavior.

See Also

  • SDSoC Environment Profiling and Optimization Guide (UG1235)