Programming the PS Host Application
In Creating a Data Flow Graph (Including Kernels) the discussion was centered around a very simple AI Engine graph application. The top-level application initialized the graph, ran the graph, and ended the graph. However, for actual AI Engine graph applications the host code must do much more than those simple tasks. The top-level PS application running on the Cortex®-A72, controls the graph and PL kernels: manage data inputs to the graph, handle data outputs from the graph, and control any PL kernels working with the graph.
In addition, AI Engine graph applications can
be run on Linux operating systems or on bare-metal systems. The requirements for
programming in these two systems are significantly different as outlined in the
following topics. Xilinx provides drivers used by the
API calls in the host program to control the graph and PL kernels based on the operating
system. In Linux this is provided by the libadf_api_xrt
library, in bare-metal the AI Engine kernels are
controlled using the graph APIs, and PL kernels are controlled using libUIO
driver calls.
Preventing Multiple Graph Executions
In cases where your graph is implemented in a PS-based host application, you
must define a conditional pragma (#ifdef
) for your
graph.cpp code to ensure the graph is only
initialized once, or run only once. The following example code is the simple
application defined in Creating a Data Flow Graph (Including Kernels) with the
additional guard macro __AIESIM__
and __X86SIM__
.
#include "project.h"
simpleGraph mygraph;
simulation::platform<1,1> platform(“input.txt”,”output.txt”);
connect<> net0(platform.src[0], mygraph.in);
connect<> net1(mygraph.out, platform.sink[0]);
#if defined(__AIESIM__) || defined(__X86SIM__)
int main(void) {
mygraph.init();
mygraph.run(<number_of_iterations>);
mygraph.end();
return 0;
}
#endif
This conditional directive compiles the application only for use with the AI Engine simulator. It prevents the graph from being initialized or run multiple times, from both the graph and PS host application. The directive lets the graph.cpp run in simulation or in hardware emulation of the system design, which also runs in the AI Engine simulator and the x86 simulator. However when running in hardware, the graph is initialized and run from the PS application, rather than from the graph.cpp.
Host Programming on Linux
In Linux operating systems, the ADF API controls the AI Engine graph. The Xilinx Runtime (XRT) API is used to control PL kernels. The Xilinx Runtime (XRT) API can also be used to control the AI Engine graph. The following figure shows the APIs and drivers required in this system.
Controlling the AI Engine Graph with the ADF API
ADF APIs are used to control graph execution in the top-level application, or host code, as described in AI Engine Programming. For example, the following code is for a synchronous update of run-time parameters for AI Engine kernels in the graph:
// ADF API:run and update graph parameters (RTP)
gr.run(4);
gr.update(gr.trigger,10);
gr.update(gr.trigger,10);
gr.update(gr.trigger,100);
gr.update(gr.trigger,100);
gr.wait();
graph.end()
terminates the graph. It will not recover after
end()
has been called. Instead, you can use graph.wait()
to wait for runs to be completed.In the host application (host.cpp), the
graph.update()
function is called to update the
RTPs, and graph.run()
is called to launch the AI Engine kernels in the graph. In hardware emulation and
hardware flows, the ADF API is calling the XRT API, and adf::registerXRT()
is used to manage the relationship between them.
adf::registerXRT()
must be called before any ADF API
control or interaction with the graph.The following is example code showing the RTP update and execution by the ADF API.
// update graph parameters (RTP) & run
adf::registerXRT(dhdl, uuid);
gr.update(gr.size, 1024);//update RTP
gr.run(16);//start AIE kernel
gr.wait();
In the preceding example, gr.run(16)
specifies
a run of 16 iterations.
In graph.wait()
, the application waits for the
AI Engine kernels to complete.
The code example shows that adf::registerXRT()
requires the device handle (dhdl
) and UUID
of the XCLBIN image. They
can be obtained using the XRT APIs:
auto dhdl = xrtDeviceOpen(0);//device index=0
xrtDeviceLoadXclbinFile(dhdl,xclbinFilename);
xuid_t uuid;
xrtDeviceGetXclbinUUID(dhdl, uuid);
Controlling the PL Kernel with the XRT API
Xilinx provides an OpenSource XRT API for controlling the execution of PL kernels when programming the host code for Linux.
The execution model for the XRT API controlling PL kernels is as follows:
- Get device handle and load the XCLBIN. Get the
uuid
as needed. - Allocate buffer objects and map to host memory. Process and transfer data from host memory to device memory.
- Get kernel and run handles, set arguments for kernels, and launch kernels.
- Wait for kernel completion.
- Transfer data from global memory in the device back to host memory.
- Host code continues processing using the new data in the host memory.
When using the native XRT API, the host application looks like the following.
1.// Open device, load xclbin, and get uuid
auto dhdl = xrtDeviceOpen(0);//device index=0
xrtDeviceLoadXclbinFile(dhdl,xclbinFilename);
xuid_t uuid;
xrtDeviceGetXclbinUUID(dhdl, uuid);
2. Allocate output buffer objects and map to host memory
xrtBufferHandle out_bohdl = xrtBOAlloc(dhdl, output_size_in_bytes, 0, /*BANK=*/0);
std::complex<short> *host_out = (std::complex<short>*)xrtBOMap(out_bohdl);
3. Get kernel and run handles, set arguments for kernel, and launch kernel.
xrtKernelHandle s2mm_khdl = xrtPLKernelOpen(dhdl, top->m_header.uuid, "s2mm"); // Open kernel handle
xrtRunHandle s2mm_rhdl = xrtRunOpen(s2mm_khdl);
xrtRunSetArg(s2mm_rhdl, 0, out_bohdl); // set kernel arg
xrtRunSetArg(s2mm_rhdl, 2, OUTPUT_SIZE); // set kernel arg
xrtRunStart(s2mm_rhdl); //launch s2mm kernel
// ADF API:run, update graph parameters (RTP) and so on
……
4. Wait for kernel completion.
auto state = xrtRunWait(s2mm_rhdl);
5. Sync output device buffer objects to host memory.
xrtBOSync(out_bohdl, XCL_BO_SYNC_BO_FROM_DEVICE , output_size_in_bytes,/*OFFSET=*/ 0);
//6. post-processing on host memory - "host_out"
After post-processing the data, release the allocated objects:
graph.end();
xrtRunClose(s2mm_rhdl);
xrtKernelClose(s2mm_khdl);
xrtBOFree(out_bohdl);
xrtDeviceClose(dhdl);
graph.end()
, the AI Engine kernels will not recover again. To run the host multiple times,
you can comment out graph.end()
if the host does not
depend on graph.end()
for synchronization purpose, or
replace graph.end()
with graph.wait()
to do synchronization.Controlling the AI Engine Graph with the XRT C API
A Xilinx provided OpenSource XRT API can also be used for controlling execution of AI Engine graph when programming the host code for Linux. To control the AI Engine graph, XRT provides APIs through the header file experimental/xrt_graph.h.
XRT graph APIs contain both a C and C++ version. Example code to control the AI Engine graph using the XRT C API is as follows:
int narrow_filter[12] = {180, 89, -80, -391, -720, -834, -478, 505, 2063, 3896, 5535, 6504};
int wide_filter[12] = {-21, -249, 319, -78, -511, 977, -610, -844, 2574, -2754, -1066, 18539};
xrtGraphOpen(dhdl,top->m_header.uuid,"gr");
if(!ghdl){
int size=1024;
xrtGraphUpdateRTP(ghdl,"gr.fir24.in[1]",(char*)narrow_filter,12*sizeof(int));
xrtGraphRun(ghdl,16);
xrtGraphWait(ghdl,0);
xrtGraphUpdateRTP(ghdl,"gr.fir24.in[1]",(char*)wide_filter,12*sizeof(int));
xrtGraphRun(ghdl,16);
......
xrtGraphEnd(ghdl,0);
xrtGraphClose(ghdl);
Controlling the AI Engine Graph with the XRT C++ API
As stated in the previous section, XRT provides C and C++ APIs through the header file experimental/xrt_graph.h to control the AI Engine graphs.
XRT provides class graph
in the name space
xrt
and its member functions to control the graph. Example code to
control the AI Engine graph using the XRT C++ API is
as follows:
using namespace adf;
// Open xclbin
auto device = xrt::device(0); //device index=0
auto uuid = device.load_xclbin(xclbinFilename);
auto dhdl = xrtDeviceOpenFromXcl(device);
...
int coeffs_readback[12];
int narrow_filter[12] = {180, 89, -80, -391, -720, -834, -478, 505, 2063, 3896, 5535, 6504};
int wide_filter[12] = {-21, -249, 319, -78, -511, 977, -610, -844, 2574, -2754, -1066, 18539};
auto ghdl=xrt::graph(device,uuid,"gr");
ghdl.update("gr.fir24.in[1]",narrow_filter);
ghdl.run(16);
ghdl.wait();
ghdl.read("gr.fir24.inout[0]",coeffs_readback);//Read after graph::wait. RTP update effective
ghdl.update("gr.fir24.in[1]",wide_filter);
ghdl.run(16);
ghdl.read("gr.fir24.inout[0]", coeffs_readback);//Async read
ghdl.end();
RTP Configurations
section
in aie_control_xrt.cpp called
gr.fir24.in[1]
.Error Reporting Through the XRT API
XRT provides error reporting APIs. The error reporting APIs can be categorized into two types: synchronous and asynchronous APIs. Synchronous errors are errors that can be detected during the XRT run-time function call. It is POSIX-compliant. For example:
An asynchronous error might not be related to the current XRT function call
or the application that is running. Asynchronous errors are cached in driver subsystems and
can be accessed by the user application through the asynchronous error reporting APIs. Cached
errors are persistent until explicitly cleared. Persistent errors are not necessarily
indicative of the current system state, for example, a board might have been reset and be
functioning correctly while previously cached errors are still available. To avoid current
state confusion, asynchronous errors have a timestamp attached indicating when the error
occurred. The timestamp can be compared to, for example, the timestamp for last xbutil2
reset.
The errors cached by the driver contain a system error code and additional meta data as defined in xrt_error_code.h, which is shared between the user space and the kernel space.
The error code format for asynchronous errors is as shown here:
/**
* xrtErrorCode layout
*
* This layout is internal to XRT (akin to a POSIX error code).
*
* The error code is populated by driver and consumed by XRT
* implementation where it is translated into an actual error / info /
* warning that is propagated to the end user.
*
* 63 - 48 47 - 40 39 - 32 31 - 24 23 - 16 15 - 0
* --------------------------------------------------------
* | | | | | | | | | | |----| xrtErrorNum
* | | | | | | | | |----|---------- xrtErrorDriver
* | | | | | | |----|-------------------- xrtErrorSeverity
* | | | | |----|------------------------------ xrtErrorModule
* | | |----|---------------------------------------- xrtErrorClass
* |----|-------------------------------------------------- reserved
*
*/
typedef uint64_t xrtErrorCode;
typedef uint64_t xrtErrorTime;
#define XRT_ERROR_NUM_MASK 0xFFFFUL
#define XRT_ERROR_NUM_SHIFT 0
#define XRT_ERROR_DRIVER_MASK 0xFUL
#define XRT_ERROR_DRIVER_SHIFT 16
#define XRT_ERROR_SEVERITY_MASK 0xFUL
#define XRT_ERROR_SEVERITY_SHIFT 24
#define XRT_ERROR_MODULE_MASK 0xFUL
#define XRT_ERROR_MODULE_SHIFT 32
#define XRT_ERROR_CLASS_MASK 0xFUL
#define XRT_ERROR_CLASS_SHIFT 40
#define XRT_ERROR_CODE_BUILD(num, driver, severity, module, eclass) \
((((num) & XRT_ERROR_NUM_MASK) << XRT_ERROR_NUM_SHIFT) | \
(((driver) & XRT_ERROR_DRIVER_MASK) << XRT_ERROR_DRIVER_SHIFT) | \
(((severity) & XRT_ERROR_SEVERITY_MASK) << XRT_ERROR_SEVERITY_SHIFT) | \
(((module) & XRT_ERROR_MODULE_MASK) << XRT_ERROR_MODULE_SHIFT) | \
(((eclass) & XRT_ERROR_CLASS_MASK) << XRT_ERROR_CLASS_SHIFT))
#define XRT_ERROR_NUM(code) (((code) >> XRT_ERROR_NUM_SHIFT) & XRT_ERROR_NUM_MASK)
#define XRT_ERROR_DRIVER(code) (((code) >> XRT_ERROR_DRIVER_SHIFT) & XRT_ERROR_DRIVER_MASK)
#define XRT_ERROR_SEVERITY(code) (((code) >> XRT_ERROR_SEVERITY_SHIFT) & XRT_ERROR_SEVERITY_MASK)
#define XRT_ERROR_MODULE(code) (((code) >> XRT_ERROR_MODULE_SHIFT) & XRT_ERROR_MODULE_MASK)
#define XRT_ERROR_CLASS(code) (((code) >> XRT_ERROR_CLASS_SHIFT) & XRT_ERROR_CLASS_MASK)
/**
* xrt_error_num - XRT specific error numbers
*/
enum xrtErrorNum {
XRT_ERROR_NUM_FIRWWALL_TRIP = 1,
XRT_ERROR_NUM_TEMP_HIGH,
XRT_ERROR_NUM_AIE_SATURATION,
XRT_ERROR_NUM_AIE_FP,
XRT_ERROR_NUM_AIE_STREAM,
XRT_ERROR_NUM_AIE_ACCESS,
XRT_ERROR_NUM_AIE_BUS,
XRT_ERROR_NUM_AIE_INSTRUCTION,
XRT_ERROR_NUM_AIE_ECC,
XRT_ERROR_NUM_AIE_LOCK,
XRT_ERROR_NUM_AIE_DMA,
XRT_ERROR_NUM_AIE_MEM_PARITY,
XRT_ERROR_NUM_UNKNOWN
};
enum xrtErrorDriver {
XRT_ERROR_DRIVER_XOCL,
XRT_ERROR_DRIVER_XCLMGMT,
XRT_ERROR_DRIVER_ZOCL,
XRT_ERROR_DRIVER_AIE,
XRT_ERROR_DRIVER_UNKNOWN
};
enum xrtErrorSeverity {
XRT_ERROR_SEVERITY_EMERGENCY = 0,
XRT_ERROR_SEVERITY_ALERT,
XRT_ERROR_SEVERITY_CRITICAL,
XRT_ERROR_SEVERITY_ERROR,
XRT_ERROR_SEVERITY_WARNING,
XRT_ERROR_SEVERITY_NOTICE,
XRT_ERROR_SEVERITY_INFO,
XRT_ERROR_SEVERITY_DEBUG,
XRT_ERROR_SEVERITY_UNKNOWN
};
enum xrtErrorModule {
XRT_ERROR_MODULE_FIREWALL = 0,
XRT_ERROR_MODULE_CMC,
XRT_ERROR_MODULE_AIE_CORE,
XRT_ERROR_MODULE_AIE_MEMORY,
XRT_ERROR_MODULE_AIE_SHIM,
XRT_ERROR_MODULE_AIE_NOC,
XRT_ERROR_MODULE_AIE_PL,
XRT_ERROR_MODULE_AIE_UNKNOWN
};
enum xrtErrorClass {
XRT_ERROR_CLASS_FIRST_ENTRY = 1,
XRT_ERROR_CLASS_SYSTEM = XRT_ERROR_CLASS_FIRST_ENTRY,
XRT_ERROR_CLASS_AIE,
XRT_ERROR_CLASS_HARDWARE,
XRT_ERROR_CLASS_UNKNOWN,
XRT_ERROR_CLASS_LAST_ENTRY = XRT_ERROR_CLASS_UNKNOWN
};
The API header file experimental/xrt_error.h defines the APIs for accessing currently cached errors.
It provides xrtErrorGetLast()
and xrtErrorGetString()
APIs to retrieve the system level asynchronous errors.
/**
* xrtErrorGetLast - Get the last error code and its timestamp of a given error class.
*
* @handle: Device handle.
* @class: Error Class for the last error to get.
* @error: Returned XRT error code.
* @timestamp: The timestamp when the error generated
*
* Return: 0 on success or appropriate XRT error code.
*/
int
xrtErrorGetLast(xrtDeviceHandle handle, xrtErrorClass ecl, xrtErrorCode* error, uint64_t* timestamp);
/**
* xrtErrorGetString - Get the description string of a given error code.
*
* @handle: Device handle.
* @error: XRT error code.
* @out: Preallocated output buffer for the error string.
* @len: Length of output buffer.
* @out_len: Output of length of message, ignored if null.
*
* Return: 0 on success or appropriate XRT error code.
*
* Specifying out_len while passing nullptr for output buffer will
* return the message length, which can then be used to allocate the
* output buffer itself.
*/
int
xrtErrorGetString(xrtDeviceHandle, xrtErrorCode error, char* out, size_t len, size_t* out_len);
The application can call xrtErrorGetLast()
with a given error class to get the latest error code. The application can call xrtErrorGetString()
with a given error code to get the error
string corresponding to this error code. XRT maintains the latest error for each class and an
associated timestamp for when the error was generated.
xbutil2
can be used to report errors. The
error report accumulates all the errors from the various
classes and sorts them by timestamp. The report queries the drivers as to when the last reset
was requested. This reset will be merged (using the timestamp) into the report listing.
$ xbutil2 examine -r error
Asynchronous Errors
Time Class Module Driver Severity Error Code
2020-Oct-08 16:40:02 CLASS_SYSTEM MODULE_FIREWALL DRIVER_XOCL SEVERITY_EMERGENCY FIREWALL_TRIP
$ xbutil2 examine -r error -f JSON-2020.2
{
"schema_version": {
"schema": "JSON",
"creation_date": "Fri Oct 9 11:04:24 2020 GMT"
},
"devices": [
{
"asynchronous_errors": [
{
"timestamp": "1602175202572070700",
"class": "CLASS_SYSTEM",
"module": "MODULE_FIREWALL",
"severity": "SEVERITY_EMERGENCY",
"driver": "DRIVER_XOCL",
"error_code": {
"error_id": "1",
"error_msg": "FIREWALL_TRIP"
}
}
]
}
]
}
xbutil2
can also be used to report
AI Engine running status and read registers for debug
purposes. For example, the following command reads the status of kernels after the graph has
executed.
$ xbutil2 examine -r aie
--------------------------
1/1 [0000:00:00.0] : edge
--------------------------
Aie
Aie_Metadata
GRAPH[ 0] Name : gr
Status : running
SNo. Core [C:R] Iteration_Memory [C:R] Iteration_Memory_Addresses
[ 0] 23:1 23:1 16388
[ 1] 23:2 23:0 6980
[ 2] 23:3 23:1 4
[ 3] 24:1 24:0 4
[ 4] 24:2 24:2 4
[ 5] 24:3 24:1 4
[ 6] 25:1 25:1 4
Core [ 0]
Column : 23
Row : 1
Core:
Status : core_done
Program Counter : 0x00000308
Link Register : 0x00000290
Stack Pointer : 0x000340a0
DMA:
MM2S:
Channel:
Id : 0
Channel Status : idle
Queue Size : 0
Queue Status : okay
Current BD : 0
Id : 1
Channel Status : idle
Queue Size : 0
Queue Status : okay
Current BD : 0
S2MM:
Channel:
Id : 0
Channel Status : idle
Queue Size : 0
Queue Status : okay
Current BD : 0
Id : 1
Channel Status : idle
Queue Size : 0
Queue Status : okay
Current BD : 0
Locks:
0 : released_for_write
1 : released_for_write
2 : released_for_write
3 : released_for_write
4 : released_for_write
5 : released_for_write
6 : released_for_write
7 : released_for_write
8 : released_for_write
9 : released_for_write
10 : released_for_write
11 : released_for_write
12 : released_for_write
13 : released_for_write
14 : released_for_write
15 : released_for_write
Events:
core : 1, 2, 5, 22, 23, 24, 28, 29, 31, 32, 35, 36, 38, 39, 40, 44, 45, 47, 68
memory : 1, 43, 44, 45, 106, 113
......
Core [ 6]
Column : 25
Row : 1
Core:
Status : enabled, east_lock_stall
Program Counter : 0x000001e6
Link Register : 0x000000b0
Stack Pointer : 0x00030020
DMA:
MM2S:
Channel:
Id : 0
Channel Status : stalled_on_requesting_lock
Queue Size : 0
Queue Status : okay
Current BD : 2
Id : 1
Channel Status : idle
Queue Size : 0
Queue Status : okay
Current BD : 0
S2MM:
Channel:
Id : 0
Channel Status : running
Queue Size : 0
Queue Status : okay
Current BD : 0
Id : 1
Channel Status : idle
Queue Size : 0
Queue Status : okay
Current BD : 0
Locks:
0 : acquired_for_write
1 : released_for_write
2 : released_for_write
3 : released_for_write
4 : released_for_write
5 : released_for_write
6 : released_for_write
7 : released_for_write
8 : released_for_write
9 : released_for_write
10 : released_for_write
11 : released_for_write
12 : released_for_write
13 : released_for_write
14 : released_for_write
15 : released_for_write
Events:
core : 1, 2, 5, 22, 26, 28, 29, 31, 32, 35, 38, 39, 44
memory : 1, 20, 21, 23, 35, 43, 44, 106, 113
The following command can be used to read specific registers for debug purposes.
$ xbutil2 advanced --read-aie-reg -d 0000:00:0 0 25 Core_Status
Register Core_Status Value of Row:0 Column:25 is 0x00000201
For AI Engine register definitions, see the Versal
ACAP AI Engine Register Reference (AM015). For details on xbutil2
command use, see https://xilinx.github.io/XRT/master/html/index.html#.
AI Engine Error Events
This section provides error and related debug information for the errors
obtained using the XRT error reporting APIs described previously. These are errors
propagated from the AI Engine array and can be
used to debug application specific errors in hardware. For errors with class
XRT_ERROR_CLASS_AIE, you can obtain additional information by enabling the dmesg
logs, which provide the causes of the error (and
are described in the following tables). An example log is shown here:
[ 6616.963964] aie aie0: Asserted tile error event 56 at col 6 row 7
[DLBF] Completed reading 4 iterat[ 6616.970234] aie aie0: Asserted tile error event 56 at col 7 row 8
[ 6616.979187] aie aie0: Asserted tile error event 56 at col 8 row 5
col
and row
number. Row 0 is the SHIM (interface) tile,
AI Engines start from row 1.The following tables list the various categories of error, in addition to the exact error number, description, and tips on the next steps to debug and resolve the errors.
Error Group | No. | Name | Description | Debug Tips |
---|---|---|---|---|
Instruction Errors | 59 | Instruction Decompression Error | Event generated when AI Engine cannot decompress instruction fetched. This can happen if the program instructions are corrupt. Validate ELF generation. | Regenerate the ELF file with the Vitis compiler (V++) --package command. If the issue
persists, contact Xilinx
support. |
Access Errors | 55 | PM Reg Access Failure | This error can happen on bank access conflict to PM by the memory mapped AXI interface and AI Engine. | Contact Xilinx support. |
60 | DM address out of range | Event generated if AI Engine tries to access a memory location outside of 0x20000 – 0x3FFFF. | Run AI Engine
simulator (aiesimulator ) with
–-enable-memory-check that
will flag any access violations. |
|
65 | PM address out of range | Event generated if PC is out of range | Run AI Engine
simulator (aiesimulator ) with
– enable-memory-check that
will flag any access violations. |
|
66 | DM access to unavailable | Event generated if AI Engine issues an access to the isolated tile in neighborhood. | Check if the kernel runs on AI Engine accesses data memory of an
isolated tile (a different partition). If the issue persists, contact Xilinx support. |
|
Bus Errors | 58 | AXI MM Slave Error | Event generated if the memory mapped AXI interface slave read/write request is for an address which does not exist in the AI Engine tile. | If the PL IP is accessing the AI Engine registers using the memory
mapped AXI interface, check the PL IP to see if it access invalid
registers. If the issue persists, contact Xilinx support. |
Stream Errors | 54 | TLAST in WSS words 0-2 | Event generated if TLAST is not on the 4th word of a wide stream. | If PL IP is used to generate the stream, check
if it generates TLAST correctly. If the issue persists, contact Xilinx support. |
56 | Stream Pkt Parity Error |
Event generated if there is any parity error in the packet header. |
Check the data source such as PL IP which generates the packets to see if the packet is valid and if the parity bit is correctly calculated. If the data is from PL IP, check the packet header generated from the PL IP. | |
57 | Control Pkt Error | Control Packet Error | Check the data source, such as PL IP which
generates the packets to see if it generates the packets
correctly. If the issue persists, contact Xilinx support. |
|
ECC Errors | 64 | PM ECC Error 2bit | Event generated when 2 bit ECC error is detected | Re-run the application. If the issue persists, contact Xilinx support. |
62 | PM ECC Error Scrub 2bit | Event generated if ECC scrubber detects 2 Bit ECC error | Re-run the application. If the issue persists, contact Xilinx support. |
|
Lock Errors | 67 | Lock Access to unavailable | Event generated if AI Engine issues an access to the isolated tile in neighborhood. | Contact Xilinx support. |
|
Errors Group | No. | Name | Description | Debug Tips |
---|---|---|---|---|
ECC Errors | 88 | DM ECC Error Scrub 2bit | Event generated when ECC scrubber detects 2-bit ECC error in bank 0 or bank 1 of DM. | Re-run the application. If the issue persists, contact Xilinx support. |
90 | DM ECC Error 2bit | Event generated when 2-bit ECC error is detected during access to bank 0 or 1 of DM. This data memory ECC error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. | Re-run the application. If the issue persists, contact Xilinx support. |
|
Memory Parity Errors | 91 | DM Parity Error Bank 2 | Event generated when a parity error is detected
during access to DM bank 2. This data memory parity error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. |
Re-run the application. If the issue persists, contact Xilinx support. |
92 | DM Parity Error Bank 3 | Event generated when a parity error is detected
during access to DM bank 3. This data memory parity error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. |
Re-run the application. If the issue persists, contact Xilinx support. |
|
93 | DM Parity Error Bank 4 | Event generated when a parity error is detected
during access to DM bank 4. This data memory parity error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. |
Re-run the application. If the issue persists, contact Xilinx support. |
|
94 | DM Parity Error Bank 5 | Event generated when a parity error is detected
during access to DM bank 5. This data memory parity error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. |
Re-run the application. If the issue persists, contact Xilinx support. |
|
95 | DM Parity Error Bank 6 | Event generated when a parity error is detected
during access to DM bank 6. This data memory parity error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. |
Re-run the application. If the issue persists, contact Xilinx support. |
|
96 | DM Parity Error Bank 7 | Event generated when a parity error is detected
during access to DM bank 7. This data memory parity error can be caused by DM access from the AI Engine, tile DMA, or memory mapped AXI interface. |
Re-run the application. If the issue persists, contact Xilinx support. |
|
DMA Errors | 97 | DMA S2MM 0 Error | This error can be caused by writing to the BD task queue of S2MM channel 0 when it is full. | If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If the issue persists, contact Xilinx support. |
98 | DMA S2MM 1 Error | This error can be caused by writing to the BD task queue of S2MM channel 1 when it is full. | If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If the issue persists, contact Xilinx support. |
|
99 | DMA MM2S 0 Error | This error can be caused by writing to the BD task queue of MM2S channel 0 when it is full. | If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If the issue persists, contact Xilinx support. |
|
100 | DMA MM2S 1 Error |
This error can be caused by writing to the BD task queue of MM2S channel 1 when it is full. |
If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If the issue persists, contact Xilinx support. |
Error Group | No. | Name | Description | Debug Tips |
---|---|---|---|---|
Bus Errors | 62 | AXI MM Slave Tile Error | Event generated if a memory mapped AXI interface slave request comes to an interface tile but the address is invalid. | If using the PL IP to access the AI Engine register with the memory
mapped AXI interface, check if the IP tries to access the wrong
address. If the issue persists, contact Xilinx support. |
64 | AXI MM Decode NSU Error | The memory mapped AXI interface traffic internally has responded with a DECERR. For example, if a column, set of tiles are clock gated, a decode error is generated internally and travels on the memory mapped AXI interface to the interface tile to generate this event. | If using the PL IP to access the AI Engine register using the memory
mapped AXI interface, check if the IP tries to access tile which is
gated. If the issue persists, contact Xilinx support. |
|
65 | AXI MM Slave NSU Error | The memory mapped AXI interface traffic internally has responded with a SLVERR. For example, An AI Engine tile in that interface tile column has responded with a slave error. That slave error will travel over the memory mapped AXI interface to the interface tile as a slave error. | If using the PL IP to access the AI Engine register with the memory
mapped AXI interface, check if the IP tries to access wrong address.
If the issue persists, contact Xilinx support. |
|
66 | AXI MM Unsupported Traffic | The memory mapped AXI interface from the NoC has made a request that the AI Engine does not support. | If using the PL IP to access the AI Engine register with the memory mapped AXI interface, check if the IP generates unsupported memory mapped AXI interface requests. | |
67 | AXI MM Unsecure Access in Secure Mode | The memory mapped AXI interface from the NoC is violating the secure mode (trying to route unsecured traffic when AI Engine only supports secure traffic). | Check if the AI Engine array is configured in secure mode. | |
68 | AXI MM Byte Strobe Error | The memory mapped AXI interface from the NoC is writing with non-complete 32-bit words (within a 32- bit word all byte strobes must be set). | If the PL IP is accessing the AI Engine using the memory mapped AXI interface, check if all byte strobes are set for a 32-bit word. | |
Stream Error | 63 | Control Pkt Error | Control Packet Error | If the PL IP is generating the control packets,
check if the IP generates packets properly. If the issue persists, contact Xilinx support. |
DMA Error | 69 | DMA S2MM 0 Error | This DMA error is for DMA S2MM channel 0. It can
be caused by:
|
If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If you manage buffer descriptors in your application, check if the memory address sent to the interface tile DMA buffer descriptor is invalid. If the issue persists, contact Xilinx support. |
70 | DMA S2MM 1 Error | This DMA error is for DMA S2MM channel 1. It can
be caused by:
|
If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If you manage buffer descriptors in your application, check if memory address sent to the interface tile DMA buffer descriptor is invalid. If the issue persists, contact Xilinx support. |
|
71 | DMA MM2S 0 Error | This DMA error is for DMA MM2S channel 0. It can
be caused by:
|
If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If you manage buffer descriptors in your application, check if memory address sent to the interface tile DMA buffer descriptor is invalid. If the issue persists, contact Xilinx support. |
|
72 | DMA MM2S 1 Error | This DMA error is for DMA MM2S channel 1. It can
be caused by:
|
If you manage buffer descriptors in your
application, verify that you are not pushing new buffer descriptors
when the queue is full. If you manage buffer descriptors in your application, check if memory address put to the sent to the interface tile DMA buffer descriptor is invalid. If the issue persists, contact Xilinx support. |
|
|
Host Code Reference with ADF API and XRT API
This section provides a summary of the XRT APIs that control the PL kernels and graph as well as a mapping relationship between the ADF API and XRT API. Complete host code using the ADF API or the XRT API to control the graph is also provided for reference.
XRT API | Description |
---|---|
Category: Device handle (experimental/xrt_device.h) | |
xrtDeviceHandle
xrtDeviceOpen(unsigned int index); |
Open a device and obtain its handle. |
xrtDeviceHandle
xrtDeviceOpenFromXcl(xclDeviceHandle xhdl); |
Get a device handle from xclDeviceHandle . |
int
xrtDeviceClose(xrtDeviceHandle dhdl); |
Close an opened device. |
int
xrtDeviceLoadXclbinFile(xrtDeviceHandle dhdl, const char*
xclbin_fnm); |
Read and load an XCLBIN file. |
void
xrtDeviceGetXclbinUUID(xrtDeviceHandle dhdl, xuid_t
out); |
Get UUID of XCLBIN image loaded on device. |
Category: PL kernel handle (experimental/xrt_kernel.h) | |
xrtKernelHandle
xrtPLKernelOpen(xrtDeviceHandle deviceHandle, const xuid_t xclbinId,
const char *name); |
Open a PL kernel and obtain its handle. |
int
xrtKernelClose(xrtKernelHandle kernelHandle); |
Close an opened kernel. |
xrtRunHandle
xrtKernelRun(xrtKernelHandle kernelHandle, ...); |
Start a kernel execution. |
xrtRunHandle
xrtRunOpen(xrtKernelHandle kernelHandle); |
Open a new run handle for a kernel without starting kernel. |
int
xrtRunSetArg(xrtRunHandle rhdl, int index, ...); |
Set a specific kernel argument for this run. |
int
xrtRunUpdateArg(xrtRunHandle rhdl, int index, ...); |
Asynchronous update of kernel argument. |
int xrtRunStart(xrtRunHandle
rhdl); |
Start existing run handle. |
enum ert_cmd_state
xrtRunWait(xrtRunHandle rhdl); |
Wait for a run to complete. |
int xrtRunClose(xrtRunHandle
rhdl); |
Close a run handle. |
Category: Graph handle (experimental/xrt_graph.h) | |
xrtGraphHandle
xrtGraphOpen(xrtDeviceHandle handle, const uuid_t xclbinUUID, const
char *graphName); |
Open a graph and obtain its handle. |
void
xrtGraphClose(xrtGraphHandle gh); |
Close an open graph. |
int
xrtGraphRun(xrtGraphHandle gh, int iterations); |
Start a graph execution. |
int
xrtGraphWait(xrtGraphHandle gh, uint64_t cycle); |
Wait a set number of AI Engine cycles since the last xrtGraphRun and then stop the graph. If cycle is 0, wait until the graph is
finished. If the graph has already run more than the set number of
cycles, stop the graph immediately. |
int
xrtGraphResume(xrtGraphHandle gh); |
Resume a suspended graph. |
int
xrtGraphEnd(xrtGraphHandle gh, uint64_t cycle); |
Wait a set number of AI Engine cycles since the last xrtGraphRun and then end the graph. If cycle is 0, wait until the graph is
finished before ending the graph. If the graph has already run more than
the set number of cycles, stop the graph immediately and end it. |
int
xrtGraphUpdateRTP(xrtGraphHandle gh, const char *hierPathPort, const
char *buffer, size_t size); |
Update RTP value of port with hierarchical name. |
int
xrtGraphReadRTP(xrtGraphHandle gh, const char *hierPathPort, char
*buffer, size_t size); |
Read RTP value of port with hierarchical name. |
Category: AIE handle (experimental/xrt_aie.h) | |
int xrtAIESyncBO(xrtDeviceHandle handle,
xrtBufferHandle bohdl, const char *gmioName, enum xclBOSyncDirection
dir, size_t size, size_t offset); |
Transfer data between DDR memory and interface tile DMA channel. |
Category: Buffer object handle (experimental/xrt_bo.h) | |
xrtBufferHandle
xrtBOAlloc(xrtDeviceHandle dhdl, size_t size, xrtBufferFlags flags,
xrtMemoryGroup grp); |
Allocate a BO of requested size with appropriate flags. |
int
xrtBOFree(xrtBufferHandle bhdl); |
Synchronize buffer contents in requested direction. |
int
xrtBOSync(xrtBufferHandle bhdl, enum xclBOSyncDirection dir, size_t
size, size_t offset); |
Synchronize buffer contents in requested direction. |
void*
xrtBOMap(xrtBufferHandle bhdl); |
Memory map BO into user address space. |
Category: Error reporting (experimental/xrt_error.h) | |
int
xrtErrorGetLast(xrtDeviceHandle handle, xrtErrorClass ecl,
xrtErrorCode* error, uint64_t* timestamp); |
Get the last error code and its timestamp of a given error class. |
int
xrtErrorGetString(xrtDeviceHandle, xrtErrorCode error, char* out,
size_t len, size_t* out_len); |
Get the description string of a given error code. |
The following table lists the mapping between the ADF API and XRT API. The
xrtGraphOpen()
, xrtPLKernelOpen()
, xrtRunOpen()
, xrtKernelClose()
XRT APIs are called inside the ADF APIs
when required and there is no corresponding mapping listed.
Graph API | XRT API |
---|---|
graph::run() |
xrtGraphRun(xrtGraphHandle, 0) for
AI Engine. |
graph::run(iterations) |
xrtGraphRun(xrtGraphHandle,
iterations) for AI Engine. |
graph::wait() |
xrtGraphWait(xrtGraphHandle, 0) for
AI Engine. |
graph::wait(aie_cycles) |
xrtGraphWait(xrtGraphHandle
aie_cycles) , for AI Engine. |
graph::resume() |
xrtGraphResume(xrtGraphHandle) |
graph::end() |
xrtGraphEnd(xrtGraphHandle, 0) and
then xrtGraphClose(xrtGraphHandle) for
AI Engine. |
graph::end(aie_cycles) |
xrtGraphEnd(xrtGraphHandle,
aie_cycles) and then xrtGraphClose(xrtGraphHandle) for AI Engine. |
graph::update() |
xrtGraphUpdateRTP()
for AI Engine; |
graph::read() |
xrtGraphReadRTP()
for AI Engine; |
GMIO::malloc() |
xrtBOAlloc() , xrtBOMap() |
GMIO::free() |
xrtBOFree() |
GMIO::gm2aie_nb() |
N/A |
GMIO::aie2gm_nb() |
N/A |
GMIO::wait() |
N/A |
GMIO::gm2aie() |
xrtSyncBOAIE(...,XCL_BO_SYNC_BO_GMIO_TO_AIE,...) |
GMIO::aie2gm() |
xrtSyncBOAIE(...,XCL_BO_SYNC_BO_AIE_TO_GMIO,...) |
adf::event APIs
for profiling and event trace |
N/A |
The following is host code using the ADF API and XRT API for reference. The __USE_ADF_API__ is a user-defined macro in the code that can be used to switch between the ADF API and XRT API to control the AI Engine graph.
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include "host.h"
#include <unistd.h>
#include <complex>
#include "adf/adf_api/XRTConfig.h"
#include "experimental/xrt_kernel.h"
#include "graph.cpp"
#define OUTPUT_SIZE 2048
using namespace adf;
int main(int argc, char* argv[]) {
size_t output_size_in_bytes = OUTPUT_SIZE * sizeof(int);
//TARGET_DEVICE macro needs to be passed from gcc command line
if(argc != 2) {
printf("Usage: %d <xclbin>\r\n",argv[0]);
return EXIT_FAILURE;
}
char* xclbinFilename = argv[1];
int ret;
// Open xclbin
auto dhdl = xrtDeviceOpen(0);//device index=0
if(!dhdl){
printf("Device open error\n");
}
ret=xrtDeviceLoadXclbinFile(dhdl,xclbinFilename);
if(ret){
printf("Xclbin Load fail\n");
}
xuid_t uuid;
xrtDeviceGetXclbinUUID(dhdl, uuid);
// output memory
xrtBufferHandle out_bohdl = xrtBOAlloc(dhdl, output_size_in_bytes, 0, /*BANK=*/0);
std::complex<short> *host_out = (std::complex<short>*)xrtBOMap(out_bohdl);
// s2mm ip
xrtKernelHandle s2mm_khdl = xrtPLKernelOpen(dhdl, uuid, "s2mm");
xrtRunHandle s2mm_rhdl = xrtRunOpen(s2mm_khdl);
xrtRunSetArg(s2mm_rhdl, 0, out_bohdl);
xrtRunSetArg(s2mm_rhdl, 2, OUTPUT_SIZE);
xrtRunStart(s2mm_rhdl);
printf("run s2mm\n");
#if __USE_ADF_API__
// update graph parameters (RTP) & run
adf::registerXRT(dhdl, uuid);
printf("Register XRT\r\n");
int narrow_filter[12] = {180, 89, -80, -391, -720, -834, -478, 505, 2063, 3896, 5535, 6504};
int wide_filter[12] = {-21, -249, 319, -78, -511, 977, -610, -844, 2574, -2754, -1066, 18539};
gr.run(16);//start AIE kernel
gr.update(gr.fir24.in[1], narrow_filter, 12);//update AIE kernel RTP
printf("Update fir24 done\r\n");
printf("Graph run done\r\n");
gr.wait(); // wait for AIE kernel to complete
printf("Graph wait done\r\n");
gr.update(gr.fir24.in[1], wide_filter, 12);//Update AIE kernel RTP
printf("Update fir24 done\r\n");
gr.run(16);//start AIE kernel
printf("Graph run done\r\n");
#else
int narrow_filter[12] = {180, 89, -80, -391, -720, -834, -478, 505, 2063, 3896, 5535, 6504};
int wide_filter[12] = {-21, -249, 319, -78, -511, 977, -610, -844, 2574, -2754, -1066, 18539};
auto ghdl=xrtGraphOpen(dhdl,uuid,"gr");
if(!ghdl){
printf("Graph Open error\r\n);;
}else{
printf("Graph Open ok\r\n");;
}
int size=1024;
xrtKernelHandle noisegen_khdl = xrtPLKernelOpen(dhdl, uuid, "random_noise");
xrtRunHandle noisegen_rhdl = xrtRunOpen(noisegen_khdl);
xrtRunSetArg(noisegen_rhdl, 1, size);
xrtRunStart(noisegen_rhdl);
printf("run noisegen\n");
ret=xrtGraphUpdateRTP(ghdl,"gr.fir24.in[1]",(char*)narrow_filter,12*sizeof(int));
if(ret!=0){
printf("Graph RTP update fail\n");
return 1;
}
ret=xrtGraphRun(ghdl,16);
if(ret){
printf("Graph run error\r\n");
}else{
printf("Graph run ok\r\n");
}
ret=xrtGraphWait(ghdl,0);
if(ret){
printf("Graph wait error\r\n");
}else{
printf("Graph wait ok\r\n");
}
xrtRunWait(noisegen_rhdl);
xrtRunSetArg(noisegen_rhdl, 1, size);
xrtRunStart(noisegen_rhdl);
printf("run noisegen\n");
ret=xrtGraphUpdateRTP(ghdl,"gr.fir24.in[1]",(char*)wide_filter,12*sizeof(int));
if(ret!=0){
printf("Graph RTP update fail\n");
return 1;
}
ret=xrtGraphRun(ghdl,16);
if(ret){
printf("Graph run error\r\n");
}else{
printf("Graph run ok\r\n");
}
#endif
// wait for s2mm done
auto state = xrtRunWait(s2mm_rhdl);
printf("s2mm completed with status %d\r\n",state);
xrtBOSync(out_bohdl, XCL_BO_SYNC_BO_FROM_DEVICE , output_size_in_bytes,/*OFFSET=*/ 0);
std::ofstream out("out.txt",std::ofstream::out);
std::ifstream golden("data/filtered.txt",std::ifstream::in);
short g_real=0,g_imag=0;
int match = 0;
for (int i = 0; i < OUTPUT_SIZE; i++) {
golden >> std::dec >> g_real;
golden >> std::dec >> g_imag;
if(g_real!=host_out[i].real() || g_imag!=host_out[i].imag()){
printf("ERROR: i=%d gold.real=%d gold.imag=%d out.real=%d out.imag=%d\r\n",i,g_real,g_imag,host_out[i].real(),host_out[i].imag());
match=1;
}
out<<host_out[i].real()<<" "<<host_out[i].imag()<<" "<<std::endl;
}
out.close();
golden.close();
#if __USE_ADF_API__
gr.end();
#else
ret=xrtGraphEnd(ghdl,0);
if(ret){
printf("Graph end error"\r\n);;
}
xrtRunClose(noisegen_rhdl);
xrtKernelClose(noisegen_khdl);
xrtGraphClose(ghdl);
#endif
xrtRunClose(s2mm_rhdl);
xrtKernelClose(s2mm_khdl);
xrtBOFree(out_bohdl);
xrtDeviceClose(dhdl);
char pPass[]= "PASSED";
char pFail[]= "FAILED";
char* presult;
presult = (match ? pFail : pPass);
printf("TEST %s\r\n",presult);
return (match ? EXIT_FAILURE : EXIT_SUCCESS);
}
The XRT API has C and C++ versions for controlling the PL kernels. For more information about the C++ version of the XRT API, see https://xilinx.github.io/XRT/master/html/xrt_native_apis.html.
Host Programming for Bare-metal Systems
In a bare-metal/standalone environment, Xilinx provides standalone board support package (BSP), drivers, and libraries for applications to use to reduce development effort. As described in Host Programming on Linux, the top-level application for bare-metal systems must also integrate and manage the AI Engine graph and PL kernels.
The following is an example top-level application (main.cpp) for a bare-metal system:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include "platform.h"
#include "xparameters.h"
#include "xil_io.h"
#include "xil_cache.h"
#include "input.h"
#include "golden.h"
...
void InitData(int32_t** out, int size)
{
int i;
*out = (int32_t*)malloc(sizeof(int32_t) * size);
if(!out) {
printf("Allocation of memory failed \n");
exit(-1);
}
for(i = 0; i < size; i++) {
(*out)[i] = 0xABCDEF00;
}
}
int RunTest(uint64_t mm2s_base, uint64_t s2mm_base, int32_t* in, int32_t* golden,
int32_t* out, int input_size, int output_size)
{
int i;
int errCount = 0;
uint64_t memAddr = (uint64_t)in;
uint64_t mem_outAddr = (uint64_t)out;
printf("Starting test w/ cu\n");
Xil_Out32(mm2s_base + MEM_OFFSET, (uint32_t) memAddr);
Xil_Out32(mm2s_base + MEM_OFFSET + 4, 0);
Xil_Out32(s2mm_base + MEM_OFFSET, (uint32_t) mem_outAddr);
Xil_Out32(s2mm_base + MEM_OFFSET + 4, 0);
Xil_Out32(mm2s_base + SIZE_OFFSET, input_size);
Xil_Out32(s2mm_base + SIZE_OFFSET, output_size);
Xil_Out32(mm2s_base + CTRL_OFFSET, 1);
Xil_Out32(s2mm_base + CTRL_OFFSET, 1);
printf("GRAPH INIT\n");
clipgraph.init();
printf("GRAPH RUN\n");
clipgraph.run();
while(1) {
uint32_t v = Xil_In32(s2mm_base + CTRL_OFFSET);
if(v & 6) {
break;
}
}
printf("PLIO IP DONE!\n");
for(i = 0; i < output_size; i++) {
if((((int32_t*)out)[i] != ((int32_t*)golden)[i]) ) {
printf("Error found in sample %d != to the golden %d\n", i+1, ((int32_t*)out)[i], ((int32_t*)golden)[i]);
errCount++;
}
else
printf("%d\n ",((int32_t*)out)[i]);
}
printf("Ending test w/ cu\n");
return errCount;
}
int main()
{
int i;
int32_t* out;
int errCount;
Xil_DCacheDisable();
init_platform();
sleep(1);
printf("Beginning test\n");
InitData(&out, OUTPUT_SIZE);
errCount = RunTest(MM2S_BASE, S2MM_BASE, (int32_t*)cint16input, int32golden, out, INPUT_SIZE, OUTPUT_SIZE);
if(errCount == 0)
printf("Test passed. \n");
else
printf("Test failed! Error count: %d \n",errCount);
cleanup_platform();
return errCount;
}
The following are the steps in the code example:
- The
main()
function initializes the platform, data, runs the test, verifies the return code, and return the error code. InitData()
allocatessize
of memory space and initializes successfully allocated memory space to known data.RunTest()
passes necessary data to the kernel to process and return a result.clipgraph.init()
initializes the tiles that kernels will be run on.clipgraph.run()
starts kernels running on associated tiles.
The preceding code example references xparameters.h which is automatically generated from the bare-metal BSP. The application needs to ensure the bare-metal BSP is properly generated so that the memory mapped addresses for all drivers are correctly assigned.
xil_io.h contains general driver I/O APIs. This is the preferred method for accessing drivers.
Addressing Kernels in Bare-metal Applications
For bare-metal applications, when addressing the PL kernels from the
embedded application, you must use the control registers, or read and write to the
kernel at the appropriate base address and offset as it is implemented in hardware.
Looking at the application discussed earlier, the embedded application can deliver
data to the MM2S
kernel, to introduce it to the
AI Engine graph for the Interpolator
and Classifier
kernels, and read data from the S2MM
kernel to continue processing in the embedded application. In
this case, address the MM2S
and S2MM
kernels as they are implemented in the PL region
of the fixed platform.
The main.cpp
of the example shows
the #define
statements for the kernel base address
and the address offset for specific registers. For example:
#define MM2S_BASE XPAR_MM2S_S_AXI_CONTROL_BASEADDR
#define S2MM_BASE XPAR_S2MM_S_AXI_CONTROL_BASEADDR
#define MEM_OFFSET 0x10
#define SIZE_OFFSET 0x1C
#define CTRL_OFFSET 0x0
To determine the address and offsets for the kernels, examine some
of the files in the fixed platform. The location of the base address for the
implemented kernels is located in the fixed platform xparameters.h file, that is located in the <platform_name>/standalone_domain/bspinclude/include
folder. For
the example design, use the following entries in xparameters.h to determine the base addresses of these kernels.
/* Definitions for peripheral MM2S */
#define XPAR_MM2S_S_AXI_CONTROL_BASEADDR 0xA4020000
#define XPAR_MM2S_S_AXI_CONTROL_HIGHADDR 0xA402FFFF
/* Definitions for peripheral S2MM */
#define XPAR_S2MM_S_AXI_CONTROL_BASEADDR 0xA4030000
#define XPAR_S2MM_S_AXI_CONTROL_HIGHADDR 0xA403FFFF
xparameters.h
file is generated and addressing is
dynamic. It is best to reference the address macros for kernels than to hard code
them.The location of the address offset is located in the <kernel_driver>_hw.h file of the _x/<kernel>/<kernel>/<kernel>/solution/impl/ip/drivers/<kernel>_v1_0/src
folder of the compiled kernel produced by the Vitis™ compiler. For example, the MM2S
kernel driver, xmm2s_mm2s_hw.h displays the following data.
#define XMM2S_MM2S_CONTROL_ADDR_AP_CTRL 0x00
#define XMM2S_MM2S_CONTROL_ADDR_GIE 0x04
#define XMM2S_MM2S_CONTROL_ADDR_IER 0x08
#define XMM2S_MM2S_CONTROL_ADDR_ISR 0x0c
#define XMM2S_MM2S_CONTROL_ADDR_MEM_V_DATA 0x10
#define XMM2S_MM2S_CONTROL_BITS_MEM_V_DATA 64
#define XMM2S_MM2S_CONTROL_ADDR_SIZE_DATA 0x1c
#define XMM2S_MM2S_CONTROL_BITS_SIZE_DATA 32
Use these offsets when reading or writing to the kernels. For instance, from the example application main.cpp file use the following to write to the memory location.
Xil_Out32(MM2S_BASE + MEM_OFFSET, (uint32_t) memAddr);