Debugging Applications and Kernels
The Vitis™ unified software platform provides application-level debug features and techniques that allow the host code, kernel code, and the interactions between them to be debugged. These features and techniques are split between software debugging and hardware debugging flows.
For software debugging, the host and kernel code can be debugged using the Vitis IDE, or using GDB from the command line as a standard debug tool.
For hardware debugging, kernels running on hardware can be debugged using Xilinx® virtual cable (XVC) running over the PCIe® bus, for Alveo™ Data Center accelerator cards, and debugged using USB-JTAG cables for both Alveo cards and embedded processor platforms.
Debugging Flows
The Vitis unified software platform provides application-level debug features which allow the host code, the kernel code, and the interactions between them to be efficiently debugged in either the Vitis IDE, or from the command line. The recommended debugging flow consists of three levels of debugging:
- Debugging in Software Emulation to confirm the algorithm functionality of the application as represented in both your host program and kernel code.
- Debugging in Hardware Emulation to compile the kernel into RTL, confirm the behavior of the generated logic, and evaluate the simulated performance of the hardware.
- Debugging During Hardware Execution to implement the FPGA binary and debug the application running in hardware.
This three-tiered approach allows debugging the host and kernel code, and the interactions between them at different levels of abstraction. Each provides specific insights into the design and makes debugging easier. All flows are supported through an integrated GUI flow as well as through a batch flow using basic compile time and runtime setup options.
In the case of applications running on embedded processor platforms, some additional setup is required as described in Debugging on Embedded Processor Platforms.
Debugging in Software Emulation
The Vitis unified software platform supports typical software debugging for the host code at all times, the kernel code when running in software emulation mode, and at points during hardware emulation mode. This is a standard software debug flow using breakpoints, stepping through code, analyzing variables, and forcing the code into specific states.
The following figure shows the debug flow during software emulation for
the host and kernel code (written in C/C++ or OpenCL™) using the GNU debugging (GDB) tool. Notice the two instances of GDB to separately
debug the host and kernel processes, and the use of the debug server (xrt_server
).
Xilinx recommends iterating the design as much as possible in Software Emulation, which takes little compile time and executes quickly. For more detailed information on software emulation, see Software Emulation.
GDB-Based Debugging
-g
option. For the GNU debugging (GDB), you can debug the kernel or host code, adding breakpoints, and inspecting variables. This familiar software debug flow allows quick design, compile, and debug to validate the functionality of your application. The Vitis debugger also provides extensions to GDB to let you examine the content of the Xilinx Runtime (XRT) library from the host program. These extensions can be used to debug protocol synchronization issues between the host and the kernel.
The Vitis core development kit supports GDB
host program debugging in all flows, but kernel debugging is limited to software and
hardware emulation modes. Debugging features need to be enabled in your host and kernel
code by using the -g
option during compilation and
linking.
This section shows how host and kernel debugging can be performed with the help of GDB. Because this flow should be familiar to most software developers, this section focuses on the extensions of host code debugging capabilities for the XRT library and the requirements of kernel debug.
Xilinx Runtime Library GDB Extensions
xgdb
) enables new GDB commands that give you visibility
from the host application into the XRT library. appdebug.py
script as
described in Launching Host and Kernel Debug.There are two kinds of commands which can be called from the gdb
command line:
xprint
commands that give visibility into XRT library data structures (cl_command_queue
,cl_event
, andcl_mem
). These commands are explained below.xstatus
commands that give visibility into IP running on the Vitis target platform when debugging during hardware execution.
You can get more information about the xprint
and xstatus
commands by using the
help <command>
from the gdb
command prompt.
A typical application for these commands is when you see the host
application hang. In this case, the host application could be waiting for the command
queue to finish, or waiting on an event list. Printing the command queue using the
xprint queue
command can tell you what events are
unfinished, allowing you to analyze dependencies between events.
The output of both of these commands is automatically tracked when
debugging with the Vitis IDE. In this case, three
tabs are provided next to the common tabs for Variables, Breakpoints, and Registers in
the upper left corner of the debug perspective. These are labeled Command Queue, Memory
Buffers, and Platform Debug, showing the output of xprint
queue
, xprint mem
, and xstatus
, respectively.
xprint Commands
The arguments to xprint queue
and xprint mem
are optional. The application debug environment
keeps track of all the XRT library objects and automatically prints all valid queues and
cl_mem
objects if the argument is not specified. In
addition, the commands do a proper validation of supplied command queue
, event
, and cl_mem
arguments.
xprint queue [<cl_command_queue>]
xprint event <cl_event>
xprint mem [<cl_mem>]
xprint kernel
xprint all
xstatus Commands
This functionality is only available in the system flow (hardware execution) and not in any of the emulation flows.
xstatus all
xstatus --<ipname>
GDB Kernel-Based Debugging
GDB kernel debugging is supported for the software emulation and hardware emulation flows. When the GDB executable is connected to the kernel in the IDE or command line flows, you can set breakpoints and query the content of variables in the kernel, similar to normal host code debugging. This is fully supported in the software emulation flow because the kernel GDB processes attach to the spawned software processes.
However, during hardware emulation, the kernel source code is transformed into RTL, created by Vitis HLS, and executed. As the RTL model is simulated, all transformations for performance optimization and concurrent hardware execution are applied. For that reason, not all C/C++/OpenCL lines can be uniquely mapped to the RTL code, and only limited breakpoints are supported and at only specific variables can be queried. Today, the GDB tool therefore breaks on the next possible line based on requested breakpoint statements and clearly states if variables can not be queried based on the RTL transformations.
Command Line Debug Flow
The following describes the steps required to run the debug flow in software emulation from the command line. Refer to Using the Vitis IDE for information on debugging in the IDE. Debugging in the Vitis core development kit uses the following steps:
- Compiling and linking the host code for debugging by adding the
-g
option to theg++
command line as described in Building the Host Program. - Compiling and linking the kernel code for debugging by adding the
-g
option to thev++
command line as described in Building the Device Binary.Note: When debugging OpenCL kernels, there are additional steps that you can take during compiling and linking as described in Debugging OpenCL Kernels. - Launching GDB to debug the application. This process involves three command target platforms as described in Launching Host and Kernel Debug.
Debugging OpenCL Kernels
For OpenCL kernels, additional runtime checks can be performed during software emulation. These additional checks include:
- Checking whether an OpenCL kernel makes
out-of-bounds accesses to the interface buffers (
fsanitize=address
). - Checking whether the kernel makes accesses to uninitialized local
memory (
fsanitize=memory
).
These are Vitis compiler options that
are enabled through the --advanced
compiler option as
described in --advanced Options, using the following
command syntax:
--advanced.param compiler.fsanitize=address,memory
When applied, the emulation run produces a debug log with emulation diagnostic messages that are written to <project_dir>/Emulation-SW/<proj_name>-Default>/emulation_debug.log.
The fsanitize
directive can also be
specified in a config file, as follows:
[advanced]
#param=<param_type>:<param_name>.<value>
param=compiler.fsanitize=address,memory
Then the config file is specified on the v++
command line:
v++ -l –t sw_emu --config ./advanced.cfg -o bin_kernel.xclbin
Refer to the Vitis Compiler Configuration File for more information on the
--config
option.
Launching Host and Kernel Debug
xrt_server
). To connect the host and kernel code to the debug server, you must
open three terminal windows using the following process. - Open three terminal windows, and set up each window as described in Setting Up the Vitis Environment. The three windows are for:
- Running
xrt_server
- Running GDB (
xgdb
) on the Host Code - Running GDB (
xgdb
) on the Kernel Code
- Running
- In the first terminal, after setting up the terminal environment, start the
Vitis debug server using the following
command:
xrt_server --sdx-url
The debug server listens for debug commands from the host and kernel, connecting the two processes to create a single debug environment. The
xrt_server
returns alistener port <num>
on standard out. Keep track of the listener port number returned as this port is used by GDB to debug the kernel process. To control this process, you must start new GDB instances and connect to thexrt_server
. You will do this in the next steps.IMPORTANT: With thexrt_server
running, all spawned GDB processes wait for control from you. If no GDB ever attaches to thexrt_server
, or provides commands, the kernel code appears to hang. - In a second terminal, after setting up the terminal environment, launch GDB
for the host code as described in the following steps:
- Set the
ENABLE_KERNEL_DEBUG
environment variable. For example, in a C-shell use the following:setenv ENABLE_KERNEL_DEBUG true
- Set the
XCL_EMULATION_MODE
environment variable tosw_emu
mode as described in Running the Application Hardware Build. For example, in a C-shell use the following:setenv XCL_EMULATION_MODE sw_emu
- The runtime debug feature must be enabled using an
entry in the xrt.ini file, as
described in xrt.ini File. Create
an xrt.ini file in the same
directory as your host executable, and include the following
lines:
[Debug] app_debug=true
This informs the runtime library that the kernel has been compiled for debug, and that XRT library should enable debug features.
- Start
gdb
through the Xilinx wrapper:xgdb --args <host> <xclbin>
Where<host>
is the name of your host executable, and<xclbin>
is the name of the FPGA binary. For example:xgdb --args host.exe vadd.xclbin
Launching GDB from the
xgdb
wrapper performs the following setup steps for the Vitis debugger:- Loads GDB with the specified host program.
- Sources the Python script from the GDB command prompt to enable the
Vitis debugger
extensions:
gdb> source ${XILINX_XRT}/share/appdebug/appdebug.py
- Set the
- In a third terminal, after setting up the terminal environment, launch the
xgdb
command, and run the following commands from the (gdb
) prompt:- For software
emulation:
file <Vitis_path>/data/emulation/unified/cpu_em/generic_pcie/model/genericpciemodel
Where <Vitis_path> is the installation path of the Vitis core development kit. Using the
$XILINX_VITIS
environment variable will not work inside GDB. - For hardware emulation:
- Locate the
xrt_server
temporary directory:/tmp/sdx/$uid
. - Find the
xrt_server
process id (PID) containing the DWARF file of this debug session. - At the
gdb
command line, run:file /tmp/sdx/$uid/$pid/NUM.DWARF
.
- Locate the
- In either case, connect to the kernel process:
target remote :<num>
Where
<num>
is the listener port number returned by thexrt_server
.
TIP: When debugging software/hardware emulation kernels in the Vitis IDE, these steps are handled automatically and the kernel process is automatically attached, providing multiple contexts to debug both the host code and kernel code simultaneously. - For software
emulation:
With the three terminal windows running the xrt_server
, GDB for the host, and GDB for the kernels, you can set
breakpoints on your host or kernels as needed, run the continue
command, and debug your application. When the all kernel
invocations have finished, the host code continues and the xrt_server
connection drops.
Using printf() or cout to Debug Kernels
The basic approach to debugging algorithms is to verify key code steps
and key data values throughout the execution of the program. For application developers,
printing checkpoint statements, and outputting current values in the code is a simple
and effective method of identifying issues within the execution of a program. This can
be done using the printf()
function, or cout
for standard output.
C/C++ Kernel
For C/C++ kernel models, printf()
is only
supported during software emulation and should be excluded from the Vitis HLS synthesis step. In this case, any printf()
statement should be surrounded by the following
compiler macros:
#ifndef __SYNTHESIS__
printf("Checkpoint 1 reached");
#endif
For C++ kernels, you can also use cout
in your
code to add checkpoints or messages used for debugging the code. For example, you might
add the following:
std::cout << "TEST " << (match ? "PASSED" : "FAILED") << std::endl;
OpenCL Kernel
The Xilinx Runtime (XRT) library supports
the OpenCL™
printf()
built-in function within kernels in all build
configurations: software emulation, hardware emulation, and during hardware
execution.
printf()
function is only supported in all build
configurations for OpenCL kernels. For C/C++
kernels, printf()
is only supported in software
emulation.The following is an example of using printf()
in the kernel, and the output when the kernel is executed with
global
size of 8:
__kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void hello_world(__global int *a)
{
int idx = get_global_id(0);
printf("Hello world from work item %d\n", idx);
a[idx] = idx;
}
The output is as follows:
Hello world from work item 0
Hello world from work item 1
Hello world from work item 2
Hello world from work item 3
Hello world from work item 4
Hello world from work item 5
Hello world from work item 6
Hello world from work item 7
printf()
messages are buffered in the global memory and unloaded when
kernel execution is completed. If printf()
is used in
multiple kernels, the order of the messages from each kernel display on the host
terminal is not certain. Note, especially when running in hardware emulation and
hardware, the hardware buffer size might limit printf
output capturing.Debugging in Hardware Emulation
During hardware emulation, kernel code is compiled into RTL code so that you can evaluate the RTL logic of kernels prior to implementation into the Xilinx device. The host code can be executed concurrently with a behavioral simulation of the RTL model of the kernel, directly imported, or created through Vitis HLS from the C/C++/OpenCL kernel code. For more information, see Hardware Emulation.
The following figure shows the hardware emulation flow diagram which can be used in the Vitis debugger to validate the host code, profile host and kernel performance, give estimated FPGA resource usage, and verify the kernel using an accurate model of the hardware (RTL). GDB can also be used for more traditional software-style debugging of the host and kernel code.
Verify the host code and the kernel hardware implementation is correct by running hardware emulation on a data set. The hardware emulation flow invokes the Vivado logic simulator in the Vitis core development kit to test the kernel logic that is to be executed on the FPGA fabric. The interface between the models is represented by a transaction-level model (TLM) to limit impact of interface model on the overall execution time. The execution time for hardware emulation is longer than software emulation.
During hardware emulation, you can optionally modify the kernel code to improve performance. Iterate your host and kernel code design in hardware emulation until the functionality is correct, and the estimated kernel performance is satisfactory.
GDB-Based Debugging in Hardware Emulation
Debugging using a software-based GDB flow is fully supported during hardware emulation. Because the Vitis debugger maps the RTL code back to the original C/C++ source code, there is no difference in using GDB for debugging in hardware emulation. However, mapping the code limits the breakpoints and observability of the variables in some cases, because during the RTL generation by Vitis HLS, some variables and loops from the kernel source might have been dissolved, or optimized away.
Using GDB for debugging the host and kernel code in hardware emulation uses the same three-terminal process described for software emulation. Refer to the instructions in Command Line Debug Flow for details of running this flow.
Waveform-Based Kernel Debugging
Because the C/C++ and OpenCL kernel code is synthesized into RTL code using Vitis HLS in the hardware emulation build configuration, you can also use RTL behavioral simulation to analyze the kernel logic. Hardware designers are likely to be familiar with this approach. This waveform-based HDL debugging is supported by the Vitis core development kit using both the command line flow, or through the IDE flow during hardware emulation.
Enable Waveform Debugging with the Vitis Compiler Command
v++
command using the following
steps:- Enable debug features in the kernel code during compilation and linking, as
described in Building the Device Binary.
v++ -g ...
- Create an xrt.ini file in the same
directory as the host executable, as described in xrt.ini File, with the following contents:
[Emulation] debug_mode=batch [Debug] profile=true timeline_trace=true data_transfer_trace=fine
- Run the application, host and kernel, in hardware emulation mode. The
waveform database, reflecting the hardware transaction data, is collected in a
file named <hardware_platform>-<device_id>-<xclbin_name>.wdb.
This file can directly be opened in the Vitis analyzer as described in Using the Vitis Analyzer.TIP: If
debug_mode=gui
in the xrt.ini, a live waveform viewer is launched when the application is run, as described in Waveform View and Live Waveform Viewer. This is especially useful when debugging ahw_emu
hang issue, because you can interrupt the simulation process in the simulator and observe the waveform up to that time.
Run the Waveform-Based Kernel Debugging Flow
The Vitis IDE provides waveform-based HDL debugging in the hardware emulation mode. The waveform is opened in the Vivado waveform viewer which should be familiar to Vivado logic simulation users. The Vitis IDE lets you display kernel interfaces, internal signals, and includes debug controls such as restart, HDL breakpoints, as well as HDL code lookup and waveform markers. In addition, it provides top-level DDR data transfers (per bank) along with kernel-specific details including compute unit stalls, loop pipeline activity, and data transfers.
For details, see Waveform View and Live Waveform Viewer.
If the live waveform viewer is activated, the waveform viewer automatically opens when running the executable. By default, the waveform viewer shows all interface signals and the following debug hierarchy:
- Memory Data Transfers
- Shows data transfers from all compute units funnel through these
interfaces.TIP: These interfaces could be a different bit width from the compute units. If so, then the burst lengths would be different. For example, a burst of sixteen 32-bit words at a compute unit would be a burst of one 512-bit word at the OCL master.
- Kernel <kernel name><workgroup size> Compute Unit<CU name>
- Kernel name, workgroup size, and compute unit name.
- CU Stalls (%)
- This shows a summary of stalls for the entire CU. A bus of all lowest-level stall signals is created, and the bus is represented in the waveform as a percentage (%) of those signals that are active at any point in time.
- Data Transfers
- This shows the data transfers for all AXI masters on the CU.
- User Functions
- This lists all of the functions within the hierarchy of the CU.
- Function: <function name>
- This is the function name.
- Dataflow/Pipeline Activity
- This shows the function-level loop dataflow/pipeline signals for a CU.
- Function Stalls
- This lists the three stall signals within this function.
- Function I/O
- This lists the I/O for the function. These I/O are of protocol
-m_axi
,ap_fifo
,ap_memory
, orap_none
.
Debug Techniques for Hardware Emulation
Due to the approximate models used in hardware emulation, the behavior of an emulated system might not match the hardware. The following list provides some common issues to examine if your application does not give expected results during hardware emulation:
- Review the host application to ensure that the event dependency between different kernel runs is correctly captured. Such issues can lead to unpredictable behavior. It is also possible that the application can pass in hardware, but there could be a logical bug in your application which can be triggered on hardware under slightly different conditions.
- If you have an RTL kernel, run the application in debug mode and ensure that are no "X" (undriven values) in simulation in the kernel. This indicates incorrect code which can work in hardware but will fail in simulation with unpredictable behavior. If it is an HLS-generated kernel, confirm that all the variables are initialized to appropriate values.
- Ensure that the amount of data being processed by kernels in hardware
emulation is small so that emulation can finish in a reasonable time. Otherwise,
it can appear that the application is running forever or has "hung". In this
case, when running the application in hardware emulation look for
INFO: [Vitis-EM 22]
messages in the host application console. Check that the amount of data being read/written to or from global memory is increasing:- If the RD/WR data is increasing, this indicates that application and hardware execution is progressing. The application is not hung, but is taking a really long time to complete. This could be due to large data size or due to kernels performing memory read/write in an inefficient manner. The application and kernel needs to be optimized.
- If the RD/WR data is not increasing in successive
messages, this indicates that simulation is running but there is a
deadlock in the hardware somewhere — either in the kernel or rest of the
platform. Review the AXI transactions at the boundary of kernel,
interconnect (for example,
sdx_memss
), and other places to check if there is an incomplete transaction or whether any transaction is being generated by the kernel.
- Run hardware emulation in waveform mode and also review at the timeline trace. Check whether the kernel is getting "started" and "done" by observing the traffic on its AXI4-Lite interface, or by observing the output interrupt from the kernel.
- Review the
[Emulation]
section of the xrt.ini File to enable applicable settings that can help to narrow down the issue in your application or kernel.
Debugging During Hardware Execution
During hardware execution, the actual hardware platform is used to execute the kernels, and you can evaluate the performance of the host program and accelerated kernels just by running the application. However, debugging the hardware build requires additional logic to be incorporated into the application. This will impact both the FPGA resources consumed by the kernel and the performance of the kernel running in hardware. The debug configuration of the hardware build includes special ChipScope debug cores, such as Integrated Logic Analyzer (ILA) and Virtual Input/Output (VIO) cores, and AXI performance monitors for debug purposes.
The following figure shows the debug process for the hardware build, including debugging the host code using GDB, and using the Vivado hardware manager, with waveform analysis, kernel activity reports, and memory access analysis to identify and localize hardware issues.
With the system hardware build configured for debugging, the host program running on the CPU and the Vitis accelerated kernels running on the Xilinx device can be confirmed to be executing correctly on the actual hardware of the target platform. Some of the conditions that can be identified and analyzed include the following:
- System hangs caused by protocol violations:
- These violations can take down the entire system.
- These violations can cause the kernel to get invalid data or to hang.
- It is hard to determine where or when these violations originated.
- To debug this condition, you should use an ILA triggered off of the AXI protocol checker, which needs to be configured on the Vitis target platform.
- Problems with the hardware kernel:
- Problems sometimes caused by the implementation: timing issues, race conditions, and bad design constraints.
- Functional bugs that hardware emulation does not reveal.
- Performance issues:
- For example, the frames per second processing is not what you expect.
- You can examine data beats and pipelining.
- Using an ILA with trigger sequencer, you can examine the burst size, pipelining, and data width to locate the bottleneck.
Enabling Kernels for Debugging with Chipscope
System ILA
The key to hardware debugging lies in instrumenting the kernels with
the required debug logic. The following topic discusses the v++
linker options that can be used to list the available kernel
ports, enable the System Integrated Logic Analyzer (ILA) core on
selected ports, and enable the AXI Protocol Checker debug core for checking for protocol
violations.
The ILA core provides transaction-level visibility into an instance of a compute unit (CU) running on hardware. AXI traffic of interest can also be captured and viewed using the ILA core. The ILA provides custom event triggering on one or more signals to allow waveform capture at system speeds. The waveforms can be analyzed in a viewer and used to debug hardware, finding protocol violations, or performance issues. It can also be crucial for debugging difficult situation like application hangs.
Captured data can be accessed through the Xilinx virtual cable (XVC) using the Vivado tools. See the Vivado Design Suite User Guide: Programming and Debugging (UG908) for complete details.
The ILA core can be added to an existing RTL kernel to enable
debugging features within that design, or it can be inserted automatically by the
v++
compiler during the linking stage. The
v++
command provides the --debug
option as described in --debug Options to attach System ILA cores at the interfaces to the
kernels for debugging and performance monitoring purposes.
The -–debug
option to enable ILA
IP core insertion has the following syntax:
--debug.chipscope <cu_name>[:<interface_name>]>
In general, the <interface_name>
is optional. If not specified, all ports are
expected to be analyzed.
AXI Protocol Checker
The AXI Protocol Checker core monitors AXI interfaces. When attached to an interface, it actively checks for protocol violations and provides an indication of which violation occurred. You can assign it for all CUs in the design, or for specific CUs and ports.
The -–debug
option to enable
AXI Protocol Checker insertion has the
following syntax:
--debug.protocol all
The protocol checker can be specified with the keyword all
, or the <cu_name>:<interface_name>
.
--debug.list_ports
option can be specified to return
the actual names of ports on the kernel to use with protocol
or chipscope
.An example flow you could use for adding ILA or protocol checkers to your design is outlined below:
- Compile the kernel source files into an XO file, using the
-g
option to instrument the kernel for debug features:v++ -c -g -k <kernel_name> --platform <platform> -o <kernel_xo_file>.xo <kernel_source_files>
- After the kernel has been compiled into an XO file, use
--debug.list_ports
to cause thev++
compiler to print the list of valid compute units and port combinations for the kernel:v++ -l -g --platform <platform> --connectivity.nk <kernel_name>:<compute_units>:<kernel_nameN> --debug.list_ports <kernel_xo_file>.xo
- Add the ILA or AXI debug cores on the desired ports by
replacing
list_ports
with the appropriate--debug.chipscope
or--debug.protocol
command syntax:v++ -l -g --platform <platform> --connectivity.nk <kernel_name>:<compute_units>:<kernel_nameN> --debug.chipscope <compute_unit_name>:<interface_name> <kernel_xo_file>.xo
--debug
option can be specified multiple times in a
single v++
command line, or configuration file to
specify multiple CUs and interfaces.When the design is built, you can debug the design using the Vivado hardware manager as described in Debugging with ChipScope.
Adding Debug IP to RTL Kernels
You can also enable debugging in RTL kernels by manually adding ChipScope debug cores like the ILA and VIO in your RTL kernel code before packaging it for use in the Vitis development flow. From within the Vivado Design Suite, edit the RTL kernel code to manually instantiate an ILA debug core, or VIO IP from the Xilinx IP catalog, similar to using any other IP in Vivado IDE. Refer to the HDL Instantiation flow in the Vivado Design Suite User Guide: Programming and Debugging (UG908) to learn more about adding debug cores to your design.
The best time to add debug cores to your RTL kernel is when you create it.
However, debug cores consume device resources and can affect performance, so it is good
practice to make one kernel for debug and a second kernel for production use. The
rtl_vadd_hw_debug
of the RTL Kernels examples on GitHub shows an ILA
debug core instantiated into the RTL kernel source file. The ILA monitors the output of
the combinatorial adder as specified in the src/hdl/krnl_vadd_rtl_int.sv file.
// ILA monitoring combinatorial adder
ila_0 i_ila_0 (
.clk(ap_clk), // input wire clk
.probe0(areset), // input wire [0:0] probe0
.probe1(rd_fifo_tvalid_n), // input wire [0:0] probe1
.probe2(rd_fifo_tready), // input wire [0:0] probe2
.probe3(rd_fifo_tdata), // input wire [63:0] probe3
.probe4(adder_tvalid), // input wire [0:0] probe4
.probe5(adder_tready_n), // input wire [0:0] probe5
.probe6(adder_tdata) // input wire [31:0] probe6
);
You can also add the ILA debug core using a Tcl script from within an open Vivado project, using the Netlist Insertion flow described in Vivado Design Suite User Guide: Programming and Debugging (UG908), as shown in the following Tcl script example:
create_ip -name ila -vendor xilinx.com -library ip -version 6.2 -module_name ila_0
set_property -dict [list CONFIG.C_PROBE6_WIDTH {32} CONFIG.C_PROBE3_WIDTH {64} \
CONFIG.C_NUM_OF_PROBES {7} CONFIG.C_EN_STRG_QUAL {1} CONFIG.C_INPUT_PIPE_STAGES {2} \
CONFIG.C_ADV_TRIGGER {true} CONFIG.ALL_PROBE_SAME_MU_CNT {4} CONFIG.C_PROBE6_MU_CNT {4} \
CONFIG.C_PROBE5_MU_CNT {4} CONFIG.C_PROBE4_MU_CNT {4} CONFIG.C_PROBE3_MU_CNT {4} \
CONFIG.C_PROBE2_MU_CNT {4} CONFIG.C_PROBE1_MU_CNT {4} CONFIG.C_PROBE0_MU_CNT {4}] [get_ips ila_0]
After the RTL kernel has been instrumented for debug with the appropriate debug cores, you can analyze the hardware in the Vivado hardware manager as described in Debugging with ChipScope.
Enabling ILA Triggers for Hardware Debug
To perform hardware debug of both the host program and the kernel code running on the target platform, the application host code must be modified to let you set up the ILA trigger conditions after the kernel has been programmed into the device, but before starting the kernel.
Adding ILA Triggers Before Starting Kernels
Pausing the host program can be accomplished through the
use of a pause, or wait step in the code, such as the wait_for_enter
function used in the
RTL Kernel example on
GitHub. The function is defined in the src/host.cpp
code as follows:
void wait_for_enter(const std::string &msg) {
std::cout << msg << std::endl;
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}
The wait_for_enter
function is used in
the main
function as follows:
....
std::string binaryFile = xcl::find_binary_file(device_name,"vadd");
cl::Program::Binaries bins = xcl::import_binary_file(binaryFile);
devices.resize(1);
cl::Program program(context, devices, bins);
cl::Kernel krnl_vadd(program,"krnl_vadd_rtl");
wait_for_enter("\nPress ENTER to continue after setting up ILA trigger...");
//Allocate Buffer in Global Memory
std::vector<cl::Memory> inBufVec, outBufVec;
cl::Buffer buffer_r1(context,CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY,
vector_size_bytes, source_input1.data());
...
//Copy input data to device global memory
q.enqueueMigrateMemObjects(inBufVec,0/* 0 means from host*/);
//Set the Kernel Arguments
...
//Launch the Kernel
q.enqueueTask(krnl_vadd);
The use of the wait_for_enter
function pauses the host program to
give you time to set up the required ILA triggers and prepare to
capture data from the kernel. After the Vivado hardware manager is set up and configured,
press Enter
to continue running the
application.
- For C++ host code, add a pause after the
creation of the
cl::Kernel
object, as shown in the example above. - For C-language host code, add a pause after the
clCreateKernel()
function call:
Pausing the Host Application Using GDB
If you are running GDB to debug the host program at the same time as performing hardware debug on the kernels, you can also pause the host program as needed by inserting a breakpoint at the appropriate line of code. Instead of making changes to the host program to pause the application as needed, you can set a breakpoint prior to the kernel execution in the host code. When the breakpoint is reached, you can set up the debug ILA triggers in Vivado hardware manager, arm the trigger, and then resume the host program in GDB.
Debugging with ChipScope
You can use the ChipScope debugging environment and the Vivado hardware manager to help you debug your host application and kernels quickly and more effectively. These tools enable a wide range of capabilities from logic to system-level debug while your kernel is running in hardware. To achieve this, at least one of the following must be true:
- Your Vitis application project has been
designed with debug cores, using the
--debug.xxx
compiler switch, as described in Enabling Kernels for Debugging with Chipscope. - The RTL kernels used in your project must have been instantiated with debug cores (as described in Adding Debug IP to RTL Kernels).
Checking the FPGA Board for Hardware Debug Support
Supporting hardware debugging requires the platform to support several IP
components, most notably the Debug Bridge. Talk to your platform designer to determine
if these components are included in the target platform. If a Xilinx platform is used, debug availability can be verified using the
platforminfo
utility to query the platform. Debug
capabilities are listed under the chipscope_debug
objects.
For example, to query the a platform for hardware debug support, the
following platforminfo
command can be used:
$ platforminfo --json="hardwarePlatform.extensions.chipscope_debug" xilinx_u200_xdma_201830_2
{
"debug_networks": {
"user": {
"name": "User Debug Network",
"pcie_pf": "1",
"bar_number": "0",
"axi_baseaddr": "0x000C0000",
"supports_jtag_fallback": "false",
"supports_microblaze_debug": "true",
"is_user_visible": "true"
},
"mgmt": {
"name": "Management Debug Network",
"pcie_pf": "0",
"bar_number": "0",
"axi_baseaddr": "0x001C0000",
"supports_jtag_fallback": "true",
"supports_microblaze_debug": "true",
"is_user_visible": "false"
}
}
}
The response shows that the target platform contains
user
and mgmt
debug networks, supports debugging a
MicroBlaze™ processor, and also supports JTAG
fallback for the Management Debug Network.
Running XVC and HW Servers
- Add debug IP to the kernel as discussed in Enabling Kernels for Debugging with Chipscope.
- Modify the host program to pause at the appropriate point as described in Enabling ILA Triggers for Hardware Debug.
- Set up the environment for hardware debug, using an automated script described in Automated Setup for Hardware Debug, or manually as described in Manual Setup for Hardware Debug.
- Run the hardware debug flow using the following process:
- Launch the required XVC and the
hw_server
of the Vivado hardware manager. - Run the host program and pause at the appropriate point to enable setup of the ILA triggers.
- Open the Vivado hardware manager and connect to the XVC server.
- Set up ILA trigger conditions for the design.
- Continue execution of the host program.
- Inspect kernel activity in the Vivado hardware manager.
- Rerun iteratively from step b (above) as required.
- Launch the required XVC and the
Automated Setup for Hardware Debug
- Set up your Vitis core development kit as described in Setting Up the Vitis Environment.
- Use the
debug_hw
script to launch thexvc_pcie
andhw_server
apps as follows:debug_hw --xvc_pcie /dev/xvc_pub.<driver_id> --hw_server
Thedebug_hw
script returns the following:launching xvc_pcie... xvc_pcie -d /dev/xvc_pub.<driver_id> -s TCP::10200 launching hw_server... hw_server -sTCP::3121
TIP: The /dev/xvc_pub.<driver_id> driver character path is defined on your machine, and can be found by examining the /dev folder. - Modify the host code to include a pause statement after the kernel has been created/downloaded and before the kernel execution is started, as described in Enabling ILA Triggers for Hardware Debug.
- Run your modified host program.
- Launch Vivado Design Suite using the
debug_hw
script:debug_hw --vivado --host <host_name> --ltx_file ./_x/link/vivado/vpl/prj/prj.runs/impl_1/debug_nets.ltx
TIP: The<host_name>
is the name of your system.As an example, the command window displays the following results:
launching vivado... ['vivado', '-source', 'vitis_hw_debug.tcl', '-tclargs', '/tmp/project_1/project_1.xpr', 'workspace/vadd_test/System/pfm_top_wrapper.ltx', 'host_name', '10200', '3121'] ****** Vivado v2019.2 (64-bit) **** SW Build 2245749 on Date Time **** IP Build 2245576 on Date Time ** Copyright 1986-2019 Xilinx, Inc. All Rights Reserved. start_gui
- In Vivado Design Suite, run the ILA
trigger.
- Press Enter to continue running the host program.
- In the Vivado hardware manager, see the
interface transactions on the kernel compute unit slave control interface in the
Waveform view.
Manual Setup for Hardware Debug
There are a few steps required to start the debug servers prior to debugging the design in the Vivado hardware manager.
- Set up your Vitis core development kit as described in Setting Up the Vitis Environment.
- Launch the
xvc_pcie
server. The file name passed toxvc_pcie
must match the character driver file installed with the kernel device driver, where<driver_id>
can be found by examining the /dev folder.>xvc_pcie -d /dev/xvc_pub.<device_id>
TIP: Thexvc_pcie
server has many useful command line options. You can issuexvc_pcie -help
to obtain the full list of available options. - Start the
hw_server
on port3121
, and connect to the XVC server on port10201
using the following command:>hw_server -e "set auto-open-servers xilinx-xvc:localhost:10201" -e "set always-open-jtag 1"
- Launch Vivado Design Suite and
open the hardware manager:
vivado
Starting Debug Servers on an Amazon F1 Instance
Instructions to start the debug servers on an Amazon F1 instance can be found here: https://github.com/aws/aws-fpga/blob/master/hdk/docs/Virtual_JTAG_XVC.md.
Debugging Designs Using Vivado Hardware Manager
Traditionally, a physical JTAG connection is used to perform hardware debug for Xilinx devices with the Vivado hardware manager. The Vitis unified software platforms also makes use of the Xilinx virtual cable (XVC) for hardware debugging on remote accelerator cards. To take advantage of this capability, the Vitis debugger uses the XVC server, an implementation of the XVC protocol that allows the Vivado hardware manager to connect to a local or remote target device for debug, using the standard Xilinx debug cores like the ILA or the VIO IP.
The Vivado hardware manager, from the Vivado Design Suite or Vivado debug feature, can be running on the target instance or it can be running remotely on a different host. The TCP port on which the XVC server is listening must be accessible to the host running Vivado hardware manager. To connect the Vivado hardware manager to XVC server on the target, the following steps should be followed on the machine hosting the Vivado tools:
- Launch the Vivado debug feature, or the full Vivado Design Suite.
- Select Open Hardware Manager from the
Tasks menu, as shown in the following figure.
- Connect to the Vivado tools
hw_server
, specifying a local or remote connection, and the Host name and Port, as shown below. - Connect to the target instance Virtual JTAG XVC server.
- Select the
debug_bridge
instance from the Hardware window in the Vivado hardware manager.Specify the probes file (.ltx) for your design adding it to the entry in the Hardware Device Properties window. Adding the probes file refreshes the hardware device, and Hardware window should now show the debug cores in your design.TIP: If the kernel has debug cores as specified in Enabling Kernels for Debugging with Chipscope, the probes file (.ltx) is written out during the implementation of the kernel by the Vivado tool. - The Vivado hardware manager can now be
used to debug the kernels running on the Vitis
software platform. Arm the ILA cores in your kernels and run your host application.
JTAG Fallback for Private Debug Network
Hardware debug for the Alveo Data Center accelerator cards typically uses the XVC-over-PCIe connection due to the inaccessibility of the physical card, and the JTAG connector on the card. While XVC-over-PCIe allows you to remotely debug your application running on the target platform, certain conditions such as AXI interconnect system hangs can prevent you from accessing the hardware debug functionality that depends on these PCIe/AXI features. Being able to debug these kinds of conditions is especially important for platform designers.
The JTAG Fallback feature is designed to provide access to debug networks that were previously only accessible through XVC-over-PCIe. The JTAG Fallback feature can be enabled without having to change the XVC-over-PCIe-based debug network in the platform design.
On the host side, when the Vivado hardware manager user
connects through the hw_server
to a JTAG cable that is
connected to the physical JTAG pins of the accelerator card, or device under test (DUT),
the hw_server
disables the XVC-over-PCIe pathway to the hardware. This lets you use the
XVC-over-PCIe cable as your primary debug path,
but enable debug over the JTAG cable directly when it is required in certain situations.
When you disconnect from the JTAG cable, the hw_server
re-enables the XVC-over-PCIe pathway to the
hardware.
JTAG Fallback Steps
- Enable the JTAG Fallback feature of the Debug Bridge (AXI-to-BSCAN mode) master of the debug network to which you want to provide JTAG access. This step enables a BSCAN slave interface on this Debug Bridge instance.
- Instantiate another Debug Bridge (BSCAN Primitive mode) in the static logic partition of the platform design.
- Connect the BSCAN master port of the Debug Bridge (BSCAN Primitive mode) from step 2 to the BSCAN slave interface of the Debug Bridge (AXI-to-BSCAN mode) from step 1.
Utilities for Hardware Debugging
In some cases, the normal Vitis IDE and command line debug features are limited in their ability to isolate an issue. This is especially true when the software or hardware appears not to make any progress (hangs). These kinds of system issues are best analyzed with the help of the utilities mentioned in this section.
Using the Linux dmesg Utility
Well-designed kernels and modules report issues through the kernel ring buffer. This is also true for Vitis technology modules that allow you to debug the interaction with the accelerator board on the lowest Linux level.
The dmesg
utility is a Linux tool that
lets you read the kernel ring buffer. The kernel ring buffer holds kernel information
messages in a circular buffer. A circular buffer of fixed size is used to limit the
resource requirements by overwriting the oldest entry with the next incoming
message.
xbutil
feature
to localize an issue. Refer to Using the Xilinx xbutil Utility for
more information on using this tool for debug.In the Vitis technology, the xocl
module and xclmgmt
driver modules write informational messages to the ring buffer. Thus, for an application
hang, crash, or any unexpected behavior (like being unable to program the bitstream,
etc.), the dmesg
tool should be used to check the ring
buffer.
The following image shows the layers of the software platform associated with the target platform.
To review messages from the Linux tool, you should first clear the ring buffer:
sudo dmesg -c
This flushes all messages from the ring buffer and makes it easier to
spot messages from the xocl
and xclmgmt
. After that, start your application and run dmesg
in another terminal.
sudo dmesg
The dmesg
utility prints a record shown
in the following example:
In the example shown above, the AXI Firewall 2 has tripped, which is
better examined using the xbutil
utility.
Using the Xilinx xbutil Utility
The Xilinx board utility (xbutil
) is a powerful standalone command line utility that
can be used to debug lower level hardware/software interaction issues. A full
description of this utility can be found in xbutil Utility.
xbutil
options are of special interest:query
- Provides an overall status of a card including information on the kernels in card memory.
program
- Downloads a binary (
xclbin
) to the programmable region of the Xilinx device.
status
- Extracts the status of the Performance Monitors (
spm
) and the Lightweight AXI Protocol Checkers (lapc
).
Techniques for Debugging Application Hangs
This section discusses debugging issues related to the interaction of the host
code and the accelerated kernels. Problems with these interactions manifest as issues
such as machine hangs or application hangs. Although the GDB debug environment might
help with isolating the errors in some cases (xprint
),
such as hangs associated with specific kernels, these issues are best debugged using the
dmesg
and xbutil
commands as shown here.
If the process of hardware debugging does not resolve the problem, it is necessary to perform hardware debugging using the ChipScope feature.
AXI Firewall Trips
- Use
xbutil
to program the FPGA:xbutil program -p <xclbin>
- Run the
xbutil
query option to check memory topology:xbutil query
In the following example, there are no kernels associated with memory banks: - If the host code expects any DDR banks/PLRAMs to be used, this report should
indicate an issue. In this case, it is necessary to check kernel and host code
expectations. If the host code is using the Xilinx
OpenCL extensions, it is necessary to check
which DDR banks should be used by the kernel. These should match the
connectivity.sp
options specified as discussed in Mapping Kernel Ports to Memory.
Kernel Hangs Due to AXI Violations
- The Vitis core development kit
provides two options for instrumentation to be applied during
v++
linking (--link
). Both of these options add hardware to your implementation, and based on resource utilization it might be necessary to limit instrumentation.- Add Lightweight AXI Protocol Checkers (
lapc
). These protocol checkers are added using the-–debug.protocol
option, as explained in --debug Options. The following syntax is used:--debug.protocol <compute_unit_name>:<interface_name>
In general, the<interface_name>
is optional. If not specified, all ports on the CU are expected to be analyzed. The--debug.protocol
option is used to define the protocol checkers to be inserted. This option can accept a special keyword,all
, for<compute_unit_name>
and/or<interface_name>
.Note: Multiple--debug.xxx
options can be specified in a single command line, or configuration file. - Adding Performance Monitors (
am, aim, asm
) enables the listing of detailed communication statistics (counters). Although this is most useful for performance analysis, it provides insight during debugging on pending port activities. The Performance Monitors are added using the--profile
option as described in --profile Options. The basic syntax for the--profile
option is:
Three fields are required to determine the specific interface to attach the performance monitor to. However, if resource consumption is not an issue, the keyword--profile.data <krnl_name>|all:<cu_name>|all:<intrfc_name>|all:<counters>|all
all
lets you apply the monitoring to all existing kernels, compute units, and interfaces with a single option. Otherwise, you can specify thekernel_name
,cu_name
, andinterface_name
explicitly to limit instrumentation.The last option,<counters>|all
, allows you to restrict the information gathering to justcounters
for large designs, whileall
(default) includes the collection of actual trace information.Note: Multiple--profile
options can be specified in a single command line, or configuration file.[profile] dataernel1:cu1:m_axi_gmem0 dataernel1:cu1:m_axi_gmem1 dataernel2:cu2:m_axi_gmem
- Add Lightweight AXI Protocol Checkers (
- When the application is rebuilt, rerun the host application using the
xclbin
with the added AIM IP and LAPC IP. - When the application hangs, you can use
xbutil status
to check for any errors or anomalies. - Check the AIM output:
- Run
xbutil status --aim
a couple of times to check if any counters are moving. If they are moving then the kernels are active.TIP: Testing AIM output is also supported through GDB debugging using the command extensionxstatus spm
. - If the counters are stagnant, the outstanding counts greater than zero might mean some AXI transactions are hung.
- Run
- Check the LAPC output:
- Run
xbutil status --lapc
to check if there are any AXI violations.TIP: Testing LAPC output is also supported through GDB debugging using the command extensionxstatus lapc
. - If there are any AXI violations, it implies that there are issues in the kernel implementation.
- Run
Host Application Hangs When Accessing Memory
- If the platform has an AXI firewall, such as in the Vitis target platforms, it is likely to trip.
The driver issues a
SIGBUS
error, kills the application, and resets the device. You can check this by runningxbutil query
. The following figure shows such an error in the firewall status:Firewall Last Error Status: 0: 0x0 (GOOD) 1: 0x0 (GOOD) 2: 0x80000 (RECS_WRITE_TO_BVALID_MAX_WAIT). Error occurred on Tue 2017-12-19 11:39:13 PST Xclbin ID: 0x5a39da87
TIP: If the firewall has not tripped, the Linux tool,dmesg
, can provide additional insight. - When you know that the firewall has tripped, it is important to determine
the cause of the DMA timeout. The issue could be an illegal DMA transfer, or
kernel misbehavior. However, a side effect of the AXI firewall tripping is that
the health check functionality in the driver resets the board after killing the
application; any information on the device that might help with debugging the
root cause is lost. To debug this issue, disable the health check thread in the
xclmgmt
kernel module to capture the error. This uses common Unix kernel tools in the following sequence:sudo modinfo xclmgmt
: This command lists the current configuration of the module and indicates if thehealth_check
parameter is ON or OFF. It also returns the path to thexclmgmt
module.sudo rmmod xclmgmt
: This removes and disables thexclmgmt
kernel module.sudo insmod <path to module>/xclmgmt.ko health_check=0
: This re-installs thexclmgmt
kernel module with the health check disabled.TIP: The path to this module is reported in the output of the call tomodinfo
.
- With the health check disabled, rerun the application. You can use the kernel instrumentation to isolate this issue as previously described.
Typical Errors Leading to Application Hangs
The user errors that typically create application hangs are listed below:
- Read-before-write in 5.0+ target platforms causes a Memory Interface Generator error correction code (MIG ECC) error. This is typically a user error. For example, this error might occur when a kernel is expected to write 4 KB of data in DDR, but it produces only 1 KB of data, and then try to transfer the full 4 KB of data to the host. It can also happen if you supply a 1 KB buffer to a kernel, but the kernel tries to read 4 KB of data.
- An ECC read-before-write error also occurs if no data has been written to a
memory location as the last bitstream download which results in MIG initialization,
but a read request is made for that same memory location. ECC errors stall the
affected MIG because kernels are usually not able to handle this error. This can
manifest in two different ways:
- The CU might hang or stall because it cannot handle this
error while reading or writing to or from the affected MIG. The
xbutil
query shows that the CU is stuck in aBUSY
state and is not making progress. - The AXI Firewall might trip if a PCIe® DMA request is made to the affected MIG, because the
DMA engine is unable to complete the request. AXI Firewall trips result in
the Linux kernel driver killing all processes which have opened the device
node with the SIGBUS signal. The
xbutil
query shows if an AXI Firewall has indeed tripped and includes a timestamp.
- The CU might hang or stall because it cannot handle this
error while reading or writing to or from the affected MIG. The
Defensive Programming
The Vitis compiler is capable of creating very efficient implementations. In some cases, however, implementation issues can occur. One such case is if a write request is emitted before there is enough data available in the process to complete the write transaction. This can cause deadlock conditions when multiple concurrent kernels are affected by this issue and the write request of a kernel depends on the input read being completed.
To avoid these situations, a conservative mode is available on the adapter. In
principle, it delays the write request until it has all of the data necessary to
complete the write. This mode is enabled during compilation by applying the following
--advanced.param
option to the v++
compiler:
--advanced.param:compiler.axiDeadLockFree=yes
Because enabling this mode can impact performance, you might prefer to use this as a defensive programming technique where this option is inserted during development and testing and then removed during optimization. You might also want to add this option when the accelerator hangs repeatedly.
Debugging on Embedded Processor Platforms
Debugging on embedded processor platforms, such as the xilinx_zcu104_base_202010_1
platform, requires the use of
the QEMU emulation environment to model the Arm processor and
operating system for the device. As described in the next sections, running or debugging
the application requires the additional step of launching the emulator, or connecting to
the hardware platform through a TCF agent.
Emulation Debug for Embedded Processors
From within the Vitis IDE, launching debug for the software and hardware emulation builds include the following steps:
- In the Assistant view, right-click the Emulation-SW or Emulation-HW build and select Set Active to make the build active.
- From the Assistant view menu, select the Debug () command, and select the
Launch on Emulator command to
launch the debug environment. This will open the Launch on Emulator dialog box as shown in the following figure. This prompts you to confirm launching the emulation environment and connecting to it using a Linux TCF agent. Select Start Emulator and Debug to continue.This launches the emulation environment (QEMU), and loads the application in preparation for debugging. The application is paused as it enters the
main()
function. The Debug perspective is opened in the Vitis IDE, and you are ready to begin debugging your application.
Hardware Debug for Embedded Processors
- Copy the contents of the <project>/Hardware/sd_card/sd_card folder to a physical SD card. This creates a bootable medium for your target platform.
- Insert the SD card into the card reader of your embedded processor platform.
- Change the boot-mode settings of the platform to SD boot mode, and power up the board.
- After the device is booted, enter the
mount
command at the command prompt to get a list of mount points. As shown in the following figure, themount
command displays mounting information for the system.TIP: Be sure to capture the proper path for thecd
command in the next step, and subsequent commands, based on the results of themount
command. - Execute the following commands, for
example:
cd /run/media/mmcblkop1 source init.sh cat /etc/xocl.txt
The
cat
command will display the platform namexilinx_vck190_base_202010_1
to let you confirm it is the same as your specified platform and that your setup is correct. - Run
ifconfig
to get the IP address of the target card. You will use the IP address to set up a TCF agent connection in Vitis IDE to connect to the assigned IP address of the embedded processor platform. - Create a target connection to the remote accelerator card. Use the Target Connections view. command to open the
- In the Target Connections view, right-click on the Linux TCF Agent and select the New Target command to open the New Target Connection dialog box.
- Specify the Target Name, enable
the Set as default target check box,
and specify the Host IP address of the
accelerator card that you obtained in an earlier step.
- Click OK to close the dialog box and continue.
- In the Assistant view, right-click on the Hardware build and select Set Active to make it the active build.
- From the Assistant view menu, select the Debug ()
command, and select the Debug
Configurations command. This opens the Debug Configurations dialog box to
let you configure debug for the Hardware build on your specific platform.
Set the following fields on the Main tab of the dialog box:
- Name
- Specifies a name for your Hardware debug configuration.
- Linux TCF Agent
- Selects the new agent you built with the specified IP address for the accelerator card.
- Configuration
- Ensure you have selected the Hardware configuration.
- Enable Profiling
- If you want to capture trace data from events.
Select the Application tab in the Debug Configuration dialog box to see the following fields:Set the following fields on the Application tab:
- Local File Path
- Specifies where the files created on the target platform will be written back into your local disk.
- Remote File Path
- Specifies the remote mount location from the accelerator card as determined in an earlier step.
- Working directory
- Specifies the location to write files created on the target platform.
- Select Apply to save your
changes, and Debug to start the
process.
This opens the Debug perspective in the Vitis IDE, and connects to the PS application on your hardware platform. The application automatically breaks at the
main()
function to let you set up and configure the debug environment.
Example of Command Line Debugging
- In a terminal, set up your environment as described in Setting Up the Vitis Environment.
- If you have not already done it, clone the Vitis Examples GitHub repository to acquire all of
the Vitis
examples:
git clone https://github.com/Xilinx/Vitis_Accel_Examples.git
This creates a Vitis_Examples directory which includes the IDCT example.
- CD to the IDCT example
directory:
cd Vitis_Examples/vision/idct/
The host code is fully contained in src/idct.cpp and the kernel code is part of src/krnl_idct.cpp.
- Build the kernel software for software emulation as discussed in Building the Device Binary.
- Compile the kernel object file for debugging using the
v++
compiler, where-g
indicates that the code is compiled for debugging:v++ -t sw_emu --platform <DEVICE> -g -c -k krnl_idct \ -o krnl_idct.xo src/krnl_idct.cpp
- Link the kernel object file, also specifying
-g
:v++ -g -l -t sw_emu --platform <DEVICE> -config config.cfg \ -o krnl_idct.xclbin krnl_idct.xo
The--config
option specifies the configuration file, config.cfg, that contains the directives for the build process as described in the Vitis Compiler Configuration File. The contents of the configuration file are as follows:kernel_frequency=250 [connectivity] nk=krnl_idct:1:krnl_idct_1 sp=krnl_idct_1.m_axi_gmem0:DDR[0] sp=krnl_idct_1.m_axi_gmem1:DDR[0] sp=krnl_idct_1.m_axi_gmem2:DDR[1] [advanced] prop=solution.hls_pre_tcl='src/hls_config.tcl"
- Compile the kernel object file for debugging using the
- Compile and link the host code for debugging using the GNU compiler chain,
g++
as described in Building the Host Program:Note: For embedded processor target platforms you will use the GNU Arm cross-compiler as described in Compiling and Linking for Arm.- Compile host code C++ files for debugging using the
-g
option:g++ -c -I${XILINX_XRT}/include -g -o idct.o src/idct.cpp
- Link the object files for debugging using
-g
:g++ -g -lOpenCL -lpthread -lrt -lstdc++ -L${XILINX_XRT}/lib/ -o idct idct.o
- Compile host code C++ files for debugging using the
- As described in emconfigutil Utility, prepare the emulation
environment using the following command:
emconfigutil --platform <device>
The actual emulation mode (sw_emu
orhw_emu
) then needs to be set through theXCL_EMULATION_MODE
environment variable. In C-shell this would be as follows:setenv XCL_EMULATION_MODE sw_emu
- As described in xrt.ini File, you must setup the runtime
for debug. In the same directory as the compiled host application, create an
xrt.ini file with the following
content:
[Debug] app_debug=true
- Run GDB on the host and kernel code. The following steps guide you through
the command line debug process which requires three separate command terminals,
setup as described in Setting Up the Vitis Environment.
- In the first terminal, start the XRT debug server,
which handles the transactions between the host and kernel
code:
${XILINX_VITIS}/bin/xrt_server --sdx-url
- In a second terminal, set the emulation
mode:
setenv XCL_EMULATION_MODE sw_emu
Run GDB by executing the following:xgdb –-args idct krnl_idct.xclbin
Enter the following on thegdb
prompt:run
- In the third terminal, attach the software emulation or hardware
emulation model to GDB to allow stepping through the design. Here, there
is a difference between running software emulation and hardware
emulation. In either flow, start up another
xgdb
:xgdb
- For debugging in software emulation:
- Type the following on the
gdb
prompt:file <XILINX_VITIS>/data/emulation/unified/cpu_em/generic_pcie/model/genericpciemodel
Note: Because GDB does not expand the environment variable, you must specify the path to the Vitis software platform installation as represented by<XILINX_VITIS>
- Type the following on the
- For debugging in hardware emulation:
- Locate the
xrt_server
temporary directory: /tmp/sdx/$uid. - Find the
xrt_server
process ID (PID) containing the DWARF file of this debug session. - At the
gdb
prompt, run:file /tmp/sdx/$uid/$pid/NUM.DWARF
- Locate the
- In either case, connect to the kernel process:
target remote :NUM
Where
NUM
is the number returned by thexrt_server
as the GDB listener port.
At this point, debugging the host and kernel code can be done as usual with GDB, with the host code and the kernel code running in two different GDB sessions. This is common when dealing with different processes.
IMPORTANT: Be aware that the application might hit a breakpoint in one process before the next breakpoint in the other process is hit. In these cases, the debugging session in one terminal appears to hang, while the second terminal is waiting for input. - For debugging in software emulation:
- In the first terminal, start the XRT debug server,
which handles the transactions between the host and kernel
code: