Compiling an AI Engine Graph Application
This chapter describes all the command line options passed to
the AI Engine compiler
(aiecompiler
). It takes the code for the data flow graph, the code for
individual kernels, and produces an image that can be run on various AI Engine target platforms such as simulators, emulators,
and AI Engine devices. Unless otherwise specified, all
the input file paths are with respect to the current directory, and all the output file
paths are with respect to the Work directory.
The AI Engine graph and kernels can be compiled individually, or as a standalone application to run in the AI Engine processor array through emulation or hardware. The graph and kernels can also be used as part of a larger system design, that incorporates the AI Engine graph with ELF application running on the embedded processor of the Versal™ device and PL kernels running in the programmable logic of the device. The AI Engine compiler is used to compile the graph and kernels, whether in a standalone configuration or as part of a larger system.
As shown in Using the Vitis IDE, the Vitis™ IDE can be used to create and manage project build settings, and run the AI Engine compiler. Alternatively, you can build the project from the command line as discussed in Integrating the Application Using the Vitis Tools Flow, or in a script or Makefile. Either approach lets you perform simulation or emulation to verify the graph application or the integrated system design, debug the design in an interactive debug environment, and build your design to run and deploy on hardware. Whatever method you choose to work with the tools, start by setting up the environment.
Setting Up the Vitis Tool Environment
The AI Engine tools are delivered and installed as part of the Vitis unified software platform. Therefore, when preparing to run the AI Engine tools, for example, the AI Engine compiler and AI Engine simulator, you must set up the Vitis tools. The Vitis unified software platform includes two elements that must be installed and configured, along with a valid Vitis tools license, to work together properly.
- Vitis tools and AI Engine tools
- A target Vitis platform
such as the
xilinx_vck190_base_202020_1
platform used for AI Engine applications
For more information, see Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).
When the elements of the Vitis software platform are installed, set up the environment to run in a specific command shell by running the following scripts.
#setup XILINX_VITIS and XILINX_HLS variables
source <Vitis_install_path>/Vitis/<version>/settings64.csh
settings64.sh
and setup.sh
scripts are also provided in the same directory.Finally, define the location of the available target platforms to use with the Vitis IDE and AI Engine tools using the following environment variable:
setenv PLATFORM_REPO_PATHS <path to platforms>
PLATFORM_REPO_PATHS
environment variable points to
directories containing platform files (XPFM). This lets you specify platforms using
just the folder name for the platform.You can validate the installation and setup of the tools by using one of the following commands.
which vitis
which aiecompiler
You can validate the platform installation by using the following command.
platforminfo --list
Inputs
The AI Engine compiler takes inputs in several forms and produces executable applications for running on an AI Engine device. The command line for running AI Engine compiler is as follows:
aiecompiler [options] <Input File>
where:
<Input File>
specifies the data flow graph code that defines themain()
application for the AI Engine graph. The input flow graph is specified using a data flow graph language. Refer to Creating a Data Flow Graph (Including Kernels) for a description of the data flow graph.
An example AI Engine compiler command:
aiecompiler --verbose --pl-freq=100 --workdir=./myWork --platform=xilinx_vck190_202020_1.xpfm\
--include="./" --include="./src" --include="./src/kernels" --include="./data" --include="${XILINX_HLS}/include" \
./src/graph.cpp
Some additional input options for the command line can include the following:
--constraints=<jsonfile>
to specify constraints such as location or placement bounding box.
Outputs
By default the AI Engine compiler writes all outputs to a directory called Work and a file libadf.a
, where
Work
is a sub-directory of the current
directory where the tool was launched and libadf.a
is a file used as an input for the Vitis
compiler created in the same directory as the AI Engine compiler was launched from. The type of output and contents
of the output directory depend on the --target
specified, as described in AI Engine Compiler Options. For
more information about the Vitis compiler,
see Vitis Compiler Command in
the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation
(UG1416).
--workdir
option.The structure and contents of the ./Work directory are described in the following table.
Directory/Files | Description | ||||
---|---|---|---|---|---|
./Work/ | |||||
<name>.aiecompile_summary | A generated file that can be opened in Vitis analyzer to see a compilation summary. | ||||
config/scsim_config.json | A JSON script that specifies options to the System C simulator. It includes AI Engine array tile geometry, input/output file specifications, and their connections to the stream switches. | ||||
arch/ | |||||
logical_arch_aie.larch | This is a JSON file describing the hardware requirements of the AI Engine application. | ||||
aieshim_constraints.json | If present, this JSON file represents the user-defined physical interface constraints between AI Engine array and programmable logic provided through the AI Engine application. | ||||
aieshim_solution.aiesol | This is a JSON file describing the mapping from logical to physical channels crossing the interface between the AI Engine array and the programmable logic. | ||||
cfgraph.xml | This is an XML file describing the hardware requirements of the AI Engine application. This is used by the Vitis tools flow. | ||||
aie/ | |||||
Makefile | A Makefile to compile code for all AI Engines. | ||||
<n>_<m>/ | These are individual AI Engine compilation directories. | ||||
Release/ | Synopsys release directory for the AI Engine including ELF file. | ||||
<n>_<m>.lst | Microcode of the kernel at <n>_<m>. | ||||
<n>_<m>.map | Shows the memory mapping of the kernel at<n>_<m>. It also includes the memory size, width, and offset. | ||||
scripts/ | Synopsys compiler project and linker scripts. | ||||
src/ | Source files for the processor
including kernels and main . |
||||
pl/systemC/ | Directory containing System C models for all PL kernels. | ||||
Makefile | A Makefile to compile all PL System C models. | ||||
generated-source/ | System C wrappers for each PL kernel. | ||||
generated-objects/ | Compiled shared libraries for each PL kernel. | ||||
ps/c_rts/ | Directory containing C-based run-time control for modeling PS interaction. | ||||
aie_control.cpp | This is the AI Engine control code generated implementing the
init,run,end graph APIs for
the specific graph objects present in the program. This file is
linked with the application main
to create a PS thread for the simulator and bare metal. |
||||
aie_control_xrt.cpp | This is the AI Engine control
code generated implementing the init,run,end graph APIs for the specific graph objects
present in the program. This file is linked with the application
main to create a PS thread for
the Linux application. |
||||
systemC/ | Directory containing System C models for PS
main . |
||||
Makefile | A Makefile to compile all PS System C models. | ||||
generated-source/ | System C wrappers for PS main . |
||||
generated-objects/ | Compiled shared libraries for PS
main . |
||||
ps/cdo/ | Directory containing generator code for graph configuration and initialization in configuration data object (CDO) format. This is used during System C-RTL simulation and during actual hardware execution. | ||||
Makefile | A Makefile to compile graph CDO | ||||
generateAIEConfig | A bash script for building graph CDO | ||||
generated-sources/ | C++ program to generate CDO. | ||||
generated-objects/ | Compiled program to generate CDO. | ||||
pthread/ | |||||
PthreadSim.c | A source-to-source translation of the input data
flow graph into a C program implemented using pthreads . |
||||
sim.out | The GCC compiled binary for PthreadSim.c. | ||||
reports/ | |||||
<graph>_mapping_analysis_report.txt | Mapping report describing allocation of kernels to AI Engines and window buffers to AI Engine memory groups. | ||||
<graph>.png | A bitmap file showing the kernel graph connectivity and partitioning over AI Engines. | ||||
<graph>.xpe | An XML file describing the estimated power profile of the graph based on hardware resources used. This file is used with the Xilinx® Power Estimator (XPE) tool. | ||||
sync_buffer_address.json | Shows kernel sync buffer addresses with local and global addresses. | ||||
lock_allocation_report.json | Describes the ports and what locks and buffers are associated with the kernels. | ||||
dma_lock_report.json | Shows DMA locks for inputs/outputs to the AI Engine as well as the kernel(s) they connect to with buffer info. | ||||
temp/ | This directory contains some temporary files generated by the AI Engine compiler that can be useful in debugging. In addition, the CF graph .o file is also created here by default. |
AI Engine Compiler Options
Option Name | Description |
---|---|
--constraints=<string> |
Constraints (location, bounding box, etc.) can be specified using a JSON file. This option lets you specify one or more constraint files. |
--heapsize=<int> |
Heap size (in bytes) used by each AI Engine (default 1024). Used for allocating any remaining file-scoped data that is not explicitly connected in the user graph. |
--stacksize=<int> |
Stack size (in bytes) used by each AI Engine (default 1024). Used as a standard compiler calling convention including stack-allocated local variables and register spilling. |
--pl-freq=<value> |
Specifies a frequency (in MHz) for all PL kernels and PLIOs. The default frequency is a quarter of the AI Engine frequency. The PL frequency specific to each interface is provided in the graph. |
--pl-register-threshold=<value> |
Specifies the frequency (in MHz) threshold for registered AI Engine-PL crossings. The default
frequency is one-eighth of the AI Engine frequency dependent on the specific device
speed grade. Note: Values above a quarter of the
AI Engine array
frequency are ignored, and a quarter is used
instead. |
Option Name | Description |
---|---|
--enable-ecc-scrubbing |
Enable ECC Scrubbing on all the AI Engines used. This option enables ECC Scrubbing while generating AI Engine ELF CDO. (One performance counter per core is used.). ECC Scrubbing is turned off by default. |
Option Name | Description |
---|---|
--kernel-linting |
Perform consistency checking between graphs and kernels. The default is false. |
--log-level=<int> |
Log level for verbose logging (0: no logging, 5:
all debug messages). The default level is 1. Note: The default level with
-–verbose is 5. |
-v |
--verbose |
Verbose output of the AI Engine compiler emits compiler messages at various stages of compilation. These debug and tracing logs provide useful messages regarding the compilation process. |
Option Name | Description |
---|---|
--pl-axi-lite=[true|false] |
PL AXI4-Lite interface to specify
true or false (default: false ). |
--target=<hw|x86sim> |
The AI Engine compiler supports
several build targets (default: hw ):
|
Option Name | Description |
---|---|
--include=<string> |
This option can be used to include additional
directories in the include path for the compiler front-end
processing. Specify one or more include directories. |
--output=<string> |
Specifies an output.json file that is produced by the front end for an input data flow graph file. The output file is passed to the back-end for mapping and code generation of the AI Engine device. This is ignored for other types of input. |
--platform=<string> |
This is a path to a Vitis platform file that defines the hardware and software components available when doing a hardware design and its RTL co-simulation. |
--workdir=<string> |
By default, the compiler writes all outputs to a sub-directory of the current directory, called Work. Use this option to specify a different output directory. |
Option Name | Description |
---|---|
--help |
List the available AI Engine compiler options, sorted in the groups listed here. |
--help-list |
Display an alphabetic list of AI Engine compiler options. |
--version |
Display the version of the AI Engine compiler. |
Option Name | Description |
---|---|
--no-init |
This option disables initialization of window buffers in AI Engine data memory. This option
enables faster loading of the binary images into the System C-RTL
co-simulation framework. TIP: This does not affect the statically
initialized lookup tables. |
--nodot-graph |
By default, the AI Engine compiler produces .dot and .png files by default to visualize the user-specified graph and its partitioning onto the AI Engines. This option can be used to eliminate the dot graph output. |
--xlopt |
Enable a combination of kernel optimizations based on the level
(default = 0)
Note: Compiler optimization ( xlopt > 0 ) reduces debug
visibility. |
Option Name | Description |
---|---|
--Xchess=<string> |
Can be used to pass kernel specific options to
the CHESS compiler that is used to compile code for each AI Engine. The option string is specified as |
--Xelfgen=<string> |
Can be used to pass additional command-line
options to the ELF generation phase of the compiler, which is
currently run as a make command to
build all AI Engine ELF
files.For example, to limit the number of
parallel compilations to four, you write |
--Xmapper=<string> |
Can be used to pass additional command-line options to the mapper
phase of the compiler. For example:
These are options to try when the design is either failing to converge in the mapping or routing phase, or when you are trying to achieve better performance via reduction in memory bank conflict. See the Mapper and Router Options for a list and description of options. |
--Xpreproc=<string> |
Pass general option to the PREPROCESSOR phase for all
source code compilations (AIE/PS/PL/x86sim). For example:
|
--Xpslinker=<string> |
Pass general option to the PS LINKER phase. For
example:
|
--Xrouter=<string> |
Pass general option to the ROUTER phase. For example:
|
Option Name | Description |
---|---|
--aie-heat-map |
Enable AI Engine heat map configurations (default: false) |
--event-trace=<value>
where |
Event trace configuration value. Where the specified functions indicates the capture of
function transitions on the AI Engine. |
--num-trace-streams=<int> |
Number of trace streams. |
Mapper and Router Options
Options | Description |
---|---|
DisableFloorplanning | This option disables the auto-floor-planning phase in the mapper. This option is useful for heavily constrained designs where you want to guide mapping phase by using location constraints. |
BufferOptLevel[1-9] | These options can be used to improve throughput
by reducing memory bank conflicts. At higher BufferOptLevel, mapper
tries to reduce number of buffers getting mapped into same memory
bank, thereby reducing the probability of bank conflicts affecting
overall performance. Higher BufferOptLevels can increase the size of
the overall mapped region, and in few cases, can fail to find a
solution. The default of BufferOptLevels is BufferOptLevel0 . |
disableSeparateTraceSolve | Default trace behavior forces the AI Engine mapper to keep all PLIOs/GMIOs in the original design location when using the trace debug feature. However, if the original solution did not leave any room for trace GMIOs, no solution will be possible unless the design PLIOs are moved. This option is to be used in this case. |
Options | Description |
---|---|
enableSplitAsBroadcast | This option treats all split nets from a split
node as a single net, with 100% usage broadcasting to multiple
points. This broadcast net does not share resources with any other
packet switched net in the design. This option can be used when
throughput degradation is observed due to interference on packetstream nets after split
node. |
dmaFIFOsInFreeBankOnly | This option ensures DMA FIFOs are only inserted into memory banks that have no other buffers mapped. This option can be used when memory stalls are observed due to DMA FIFO buffers being accessed at the same time as some other design buffer placed in the same bank. |
disableSSFifoSharing | Disables the ability of the router to share stream switch FIFOs among two or more terminals of a net. This option should only be used when there are not enough stream switch FIFOs in the device to give each terminal its own individual FIFO(s). |
Viewing Compilation Results in the Vitis Analyzer
After the compilation of the AI Engine
graph, the AI Engine compiler writes a summary
of compilation results called <graph-file-name>.aiecompile_summary to view in the Vitis analyzer. The summary contains a collection of
reports, and diagrams reflecting the state of the AI Engine application implemented in the compiled build. The summary
is written to the working directory of the AI Engine compiler as specified by the --workdir
option, which defaults to ./Work
.
To open the AI Engine compiler summary, use the following command:
vitis_analyzer ./Work/graph.aiecompile_summary
The Vitis analyzer opens displaying the Summary page of the report. The Report Navigator view lists the different reports that are available in the Summary. For a complete understanding of the Vitis analyzer, see Using the Vitis Analyzer in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).
The listed reports include:
- Summary
- This is the top-level of the report, and reports the details of the build, such as date, tool version, a link to the graph, and the command-line used to create the build.
- Graph
- Provides a flow diagram of the AI Engine graph that shows the data flow through the
various kernels. You can zoom into and pan the graph display as needed. At
the bottom of the Reports view, a table summarizes the graph with
information related to kernels, buffers, ports, and nets. Clicking on
objects in the graph diagram highlights the selected object in the
tables.TIP: The Kernels table lists the kernels in the graph, and provides links to the kernel source code, and the graph header definition. Selecting one of these links opens the source code for read only in the Source Code view.
- Array
- Provides a graphical representation of the AI Engine processor array on the Versal device. The graph kernels and connections are placed within the context of the array. You can zoom into and select elements in the array diagram. Choosing objects in the array also highlights the object chosen in the tables at the bottom of the Reports view.
- Mapping Analysis
- Displays the text report graph_mapping_analysis_report.txt. Reports the block mapping, port mapping, and memory bank mapping of the graph to the device resources.
- DMA Analysis
- Displays the text report DMA_report.txt, providing a summary of DMA accesses from the graph.
- DMA Lock
- Displays the text report Lock_report.txt, listing DMA locks on port instances.
The following figure shows the graph.aiecompile_summary report open in the Vitis analyzer, with the Array diagram displayed, an AI Engine kernel selected in the diagram and the table views, and the source code for the kernel displayed in the Source Code view.