Compiling an AI Engine Graph Application

This chapter describes all the command line options passed to the AI Engine compiler (aiecompiler). It takes the code for the data flow graph, the code for individual kernels, and produces an image that can be run on various AI Engine target platforms such as simulators, emulators, and AI Engine devices. Unless otherwise specified, all the input file paths are with respect to the current directory, and all the output file paths are with respect to the Work directory.

The AI Engine graph and kernels can be compiled individually, or as a standalone application to run in the AI Engine processor array through emulation or hardware. The graph and kernels can also be used as part of a larger system design, that incorporates the AI Engine graph with ELF application running on the embedded processor of the Versal™ device and PL kernels running in the programmable logic of the device. The AI Engine compiler is used to compile the graph and kernels, whether in a standalone configuration or as part of a larger system.

As shown in Using the Vitis IDE, the Vitis™ IDE can be used to create and manage project build settings, and run the AI Engine compiler. Alternatively, you can build the project from the command line as discussed in Integrating the Application Using the Vitis Tools Flow, or in a script or Makefile. Either approach lets you perform simulation or emulation to verify the graph application or the integrated system design, debug the design in an interactive debug environment, and build your design to run and deploy on hardware. Whatever method you choose to work with the tools, start by setting up the environment.

Setting Up the Vitis Tool Environment

The AI Engine tools are delivered and installed as part of the Vitis unified software platform. Therefore, when preparing to run the AI Engine tools, for example, the AI Engine compiler and AI Engine simulator, you must set up the Vitis tools. The Vitis unified software platform includes two elements that must be installed and configured, along with a valid Vitis tools license, to work together properly.

Vitis tools and AI Engine tools
A target Vitis platform such as the xilinx_vck190_base_202020_1 platform used for AI Engine applications

For more information, see Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

When the elements of the Vitis software platform are installed, set up the environment to run in a specific command shell by running the following scripts.

#setup XILINX_VITIS and XILINX_HLS variables
 source <Vitis_install_path>/Vitis/<version>/settings64.csh

TIP: settings64.sh and setup.sh scripts are also provided in the same directory.

Finally, define the location of the available target platforms to use with the Vitis IDE and AI Engine tools using the following environment variable:

setenv PLATFORM_REPO_PATHS <path to platforms>

TIP: The PLATFORM_REPO_PATHS environment variable points to directories containing platform files (XPFM). This lets you specify platforms using just the folder name for the platform.

You can validate the installation and setup of the tools by using one of the following commands.

which vitis
which aiecompiler

You can validate the platform installation by using the following command.

platforminfo --list

Inputs

The AI Engine compiler takes inputs in several forms and produces executable applications for running on an AI Engine device. The command line for running AI Engine compiler is as follows:

aiecompiler [options] <Input File>

where:

<Input File> specifies the data flow graph code that defines the main() application for the AI Engine graph. The input flow graph is specified using a data flow graph language. Refer to Creating a Data Flow Graph (Including Kernels) for a description of the data flow graph.

An example AI Engine compiler command:

aiecompiler --verbose --pl-freq=100 --workdir=./myWork --platform=xilinx_vck190_202020_1.xpfm\
--include="./" --include="./src" --include="./src/kernels" --include="./data" --include="${XILINX_HLS}/include"  \
./src/graph.cpp

Some additional input options for the command line can include the following:

--constraints=<jsonfile> to specify constraints such as location or placement bounding box.

Outputs

By default the AI Engine compiler writes all outputs to a directory called Work and a file libadf.a, where Work is a sub-directory of the current directory where the tool was launched and libadf.a is a file used as an input for the Vitis compiler created in the same directory as the AI Engine compiler was launched from. The type of output and contents of the output directory depend on the --target specified, as described in AI Engine Compiler Options. For more information about the Vitis compiler, see Vitis Compiler Command in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).

TIP: You can specify a different output directory using the --workdir option.

The structure and contents of the ./Work directory are described in the following table.

Table 1. Work Directory Structure
Directory/Files					Description
./Work/
	<name>.aiecompile_summary				A generated file that can be opened in Vitis analyzer to see a compilation summary.
	config/scsim_config.json				A JSON script that specifies options to the System C simulator. It includes AI Engine array tile geometry, input/output file specifications, and their connections to the stream switches.
	arch/
		logical_arch_aie.larch			This is a JSON file describing the hardware requirements of the AI Engine application.
		aieshim_constraints.json			If present, this JSON file represents the user-defined physical interface constraints between AI Engine array and programmable logic provided through the AI Engine application.
		aieshim_solution.aiesol			This is a JSON file describing the mapping from logical to physical channels crossing the interface between the AI Engine array and the programmable logic.
		cfgraph.xml			This is an XML file describing the hardware requirements of the AI Engine application. This is used by the Vitis tools flow.
	aie/
		Makefile			A Makefile to compile code for all AI Engines.
		<n>_<m>/			These are individual AI Engine compilation directories.
			Release/		Synopsys release directory for the AI Engine including ELF file.
				<n>_<m>.lst	Microcode of the kernel at <n>_<m>.
				<n>_<m>.map	Shows the memory mapping of the kernel at<n>_<m>. It also includes the memory size, width, and offset.
			scripts/		Synopsys compiler project and linker scripts.
			src/		Source files for the processor including kernels and `main`.
	pl/systemC/				Directory containing System C models for all PL kernels.
		Makefile			A Makefile to compile all PL System C models.
		generated-source/			System C wrappers for each PL kernel.
		generated-objects/			Compiled shared libraries for each PL kernel.
	ps/c_rts/				Directory containing C-based run-time control for modeling PS interaction.
		aie_control.cpp			This is the AI Engine control code generated implementing the `init,run,end` graph APIs for the specific graph objects present in the program. This file is linked with the application `main` to create a PS thread for the simulator and bare metal.
		aie_control_xrt.cpp			This is the AI Engine control code generated implementing the `init,run,end` graph APIs for the specific graph objects present in the program. This file is linked with the application `main` to create a PS thread for the Linux application.
		systemC/			Directory containing System C models for PS `main`.
				Makefile	A Makefile to compile all PS System C models.
				generated-source/	System C wrappers for PS `main`.
				generated-objects/	Compiled shared libraries for PS `main`.
	ps/cdo/				Directory containing generator code for graph configuration and initialization in configuration data object (CDO) format. This is used during System C-RTL simulation and during actual hardware execution.
		Makefile			A Makefile to compile graph CDO
		generateAIEConfig			A bash script for building graph CDO
		generated-sources/			C++ program to generate CDO.
		generated-objects/			Compiled program to generate CDO.
	pthread/
		PthreadSim.c			A source-to-source translation of the input data flow graph into a C program implemented using `pthreads`.
		sim.out			The GCC compiled binary for PthreadSim.c.
	reports/
		<graph>_mapping_analysis_report.txt			Mapping report describing allocation of kernels to AI Engines and window buffers to AI Engine memory groups.
		<graph>.png			A bitmap file showing the kernel graph connectivity and partitioning over AI Engines.
		<graph>.xpe			An XML file describing the estimated power profile of the graph based on hardware resources used. This file is used with the Xilinx® Power Estimator (XPE) tool.
		sync_buffer_address.json			Shows kernel sync buffer addresses with local and global addresses.
		lock_allocation_report.json			Describes the ports and what locks and buffers are associated with the kernels.
		dma_lock_report.json			Shows DMA locks for inputs/outputs to the AI Engine as well as the kernel(s) they connect to with buffer info.
	temp/				This directory contains some temporary files generated by the AI Engine compiler that can be useful in debugging. In addition, the CF graph .o file is also created here by default.

AI Engine Compiler Options

Table 2. AI Engine Options
Option Name	Description
`--constraints=<string>`	Constraints (location, bounding box, etc.) can be specified using a JSON file. This option lets you specify one or more constraint files.
`--heapsize=<int>`	Heap size (in bytes) used by each AI Engine (default 1024). Used for allocating any remaining file-scoped data that is not explicitly connected in the user graph.
`--stacksize=<int>`	Stack size (in bytes) used by each AI Engine (default 1024). Used as a standard compiler calling convention including stack-allocated local variables and register spilling.
`--pl-freq=<value>`	Specifies a frequency (in MHz) for all PL kernels and PLIOs. The default frequency is a quarter of the AI Engine frequency. The PL frequency specific to each interface is provided in the graph.
`--pl-register-threshold=<value>`	Specifies the frequency (in MHz) threshold for registered AI Engine-PL crossings. The default frequency is one-eighth of the AI Engine frequency dependent on the specific device speed grade. Note: Values above a quarter of the AI Engine array frequency are ignored, and a quarter is used instead.

Table 3. CDO Options
Option Name	Description
`--enable-ecc-scrubbing`	Enable ECC Scrubbing on all the AI Engines used. This option enables ECC Scrubbing while generating AI Engine ELF CDO. (One performance counter per core is used.). ECC Scrubbing is turned off by default.

Table 4. Compiler Debug Options
Option Name	Description
`--kernel-linting`	Perform consistency checking between graphs and kernels. The default is false.
`--log-level=<int>`	Log level for verbose logging (0: no logging, 5: all debug messages). The default level is 1. Note: The default level with `-–verbose` is 5.
`-v \| --verbose`	Verbose output of the AI Engine compiler emits compiler messages at various stages of compilation. These debug and tracing logs provide useful messages regarding the compilation process.

Table 5. Execution Target Options
Option Name	Description
`--pl-axi-lite=[true\|false]`	PL AXI4-Lite interface to specify `true` or `false` (default: `false`).
`--target=<hw\|x86sim>`	The AI Engine compiler supports several build targets (default: `hw`): The `hw` target produces a `libadf.a` for use in the hardware device on a target platform. The `x86sim` target compiles the code for use in the x86simulator as described in x86 Functional Simulator (x86simulator).

Table 6. File Options
Option Name	Description
`--include=<string>`	This option can be used to include additional directories in the include path for the compiler front-end processing. Specify one or more include directories.
`--output=<string>`	Specifies an output.json file that is produced by the front end for an input data flow graph file. The output file is passed to the back-end for mapping and code generation of the AI Engine device. This is ignored for other types of input.
`--platform=<string>`	This is a path to a Vitis platform file that defines the hardware and software components available when doing a hardware design and its RTL co-simulation.
`--workdir=<string>`	By default, the compiler writes all outputs to a sub-directory of the current directory, called Work. Use this option to specify a different output directory.

Table 7. Generic Options
Option Name	Description
`--help`	List the available AI Engine compiler options, sorted in the groups listed here.
`--help-list`	Display an alphabetic list of AI Engine compiler options.
`--version`	Display the version of the AI Engine compiler.

Table 8. Miscellaneous Options
Option Name	Description
`--no-init`	This option disables initialization of window buffers in AI Engine data memory. This option enables faster loading of the binary images into the System C-RTL co-simulation framework. TIP: This does not affect the statically initialized lookup tables.
`--nodot-graph`	By default, the AI Engine compiler produces .dot and .png files by default to visualize the user-specified graph and its partitioning onto the AI Engines. This option can be used to eliminate the dot graph output.
`--xlopt`	Enable a combination of kernel optimizations based on the level (default = 0) 0: Only Single Core compiler optimizations. 1: AI Engine compiler analysis and guidance. Guidance is provided to allow the mapper to optimally allocate large global arrays thus minimizing memory conflicts. Efficient alignment of globals that are accessed as vectors. This alignment is required for the Versal AI Engine architecture. Enables ease of use using kernel analysis to automatically compute heap requirements for each core (therefore you do not need to specify the heap size). Note: Compiler optimization (`xlopt > 0`) reduces debug visibility.

Table 9. Module Specific Options
Option Name	Description
`--Xchess=<string>`	Can be used to pass kernel specific options to the CHESS compiler that is used to compile code for each AI Engine. The option string is specified as `<kernel-function>:<optionid>=<value>`. This option string is included during compilation of generated source files on the AI Engine where the specified kernel function is mapped.
`--Xelfgen=<string>`	Can be used to pass additional command-line options to the ELF generation phase of the compiler, which is currently run as a `make` command to build all AI Engine ELF files. For example, to limit the number of parallel compilations to four, you write `-Xelfgen="-j4"`.
`--Xmapper=<string>`	Can be used to pass additional command-line options to the mapper phase of the compiler. For example: `--Xmapper=DisableFloorplanning` These are options to try when the design is either failing to converge in the mapping or routing phase, or when you are trying to achieve better performance via reduction in memory bank conflict. See the Mapper and Router Options for a list and description of options.
`--Xpreproc=<string>`	Pass general option to the PREPROCESSOR phase for all source code compilations (AIE/PS/PL/x86sim). For example: `--Xpreproc=-D<var>=<value>`
`--Xpslinker=<string>`	Pass general option to the PS LINKER phase. For example: `--Xpslinker=-L<libpath> -l<libname>`
`--Xrouter=<string>`	Pass general option to the ROUTER phase. For example: `-Xrouter=enableSplitAsBroadcast`

Table 10. Tracing Options
Option Name	Description
`--aie-heat-map`	Enable AI Engine heat map configurations (default: false)
`--event-trace=<value>` where `<value>` is equal to `functions`.	Event trace configuration value. Where the specified `functions` indicates the capture of function transitions on the AI Engine.
`--num-trace-streams=<int>`	Number of trace streams.

Mapper and Router Options

Table 11. Mapper Options
Options	Description
DisableFloorplanning	This option disables the auto-floor-planning phase in the mapper. This option is useful for heavily constrained designs where you want to guide mapping phase by using location constraints.
BufferOptLevel[1-9]	These options can be used to improve throughput by reducing memory bank conflicts. At higher BufferOptLevel, mapper tries to reduce number of buffers getting mapped into same memory bank, thereby reducing the probability of bank conflicts affecting overall performance. Higher BufferOptLevels can increase the size of the overall mapped region, and in few cases, can fail to find a solution. The default of BufferOptLevels is `BufferOptLevel0`.
disableSeparateTraceSolve	Default trace behavior forces the AI Engine mapper to keep all PLIOs/GMIOs in the original design location when using the trace debug feature. However, if the original solution did not leave any room for trace GMIOs, no solution will be possible unless the design PLIOs are moved. This option is to be used in this case.

Table 12. Router Options
Options	Description
enableSplitAsBroadcast	This option treats all split nets from a split node as a single net, with 100% usage broadcasting to multiple points. This broadcast net does not share resources with any other packet switched net in the design. This option can be used when throughput degradation is observed due to interference on `packetstream` nets after split node.
dmaFIFOsInFreeBankOnly	This option ensures DMA FIFOs are only inserted into memory banks that have no other buffers mapped. This option can be used when memory stalls are observed due to DMA FIFO buffers being accessed at the same time as some other design buffer placed in the same bank.
disableSSFifoSharing	Disables the ability of the router to share stream switch FIFOs among two or more terminals of a net. This option should only be used when there are not enough stream switch FIFOs in the device to give each terminal its own individual FIFO(s).

Viewing Compilation Results in the Vitis Analyzer

After the compilation of the AI Engine graph, the AI Engine compiler writes a summary of compilation results called <graph-file-name>.aiecompile_summary to view in the Vitis analyzer. The summary contains a collection of reports, and diagrams reflecting the state of the AI Engine application implemented in the compiled build. The summary is written to the working directory of the AI Engine compiler as specified by the --workdir option, which defaults to ./Work.

To open the AI Engine compiler summary, use the following command:

vitis_analyzer ./Work/graph.aiecompile_summary

The Vitis analyzer opens displaying the Summary page of the report. The Report Navigator view lists the different reports that are available in the Summary. For a complete understanding of the Vitis analyzer, see Using the Vitis Analyzer in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).

The listed reports include:

Summary: This is the top-level of the report, and reports the details of the build, such as date, tool version, a link to the graph, and the command-line used to create the build.
Graph: Provides a flow diagram of the AI Engine graph that shows the data flow through the various kernels. You can zoom into and pan the graph display as needed. At the bottom of the Reports view, a table summarizes the graph with information related to kernels, buffers, ports, and nets. Clicking on objects in the graph diagram highlights the selected object in the tables.
TIP: The Kernels table lists the kernels in the graph, and provides links to the kernel source code, and the graph header definition. Selecting one of these links opens the source code for read only in the Source Code view.
Array: Provides a graphical representation of the AI Engine processor array on the Versal device. The graph kernels and connections are placed within the context of the array. You can zoom into and select elements in the array diagram. Choosing objects in the array also highlights the object chosen in the tables at the bottom of the Reports view.
Mapping Analysis: Displays the text report graph_mapping_analysis_report.txt. Reports the block mapping, port mapping, and memory bank mapping of the graph to the device resources.
DMA Analysis: Displays the text report DMA_report.txt, providing a summary of DMA accesses from the graph.
DMA Lock: Displays the text report Lock_report.txt, listing DMA locks on port instances.

The following figure shows the graph.aiecompile_summary report open in the Vitis analyzer, with the Array diagram displayed, an AI Engine kernel selected in the diagram and the table views, and the source code for the kernel displayed in the Source Code view.