AR# 70835

|

SDAccel Environments 2017.4 - Release Notes and Known Issues

描述

This answer record serves as the Vivado SDAccel 2017.4 Release Notes, and contains Links to information about what is included in the update.

Xilinx Forums:

Please seek technical support via the SD Accel Board. The Xilinx Forums are a great resource for technical support. 

The entire Xilinx Community is available to help here, and you can ask questions and collaborate with Xilinx experts to get the solutions you need.

解决方案

SDAccel Development Environment Changes for 2017.4

  • DSAs
    • New 5.x version DSAs ("Dynamic" DSAs), which are now configurable at link time to free up resources for kernels, reduce hardware compile times, and improve timing closure in Vivado.
    • In dynamic platforms, logic in the DSA, which is not required, such as the logic used to implement an unused DDR memory is optimized away during compilation.
    • When using 4.x or earlier DSAs, Software and Hardware Emulation require the prepending of the 4.x DSA libraries to the LD_LIBRARY_PATH.
    • When using 4.x or earlier DSAs, only design examples from the 2017.2 or earlier Xilinx GitHub may be used/
  • xocc compiler
    • xocc compiler for OpenCL kernel compilation has been upgraded to LLVM 3.9
    • Starting 2017.4, two advanced transformations were added that should help getting consistently high global memory bandwidth utilization in openCL kernels. This is achieved in two complementary ways:
      • Widening/Vectorization of the datapath of the memory interface
      • Enhanced burst inference on the array access to the memory interface.
    • When these two transformations are trigged, speed-ups by as much as 5X can be achieved (usually, up to saturation of the memory bandwidth)
  • Xilinx SDAccel Runtime
    • Use of multiple xclbin files
      • Cases where the host code uses multiple Xclbin files must now free the current xclbin using clReleaseProgram() and clReleaseKernel() before proceeding to load subsequent xclbin files.
    • Updated memory selection
      • When using multiple DDR memories, the selection is now made at compile time using the xocc command line swith -sp and the map_connect option is now deprecated/
      • Existing host code, which specifies DDR banks using clCreateBuffer, should be updated to use the new command line switches.
    • Shared Library
      • When you use HLS math library in SDx host code, you must add the additional linkage information to your Makefile to find the dynamic library. For example:
        • $(XILINX_VIVADO)/lnx64/lib/csim -lhlsmc++-GCC46
    • Enhanced xbsak features
      • The command line option dmatest now performs a DMA test on all the DDR banks used in the application and is specified in the xocc command line
      • The command line option boot now forces a re-enumation of the PCIe bus and bus re-scanning
      • The command line option scan is enhanced to provide more details about the OS
      • The command line option query is enhanced to provide more details on each CU, such as the name, the kernel type, index, address and status.
    • Profile Features
      • Kernel instrumentation is now controlled with the xocc compile option -profile_kernel
        • The compile time option allows for the insertion of additional hardware logic to enable the generation of profiling data. This is enabled by default when using the IDE and may be disabled.
      • The profile Summary Report and Timeline Trace Report are now generated using the sdx_analyze utility. This replaces the existing sda2protobuf and sda2wbd utilities.
    • Debug Features
      • Ensuring that all AXI outputs of RTL kernels are driven Driverless outputs on RTL kernels might lead to xocc failures after synthesis
        • Designs upgraded from V4.x DSAs to V5.x DSAs might be more sensitive to driverless output errors
What's new in SDAccel for 2017.4

  • SDX GUI
    • The Vivado IDE can be launched directly from the SDx GUI
      • This allows experienced hardware designs, or those familiar with the Vivado Design Environments, to perform implementations changes to the hardware (detailed timing closure etc.) and save the results.
      • In addition a pre-synthesized or re-implemented Vivado Design Checkpoint (.dcp) file can be brought in from the launched Vivado session, and directly used in the SDx session to complete the remaining flow without having to start from the beginning.
        • Changes made during the Vivado session are also automatically captured in SDx for subsequent runs.
    • RTL Kernel Wizard has been enhanced to support additional types of packaging options, including pre-compiled kernels and netlist (.dcp) based kernels.
    • Dataflow is an important feature in xocc (for both C/C++ as well as OpenCL kernels). Several DRC (with extended documentation) related to C/C++/OpenCL designs with dataflow have been added.
    • DRC window highlighting key DRC in the user C/C++/OpenCL code is available at the bottom of the SDx GUI next to the Console Tab.
  • Kernel Performance Enhancements
    • Improved data transfer rates are now provided through the following means.
      • Automatic memory coalescing and widening. The automatic widening of the data transfer might be disabled by adding the nounroll pragma to for-loops/
      • Manually specified memory coalescing using the new Xilinx OpenCL attribute xcl_zero_global_work_offset, which can be used when clEnqueueNDRangeKernel is used without global_work_offset.
    • The workgroup size is not automatically inferred based on the OpenCL semantics.
  • Whole-function vectorization is provided through the vec_type_hint attribute on NDRange
    • It is highly recommended to use vec_type_hint for improved performance
    • Whole-function vectorization may increase the size of the hardware implementation.
  • Sub-functions are now automatically inlined to improve performance
    • This may be disabled using the noinline program and OpenCL attribute
  • Xilinx SDAccel Runtime
    • Support is now provided for the OpenCL API clCreateSubDevices.
      • Sub-devices can be created for each Compute Unit (CU_ allowing for multiple independent command queues for each CU
      • Each sub-device can include only one CU
    • Improved Linux Driver support
      • Drivers now use the Linux DMA_BUG framework allowing data sharing across all Linux devices.
      • Enables the exporting of device data (temp, current, etc.) via Linux sysFS framework
  • RTL Kernel Enhancements
    • support for compile time parameterization of RTL kernels via the xocc command line.
    • A new xocc command line option allows a single RTL Kernel (.xo file) to be instantiated as multiple kernel instances. In addition, these separate instances can be queued independently from each other.
    • RTL kernels can now be pre-compiled, to reduce SDx compile flow time by not having to do synthesis under SDx
    • RTL kernels can be created from a Xilinx checkpoint (.dcp) file
    • Encryption support is now provided for RTL kernels
  • XOCC Enhancements
    • --ini_files switch can be used to pass a set of advanced --xp style switches to xocc using a single file (similar to use of xocc.ini file).
    • --report_dirs switch allows report files generated under SDx runs to be copied to separate directory for easy access
    • --log_dir switch allows log files generated under SDx runs to be copied to a separate directory for easy access.
    • --temp_dir switch allows a user specified directory to be used for generation of temporary files.
    • --interactive swtich allows Vivado to be launched from within the xocc environment, with the right project loaded.
    • --reuse_syth switch allows a pre-synthesized Vivado Design Checkpoint (.DCP) file to be brought in and used directly in the SDx flow to complete implementation and xclbin generation.
    • --reuse_impl switch allows a pre-implemented and timing closed Vivado Design Checkpoint (.dcp) file to be brought in and used directly in the SDx flow to do xclbin generation.
    • --remote_ip_cache switch can be used to turn off all images of IP caches. This is generally not recommended other than debugging purposes.
    • --no_ip_cache switch can be used to turn off all usages of IP caches. This is generally not recommended other than debugging purposes.
  • Profile Features
    • Profile instrumentation of kernels is now enabled through xocc compile option -profile_kernel
    • Profile reports generated using sdx_analyze utility
    • Profile Summary Report Enhancements:
      • Data Transfer table now displays information on a compute unit/port basis including kernel arguments and DDR bank.
      • Compute Unit Table now reports clock frequency per Compute Unit.
  • Debug Features
    • Kernel; Debug is now supported through GDB and TCF in hardware Emulation. This provides the ability to:
      • Start and stop at intermediate points in the execution of the kernels
      • Inspect both kernel arguments and global memory
    • Application Debug: The following new gdb extensions provide enhanced debug information:
      • xprint kernel: Displays all NDRange events that are pending and their arguments
      • xprint all: Displays all valid OpenCL objects
      • xstatus all: Provides visibility into the IPs instantiated on the platform
      • A new xocc command option -dk to insert a Light Weight Protocol Checker IP into the system to debug AXI protocol violations

SDAccel Known issues:

(Xilinx Answer 70450)2017.4 - SDAccel - Frequency Scaling of Kernel Clock not applied

2017.4 - SDAccel -  OpenCL support of LLVM 3.9

2017.4 - SDAccel - xocc options for dynamic platform (5.0 DSA) have changed

2017.4 - SDAccel - design fails with "ERROR: [KernelCheck 83-114] Kernel 'black_scholes' port 'M_AXI_GMEM' is not mapped from any kernel argument. There must be at least one kernel argument that maps onto every AXI4 master port on the kernel IP" 
(Xilinx Answer 70933) 2017.4 - SDAccel - Motherboard and System Recommendations

2017.4 - SDAccel - Design fail with interconnect_ddr4_mem00_0.dcp is not a valid design checkpoint 

2017.4 - SDAccel - Moving vcu1525_dyanmic_5_0 from one host to another corrupted golden image 
(Xilinx Answer 70936)2017.4 - SDAccel - Cold Boot required after upgrading/downgrading DSA using XBSAK flash 

2017.4 - SDAccel - AWS failures terminate called after throwing an instance of 'xrt::error' with 7v3 and ku3-2ddr-xpr 

2017.4 - SDAccel - Bandwidth mismatch between VCU1525 and KCU1500 dynamic DSAs for the same testcase 

2017.4 - SDAccel - RTL Kernels


 

AR# 70835
日期 11/30/2018
状态 Active
Type 综合文章
Tools
People Also Viewed