Profile/Analyze

TCF Profiling

TCF profiler supports profiling of both standalone and Linux applications. TCF profiling does not require any additional compiler flags to be set while building the application. Profiling standalone applications over Jtag is based on sampling the Program Counter through debug interface. It doesn’t alter the program execution flow and is non-intrusive when stack trace is not enabled. When stack trace is enabled, program execution speed decreases as the debugger has to collect stack trace information.

  1. Select the application you want to profile.
  2. Right-click the application and select Run As > Launch on Hardware (Single Application Debug).
  3. When the application stops at main, open the TCF profiler view by selecting Window > Show View > Debug > TCF Profiler.
  4. Click the button to start profiling. The Profiler Configuration dialog box appears.
  5. Select the Aggregate Per Function option, to group all the samples collected for different addresses in a single function together. When the option is disabled, the samples collected are shown as per the address.
  6. Select the Enable stack tracing option, to show the stack trace for each address in the sample data. To view the stack trace for an address, click on that address entry in the profiler view.
  7. Specify the Max stack frames count for the maximum number of frames that are shown in the stack trace view.
  8. Specify the View update interval for the time interval (in milliseconds) the TCF profiler view is updated with the new results. Please note that this is different from the interval at which the profile samples are collected.
  9. Resume your application. The profiler view will be updated with the data as shown the figure below.


gprof Profiling (Zynq-7000 SoC)

IMPORTANT: This feature applies only to Zynq-7000 SoC devices.
This feature applies only to Zynq-7000 SoC devices. GNU gprof provides two kinds of information that you can use to optimize the program:
  • A histogram with which you can identify the functions in the program that take up the most execution time
  • A call graph that shows what functions called which other functions, and how many times

The execution flow of the program is altered so that gprof can obtain data. Consequently, this method of profiling is considered software-intrusive. The program flow is altered in two ways:

  • To obtain histogram data, the program is periodically interrupted to obtain a sample of its program counter location. This user-defined interval is usually measured in milliseconds. The program counter location helps identify which function was being executed at that particular sample. Taking multiple samples over a long interval of a few seconds helps identify which functions execute for the longest time in the program.
  • To obtain the call graph information, the compiler annotates every function call to store the caller and callee information in a data structure.

The profiling workflow is described in the following diagram:

Figure 1: Profiling Workflow


Note: Xilinx recommends not to use garbage collector flags when you run profiling. Using garbage collector flags can cause errors.
For additional information about GNU gprof, refer to http://sourceware.org/binutils/docs-2.18/gprof/index.html.

Specifying Profiler Configuration

To configure options for the Profiler, do the following:

  1. In the Project Explorer or C/C++ Projects view, select a project.
  2. Select Run > Run Configuration.
  3. In the Run Configurations dialog box, expand Launch on Hardware (Single Application Debug).
  4. Create a run configuration.
  5. Click the Application tab.
  6. Click the Edit button to view and configure the Advanced Options.
  7. In the Profile Options area, select the Enable Profiling check box.
  8. Specify the sampling frequency and the scratch memory to use for profiling, where:
    1. The sampling frequency is the interrupt interval that the profiling routine uses to periodically check which function is currently being executed. The routine performs the sampling by examining the program counter at each interrupt.
    2. The scratch memory address is the location in DDR3 memory that the domain profiling services use for data collection. The application program should never touch this space.
  9. Click Run to profile the application.

Setting Up the Hardware for Profiling

To profile a software application, you must ensure that interrupts are raised periodically to sample the program counter (PC) value. To do this, you must program a timer and use the timer interrupt handler to collect and store the PC. The profile interrupt handler requires full access to the timer, so a separate timer that is not used by the application itself must be available in the system.

Xilinx profiling libraries that provide the profile interrupt handler support the AXI Timer core.

When profiling on Zynq-7000 SoC processors, the internal SCU timer should be used.

Setting Up the Software for Profiling

There are three important steps involved in setting up the software application for profiling:
  1. Enable profiling in the Software Platform to include profiling libraries.
    Note: Profiling is supported only for standalone software platforms.
    1. Add -pg to the extra compiler flags to build the domain with profiling.

    2. Set enable_sw_intrusive_profiling to true in the Board Support Package Settings window.

  2. Enable profiling in application C/C++ build settings from C/C++ Build > Settings > Profiling

Setting up the Domain

  1. Double-click Application.prj. This opens the Application Overview page. Click Navigate to BSP Settings. Click Modify BSP Settings.
  2. Click on the OS name, such as standalone, to configure its parameters.
  3. Set the enable_sw_intrusive_profiling field to true and select the timer for use by the profile libraries.
  4. The domain should be compiled with the -pg compiler option. To perform this step, click on the drivers item and select the CPU driver. Add the -pg flag to the extra_compiler_flags option.
  5. Click OK.

Setting Up the Software Application

  1. Modify the software application code to enable interrupts. If there is an interrupt controller present in the system with multiple interrupt sources, you must enable interrupts in the processor and the interrupt controller to allow interrupts from the profile timer to reach the processor. Example code is shown below:
    /* enable interrupt controller */
            XIntc_mMasterEnable(SYSINTC_BASEADDR);
            /* service all interrupts */
            XIntc_SetIntrSvcOption(SYSINTC_BASEADDR, XIN_SVC_ALL_ISRS_OPTION);
            /* enable the profile timer interrupt */
            XIntc_mEnableIntr(SYSINTC_BASEADDR, PROFILE_TIMER_INTR_MASK);
            /* enable interrupts in the processor */
            microblaze_enable_interrupts();
  2. If the profiling timer is the only entity that connects to the input of interrupt controller or directly to the processor, the tool sets up the interrupt for you automatically, and no change is required in the application code.
  3. Right-click the software application and select C/C++ Settings (or Properties > C/C++ Build > Settings).
  4. Select gcc compiler > Profiling and enable profiling by selecting Enable Profiling (-pg).
  5. Click OK.

Viewing the Profiling Results

When the program completes execution (reaches exit), or when you click the Stop button to stop the program, the Vitis software platform downloads the profile data and stores it in a file named gmon.out.

The Vitis software platform automatically opens the gmon.out file for viewing. The gmon.out file is generated in the debug folder of the application project.

Profiling Linux Applications with System Debugger

To profile Linux applications using Xilinx System Debugger, perform the following:

  1. Create an new Linux application for the target, using the Vitis IDE.
    Note: The instructions have been developed based on Cortex-A9 on ZC702 but should be valid for other targets as well.
  2. Import your application sources in to the new project.
  3. Build the application.
  4. Boot Linux on ZC702 (for example, from the SD card) and start the TCF agent on the target.
  5. Create a new target connection for the TCF agent, from the Target Connections icon.
  6. Create a new Xilinx System Debugger debug configuration for the application, you wish to profile, and launch the debug configuration. Create a new Launch on Hardware (Single Application Debug).
  7. On the Main tab, select Linux Application Debug from the Debug Type list.
  8. On the Application tab page, specify the local .elf file path and the remote .elf file path.
  9. Click Debug.
  10. When the process context stops at main(), launch the TCF Profiler view by selecting Window > Show View > Debug > TCF Profiler.
  11. In the TCF Profiler view, click the Start toolbar icon to start profiling.
    Note: Set a breakpoint at the end of your application code, so that the process is not terminated. If not set, the data collected by the TCF Profiler is lost when the process terminates.
  12. Resume the process context. TCF Profiler view will be updated with the profile date.

Non-Intrusive Profiling for MicroBlaze Processors

When extended debug is enabled in the hardware design, MicroBlaze supports non-intrusive profiling of the program instructions. You can configure whether the instruction count or the cycle count should be profiled. The profiling results are stored in a profiling buffer in the debug memory, which can be accessed by the debugger thru MDM debug registers. The size of the buffer can be configured from 4K to 128K, using the C_DEBUG_PROFILE_SIZE (a size of 0 indicates profiling is disabled) parameter.

The profile buffer is divided into number of portions known as bins. Each bin is 36 bit wide and can count the instructions or cycles of a program address range. The address range that is profiled by each bin is dependent on the total size of the program that is profiled. Bin size is calculated using the formula:

B = log2((H - L + S * 4) / S * 4)

Where B is the bin size, H, L are high and low address of the program address range being profiled, and S is the size of the profile buffer.

When profiling is enabled and program starts running, profile statistics for an address range are stored in its corresponding bin. Xilinx System Debugger can read these results, when needed.

Specifying Non-Intrusive Profiler Configuration

To configure options for the Profiler, do the following:

  1. Launch the Vitis software platform.
  2. Create a new standalone application project or select an existing one.
  3. Select Run > Run Configuration.
  4. In the Run Configurations dialog box, expand Launch on Hardware (Single Application Debug).
  5. Create a run configuration.
  6. Click the Application tab.
  7. Click the Edit button to view and configure the Advanced Options.
  8. In the Profile Options area, select the Enable Profiling check box.
  9. Select Non-Intrusive.
  10. Specify the low address and the high address of the program range to be profiled. Alternatively, select the Program Start or the Program End check box to auto-calculate the low or high address from the program.
  11. Count Instructions to count the number of instructions executed. Alternatively, select Count Cycles to count the number of cycles elapsed.
  12. Select Cumulative Profiling to profile without clearing the profiling buffers from the last execution.
  13. Click OK to save the configurations.
  14. Click Run to profile the selected project.

Viewing the Non-Intrusive Profiling Results

When the application completes execution, or when you click the Stop button to stop the program, the Vitis software platform downloads the non-intrusive profile data and stores it in a file named gmon.out.

Note: The Vitis software platform automatically opens the gmon.out file for viewing. The gmon.out file is generated in the debug folder of the application project.

FreeRTOS Analysis using STM

The Vitis software platfrom supports collection and analysis of trace events generated by FreeRTOS based applications. Zynq UltraScale+ MPSoC processors support the Software Trace Microcell (STM) block which is a software application driven trace source to generate a software instrumentation trace (SWIT). To collect FreeRTOS events and analyze them, do the following:

  1. Click File > New > Application Project. The New Application Project dialog box appears.
  2. Type a project name into the Project Name field.
  3. Select the location for the project. You can use the default location as displayed in the Location field by leaving the Use default location check box selected. Otherwise, click the check box and type or browse to the directory location.
  4. In platform selection window, select Create a new platform from hardware (XSA) and choose zcu102 design. Click Next to proceed.
  5. Choose CPU and OS as freertos10_xilinx.
    Note: The FreeRTOS version may vary in upcoming releases.
  6. Click Next to advance to the Templates screen.

    The Vitis software platform provides useful sample applications listed in Templates dialog box that you can use to create your project. The Description box displays a brief description of the selected sample application. When you use a sample application for your project, the Vitis software platform creates the required source and header files and linker script.

  7. Select the desired template. If you want to create a blank project, select Empty Application. You can add C files to the project after the project is created.
  8. Click the Navigate to BSP settings option in application project settings page. Select Open BSP Settings > Overview > FreeRTOS and change the value of enable_stm_event_trace to TRUE.
  9. Right-click on the application and select Debug As > Debug Configuration.
  10. In the Debug Configurations dialog box, double-click Single Application Debug to create a launch configuration for the selected project.
  11. Click Debug. Debugging begins with the processors in the running state.
  12. Debug the project using the system debugger on the required target.
  13. Wait for project to be downloaded on to board and stop at main().
  14. Click Window > Show View > Xilinx. The Show View dialog box appears.
  15. Select Trace Session Manager from the Show View dialog box. The launch configuration related to the application being debugged can be seen in the Trace Session Manager view.
  16. Click the start button in the Trace Session Manager view toolbar to start the FreeRTOS trace collection.
  17. Switch to the Debug view and resume the project.
  18. Allow the project to run.
  19. Switch back to the Trace Session Manager view and stop the trace collection. All the trace data collected will be exported to suitable trace file and will be opened in Events editor and the FreeRTOS Analysis view.