Clocking and Resets
Clocking
There are three clock domains in the DPU IP: the register configuration, the data controller, and the computation unit. The three input clocks can be configured depending on the requirements. Therefore, the corresponding reset for the three input clocks must be configured correctly.
Clock Domain
The following figure shows the three clock domains.
Register Clock
s_axi_aclk is used for the register configuration module. This module receives the DPU configuration though the S_AXI interface. The S_AXI clock can be configured as common with the M-AXI clock or as an independent clock. The DPU configuration registers are updated at a very low frequency and most of those registers are set at the start of a task. The M-AXI is used as a high-frequency clock, Xilinx recommends setting the S-AXI clock as an independent clock with a frequency of 100 MHz.
In the Vitis flow, the shell may provide only two clocks for the DPU IP. In this case, the S_AXI clock must be configured as common with the M-AXI clock.
Data Controller Clock
The primary function of the data controller module is to schedule the data flow in the DPU IP. The data controller module works with m_axi_dpu_aclk. The data transfer between the DPU and external memory happens in the data controller clock domain, so m_axi_dpu_aclk is also the AXI clock for the AXI_MM master interface in the DPU IP. m_axi_dpu_aclk should be connected to the AXI_MM master clock.
Computation Clock
The DSP slices in the computation unit module are in the dpu_2x_clk domain, which runs at twice the clock frequency of the data controller module. The two related clocks must be edge-aligned.
Reference Clock Generation
There are three input clocks for the DPU and the frequency of dpu_2x_clk should be twice that of m_axi_dpu_aclk. m_axi_dpu_aclk and dpu_2x_clk must be synchronous. The recommended circuit design is shown here.
An MMCM and two BUFGCE_DIV blocks can be instantiated to design this circuit. The frequency of clk_in1 is arbitrary and the frequency of output clock CLKOUT in the MMCM should be the frequency of dpu_clk_2x. BUFGCE_DIV_CLK1_INST divides the frequency of CLKOUT by two. dpu_clk and dpu_clk_2x are derived from the same clock, so they are synchronous. The two BUFGCE_DIVs reduce the skew between the two clocks, which helps with timing closure.
Configuring Clock Wizard
In addition, Matched Routing must be selected for m_axi_dpu_aclk and dpu_2x_clk in the Output Clocks tab of the Clock Wizard IP. Matched Routing significantly reduces the skew between clocks generated through BUFGCE_DIV blocks. The related configuration is shown in the following figure.
Adding CE for dpu_2x_clk
The dpu_2x clock gating option can reduce the power consumption of the DPU. When the option is enabled, the number of generated clk_dsp should be equal to the number of DPU cores. Each clk_dsp should be set as a buffer with CE in the clock wizard IP. As shown in the following figure, three clk_dsp_ce appear when the output clock is configured with the CE. To enable the dpu_2x clock gating function, each clk_dsp_ce port should be connected to the corresponding dpu_2x_clk_ce port in the DPU.
After configuring the clock wizard, the clock_dsp_ce should be connected to the corresponding port in the DPU. The connections are shown in the following figure.
Reset
There are three input clocks for the DPU IP and each clock has a corresponding reset. Each reset must be synchronous to its corresponding clock. If the related clocks and resets are not synchronized, the DPU might not work properly. A Processor System Reset IP block is recommended to generate a synchronized reset signal. The reference design is shown here.