# **LogiCORE IP YCrCb to RGB Color-Space Converter v6.00.a**

*Product Guide*

**PG014 July 25, 2012**





# Table of Contents

#### **[SECTION I: SUMMARY](#page-4-0)**

#### **[IP Facts](#page-5-0)**





#### **[Chapter 2: Product Specification](#page-9-0)**



#### **[Chapter 3: Designing with the Core](#page-26-0)**



#### **[Chapter 4: C Model Reference](#page-37-0)**



#### **EXILINX**

#### **[SECTION II: VIVADO DESIGN SUITE](#page-47-0)**



#### **[SECTION III: ISE DESIGN SUITE](#page-55-0)**





#### **[Chapter 8: Constraining the Core](#page-62-0)**



#### **[Chapter 9: Detailed Example Design](#page-64-0)**



#### **EXILINX**

#### **[SECTION IV: APPENDICES](#page-67-0)**





# <span id="page-4-0"></span>SECTION I: SUMMARY

[IP Facts](#page-5-1)

**[Overview](#page-7-2)** 

[Product Specification](#page-9-3)

[Designing with the Core](#page-26-2)

[C Model Reference](#page-37-2)

# <span id="page-5-1"></span>**IP Facts**

# **EXALINX**

# **Introduction**

The Xilinx LogiCORE™ IP YCrCb to RGB Color-Space Converter core is a simplified 3x3 matrix multiplier converting three input color samples to three output samples in a single clock cycle. The optimized structure uses only four XtremeDSP™ slices by taking advantage of the dependencies between coefficients in the conversion matrix of most YCrCb 4:4:4 or YUV 4:4:4 to RGB standards.

# **Features**

- Built-in support for:
	- ° SD (ITU 601)
	- ° HD (ITU 709) PAL
	- ° HD (ITU 709) NTSC
	- ° YUV
- Support for user-defined conversion matrices
- AXI4-Stream data interfaces
- Optional AXI4-Lite control interface
- Supports 8, 10, 12 and 16-bit per color component input and output
- Built-in, optional bypass and test-pattern generator mode
- Built-in, optional throughput monitors
- Supports spatial resolutions from 32x32 up to 7680x7680
	- Supports 1080P60 in all supported device families ([1](#page-5-8))
	- Supports 4kx2k @ 24 Hz in supported high performance devices
- <span id="page-5-8"></span>1. Performance on low power devices may be lower.

<span id="page-5-0"></span>

- <span id="page-5-2"></span>1. For a complete listing of supported devices, see the [release](www.xilinx.com/support/documentation/ip_documentation/xtp025.pdf)  [notes](www.xilinx.com/support/documentation/ip_documentation/xtp025.pdf) for this core.
- <span id="page-5-7"></span>2. Supported in ISE Design Suite implementations only.
- <span id="page-5-3"></span>3. Video protocol as defined in the *Video IP: AXI Feature Adoption* section of [UG761 AXI Reference Guide](http://www.xilinx.com/support/documentation/ip_documentation/axi_ref_guide/v14_2/ug761_axi_reference_guide.pdf).
- <span id="page-5-4"></span>4. HDL test bench and C-Model available on the product page on Xilinx.com at http://www.xilinx.com/products/ intellectual-property/YCrCb\_to\_RGB.htm.
- <span id="page-5-5"></span>5. For the supported versions of the tools, see the [ISE Design](http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_2/irn.pdf)  [Suite 14: Release Notes Guide](http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_2/irn.pdf).
- <span id="page-5-9"></span>6. For the supported versions of the tools, see the [Xilinx Design](http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_2/irn.pdf)  [Tools: Release Notes Guide](http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_2/irn.pdf).
- <span id="page-5-6"></span>7. Supports only 7 series devices.

**EXILINX** 



# <span id="page-7-2"></span><span id="page-7-0"></span>Overview

A color space is a mathematical representation of a set of colors. The two most popular color models are:

- RGB or R'G'B', gamma corrected RGB, used in computer graphics
- YIQ, YUV and YCrCb used in video systems

These color spaces are directly related to the intuitive notions of hue, saturation and brightness.

All color spaces can be derived from the RGB information supplied by devices such as cameras and scanners. Different color spaces have historically evolved for different applications. In each case, a color space was chosen for application-specific reasons.

The convergence of computers, the Internet and a wide variety of video devices, all using different color representations, is forcing the digital designer today to convert between them. The objective is to have all inputs converted to a common color space before algorithms and processes are executed. Converters are useful for a number of markets, including image and video processing.

# <span id="page-7-1"></span>**Feature Summary**

The YCrCb to RGB Color-Space Converter core transforms YCrCb 4:4:4 or YUV 4:4:4 video data into RGB video data. The core supports a 4 common format conversions as well as custom mode that allows for a user-defined transform. The core is capable of a maximum resolution of 7680 columns by 7680 rows with 8, 10, 12, or 16 bits per pixel and supports the bandwidth necessary for High-definition (1080p60) resolutions in all Xilinx FPGA device families. Higher resolutions can be supported in Xilinx high-performance device families.

You can configure and instantiate the core from CORE Generator or EDK tools. Core functionality may be controlled dynamically with an optional AXI4-Lite interface.

# <span id="page-8-0"></span>**Applications**

- Post-processing core for image data
- Video surveillance
- Video conferencing
- Machine vision
- Other imaging applications

# <span id="page-8-1"></span>**Licensing and Ordering Information**

This Xilinx LogiCORE IP module is provided at no additional cost with the Xilinx Vivado Design Suite and ISE Design Suite tools under the terms of the [Xilinx End User License](http://www.xilinx.com/ise/license/license_agreement.htm). Information about this and other Xilinx LogiCORE IP modules is available at the [Xilinx](http://www.xilinx.com/products/intellectual-property/index.htm)  [Intellectual Property](http://www.xilinx.com/products/intellectual-property/index.htm) page. For information about pricing and availability of other Xilinx LogiCORE IP modules and tools, contact your [local Xilinx sales representative](http://www.xilinx.com/company/contact/index.htm).



# <span id="page-9-3"></span><span id="page-9-0"></span>Product Specification

# <span id="page-9-1"></span>**Standards Compliance**

The YCrCb to RGB Color-Space Converter core is compliant with the AXI4-Stream Video Protocol and AXI4-Lite interconnect standards. Refer to the *Video IP: AXI Feature Adoption* section of the (UG761) *AXI Reference Guide* [\[Ref 8\]](#page-82-2) for additional information.

# <span id="page-9-2"></span>**Performance**

The following sections detail the performance characteristics of the YCrCb to RGB Color-Space Converter core.

### **Maximum Frequencies**

The maximum achievable clock frequency can vary. The maximum achievable clock frequency and all resource counts can be affected by other tool options, additional logic in the FPGA device, using a different version of Xilinx tools and other factors. Refer to in [Table 2-1](#page-10-2) through [Table 2-5](#page-11-1) for device-specific information.

#### **Latency**

The processing latency of the core is shown in the following equation:

Latency =  $9 + 1$ (if has clipping) + 1(if has clamping)

This code evaluates to 11 clock cycles for typical cases (unless in "custom" mode the clipping and/or clamping circuits are not used).

### **Throughput**

The YCrCb to RGB Color Space Converter core outputs one YCbCr 4:4:4 sample per clock cycle.

# <span id="page-10-0"></span>**Resource Utilization**

For an accurate measure of the usage of primitives, slices, and CLBs for a particular instance, check the **Display Core Viewer after Generation** check box in the CORE Generator interface.

The information presented in [Table 2-1](#page-10-2) through [Table 2-5](#page-11-1) is a guide to the resource utilization and maximum clock frequency of the YCrCb to RGB Color Space Converter core for all input/output width combinations for Virtex-7, Kintex-7, Artix-7, Zynq-7000, Virtex-6, and Spartan-6 FPGA families. The Xtreme DSP Slice count is always 4, regardless of parameterization, and this core does not use any dedicated I/O or CLK resources. The design was tested using  $ISE^{\circledR}$  v14.2 tools with default tool options for characterization data. When the AXI4-Lite Register Interface is enabled, add the following values to the values in the tables; LUT\_FF Pairs: 864, LUTs: 630 and FFs: 605.



<span id="page-10-2"></span><span id="page-10-1"></span>*Table 2-1:* **Spartan-6**

1. Speedfile: XC6SLX25 FGG484

*Table 2-2:* **Virtex-7**



1. Speedfile: XC7V585T FFG1157

*Table 2-3:* **Virtex-6**



1. Speedfile: XC6VLX75T FF484

| Data Width | <b>LUT-FF Pairs</b> | <b>LUTs</b> | <b>FFs</b> | Fmax (MHz) |
|------------|---------------------|-------------|------------|------------|
|            | 313                 | 298         | 226        | 272.5      |
| 10         | 339                 | 321         | 253        | 280.7      |
| 12         | 358                 | 347         | 279        | 272.5      |
| 16         | 413                 | 386         | 331        | 272.5      |

*Table 2-4:* **Kintex-7 (Zynq-7000 XC7Z010 and XC7Z020)**

1. Speedfile: XC7K70T FBG484

<span id="page-11-1"></span>*Table 2-5:* **Artix-7 (Zynq-7000 XC7Z030 and XC7Z045)**

| Data Width | <b>LUT-FF Pairs</b> | <b>LUTs</b> | <b>FFs</b> | Fmax (MHz) |
|------------|---------------------|-------------|------------|------------|
|            | 297                 | 284         | 225        | 182.27     |
| 10         | 327                 | 305         | 252        | 190.47     |
| 12         | 349                 | 328         | 278        | 190.47     |
| 16         | 401                 | 372         | 330        | 198.67     |

1. Speedfile: XC7A100T FGG484

[Table 2-6](#page-11-2) through [Table 2-8](#page-11-3) were generated using Vivado Design Suite 2012.2 with default tool options for characterization data.

<span id="page-11-2"></span>



1. Speedfile: XC7V585T FFG1157

#### *Table 2-7:* **Kintex-7 (Zynq-7000 XC7Z010 and XC7Z020)**



1. Speedfile: XC7K70T FBG484

#### <span id="page-11-3"></span><span id="page-11-0"></span>*Table 2-8:* **Artix-7 (Zynq-7000 XC7Z030 and XC7Z045)**



| Data Width | <b>LUT-FF Pairs</b> | <b>LUTs</b> | <b>FFs</b> | Fmax (MHz) |
|------------|---------------------|-------------|------------|------------|
|            | 373                 | 124         | 326        | 182.27     |
| 16         | 490                 | 167         | 406        | 174.06     |

*Table 2-8:* **Artix-7 (Zynq-7000 XC7Z030 and XC7Z045)** *(Cont'd)*

1. Speedfile: XC7A100T FGG484

# <span id="page-12-0"></span>**Core Interfaces and Register Space**

### **Port Descriptions**

The YCrCb to RGB Color-Space Converter core uses industry standard control and data interfaces to connect to other system components. The following sections describe the various interfaces available with the core. [Figure 2-1](#page-13-0) illustrates an I/O diagram of the YCrCb2RGB core. Some signals are optional and not present for all configurations of the core. The AXI4-Lite interface and the IRQ pin are present only when the core is configured via the GUI with an AXI4-Lite control interface. The INTC\_IF interface is present only when the core is configured via the GUI with the INTC interface enabled.



*Figure 2-1:* **YCrCb2RGB Core Top-Level Signaling Interface**

### <span id="page-13-0"></span>**Common Interface Signals**

[Table 2-9](#page-13-1) summarizes the signals which are either shared by, or not part of the dedicated AXI4-Stream data or AXI4-Lite control interfaces.

| <b>Signal Name</b>        | Direction Width |   | <b>Description</b>                                                                                   |  |
|---------------------------|-----------------|---|------------------------------------------------------------------------------------------------------|--|
| <b>ACLK</b>               | In              |   | Video Core Clock                                                                                     |  |
| <b>ACLKEN</b>             | In              |   | Video Core Active High Clock Enable                                                                  |  |
| <b>ARESET<sub>n</sub></b> | In              |   | Video Core Active Low Synchronous Reset                                                              |  |
| INTC IF                   | Out             | 6 | Optional External Interrupt Controller Interface. Available only<br>when INTC IF is selected on GUI. |  |
| IRQ                       | Out             |   | Optional Interrupt Request Pin. Available only when AXI4-Liter<br>interface is selected on GUI.      |  |

<span id="page-13-1"></span>*Table 2-9:* **Common Interface Signals**

The ACLK, ACLKEN and ARESETn signals are shared between the core and the AXI4-Stream data interfaces. The AXI4-Lite control interface has its own set of clock, clock enable and reset pins: S\_AXI\_ACLK, S\_AXI\_ACLKEN and S\_AXI\_ARESETn. Refer to [The Interrupt](#page-24-0)  [Subsystem](#page-24-0) for a description of the INTC\_IF and IRQ pins.

#### **ACLK**

The AXI4-Stream interface must be synchronous to the core clock signal ACLK. All AXI4-Stream interface input signals are sampled on the rising edge of ACLK. All AXI4-Stream output signal changes occur after the rising edge of ACLK. The AXI4-Lite interface is unaffected by the ACLK signal.

#### **ACLKEN**

The ACLKEN pin is an active-high, synchronous clock-enable input pertaining to AXI4-Stream interfaces. Setting ACLKEN low (de-asserted) halts the operation of the core despite rising edges on the ACLK pin. Internal states are maintained, and output signal levels are held until ACLKEN is asserted again. When ACLKEN is de-asserted, core inputs are not sampled, except ARESETn, which supersedes ACLKEN. The AXI4-Lite interface is unaffected by the ACLKEN signal.

#### **ARESETn**

The ARESETn pin is an active-low, synchronous reset input pertaining to only AXI4-Stream interfaces. ARESETn supersedes ACLKEN, and when set to 0, the core resets at the next rising edge of ACLK even if ACLKEN is de-asserted. The ARESETn signal must be synchronous to the ACLK and must be held low for a minimum of 32 clock cycles of the slowest clock. The AXI4-Lite interface is unaffected by the ARESETn signal.

### **Data Interface**

The YCrCb2RGB core receives and transmits data using AXI4-Stream interfaces that implement a video protocol as defined in the *Video IP: AXI Feature Adoption* section of the (UG761) *AXI Reference Guide* [\[Ref 8\].](#page-82-2)

#### **AXI4-Stream Signal Names and Descriptions**

[Table 2-10](#page-14-0) describes the AXI4-Stream signal names and descriptions.

| <b>Signal Name</b>  | <b>Direction</b> | Width       | <b>Description</b>       |
|---------------------|------------------|-------------|--------------------------|
| s_axis_video_tdata  | In               | 24,32,40,48 | Input Video Data         |
| s_axis_video_tvalid | In               |             | Input Video Valid Signal |
| s_axis_video_tready | Out              |             | Input Ready              |

<span id="page-14-0"></span>*Table 2-10:* **AXI4-Stream Data Interface Signal Descriptions**

| <b>Signal Name</b>  | <b>Direction</b> | Width       | <b>Description</b>          |
|---------------------|------------------|-------------|-----------------------------|
| s_axis_video_tuser  | In               |             | Input Video Start Of Frame  |
| s_axis_video_tlast  | In               | 1           | Input Video End Of Line     |
| m_axis_video_tdata  | Out              | 24,32,40,48 | Output Video Data           |
| m axis video tvalid | Out              |             | Output Valid                |
| m_axis_video_tready | In               |             | Output Ready                |
| m_axis_video_tuser  | Out              | 1           | Output Video Start Of Frame |
| m_axis_video_tlast  | Out              |             | Output Video End Of Line    |

*Table 2-10:* **AXI4-Stream Data Interface Signal Descriptions**

#### **Video Data**

The AXI4-Stream interface specification restricts TDATA widths to integer multiples of 8 bits. Therefore, 10, 12, and 16 bit sensor data must be padded with zeros on the MSB to form Nx8 bit wide vector before connecting to s\_axis\_video\_tdata. Padding does not affect the size of the core.

For example, RGB data on the YCrCb2RGB output m\_axis\_video\_tdata is packed and padded to multiples of 8 bits as necessary, as seen in [Figure 2-2](#page-15-0). Zero padding the most significant bits is only necessary for 10, 12, and 16 bit wide data.

<span id="page-15-1"></span>

*Figure 2-2:* **RGB Data Encoding on m\_axis\_video\_tdata**

#### <span id="page-15-0"></span>**READY/VALID Handshake**

A valid transfer occurs whenever READY, VALID, ACLKEN, and ARESETn are high at the rising edge of ACLK, as seen in [Figure 2-3.](#page-16-0) During valid transfers, DATA only carries active video data. Blank periods and ancillary data packets are not transferred via the AXI4-Stream video protocol.

#### **Guidelines on Driving s\_axis\_video\_tvalid**

Once s\_axis\_video\_tvalid is asserted, no interface signals (except the YCrCb2RGB core driving s\_axis\_video\_tready) may change value until the transaction completes (s\_axis\_video\_tready, s\_axis\_video\_tvalid ACLKEN high on the rising edge of ACLK). Once asserted, s\_axis\_video\_tvalid may only be de-asserted after a transaction has completed. Transactions may not be retracted or aborted. In any cycle following a transaction,  $s_axis_yideo_tvalid can either be de-asserted or remain$ asserted to initiate a new transfer.

<span id="page-16-1"></span>

*Figure 2-3:* **Example of READY/VALID Handshake, Start of a New Frame**

#### <span id="page-16-0"></span>**Guidelines on Driving m\_axis\_video\_tready**

The m\_axis\_video\_tready signal may be asserted before, during or after the cycle in which the YCrCb2RGB core asserted m\_axis\_video\_tvalid. The assertion of m\_axis\_video\_tready may be dependent on the value of m\_axis\_video\_tvalid. A slave that can immediately accept data qualified by  $m_{axis}$  video\_tvalid, should pre-assert its m\_axis\_video\_tready signal until data is received. Alternatively, m\_axis\_video\_tready can be registered and driven the cycle following VALID assertion. It is recommended that the AXI4-Stream slave should drive READY independently, or pre-assert READY to minimize latency.

#### **Start of Frame Signals - m\_axis\_video\_tuser0, s\_axis\_video\_tuser0**

The Start-Of-Frame (SOF) signal, physically transmitted over the AXI4-Stream TUSER0 signal, marks the first pixel of a video frame. The SOF pulse is 1 valid transaction wide, and must coincide with the first pixel of the frame, as seen in [Figure 2-3.](#page-16-0) SOF serves as a frame synchronization signal, which allows downstream cores to re-initialize, and detect the first pixel of a frame. The SOF signal may be asserted an arbitrary number of ACLK cycles before the first pixel value is presented on DATA, as long as a VALID is not asserted.

#### **End of Line Signals - m\_axis\_video\_tlast, s\_axis\_video\_tlast**

The End-Of-Line signal, physically transmitted over the AXI4-Stream TLAST signal, marks the last pixel of a line. The EOL pulse is 1 valid transaction wide, and must coincide with the last pixel of a scan-line, as seen in [Figure 2-4.](#page-17-0)



*Figure 2-4:* **Use of EOL and SOF Signals**

### <span id="page-17-0"></span>**Control Interface**

When configuring the core, the user has the option to add an AXI4-Lite register interface to dynamically control the behavior of the core. The AXI4-Lite slave interface facilitates integrating the core into a processor system, or along with other video or AXI4-Lite compliant IP, connected via AXI4-Lite interface to an AXI4-Lite master. In a static configuration with a fixed set of parameters (constant configuration), the core can be instantiated without the AXI4-Lite control interface, which reduces the core Slice footprint.

#### **Constant Configuration**

The constant configuration caters to users who will use the core in one setup that will not need to change over time. In constant configuration the image resolution (number of active pixels per scan line and the number of active scan lines per frame), and the other core parameters are hard coded into the core via the YCrCb2RGB core GUI. Since there is no AXI4-Lite interface, the core is not programmable, but can be reset, enabled, or disabled using the ARESETn and ACLKEN ports.

#### **AXI4-Lite Interface**

The AXI4-Lite interface allows a user to dynamically control parameters within the core. Core configuration can be accomplished using an AXI4-Stream master state machine, or an embedded ARM or soft system processor such as MicroBlaze.

The YCrCb2RGB core can be controlled via the AXI4-Lite interface using read and write transactions to the YCrCb2RGB register space.

| <b>Signal Name</b> | Direction Width | <b>Description</b>                                   |
|--------------------|-----------------|------------------------------------------------------|
| s axi aclk         | In              | AXI4-Lite clock                                      |
| s_axi_aclken       | In              | AXI4-Lite clock enable                               |
| s axi aresetn      | In              | AXI4-Lite synchronous Active Low reset               |
| s axi awvalid      | In              | AXI4-Lite Write Address Channel Write Address Valid. |

*Table 2-11:* **AXI4-Lite Interface Signals**

| <b>Signal Name</b> | <b>Direction</b> | Width        | <b>Description</b>                                                                                       |
|--------------------|------------------|--------------|----------------------------------------------------------------------------------------------------------|
| s_axi_awread       | Out              | $\mathbf{1}$ | AXI4-Lite Write Address Channel Write Address Ready. Indicates DMA<br>ready to accept the write address. |
| s_axi_awaddr       | In.              | 32           | AXI4-Lite Write Address Bus                                                                              |
| s_axi_wvalid       | In.              | 1            | AXI4-Lite Write Data Channel Write Data Valid.                                                           |
| s_axi_wready       | Out              | 1            | AXI4-Lite Write Data Channel Write Data Ready. Indicates DMA is ready to<br>accept the write data.       |
| s_axi_wdata        | In.              | 32           | AXI4-Lite Write Data Bus                                                                                 |
| s_axi_bresp        | Out              | 2            | AXI4-Lite Write Response Channel. Indicates results of the write transfer.                               |
| s_axi_bvalid       | Out              | 1            | AXI4-Lite Write Response Channel Response Valid. Indicates response is<br>valid.                         |
| s_axi_bready       | In               | 1            | AXI4-Lite Write Response Channel Ready. Indicates target is ready to<br>receive response.                |
| s_axi_arvalid      | In.              | 1            | AXI4-Lite Read Address Channel Read Address Valid                                                        |
| s_axi_arready      | Out              | 1            | Ready. Indicates DMA is ready to accept the read address.                                                |
| s_axi_araddr       | In.              | 32           | AXI4-Lite Read Address Bus                                                                               |
| s_axi_rvalid       | Out              | 1.           | AXI4-Lite Read Data Channel Read Data Valid                                                              |
| s_axi_rready       | In               | 1            | AXI4-Lite Read Data Channel Read Data Ready. Indicates target is ready to<br>accept the read data.       |
| s_axi_rdata        | Out              | 32           | AXI4-Lite Read Data Bus                                                                                  |
| s_axi_rresp        | Out              | 2            | AXI4-Lite Read Response Channel Response. Indicates results of the read<br>transfer.                     |

*Table 2-11:* **AXI4-Lite Interface Signals** *(Cont'd)*

#### **S\_AXI\_ACLK**

The AXI4-Lite interface must be synchronous to the S\_AXI\_ACLK clock signal. The AXI4-Lite interface input signals are sampled on the rising edge of ACLK. The AXI4-Lite output signal changes occur after the rising edge of ACLK. The AXI4-Stream interfaces signals are not affected by the S\_AXI\_ACLK.

#### **S\_AXI\_ACLKEN**

The S\_AXI\_ACLKEN pin is an active-high, synchronous clock-enable input for the AXI4-Lite interface. Setting S\_AXI\_ACLKEN low (de-asserted) halts the operation of the AXI4-Lite interface despite rising edges on the S\_AXI\_ACLK pin. AXI4-Lite interface states are maintained, and AXI4-Lite interface output signal levels are held until S\_AXI\_ACLKEN is asserted again. When S\_AXI\_ACLKEN is de-asserted, AXI4-Lite interface inputs are not sampled, except S\_AXI\_ARESETn, which supersedes S\_AXI\_ACLKEN. The AXI4-Stream interfaces signals are not affected by the S\_AXI\_ACLKEN.

### **S\_AXI\_ARESETn**

The S\_AXI\_ARESETn pin is an active-low, synchronous reset input for the AXI4-Lite interface. S\_AXI\_ARESETn supersedes S\_AXI\_ACLKEN, and when set to 0, the core resets at the next rising edge of S\_AXI\_ACLK even if S\_AXI\_ACLKEN is de-asserted. The S\_AXI\_ARESETn signal must be synchronous to the S\_AXI\_ACLK and must be held low for a minimum of 32 clock cycles of the slowest clock. The S\_AXI\_ARESETn input is resynchronized to the ACLK clock domain. The AXI4-Stream interfaces and core signals are also reset by S\_AXI\_ARESETn.

### **Register Space**

The standardized Xilinx Video IP register space is partitioned to control-, timing-, and core specific registers. The YCrCb2RGB core uses only one timing related register, ACTIVE\_SIZE (0x0020), which allows specifying the input frame dimensions. The core has nine core specific registers that control Matrix coefficients, the data offsets and the clip and clamp values.



<span id="page-19-0"></span>

| <b>Address</b><br>(hex)<br><b>BASEADDR</b> | <b>Register Name</b> | <b>Access</b><br><b>Type</b> | <b>Double</b><br><b>Buffered</b> | <b>Default Value</b> | <b>Register Description</b>                                                                 |
|--------------------------------------------|----------------------|------------------------------|----------------------------------|----------------------|---------------------------------------------------------------------------------------------|
| 0x001C                                     | SYSDEBUG2            | R                            | N/A                              | $\Omega$             | 0-31: Pixel Throughput monitor <sup>(1)</sup>                                               |
| 0x0020                                     | ACTIVE_SIZE          | R/W                          | Yes                              | Specified via<br>GUI | 12-0: Number of Active Pixels per<br>Scanline<br>28-16: Number of Active Lines per<br>Frame |
| 0x0100                                     | RGBMAX               | R/W                          | Yes                              | Specified via GUI    | 15:0: RGB clippling value                                                                   |
| 0x0104                                     | <b>RGBMIN</b>        | R/W                          | Yes                              | Specified via GUI    | 15:0: RGB clampling value                                                                   |
| 0x0108                                     | <b>ROFFSET</b>       | R/W                          | Yes                              | Specified via GUI    | 31:0: Red offset compensation                                                               |
| 0x010C                                     | <b>GOFFSET</b>       | R/W                          | Yes                              | Specified via GUI    | 31:0: Green offset compensation                                                             |
| 0x0110                                     | <b>BOFFSET</b>       | R/W                          | Yes                              | Specified via GUI    | 31:0: Blue offset compensation                                                              |
| 0x0114                                     | <b>ACOEF</b>         | R/W                          | Yes                              | Specified via GUI    | 17:0: ACOEF, BCOEF, CCOEF, DCOEF are                                                        |
| 0x0118                                     | <b>BCOEF</b>         | R/W                          | Yes                              | Specified via GUI    | derived from CA, CB, CC and CD, by<br>calculating the inverse multiplication                |
| 0x011C                                     | <b>CCOEF</b>         | R/W                          | Yes                              | Specified via GUI    | matrix and representing as a 17-bit fixed                                                   |
| 0x0120                                     | <b>DCOEF</b>         | R/W                          | Yes                              | Specified via GUI    | point number.                                                                               |

*Table 2-12:* **Register Names and Descriptions**

<span id="page-20-0"></span>1. Only available when the debugging features option is enabled in the GUI at the time the core is instantiated.

#### **CONTROL (0x0000) Register**

Bit 0 of the CONTROL register, SW\_ENABLE, facilitates enabling and disabling the core from software. Writing '0' to this bit effectively disables the core halting further operations, which blocks the propagation of all video signals. The default value of  $\text{SW}$  enable is 1 (enabled) for the Constant configuration. After Power up, or Global Reset, the SW\_ENABLE defaults to 0 for the AXI4-Lite interface. Similar to the ACLKEN pin, the SW\_ENABLE flag is not synchronized with the AXI4-Stream interfaces: Enabling or Disabling the core takes effect immediately, irrespective of the core processing status. Disabling the core for extended periods may lead to image tearing.

Bit 1 of the CONTROL register, REG\_UPDATE is a write done semaphore for the host processor, which facilitates committing all user and timing register updates simultaneously. The YCrCb2RGB core ACTIVE\_SIZE and core specific registers are double buffered. One set of registers (the processor registers) is directly accessed by the processor interface, while the other set (the active set) is actively used by the core. New values written to the processor registers will get copied over to the active set at the end of the AXI4-Stream frame, if and only if REG\_UPDATE is set. Setting REG\_UPDATE to 0 before updating multiple register values, then setting REG\_UPDATE to 1 when updates are completed ensures all registers are updated simultaneously at the frame boundary without causing image tearing.

Bit 4 of the CONTROL register, BYPASS, switches the core to bypass mode if debug features are enabled. In bypass mode the YCrCb2RGB core processing function is bypassed, and the core repeats AXI4-Stream input samples on its output. Refer to [Debugging Features in](#page-74-1)  [Appendix C](#page-74-1) for more information. If debug features were not included at instantiation, this flag has no effect on the operation of the core. Switching bypass mode on or off is not synchronized to frame processing, therefore can lead to image tearing.

Bit 5 of the CONTROL register, TEST\_PATTERN, switches the core to test-pattern generator mode if debug features are enabled. Refer to [Debugging Features in Appendix C](#page-74-1) for more information. If debug features were not included at instantiation, this flag has no effect on the operation of the core. Switching test-pattern generator mode on or off is not synchronized to frame processing, therefore can lead to image tearing.

Bits 30 and 31 of the CONTROL register, FRAME\_SYNC\_RESET and SW\_RESET facilitate software reset. Setting SW\_RESET reinitializes the core to GUI default values, all internal registers and outputs are cleared and held at initial values until SW\_RESET is set to 0. The SW\_RESET flag is not synchronized with the AXI4-Stream interfaces. Resetting the core while frame processing is progress will cause image tearing. For applications where the soft-ware reset functionality is desirable, but image tearing has to be avoided a frame synchronized software reset (FRAME\_SYNC\_RESET) is available. Setting FRAME SYNC RESET to 1 will reset the core at the end of the frame being processed, or immediately if the core is between frames when the FRAME\_SYNC\_RESET was asserted. After reset, the FRAME SYNC RESET bit is automatically cleared, so the core can get ready to process the next frame of video as soon as possible. The default value of both RESET bits is 0. Core instances with no AXI4-Lite control interface can only be reset via the ARESETn pin.

#### **STATUS (0x0004) Register**

All bits of the STATUS register can be used to request an interrupt from the host processor. To facilitate identification of the interrupt source, bits of the STATUS register remain set after an event associated with the particular STATUS register bit, even if the event condition is not present at the time the interrupt is serviced.

Bits of the STATUS register can be inverted individually by writing '1' to the bit position to be cleared.

Bit 0 of the STATUS register, PROC\_STARTED, indicates that processing of a frame has commenced via the AXI4-Stream interface.

Bit 1 of the STATUS register, End-of-frame (EOF), indicates that the processing of a frame has completed.

Bit 16 of the STATUS register, SLAVE\_ERROR, indicates that one of the conditions monitored by the ERROR register has occurred.

#### **ERROR (0x0008) Register**

Bit 4 of the STATUS register, SLAVE\_ERROR, indicates that one of the conditions monitored by the ERROR register has occurred. This bit can be used to request an interrupt from the host processor. To facilitate identification of the interrupt source, bits of the STATUS and ERROR registers remain set after an event associated with the particular ERROR register bit, even if the event condition is not present at the time the interrupt is serviced.

Bits of the ERROR register can be inverted individually by writing '1' to the bit position to be cleared.

Bit 0 of the ERROR register, EOL\_EARLY, indicates an error during processing a video frame via the AXI4-Stream slave port. The number of pixels received between the latest and the preceding End-Of-Line (EOL) signal was less than the value programmed into the ACTIVE\_SIZE register.

Bit 1 of the ERROR register, EOL\_LATE, indicates an error during processing a video frame via the AXI4-Stream slave port. The number of pixels received between the last EOL signal surpassed the value programmed into the ACTIVE\_SIZE register.

Bit 2 of the ERROR register, SOF\_EARLY, indicates an error during processing a video frame via the AXI4-Stream slave port. The number of pixels received between the latest and the preceding Start-Of-Frame (SOF) signal was less than the value programmed into the ACTIVE\_SIZE register.

Bit 3 of the ERROR register, SOF\_LATE, indicates an error during processing a video frame via the AXI4-Stream slave port. The number of pixels received between the last SOF signal surpassed the value programmed into the ACTIVE\_SIZE register.

#### **IRQ\_ENABLE (0x000C) Register**

Any bits of the STATUS register can generate a host-processor interrupt request via the IRQ pin. The Interrupt Enable register facilitates selecting which bits of STATUS register will assert IRQ. Bits of the STATUS registers are masked by (AND) corresponding bits of the IRQ\_ENABLE register and the resulting terms are combined (OR) together to generate IRQ.

#### **Version (0x0010) Register**

Bit fields of the Version Register facilitate software identification of the exact version of the hardware peripheral incorporated into a system. The core driver can take advantage of this Read-Only value to verify that the software is matched to the correct version of the hardware. See [Table 2-12](#page-19-0) for more information.

#### **SYSDEBUG0 (0x0014) Register**

The SYSDEBUG0, or Frame Throughput Monitor, register indicates the number of frames processed since power-up or the last time the core was reset. The SYSDEBUG registers can be useful to identify external memory / Frame buffer / or throughput bottlenecks in a video system. Refer to [Debugging Features in Appendix C](#page-74-1) for more information.

#### **SYSDEBUG1 (0x0018) Register**

The SYSDEBUG1, or Line Throughput Monitor, register indicates the number of lines processed since power-up or the last time the core was reset. The SYSDEBUG registers can be useful to identify external memory / Frame buffer / or throughput bottlenecks in a video system. Refer to [Debugging Features in Appendix C](#page-74-1) for more information.

#### **SYSDEBUG2 (0x001C) Register**

The SYSDEBUG2, or Pixel Throughput Monitor, register indicates the number of pixels processed since power-up or the last time the core was reset. The SYSDEBUG registers can be useful to identify external memory / Frame buffer / or throughput bottlenecks in a video system. Refer to [Debugging Features in Appendix C](#page-74-1) for more information.

#### **ACTIVE\_SIZE (0x0020) Register**

The ACTIVE\_SIZE register encodes the number of active pixels per scan line and the number of active scan lines per frame. The lower half-word (bits 12:0) encodes the number of active pixels per scan line. Supported values are between 32 and the value provided in the **Maximum number of pixels per scan line** field in the GUI. The upper half-word (bits 28:16) encodes the number of active pixels per scan line. Supported values are 32 to 7680. To avoid processing errors, the user should restrict values written to ACTIVE\_SIZE the range supported by the core instance.

#### **RGBMAX (0x0100) Register**

The RGBMAX register holds the maximum value allowed on the Red, Green and Blue channels of the output. If the output data is greater than this value, then this value replaces it on the output. This register is only valid if "Outputs Clipped" is selected in the core parameterization GUI.

#### **RGBMIN (0x0104) Register**

The YMin register holds the minimum value allowed on the Red, Green and Blue channels of the output. If the output data is less than this value, then this value replaces it on the output. This register is only valid if "Outputs Clamped" is selected in the core parameterization GUI.

#### **ROFFSET (0x0108) Register**

The ROFFSET register holds the offset compensation value for the Red channel.

#### **GOFFSET(0x010C) Register**

The GOFFSET register holds the offset compensation value for the Green channel.

#### **BOFFSET (0x0110) Register**

The BOFFSET register holds the offset compensation value for the Blue channel.

#### **ACOEF (0x0114) Register**

The ACOEF register holds the transformed CA coefficient expressed as an 18.16 floating point number.

#### **BCOEF (0x0118) Register**

The BCOEF register holds the transformed CB coefficient expressed as an 18.16 floating point number.

#### **CCOEF (0x011C) Register**

The CCOEF register holds the transformed CC coefficient expressed as an 18.16 floating point number.

#### **DCOEF (0x0120) Register**

The DCOEF register holds the transformed CD coefficient expressed as an 18.16 floating point number.

#### <span id="page-24-0"></span>**The Interrupt Subsystem**

STATUS register bits can trigger interrupts so embedded application developers can quickly identify faulty inter-faces or incorrectly parameterized cores in a video system. Irrespective of whether the AXI4-Lite control interface is present or not, the YCrCb2RGB core detects AXI4-Stream framing errors, as well as the beginning and the end of frame processing.

When the core is instantiated with an AXI4-Lite Control interface, the optional interrupt request pin  $(IRQ)$  is present. Events associated with bits of the  $STATUS$  register can generate a (level triggered) interrupt, if the corresponding bits of the interrupt enable register (IRQ ENABLE) are set. Once set by the corresponding event, bits of the STATUS register stay set until the user application clears them by writing '1' to the desired bit positions. Using this mechanism the system processor can identify and clear the interrupt source.

Without the AXI4-Lite interface the user can still benefit from the core signaling error and status events. By selecting the **Enable INTC Port** check-box on the GUI, the core generates the optional INTC\_IF port. This vector of signals gives parallel access to the individual interrupt sources, as seen in [Table 2-13](#page-25-0).

Unlike STATUS and ERROR flags, INTC\_IF signals are not held, rather stay asserted only while the corresponding event persists.

| <b>INTC_IF signal</b> | <b>Function</b>              |
|-----------------------|------------------------------|
| 0                     | Frame processing start       |
| 1                     | Frame processing complete    |
| $\mathcal{P}$         | Reserved                     |
| 3                     | Reserved                     |
| 4                     | Video over AXI4-Stream Error |
| 5                     | <b>EOL Early</b>             |
| 6                     | <b>EOL Late</b>              |
| 7                     | <b>SOF Early</b>             |
| 8                     | <b>SOF Late</b>              |

<span id="page-25-0"></span>*Table 2-13:* **INTC\_IF Signal Functions**

In a system integration tool, such as EDK, the interrupt controller INTC IP can be used to register the selected INTC\_IF signals as edge triggered interrupt sources. The INTC IP provides functionality to mask (enable or disable), as well as identify individual interrupt sources from software. Alternatively, for an external processor or MCU the user can custom build a priority interrupt controller to aggregate interrupt requests and identify interrupt sources.

# <span id="page-26-2"></span><span id="page-26-0"></span>Designing with the Core

# <span id="page-26-1"></span>**General Design Guidelines**

The YCrCb2RGB core converts YCrCb 4:4:4 (or YUV 4:4:4) video data into RGB video data.The core processes samples provided via an AXI4-Stream slave interface, outputs pixels via an AXI4-Stream master interface, and can be controlled via an optional AXI4-Lite interface. The YCrCb2RGB block cannot change the input/output image sizes, the input and output pixel clock rates, or the frame rate. It is recommended that the YCrCb2RGB core is used in conjunction with the Video In to AXI4-Stream and Video Timing Controller cores. The Video Timing Controller core measures the timing parameters, such as number of active scan lines, number of active pixels per scan line of the image sensor. The Video In to AXI4-Stream core converts a standard parallel clocked video interface with syncs and or blanks to AXI4-Stream Video protocol as defined in the *Video IP: AXI Feature Adoption* section of the (UG761) *AXI Reference Guide* [\[Ref 8\].](#page-82-2)

Typically, the YCrCb2RGB core is part of a larger video system such as an Image Sensor Pipeline (ISP) System shown in [Figure 3-1.](#page-26-3)

<span id="page-26-3"></span>

*Figure 3-1:* **Image Sensor Pipeline System with YCrCb2RGB Core**

# <span id="page-27-0"></span>**Color-Space Conversion Background**

### **RGB Color Space**

The red, green and blue (RGB) color space is widely used throughout computer graphics. Red, green and blue are three primary additive colors: individual components are added together to form a desired color, and are represented by a three dimensional, Cartesian coordinate system, as shown in [Figure 3-2.](#page-28-0)

[Table 3-1](#page-27-1) presents the RGB values for 100% saturated color bars, a common video test signal.

|   | <b>Normal</b><br>Range | White | Yellow | Cyan | Green | Magenta | Red | <b>Blue</b> | <b>Black</b> |
|---|------------------------|-------|--------|------|-------|---------|-----|-------------|--------------|
| R | 0 to 255               | 255   | 255    | 0    |       | 255     | 255 |             |              |
| G | 0 to 255               | 255   | 255    | 255  | 255   |         |     |             |              |
| B | 0 to 255               | 255   |        | 255  | 0     | 255     |     | 255         |              |

<span id="page-27-1"></span>*Table 3-1:* **100% RGB Color Bars**

The RGB color space is the most prevalent choice for computer graphics because color displays use red, green and blue to create the desired color. Also, a system that is designed using the RGB color space can take advantage of a large number of existing software algorithms.

However, RGB is not very efficient when dealing with real-world images. All three components need equal bandwidth to generate arbitrary colors within the RGB color cube. Also, processing an image in the RGB color space is usually not the most efficient method. For example, to modify the intensity or color of a given pixel, all three RGB values must be read, modified and written back to the frame buffer. If the system had access to the image stored in the intensity and color format, the process would be faster.

# **R'G'B' Color Space**

While the RGB color space is ideal to represent computer graphics, 8-bit linear-light coding performs poorly for images to be viewed [\[Ref 2\]](#page-81-4). It is necessary to have 12 or 14 bits per component to achieve excellent quality. The best perceptual use is made of a limited number of bits by using nonlinear coding that mimics the nonlinear response of human vision. In video, JPEG, MPEG, computing, digital photography, and many other domains, a nonlinear transfer function is applied to the RGB signals to give nonlinearly coded gamma-corrected components, denoted with symbols R'G'B'. Excellent image quality can be obtained with 10-bit nonlinear coding with a transfer function similar to that of *Rec. 709* [\[Ref 4\]](#page-81-5) or RGB.

# **YUV Color Space**

The YUV color space is used by the analog PAL, NTSC and SECAM color video/TV standards. In the past, black and white systems used only the luminance (Y) information. Chrominance information (U and V) was added in such a way that a black and white receiver can still display a normal black and white picture.

# **YCrCb (or YCbCr) Color Space**

The YCrCb or YCbCr color space was developed as part of the *ITU-R BT.601* [\[Ref 3\]](#page-81-6) during the development of a world-wide digital component video standard. YCbCr is a scaled, offset version of the YUV color space. Y has a nominal range of 16-235; Cb and Cr have a nominal range of 16-240. There are several YCbCr sampling formats, such as 4:4:4, 4:2:2 and  $4.2 \cdot 0$ 

<span id="page-28-0"></span>

*Figure 3-2:* **RGB and YCrCb Color Representations**

## **Conversion Equations**

#### **Derivation of Conversion Equations**

To generate the luminance (Y, or gray value) component, biometric experiments were employed to measure how the human eye perceives the intensities of the red, green and blue colors. Based on these experiments, optimal values for coefficients CA and CB were determined, such that:

$$
Y = CA^*R + (1 - CA - CB)^*G + CB^*B
$$
 *Equation 3-1*

Actual values for CA and CB differ slightly in different standards.

Conversion from the RGB color space to luminance and chrominance (differential color components) could be described with the following equation:



Coefficients CA and CB are chosen between 0 and 1, which guarantees that the range of Y is constrained between the maximum and the minimum RGB values permitted,  $RGB<sub>max</sub>$  and RGBmin respectively.

In most practical implementations, the range of the luminance and chrominance components should be equal. There are two ways to accomplish this: the chrominance components (B-Y and R-Y) can be normalized (compressed and offset compensated), or values above and below the luminance range can be clipped/clamped.

Both clipping and dynamic range compression results in loss of information; however, the introduced artifacts are different. To leverage differences in the input (RGB) range, different standards choose different tradeoffs between clipping and normalization.

The YCrCb to RGB Color-Space Converter core supports only the conversions that fit the following general form:

$$
\begin{bmatrix} Y \\ C_R \\ C_B \end{bmatrix} = \begin{bmatrix} CA & 1 - CA - CB & CB \\ CC(1 - CA) & CC(CA + CB - 1) & CC(-CB) \\ CD(-CA) & CD(CA + CB - 1) & CD(1 - CB) \end{bmatrix} \begin{bmatrix} R \\ G \\ B \end{bmatrix} + \begin{bmatrix} O_Y \\ O_{Cr} \\ O_{Cb} \end{bmatrix}
$$
 *Equation 3-3*

<span id="page-29-0"></span>CC and CD allow dynamic range compression for B-Y and R-Y, and constants  $O<sub>Y</sub>$  and  $O<sub>C</sub>$ facilitate offset compensation for the resulting CB and CR. To avoid arithmetic under- and overflows while converting from the RGB to the YCrCb domain, with RGB values in the [0.1] range, a choice for CC and CD is:

$$
CC = \frac{1}{2(1 - CA)} \qquad CD = \frac{1}{2(1 - CB)} \qquad \qquad \text{Equation 3-4}
$$

<span id="page-29-2"></span>The YCrCb to RGB core facilitates both range de-compression and optional clipping and clamping. Range, offset, clipping and clamping levels are parameterizable.

By inverting the transformation matrix in [Equation 3-3,](#page-29-0) the transformation from the YCrCb color space to the RGB color space can be defined as:

$$
\begin{bmatrix} R \\ G \\ B \end{bmatrix} = \begin{bmatrix} 1 & 1/CC & 0 \\ 1 & -CA & -CB \\ 1 & \overline{CC(1 - CA - CB)} & \overline{CD(1 - CA - CB)} \\ 1 & 0 & 1/CD \end{bmatrix} \begin{bmatrix} Y - O_Y \\ C_R - O_{Cr} \\ C_B - O_{Cb} \end{bmatrix}
$$
 *Equation 3-5*

#### <span id="page-29-1"></span>**Hardware Implementation**

The YCrCb to RGB color-space transformation [\(Equation 3-5\)](#page-29-1) can be expressed as:

$$
R = Y - O_Y + ACOEF(C_R - O_{Cr})
$$
 *Equation 3-6*

$$
G = Y - O_Y + BCOEF(C_R - O_{Cr}) + CCOEF(C_B - O_{Cb})
$$
 *Equation 3-7*

$$
B = Y - O_Y + DCOEF(C_B - O_{Cb})
$$
 *Equation 3-8*

This cannot efficiently utilize the MADD capabilities of XtremeDSP slices. As offsets and coefficients are constants, the preceding equations can be rewritten as:

$$
R = ACOEF \bullet C_R + ROFFSET + Y
$$
 *Equation 3-9*

$$
G = \text{BCOEF} \bullet C_R + \text{CCOEF} \bullet C_B + \text{GOFFSET} + Y
$$
 *Equation 3-10*

$$
B = DCOEF \bullet C_B + BOFFSET + Y
$$
 *Equation 3-11*

<span id="page-30-0"></span>This can be directly mapped to the architecture shown in [Figure 3-3](#page-30-0). The blue and gray boxes represent logic blocks, which are always implemented using XtremeDSP slices.



*Figure 3-3:* **YcrCb to RGB Schematic**

#### **Assigning Values to Design Parameters**

The following section specifies parameter values for some widely used standards. Most parameter values, except for COEF and OFFSET parameters, can be assigned from [Table 3-2,](#page-31-0) [Table 3-3](#page-31-1) and [Table 3-4](#page-32-0) directly. These parameters have to be calculated, scaled and rounded before assigning integer values to corresponding VHDL parameters, using the following equations:

$$
ACOEF = \frac{1}{CC}
$$
  
\n
$$
BCOEF = \frac{-CA}{CC(1 - CA - CB)}
$$
  
\n
$$
CCOEF = \frac{-CB}{CD(1 - CA - CB)}
$$
  
\n
$$
DCOEF = \frac{-CB}{CD(1 - CA - CB)}
$$
  
\n
$$
EXECOEF = \frac{-CB}{CD(1 - CA - CB)}
$$
  
\n
$$
EXECOEF = \frac{-CB}{CD(1 - CA - CB)}
$$
  
\n
$$
Equation 3.14
$$

$$
DCOEF = \frac{1}{CD}
$$
 *Equation 3-15*

Coefficients are passed to the core in 17 bits wide two's complement format. Y, Cr and Cb are passed as DATA\_WIDTH bits wide unsigned integers. MWIDTH is preset to Min(32, DATA\_WIDTH+17).

ROUNDING\_CONST = 
$$
2^{MWIDTH - \overline{DATA\_WIDTH}}
$$
 Equation 3-16

\n*Equation 3-16*

\n*Equation 3-17*

\n*ROFFSET* = ROUNDING\_CONST – (*ACOFFSET*) + *CCOFFSET*) + *SOFESET* = ROUNDING\_CONST – ((*BOEF + CROFFSET*) + (*CCDEF X CBOFFSET*) + *YOFFSET*) + *SCALE\_N*

\n*Equation 3-18*

\n*Equation 3-19*

\n*BOFFSET* = ROUNDING\_CONST – (*DCOFF SET*) + *COFFSET* + *YOFFSET*) + *SCALE\_M*

\n*Equation 3-19*

\n*SCALE\_M* =  $2^{MWDITH - lWIDTH - 17}$ 

\n*Equation 3-20*

#### **ITU 601 (SD) and 709 - 1125/60 (NTSC) Standard Conversion Coefficients**

<span id="page-31-0"></span>



#### **Standard ITU 709 (HD) 1250/50 (PAL)**

<span id="page-31-1"></span>





#### *Table 3-3:* **Parameterization Values for the PAL HD (ITU 709) Standard** *(Cont'd)*

#### **YUV Standard**

<span id="page-32-0"></span>



### **Clipping and Clamping**

#### **Output Quantization Noise**

Coefficients CC and CD in [Equation 3-3](#page-29-0) allow standard designers to trade off output quantization and clipping noise. Actual noise inserted depends on the probability statistics of the Cb and Cr variables, but in general if CC and CD are larger than the maximum values calculated in [Equation 3-4](#page-29-2), output values may clip, introducing clipping noise. However, the lower CC and CD values are chosen, the worse Cb and Cr values will use the available dynamic range, thus introducing more quantization noise. Therefore, the designer's task is to equalize output quantization and clipping noise insertion by carefully choosing CC and CD values based on knowing the statistics of Cb and Cr values. For instance, when

probabilities of extreme chrominance values are small, it is beneficial to increase CC and CD values, as the extra noise inserted by occasional clipping is less than the gain in average signal power (and thus SQNR).

#### **Output Clipping Noise**

If coefficients CC and CD in [Equation 3-3](#page-29-0) are larger than the maximum values calculated in [Equation 3-4](#page-29-2), Cr and Cb output values may get larger (overflow) than the maximum or smaller (underflow) than minimum value the output representation can carry. If overflow occurs and the design does not have clipping logic, binary values wrap around and insert substantial noise to the output. If clamping/clipping logic is used, output values saturate and less noise is introduced, as shown in [Figure 3-4](#page-33-1). Use of clipping and clamping increases slice count of the design by approximately 6\*DATA\_WIDTH slices.

If a targeted standard limits output of values to a predefined range other than those of binary representation, such as *ITU-R BT.601-5* [\[Ref 3\],](#page-81-6) use of clipping and clamping logic facilitates constraining output values to the predefined range by setting  $RGB_{\text{max}}$  and RGB<sub>min</sub> values according to the standard specifications.

<span id="page-33-1"></span>

# <span id="page-33-0"></span>**Clock, Enable, and Reset Considerations**

### **ACLK**

The master and slave AXI4-Stream video interfaces use the ACLK clock signal as their shared clock reference, as shown in [Figure 3-5](#page-34-0).



*Figure 3-5:* **Example of ACLK Routing in an ISP Processing Pipeline**

### <span id="page-34-0"></span>**S\_AXI\_ACLK**

The AXI4-Lite interface uses the A\_AXI\_ACLK pin as its clock source. The ACLK pin is not shared between the AXI4-Lite and AXI4-Stream interfaces. The CFA core contains clock-domain crossing logic between the ACLK (AXI4-Stream and Video Processing) and S AXI ACLK (AXI4-Lite) clock domains. The core automatically ensures that the AXI4-Lite transactions will complete even if the video processing is stalled with ARESETn, ACLKEN or with the video clock not running.

### **ACLKEN**

The YCrCb2RGB core has two enable options: the ACLKEN pin (hardware clock enable), and the software reset option provided via the AXI4-Lite control interface (when present).

ACLKEN is by no means synchronized internally to AXI4-Stream frame processing therefore de-asserting ACLKEN for extended periods of time may lead to image tearing.

The ACLKEN pin facilitates:

- Multi-cycle path designs (high speed clock division without clock gating),
- Standby operation of subsystems to save on power
- Hardware controlled bring-up of system components

**Note:** When ACLKEN (clock enable) pins are used (toggled) in conjunction with a common clock source driving the master and slave sides of an AXI4-Stream interface, to prevent transaction errors the ACLKEN pins associated with the master and slave component interfaces must also be driven by the same signal [\(Figure 2-2\)](#page-15-1).

*Note:* When two cores connected via AXI4-Stream interfaces, where only the master or the slave interface has an ACLKEN port, which is not permanently tied high, the two interfaces should be connected via the AXI4-Stream Interconnect or AXI-FIFO cores to avoid data corruption ([Figure 2-3\)](#page-16-1).

### **S\_AXI\_ACLKEN**

The S\_AXI\_ACLKEN is the clock enable signal for the AXI4-Lite interface only. Driving this signal low will only affect the AXI4-Lite interface and will not halt the video processing in the ACLK clock domain.

### **ARESETn**

The CFA core has two reset source: the ARESETn pin (hardware reset), and the software reset option provided via the AXI4-Lite control interface (when present).

*Note:* ARESETn is by no means synchronized internally to AXI4-Stream frame processing, therefore de-asserting ARESETn while a frame is being process will lead to image tearing.

The external reset pulse needs to be held for 32 ACLK cycles to reset the core. The ARESETn signal will only reset the AXI4-Stream interfaces. The AXI4-Lite interface is unaffected by the ARESETn signal to allow the video processing core to be reset without halting the AXI4-Lite interface.

**Note:** When a system with multiple-clocks and corresponding reset signals are being reset, the reset generator has to ensure all reset signals are asserted/de-asserted long enough that all interfaces and clock-domains in all IP cores are correctly reinitialized.

### **S\_AXI\_ARESETn**

The S\_AXI\_ARESETn signal is synchronous to the S\_AXI\_ACLK clock domain, but is internally synchronized to the ACLK clock domain. The S\_AXI\_ARESETn signal will reset the entire core including the AXI4-Lite and AXI4-Stream interfaces.

# <span id="page-35-0"></span>**System Considerations**

When using the YCrCb2RGB, it needs to be configured for the actual image sensor frame size the proper color-space conversion, to operate properly. To gather the frame size information from the video, it can be connected to the Video In to AXI-Stream input and the Video Timing Controller. The timing detector logic in the Video Timing Controller will gather the video timing signals. The AXI4-Lite control interface on the Video Timing Controller allows the system processor to read out the measured frame dimensions, and program all downstream cores, such as the YCrCb2RGB, with the appropriate image dimensions.

If the target system uses only one unchanging setup, the user may choose to consolidate the register values, and create a constant configuration by removing the AXI4-Lite interface. This option allows reducing the core Slice footprint.
#### **Clock Domain Interaction**

The ARESETn and ACLKEN input signals will not reset or halt the AXI4-Lite interface. This allows the video processing to be reset or halted separately from the AXI4-Lite interface without disrupting AXI4-Lite transactions.

The AXI4-Lite interface will respond with an error if the core registers cannot be read or written within 128 S\_AXI\_ACLK clock cycles. The core registers cannot be read or written if the ARESETn signal is held low, if the ACLKEN signal is held low or if the ACLK signal is not connected or not running. If core register read does not complete, the AXI4-Lite read transaction will respond with **10** on the S\_AXI\_RRESP bus. Similarly, if a core register write does not complete, the AXI4-Lite write transaction will respond with **10** on the S\_AXI\_BRESP bus. The S\_AXI\_ARESETn input signal resets the entire core.

#### **Programming Sequence**

If processing parameters such as the image size needs to be changed on the fly, or the system needs to be reinitialized, it is recommended that pipelined Xilinx IP video cores are disabled/reset from system output towards the system input, and programmed/enabled from system input to system output. STATUS register bits allow system processors to identify the processing states of individual constituent cores, and successively disable a pipeline as one core after another is finished processing the last frame of data.

#### **Error Propagation and Recovery**

Parameterization and/or configuration registers define the dimensions of video frames video IP should process. Starting from a known state, based on these configuration settings the IP can predict when the beginning of the next frame is expected. Similarly, the IP can predict when the last pixel of each scan line is expected. SOF detected before it was expected (early), or SOF not present when it is expected (late), EOL detected before expected (early), or EOL not present when expected (late), signals error conditions indicative of either upstream communication errors or incorrect core configuration.

When SOF is detected early, the output SOF signal is generated early, terminating the previous frame immediately. When SOF is detected late, the output SOF signal is generated according to the programmed values. Extra lines / pixels from the previous frame are dropped until the input SOF is captured.

Similarly, when EOL is detected early, the output EOL signal is generated early, terminating the previous line immediately. When EOL is detected late, the output EOL signal is generated according to the programmed values. Extra pixels from the previous line are dropped until the input EOL is captured.



# <span id="page-37-0"></span>C Model Reference

#### **Installation and Directory Structure**

This chapter contains information for installing the YCrCb to RGB Color-Space Converter C-Model, and describes the file contents and directory structure.

#### **Software Requirements**

The YCrCb to RGB Color-Space Converter v6.00.a C-models were compiled and tested with the following software versions.

*Table 4-1:* **Supported Systems and Software Requirements**

| <b>Platform</b>           | C-Compiler                                                          |
|---------------------------|---------------------------------------------------------------------|
| Linux 32-bit and 64-bit   | GCC 4.1.1                                                           |
| Windows 32-bit and 64-bit | Microsoft Visual Studio 2005, Visual Studio 2008 (Visual $C++8.0$ ) |

#### **Installation**

The installation of the C-Model requires updates to the PATH variable, as described below.

#### **Linux**

Ensure that the directory in which the

libIp\_v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.so and libstlport.so.5.1 files are located is in your \$LD\_LIBRARY\_PATH environment variable.

#### **C-Model File Contents**

Unzipping the v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_model.zip file creates the following directory structures and files which are described in[Table 4-2.](#page-38-0)

<span id="page-38-0"></span>*Table 4-2:* **C-Model Files**

| File                                            | <b>Description</b>                                                                                  |
|-------------------------------------------------|-----------------------------------------------------------------------------------------------------|
| /lin                                            | Pre-compiled bit accurate ANSI C reference model for simulation on<br>32-bit Linux Platforms        |
| libIp_v_ycrcb2rgb_v6_00_a_bitacc_cmo<br>del.lib | YCrCb to RGB Color-Space Converter v6.00.a model shared object<br>library (Linux platforms only)    |
| libstlport.so.5.1                               | STL library, referenced by the YCrCb to RGB Color-Space Converter<br>library (Linux platforms only) |
| run_bitacc_cmodel                               | Pre-compiled bit accurate executable for simulation on 32-bit Linux<br>Platforms                    |
| $/$ lin64                                       | Pre-compiled bit accurate ANSI C reference model for simulation on<br>64-bit Linux Platforms        |
| libIp_v_ycrcb2rgb_v6_00_a_bitacc_cmo<br>del.lib | YCrCb to RGB Color-Space Converter v6.00.a model shared object<br>library (Linux platforms only)    |
| libstlport.so.5.1                               | STL library, referenced by the YCrCb to RGB Color-Space Converter<br>library (Linux platforms only) |
| run_bitacc_cmodel                               | Pre-compiled bit accurate executable for simulation on 32-bit Linux<br>Platforms                    |
| /nt                                             | Pre-compiled bit accurate ANSI C reference model for simulation on<br>32-bit Windows Platforms      |
| libIp_v_ycrcb2rgb_v6_00_a_bitacc_cmo<br>del.lib | Pre-compiled library file for win32 compilation                                                     |
| run_bitacc_cmodel.exe                           | Pre-compiled bit accurate executable for simulation on 32-bit Windows<br>Platforms                  |
| /nt64                                           | Pre-compiled bit accurate ANSI C reference model for simulation on<br>64-bit Windows Platforms      |
| libIp_v_ycrcb2rgb_v6_00_a_bitacc_cmo<br>del.lib | Pre-compiled library file for win32 compilation                                                     |
| run_bitacc_cmodel.exe                           | Pre-compiled bit accurate executable for simulation on 64-bit Windows<br>Platforms                  |
| README.txt                                      | Release notes                                                                                       |
| pg014_v_ycrcb2rgb.pdf                           | YCrCb to RGB Color-Space Converter Core Product Guide                                               |
| v_ycrcb2rgb_v6_00_a_bitacc_cmodel.h             | Model header file                                                                                   |
| rgb_utils.h                                     | Header file declaring the RGB image / video container type and support<br>functions                 |
| bmp_utils.h                                     | Header file declaring the bitmap (.bmp) image file I/O functions                                    |

*Table 4-2:* **C-Model Files** *(Cont'd)*

| <b>File</b>          | <b>Description</b>                                                                                |
|----------------------|---------------------------------------------------------------------------------------------------|
| video utils.h        | Header file declaring the generalized image / video container type, I/O<br>and support functions. |
| Test_stimuli.bmp     | 32x32 sample test image                                                                           |
| run_bittacc_cmodel.c | Example code calling the C-Model                                                                  |

# **Using the C-Model**

The bit accurate C model is accessed through a set of functions and data structures that are declared in the v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.h file. Before using the model, the structures holding the inputs, generics and output of the YCrCb to RGB Color-Space Converter instance must be defined:

**struct** xilinx\_ip\_v\_ycrcb2rgb\_v6\_00\_a\_generics generics; **struct** xilinx\_ip\_v\_ycrcb2rgb\_v6\_00\_a\_inputs inputs; **struct** xilinx\_ip\_v\_ycrcb2rgb\_v6\_00\_a\_outputs outputs;

The declaration of these structures is in the v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.h file. [Table 4-3](#page-39-0) lists the generic parameters taken by the YCrCb to RGB Color-Space Converter v4.0 IP core bit accurate model, as well as the default values.

| <b>Generic Variable</b> | <b>Type</b>            | <b>Default</b><br>Value | Range            | <b>Description</b>                                        |
|-------------------------|------------------------|-------------------------|------------------|-----------------------------------------------------------|
| ACTIVE_COLS             | int                    | 1920                    | 0-7680           | Active # of Columns                                       |
| ACTIVE_ROWS             | int                    | 1080                    | $0 - 7680$       | Active # of Rows                                          |
| COEF IN                 | ycrcb_coef_in<br>puts  |                         |                  | YCC coefficient input structure                           |
| COEF OUT                | ycrcb_coef_o<br>utputs |                         |                  | YCC coefficient output structure                          |
| RGBMAX                  | int                    | 240                     | $0 - 2$ OWIDTH-1 | Clipping value for the R, G and B<br>Channels             |
| <b>RGBMIN</b>           | int                    | 16                      | $0 - 2$ OWIDTH-1 | Clipping value for the R, G and B<br>Channels             |
| HAS CLIP                | int                    | 1                       | 0,1              | Determines if Clipping is performed on<br>the output data |
| HAS_CLAMP               | int                    | 1                       | 0,1              | Determines if Clamping is performed on<br>the output data |

<span id="page-39-0"></span>*Table 4-3:* **Core Generic Parameters and Default Values** 

| STANDARD SEL | int | 0,1,2,3 | <b>Standard Selection</b><br>$0 = SD$ ITU 601 1 =<br>HD_ITU_709_1125_NTSC 2 =<br>HD_ITU_709__1250_PAL 3 = YUV        |
|--------------|-----|---------|----------------------------------------------------------------------------------------------------------------------|
| INPUT RANGE  | int | 0.1.2   | Input Range $0 = 16$ _to_240_for_TV, $1 =$<br>$16_to_235_for_Studio_Equipment 2 =$<br>0_to_255_for_Computer_Graphics |

*Table 4-3:* **Core Generic Parameters and Default Values** *(Cont'd)*



| Variable        | <b>Type</b> | <b>Default Value</b> | Range                       | <b>Description</b>  |
|-----------------|-------------|----------------------|-----------------------------|---------------------|
| <b>IWIDTH</b>   | int         | 8                    | 8,10,12,16                  | Data Width          |
| <b>ACOEF</b>    | double      | 0.299                | $0.0 - 1.0$                 | CA GUI coefficient  |
| <b>BCOEF</b>    | double      | 0.114                | $0.0 - 1.0$                 | CB GUI coefficient  |
| <b>CCOEF</b>    | double      | 0.713                | $0.0 - 0.9$                 | CC GUI coefficient  |
| <b>DCOEF</b>    | double      | 0.564                | $0.0 - 0.9$                 | CD GUI coefficient  |
| <b>YOFFSET</b>  | int         | 16                   | $0 - 2^{\text{IWIDTH}} - 1$ | <b>GUI YOFFSET</b>  |
| <b>CBOFFSET</b> | int         | 128                  | $0 - 2^{\text{IWIDTH}} - 1$ | <b>GUI CBOFFSET</b> |
| <b>CROFFSET</b> | int         | 128                  | $0 - 2^{\text{IWIDTH}} - 1$ | <b>GUI CROFFSET</b> |

*Table 4-5:* **ycrcb\_coef\_output Structure**



Calling xilinx\_ip\_v\_ycrcb2rgb\_v6\_00\_a\_get\_default\_generics(&generics) initializes the generics structure with the default value.

The inputs structure defines the actual input image. For the description of the input video structure, see [Input and Output Video Structures.](#page-41-0)

Calling xilinx\_ip\_v\_ycrcb2rgb\_v6\_00\_a\_get\_default\_inputs(&generics, &inputs) initializes the input video structure before it can be assigned an image or video sequence using the memory allocation or file I/O functions provided in the BMP, RGB or video utility functions.

*Note:* The video\_in variable is not initialized to point to a valid image / video container, as the container size depends on the actual test image to be simulated. The initialization of the video\_in structure is described in [Initializing the Input Video Structure](#page-42-0).

After the inputs are defined, the model can be simulated by calling this function:

```
int xilinx_ip_v_ycrcb2rgb_v6_00_a_bitacc_simulate(
struct xilinx_ip_v_ycrcb2rgb_v6_00_a_generics* generics,
struct xilinx_ip_v_ycrcb2rgb_v6_00_a_inputs* inputs,
struct xilinx_ip_v_ycrcb2rgb_v6_00_a_outputs* outputs).
```
Results are included in the outputs structure, which contains only one member, type video struct. After the outputs are evaluated and saved, dynamically allocated memory for input and output video structures must be released by calling this function:

```
void xilinx_ip_v_ycrcb2rgb_v6_00_a_destroy(
struct xilinx_ip_v_ycrcb2rgb_v6_00_a_inputs *input, 
struct xilinx ip v ycrcb2rgb v6_00_a_outputs *output).
```
Successful execution of all provided functions, except for the destroy function, return value 0. A non-zero error code indicates that problems occurred during function calls.

#### <span id="page-41-0"></span>**Input and Output Video Structures**

Input images or video streams can be provided to the YCrCb to RGB Color-Space Converter v4.0 reference model using the video\_struct structure, defined in video\_utils.h:

```
struct video_struct{ 
 int frames, rows, cols, bits_per_component, mode;
  uint16*** data[5]; };
```

| <b>Member Variable</b> | <b>Designation</b>                                                                                                                                                                                                                                                                                        |
|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| frames                 | Number of video/image frames in the data structure.                                                                                                                                                                                                                                                       |
| rows                   | Number of rows per frame. Pertaining to the image plane with the most rows<br>and columns, such as the luminance channel for YUV data. Frame dimensions<br>are assumed constant through all frames of the video stream. However<br>different planes, such as y, u and v can have different dimensions.    |
| cols                   | Number of columns per frame. Pertaining to the image plane with the most<br>rows and columns, such as the luminance channel for YUV data. Frame<br>dimensions are assumed constant through all frames of the video stream.<br>However different planes, such as y, u and v can have different dimensions. |
| bits_per_component     | Number of bits per color channel/component.All image planes are assumed<br>to have the same color/component representation. Maximum number of bits<br>per component is 16.                                                                                                                                |

*Table 4-6:* **Member Variables of the Video Structure**

| mode | Contains information about the designation of data planes. Named<br>constants to be assigned to mode are listed in Table 4-7.                                                |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| data | Set of five pointers to three dimensional arrays containing data for image<br>planes. Data is in 16-bit unsigned integer format accessed as<br>data[plane][frame][row][col]. |

*Table 4-6:* **Member Variables of the Video Structure** *(Cont'd)*

<span id="page-42-1"></span>



<sup>1</sup> The Color Space Conversion core C model supports FORMAT\_C444 for input data and FORMAT\_RGB for output data.

## <span id="page-42-0"></span>**Initializing the Input Video Structure**

The easiest way to assign stimuli values to the input video structure is to initialize it with an image or video. The yuv\_utils.h, bmp\_util.h and video\_util.h header files packaged with the bit accurate C models contain functions to facilitate file I/O.

#### **Bitmap Image Files**

The header bmp\_utils.h declares functions that help access files in Windows Bitmap format [\(http://en.wikipedia.org/wiki/BMP\\_file\\_format](http://en.wikipedia.org/wiki/BMP_file_format)). However, this format limits color depth to a maximum of 8-bits per pixel, and operates on images with three planes (R,G,B). Consequently, the following functions operate on arguments type rgb8\_video\_struct, which is defined in rgb\_utils.h. Also, both functions support only true-color, non-indexed formats with 24-bits per pixel.

**int** write\_bmp(FILE \*outfile, **struct** rgb8\_video\_struct \*rgb8\_video); **int** read\_bmp(FILE \*infile, **struct** rgb8\_video\_struct \*rgb8\_video);

Exchanging data between rgb8\_video\_struct and general video\_struct type frames/videos is facilitated by these functions:

```
int copy_rgb8_to_video(struct rgb8_video_struct* rgb8_in, 
                       struct video_struct* video_out ); 
int copy_video_to_rgb8(struct video_struct* video_in, 
                       struct rgb8_video_struct* rgb8_out );
```
*Note:* All image/video manipulation utility functions expect both input and output structures initialized; for example, pointing to a structure that has been allocated in memory, either as static or dynamic variables. Moreover, the input structure must have the dynamically allocated container (data or r, g, b) structures already allocated and initialized with the input frame(s). If the output container structure is pre-allocated at the time of the function call, the utility functions verify and issue an error if the output container size does not match the size of the expected output. If the output container structure is not pre-allocated, the utility functions create the appropriate container to hold results.

#### **Binary Image/Video Files**

The video\_utils.h header file declares functions that help load and save generalized video files in raw, uncompressed format.

```
int read_video( FILE* infile, struct video_struct* in_video);
int write_video(FILE* outfile, struct video_struct* out_video);
```
These functions serialize the video\_struct structure. The corresponding file contains a small, plain text header defining, "Mode", "Frames", "Rows", "Columns", and "Bits per Pixel". The plain text header is followed by binary data, 16-bits per component in scan line continuous format. Subsequent frames contain as many component planes as defined by the video mode value selected. Also, the size (rows, columns) of component planes can differ within each frame as defined by the actual video mode selected.

#### **YUV Image Files**

The yuv\_utils.h file declares functions that help access files in standard YUV format. It operates on images with three planes (Y, U and V). The following functions operate on arguments of type yuv8\_video\_struct, which is defined in yuv\_utils.h:

```
int write_yuv8(FILE *outfile, struct yuv8_video_struct *yuv8_video);
int read_yuv8(FILE *infile, struct yuv8_video_struct *yuv8_video);
```
Exchanging data between yuv8\_video\_struct and general video\_struct type frames/videos is facilitated by these functions:

```
int copy_yuv8_to_video(struct yuv8_video_struct* yuv8_in, 
                       struct video_struct* video_out ); 
int copy_video_to_yuv8(struct video_struct* video_in, 
                       struct yuv8_video_struct* yuv8_out );
```
#### **Working with Video\_struct Containers**

The video utils.h header file defines functions to simplify access to video data in video\_struct.

```
int video_planes_per_mode(int mode);
int video_rows_per_plane(struct video_struct* video, int plane);
int video_cols_per_plane(struct video_struct* video, int plane);
```
The video\_planes\_per\_mode function returns the number of component planes defined by the mode variable, as described in [Table 4-7.](#page-42-1) The video\_rows\_per\_plane and video cols per plane functions return the number of rows and columns in a given plane of the selected video structure. The following example demonstrates using these functions in conjunction to process all pixels within a video stream stored in the  $\text{in\_video}$ variable:

```
for (int frame = 0; frame < in_video->frames; frame++) {
   for (int plane = 0; plane < video_planes_per_mode(in_video->mode); plane++) {
    for (int row = 0; row < rows_per_plane(in_video,plane); row++) {
     for (\text{int} col = 0; col < cols_per_plane(in_video,plane); col++) {
   // User defined pixel operations on 
// in_video->data[plane][frame][row][col]
       }
     }
  }
}
```
#### **C-Model Example Code**

The example C demonstrator provided with the core, run\_bitacc\_cmodel.c demonstrates the steps required to run the C-model, by:

- Opening an example yuv file
- Increasing the color-component width to 10, 12, or 16 bits as necessary by shifting 8 bit data derived from the yuv file and padding the LSBs with X-Y ramp bits.
- Running the YCrCb2RGB C-model

After following the compilation instructions, run the example executable. The executable takes the path/name of the input file and the path of the output as parameters. If invoked with insufficient parameters, the following help message is printed:

Usage: run\_bitacc\_cmodel in\_file out\_path in\_file : path/name of the input YUV file out\_path : path to the output files

During successful execution, two directories will be created at the location specified by the out\_path command line parameter. The first directory is the "expected" directory. This directory will contain a BMP file that corresponds to the output of the first frame that was processed. This directory will also contain a txt file called golden\_1.txt. This txt file contains the output of the model in a format that can be directly used with the demonstration test

bench. The second directory that is created is the "stimuli" directory. This directory will contain a txt file called stimuli\_1.txt. This txt file contains the input of the model in a format that can be directly used with the demonstration test bench.

### **Compiling with the YCrCb to RGB C-Model**

#### **Linux (32- and 64-bit)**

To compile the example code, first ensure that the directory in which the files libIp\_v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.so and libstlport.so.5.1 are located is present in your \$LD\_LIBRARY\_PATH environment variable. These shared libraries are referenced during the compilation and linking process. Then cd into the directory where the header files, library files and run\_bitacc\_cmodel.c were unpacked. The libraries and header files are referenced during the compilation and linking process.

Place the header file and C source file in a single directory. Then in that directory, compile using the GNU C Compiler:

```
gcc -m32 -x c++ ../run_bitacc_cmodel.c ../gen_stim.c -o run_bitacc_cmodel -L. 
-lIp_v_ycrcb2rgb_v6_00_a_bitacc_cmodel -Wl,-rpath,.
gcc -m64 -x c++ ../run_bitacc_cmodel.c ../gen_stim.c -o run_bitacc_cmodel -L. 
-lIp_v_ycrcb2rgb_v6_00_a_bitacc_cmodel -Wl,-rpath,.
```
#### **Windows (32- and 64-bit)**

Precompiled library v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.dll, and top level demonstration code run\_bitacc\_cmodel.c should be compiled with an ANSI C compliant compiler under Windows. Here an example is presented using Microsoft Visual Studio.

In Visual Studio create a new, empty Windows Console Application project. As existing items, add:

- The libIP\_v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.dll file to the "Resource Files" folder of the project
- The run\_bitacc\_cmodel.c and gen\_stim.c files to the "Source Files" folder of the project
- The v\_ycrcb2rgb\_v6\_00\_a\_bitacc\_cmodel.h header files to "Header Files" folder of the project (optional)

After the project has been created and populated, it needs to be compiled and linked (built) to create a win32 executable. To perform the build step, choose **Build Solution** from the Build menu. An executable matching the project name has been created either in the Debug or Release subdirectories under the project location based on whether **Debug** or **Release** has been selected in the **Configuration Manage**r under the Build menu.



# SECTION II: VIVADO DESIGN SUITE

[Customizing and Generating the Core](#page-48-0) [Constraining the Core](#page-53-0)



# <span id="page-48-0"></span>Customizing and Generating the Core

This chapter includes information on using Xilinx tools to customize and generate the core using Vivado tools.

### **Graphical User Interface**

The main screen of the Graphical User Interface (GUI) of Vivado IP Catalog shown in [Figure 5-1](#page-49-0) allows quick implementation of standard YCrCb to RGB or YUV to RGB converters without having to manually enter values from [Table 3-2](#page-31-0), [Table 3-3](#page-31-1) and [Table 3-4.](#page-32-0) The Color-Space Converter core also supports proprietary (non-standard) converter implementations, by selecting "custom" from the Standard Selection drop-down menu, as long as the custom conversion matrix can be transformed to the form of [Equation 3-3.](#page-29-0) Descriptions of the options provided in the GUI screens are included in this section.

<span id="page-49-0"></span>

*Figure 5-1:* **Color Space Converter Main Screen**

The first page of the GUI displays the following options:

- **Component Name**: The component name is used as the base name of output files generated for the module. Names must begin with a letter and must be composed from characters: a to z, 0 to 9 and "\_". The name v\_ycrcb2rgb\_v6\_00\_a cannot be used as a component name.
- **Data Width**: Specifies the bit width of input samples. Permitted values are 8, 10, 12 and 16 bits.
- **Number of Active Pixels per Scan line**: When the AXI4-Lite control interface is enabled, the generated core will use the value specified in the CORE Generator GUI as the default value for the lower half-word of the ACTIVE\_SIZE register. When an AXI4-Lite interface is not present, the GUI selection permanently defines the horizontal size of the frames the generated core instance is to process.
- **Number of Active Lines per Frame**: When the AXI4-Lite control interface is enabled, the generated core will use the value specified in the CORE Generator GUI as the default value for the upper half-word of the ACTIVE\_SIZE register. When an AXI4-Lite

interface is not present, the GUI selection permanently defines the vertical size (number of lines) of the frames the generated core instance is to process.

- **Optional Features**:
	- ° **AXI4-Lite Register Interface**: When selected, the core will be generated with an AXI4-Lite interface, which gives access to dynamically program and change processing parameters. For more information, refer to Control Interface in Chapter 2.
	- ° **Include Debugging Features**: When selected, the core will be generated with debugging features, which simplify system design, testing and debugging. For more information, refer to Debugging Features in Appendix C.

*Note:* Debugging features are only available when the AXI4-Lite Register Interface is selected.

- **Enable INTC Port**: When selected, the core will generate the optional INTC\_IF port, which gives parallel access to signals indicating frame processing status and error conditions. For more information, refer to The Interrupt Subsystem in Chapter 2.
- Converter Type
	- ° **Standard Selection**: Select the standard to be implemented. The offered standards are:
		- YCrCb *ITU 601* (SD)
		- YCrCb *ITU 709* (HD) 1125/60 (PAL)
		- YCrCb *ITU 709* (HD) 1250/50 (NTSC)
		- YUV
		- custom

Selecting "custom" enables the controls on page 2 of the GUI, so conversion settings can be customized. Otherwise, page 2 only displays the parameters to implement the selected standard.

- ° **Output Range Selection**: This selection governs the range of outputs R, G and B by affecting the conversion coefficients as well as the clipping and clamping values. The core supports typical output ranges:
	- 16 to 235, typical for studio equipment
	- 16 to 240, typical for broadcast or television
	- 0 to 255, typical for computer graphics

The previously-mentioned ranges are characteristic for 8-bit outputs. If 10-, 12- or 16-bit outputs are used, the ranges are extended proportionally. For example, 16 to 240 mode for 10-bit outputs will result in output values ranging from 64 to 960.

The Conversion Matrix, Offset Compensation, Clipping and Clamping screen displays and enables editing of conversion coefficients, similar to [Equation 3-3.](#page-29-0) Contents are editable only when "custom" is selected as the standard on page 1 [\(Figure 5-2](#page-51-0)).

<span id="page-51-0"></span>

*Figure 5-2:* **Conversion Matrix, Offset Compensation, Clipping and Clamping Screen**

- **Conversion Matrix**: Enter floating-point conversion constants, ranging from 0 to 1, into the four fields representing CA, CB, CC and CD.
- **Offset Compensation**: Enter the offset compensation constants ( $O<sub>Y</sub>$ ,  $O<sub>Ch</sub>$ , and  $O<sub>Cr</sub>$  in [Equation 3-17](#page-31-2), [Equation 3-18](#page-31-3) and [Equation 3-19](#page-31-4)). These constants are scaled to the output representation. If  $O_{Y}$ ,  $O_{Cb}$ , and  $O_{Cr}$  are in the 0.0 - 1.0 range, and the output is represented as 10-bit unsigned integers, then luminance and chrominance offsets should be entered as integers in the 0 - 1023 range.
- **Outputs Clipped/Outputs Clamped**: These check boxes control whether clipping/ clamping logic will be instantiated in the generated netlist. The clipping/clamping logic ensures no arithmetic wrap-arounds happen at overflows, at the expense of extra slice-based logic resources.
- **Minimum and Maximum Values**: Similar to offset values, the edit boxes take unsigned integer values in the range permitted by the current output representation.

# **Output Generation**

Vivado generates the files necessary to build the core and places those files in the <project>/<project>.srcs/sources\_1/ip/<core> directory.

#### **File Details**

The Vivado output consists of some or all the following files:

*Table 5-1:*





# <span id="page-53-0"></span>Constraining the Core

#### **Required Constraints**

The ACLK pin should be constrained at the pixel clock rate desired for your video stream.

#### **Device, Package, and Speed Grade Selections**

There are no device, package, or speed grade requirements for this core. For a complete listing of supported devices, see the release notes for this core.

### **Clock Frequencies**

The pixel clock (ACLK) frequency is the required frequency for the Color Filter Array Interpolation core. See [Maximum Frequencies in Chapter 2.](#page-9-0) The S\_AXI\_ACLK maximum frequency is the same as the ACLK maximum.

## **Clock Management**

The core automatically handles clock domain crossing between the ACLK (video pixel clock and AXI4-Stream) and the S\_AXI\_ACLK (AXI4-Lite) clock domains. The S\_AXI\_ACLK clock can be slower or faster than the ACLK clock signal, but must not be more than 128x faster than ACLK.

# **Clock Placement**

There are no specific Clock placement requirements for this core.

# **Banking**

There are no specific Banking rules for this core.

# **Transceiver Placement**

There are no Transceiver Placement requirements for this core.

# **I/O Standard and Placement**

There are no specific I/O standards and placement requirements for this core.



# SECTION III: ISE DESIGN SUITE

[Customizing and Generating the Core](#page-56-0) [Constraining the Core](#page-62-0) [Detailed Example Design](#page-64-0)



# <span id="page-56-0"></span>Customizing and Generating the Core

This chapter includes information on using Xilinx tools to customize and generate the core.

#### **Graphical User Interface**

The main screen of the Graphical User Interface (GUI) of CORE Generator, shown in [Figure 7-1](#page-56-1) (and EDK shown in [Figure 7-3\)](#page-59-0), allows quick implementation of standard YCrCb to RGB or YUV to RGB converters without having to manually enter values from [Table 3-2](#page-31-0), [Table 3-3](#page-31-1) and [Table 3-4](#page-32-0). The Color-Space Converter core also supports proprietary (non-standard) converter implementations, by selecting "custom" from the Standard Selection drop-down menu, as long as the custom conversion matrix can be transformed to the form of [Equation 3-3](#page-29-0). The front pages of the RGB to YCrCb and the YCrCb to RGB GUIs are very similar, allowing easy generation of complementing converter pairs.

<span id="page-56-1"></span>

*Figure 7-1:* **Color Space Converter Main Screen**

The first page of the GUI displays the following options:

- **Component Name**: The component name is used as the base name of output files generated for the module. Names must begin with a letter and must be composed from characters: a to z, 0 to 9 and "\_". The name v\_ycrcb2rgb\_v6\_00\_a cannot be used as a component name.
- **Data Width**: Specifies the bit width of input samples. Permitted values are 8, 10, 12 and 16 bits.
- **Number of Active Pixels per Scan line**: When the AXI4-Lite control interface is enabled, the generated core will use the value specified in the CORE Generator GUI as the default value for the lower half-word of the ACTIVE\_SIZE register. When an AXI4-Lite interface is not present, the GUI selection permanently defines the horizontal size of the frames the generated core instance is to process.
- **Number of Active Lines per Frame**: When the AXI4-Lite control interface is enabled, the generated core will use the value specified in the CORE Generator GUI as the default value for the upper half-word of the ACTIVE SIZE register. When an AXI4-Lite interface is not present, the GUI selection permanently defines the vertical size (number of lines) of the frames the generated core instance is to process.
- **Optional Features**:
	- ° **AXI4-Lite Register Interface**: When selected, the core will be generated with an AXI4-Lite interface, which gives access to dynamically program and change processing parameters. For more information, refer to Control Interface in Chapter 2.
	- ° **Include Debugging Features**: When selected, the core will be generated with debugging features, which simplify system design, testing and debugging. For more information, refer to Debugging Features in Appendix C.

*Note:* Debugging features are only available when the AXI4-Lite Register Interface is selected.

- **Enable INTC Port**: When selected, the core will generate the optional INTC\_IF port, which gives parallel access to signals indicating frame processing status and error conditions. For more information, refer to The Interrupt Subsystem in Chapter 2.
- **Converter Type**:
	- ° **Standard Selection**: Select the standard to be implemented. The offered standards are:
		- YCrCb *ITU 601* (SD)
		- YCrCb *ITU 709* (HD) 1125/60 (PAL)
		- YCrCb *ITU 709* (HD) 1250/50 (NTSC)
		- YUV
		- custom

Selecting "custom" enables the controls on page 2 of the GUI, so conversion settings can be customized. Otherwise, page 2 only displays the parameters to implement the selected standard.

- ° **Output Range Selection**: This selection governs the range of outputs R, G and B by affecting the conversion coefficients as well as the clipping and clamping values. The core supports typical output ranges:
	- 16 to 235, typical for studio equipment
	- 16 to 240, typical for broadcast or television
	- 0 to 255, typical for computer graphics

The previously-mentioned ranges are characteristic for 8-bit outputs. If 10-, 12- or 16-bit outputs are used, the ranges are extended proportionally. For example, 16 to 240 mode for 10-bit outputs will result in output values ranging from 64 to 960.

The Conversion Matrix, Offset Compensation, Clipping and Clamping screen ([Figure 7-2](#page-58-0)), displays and enables editing of conversion coefficients, similar to [Equation 3-3](#page-29-0). Contents are editable only when "custom" is selected as the standard on page 1 ([Figure 7-1\)](#page-56-1).

<span id="page-58-0"></span>

*Figure 7-2:* **Conversion Matrix, Offset Compensation, Clipping and Clamping Screen**

• **Conversion Matrix**: Enter floating-point conversion constants, ranging from 0 to 1, into the four fields representing CA, CB, CC and CD.

- **Offset Compensation**: Enter the offset compensation constants ( $O_Y$ ,  $O_{Cb}$ , and  $O_{Cr}$  in [Equation 3-17](#page-31-2), [Equation 3-18](#page-31-3) and [Equation 3-19](#page-31-4)). These constants are scaled to the output representation. If  $O_{Y}$ ,  $O_{Cb}$ , and  $O_{Cr}$  are in the 0.0 - 1.0 range, and the output is represented as 10-bit unsigned integers, then luminance and chrominance offsets should be entered as integers in the 0 - 1023 range.
- **Outputs Clipped/Outputs Clamped**: These check boxes control whether clipping/ clamping logic will be instantiated in the generated netlist. The clipping/clamping logic ensures no arithmetic wrap-arounds happen at overflows, at the expense of extra slice-based logic resources.
- **Minimum and Maximum Values**: Similar to offset values, the edit boxes take unsigned integer values in the range permitted by the current output representation.

<span id="page-59-0"></span>

*Figure 7-3:* **YCrCb to RGB Color-Space Converter, EDK GUI Screen**

Definitions of the EDK GUI controls are identical to the corresponding CORE Generator GUI functions.

### **Parameter Values in the XCO File**

[Table 7-1](#page-60-0) defines valid entries for the XCO parameters. Xilinx strongly suggests that XCO parameters are not manually edited in the XCO file; instead, use the CORE Generator software GUI to configure the core and perform range and parameter value checking. The XCO parameters are helpful in defining the interface to other Xilinx tools.



#### <span id="page-60-0"></span>*Table 7-1:* **XCO Parameters**

## **Output Generation**

CORE Generator will output the core as a netlist that can be inserted into a processor interface wrapper or instantiated directly in an HDL design. The output is placed in the <project director>.

#### **File Details**

The CORE Generator output consists of some or all the following files.





# <span id="page-62-0"></span>Constraining the Core

#### **Required Constraints**

The ACLK pin should be constrained at the pixel clock rate desired for your video stream.

#### **Device, Package, and Speed Grade Selections**

There are no device, package, or speed grade requirements for this core. For a complete listing of supported devices, see the release notes for this core.

### **Clock Frequencies**

The pixel clock (ACLK) frequency is the required frequency for the Color Filter Array Interpolation core. See [Maximum Frequencies in Chapter 2.](#page-9-0) The S\_AXI\_ACLK maximum frequency is the same as the ACLK maximum.

### **Clock Management**

The core automatically handles clock domain crossing between the ACLK (video pixel clock and AXI4-Stream) and the S\_AXI\_ACLK (AXI4-Lite) clock domains. The S\_AXI\_ACLK clock can be slower or faster than the ACLK clock signal, but must not be more than 128x faster than ACLK.

# **Clock Placement**

There are no specific Clock placement requirements for this core.

# **Banking**

There are no specific Banking rules for this core.

## **Transceiver Placement**

There are no Transceiver Placement requirements for this core.

# **I/O Standard and Placement**

There are no specific I/O standards and placement requirements for this core.



#### *Chapter 9*

# <span id="page-64-0"></span>Detailed Example Design

No example design is available at the time for the LogiCORE IP YCrCb to RGB Color-Space Converter v6.00.a core.

#### **Demonstration Test Bench**

A demonstration test bench is provided which enables core users to observe core behavior in a typical use scenario. The user is encouraged to make simple modifications to the test conditions and observe the changes in the waveform.

#### **Test Bench Structure**

The top-level entity,  $tb\_main.v$ , instantiates the following modules:

• DUT

The YCrCb2RGB v6.00.a core instance under test.

• axi4lite\_mst

The AXI4-Lite master module, which initiates AXI4-Lite transactions to program core registers.

• axi4s video mst

The AXI4-Stream master module, which opens the stimuli txt file and initiates AXI4-Stream transactions to provide stimuli data for the core

• axi4s\_video\_slv

The AXI4-Stream slave module, which opens the result txt file and verifies AXI4-Stream transactions from the core

• ce\_gen

Programmable Clock Enable (ACLKEN) generator

# **Running the Simulation**

- Simulation using ModelSim for Linux: From the console, Type "source run\_mti.sh".
- Simulation using ModelSim for Windows: Double-click on "run\_mti.bat" file.
- Simulation using iSim: Double-click on "run\_isim.bat" file.

### **Directory and File Contents**

The directory structure underneath the top-level folder is:

- expected: Contains the pre-generated expected/golden data used by the testbench to compare actual output data.
- stimuli:

Contains the pre-generated input data used by the testbench to stimulate the core (including register programming values).

- Results: Actual output data will be written to a file in this folder.
- Src:

Contains the .vhd simulation files and the .xco CORE Generator parameterization file of the core instance. The .vhd file is a netlist generated using CORE Generator. The .xco file can be used to regenerate a new netlist using CORE Generator.

The available core C-model can be used to generate stimuli and expected results for any user bmp image. For more information, refer to [Chapter 4, C Model Reference.](#page-37-0)

The top-level directory contains packages and Verilog modules used by the test bench, as well as:

- isim\_wave.wcfg: Waveform configuration for ISIM
- mti\_wave.do: Waveform configuration for ModelSim
- run isim.bat : Runscript for iSim in Windows

#### **EXILINX**

- run\_isim.sh: Runscript for iSim in Linux
- run\_mti.bat: Runscript for ModelSim in Windows
- run\_mti.sh: Runscript for ModelSim in Linux



# SECTION IV: APPENDICES

[Verification, Compliance, and Interoperability](#page-68-0) [Migrating](#page-70-0) [Debugging](#page-71-0) [Application Software Development](#page-77-0) [Additional Resources](#page-81-0)

### *Appendix A*



# <span id="page-68-0"></span>Verification, Compliance, and Interoperability

### **Simulation**

A highly parameterizable test bench was used to test the YCrCb to RGB Color-Space Converter core. Testing included the following:

- Register accesses
- Processing multiple frames of data
- AXI4-Stream bidirectional data-throttling tests
- Testing detection, and recovery from various AXI4-Stream framing error scenarios
- Testing different ACLKEN and ARESETn assertion scenarios
- Testing of various frame sizes
- Varying parameter settings

#### **Hardware Testing**

The YCrCb to RGB Color-Space Converter core has been validated in hardware at Xilinx to represent a variety of parameterizations, including the following:

- A test design was developed for the core that incorporated a MicroBlaze™ processor, AXI4-Lite interconnect and various other peripherals. The software for the test system included pre-generated input and output data along with live video stream. The MicroBlaze processor was responsible for:
	- ° Initializing the appropriate input and output buffers
	- ° Initializing the YCrCb to RGB Color-Space Converter core
	- ° Launching the test
	- Comparing the output of the core against the expected results

° Reporting the Pass/Fail status of the test and any errors that were found

### **Interoperability**

The core slave (input) AXI4-Stream interface can work directly with any Video core which produces YCrCb (or YUV) 4:4:4 video data on an AXI4-Stream interface with a Video Protocol. The core master (output) RGB interface can work directly with any Video core which consumes RGB data on an AXI4-Stream interface with a Video Protocol.



### *Appendix B*

# <span id="page-70-0"></span>Migrating

The core previously had aclk, aclken, and aresetn to control the Video over AXI4-Stream and AXI4-Lite interfaces. Currently, separate clock, clock enable, and reset pins control the Video over AXI4-Stream and the AXI4-Lite interfaces with clock domain crossing logic added to the core to handle the heterogenous clock domains between the AXI4-Lite and Video over AXI4-Stream domains.

For information about migration from ISE Design Suite to Vivado Design Suite, see the Vivado tools documentation at:

<http://www.xilinx.com/cgi-bin/docs/rdoc?v=2012.2;t=vivado+userguides>



### *Appendix C*

# <span id="page-71-0"></span>Debugging

It is recommended to prototype the system with the AXI4-Stream interface enabled, so status and error detection, reset, and dynamic size programming can be used during debugging.

The following steps are recommended to bring-up/debug the core in a video/imaging system:

- 1. Bring up the AXI4-Lite interface
- 2. Bring up the AXI4-Stream interfaces

Once the core is working as expected, the user may consider 'hardening' the configuration by replacing the YCrCb2RGB core with an instance where GUI default values used for the register values, and the AXI4-Lite interface is disabled. This configuration reduces the core slice footprint.

#### **Bringing up the AXI4-Lite Interface**

[Table C-1](#page-71-1) describes how to troubleshoot the AXI4-Lite interface.

<span id="page-71-1"></span>






Assuming the AXI4-Lite interface works, the second step is to bring up the AXI4-Stream interfaces.

# **Bringing up the AXI4-Stream Interfaces**

[Table C-2](#page-72-0) describes how to troubleshoot the AXI4-Lite interface.

| Symptom                                                    | <b>Solution</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Bit 0 of the ERROR<br>register reads back set.             | Bit 0 of the ERROR register, EOL_EARLY, indicates the number of pixels<br>received between the latest and the preceding End-Of-Line (EOL) signal was<br>less than the value programmed into the ACTIVE_SIZE register. If the<br>value was provided by the Video Timing Controller core, read out<br>ACTIVE_SIZE register value from the VTC core again, and make sure that<br>the TIMING_LOCKED flag is set in the VTC core. Otherwise, using Chipscope,<br>measure the number of active AXI4-Stream transactions between EOL<br>pulses.                      |
| Bit 1 of the ERROR<br>register reads back set.             | Bit 1 of the ERROR register, EOL_LATE, indicates the number of pixels<br>received between the last End-Of-Line (EOL) signal surpassed the value<br>programmed into the ACTIVE_SIZE register. If the value was provided by<br>the Video Timing Controller core, read out ACTIVE_SIZE register value<br>from the VTC core again, and make sure that the TIMING_LOCKED flag is set<br>in the VTC core. Otherwise, using Chipscope, measure the number of active<br>AXI4-Stream transactions between EOL pulses.                                                  |
| Bit 2 or Bit 3 of the<br>ERROR register reads<br>back set. | Bit 2 of the ERROR register, SOF_EARLY, and bit 3 of the ERROR register<br>SOF_LATE indicate the number of pixels received between the latest and the<br>preceding Start-Of-Frame (SOF) differ from the value programmed into the<br>ACTIVE_SIZE register. If the value was provided by the Video Timing<br>Controller core, read out ACTIVE_SIZE register value from the VTC core<br>again, and make sure that the TIMING_LOCKED flag is set in the VTC core.<br>Otherwise, using Chipscope, measure the number EOL pulses between<br>subsequent SOF pulses. |

<span id="page-72-0"></span>*Table C-2:* **Troubleshooting AXI4-Stream Interface**

| Symptom                                                                                                                        | <b>Solution</b>                                                                                                                                                                                                                                                                                                     |  |  |
|--------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| s_axis_video_tready<br>stuck low, the upstream<br>core cannot send data.                                                       | During initialization, line-, and frame-flushing, the YCrCb2RGB core keeps its<br>s_axis_video_tready input low. Afterwards, the core should assert<br>s_axis_video_tready automatically.<br>Is m_axis_video_tready low? If so, the YCrCb2RGB core cannot send<br>data downstream, and the internal FIFOs are full. |  |  |
| m_axis_video_tvalid<br>stuck low, the<br>downstream core is not<br>receiving data                                              | No data is generated during the first two lines of processing.<br>1.<br>If the programmed active number of pixels per line is radically smaller than<br>2.<br>the actual line length, the core drops most of the pixels waiting for the<br>(s_axis_video_tlast) End-of-line signal. Check the ERROR register.       |  |  |
| Generated SOF signal<br>(m_axis_video_tuser0)<br>signal misplaced.                                                             | Check the ERROR register.                                                                                                                                                                                                                                                                                           |  |  |
| Generated EOL signal<br>(m_axis_video_tlast<br>) signal misplaced.                                                             | Check the ERROR register.                                                                                                                                                                                                                                                                                           |  |  |
| Data samples lost<br>between Upstream core<br>and the YCrCb2RGB core.<br>Inconsistent EOL and/or<br>SOF periods received.      | Are the Master and Slave AXI4-Stream interfaces in the same clock domain?<br>1.                                                                                                                                                                                                                                     |  |  |
|                                                                                                                                | Is proper clock-domain crossing logic instantiated between the upstream<br>2.<br>core and the YCrCb2RGB core (Asynchronous FIFO)?                                                                                                                                                                                   |  |  |
|                                                                                                                                | 3.<br>Did the design meet timing?                                                                                                                                                                                                                                                                                   |  |  |
|                                                                                                                                | Is the frequency of the clock source driving the YCrCb2RGB ACLK pin lower<br>4.<br>than the reported Fmax reached?                                                                                                                                                                                                  |  |  |
| Data samples lost<br>between Downstream<br>core and the YCrCb2RGB<br>core. Inconsistent EOL<br>and/or SOF periods<br>received. | Are the Master and Slave AXI4-Stream interfaces in the same clock domain?<br>1.                                                                                                                                                                                                                                     |  |  |
|                                                                                                                                | Is proper clock-domain crossing logic instantiated between the upstream<br>2.<br>core and the YCrCb2RGB core (Asynchronous FIFO)?                                                                                                                                                                                   |  |  |
|                                                                                                                                | 3.<br>Did the design meet timing?                                                                                                                                                                                                                                                                                   |  |  |
|                                                                                                                                | Is the frequency of the clock source driving the YCrCb2RGB ACLK pin lower<br>4.<br>than the reported Fmax reached?                                                                                                                                                                                                  |  |  |

*Table C-2:* **Troubleshooting AXI4-Stream Interface**

If the AXI4-Stream communication is healthy, but the data seems corrupted, the next step is to find the correct configuration for the YCrCb2RGB core.

# **Debugging Features**

The YCrCb2RGB core is equipped with optional debugging features which aim to accelerate system bring-up, optimize memory and data-path architecture and reduce time to market. The optional debug features can be turned on/off via the **Include Debug Features** checkbox on the GUI when an AXI4-Lite interface is present. Turning off debug features reduces the core Slice footprint.

#### **Core Bypass Option**

The bypass option facilitates establishing a straight through connection between input (AXI4-Stream slave) and output (AXI4-Stream master) interfaces bypassing any processing functionality.

Flag BYPASS (bit 4 of the CONTROL register) can turn bypass on (1) or off, when the core instance Debugging Features were enabled at generation. Within the IP this switch controls multiplexers in the AXI4-Stream path.

In bypass mode the YCrCb2RGB core processing function is bypassed, and the core repeats AXI4-Stream input samples on its output. In bypass mode, the Y channel is presented on the Green channel, Cr is presented on the Red channel and Cb is presented on the Blue channel.

Starting a system with all processing cores set to bypass, then by turning bypass off from the system input towards the system output allows verification of subsequent cores with known good stimuli.

#### **Built in Test-Pattern Generator**

The optional built-in test-pattern generator facilitates to temporarily feed the output AXI4-Stream master interface with a predefined pattern.

Flag TEST\_PATTERN (bit 5 of the CONTROL register) can turn test-pattern generation on (1) or off, when the core instance Debugging Features were enabled at generation. Within the IP this switch controls multiplexers in the AXI4-Stream path, switching between the regular core processing output and the test-pattern generator. When enabled, a set of counters generate 256 scan-lines of color-bars, each color bar 64 pixels wide, repetitively cycling through Black, Red, Green, Yellow, Blue, Magenta, Cyan, and White colors till the end of each scan-line. After the Color-Bars segment, the rest of the frame is filled with a monochrome horizontal and vertical ramp.

Starting a system with all processing cores set to test-pattern mode, then by turning test-pattern generation off from the system output towards the system input allows successive bring-up and parameterization of subsequent cores.

#### **Throughput Monitors**

Throughput monitors enable the user to monitor processing performance within the core. This information can be used to help debug frame-buffer bandwidth limitation issues, and if possible, allow video application software to balance memory pathways.

Often times video systems, with multiport access to a shared external memory, have different processing islands. For example a pre-processing sub-system working in the input video clock domain may clean up, transform, and write a video stream, or multiple video streams, to memory. The processing sub-system may read the frames out, process, scale, encode, then write frames back to the frame buffer, in a separate processing clock domain. Finally, the output sub-system may format the data and read out frames locked to an external clock.

Typically, access to external memory using a multiport memory controller involves arbitration between competing streams. However, to maximize the throughput of the system, different memory ports may need different specific priorities. To fine tune the arbitration and dynamically balance frame rates, it is beneficial to have access to throughput information measured in different video data paths.

The SYSDEBUG0 (0x0014), or Frame Throughput Monitor, register indicates the number of frames processed since power-up or the last time the core was reset. The SYSDEBUG1 (0x0018), or Line Throughput Monitor, register indicates the number of lines processed since power-up or the last time the core was reset. The SYSDEBUG2 (0x001C), or Pixel Throughput Monitor, register indicates the number of pixels processed since power-up or the last time the core was reset.

Priorities of memory access points can be modified by the application software dynamically to equalize frame, or partial frame rates.

# **Interfacing to Third-Party IP**

[Table C-3](#page-76-0) describes how to troubleshoot third-party interfaces.

| Symptom                                                                                                                  | <b>Solution</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |  |
|--------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Severe color distortion or<br>color-swap when interfacing to<br>third-party video IP.                                    | Verify that the color component logical addressing on the<br>AXI4-Stream TDATA signal is in according to Data Interface in<br>Chapter 2. If misaligned:<br>In HDL, break up the TDATA vector to constituent components and<br>manually connect the slave and master interface sides.<br>In EDK, create a new vector for the slave side TDATA connection. In<br>the MPD file, manually assign components of the master-side<br>TDATA vector to sections of the new vector.                                                                                                                                                                                                                                                                                                                            |  |
| Severe color distortion or<br>color-swap when processing video<br>written to external memory using<br>the AXI-VDMA core. | Unless the particular software driver was developed with the<br>AXI4-Stream TDATA signal color component assignments<br>described in Data Interface in Chapter 2 in mind, there are no<br>quarantees that the software will correctly identify bits<br>corresponding to color components.<br>Verify that the color component logical addressing TDATA is in<br>alignment with the data format expected by the software drivers<br>reading/writing external memory. If misaligned:<br>In HDL, break up the TDATA vector to constituent components, and<br>manually connect the slave and master interface sides.<br>In EDK, create a new vector for the slave side TDATA connection. In<br>the MPD file, manually assign components of the master-side<br>TDATA vector to sections of the new vector. |  |

<span id="page-76-0"></span>*Table C-3:* **Troubleshooting Third-Party Interfaces**



# *Appendix D*

# Application Software Development

# **Programmer's Guide**

The software API is provided to allow easy access to the YCrCb2RGB AXI4-Lite registers defined in [Table 2-12](#page-19-0). To utilize the API functions, the following two header files must be included in the user C code:

```
#include "ycrcb2rgb.h"
#include "xparameters.h"
```
The hardware settings of your system, including the base address of your YCrCb2RGB core, are defined in the xparameters.h file. The ycrcb2rgb.h file contains the macro function definitions for controlling the YCrCb2RGB pCore.

For examples on API function calls and integration into a user application, the drivers subdirectory of the pCore contains a file, example.c, in the ycrcb2rgb\_v6\_00\_a0\_a/ example subfolder. This file is a sample C program that demonstrates how to use the YCrCb2RGB pCore API.

| <b>Function Name and</b><br>Parameterization             | <b>Description</b>                                                                                                                              |  |
|----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|--|
| YCC_Enable<br>(uint32 BaseAddress)                       | Enables a YCrCb2RGB instance.                                                                                                                   |  |
| YCC_Disable<br>(uint32 BaseAddress)                      | Disables a YCrCb2RGB instance.                                                                                                                  |  |
| YCC_Reset<br>(uint32 BaseAddress)                        | Immediately resets a YCrCb2RGB instance. The core stays in reset until the<br>RESET flag is cleared.                                            |  |
| YCC_ClearReset<br>(uint32 BaseAddress)                   | Clears the reset flag of the core, which allows it to re-sync with the input video<br>stream and return to normal operation.                    |  |
| YCC_AutoSyncReset<br>(uint32 BaseAddress)                | Resets a YCrCb2RGB instance at the end of the current frame being processed,<br>or immediately if the core is not currently processing a frame. |  |
| YCC_ReadReg<br>(uint32 BaseAddress, uint32<br>RegOffset) | Returns the 32-bit unsigned integer value of the register. Read the register<br>selected by RegOffset (defined in Table 2-12).                  |  |

*Table D-1:* **YCrCb2RGB Driver Function Definitions**



#### *Table D-1:* **YCrCb2RGB Driver Function Definitions** *(Cont'd)*

#### **Software Reset**

Software reset reinitializes registers of the AXI4-Lite control interface to their initial value, resets FIFOs, forces m\_axis\_video\_tvalid and s\_axis\_video\_tready to 0. YCC\_Reset() and YCC\_AutoSyncReset () reset the core immediately if the core is not currently processing a frame. If the core is currently processing a frame calling YCC\_Reset(), or setting bit 30 of the CONTROL register to 1 will cause image tearing. After calling YCC\_Reset(), the core remains in reset until YCC\_ClearReset() is called.

Calling YCC\_AutoSyncReset() automates this reset process by waiting until the core finishes processing the current frame, then asserting the reset signal internally, keeping the core in reset only for 32 ACLK cycles, then deasserting the signal automatically. After calling YCC\_AutoSyncReset(), it is not necessary to call YCC\_ClearReset() for the core to return to normal operating mode.

**Note:** Calling YCC\_AutoSyncReset() does not quarantee prompt, or real-time resetting of the core. If the AXI4-Stream communication is halted mid frame, the core will not reset until the upstream core finishes sending the current frame or starts a new frame.

### <span id="page-79-0"></span>**Double Buffering**

The ACTIVE\_SIZE and the core specific registers are double-buffered to ensure no image tearing happens if values are modified during frame processing. Values from the AXI4-Lite interface are latched into processor registers immediately after writing, and processor register values are copied into the active register set at the Start Of Frame (SOF) signal. Double-buffering decouples AXI4-Lite register updates from the AXI4-Stream processing, allowing software a large window of opportunity to update processing parameter values without image tearing.

If multiple register values are changed during frame processing, simple double buffering would not guarantee that all register updates would take effect at the beginning of the same frame. Using a semaphore mechanism, the RegUpdateEnable() and RegUpdateDisable() functions allows synchronous commitment of register changes. The YCrCb2RGB core will start using the updated ACTIVE\_SIZE and register values only if the REGUPDATE flag of the CONTROL register is set (1), after the next Start-Of-Frame signal (s\_axis\_video\_tuser0) is received. Therefore, it is recommended to disable the register update before writing multiple double-buffered registers, then enable register update when register writes are completed.

#### **Reading and Writing Registers**

Each software register that is defined in [Table 2-12](#page-19-0) has a constant that is defined in ycrcb2rgb.h which is set to the offset for that register listed in [Table D-2](#page-79-1). It is recommended that the application software uses the predefined register names instead of register values when accessing core registers, so future updates to the YCrCb2RGB drivers which may change register locations will not affect the application dependent on the YCrCb2RGB driver.



<span id="page-79-1"></span>

| <b>Constant Name Definition</b> | Value  | <b>Target Register</b> |
|---------------------------------|--------|------------------------|
| YCC_ACTIVE_SIZE                 | 0x0020 | <b>ACTIVE SIZE</b>     |
| YCC RGBMAX                      | 0x100  | YMAX                   |
| YCC RGBMIN                      | 0x104  | YMIN                   |
| <b>YCC ROFFSET</b>              | 0x108  | <b>YOFFSET</b>         |
| YCC GOFFSET                     | 0x10C  | <b>CBOFFSET</b>        |
| YCC BOFFSET                     | 0x110  | <b>CROFFSET</b>        |
| YCC ACOEF                       | 0x114  | <b>ACOEF</b>           |
| YCC BCOEF                       | 0x118  | <b>BCOEF</b>           |
| YCC CCOEF                       | 0x11C  | <b>CCOEF</b>           |
| YCC DCOEF                       | 0x120  | <b>DCOEF</b>           |

*Table D-2:* **Predefined Constants Defined in ycrcb2rgb.h** *(Cont'd)*



# *Appendix E*

# Additional Resources

# **Xilinx Resources**

For support resources such as Answers, Documentation, Downloads, and Forums, see the Xilinx Support website at:

[http://www.xilinx.com/support.](http://www.xilinx.com/support)

For a glossary of technical terms used in Xilinx documentation, see:

[http://www.xilinx.com/support/documentation/sw\\_manuals/glossary.pdf.](http://www.xilinx.com/support/documentation/sw_manuals/glossary.pdf)

For a comprehensive listing of Video and Imaging application notes, white papers, reference designs and related IP cores, see the Video and Imaging Resources page at:

[http://www.xilinx.com/esp/video/refdes\\_listing.htm#ref\\_des](http://www.xilinx.com/esp/video/refdes_listing.htm#ref_des).

# **Solution Centers**

See the [Xilinx Solution Centers](http://www.xilinx.com/support/solcenters.htm) for support on devices, software tools, and intellectual property at all stages of the design cycle. Topics include design assistance, advisories, and troubleshooting tips.

# **References**

These documents provide supplemental material useful with this user guide:

- 1. Jack, Keith. 2004. Video Demystified, 4th Edition. Burlington, MA: Newnes: pp 15-19.
- 2. Poynton, Charles. 2003. Digital Video and HDTV. San Francisco: Morgan Kaufmann: pp 302- 321.
- 3. ITU Recommendation BT.601-5, International Telecommunication Union, 1995.
- 4. ITU Recommendation BT.709-5, International Telecommunication Union, 2002.
- 5. Proakis, John G., and Dimitris G. Manolakis. Digital Signal Processing, 3rd edition. Upper Saddle River, NJ: Prentice Hall: pp 755-756.
- 6. Sullivan, Gary. 2003. Approximate theoretical analysis of RGB to YCbCr to RGB conversion error. Presented for Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC
- 7. [JTC1/SC29/WG11 and ITU-T SG16 Q.6\), July 22-24, in Trondhe](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0051a/index.html)
- 8. [UG761 AXI Reference Guide](http://www.xilinx.com/support/documentation/ip_documentation/axi_ref_guide/v14_2/ug761_axi_reference_guide.pdf)

# **Technical Support**

Xilinx provides technical support at [www.xilinx.com/support](http://www.xilinx.com/support) for this LogiCORE™ IP product when used as described in the product documentation. Xilinx cannot guarantee timing, functionality, or support of product if implemented in devices that are not defined in the documentation, if customized beyond that allowed in the product documentation, or if changes are made to any section of the design labeled DO NOT MODIFY.

See the IP Release Notes Guide ([XTP025\)](http://www.xilinx.com/support/documentation/ip_documentation/xtp025.pdf) for more information on this core. For each core, there is a master Answer Record that contains the Release Notes and Known Issues list for the core being used. The following information is listed for each version of the core:

- New Features
- Resolved Issues
- Known Issues

# **Revision History**

The following table shows the revision history for this document.



# **Notice of Disclaimer**

The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx products. To the maximum extent permitted by applicable law: (1) Materials are made available "AS IS" and with all faults, Xilinx hereby DISCLAIMS ALL WARRANTIES AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and (2) Xilinx shall not be liable (whether in contract or tort, including negligence, or under any other theory of liability) for any loss or damage of any kind or nature related to, arising under, or in connection with, the Materials (including your use of the Materials), including for any direct, indirect, special, incidental, or consequential loss or damage (including loss of data, profits, goodwill, or any type of loss or damage suffered as a result of any action brought by a third party) even if such damage or loss was reasonably foreseeable or Xilinx had been advised of the possibility of the same. Xilinx assumes no obligation to correct any errors contained in the Materials or to notify you of updates to the Materials or to product specifications. You may not reproduce, modify, distribute, or publicly display the Materials without prior written consent. Certain products are subject to the terms and conditions of the Limited Warranties which can be viewed at <http://www.xilinx.com/warranty.htm>; IP cores may be subject to warranty and support terms contained in a license issued to you by Xilinx. Xilinx products are not designed or intended to be fail-safe or for use in any application requiring fail-safe performance; you assume sole risk and liability for use of Xilinx products in Critical Applications: [http://www.xilinx.com/](http://www.xilinx.com/warranty.htm#critapps) [warranty.htm#critapps](http://www.xilinx.com/warranty.htm#critapps).

© Copyright 2011–2012 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Zynq, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. All other trademarks are the property of their respective owners.