Super-Resolution: Upgrading Image Quality with AI

This is a guest post by Robert Lara, Senior Marketing Director at Mipsology

Super Resolution refers to the process of reconstructing a higher-resolution image or sequence from the observed lower – resolution images. An image may have a “lower resolution” due to a smaller spatial resolution (i.e., size) or due to a result of degradation (such as blurring). It has a wide range of applications including but not limited to satellite imaging, medical imaging, video surveillance as well as video streaming which is the primary focus of this article.


Video streaming and content providers are being challenged to keep up with consumers' demand for high-quality videos and images across all devices. When the standards for high-definition resolution (720p and 1080p) were introduced, HDTV exploded, and the technology has continued to innovate, raising the image quality even higher. The current go-to standard, 4K ultra-high definition (4K UHD), leaves viewers with a crisper image and more accurate details due to its 8 million pixels (compared to the 1 million of HD) but it’s not the only thing changing the game for content viewing. Many service providers like Netflix, Amazon Video, and even Facebook, and TikTok are increasingly offering 4K content as more people buy 4K and 8K televisions. But how can these service providers ensure that both new and old videos meet increasing customer expectations on any device?

Super Resolution can turn this………………………... into this……

image (10)
image (11)


Leveraging Super-Resolution to Improve Video and Image Quality

New content offerings generally meet the HD standard, but this doesn’t always apply to older TV shows and movies, nor to user-generated videos posted on social media. Thankfully, there is a solution. Advanced deep learning models are now able to perform “super-resolution,” a method of improving video images that identifies the attributes of the low-quality video or image and ‘fills in’ the missing parts to create a higher quality video or image output. It’s not the real original image, but it looks more natural to the human eye. With super-resolution, a streaming service can take old content like “The Twilight Zone” and make the video quality look as if it were shot in the 21st century. And people who don’t like black and white footage are in luck too; machine learning and neural networks will likely be used to add color to old footage someday soon. More on that in a future blog post

A Strain on the Computing Resources

The largest streaming services and social media applications would ideally offer millions of videos at the highest quality resolution to optimize the viewing experience, but this is neither quick nor easy. Applying super-resolution to one hour of video can take 10-15 hours and requires significant computing resources. Add to this the growing demand for quality live-streaming content through services like Twitch and Zoom, which requires them to create millions of high-resolution streams without delays at optimum performance, 24/7, and compatible with any screen size - phone, tablet, or TV.

This is where Mipsology’s Zebra software solution can play a significant role for service providers looking to differentiate with high-quality video content. See the Zebra software stack image below.

image (9)

Super-Resolution: The Zebra difference

Nowadays, deep learning techniques have been applied to many images or video-related tasks. It has also been proven to be effective for Super Resolution, which shows state-of-the-art performance in terms of image quality. However, neural networks for super-resolution differ from standard classification or segmentation networks in that they have massive inputs and outputs and require a huge number of calculations. Zebra leverages the high density of memories coupled to the large computing resources in FPGA to deliver an ideal computing platform for all NNs, including those as demanding as super-resolution.


A very good neural network for creating such high-resolution images is EDSR (https://arxiv.org/abs/1707.02921), which structure looks like:

image (12)

Zebra streamlines the process of super-resolution and eases the computing load, enabling content and streaming providers to achieve their high-quality video and image goals. Using multiple Xilinx Alveo™ Accelerator cards in a computer, Zebra makes it possible to achieve a high density of computing, reducing the cost of infrastructure: 1 Xilinx Alveo-enabled server does the job of 3 GPU-enabled servers. Based on 8-bit integer computing and a proprietary efficient quantization, Zebra accelerates the inference of neural networks like EDSR to create high-quality 2K or 4K content from 1K video and enable live streaming – all on a single computer. Not only does that reduce the initial cost, but it also reduces the installation costs, data center costs, and greatly simplifies the software as the video streams don’t need to be spread over multiple hosts.

image (13)

A Zebra result of running EDSR is displayed below using the “0825 The band” image (from DIV2K dataset, NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study).
The image before processing is displayed below (top left) and after processing (bottom left). As a 4k image precision is not easy to show in an article, we have zoomed the same area of the image. The top right shows a zoom on the original image using a classical bicubic algorithm, while the bottom right displays a real piece of the resulting 4k image after it was enlarged by Zebra using EDSR.

mip-image

Conclusion

By using Zebra enabled Xilinx FPGA-based platforms, Zebra enables a simple processing infrastructure that reduces the cost of creating high-definition content, compared to competitive hardware. FPGA-based hardware has a long lifespan and does not randomly fail, enabling 24/7 services to run with low maintenance costs and no interruption. This is essential for companies that are looking to upgrade thousands of movies, shows, and short videos.


Zebra’s high-performance AI acceleration engine is plug & play, does not require any changes to the neural network, and can be immediately deployed for inference while keeping existing training. This is important for two reasons. First, it saves an incredible amount of time and cost, and second, Zebra’s unique IP delivers high-quality content which is required in today’s commercial level super-resolution applications.

To learn more about Achieving Super-Resolution through Deep Learning with a Xilinx and the Mipsology Plug & Play Solution watch this webinar