The latest video compression standard will transform the way we consume video

by Miska Hannuksela

6 May 2021

The latest video compression standard will transform the way we consume video

For over a year, the global pandemic has forced us to stay at home, working and socializing remotely. When the world first went into lockdown, the amount of internet traffic exploded without warning. Many of us experienced the frustration of our favourite video streaming providers dropping image resolutions to prevent network congestion and keep things humming. Luckily, Versatile Video Coding (VVC), a new video compression standard, developed by Nokia and a range of other partners, is now available, and enables 50% more compression gain than the previous standard with the same picture quality.

Extensive collaboration behind the Versatile Video Coding standard

The Versatile Video Coding (VVC) standard was completed in July 2020, following a collaborative standardization process in the Joint Video Experts Team involving hundreds of video coding engineers from around the world. The process began with a multi-year exploration phase. This was followed by a call for proposals in 2018, which resulted in the selection of an initial test model for the video encoder and decoder, i.e., a codec. The collaborative standardization phase then ran for about two years, during which thousands of technical contribution documents were rigorously evaluated for adoption into the VVC standard and the common test model source code was continuously updated.

As I wrote in my previous blog, Nokia has been a major contributor to international video coding standardization since the early 1990s. We used the experiences and knowledge gained from years of research to make a significant input into the design of VVC. In this blog, I will walk you through some of the key features of the VVC standard that were heavily influenced by Nokia’s contributions.

Why VVC is outperforming previous video codecs

A key driver in the development of the new standard was the need to respond to the ever-increasing amount of video traffic with the available network capacity, while also meeting the increased display resolutions of television sets, smartphones, and other video playback devices.

VVC is a significant upgrade for conventional video applications, such as digital television and video conferencing, thanks to its capability to reduce bitrate requirements by 50% when compared to the previous video codec generation without reducing quality. Let’s all say goodbye to using bandwidth issues as an excuse for not turning on the camera in video meetings!

Versatility is one of the strengths of VVC. It can support emerging video content formats and low-latency transmission over 5G networks, which is crucial for remote-operated machines and vehicles, but is also the perfect match for game streaming, cloud gaming and any content that mixes camera-captured and computer-generated images, such as news broadcasts. And in the future, VVC will also enable immersive video beyond 360°, including six-degrees-of-freedom video, which enables the viewer to take any viewing position and look freely around within lifelike 3D video.

The first video codec for the 5G low latency

Ultra-low latency - which is vital for real time reactions in remote-operated machines - is one of the key defining capabilities of 5G networks. However, it’s not only the network connection but also the video codec that needs to operate with low delay. VVC is the first video standard where all decoders support Gradual Decoding Refresh (GDR). This feature reduces end-to-end delay from a quarter of a second down, which may be experienced with other codecs, to just few tens of milliseconds by matching the encoded video bitrate with the network throughput. The result is a dramatic improvement in the quality of experience. For example, in a multiparty video conference a lower delay makes it far less likely people will talk over each other.

I am proud to say that Nokia has been one of the leading companies in the development and standardization of GDR technology since 2002 _[1][2] and one of the most active contributors of GDR to the VVC standard. _{[3] – [10]}.

Adaptive delivery of immersive video

5G networks enable the transmission of content requiring very high resolution and bandwidth, including 360° virtual reality video. While a viewer can freely look around when watching 360° video, only a portion, known as a viewport, is visible at any time. Consequently, the bandwidth requirements of 360° video can be reduced if the viewport is transmitted at a higher picture quality compared to other portions – which is commonly referred to as viewport-dependent streaming.

VVC is the first standard to provide subpictures, a unique feature that supports viewport-dependent streaming and brings the mainstream adoption of 360° video another step closer. Nokia has been a driving force for developing subpicture technology since 2001 _[11]. For VVC, Nokia introduced the requirements for 360° video _[12], based on the experience gained with the OZO camera and related software products. During the collaborative phase of VVC standardization, Nokia initiated the work towards subpictures _[19][20] and was a key contributor in its technical design _{[13] – [29]}.

VVC and 5G are a perfect match

Thanks to the superior 50 % compression gain compared to the previous video coding standard generation, VVC is the best available compression technology for video services. On top of the compression performance, VVC provides unprecedented versatility, which makes it the preferable choice for 5G networks and immersive video.

You can read more about the future trends in video usage here.

References
[1]   Y.-K. Wang and M. M. Hannuksela, "Gradual Decoder Refresh Using Isolated Regions," Joint Video Team document JVT-C074, May 2002.
[2]   M. M. Hannuksela, "Sync Pictures," JVT-C081, May 2002.
[3]   M. M. Hannuksela, K. Kammachi-Sreedhar, and A. Aminlou, "AHG14: On gradual decoding refresh," Joint Video Experts Team document JVET-N0310, Mar. 2019.
[4]   K. Kammachi-Sreedhar, M. M. Hannuksela, and A. Aminlou, "CE11-3.2: Gradual Random Access (GRA) with Sub-Pictures," JVET-O0663, June 2019.
[5]   L. Wang and M. M. Hannuksela, "CE11-related: Wavefront-based GRA and Related Syntax," JVET-O0977, July 2019.
[6]   L. Wang, " CE11-related: Wavefront-based GRA Method," JVET-O0979, July 2019.
[7]   L. Wang, S. Hong, and K. Panusopone, "CE2: Test 2-3 Wavefront-Based Gradual Decoding Refresh (GDR)," JVET-P0112, Oct. 2019.
[8]   L. Wang, S. Hong, and K. Panusopone, "AHG9: Gradual Decoding Refresh for VVC," JVET-Q0527, Jan. 2020.
[9]   L. Wang, S. Hong, K. Panusopone, and J. Lainema, "AHG9: Gradual Decoding Refresh without Forcing Intra Area," JVET-Q0560, Jan. 2020.
[10]   L. Wang, S. Hong, K. Panusopone, and M. M. Hannuksela, "AHG9: On LMCS for GDR," JVET-R0393, Apr. 2020.
[11]   M. M. Hannuksela and Y.-K. Wang, "New image segmentation method," ITU-T Video Coding Experts Group document VCEG-O46, Dec. 2001.
[12]   J. Ridge and M. M. Hannuksela, "Future video coding requirements on virtual reality," ISO/IEC JTC1 SC29 WG11 document M37152, Oct. 2015.
[13]   M. M. Hannuksela, A. Zare, M. Homayouni, R. Ghaznavi-Youvalari, and A. Aminlou, "Design goals for tiles," JVET-K0300, July 2018.
[14]   M. M. Hannuksela and A. Aminlou, "AHG12: On grouping of tiles," JVET-M0261, Jan. 2019.
[15]   M. M. Hannuksela, "AHG12/AHG17: On merging of MCTSs for viewport-dependent streaming," JVET-M0388, Jan. 2019.
[16]   M. M. Hannuksela, A. Aminlou, and K. Kammachi-Sreedhar, "AHG12: Comparison of approaches for independently coded picture regions," JVET-N0044, Mar. 2019.
[17]   M. M. Hannuksela, K. Kammachi-Sreedhar, and A. Aminlou, "AHG12: Sub-picture layers for realizing independently coded picture regions," JVET-N0045, Mar. 2019.
[18]   M. M. Hannuksela and K. Kammachi-Sreedhar, "AHG12: Sub-picture-based picture partitioning and decoding," JVET-N0046, Mar. 2019.
[19]   M. M. Hannuksela, "AHG12/AHG17: Merging IRAP and non-IRAP VCL NAL units into the same coded picture," JVET-N0047, Mar. 2019.
[20]   A. Aminlou, A. Zare, M. M. Hannuksela, "CE12-related: MCTS improvement by modifying motion compensation filter coefficients," JVET-N0402, Mar. 2019.
[21]   A. Aminlou, A. Zare, M. M. Hannuksela, "CE12-related: MCTS improvement by modifying prediction block," JVET-N0403, Mar. 2019.
[22]   M. M. Hannuksela, "AHG12: On independently coded picture regions," JVET-O0394, June 2019.
[23]   M. M. Hannuksela, Y.-K. Wang, and Hendry, "AHG12: Signalling of subpicture IDs and layout," JVET-P0126, Sep. 2019.
[24]   M. M. Hannuksela, A. Aminlou, and K. Kammachi-Sreedhar, "AHG8/AHG12: Subpicture-specific reference picture resampling," JVET-P0403, Oct. 2019.
[25]   M. M. Hannuksela, Y.-K. Wang, and Hendry, "AHG12: single_slice_per_subpic_flag," JVET-P1024, Oct. 2019.
[26]   M. M. Hannuksela, A. Aminlou, R. Ghaznavi-Youvalari, K. Kammachi-Sreedhar, "AHG8/AHG12: Subpicture-specific RPR," JVET-Q0236, Dec. 2019.
[27]   M. M. Hannuksela, "AHG9: On mixed NAL unit types in a coded picture," JVET-Q0239, Dec. 2019.
[28]   A. Hallapuro, M. Homayouni, A. Aminlou, M. M. Hannuksela, "AHG12: Subpicture merging experiments," JVET-R0148, Apr. 2020.
[29]   A. Hallapuro and M. M. Hannuksela, "AHG3/AHG12: Subpicture merging software," JVET-S0162, June 2020.

About Miska Hannuksela

Miska Hannuksela, (M.Sc., Dr. Tech), is the Head of Video Research at Nokia Technologies and a Nokia Bell Labs Fellow. He is an internationally acclaimed expert in video and image compression and end-to-end multimedia systems.