Skip to main content

Multiview HEVC — Stereoscopic video on steroids

Multiview HEVC — Stereoscopic video on steroids

Celebrating a decade of multiview and scalable HEVC extensions

You’ve probably heard about the Apple Vision Pro headset. But did you know that it relies upon a multiview extension of the High Efficiency Video Coding (HEVC, H.265) standard that is celebrating its 10th anniversary?

Video coding encompasses the process of compressing and decompressing video data. This allows for large files to be shared across the internet. Without video coding, we would not be able to stream our favorite shows and movies or share our own video content on social media platforms. 

The first version of HEVC targeted regular two-dimensional video formats, but HEVC version 2 includes the multiview extension, known as MV-HEVC, that expands the scope of HEVC to three-dimensional (3D) video.

Since the first release of the HEVC standard in 2013, HEVC became broadly supported in computers, smartphones, television sets and other consumer devices. In fact over 96 percent of consumer video products sold today have HEVC playback capability [1].

Market adoption of MV-HEVC gained a significant boost after Apple's announcements of MV-HEVC playback in the Apple Vision Pro headset last year and MV-HEVC capture and encoding in iPhone 15 Pro or iPhone 15 Pro Max with iOS 17.2. Since then, MV-HEVC support has been announced for many software and hardware products, including the Meta Quest virtual reality headset.

In this blog post, I’ll cover the benefits of MV-HEVC while also taking a peek at the history of its development.

Why MV-HEVC?

One of the key techniques to reproduce the three-dimensional visual sensation is to capture video with two parallel camera sensors, as illustrated in Figure 1, and present the captured video views to different eyes, for example using a head-mounted virtual reality headset. The captured views have a significant amount of overlap and similarity, since the same captured objects appear in both views, albeit at different horizontal positions within the respective view. Consequently, compression can be achieved through exploitation of the inter-view correlation.

Figure 1

Figure 1

MV-HEVC applies the powerful motion compensation coding tools of HEVC to the prediction between the left and right views, as exemplified in Figure 2. Prediction with motion compensation is illustrated using arrows, pointing to the predicted pictures. This example employs temporally hierarchical prediction, resulting in the pictures being coded/decoded in their dependency order, which differs from the displaying time order presented on the horizontal axis.

Figure 2

Figure 2

In fact, MV-HEVC reuses the HEVC decoding process and only needs additions in high-level syntax processing on top of an HEVC implementation. Thus, MV-HEVC incurs only a small-scale implementation effort.

MV-HEVC provides an astonishing compression gain compared to earlier approaches. The official verification test [2] concluded that for stereoscopic 3D video, MV-HEVC achieves about 50 percent bitrate reduction compared to the multiview extension of the Advanced Video Coding standard (AVC, H.264) and more than 30% bitrate reduction relative to coding each view independently with HEVC.

Brief history of MV-HEVC standardization

The Moving Picture Experts Group (MPEG) issued a Call for Proposals (CfP) for 3D video coding technology in March 2011. The following HEVC extension projects were established after the completion of the CfP:

  1. MV-HEVC, which targeted at extending HEVC in a manner that requires only high-level changes. Support for depth maps was added during the MV-HEVC standardization project.

  2. 3D-HEVC, which targeted higher compression efficiency than MV-HEVC with new coding tools dedicated to 3D video.

The standardization projects were carried out in the Joint Collaborative Team on 3D Video Coding (JCT-3V). Nokia was among the most active contributors in the collaborative standardization phase of MV-HEVC.

While the 3D video coding standardization was ongoing, the Joint Collaborative Team on Video Coding (JCT-VC) was working on scalable extensions to HEVC, also known as SHVC. The scalable extensions can be used to decode the same bitstream in devices with different capabilities or extract a subset of a bitstream, resulting in a lower fidelity. Different scalability types supported by SHVC include quality, spatial, bit-depth, color gamut, and dynamic range scalability.

Nokia's groundbreaking proposal [3] to unify the core of the multiview and scalable extensions of HEVC was a major pivot point in the standardization. Nokia's contribution resulted in a coordinated standardization of scalable and multiview extensions of HEVC, which were then both released in HEVC version 2, which was technically completed in July 2014 and published later that year. This was the first time in the history of video coding standards when such a common design for multi-layer video coding extensions was established. Since the core part is common for any multi-layer extensions, the implementation effort for supporting both multiview and scalable extensions as well as any scalability types is reportedly reduced, for example when compared to the respective functionality in AVC.

Similarly to HEVC version 2, the unified multi-layer design approach was taken into use in its successor, the Versatile Video Coding (VVC, H.266) standard. VVC provides approximately the same picture quality with half the bitrate compared to HEVC [4][5]. For stereoscopic 3D video, the multi-layer profile of VVC is capable of achieving similar relative compression gains over single-layer VVC as MV-HEVC achieved over HEVC. What's more, multi-layer VVC only requires additions in high-level syntax processing over the main VVC profile. Thus, the odds are high that as soon as market adoption of VVC increases, its multi-layer profile will gain traction. All this is thanks to the 10-year-old pioneering work on the multiview HVEC standard – Happy anniversary!

References

      [1]      S. Forrest and J. Duvall, "Spotlight on HEVC. The codec choice for the video streaming industry", Dec. 2023.

      [2]      V. Baroncini, K. Müller, and S. Shinya (editors), "MV-HEVC verification test report", JCT3V-N1001, May 2016.

      [3]      K. Ugur, M. M. Hannuksela, J. Lainema, D. Rusanovskyy, "Unification of scalable and multi-view extensions with HLS only changes", JCTVC-L0188, Jan. 2013.

      [4]      M. Wien and V. Baroncini, "Report on VVC compression performance verification testing in the SDR UHD Random Access Category", JVET-T0097, Oct. 2020.

      [5]      V. Baroncini and M. Wien, "Report on VVC compression performance verification testing in the SDR HD and 360 Video categories", JVET-V0174, Apr. 2021.

Miska Hannuksela

About Miska Hannuksela

Miska Hannuksela, (M.Sc., Dr. Tech), is the Head of Video Research at Nokia Technologies and a Nokia Bell Labs Fellow. He is an internationally acclaimed expert in video and image compression and end-to-end multimedia systems.

Connect with Miska on LinkedIn

Article tags