International organisation for standardisation organisation internationale de normalisation


Applications and scope discussion for MVC



Yüklə 3,67 Mb.
səhifə38/55
tarix27.10.2017
ölçüsü3,67 Mb.
#16651
1   ...   34   35   36   37   38   39   40   41   ...   55

7.3Applications and scope discussion for MVC




7.3.1Depth map coding scenarios discussion


Coding scenario #1: Consider coding M ordinary 2D+t video sequences, plus P depth maps. This information can be used to synthesize N views (possibly with N being very large or conceptually infinite). This could enable “3D TV” at, e.g. twice the bit rate of ordinary “2D TV”.
Coding scenario #2: Consider coding M ordinary 2D+t video sequences (joint or “simulcast” coding). These sequences (without additional depth map information) can provide the N=M views to be presented on some display.
Coding scenario #3: Consider coding M=1 ordinary 2D+t video sequences, plus P=1 depth maps, plus one additional 2D+t “background” (BG) video sequence.
Each coding scenario may be capable of generation of N views for presentation on an autostereoscopic video display (such that moving your head will change your stereo view perspective).
Remark: Coding scenario #1 is likely to provide better performance in terms of the number of bits needed to provide the necessary quality for the N views.
For M=1 and P=1, we can do this with MPEG-C Part 3.
For M=2 and P=0, we can do this with stereo video SEI messages.
For M=1 and P=1 plus BG, we have something that has been recently discussed in an MPEG AHG as a potential extension of MPEG-C Part 3 in response to a proposal that arrived at the Lausanne meeting of MPEG.
Is that our target application?
“Free viewpoint TV” is considered a more elaborate application than the driving of an autostereoscopic display – more of a virtual reality ability to navigate more arbitrarily within some visual environment. Is that correct?

7.3.2Joint discussion with MPEG video on MVC / FTV applications and scope




7.3.2.1.1M14876 USNB input to WG 11 parent body

The USNB of WG 11 provided the following three remarks on multi-view and free-viewpoint video coding work (MVC/FTV):

  1. The USNB suggests consideration of two distinct forms of multiview video coding. These two forms are: “inward-looking” / “parallel viewing” (where the multiple views are created from distinct viewing positions pointing at essentially the same scene area to be viewed), and “outward-looking” (also known as panoramic mode, where the multiple views are created by taking different viewing angles from the same viewing position). The USNB notes that the current multi-view test set is comprised only of inward-looking sequences. The USNB certainly appreciates the difficulty in obtaining multiview test data (as USNB members have contributed sequences to the current test set), but it is suggested that WG 11 make an effort to gather panoramic video data. Neighboring views in the panorama case may have different inter-view predictive coding issues than adjacent views in the inward-looking case. Inclusion of these cases may thus affect the design and analysis of the view coding layer.




  1. The USNB also recognizes that depth map information relating to multi-view video is important to successful widespread adoption of both multiview and free-viewpoint video applications. The USNB suggests that WG 11 undertake activities to demonstrate the capabilities of current state-of-the-art for such depth map estimation. The USNB further suggests that a high-quality reference depth map estimation technique be made available for the design and testing of related video coding standards and to aid implementers of WG 11 specifications. The USNB expects that such techniques should be adopted in MPEG standards as reference software.




  1. The USNB has reviewed the requirements for free-viewpoint television (FTV) and notes that coding of depth/disparity map information is an integral part of enabling free viewpoint video applications. Since the work on MVC has also led to exploring the coded representation of depth map information for the purpose of optimizing MVC coding efficiency, the USNB suggests that inclusion of depth/disparity map information for the purpose of enabling free viewpoint video be considered within the scope of the MVC project. The USNB believes that an integrated treatment of the subject would lead to a more coherent overall design of the video coding layer.

The USNB indicated that its members will contribute to the work in these areas.


Remark: What are some application scenarios for the “outward looking” case.
Question: Is the outward looking case part of “FTV”?
For the case of panoramic view: Is it possible to gain compression, when cameras have little overlap? Is joint coding substantially beneficial to the outward-looking case?
Seek input on applications and requirements for outward-looking.

Study on this was already performed in 3DAV exploration

FTV could also use semi-automatic or manually produced depth maps

Depth/disparity map encoding is relevant, should be performed as part of MVC


Disposition: We request further input on the application scenario of the outward-looking case, the role that coding technology can play in that environment, and the degree to which current coding technology does not fulfill the needs, and test material.

7.3.2.1.2Initial discussion of FTV application scenarios

Application #1 – watch 3-d video on a screen, either “stereoscopic” (e.g. with glasses) or “autostereoscipic” within a limited range of perspective (e.g. 10 degrees).
Application #2 – free navigation with “3-D” or “2-D” viewing (interactively – with a broad range of perspective).
Remark: The coding efficiency benefit of inter-view prediction is rather limited – the bit rate is roughly proportional to the number of coded views.
Remark: Changes of perspective by use of view interpolation, produces artifacts – it may look OK when sitting still, but not look correct while the perspective is moving.
Remark: Display technology is evolving and different displays are capable of showing different (and increasing) numbers of views – we should not presuppose some particular number of views.
Remark: We should create a format that allows any number of views to be created from the encoded data without a proportional increase of bit rate. Perhaps a 2x bit rate increase relative to ordinary 2-D viewing is acceptable, but we should forget much higher bit rate relationships.
Remark: Backward compatibility is an important requirement.
Activities:

  • Application scenario requirements

  • Depth map technology development

Some relevant recent WG 11 output documents:



  • N7777 Steroscopic video requirements

  • N9163 MVC requirements (Lausanne)

  • N9168 FTV requirements

Some relevant WG 11 input contributions:



  • 14779 AHG report on new video

  • 14952 FTV

  • 14949 Depth-viewing HDTV camera

Additional WG 11 technical-oriented input contributions: 14879, 14888, 14889, 14920, 14994, 14996.


Suggestion:

  • Identify typical application cases

  • Test material for such cases

  • What can be done with existing standards? Which parts would need to be normative?

Remark: Consider two application scenarios: Navigation in video, generation of views for multiview; their combination may be further out in the future.


Problem: Increase in data rate as compared to single should not be high. This will need to be traded off against the quality of the generated views.
Remark: Range of views will typically be higher in FTV than in current MVC scope (which is, in principle, accommodating N-view displays).
Remark: Technology should be scalable in a way that by stripping off data the ranges of view variation support (and quality) are varied.
WG 11 AHG to be formed to improve understanding of the applications & requirements, and to study the issue of depth maps.

7.3.3Additional break-out report discussions on FTV applications and scope

7.3.3.1.1JVT-Y087-B [A. Vetro, F. Bruls] BoG report: Summary of BoG Discussions on FTV

This document describes the FTV concept, target applications and first thoughts on experimental evaluation resulting from break-out group discussion during the meeting. This is expected to form the basis for experiments that would be defined in an upcoming Call for Proposals on the topic. Open issues to be discussed further were also highlighted.
Presentation on identifying applications and parameters – table shown. Outlines number of views, number of depth maps, properties of applications.
Applications were identified in an accompanying spreadsheet.
A “Call for Test Material & Software for FTV Experiments” was provided as an annex.
Needs for displays, test material, etc. were discussed.
FTV can be defined as a compressed representation and associated technologies which enable generating a large number of different views from a sparse view set. This most probably (from technologies currently known) requires implementation of depth/disparity map estimation (non-normative), definition of depth/disparity map representation/compression and interpolation/rendering method (not clear yet whether the latter should be non-normative or normative). All of these elements rely on each other, such that proper technology selection will most probably not be simple. Depending on the application, the view number to be generated may range from two for simple stereoscopic up to "many" for almost-free walk-through within a scene.
WG 11 will issue the following output documents on FTV at this meeting:

  • N9466 on applications and requirements

  • N9467 describing FTV test cases and evaluations, and

  • N9468 which will be a call for contributions on FTV test material.




Yüklə 3,67 Mb.

Dostları ilə paylaş:
1   ...   34   35   36   37   38   39   40   41   ...   55




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin