7.2Core experiment #3: MVC view synthesis prediction & related docs
7.2.1.1.1JVT-Y068 ( Prop 2.2/3.1) [S. Yea, A. Vetro (MERL)] CE3: MVC report on view synthesis pred
This contribution reports the progress of CE3 on view synthesis prediction for multiview coding. Preliminary results on the use of view synthesis prediction for coding efficiency improvement in free viewpoint scenarios were reported. The depth map for each view was reportedly encoded separately from the multiview video and used to generate view synthesis prediction for coding efficiency improvement. The effects of down-sampling as well as the use of different QPs for the depth map were also reported.
When comparing to the total bit rate for sending both the pictures and the associated depth map, improvements of 5% or about 0.2 dB were reported for various configurations.
Presentation? Available.
Software? Not provided.
7.2.1.1.2JVT-Y064 ( Prop 2.2/3.1) [S. Shimizu, H. Kimata (NTT)] CE3: MVC view synth resid pred on hierarchical inter-view reference
This contribution proposed a view generation method for view synthesis residual prediction.
Center view is coded with depth map. Depth map is not encoded for other views.
Only tested on one sequence; only for anchor pictures, with base view all intra coded.
Preliminary experimental result for the sequence “Rena” is, when comparing to the total bit rate for sending both the pictures and the associated depth map, reportedly a 13% bit rate saving or a 0.6 dB PSNR gain on anchor pictures using Bjøntegaard measures.
The gain reportedly increases to 15% if there is no need to consider the bit rate of depth map.
Other views are all reportedly predicted from their adjacent views by following the common experiment conditions.
Presentation? Uploaded later.
Software? Not provided.
7.2.1.1.3JVT-Y065 ( Prop NN 2.2/3.1) [Y. S. Ho, K. J. Oh, C. Lee, S. B. Lee, S. T. Na, B. H. Choi, J. H. Park (GIST)] CE3: Depth map gen and virtual view synth
Multi-view depth can be used in virtual view synthesis for free viewpoint video (FVV)/free viewpoint television (FTV) and view synthesis prediction (VSP) for multi-view video coding. However, among the current test sequences, only “Breakdancers” has accompanying depth map information. This contribution describes a depth generation scheme and virtual view synthesis using depth data. An analysis of virtual view synthesis and its relationship to depth coding and preprocessing is also reported.
In this contribution, multi-view depth generation and rendering results under various conditions were reported. Image segmentation and 3D warping techniques were used for depth generation. However, there were reported to be remaining problems such as inaccurate object segment and low temporal correlation. This work was still reported to be ongoing. The experiments on depth map reportedly demonstrated that down-sampling is not good and median filtering can be beneficial to depth coding.
Presentation? Uploaded later.
Software? Not provided.
7.2.1.1.4JVT-Y024 ( Prop 2.2/3.1) [G. Jiang, T. Qiu, M. Yu, Z. Peng, Q. Xu (Ningbo U.)] Disparity est and compression in MVC with disparity maps
A framework of multi-view video plus disparity maps was presented in this contribution. In the framework, the disparity maps are for the use of fast virtual view generation at the user side. On the point of view that the computation ability of the user side is limited due to the resources of the client, it was suggested to be better to generate disparity maps at the server and transmit them to the client. The asserted advantage of sample-wise disparity maps is asserted to be that such maps can be used directly for view generation. However, this is at the cost of relatively larger transmission bandwidth. 8×8 block disparity maps are recommended in this contribution, and a disparity estimation scheme is presented, in which the relationship between adjacent disparity maps is asserted to be easy to use. The 8×8 block disparity maps are losslessly coded by CABAC so as to not introduce coding artifacts which strongly influence the reconstruction quality of rendered arbitrary views. Based on the 8×8 block disparity map transmitted from the server, a sample-wise disparity refinement algorithm was described to generate the pixel-wise disparity map for view synthesis.
Suggests to encode N views and N-1 losslessly (CABAC) encoded disparity maps having 1/8 resolution horizontally and vertically. The decoder then performs post-processing to construct full-resolution disparity maps from the decoded lower-resolution disparity maps.
The contribution reports that a 0.5 dB gain can be achieved by using the refined disparity map
Remark: May need to encode more data in cases where the views are not rectified.
Remark: Encoding depth rather than disparity map information may reduce the number of maps needed to be coded.
Remark: May need to consider vector map coding.
Presentation? Available.
Software? Not provided.
7.2.1.1.5JVT-Y031 ( Req 2.2/3.1) [T. Senoh, et al. (NICT)] MVC requirements for depth info
(Corresponds to WG 11 document m14879.)
This input document suggested requirements for depth information, especially when rendering virtual-camera images. The proponent suggested considering the following to be a requirement: “Depth information shall be able to provide the absolute-distance of objects in order to enable virtual- camera image rendering, as well as the 3D image rendering with correct dimension.”
The contribution discusses enabling the ability to synthesize new views from the provided views by use of depth map information.
The need for this seems generally agreed in principle.
See section below on applications and scope discussion for MVC / FTV.
Question: “should” or “shall”? Exact details should be worked out as appropriate.
FTV requirements will be an MPEG output.
7.2.1.1.6General discussion and conclusion for CE3 issues
Remark: Each contributor for this CE seems to be doing quite different and preliminary work. Is this a core experiment or a general research area?
Remark: Applying a video codec as-is to depth map coding does not seem to be the right thing to do. Question: Why not?