Summary of Adoptions from CEs
ATM
-
m24796 (NTT): no upsampling for integer disparities, interpolation for sub-pixel synthesis
-
m24941 (Qualcomm): no upsampling to align resolution of depth to texture
-
m24915 (Nokia): adopt change in filter for chroma
-
m24819 (Samsung): synthetic skip
-
m25017 (Poznan): non-linear depth representation (SEI message in HP, SPS in EHP)
-
m24824 (Samsung/Sharp): inter-view motion vector prediction using max disparity
-
m24825 (Samsung/ETRI/KHU): depth intra prediction
-
Test Model and Ref SW only: modification of view synthesis distortion for RDO based on m24826 (Samsung)
-
m24731 (Sony): restriction on interlaced texture with progressive depth, text only for PDAM
HTM
-
m24829 (Samsung): inter-view motion vector prediction using max disparity
-
m24766 (LG): inter-view predicted residual
-
m24938 (Qualcomm): inter-view predicted residual
-
Reference SW only: add inter-view motion vector based approach based on m24937 (Qualcomm), m25024 (LG), m24989 (Sony)
CTC change: indicate output viewing positions and shift parameters for stereo.
CE1.a: View synthesis prediction (Coordinator: J. Lee/Samsung) [17 docs]
m24832
|
Jaejoon Lee
|
3D-CE1.a summary report on view synthesis prediction
|
m24819
|
Jin Young Lee, Jaejoon Lee, Du Sik Park
|
3D-CE1.a results on a context-based adaptive skip flag positioning method by Samsung
|
m24848
|
Shinya Shimizu
|
Cross-check of 3D-CE1.a on context-based skip flag syntax (Samsung)
|
m24941
|
Li Zhang, Ying Chen, Yang Yu, Karthic Veera, Marta Karczewicz,
|
3D-CE1.a results on View synthesis based on asymmetric texture and depth resolutions
|
m24838
|
Jin Young Lee (Samsung)
|
Cross check of 3D-CE1.a results of Qualcomm
|
m24849
|
Shinya Shimizu
|
Cross-check of 3D-CE1.a on signalling of view synthesis prediction (Qualcomm)
|
m24898
|
Yin Zhao, Lu Yu
|
3D-CE1.a results on VSP with integer and half-pixel rendering precision
|
m24840
|
Ilsoon Lim (Samsung)
|
Cross check of 3D-CE1.a results of Zhejiang Univ
|
m25102
|
Jin Young Lee, Shinya Shimizu, Jaejoon Lee, Hideaki Kimata, Du-Sik Park
|
3D-CE1.a results on joint proposal of Samsung and NTT
|
m24932
|
Yang Yu, Li Zhang, Ying Chen, Marta Karczewicz
|
Cross check of 3D-CE1.a results of Samsung and NTT joint proposal
|
|
|
|
m24796
|
Shinya Shimizu, Hidaki Kimata
|
3D-CE1.a related: Simplification of in-loop view synthesis with fractional-pel disparity
|
m24839
|
Jin Young Lee (Samsung)
|
Cross check of 3D-CE1.a results of NTT
|
m24915
|
S. Wenyi, D.Rusanovskyy, M.Hannuksela
|
3DV-ATM CE1 - related contribution
|
m24935
|
Dong Tian, Danillo Graziosi, Anthony Vetro
|
3D-CE1.a related proposal for synthetic reference refinement
|
m24940
|
Yu Yang, Ying Chen, Li Zhang, Marta Karczewicz
|
3D-CE1.a related: on Generalized view synthesis prediction (GVSP) mode
|
m25180
|
Ying Chen, Jin Young Lee, Jaeojon Lee
|
3D-CE1.a related: results on joint proposal of Qualcomm and Samsung
|
m25184
|
Shinya Shimizu
|
Cross-check of 3D-CE1.a related on improvement of VSP (Qualcomm and Samsung)
|
m25195
|
Shinya Shimizu, Hideaki Kimata, Li Zhang, Ying Chen, Jin Young Lee, Jaeojon Lee
|
CE1.a related: results on joint proposal of NTT, Qualcomm and Samsung
|
Summary report:
-
Samsung: evaluate context based adaptive skip flag positioning
-
Qualcomm: proposes generalized view synthesis prediction method, avoids upsampling
-
Zhejiang: aims to reduce complexity with variable pixel precision for warping
-
Mitsubishi: proposes refined synthetic picture and block-level synthesis constraints
-
NTT: simplification of in-loop synthesis with fractional disparity
-
Samsung+Qualcomm shows 1.72% gain, 1.86% on synthesized
-
The summary report recommended to adopt the Samsung+Qualcomm proposal into the 3DV-ATM for high-efficiency and continue CE1 for low complexity; combine NTT proposal for low complexity.
m24819 (Samsung): In the current syntax, skip_type_flag is used to distinguish conventional skip from synthetic skip. In the proposed syntax, mb_skip_type2 is introduced for this purpose and the syntax is redesigned to achieve approx. 1% rate reduction. There was a question on whether it is possible to keep the semantics of the original skip flag intact. Cross check of this proposal was reported by NTT in m24848.
m24941 (Qualcomm): This proposal targets complexity reduction of the view synthesis prediction operations, whereby each pixel in the low-resolution depth image is simply used, i.e. avoiding up-sampling. It was not originally part of the CE, but is a related contribution. Minimal change was reported in BD performance; encoding time was almost unchanged, but decoding time was reduced by approx. 20%. A cross check of this proposal was reported by Samsung in m24838. It was discussed whether subjective quality should be assessed. It was noted that this method is applied in the loop, so it may not have any impact on final subjective quality, rather perhaps only affecting the prediction quality. It was noted that this proposal is related to CE3. Further study with both CE1 and CE3 proposals was suggested.
m24898 (Zhejiang): This contribution was a study of view synthesis prediction using integer and half-pel precision relative to existing quarter-pel precision in warping operations. It reported that integer precision loses 0.83% coding gain with 61% decoding time (i.e. 39% reduction), while half-pel precision loses 0.35% coding gain with 66% decoding time (i.e. 34% reduction). The contribution recommended that the rendering module be more flexible and that the syntax should support sequence-level signalling of the precision. It was suggested that the precision could be optimized for different sequences. The criterion to select the warping precision was not specified and there did not seem to be a clear method to predict the optimal precision (other than exhaustive search). It was claimed that the encoder could select the precision to reduce the complexity of the decoder. There was a question on whether there is any impact on interoperability. All precisions need to be supported in a decoder, so there is no benefit in terms of hardware, but there may be benefit in terms of power consumption for specific bitstreams that are encoded with reduced precision. Further study was encouraged.
m24796 (NTT): This contribution described simplification of in-loop view synthesis, as initially proposed at the Hannover meeting in July 2008. The number of warped pixels is the resolution of the coded picture, and it was proposed to perform no up-sampling of the reference picture and no down-sampling of the warped picture. Disparity vectors with factional-pel accuracy were used. The coding impact was reported as a 0.18% loss in coding performance, with negligible performance loss on synthesis results and 67% decoding time (i.e. 33% reduction). An additional claimed benefit is that the working memory is also reduced. The contribution also reported a combination with Qualcomm’s proposal in which 60% decoding time (i.e. 40% reduction) is achieved with similar results. A slight modification to not halve the disparity vector resulted in 0.26% coding gain with 60% decoding time (i.e. 40% reduction). Some further combinations were reportedly also being simulated. It was remarked that some further modifications to the warping process could also improve quality.
m25102 (Samsung+NTT): Combined results reportedly show 0.84% gain on coded texture and 1.05% gain on synthesis results with 73% decoding time (i.e. 37% reduction).
m24915 (Nokia): This contribution proposed block-based view synthesis prediction, where synthesis is only generated when it is needed. Based on min and max disparity, a reference area is generated. Local hole filling is performed. In the current implementation, the 6-tap up-sampling filter is replaced with a bilinear interpolation, and the VSP motion vector is restricted to 0. There was a reported 0.3% loss in coding. Encoder complexity is increased, while decoding time is reduced to 70% (i.e. 30% reduction).
m24935 (Mitsubishi): One aspect of this proposal is to refine the synthetic reference picture with decoded data including a hole-filling process. A second aspect of this proposal is to reduce decoder complexity with block-level synthesis. Current simulations do not show any gain, but the contributor suggested to study further as part of CE.
m24940 (Qualcomm): This contribution observed the signalling overhead for reference picture indexing and that MVs corresponding to VSP reference picture are typically zero, which is not considered good for building context for MV coding. The contribution's aim was to improve coding efficiency of view synthesis prediction by removing VSP picture from the reference picture list. Instead, it was proposed to use a flag to indicate whether VSP prediction is used for prediction (similar to SVC). The proposal introduced several flags to indicate skip from VSP picture, or coding with different modes. The contribution reported 0.62% and 0.58% bit rate gains on coded data and 0.5% on synthesis results. The proposal appears to be a significant change to the coding design without significant coding benefit. Further discussion of this was requested.
m25180 (Qualcomm+Samsung): This combined proposal reportedly achieves 1.72% bit rate gain on coded data, and 1.86% on synthesized results.
m25195 (NTT+Samsung+Qualcomm): This contribution reported on a combination of tools from m24796, m24940, m24941, 24819. The combined proposal reportedly achieves 1.65% gain on coded data and 1.93% on synthesized results (results are additive). The VSP picture is removed from the reference picture list. Concerns were expressed on complexity, combination of tools, and individual gains. It was suggested to discuss this offline and clarify the technology/syntax being considered for adoption.
Low complexity adoptions:
-
NTT: no upsampling for integer disparities, interp for sub-pixel synthesis needed
-
Qualcomm: no upsampling to align resolution of depth to texture
-
Nokia: adopt change in filter for chroma
High efficiency adoption:
-
m24819 (Samsung)
-
The proposal was part of CE and reported moderate gains and had been cross-checked. It was commented that the method evaluated in the CE was different than the proposal from the last meeting. The spirit of the initial proposal was to provide the capability to skip based on the VSP reference, but it was a simple modification based on adoptions at the previous meeting.
It was concluded to continue the CE, to study the potential benefits of:
-
Block-based view synthesis (m24915, m24935)
-
Depth up-sampling in combination with VSP (based on CE3 inputs)
-
Inter-view and view synthesis prediction with adaptive luminance compensation (m24816, m22616)
-
Not transmitting MV info and building context (m24940, m24915)
-
Sub-MB skip/direct (m24940)
-
Reference picture refinement (m24935)
-
Variable precision synthesis (m24898)
CE1.h: View synthesis prediction (Coordinator: F. Jäger/Aachen) [6 docs]
m24867
|
Fabian Jäger
|
3D-CE1.h Summary Report: View Synthesis Prediction
|
m24868
|
Fabian Jäger
|
3D-CE1.h Results on Warping Based Prediction by RWTH Aachen University
|
m25187
|
Krzysztof Wegner, Olgierd Stankiewicz, Jakub Siast
|
3D-CE1h cross check of RWTH University proposal on Warping Based Prediction by Poznan University of Technology
|
m25014
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE1h results on Disocclusion Coding
|
m25068
|
Ilsoon Lim (Samsung)
|
Cross check of 3D-CE1.h results of Poznan Univ.
|
|
|
|
m24933
|
Danillo Graziosi, Dong Tian, Anthony Vetro
|
3D-CE1.h related proposal for view synthesis prediction
|
Summary report
-
2 proponents (Aachen and Poznan)
-
Aachen (m24868): proposal showed only one sequence with 2% gains, but an average loss of 0.5% over all sequences.
-
Poznan (m25014): no coding efficiency change in texture coding results since it is only applied for depth coding, 0.26% gain in synthesized views, and 88% encoder time (i.e. 12% reduction).
-
The participants recommended to continue the CE.
m24933 (Mitsubishi): The contribution proposed to apply an AVC-based design for view synthesis prediction in HEVC. Results were mixed, with some notable gains on sequences with ground truth depth but overall loss. Further study in CE work was recommended.
It was concluded to continue the CE (with participation from Aachen, Poznan, Mitsubishi).
CE2: Depth representation and coding (Coordinator: O. Stankiewicz/Poznan) [14 docs]
m25016
|
O.Stankiewicz
|
3D-CE2 summary report: Nonlinear Depth Representation & Coding
|
m24820
|
Byung Tae Oh, Jaejoon Lee, Du Sik Park,
|
3D-CE2.a results on adaptive depth quantization by Samsung
|
m25018
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE2a results on Adaptive Depth Quantization (cross-check of Samsung)
|
m24827
|
Byung Tae Oh, Jaejoon Lee, Du Sik Park
|
3D-CE2.h results on adaptive depth quantization by Samsung
|
m24767
|
X. Li, L. Zhang, Y. Chen (Qualcomm)
|
3D-CE2.ah Crosschecking report of Samsung's ADQ
|
m25021
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE2h results on Adaptive Depth Quantization (cross-check of Samsung)
|
m24907
|
Junghak Nam, Sunmi Yoo, Hyomin Choi, Woong Lim, Donggyu Sim, Gun Bang, Won-Sik Cheong, Namho Hur
|
3DV-CE2.h results on adaptive quantization for depth map
|
m25006
|
Jin Heo, Eunyong Son, Sehoon Yea
|
Cross-check report for m24907 on 3DV-CE2.h: adaptive quantization for depth map
|
m25017
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE2a results on Nonlinear Depth Representation
|
m25019
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE2a results on Adaptive Depth Quantization combined with Nonlinear Depth Representation
|
m25056
|
Byung Tae Oh (Samsung)
|
Cross check of 3D-CE2.a results of Poznan Univ.
|
m25020
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE2h results on Nonlinear Depth Representation
|
m25022
|
O.Stankiewicz, K.Wegner, J.Siast
|
3D-CE2h results on Adaptive Depth Quantization combined with Nonlinear Depth Representation
|
m25057
|
Byung Tae Oh (Samsung)
|
Cross check of 3D-CE2.h results of Poznan Univ.
|
Summary report
-
3 proponents (Poznan, Samsung, ETRI/KWU)
-
Some tools were tested in the context of both AVC and HEVC, and various combinations were also evaluated in both frameworks. CE participants determined the QP values that would provide comparable rates for subjective assessment for AVC-based results.
-
AVC results were as follows: gains for synthesis results were 0.98% bit rate gain (NDR-HP), 0.5% (NDR-EHP), 0.94% gain (ADQ) and 1.33% (NDR+ADQ). For coded data, 1.4% (NDR-HP), 0.6% (NDR-EHP), 2.56% (NDR+ADQ). Most of the cross-checks had been completed, except for combined Poznan-Samsung proposal. The NDR tool aims to give subjective gain and results were available for viewing.
-
HEVC results were as follows: gains for synthesis results were unavailable (NDR), 0.47% (ADQ), 0.15% (NDR+ADQ), and 0.1% (NDR+AQP-AQD). Results were not fully cross-checked for HEVC.
-
The participants recommended viewing the NDR results
Viewing was done on Tues/Wed during the meeting, as follows:
-
Subjective comparison of anchor with NDR (AVC framework) at constant rate points
-
3 sequences were presented, i.e., only for those sequences that NDR is turned on. The decision to turn on depends on the center of the histogram for the depth distribution.
-
10-point scale for evaluation, 30 subjects
-
Subjective gains shown, although confidence intervals overlapping
Some participants suggested that depth transformation be supported as part of high-level syntax. The current implementation signals the transformation as part of the SPS. The proposed approach has some impact on other coding tools that utilize depth for prediction – inverse mapping needs to be performed for each depth value. For profiles that do not use depth for any in-loop processing (e.g., MVC compatible extension with depth), the signalling could be achieved with an SEI message.
Comment on ADQ: There was a question on whether the gains reported by ADQ could be achieved with MB-level rate control that adjusts the QP, e.g. based on block activity. Additional CE results were expected relative to such a reference.
Decision: Adopt NDR to ATM (SEI message in HP, SPS in EHP).
Continue CE: Further study ADQ in ATM, as well as NDR and ADQ in HTM.
CE3: In-loop depth resampling (Coordinator: T. Rusert/Ericsson) [12 docs]
m24823
|
Thomas Rusert
|
3D-CE3 summary report: in-loop depth resampling
|
m24821
|
Seok Lee, Seungsin Lee, Kwan-Jung Oh, Ho-Cheon Wey, Jaejoon Lee
|
3D-CE3 results on in-loop depth resampling by Samsung
|
m25047
|
G. Van der Auwera
|
3D-CE3.a: Cross Check of Samsung's proposal on in-loop depth resampling
|
m24942
|
G. Van der Auwera, Y. Yu, L. Zhang, Y. Chen, M. Karczewicz,
|
3D-CE3.a results on Nonlinear depth map resampling
|
m24842
|
Seok Lee (Samsung)
|
Cross check of 3D-CE3 results of Qualcomm
|
m25052
|
Jiwook Jung, Sehoon Yea
|
Cross check of 3D-CE3.a results of Qualcomm by LG
|
m24927
|
Danillo Graziosi, Dong Tian, Anthony Vetro
|
3D-CE3.a results on in-loop depth upsampling filter
|
m25034
|
P. Aflaki, D. Rusanovskyy
|
3D-CE3.a: Crosscheck of MERL's proposal on depth map processing by Nokia
|
|
|
|
m24936
|
Danillo Graziosi, Dong Tian, Anthony Vetro
|
3D- CE3.a results on cross check for Nokia proposal
|
|
|
|
m24928
|
Danillo Graziosi, Dong Tian, Anthony Vetro
|
3D-CE3.a related new in-loop depth up-sampling filter
|
m24948
|
Do-Young Kim, Cheon Lee, Yo-Sung Ho
|
3D-CE3 related depth upsampling using depth edge detection
|
m24914
|
P. Aflaki, D. Rusanovsky, M. Hannuksela
|
Non-linear Depth Map Resampling for 3DV-ATM Coding
|
Summary report:
-
Three processing steps were investigated in this CE including non-normative depth downsampling (pre-processing), normative depth upsampling (in-loop), and non-normative depth upsampling (post-processing). The CE evaluated different combinations of these filters, e.g., benefit of post-processing only or in combination with in-loop filter.
-
4 proponents (Qualcomm, Samsung, Mitsubishi, Nokia)
-
Results: Average bit rate reductions of up to 0.55% were observed on coded data, and 7.55% on synthesized results. A majority of these gains (7.11%) were achieved without normative changes, i.e. using post-processing only. Also, using alternative depth downsampling filters generally has a significant impact on the depth coding, but further analysis of this effect seems necessary.
m24821 (Samsung): This contribution reported results using 5x5 dilation filter after linear interpolation. Gains for synthetic results on the order of 7% were shown. Notable subjective quality improvement was shown on sample pictures.
m24927 (Mitsubishi): This contribution reported 4.9% gain on synthesis result, but high decoder complexity.
m24942 (Qualcomm): This contribution reported 5.52% coding gain on synthesized results.
m24928 (Mitsubishi): This CE related was contribution similar to a Samsung proposal with a few differences, i.e. to use nearest neighbor and median rather than linear interpolation, and use 3x3 window rather than 5x5. Similar gains were observed. Objective evaluation was suggested for the in-loop up-sample filter and subjective evaluation for post-processing.
m24948 (GIST): This was a CE related contribution with a joint bilateral filter, and Canny edge detection followed by morphological operation (dilation/erosion). Average gain of 0.2% was shown on synthesized views.
m24914 (Nokia): This contribution described a modified version of non-linear resampling technique that enables some sequence level adaptation. The new method achieves an average of 5.77% bit rate reduction on synthesis results, which is lower than the 7.35% gain reported earlier, but it was claimed that the subjective quality was improved.
Viewing results were as follows: comparing the anchor to the proposals, there were found no noticeable differences in stereo as well as monoscopic viewing (of the intermediate view).
There was a suggestion to not consider upsampling in the loop (0.07% coding loss on total bit rate, 25% reduction in decoder time) and refocus the CE on non-normative pre/post-processing.
Decision: Adopt the best performing method (m24928) based on objective measures (0.5% gain on total bit rate, but at the cost of using 3x3 dilation filter) and continue CE3.
Continue CE (Dmytro), with the following guidance:
-
Decision of CE1 is to study depth up-sampling in combination with VSP
-
Evaluate non-normative depth resampling (pre/post-processing)
-
Evaluate combinations of in-loop and post-processing methods from CE4 (JVF, m24822, m24947, m24942) - disable JVF and AVC deblocking for depth in CTC
CE4.a: In-loop filtering for depth (Coordinator: K.-J. Oh/Samsung) [4 docs]
m24833
|
Kwan-Jung Oh
|
3D-CE4.a summary report on in-loop filtering for depth
|
m24822
|
Ilsoon Lim, HoCheon Wey, Jaejoon Lee
|
3D-CE4.a results on region based adaptive loop filter by Samsung
|
m25053
|
Jiwook Jung, Sehoon Yea
|
Cross check of 3D-CE4.a results of Samsung by LG
|
|
|
|
m24947
|
Yunseok Song, Cheon Lee, Yo-Sung Ho
|
3D-CE4.a related depth boundary filtering
|
Summary report:
-
1 proposal from Samsung, and 1 CE related document
-
m24822 (Samsung): This proposal reported an average gain of 0.05% bit rate savings on coded data and 1.45% on the synthesized result. Decoding time was reduced to 86% of that of the anchor (i.e. 14% reduction) due to removal of existing tools. A cross check was reported by LG in m25053. The participants in the CE had elected to use the proposed R-ALF filter and remove the current JVF tool and disable the AVC deblocking filter for depth. The complexity of JVF was discussed – it was remarked that this module is currently implemented with floating point precision and needs to be optimized, and also that its performance needs to be checked.
-
m24947 (GIST): This contribution proposed a depth boundary filter applied to coded/up-sampled views. This was not currently implemented in-loop, but the contributor planned to do so by the next meeting. A 2.93% bit rate savings was reported on synthesized results, with sharper depth boundaries.
It was decided to discontinue CE4 and incorporate the relevant aspects into CE3.
CE4.h: In-loop filtering for depth (Coordinator: K.-J. Oh/Samsung) [7 docs]
m24836
|
Kwan-Jung Oh
|
3D-CE4.h summary report on in-loop filtering for depth
|
m24828
|
Ilsoon Lim, HoCheon Wey, Jaejoon Lee
|
3D-CE4.h results on region based adaptive loop filter by Samsung
|
m24871
|
Fabian Jäger
|
3D-CE4.h Cross-Check of Region Based Adaptive Loop Filter by Samsung
|
m24870
|
Fabian Jäger
|
3D-CE4.h Results on Trilateral Loop Filtering by RWTH Aachen University
|
m24843
|
Ilsoon Lim (Samsung)
|
Cross Check of 3D-CE4.h results from Aachen Univ.
|
m24951
|
W. Lim, S. Yoo, J. Nam, D. Sim, G. Bang, W. Cheong, N. Hur
|
Cross check of 3DV-CE4.h : Results of Trilateral filter by Aachen Univ.
|
|
|
|
m24939
|
G. Van der Auwera, Y. Chen, M. Karczewicz
|
3D-CE4.h related: Adaptive depth map edge filtering
|
Summary report
-
2 CE proposals (Samsung, Aachen) and 1 related contribution (Qualcomm)
-
m24828 (Samsung): This contribution showed almost no gain on coded data and 0.95% gain on synthesis results, with minimal complexity increase. The CE participants recommended to adopt this method.
-
m24870 (Aachen): This contribution showed 9% loss with high complexity.
-
m24939 (Qualcomm): This contribution proposed adaptive edge filtering for in-loop depth processing. Each 16x16 block is analyzed for the presence of an edge. If an edge is present, then the adaptive processing is performed. 0.5% gains were reported on synthesized results. Results were reported with and without VSO enabled. There was a request to conduct further study in CE, to understand the interaction with VSO and consider complexity issues.
It was concluded to discontinue this CE.
CE5.a: Motion/mode parameter prediction (Coordinator: O. Stankiewicz/Poznan) [5 docs]
m25015
|
O.Stankiewicz
|
3D-CE5a summary report: motion/mode parameter prediction
|
m24824
|
Jin Young Lee, Tadashi Uchiumi, Jaejoon Lee, Yoshiya Yamamoto, Du-Sik Park
|
3D-CE5.a results on joint proposal for an improved depth-based motion vector prediction method by Samsung and Sharp
|
m24850
|
Shinya Shimizu
|
Cross-check of 3D-CE5.a on depth-based motion vector prediction (Samsung and Sharp)
|
|
|
|
m24847
|
Jian-Liang Lin, Yi-Wen Chen, Xun Guo, Yu-Lin Chang, Yu-Pao Tsai, Yu-Wen Huang, Shawmin Lei
|
3D-CE5.a related motion vector competition-based Skip/Direct mode with explicit signalling
|
m25086
|
Ching-Chieh Lin, Fang-Chu Chen
|
Cross check of 3D-CE5.a related explicitly signalling-based MVC for skip and direct by ITRI
|
Summary report:
-
The proposed methods from Poznan and NTT had been abandoned, but there was a joint proposal from Samsung and Sharp.
-
m24824 (Samsung and Sharp): This contribution proposed modified inter-view motion vector prediction by using max disparity instead of average disparity. It also proposed to change disparity-based skip/direct MV competition by using disparity SAD instead of depth SAD. A 1% improvement in compression was reported without a complexity increase. This was suggested to be a minor change, and there was support by other experts. The CE participants recommended to adopt this method.
-
m24847 (MediaTek): This contribution proposed a MV competition-based skip/direct mode that explicitly signals MVP index instead of performing depth matching. It was suggested that the method avoids complexity and memory issues associated with current method. The technique reportedly provides 7.15% gain for coded data and 5.1% gain for synthesized results. A cross-check was performed by ITRI, as reported in m25086. Further study in CE work was recommended by the CE participants.
Decision: Adopt m24824.
It was concluded to continue the CE based on m24847, in which notable gains were reported.
CE5.h: Motion/mode parameter prediction (Coordinator: S. Yea/LG) [13 docs]
m25060
|
Sehoon Yea
|
3D-CE5.h summary report: motion/mode parameter prediction
|
m24908
|
Junghak Nam, Sunmi Yoo, Hyomin Choi, Woong Lim, Donggyu Sim
|
3DV-CE5.h result on KWU’s advanced motion and disparity prediction method
|
m24879
|
Byeongdoo Choi
|
Cross check of 3D-CE5.h KWU motion vector prediction
|
m24990
|
Y. Takahashi, T. Suzuki
|
Cross check of 3D-CE5.h KWU inter-view motion vector prediction by Sony
|
m25024
|
Jaewon Sung, Moonmo Koo, Sehoon Yea
|
3D-CE5.h results on motion parameter prediction of LG
|
m25051
|
Y. Takahashi, T. Suzuki
|
Cross check of 3D-CE5.h LG's inter-view motion vector prediction by Sony
|
|
|
|
m24829
|
Jin Young Lee, Jaejoon Lee, Du Sik Park
|
3D-CE5.h related results on an improved inter-view motion parameter prediction method by Samsung
|
m24929
|
Li Zhang, Ying Chen, Marta Karczewicz
|
3D-CE5.h related: cross check of improved inter-view motion parameter prediction method by Samsung
|
m24937
|
Li Zhang, Ying Chen, Marta Karczewicz
|
3D-CE5.h related: Disparity vector derivation for multiview video and 3DV
|
m25002
|
Jin Young Lee (Samsung)
|
Cross Check of 3D-CE5.h results of Qualcomm
|
m25025
|
Jaewon Sung, Sehoon Yea
|
3D-CE5.h cross check results of Qualcomm by LG
|
m24989
|
Y. Takahashi, T. Suzuki
|
3D-CE5.h related results on reduction of depth map estimation by Sony
|
m25027
|
Heiko Schwarz, Gerhard Tech
|
Cross check of 3D-CE5.h related: Sony's simplification of depth map estimation by HHI
|
Summary report:
-
This CE investigated the potential benefits of a global disparity vector for motion/mode parameter prediction, as well as simplifications of motion/mode parameter prediction process.
-
2 CE results were reported (by KWU and LG) and 3 related contributions had been submitted (by Samsung, Qualcomm, Sony).
-
m24908 (KWU): This contribution proposed to calculate global disparity using textures instead of depth maps. Multiple global disparities are used, i.e., frame divided into four regions, with 1/8-pel precision. The difference of global disparities between consecutive regions is transmitted. 0.5% and 0.3% coding loss were reported with comparable complexity to anchor. The CE participants suggested further study in CE work.
-
m25024 (LG): This contribution proposed modification for PDM initialization and update, i.e., warping and motion compensation for PDM not used. A reported 0.22% loss occurred on average with 97.5% decoder time.
-
m24829 (Samsung): This contribution proposed to use max disparity for inter-view motion parameter prediction instead of disparity of middle sample. A 0.27% gain in texture and depth coding was reported.
-
m24937 (Qualcomm): This contribution addresses complexity and memory issues with current scheme by utilizing motion information of neighboring blocks instead of relying on an estimated depth map. Its coding loss is approx. 0.1% with approx. 97% decoding time relative to anchor. Further reduction in decoding time is expected.
-
m24989 (Sony): This contribution proposes a modification of depth map estimation, similar to LG. It modifies the depth map update procedure in non-base views. In Sony's proposal, the motion compensation is same as in the anchor; in LG's, motion compensation is removed. In regard to depth map selection in the same view, in Sony's proposal, this was the same as co-located picture selection in HEVC; in LG's it was based on min POC distance. The reported results for the Sony and LG proposals were 0.06% and 0.14% loss with 2.88% and 3.67% reduction in decoder time.
There was significant overlap among proposals, BoG activity (reported by Sehoon) was conducted to review the various aspects.
Decision: Adopt m24829 into the current PDM approach.
It was agreed to add an inter-view motion vector based approach based on m24937, m25024, m24989 into the HTM reference software.
It was planned to continue the CE, including work on the following:
-
Comparison of PDM and inter-view MV based approaches
-
Investigating potential benefits of combining GDV as well as any further improvement and simplification relevant to the two approaches.
It was agreed to modify the HTM and CTC (m25024, m24989) to remove motion dependencies between dependent views; desirable from bitstream extraction point of view. The motivation is that when a sample prediction between two views is disabled, inter-view motion prediction should also be disabled.
CE6: Depth intra prediction (Coordinator: J. Lee/Samsung) [7 docs]
m24834
|
Jaejoon Lee
|
3D-CE6 summary report on depth intra prediction
|
m24825
|
Kwan-Jung Oh, Byung Tae Oh, Jin Young Lee, HoCheon Wey, Jaejoon Lee, Du Sik Park (Samsung), Gun Bang, Won-Sik Cheong, Namho Hur (ETRI), Kyung Yong Kim, Gwang Hoon Park (KHU)
|
3D-CE6 results on depth intra prediction by Samsung and ETRI-KHU
|
m24851
|
Shinya Shimizu
|
Cross-check of 3D-CE6.a on model-based intra prediction for depth map (Samsung and ETRI/KHU)
|
|
|
|
m25011
|
Doowon Lee, Jaewon Sung, Jin Heo, Sehoon Yea
|
3D-CE6.h related: Region Boundary Chain Coding for Depth-map
|
m25163
|
Shinya Shimizu
|
Cross-check of 3D-CE6.h related on Region Boundary Chain Coding for Depth-map (LG)
|
m25012
|
Doowon Lee, Jaewon Sung, Jin Heo, Sehoon Yea
|
3D-CE6.h related: Simple Region-based Intra Prediction for Depth-map
|
m25098
|
Lulu Chen (USTC), Miska M. Hannuksela (Nokia)
|
3DV-ATM CE6-related information
|
Summary report
-
There was 1 CE proposal from Samsung/ETRI/KHU, and 2 related contributions.
-
m24825 (Samsung/ETRI/KHU): This contribution reported 1.21% gain on coded data and 1.48% on synthesized results with minimal complexity increase. The CE participants supported the adoption of this scheme.
-
m25011 (LG): This was a CE related contribution that aims to improve on existing depth modeling modes (DMM) to describe a curved pattern that has weak or little inter-component correlation using chain coding mode. It reportedly achieved 0.2% gain with comparable complexity. It was suggested to complement (or replace) modes 3 and 4 in the DMM. It does not require inter-component decoding dependency.
-
m25012 (LG): This contribution observed that the complexity of DMM is relatively high due to the search for a best pattern match. A method is proposed that replaces the conventional intra prediction that separates the block into two regions. Preliminary results with DMM disabled reportedly achieved 0.6% bit rate reduction.
-
m25098 (USTC/Nokia): This contribution presents an intra coding method for depth maps based on a two-step adaptive boundary location process and modified intra prediction that also uses depth edge samples. 13% gains are reported for all-intra cases, but only 0.6% gains under common test conditions.
Decision: Adopt m24825 (AVC).
It was agreed to continue the CE.
[HEVC based: m25011, m25012] – Sehoon/Philipp
[AVC based: m25098] – Miska
CE7: Global depth and view prediction (Coordinator: T. Senoh/NICT) [3 docs]
m24984
|
Takanori Senoh
|
3D- CE7.a summary report on Global Depth and View Prediction
|
m24985
|
Takanori Senoh, Kenji Yamamoto, Ryutaro Oi, Yasuyuki Ichihashi, Taiichiro Kurita
|
3D-CE7.a results on Global Depth and View Prediction by NICT
|
m24845
|
Kazuyoshi Suzuki, Masayuki Tanimoto
|
Cross Check of 3D-CE7.a Results on Global Depth and View Prediction of NICT by NISRI
|
Summary report:
-
In this category, there was one proposal from NICT on global depth and view prediction. This CE evaluates the pre-processing of depth maps and non-base views into a global depth and predicted residual, as well as post-processing to reconstruct synthesized views.
-
m24985 (NICT): This contribution reported 33% coding gain of texture (base view and predicted residual), and 48% gain on depth (global depth). The bit rate is comparable but the distortion is not. Synthesized view results are degraded by 176% in terms of BD bit rate increase measurement, which would seem to indicate an obvious substantial degradation, but the subjective quality is reportedly good. This method does not require any change to the AVC or HEVC coding design. The CE participants suggested to consider creating a simple profile that supports this coding architecture. 3-view results were available for viewing.
Viewing was performed on Wednesday afternoon
m25242: subjective evaluation results of CE7
-
Subjective evaluation was done on scale of [-3, +3]
-
Results showed that CE7 results are subjectively similar, but vary by sequence
-
The report provides comparison according to bit rate
-
Summary: CE7 shows almost the same subjective quality at almost same bit rate as anchor; also there are complexity savings and no change to the AVC/HEVC coding design.
-
The BoG recommendation was to create a profile based on the CE7 tool.
We need to understand how this data format fits into the standardization framework.
It was suggested that this framework may be further studied under a separate AHG on Alternative 3DV Representations (maybe together with warping-based representations). The explicit goal is study schemes that utilize the existing codec. Expected or potential benefits include: using existing coding design, lower complexity, lower bit rate, etc. Several organizations expressed interest. Consideration should be given to the number of 3D solutions to be specified.
From the breakout group, it was originally recommended to establish a separate AHG on this topic. This was further discussed in the video plenary, which decided that for the benefit of coordinating and keeping results comparable, it would rather be beneficial to continue this study within the CE framework.
CE8.a: RD Optimization (Coordinator: K.-J. Oh/Samsung) [3 docs]
m24835
|
Kwan-Jung Oh
|
3D-CE8.a summary report on RD Optimization
|
m24826
|
Byung Tae Oh, Jaejoon Lee, Du Sik Park
|
3D-CE8.a results on view synthesis optimization using distortion in synthesized views by Samsung
|
m25035
|
P. Aflaki, D.Rusanovskyy
|
3D-CE8.a: Crosscheck of Samsung's proposal on VSO by Nokia
|
Summary report
-
There was one proposal from Samsung
-
m24826 (Samsung): This contribution evaluated a modification of view synthesis distortion. It reportedly achieves 1.44% and 0.76% gains on coded and synthesized data, respectively, without any complexity impact. A cross-check confirmed not only objective gain, but also subjective gain. There was support to adopt this method.
Decision: Adopt m24826
It was concluded to discontinue this part of CE.
CE8.h: RD Optimization (Coordinator: K.-J. Oh/Samsung) [9 docs]
m24837
|
Kwan-Jung Oh
|
3D-CE8.h summary report on RD Optimization
|
m24830
|
Byung Tae Oh, Jaejoon Lee, Du Sik Park
|
3D-CE8.h results on view synthesis optimization using distortion in synthesized views by Samsung
|
m24869
|
Gerhard Tech
|
Cross Check of 3D-CE8.h results of Samsung
|
m24865
|
Gerhard Tech, Karsten Müller, Thomas Wiegand
|
3D-CE8.h results on view synthesis optimization by HHI
|
m24844
|
Byung Tae Oh (Samsung)
|
Cross Check of 3D-CE8.h results of HHI
|
m24899
|
Li Wang, Deliang Fu, Yin Zhao, Lu Yu
|
3D-CE8.h: results on JRDO and VSO with different view synthesis algorithms
|
m24873
|
Gerhard Tech
|
Cross check of 3D-CE8.h results of Zhejiang Univ
|
m25058
|
Byung Tae Oh (Samsung)
|
Cross Check of 3D-CE8.h results of Zhejiang Univ.
|
m25010
|
Eunyong Son, Jiwook Jung, Jin Heo, Sehoon Yea
|
3D-CE8.h results: On the adequacy of the RD-measure in VSO
|
Summary report
-
There were 4 CE contributions reporting results on this topic (Samsung, HHI, Zhejiang, LG)
-
m24830 (Samsung): This contribution proposed to model distortion without directly performing view synthesis, where only corresponding texture is required. The proposal reduces the encoding time by 40% with minor increase in coding gain.
-
m24865 (HHI): A modified VSO algorithm was presented with the following key features: (1) includes run-time optimization of VSO algorithm implementation, (2) distortion computation only depends on luma samples, (3) the update of the renderer model does not need to be performed for blocks that have already been coded, (4) the number of views used for error computation was increased by a factor of three. The proposal reports a 1.2% coding gain with 88% encoding time. It was noted by the proponents that this method could be combined with Samsung’s proposal.
-
m24899 (Zhejiang): This contribution evaluated the current RDO algorithm using different renderers and suggests that this evaluation methodology be considered to make the performance evaluation more reliable. It was also recommended to adopt the option to perform a Joint RDO as an additional RDO option in the encoder for depth coding; such a change would be non-normative.
-
m25010 (LG): This contribution reports that the synthesis quality varies significantly with respect to the position of the synthesized view, i.e., the view synthesis performs worse in terms of both objective and subjective quality near the coded views. It was asserted that the current VSO algorithm seems to introduce erroneous depth values in the coded depth map and current performance measures do not capture those issues. The proponents recommend that CE’s report not only the average qualities but also the worst case including synthesized views near coded views. It was countered by a member of the group that the variation in synthesis gains are expected and that no problem really exists.
It was concluded to continue the CE with focus on evaluating the integrated proposals and exploring synthesis quality variations. Sehoon Yea would coordinate the CE.
Appendix A: 3D Video coding assessment experiment [3 docs]
m24968
|
Sebastiaan Van Leuven, Glenn Van Wallendael, Jan De Cock, Fons Bruls, Ajay Luthra, Rik Van de Walle
|
Overview of the coding performance of 3D video architectures
|
m24900
|
Franz Hunn Rhee, Karsten Müller, Heiko Schwarz
|
3DV-CE-Annex-A: Crosscheck of 3D-HTM results by HHI
|
m24913
|
D.Rusanovskyy, D.Tian, M. Hannuksela, A.Vetro
|
Compression efficiency of 3DV-ATM in MVC-compatible and AVC-compatible coding configurations
|
m24968 (Ghent/Philips/Motorola): This report provided a comparison of coding efficiency for 5 different coding architectures with several variations.
a) Simple MVC extension with depth
b) AVC extension with improved coding efficiency
c1) Hybrid without depth
c2) Hybrid with depth
c3) MVHEVC without depth (no sub-LCU changes)
c4) MVHEVC with depth (no sub-LCU changes)
c5) Full 3D-HEVC for video and depth
The results report BD performance for coded texture, as well as coded texture and depth. The AVC-based results have been cross-checked by Nokia and MERL. It was reported by the cross-checkers that a perfect match was confirmed for the center view, but there were mismatches in the results for the side views. It was remarked that this difference is likely due to interview delta QP differences. The HTM-based results were cross-checked by HHI. The cross-checker reported perfect match for all views and sequences.
The reported coding gains for the 3-view/2-view cases relative to the MVC+depth approach were:
-
AVC compatible: 10.7/8.3% BR reduction
-
Hybrid AVC/HEVC: 34.1/24.8% BR reduction
-
MV-HEVC: 48/46.98% BR reduction
-
HTM-HEVC: 53.6/51.4% BR reduction
The results considering the quality of synthesized views were not available when this contribution was presented.
m24913 (Nokia/MERL): This contribution reported on the evaluation of the AVC-compatible configuration of the ATM software using the common test condition, i.e., the anchors used in CE’s for the development of AVC extensions. In contrast to the results presented in m24968, this contribution used asymmetric QP’s (+3 for dependent view). It was noted by the authors of this contribution that these are more practical QP settings, and explain the higher gains that were achieved. The reported coding gains for the 2-view and 3-view cases relative to the MVC+depth approach were:
-
2-view: 11.35% for coded texture and depth, 13.75% for synthesized result
-
3-view: 14.24% for coded texture and depth, 16.57% for synthesized result
The contribution also noted the importance of gains on the enhancement views since this relates to the additional bandwidth that would be required beyond existing 2D services. On average, the reported gains of the AVC-compatible solution were about 30% of dBR for the coded enhancement views. Specifically, the gains for the 2-view case were 31.45% and the gains for the 3-view case were 29.3%. Similar results were observed considering the view synthesis quality, where the AVC-compatible configuration outperformed the MVC-compatible by 29% of dBR for the 3-view case and by 31% of dBR for the 2-view case.
The contribution also noted that additional gains are expected from ongoing CEs, which could potentially add up and raised the average performance gains.
m24913 (HHI): This document presents the cross-check results of m24968. An average gain of 9.6% on each of the dependent views was reported, and more significant gains are reported on the synthesized views with 18.2% gains for the 2-view case and 15.6% gains for the 3-view case.
There were subsequent discussions on the above evaluation results on Monday afternoon with the 3DV group and on Tuesday morning with the Requirements group.
It was noted that the AVC compatible solution requires block-based changes. It was asked whether more significant changes should be considered as part of this track. The group felt that it was possible, but AVC implementations have been optimized over the years. While these implementations can tolerate some level of block-level change, it was remarked that significant architectural changes may be prohibitive.
There was a question about the expected services for the AVC compatible approach. It was remarked that there is no legacy of stereo coding in the mobile service environment.
The group expects a stereo solution for HEVC to emerge very soon based on high-level syntax changes.
The following notes summarize the discussion of the group on the different approaches under consideration.
MVC compatible:
-
This approach is considered to be an obvious and desirable extension.
AVC compatible:
-
This approach is mainly intended for mobile and conversational services.
-
It was noted that broadcast standards are currently adopting MVC. Is it justified to introduce another depth-based stereo format which is not backward compatible to MVC? This may confuse the markets.
-
Depth may be needed for other applications than advanced displays, e.g., gaming depth data from Kinect.
-
MB-level changes may be justified by a certain amount of benefit. An argument was given that in the mobile area chips are often changed in a customized way and with relatively small changes such that most of the AVC building blocks can be reused. It was agreed by the group that a 25% subjective gain would justify a new profile.
-
Several companies expressed interest in standardization of this approach (Nokia, Qualcomm, Samsung).
HEVC
-
This approach provides better compression than any AVC-based solution.
-
A basic stereo extension of HEVC (MV-HEVC) is needed and being asked for by NBs. It was believed by some that this may be a competitor to frame-compatible methods, but with better performance. Such an extension is a subset of the current HM and could potentially be finalized by January 2014.
-
It was remarked that the development of multiview and scalability extensions could be developed in an aligned manner and could lead to a reasonable set of tools for coding of dependent pictures/layers, where the dependent decoder is not necessarily identical to the base decoder. This would match with the current HM design and may potentially be finalized by July 2014, or sometime soon after. This development activity could also include hybrid solutions which are considered to be a bridge between AVC and HEVC based solutions.
-
It was considered desirable by some members of the group to avoid the introduction of extended stereo/multi-view formats (with depth) that were not compatible with the simple MV-HEVC extension. There was a question on whether this can be achieved with sufficiently high coding efficiency. It was remarked that combining multiview and scalability with MV-HEVC as base layer could solve this problem. Otherwise, similar to the AVC-compatible case, a 25% subjective gain would justify a new profile that is not compatible with MV-HEVC.
Dostları ilə paylaş: |