6.10High-level syntax and slice structure 6.10.1High-level syntax and systems usage of bitstreams
6.10.1.1.1.1.1.1.1JCTVC-F541 Syntax to express a constraint on reordering latency [G. J. Sullivan (Microsoft)]
(Presentation chaired by J. Boyce.)
This contribution proposes to add an SPS-level parameter in HEVC that expresses a constraint on the maximum amount of reordering that can be applied to any frame in a coded video sequence. By comparing the latency status of each frame in the DPB to the value of the maximum latency constraint, a decoder can determine when the maximum latency limit has been reached, and can immediately output any frame that has reached this limit. It is asserted that this can enable the decoder to more rapidly identify frames that are ready for output than with the current syntax for a variety of video encoding structures that includes typical cases. It is also asserted that directly expressing such a limit on the amount of reordering latency allowed through the encoding-decoding process would be a useful characteristic to be established for system-level negotiation and characterization purposes.
No limitation of latency in existing frame reordering syntax. Proposes new syntax in VUI in SPS. Allows low delay decoders to allow to output frames more quickly. Not intended to impact the normative decoder behavior. Only impacts output order. Questions were raised about the interaction with the bumping HRD. May need to change the HRD description.
Application specifications may want to place limits on the value of the new syntax element.
Software was not available. Text is available. We don't have VUI yet in the spec.
Promising concept. Experts need more time to consider.
6.10.1.1.1.1.1.1.2JCTVC-F158 Resolution switching for coding efficiency and error resilience [T. Davies (Cisco)]
A method for changing the resolution of frames within a sequence without causing an IDR or new SPS to be sent is proposed. Frames may be predicted across resolutions by re-scaling reference pictures in a similar manner to H.263 Annex P. The purpose is to allow video communications to use re-scaling to adapt seamlessly to adverse network conditions. It is remarked that in HEVC Intra frames are relatively more expensive than in AVC, and forcing the use of Intra frames can worsen losses and increase delays. It is reported that predicting instead of inserting an IDR frame when down-scaling gives average gains of 5.4–5.6% in Low-Delay common conditions, with 15.6–16.4% gain in Class E. When the size of intra frames is reduced to emulate a video conferencing model, average gains of 6.9–7.4% are reported, with 20.5–21.6% gain in Class E.
Basically conceptually analogous to H.263 Annex P (without the general warping).
Fast start by sending a lower-resolution I frame.
Restricted (as tested) to 1:2 and 2:1 with particular 4-tap filtering.
Temporal MV prediction is disabled across resolutions.
A particular method was used for how to measure the BD effects, and with assumptions about QP relationships.
The PSNR was measured on the downscaled picture relative to a downscaled original picture.
Different methods of trying measure the impact would produce quite different measurements.
This is more than just high-level syntax – it is a substantial additional coding tool.
Somewhat similar to a scalable coding layer switch.
A participant asked whether it is clearly better to switch down the resolution rather than to just increase the QP. The proponent said that it is better in perceptual terms, although perhaps not so much in PSNR, depending on the resolution relationship, bit rate assumptions, etc.
It was remarked that an effort in the IETF called RTCweb (http://tools.ietf.org/wg/rtcweb/) has included related discussions, with general scaling ratios.
It was remarked that this has some relationship with scalability (and perhaps JCTVC-F618 in particular).
It was generally supported that it is desirable to try to do something to avoid I frames for resolution switching.
Further thought seems to be needed to determine what is really needed. A participant suggested trying to think further about that before trying to adopt a particular approach. It was suggested that this may be more potentially useful in some applications than others. Some participants indicated that this is not really a change of scope but is simply a candidate way of addressing existing requirements.
For further study (e.g. AHG).
6.10.1.1.1.1.1.1.3JCTVC-F551 Cross-check report for JCTVC-F158 on resolution switching [A. Gabriellini, M. Mrak (BBC)] [late upload 07-05]
The software was checked as well as run. No extra analysis or commentary was provided.
6.10.1.1.1.1.1.1.4JCTVC-F201 High-level Syntax: Temporal Information Decoding Refresh [B. Li (USTC), J. Xu (Microsoft), H. Li (USTC)]
(Presentation chaired by J. Boyce.)
Temporal MVP brings some bit rate savings to HEVC, but can break parsing robustness. The parsing problem has been analyzed in many proposals. This contribution proposes a mechanism to provide correct MV at some recovery point as a tradeoff between coding efficiency and MV accessibility.
The contribution proposed a concept referred to as a "temporal MVP IDR" (TIDR), to reset the state of temporal motion vector prediction for all pictures in the DPB.
A 0.4% bit rate loss was reported using the proposed method, using TIDR every 8 frames.
A TIDR access unit type defined. A new NAL unit type value can be added.
A participant asked what is the normative decoder behaviour. The response was to reset to ensure that no temporal vectors are available anymore.
It was remarked that this should probably be restricted to frames with temporal_id equal to 0.
No text was provided for the decoding process change.
A participant suggested using a flag in the slice layer instead.
This contribution is potentially related to contribution JCTVC-F427.
The contribution has identified a problem. Further study encouraged.
6.10.1.1.1.1.1.1.5JCTVC-F135 Analysis of Multi-core Processing approaches [V. Sze, M. Budagavi, M. Zhou (TI)]
(Information contribution.)
Low power and high frame rate/resolution requirements for future video coding applications reportedly make the need for parallelism in the video codec implementation ever more important. Several methods have been proposed to enable high-level parallelism on multi-core architectures. This contribution describes the differences between regular slices, entropy slices, interleaved entropy slices/wavefront parallel processing, and Tiles. It provides an analysis of these tools based on throughput, coding efficiency, implementation complexity, and latency.
A comment made during the presentation is that "tiles" (according to some proposals, at least) do not have as much header-level overhead as may have been assumed in this analysis).
6.10.1.1.1.1.1.1.6JCTVC-F491 High level syntax for scalability support in HEVC [T. Rusert, R. Sjöberg, P. Fröjdh, Z. Wu (Ericsson)]
This contribution proposes high-level syntax changes to HEVC aiming at making inclusion of future scalability extensions straightforward without changing the NAL unit headers. It is proposed to introduce a fixed-length NAL unit header that includes seq_parameter_set_id, such that all VCL and non-VCL NAL units that are associated with a certain scalable layer are assigned the same value of seq_parameter_set_id. Furthermore, it is proposed to introduce syntax for signalling of both dependencies between scalable layers and properties of the respective layers into the SPS. Thus according to the proposed scheme, the seq_parameter_set_id serves as general scalable layer identifier, whereas the associated SPSs indicate both dependencies between layers and respective layer properties. Consequently, it is proposed to move temporal_id (which is carried in the NAL unit header in the current HM design) into the sequence parameter set. The proposal claims these changes make it possible to signal dependencies within scalable video representations in an extensible way. It also claims that new scalability extensions can be defined and used together with old ones.
Proposes a 3-byte NAL unit header.
Adding an SPS index in the NUT. Removing the temporal layer ID from the NUT (instead placing that in the SPS). Also proposes to remove output_flag (instead placing that in the slice header).
It was remarked that this would require parsing of SPS content to understand what to do with the bitstream. Possibly the SPS index could double as a priority ID.
The proposed NAL unit type just contains forbidden_zero_bit, nal_ref_idc, nal_unit_type, and sps_id (16 bits).
Further study was encouraged.
6.10.1.1.1.1.1.1.7JCTVC-F714 High-level syntax mismatches between WD and HM [Q. Shen, Y.-K. Wang (Huawei), K. Sühring (Fraunhofer HHI)] [late reg. 07-11, upload 07-11]
This document provides an analysis of high-level syntax related mismatches, including both syntax mismatches and behavior mismatches, between the HEVC WD (JCTVC-E603_d8) and the reference software HM-3.2.
The v3 submission was reviewed.
It was agreed that generally, the draft should reflect our design intent; the software can deviate as necessary for aspects not yet implemented such as ref pic marking and ref list construction. Particular aspects:
-
use_mrg_flag – reflect decisions recorded elsewhere.
-
POC issues – not important to align at this time.
-
deblock control – keep the AVC-like syntax, add remarks in WD and ref software flagging for further work.
-
cu_qp_delta_enabled_flag – ue(v), 0 = no delta QP, 1 = LCU, 2 = (LCU/2)x(LCU/2), etc.
Decision: Agreed.
Dostları ilə paylaş: |