HL syntax common issues for range extensions, 3D, SHVC, and single-layer HEVC coding (11) Auxiliary pictures (6)
See also section 6.5 (SEI and VUI).
It was noted that we have syntax and semantics for interpreting auxiliary pictures for alpha, but do not have a specification of how to interpret the alpha data. At the previous meeting, it was said that this should be defined in SEI. See additional notes elsewhere on this topic.
A general question raised by P0137, P0207, and P0071 is how to determine what decoder capabilities are needed for decoding auxiliary pictures. See also P0137 in that regard.
We need a way to convey the syntax constraints and decoding process needed for auxiliary pictures. This was agreed, but needs to be solved by further work.
The need to establish rules for future allocation of codepoints was noted. This was the subject of joint discussion with the parent bodies, and the outcome is reported in section 7.1.
We note that the types identified as "unspecified" may be accompanied by an additional indicator (e.g., SEI / VUI) that explains the use.
The possibility of wanting to have multiple instances of a particular type (e.g., multiple alpha planes for different purposes or even multiple depth maps for different purposes) was discussed. It was remarked that this relates to contribution P0135.
14.1.97.1.1.1.1.1.286JCTVC-P0065 Guided transcoding using auxiliary pictures [K. Andersson, Y. Rai, T. Rusert, R. Sjöberg, J. Samuelsson (Ericsson)]
Discussed 01-09 p.m. (GJS).
At the Geneva meeting it was proposed in JCTVC-O0127 to enable "guided transcoding" using SHVC. This time it is proposed to enable guided transcoding for both spatial and SNR scalability using auxiliary pictures without normative low-level changes. With this approach, the higher fidelity is represented in the base layer and the side information for generation of another resolution/fidelity is represented as an auxiliary picture. The results reportedly show gains compared to simulcast coding of 21.5%/26.2% for all intra, 15.1%/18.9%/13.2% for random access, and 13.9%/17.6%/11.7% for low delay 2x/1.5x/SNR scalability for the common conditions. In comparison to traditional transcoding (re-encoding to another resolution/fidelity), the proposed guided transcoding reportedly significantly reduces the transcoding time to be comparable with the decoding time and to have less loss in BD BR performance compared to single-layer coding. The proposed modification to the specification is to add a new auxiliary picture type AUX_TRANSCODING and an informative note on how to use a primary and auxiliary picture to enable guided transcoding.
The auxiliary picture would be coded at the target resolution of the transcoding process. The syntax would be ordinary syntax but the transform coefficients are meant to be discarded.
It was asked how much gain could be obtained if the syntax were modified so that no coefficients are sent instead of sending "junk" coefficients. This was suggested to be perhaps 5–10% of the size of the auxiliary picture data (and perhaps 2–3% of the total data), but this was said to be only a rough guess.
The downsampling process would probably need to be known – and using relatively short filters were suggested to be used for this.
The transcoder could instead just do a full re-encode with R-D search instead of using the guidance provided in the auxiliary picture. But if the encoder generates a reference picture using some method different from what was anticipated in the semantics, there would be drift which could make subsequent data substantially less relevant.
It was remarked that potentially just having a proprietary user-defined type of auxiliary picture (for which signalling is already anticipated) might be a way to enable this.
Some generalization of filter description, such as a filter description SEI message, was a suggested way to handle the downsampling filter. However, it was asked whether transcoders would really want to support a general family of filtering or would only want to implement a specific method – and perhaps just discard all data accompanied by an indication different than their favourite filter.
No specific filter description method was proposed.
Some experiment results relative to simulcast and relative to SHVC were provided. It was asked whether there was a really benefit relative to SHVC. The proponent said it was a matter of prioritization of whether the desire is having good fidelity for the enhanced resolution or good fidelity for the base layer. The prioritization of the layers with SHVC was also noted to be a relevant issue.
The relevance of the data in terms of potential drift relative to the original encoder would also depend on the requantization process (including QP and other aspects). If that process is fully specified to avoid this issue, it would involve output of only one specific bit rate as the output.
Feedback to the original encoder was suggested as a way to perform bit rate control; however, it was remarked that this again ends up having the usefulness of the feature depend on some (potentially undocumented and proprietary) external technology – so it was suggested that simply having the auxiliary picture indicated as a proprietary type of auxiliary picture might suffice.
It was agreed that, pending further input, no action seems (currently) justified to enable this other than defining an auxiliary picture type indicated as "unspecified" (already currently drafted as type codes between 128 and 143, inclusive).
Further study was encouraged. If further study results in new information at some point in the future to resolve the "gaps" in the concept, this could be reconsidered for action.
14.1.97.1.1.1.1.1.287JCTVC-P0299 Cross-check report of JCTVC-P0065: Guided transcoding using auxiliary pictures [D. Bugdayci, K. Ugur (Nokia)] [late]
14.1.97.1.1.1.1.1.288JCTVC-P0071 MV-HEVC/SHVC HLS: On auxiliary pictures [B. Choi, Y. Cho, M.W. Park, J.Y. Lee, H. Wey, C. Kim (Samsung)] [late]
Discussed 01-09 p.m. (GJS).
Signalling the auxiliary picture types used for a CVS in VPS extension is proposed for session negotiation and sub-bitstream extraction.
It was commented that in the current VPS syntax, we have an "aux ID" that corresponds to an auxiliary picture type, and it was suggested that this may be sufficient to provide the intended functionality.
No action was taken on that aspect.
Additionally, for efficient layer-dependency signalling, layer dependency information, direct_dependency_flag[ i ][ j ] and direct_dependency_type[ i ][ j ], are proposed to be signalled by grouping a layer with primary pictures and layers with their associated auxiliary pictures, when they have the same layer dependencies.
It was remarked that the potential bit rate savings for this is minimal (no estimate was provided, but this seemed basically true). It was also commented that it may not be clear that the auxiliary pictures would have the same type of inter-layer dependency characteristics as the non-auxiliary pictures. The proponent suggested to consider the multiview case with depth map auxiliary pictures, wherein each view may be accompanied by a depth map and inter-layer referencing may have the same characteristics – and asserted that this is a common test case used in experiments.
It was noted that the proposal would retain the current type of dependency indication – this would add an additional type of dependency indication as an alternative rather than simplifying the syntax.
In the absence of an understanding that the number of bits saved would be significant, there seemed to be no interest from non-proponents, so no action was taken on this aspect either.
Furthermore, some constraints for slice header parameters of auxiliary pictures were proposed.
Proposed constraint 1:
-
When present, the value of the slice segment header syntax elements pic_output_flag, no_output_of_prior_pics_flag, slice_pic_order_cnt_lsb, discardable_flag, cross_layer_bla_flag, poc_reset_flag, shall be the same in all slice segment headers of a primary coded picture and the associated auxiliary coded pictures.
It was remarked that one-way constraints may be more reasonable for some of these – establishing a constraint on the auxiliary as a function of what is happening in the primary.
It was also asked to consider each specific constraint – for each specific syntax element, and think about its reason for being constrained. Each seems to have its own characteristics, and detailed reasoning was not provided in the contribution.
Proposed constraint 2:
-
For all slices of auxiliary pictures, slice_sao_luma_flag and slice_sao_chroma_flag shall both be equal to 0; or
-
When AuxId[ lId ] is equal to AUX_DEPTH, slice_sao_luma_flag and slice_sao_chroma_flag shall both be equal to 0
It was remarked that if such a constraint is needed, it should be part of a profile constraint and specific to particular auxiliary picture types.
As applied to depth maps, this should be considered a JCT-3V issue.
Proposed constraint 3:
-
For all slices of auxiliary pictures, slice_deblocking_filter_disabled_flag shall be equal to 0
-
When AuxId[ lId ] is equal to AUX_DEPTH, slice_deblocking_filter_disabled_flag shall be equal to 0
It was remarked that if such a constraint is needed, it should be part of a profile constraint and specific to particular auxiliary picture types.
As applied to depth maps, this should be considered a JCT-3V issue.
14.1.97.1.1.1.1.1.289JCTVC-P0092 MV-HEVC/SHVC HLS: Proposal for supporting optional overlays with help of auxiliary pictures [N. Stefanoski, O. Wang, A. Smolic (DRZ), T. Szypulski (ESPN)]
Presented in joint VC+3V session 01-14 (GJS & JRO).
This proposal is based on JCTVC-O0358/ JCT3V-F0057, which proposed to realize a functionality of "optional overlays" with the use of an SEI message and different views of MV-HEVC.
It was suggested that "selectable" might be a better name than "optional" to clarify the intent.
Auxiliary picture types have since become the envisioned method for which the interpretation can be specified by an SEI message. In this document, a revised version of the SEI message presented in JCTVC-O0358/ JCT3V-F0057 is proposed to provide the functionality of "optional overlays" with use of auxiliary pictures instead of views.
The overlay use of three auxiliary pictures per overlay: texture, "label", and alpha.
Can overlays have different size than the accompanying video? Yes.
Additional requirement for buffers? Not significant, as only the overlay that is currently displayed needs to be fully decoded.
Strings are proposed to be sent (using UTF-8) as user-identifiable names for the selectable elements. It was remarked that these should have variable length.
As described, "label id" is a luma value associated with a label and "label offset" defines a tolerance range around the value.
It was remarked that there had been prior parent-level review of the concept without objection.
See also notes on P0135.
At the concept level, it was agreed to plan to support such a capability. Further study was needed to work out details.
14.1.97.1.1.1.1.1.290JCTVC-P0135 MV-HEVC/SHVC HLS: Auxiliary pictures for multiple overlays [J. Boyce, S. Wenger (Vidyo)]
Presented in joint VC+3V session 01-14 (GJS & JRO).
Changes to the VPS extension and a new SEI message are proposed to support overlay pictures with individually controllable overlay elements using auxiliary pictures, to enable the use case described in contribution JCTVC-O0358 and expand the use case to support multiple overlay pictures. In the VPS, an aux_type syntax element is proposed to be explicitly signalled, rather than inferring the type from the AuxId. Three new aux type values are proposed to represent overlay content, overlay layout, and overlay alpha. An overlay info SEI message is proposed to describe the overlays, by indicating the layer_id values of the various aux type layers, and providing overlay layout mapping parameters.
The segmentation map (here called a "layout", in other proposal called "label") is proposed to be sharable among defined sets with different texture overlay content.
It was noted that, with the current syntax, it is possible to use layer ID to enable sending multiple auxiliary layers with the same Aux type associated with the same primary picture.
Aside from the modified VPS syntax, it was suggested to define specific types rather than using the "unspecified" range of auxiliary type identifiers.
The primary purpose of having some kind of enumerated auxiliary type is to be usable in session negotiation.
It was suggested that it might not be necessary to use three different type codes rather than using one and having other information identify the sub-type of pictures within the overlay scheme.
Currently, we have assigned 16 values to the "unspecified" range.
Interest was expressed in the functionality. There was some question of whether this should be specified using auxiliary pictures in the video bitstream versus at the systems level.
A proponent indicated that defining this information in the video bitstream can simplify usage, e.g., for video editing, and that such schemes are used in product applications.
It was asked what our rules should be for assigning enumeration type codes to auxiliary picture types.
It was planned to raise the topic for parent-level discussion. See section 7.1.
14.1.97.1.1.1.1.1.291JCTVC-P0207 RExt/MV-HEVC/SHVC: On Auxiliary Alpha Plane Pictures [K. Misra, S. Deshpande, A. Segall (Sharp)]
Discussed 01-09 p.m. (GJS).
This contribution proposes a bitstream constraint that requires alpha plane auxiliary pictures associated with IDR primary pictures to also be IDR pictures. The proposed constraint guarantees that if random access is performed at the primary picture IDR then the corresponding alpha plane auxiliary picture would also be a random access point.
In revision 1 the proposed bitstream constraint language is modified with change marks.
It was remarked that if we add such a constraint, additional details may be desirable – e.g., also saying that if the primary picture is IRAP, the auxiliary must be IRAP.
It was remarked that some constraints that may be applied in all cases would resolve particular cases involving alpha pictures.
It was remarked that we have other cases where we envision having an IDR picture in a base layer and no IRAP in an enhancement layer and therefore do not apply such a constraint (e.g., for layer-wise start-up).
The question is whether the spirit of the intent is to require the alpha to always be decodable along with the primary picture whenever the alpha is present, including for purposes of random access.
It was suggested that this seems like the sort of constraint we might specify if alpha decoding is part of a profile capability, but may not be necessary if alpha decoding capability is not profiled. No action was taken for this reason.
14.1.97.1.1.1.1.1.292JCTVC-P0122 On chroma auxiliary pictures [K. Ugur, D. Bugdayci, M. M. Hannuksela (Nokia)]
Discussed 01-09 p.m. (GJS).
Presentation was initially deferred for coordination with section 6.5.2.
Discussed 01-16 p.m. (GJS).
In version 3 of the contribution, an alternative of the specification text was proposed, which would specify, through an SEI message, that a pair of auxiliary layers associated with an unspecified AuxId value would carry chroma enhancement pictures.
P0209 proposed a combination of RExt and SHVC for chroma enhancement. It was deferred.
This proposal uses monochrome coding of the auxiliary pictures, which is a capability not enabled in v1 profiles. Alpha and depth raise that same issue.
No objections were raised to the proposal at this time, and the latest "option 2" variant seemed mature.
Discussed 01-17 a.m. (JRO).
An advantage claimed was backward compatibility with 4:2:0 decoding, “simulcasting” the chroma, with an estimated bit rate increase of 15%.
4:2:0 decoding would be normative, while 4:4:4 would not have a normative decoding process.
One expert said that the value is unclear.
Another expert said that external bodies could use this for application standards.
It seemed unclear which application domains would benefit and what are the associated requirements.
The topic was deferred for consideration at the next meeting, with a plan to bring it again to the attention of parent bodies.
Dostları ilə paylaş: |