3D-HEVC HLS (17):
NAL unit header and parameter sets
A comment was made that we may need the VPS concept due to the commonality of the majority of the relevant proposals.
It was asked about the motivation of having a VPS and what are the syntax elements that be potentially put in VPS.
-
The dependency of the views are signalled in VPS that not need to be present in multiple SPSs. This also includes view component order.
-
The level signalled in the SPS might not be accurate enough, especially for the multiview case, because multiple operation points may require correspondence to multiple profile/level indications, and the only one in the current SPS might not be accurate enough for that.
-
The short-term RPSs may be shared by multiple views, since the decoder buffer management of multiple views is typically aligned.
-
Temporal scalability related info. can also be moved to the VPS. Temporal sub-bitstreams can have different info. related to buffer management.
-
Representation format info. e.g., including bit depth, chroma sampling format, resolution of the sequences.
It was asked whether it is beneficial to have the VPS design given the current SPS design in the HEVC base specification?
-
One comment was made that even if the base spec. contains only SPS and not VPS, the VPS can be useful for extensions.
-
One comment was made that similar functionality can be supported with the proposals in m24817 and m24818.
-
Associating an SPS for each layer is convenient for extraction: the extracted results may be cleaner, meaning it doesn’t contain the non-VCL NAL units that belong to a layer which is not in the sub-bitstream.
-
This can be addressed by the scalable (multiview) layer not present SEI.
-
One comment was mentioned that the dependency of the SPSs may have error robustness issue. But it seems that this might not be a problem.
-
A "middle box" needs to collect all the SPSs and perform de-referencing of all the SPSs to perform the bitstream extraction. It was commented that the de-referencing process is an overhead compared to other VPS designs.
-
Activation of multiple SPSs might not be lightweight, but this potentially can be solved by sending the dependency info. in an SEI message.
-
Question: Are the SPS ids of the slice headers of the base and enhancement views equal to each other? This is not fully described. If multiple views use the same SPS id, the SPS has to be transmitted for each of the views, any slices that need to refer to different SPS content must refer to an SPS with a different id, unless the content of the SPS with the original id is changed. If different SPSs are used across layers, the problem doesn’t exist. If an SPS with a different id is used by a slice, an additional PPS is also needed consequentially.
-
A suggestion which sounds agreeable is either to take no action or to adopt the VPS as a concept and discuss further what syntax elements/functionalities to put into it, e.g., during the next meeting.
-
The majority of the participants agreed to have the VPS concept be adopted, even though there was no common understanding for the exact design in the VPS.
Decision: Adopt VPS concept into the HTM software, and further study this concept. Further discussion was suggested regarding whether we want to include this in the HTM Working Draft, which is currently unavailable.
Vidyo volunteered to provide some implementation of a VPS, wherein at least the view dependency can be implemented. Note that the view dependency is needed for reference picture list construction.
It is suggested that a document should be produced to describe what is included in the software.
It is also agreed that the picture parameter set, reference picture sets, reference picture list construction, and slice header are all more or less related to parameter sets, including the video parameter set, sequence parameter set, as well as potential SPS subsets. We may make decisions on proposals related to those aspects after the VPS design matures.
m24714
|
Jill Boyce, Danny Hong, Wonkap Jang
|
3D-HEVC HLS: Parameter sets modifications for extension hooks
|
This document proposes changes to both the HEVC base specification and the extensions (including 3DV).
The video parameter set (VPS) concept was proposed.
It was proposed that temporal scalability related info. is moved from SPS to VPS.
Each layer is uniquely identified by a layer_id.
It was proposed to rename reserved_one_5bits as layer_id_plus1. These five bits can be used to identify other dimensions of scalability.
The mapping of layer_id to other dimension information is in the video parameter set.
A comment was made that depth_flag might be always needed in the VPS design for a certain extension type.
Syntax elements proposed to be removed from the SPS included max_temproral_layers_minus1 and temporal_id_nesting_flag.
Other syntax elements might also be removed.
The view dependency structure is included in the VPS.
Information for the operation points is not included in the VPS.
One aspect of the proposal is to rename the reserved_one_5bits as layer_id_plus1.
No change of the SPS of the base specification was proposed.
Inter-SPS prediction: some redundancy of SPSs is identified. The current HTM software has one SPS for texture and one SPS for the depth for each view.
For each random access based tune-in, the SPSs need to be transmitted, thus in some cases it is good to save bits for SPSs.
view_ref_layer_present_flag and delta_view_ref_layer_minus1 are signalled for the SPS prediction.
view_ref_layer_present_flag indicates whether the prediction of SPS content is allowed.
delta_view_ref_layer_minus1 specifies the layer_id of a reference layer.
The advantage of this proposal over the VPS, SPS hierarchical design is that this can be complementary for the common syntax elements which won’t be present in VPS. This proposal doesn’t provide the functionality of partial prediction of syntax elements. So except for the profile and level, most of the syntax elements are just copied from the SPS of the reference layer.
m24818
|
Thomas Rusert
|
3D-HEVC HLS: Signalling of layer identifiers and inter-layer decoding dependencies
|
The mapping of layer_id to view_id is added in SPS.
It was proposed that layer_id_plus1 is to be mapped to view_id and view_order_idx, for the multiview case. view_id and view_idx may be both present in the SPS.
Signalling of decoding dependency is also proposed; for each layer, only up to one dependent layer is enabled.
Each texture and depth view has a different layer_id_plus1.
The prediction chain of the SPSs may be used to derive the view dependency.
delta_texture_ref_layer_minus1: indicates the inside view components dependency.
For an SPS applicable to both texture and depth, the dependent view for texture and the dependent view for depth are signalled.
m24878
|
Byeongdoo Choi, Jaehyun Kim, Jeonghoon Park
|
3D-HEVC HLS: On NAL unit header
|
No change is proposed in this contribution for the HEVC base layer.
For all enhancement layers: the TID and the reserved_one_5bits are combined and can be further allocated.
A fixed 2-bit for SET and 6-bit for LID are proposed, with:
0: priority_id (6-bit)
1: view_id (3-bit) t_id (3)
2: dependency_id (3) quality_id (2): for this case, temporal scalability is not supported.
3: dependency_id (3), temporal_id (3)
A LUT of the mapping can be pre-defined or signalled in SEI or VPS.
For 3DV, the first bit can be used for depth_flag, with the remaining 5 bits used for view_id.
It was commented that care has to be taken to guarantee that any value of these 8 bits for an enhancement layer NAL unit header is different from possible values of the second byte of the NAL unit header in the HEVC base layer.
Start code emulation was not considered in this proposal.
m24943
|
Ying Chen, Ye-Kui Wang
|
3D-HEVC HLS: on NAL unit header
|
In the NAL unit header, reserved_one_5bits is used as layer_id_plus1. Texture and depth of the same view have the same value of layer_id_plus1. Different NAL unit types are used to distinguish texture and depth.
Four new NAL unit types are introduced for depth views (15-18: normal coded slice, IDR, CRA, TLA).
Parameter sets for texture and depth use a common parameter set ID space. It was commented that if the same parameter space is used, it is not possible to distinguish whether a parameter set is associated with texture or depth, thus if depth is discarded in the extraction process, it may not be possible to discard the associated parameter sets, which may cause overhead in the extracted sub-bitstream.
m24945
|
Ying Chen, Ye-Kui Wang, Marta Karczewicz
|
3D-HEVC HLS: Parameter sets for 3DV
|
The use of a VPS is proposed, and some parameters from SPS are moved to VPS. Operation point information is signalled in the VPS. layer_id_plus1 is used in the NAL unit header. It was proposed to be equivalent to view_idx_plus1 in the 3D-HEVC context.
The assumption is that new NAL units types are used for depth as proposed in M24943.
A view dependency syntax structure is proposed that is said to be similar to the dependency signalling in the MVC SPS extension.
It includes num_views_minus1, which may be redundant since it is equivalent to num_layers_minus1 (which is signalled in the VPS).
For each dependent view, view dependencies are signalled.
A flag is used to indicate whether the dependent view is put into the inter-layer RPS and thus used in the reference picture list construction. This way it is said that for a view with a given view_id, views with a lower value of view_id can be excluded from the reference picture list construction. It was said the proposed view dependency structure could be used either in the VPS or in the SPS.
The VPS contains profile/level information, video format, and max_temporal_layers and temporal_id_nesting flag. The VPS is required both in the base specification and the enhancement specifications.
The vps_extension_flag indicates the VPS enhancement layer extension. In the VPS enhancement layer extension, num_layers_minus2 is signalled and a number of different video representation formats are signalled to indicate the possible different video formats in the coded video sequence. For each layer, an index to the representation format is signalled, and additionally a mapping from layer_id to dependency_id, etc. Mapping to view_id in the 3D-HEVC context is not necessary since layer_id and view_id are proposed to be equivalent.
It was commented that in a hypothetical combined multiview and scalability extension, a new layer_id mapping could be introduced in the VPS that would include a mapping to view_id.
Operation point information is signalled in the VPS, which is said to be similar to the operation point information in the view scalability information SEI message of MVC. For each layer, it includes temporal_id, depth_included_flag for each view, and a loop over the values of layer_id that are associated with the operating point.
In the SPS, syntax elements that are now present in the VPS are no longer present in the SPS. This includes the SPS ID associated with the base layer, thus it implies changing the SPS definition in the base specification. Still present in the SPS are tool enabling flags and CU hierarchy-related elements. An identifier of the representation format is added. Furthermore, vps_id is added into the SPS.
A participant asked how dependencies between texture and depth are signalled for extraction purposes. It was claimed that the operation points are sufficient for the extraction process, and no additional information about texture/depth dependencies is needed.
m24896
|
Miska M. Hannuksela, Dmytro Rusanovskyy (Nokia)
|
3DV-HTM high-level syntax: video and sequence parameter set design
|
Definition: A component sequence is defined as a set of all the components with the same component id (similar to a layer). reserved_one_5bits is renamed as component_pic_id.
Profile/Level are defined in VPS.
A loop of all the possible combinations was proposed, each having profile and level info, even it doesn’t correspond to an operation point.
The proposed ref_vps_id is only present in the slices of the IDR/CRA, such that referring to a VPS doesn’t require going through PPS and SPS ID references.
A dependency loop is sent for a specific operation point, similar to MVC.
num_component_seq_types indicates how many scalable dimensions a coded video sequence has.
The len parameter indicates how many bits are allocated for a specific characteristic dimension.
The exact value of each instance of a specific dimension is further signalled in a different loop.
For the MVC case, the componenent_id is view_idx.
For the 3DV case, the component_id may be either view_idx or view_idx plus depth_flag.
The proposed solution was claimed to be more generic than other schemes, and a comment was made that it naturally supports flexible decoding order.
m24919
|
R. Skupin, V. George, T. Schierl
|
3D-HEVC HLS: On NAL unit header and video parameter set
|
Operation point id is signalled in the NAL unit header.
Base specification of NAL unit header is changed to the following:
2-bit: layer_scenario. And the allocation of the other 6-bit can be done in different ways based on the layer_scenario.
0: 32 values for OP id;
1: 2 dimensions, 3-3 allocation;
2: 2 dimensions, 2-4 allocation;
3: 3 dimensions, 2-2-2 allocation;
Temporal scalability may be disabled in some cases.
In the VPS:
scalability_type: 7-bit.
Profile/level: only for the lowest temporal layer in the SPS base design.
Multiple vps_extension_data() iterations can be signalled, each contains a profile/level.
Reference picture sets and reference picture lists
m24944
|
Ying Chen, Adarsh K. Ramasubramonian, Ye-Kui Wang, Marta Karczewicz
|
3D-HEVC HLS: reference picture list modification
|
This included SW implementation of the RPS and reference picture list construction schemes following HM6 design. Reference picture list modification is based on a proponent’s HEVC base specification proposal as proposed to the JCT-VC, extended for multiview coding.
In the current 3D-HEVC TMuC, the entries for the reference picture list are signalled in the slice header, and each reference picture in the list can be either an inter-view or inter-picture reference, which is signalled by the flag.
The MVC reference picture list modification syntax is reviewed and said to be advantageous over the current 3D-HEVC TMuC design.
The proposal uses the same syntax elements as in the corresponding HEVC base specification proposal. A starting point and the number of modification commands is sent first, and followed by a loop of the modification commands. Each modification command includes a source list index, if more than one source RPS is present, and a RPS index, if more than one element is present in the RPS. In the decoding process an initial list is created based on the source lists in a pre-defined order. Afterwards the reference picture list modification process is applied, and it is said to be similar to the reference picture list modification in MVC.
Analysis of the number of bits required for reference picture list modification was provided. 20% bit reduction is reported for the base view, and 34% for enhancement views. The reference for simulation results is HEVC WD6 reference picture list modification.
No decision was made on this proposal.
m24876
|
Byeongdoo Choi, Jaehyun Kim, Jeonghoon Park
|
3D-HEVC HLS: On Reference list
|
This proposal is related to reference picture list construction and modification.
An inter-view RPS subset scheme is also proposed.
Reference picture initialization is the same as in m24944.
This proposal follows the HEVC CD design for reference picture list combination and reference picture list modification.
The contribution proposes using multiple inter-view RPS subsets, for which each slice may use an index to one of the inter-view RPS subsets. So the inter-view RPS subset w.r.t. the view_idx values of the entries in the subset may change – whereas in m24944, the inter-view RPS subsets, as indented by view_idx values, is not changed for the whole coded video sequence.
The possibility of using different inter-view RPS subsets for different pictures in a view was discussed. If following the MVC design, the possible inter-view reference pictures (identified by view ids) may be the same for each view component within a view, and the flexibility of including or excluding a possible picture can be achieved by reference picture list modification.
It was claimed by the proponent that having a more precise signalling of the inter-view RPS subset for each view component may save bits for reference picture list modification. Further study of this was recommended.
Another aspect of the proposal is to reuse the original syntax elements in the HEVC base specification for reference picture list construction and list combination. This aspect was agreed.
Miscellaneous
m24874
|
Byeongdoo Choi, Jaehyun Kim, Jeonghoon Park
|
3D-HEVC HLS: On Picture order counts
|
A POC signalling method is proposed for the enhancement views. A problem was mentioned on POC calculation during the view switching.
Proposal #1 is to have a larger POC for CRA pictures, which can be applicable to the base specification. A comment was made that there was a similar proposal to the JCT-VC but only for the HEVC base specification that had been discussed and concluded not to be useful.
No decision was made on this: if the idea of the larger POC for base view CRA was not adopted in JCT-VC, the decision for the similar functionality for enhancement views should presumably be the same.
Proposal #2 is to use the previous POC MSB of the base view for the enhancement view slice POC calculation.
A constraint is also proposed to align the IDR property of all the view components inside an access unit.
Comments were made saying that such a technique doesn’t have a normative impact on a multiview/3DV codec, so it was not clear that action is needed on this.
A more complicated dependency structure is proposed, so that whether a texture view component depends on a specific depth view component or whether a depth view component depends on a specific texture view component is signalled in the access unit delimiter.
In the access unit delimiter, the mapping of each layer to a specific view_id and view_idx and whether the current view component is depth or texture, are also signalled.
A comment was made that such functionality can be supported by some proposals related to VPS.
m24872
|
Byeongdoo Choi, Jaehyun Kim, Jeonghoon Park,
|
3D-HEVC HLS: On Slice header
|
It was proposed to support slice header prediction in the HTM.
The depth slice header may share syntax elements of the texture slice header and the scheme for of slice header prediction is similar to the slice header of an entropy slice.
A short_slice_header_flag indicates whether the prediction is enabled.
Both depth-to-texture prediction and texture-to-depth prediction are possible as proposed.
Some simple method for slice header prediction may be independent of the parameter set design.
In general, the slice header prediction is doable; however there are other proposals on the table related to parameter sets, and it is desired that the decision is to be made with full consideration of the various possibilities.
Regarding prediction from the texture to depth inside a view component, no decision was made.
A comment was made that the amount of savings of slice header bits should be made clearer.
It was mentioned that the dependency of the texture and depth within a view is further signalled in the slice header. A comment was given that this functionality is supported in some of the VPS proposals, and it is not clear that the flexibility in the access unit level has a benefit.
Some editorial changes were proposed to support 4:0:0 (used for depth coding); these are to be addressed when the working draft is available.
m24894
|
Miska M. Hannuksela (Nokia)
|
3DV-HTM high-level syntax: slice header prediction
|
It was proposed to predict slice headers of view components within an access unit, but not necessarily with the same view.
Slice header syntax elements were grouped to six groups, and the reference view component (containing the reference slice header) of each of the group may be signalled. A comment was made that for some specific groups, if those syntax elements are present, the prediction might not be needed. A comment was also made that the number of groups may not be optimal. The number of groups might need to be further studied.
There were no simulation results to show the benefits of such a grouping method.
The patterns are proposed as being helpful to save the bits on signalling the reference view components, which requires, for worst case, 6 indices to reference view components.
Several patterns are proposed to be signalled in the SPS; each identifies a prediction direction: e.g., ref_view_idx, depth or texture. It is noted that the number of patterns is variable and it might be also OK to just put the patterns in picture parameter sets.
Dostları ilə paylaş: |