6.2.1.1.1JVT-Y075 ( Prop 2.2/3.1) [M. Karczewicz, R. Ranchal. Y. Ye (Qualcomm)] SVC FGS Simplifications
This contribution discussed elements of the VLC coding used in FGS. Based on this analysis changes were proposed reportedly resulting in simplification, with an average performance drop of 0.8%.
The provided plots included results for MGS run in low delay mode as described in JVT-X071.
The contributor proposed to start a CE to evaluate the proposed changes and compare to “vector CAVLC”.
Simplification was compared to AR-FGS, bit rate increase 0.8 % average (mostly at higher rates)
The proposal was claimed to be much better than MGS in cases where a constant bit rate per frame needs to be sent.
Results of delta-rate and delta-psnr compared to MGS (hier. P pictures) were requested to be provided (both for CBR and VBR)
Also it was requested to provide a better understanding of the intended application domain (mobile multipoint videoconferencing?)
Subjective results? Not provided – It was suggested that this may be worth considering in further study.
Remark: AR aspect adds complexity to encoding and decoding – e.g. relative to MGS. The proponent indicates that this may not be the case. Remark: At least the decode complexity is certainly higher. There are more motion compensations to perform (although one is bilinear).
Question: Does low delay imply real-time encoding? Yes.
Roughly the same number of bits for each picture in these tests? Approximately yes.
Bitstream and software availability? Yes. Software is available.
Trade-off between complexity and delay? Response: Proponent suggests that MGS complexity is about the same.
Remark: We have standardized something already – do we have clear evidence that it is not adequate?
Question: What is the asserted benefit of this relative to the hierarchical forward-P picture case? Proponent response: The ability to maintain a roughly equal number of bits per picture in a CBR channel scenario which helps reduce delay. Remark: Should also consider complexity – increased complexity has a delay impact.
Proponent: Benefit of proposed FGS is asserted to be about 2 dB relative to similarly-constrained bit allocation use of current MGS design.
Remark: Looking at JVT-X040, do not see that. What are average numbers over all sequences for new proposal and for previous AR-FGS, both for CBR and VBR? Are there any average numbers reported in JVT-X040? Proponent indicates that new proposal has 0.7% degradation relative to prior AR-FGS; comparing to MGS “low-delay”, about 2 dB better; comparing to VBR (no delay constraint), about 0.5 dB better. In both cases, a “GOP size” of 4 frames was used for MGS (i.e. 3-level hierarchical P with one I picture at beginning) while IPPPP for the proposal.
Question: Is constant rate really needed in such low-delay scenarios? For variable bit rate, improvement is not so much – and should avoid introducing an incompatible “dialect”.
Question: One packet per frame? Proponent: Not necessarily.
Question: What is the benefit of FGS (or even SVC) when there is constant bit rate? Proponent: May be constant bit rate on uplink but perhaps not further downstream. Some material presented previously. Suggestion to just send the bit rate that the receiver can receive? Proponent: For cases with multiple recipients of the video? Remark: Multipoint video conferencing on mobile networks?
Question: How much of a niche application are we considering here? Low delay with CBR for multipoint SVC?
Suggestion to have some CE in which FGS technology may apply – what would be the topic?
The issues were requested to be discussed offline to clarify application need/scenario.
Interaction with other application scenarios where other AVC/SVC scenarios would be relevant? Is this for a relatively closed application domain of mobile devices, or a set of devices that has wider interaction with other devices?
Some AR-FGS-related changes previously adopted were reported to not have been integrated into the software yet. CABAC may or may not (but should be) supported in the software.
Suggestion: To create AHG which will help to integrate CABAC and AR-FGS simplifications that were previously intended to be adopted, and investigate the JVT-Y075 VLC changes in the AHG.
Question: How can we conclude on whether a profile that includes FGS technology is needed or not?
Question: Is mobile multipoint videoconferencing the only target application?
Remark: JVT-W093, JVT-X040 and JVT-X071 may contain relevant information to clarify application needs. An application suggested is “Videoshare”, which is a one-way multicast. Question: Is there a serious delay constraint need for that application?
This document reported verification of Qualcomm's proposal on FGS simplification (JVT-Y075). Sharp reportedly inspected the source code (source code was marked with #define statements, read through those sections and compiled it both ways and provided both results). Sharp also reportedly compiled and generated results for the conditions investigated by Qualcomm. All results reportedly matched with the exception of the Mother and Daughter sequence. (It is believed that Sharp and Qualcomm used different original sequences for Mother and Daughter.) Results were included with the submission.
A new version of the contribution was uploaded. There was a difference on one sequence – likely to be the result of slightly different source sequences (Mother & Daughter case).
6.3.1.1.1JVT-Y076 ( Prop 2.2) [J. H. Park, Y. H. Kim, B. H. Choi (KETI)] Requirement of SVC color space scalability
This document asserted a requirement for color space scalability and described possible scenarios that might benefit from color space scalable coding. This document also claimed that the conventional color conversion equation for color space scalability can cause a mismatch problem between encoder and decoder because of floating point operation so that an integer point conversion method is described for color space scalable video coding. The proposed method requires syntax modifications for color space scalability. Modified syntax tables were described.
Different applications: Requirements for scalability in terms of color sampling and color spaces (matrices). It was claimed that, in the case of layer prediction, color transfer matrices must be mathematically precise (not FP). Solutions proposed included integer transfer matrices and lookup tables.
Remark: Matrix definition may not be sufficient because different color spaces also include e.g. gamma transfer characteristics.
Typically, color is treated outside the coding/decoding process. E.g. YCbCr of BT.709 is different from BT.601. But is it the case that nobody typically cares?
Would it rather be appropriate to encode all scalable layers within only one color space? E.g. conversion of YCbCr to RGB outside the decoding process would rather produce a small error which is in the near lossless range. Further study would seem necessary.
Remark: Chroma format scalability should in principle work, but is not included in any profile yet.
The suggestion in the proposal was to consider inserting an integer-based color conversion into the decoding process – sending integer coefficients in the bitstream.
Remark: True color conversion could be (a lot) more than a matrix transformation – also consider transfer characteristics, colour primaries, reference white.
Remark: Consider the possiblity that the base layer is SD (using the BT.601 color space) and the enhancement layer is HD (using the BT.709 color space).
Question: How much real benefit would there be by pulling a color conversion process into the decoding process?
Remark: Including sampling grid upsampling (e.g. 4:2:0 to 4:2:2) in the decoding process may be more well-justified than including color space conversion in the decoding process.
The provided material was primarily verbal argumentation – real data demonstrating the benefit of such a scheme would be more desirable.
The proponent indicated a desire to show some preliminary results at next meeting.
Presentation deck available? Yes, newly-provided.