14.10TE12 TMuC transforms and coefficient coding 14.10.1Transform skip flag
14.10.1.1.1.1.1.1.1JCTVC-C047 TE12.4: Transform skip flag test (off vs. on) by Fraunhofer HHI [B. Bross (Fraunhofer HHI)]
This document reported the results of testing the transform skip flag for the transform unit quadtree (RQT_ROOT) within the scope of tool experiment 12 evaluating TMuC Tools. For the test, RQT_ROOT was enabled and tested against the RQT_ROOT disabled default. In a high efficiency scenario, similar performance was reported when enabling RQT_ROOT – with bit rate savings were reported to be around 0.1%. In a low complexity scenario (using LCEC entropy coder), enabling RQT_ROOT reportedly performs better than the default in terms of coding efficiency – with average bit rate savings around 3.6%. Encoder runtimes ranging from 93% to 106% and decoder runtimes from 91% to 103% of the default runtimes had been measured.
It was remarked that this experiment used version 0.7.2, which was modified for "LCEC Phase 2" in 0.7.4. The question was asked about the potential interaction of this proposal with that one.
The transform skip flag handling was asserted to be essentially the same in this proposal as in the Dresden HHI proposal.
The benefit was found primarily when using LCEC entropy coding – will we be keeping that?
It seemed that if we keep LCEC entropy coding, then we should also use this scheme (pending any conflicting later decisions at the meeting).
14.10.1.1.1.1.1.1.2JCTVC-C258 TE12: Evaluation of transform skip flag (RQT_ROOT) [W.-J. Chien, P. Chen, M. Karczewicz (Qualcomm)]
This contribution was a cross-verification report with similar results as reported in JCTVC-C047.
The algorithm and code was studied in detail – which resulted in a related contribution discussed below.
All test cases were performed.
14.10.1.1.1.1.1.1.3JCTVC-C267 Test results of transform skip flag and phase 2 VLC integration [W.-J. Chien, P. Chen, X. Wang, M. Karczewicz (Qualcomm)]
This contribution was not directly part of the planned TE12, but is sufficiently closely related to merit being considered together with it.
This contribution presented test results for Transform Skip flag (HHI_RQT_ROOT) under "Phase 2 LCEC VLC integration".
According to the simulation results, further investigations and testing should be conducted to indentify the interaction of the transform skip flag and the phase 2 VLC.
If we don't use the residual quadtree scheme, the proposed Phase 2 scheme was reported to provide a similar capability.
If we do use the residual quadtree scheme and the LCEC entropy coding, the proposed skip flag provides a benefit.
And the two schemes have roughly similar actual signaling.
LCEC phase 2 was something planned for integration in Geneva, but was integrated a bit late, and was asserted to have been intended as original Dresden output (based on JCTVC-A119). Similarly, the RQT_ROOT scheme with the skip flag was asserted to have been intended as original Dresden output as well (based on JCTVC-A116).
If we use the residual quadtree approach, it was suggested to use the transform skip flag with the RQT_ROOT scheme. This was agreed.
If we have LCEC entropy coding and don't use the residual quadtree approach, it was suggested to use the "Phase 2" scheme (not yet reviewed). – further discussion of this avenue did not appear necessary.
14.10.1.1.1.1.1.1.4JCTVC-C185 Recent improvements of the low complexity entropy coder (LCEC) in TMuC [A. Fuldseth (Cisco), A. Hallapuro, K. Ugur, J. Lainema (Nokia)]
This contribution was not directly part of the planned TE12, but is sufficiently closely related to merit being considered together with it.
This contribution reports on recent improvements of the low complexity entropy coder (LCEC) in the Test Model under Consideration (TMuC) context.
The low complexity entropy coder (LCEC), based on the JCTVC-A119 HEVC CfP proposal has been adopted into TMuC as described in JCTVC-B205. This has reportedly proved to be a challenging task, especially since the coding structures of the two algorithms are quite different. In particular, the side information parameters of JCTVC-A119 could not be directly mapped to the side information parameters of TMuC and the strategy of JCTVC-A119 had to be modified as appropriate.
LCEC has been integrated into the TMuC software in four phases:
-
Phase 0: TMuC v0.1, preliminary version provided by Samsung/BBC.
-
Phase 1: TMuC v0.7, used as anchor in TE12 low complexity configurations,
(including the PIPE vs. LCEC experiment).
-
Phase 2: TMuC v0.7.4, disabled by default, but evaluated as part of TE12.
-
Phase 3: Recent improvements not integrated in the official TMuC software.
In this document, BD-Rate results for LCEC phase 2 and phase 3 relative to phase 1 were provided. Also, results with and without residual quad tree (RQT) and the transform skip flag (HHI_RQT_ROOT) were reported. The reason is that significant side information parameters such as coded block flag (cbf) are treated differently dependent on the values of these RQT parameters. Finally, a comparison between PIPE and LCEC was given.
The Class E sequences reportedly benefit substantially from the proposed scheme.
The gap between LCEC (as proposed) and PIPE was reported to be approximately 10%.
No cross verification of "phase 3" had been done at this point – see TE12 entropy coding testing.
14.10.2MDDT and ROT
14.10.2.1.1.1.1.1.1JCTVC-C136 TE12.4: Mode dependent directional transform (MDDT) test (off vs. on) by HKUST [X. Zhang (HKUST)]
This contribution reported coding performance tests of MDDT according to Tool Experiment 12 JCTVC-B312. Some difficulties were reported in completing the testing process. Incomplete test results were presented for this reason.
0.7.0 software was used.
When MDDT is disabled in the software, ROT is enabled for all block sizes.
When MDDT is enabled in the software, ROT is enabled on some block sizes but not others.
(ROT was integrated first – when MDDT was later integrated, since it applied only to smaller block sizes, the MDDT enabling switch was designed to revert to the prior ROT behavior when MDDT is off.)
The results seem somewhat difficult to interpret because of this behavior. This is not really a simple test of "off" versus "on" for some tool, since the ROT behavior changes when the MDDT behavior changes.
In the HE test cases, "MDDT on with ROT on for large block sizes" versus "MDDT off with ROT on for all block sizes" showed no significant performance difference.
It was remarked that for LC configuration, some bug fix to the entropy coding behavior was made after version 0.7.0 of the software (bug fix 79) – this may call into question the LC results in this contribution.
14.10.2.1.1.1.1.1.2JCTVC-C181 TE12.4: Cross-check of rotational transform (ROT) [K. Misra, A. Segall (Sharp)] (missing prior, available first day)
This contribution cross-checked the performance of the rotational transform in the context of TE12. The experimental results reportedly showed that, with MDDT enabled, using ROT for the large block sizes introduces an average loss (Y BD-rate for each condition) ranging from 0.0% to 0.6%. This closely matched the rate-distortion results provided by Qualcomm.
The software was studied, and was consistent with the description. The 4x4 ROT variant was not tested.
Encoding and decoding time measures were too unstable to provide confident measures.
14.10.2.1.1.1.1.1.3JCTVC-C202 TE12.4: Experimental results of MDDT and ROT by Samsung and Qualcomm [E. Alshina, A. Alshin, W.-J. Han (Samsung), R. Joshi, M. Coban (Qualcomm)]
The presentation for this contribution was requested to be uploaded
This document described results of three tool evaluation tests on alternative transform field. These experiments were carried out and cross-verified by Samsung and Qualcomm in a framework of TE12 activity.
A modified combination of MDDT and ROT was proposed in the contribution.
The current combination of MDDT and ROT in the TMuC when enabled together, reportedly provides about 2% benefit for HE intra only (and 1% benefit for RA), with about 45% increase in encoding time. The contributor indicated that this degree of coding time increase is not really necessary – and contribution JCTVC-C250 discusses that topic.
The new combination proposed in the contribution reportedly provides about 3% benefit for HE intra only (and 1% benefit for RA).
It was remarked that TE7 also has some other transform modification.
It was noted that there is also a lower-complexity ROT proposal in JCTVC-C096.
A contributor asked if it was possible to estimate the gain from the transform aspects of this and the adaptive scanning that is part of it. There are really two features being tested together here, in terms of scan and transform changes. The opinion was offered that most of the gain is from the scan rather than the transform aspects. Contributions JCTVC-C106, JCTVC-C114, and JCTVC-C263 also discuss adaptive scanning.
Considering the complexity of these features, their inclusion in the TM at this time did not appear justified for the benefit shown.
14.10.2.1.1.1.1.1.4JCTVC-C268 TE12: Report on evaluation of MDDT and ROT [R. Joshi, M. Coban, M. Karczewicz (Qualcomm)]
Roughly consistent results were presented in this contribution as in the others JCTVC-C136, JCTVC-C181, JCTVC-C202.
14.10.2.1.1.1.1.1.5JCTVC-C298 TE12.4 Cross-check of unified MDDT/ROT [T. Davies , D. Flynn (BBC)] (late registration, missing prior)
This contribution document contained cross-check results of JCTVC-C202, specifically on providing a unified framework for MDDT and ROT mode selection. The cross-check had only partially been completed, although the results collected so far matched those proposed in JCTVC-C202.
14.10.2.1.1.1.1.1.6JCTVC-C036 Alternative performance measurement of MDDT and ROT in TMuC [R. Cohen, A. Vetro, H. Sun (Mitsubishi)]
This contribution was not part of the TE, but is included here since it contains closely related information.
This document evaluated the performance of ROT and MDDT both individually and combined in TMuC, using a reference in which both ROT and MDDT were disabled. The results reportedly show that for all-Intra high efficiency conditions in TMuC 0.7, these tools produce average BD-Rate improvements from 1.6% to 2.1%, where ROT slightly outperforms both MDDT and their combination. For random access high efficiency, the gains were reportedly between 0.7% and 0.9%, with a similar relationship among individual and combined tests. Encoder run times approximately doubled for ROT, whereas decoder run times doubled or tripled with MDDT, depending upon the configuration. Possible future improvements were addressed as well.
The need for subjective evaluation was noted.
14.10.3Transform coefficient coding
14.10.3.1.1.1.1.1.1JCTVC-C049 TE12.4: Transform coding HHI tested against Samsung proposal by Fraunhofer HHI [B. Bross (Fraunhofer HHI)]
This document reported results of testing the transform coefficient coding proposed by Fraunhofer HHI against the transform coefficient coding proposed by Samsung within the scope of tool experiment 12 evaluating TMuC Tools. For the test, the Samsung coding was tested against the HHI coding default. It was reported that the Samsung coding results in 1.3% to 2.0% luma bit rate increase compared the HHI coding. The Samsung coding runtime of the encoder is between 92% and 110% of the HHI coding runtime and the Samsung coding runtime of the decoder is between 98% and 120% of the HHI coding runtime.
The HHI scheme has adaptive scan order, context modeling for the significance map, and context modeling for the absolute coefficient levels.
The Samsung scheme uses fixed scan order and different context modeling.
Contributions JCTVC-C106, JCTVC-C114, and JCTVC-C263 also discussed adaptive scanning.
14.10.3.1.1.1.1.1.2JCTVC-C059 TE12: Evaluation of transform coefficient coding (HHI_TRANSFORM_CODING) with tool breakdown [V. Sze, M. Budagavi, M. Zhou (TI)]
The coding efficiency gain of HHI_TRANSFORM_CODING was evaluated with TMuC-0.7.3 to be between 1.3 to 2.0%. A breakdown of this gain across the three main tools was also studied and reported as follows: adaptive scanning order of significance map (0.2 to 0.4%), proposed context modeling of significance map (0.9 to 1.4%) and a proposed context modeling of coefficient level (0.1 to 0.2%).
The coding efficiency gain of HHI_TRANSFORM_CODING was also evaluated with TMuC-0.7 by the contributor and was reported to have been verified to be a match with results provided by Samsung.
However, the three tools increase the complexity of context modeling – making it more difficult to parallelize for throughput.
Decision: The conclusion was to put the HHI scheme in TM, and further study the complexity issues.
14.10.3.1.1.1.1.1.3JCTVC-C203 TE12.4: Experimental results of transform coefficient coding [J. Chen, V. Seregin, W.-J. Han (Samsung)]
This contribution reported similar results as in JCTVC-C049.
Dostları ilə paylaş: |