7.3Discussion and Conclusions
Decision: Adopted JCTVC-D347 Variant 3: 4x4 separable with 6 bits tap values (both for HE and LC).
8CE5: Low complexity entropy coding improvements 8.1Summary
8.1.1.1.1.1.1.1.1JCTVC-D364 CE5: Summary report of CE5 on LCEC [X. Wang (Qualcomm), A. Fuldseth (Cisco)]
This document summarized the activities in Core Experiment 5 on Low Complexity Entropy Coding (LCEC). A group of seven companies had registered for participation in CE5.
8.2Contributions
8.2.1.1.1.1.1.1.1JCTVC-D366 CE5: Improved intra prediction mode coding with LCEC [M. Karczewicz, X. Wang, W.-J. Chen (Qualcomm), A. Fuldseth (Cisco)]
In this report, coding results of an improved intra prediction mode coding scheme are reported. Results show that based on the CE test conditions, an average coding gain of 0.5% can be achieved.
Coding of intra prediction modes is modified for 4x4 blocks (17 modes), 16x16 and 32x32 blocks (34 modes). If the chosen intra prediction mode iDir is equal to the most probable mode mostProbMode, code number 0 is assigned to this mode. Otherwise the codeword number n is found for iDir using a mapping table. 2 different tables are used, one with a 1-bit code and one with a 2-bit code for the most probable mode. Which table is used depends whether intra prediction modes of its left and above neighboring blocks are same.
The modes of neighboring blocks are used anyway to derive the most probable mode, therefore implications on complexity etc. seem to be marginal.
Question: Were the tables trained with the test sequences? Training of tables was done using CABAC and previous software version
Decision: Minor change that gives some gain – no objection raised – recommendation to adopt.
8.2.1.1.1.1.1.1.2JCTVC-D187 CE5: Cross-check results of Qualcomm and Cisco's proposal (JCTVC-C263) by ETRI [S.-C. Lim, H. Lee, H. Y. Kim (ETRI)]
Cross-check of JCTVC-263 ‒ compiled source code and got identical results, confirmed that gains are consistent over sequences
8.2.1.1.1.1.1.1.3JCTVC-D403 CE5: Cross-check of Qualcomm/Cisco's proposal on improved intra prediction mode coding in LCEC [Y. H. Tan, C. Yeo, Z. Li (I2R)]
Cross-check of JCTVC-263 ‒ compiled source code and got identical results.
8.2.1.1.1.1.1.1.4JCTVC-D381 CE5: Cross-check of Qualcomm’s intra mode coding for LCEC by Sony [J. Xu, A. Tabatabai]
Cross-check of JCTVC-263 ‒ results match, was not possible to check the computation time (different platform used)
8.2.1.1.1.1.1.1.5JCTVC-D369 CE5: Efficient coefficient coding method for 16x16 and 32x32 transforms in LCEC mode [S. Lee, I.-K. Kim, W.-J. Han, J.-H. Park (Samsung)]
This contribution was Samsung’s response to the Core Experiment 5 on the low complexity entropy coding improvements. In this contribution, a coefficient coding method for 16x16 and 32x32 transforms in LCEC mode, which has been proposed in JCTVC-C210, was presented. Tandberg, Ericsson, and Nokia’s coefficient coding method, which is currently implemented in the HEVC Test Model, had reportedly been extended for the efficient coding of the coefficients from 16x16 and 32x32 transforms. A simplified version of the proposed method which does not require any additional variable-length code tables was also presented. Experimental results reportedly showed that the proposed method provides good coding gain in intra and random access configurations without noticeable complexity increments.
LCEC reportedly lacks an efficient method to code the coefficients from large transforms (designed for 8x8). The method using new tables for larger block sizes (described in JCTVC-C210) reportedly gives BD BR savings of 1.4%, 0.8%, and -0.3% in intra, random access, and low delay configurations, respectively. A new low-complexity method, where for run<=63, the existing method was tested. For larger runs, an escape code is used, and the remaining run is coded directly. Average BD BR savings were reported as 1.1%, 0.5%, and 0.2% in intra, random access, and low delay configurations, respectively. Also for 32x32 transform, at most 16x16 coefficients are encoded.
There are other contributions (JCTVC-D374, JCTVC-D261) that target at the penalty of LCEC due to the limited coding of large-block transform coefficients.
Conclusion: The simple method (not using different tables for larger blocks) seems to be sufficient. It is well understood that there is a problem (restricting larger blocks to 8x8 coefficients is sub-optimum). However there may be other solutions to this (see JCTVC-D374, JCTVC-D261). This should be further investigated in CE5.
8.2.1.1.1.1.1.1.6JCTVC-D194 CE5: Cross-check of Samsung's proposal for LCEC improvements [Thomas Davies] (missing prior, uploaded Monday 17th, before meeting)
Results matched, but there was some discrepancy in encoding and decoding time (132% vs. 116% encoder, 105% vs. 101% decoder for the full method; simplified method 126% vs. 112% encoder, 104% vs. 99% decoder). Potentially the relation with RDOQ needs more investigation.
8.2.1.1.1.1.1.1.7JCTVC-D370 CE5: Improved coding of inter prediction mode with LCEC [W.-J. Chen, X. Wang, M. Karczewicz (Qualcomm)]
This document reported CE 5 results on modified coding of inter prediction mode with low complexity entropy coding (LCEC). The average gain was 1.9% for random access configuration and 2% for low delay configuration.
According to this proposal, split flag, skip flag, merge flag and direct flag are grouped and coded together with a symbol Inter_Partition. Unary codes are used. When Inter_Partition is coded, it indicates that additional information about prediction modes will follow. Mapping between symbol and codeword index is adaptive and CU depth dependent. The adaptation scheme is the same as used for other syntax elements. When Inter_Partition is coded, additional code is sent to signal one of other prediction modes for the current CU, e.g. Inter_2Nx2N or Inter_NxN, etc. A fixed VLC table at each CU depth is used for this purpose. (This gives 1.8% for RA and 1.6% for LD)
In addition, a "counter-based adaptation" is proposed. After encoding of each symbol the counter for this symbol is increased. If the current symbol’s counter is greater than the counter of the symbol with the codeword index smaller by one, the codeword indices of these symbols are swapped. The additional advantage of this method was reported as 0.4% BR saving for LD, 0.1% for RA.
The word length of the counter implementation is 8-bit and could overflow (although this does not happen with the current test set).
The process of counter-based adaptation would require precise normative description (the cross-checker points out that it is relatively simple in the software).
It was remarked that this relates to CE9 where the usage of modes is investigated.
One expert pointed out that this targets similar methods of adaptation of mode encoding as in the original TENTM proposal.
The encoding and decoding time were more or less unchanged.
Conclusion: inter_partition symbol and adaptation is useful improvement giving some gain in LCEC. Several experts expressed support.
Decision: Adopt (no objection raised). Regarding the relation to CE9, it had been reported before that the method works also without the merge flag, and the adaptation seems to be beneficial regardless of the specific modes. Regarding the counter-based adaptation: Further study was recommended.
It was remarked that JCTVC-D140 also suggests a method of counter-based adaptation.
It was noted that there may be some interaction between this proposal and other aspects that are evolving in the design – e.g., in relation to CE9.
8.2.1.1.1.1.1.1.8JCTVC-D449 Cross-verification of Microsoft’s method in JCTVC-D140 based CE9.3.1.t software [T.Lee, J.Chen(Samsung)] (late registration Tuesday 25th after start of meeting, uploaded Wednesday 26th, near the end of the meeting)
8.2.1.1.1.1.1.1.9JCTVC-D131 CE5: Crosscheck of Qualcomm's Modified LCEC by MediaTek [Ching-Yeh Chen, Yu-Wen Huang]
(The person who was familiar with this cross-check in detail was not in the room when discussed.) Reportedly the software was studied and compiled and run independently. The company supported the proposal.
8.2.1.1.1.1.1.1.10JCTVC-D319 CE5: Cross verification of Qualcomm’s counter adaptation for LCEC tool [K. Ugur, J. Lainema (Nokia)]
This contribution only cross-checked the counter adaptation part in detail.
8.2.1.1.1.1.1.1.11JCTVC-D374 CE5: Improved coefficient coding with LCEC [M. Karczewicz, X. Wang, W.-J. Chien (Qualcomm)]
Coding results of modifications of coefficient coding were reported. Results were available for both the cases of applying the proposed coefficient coding alone as well as applying it together with adaptive scan. In addition, the same idea was extended to the case of 16x16 coefficient coding, with results reported.
The current tables in the HM that map the {levelID, run} pair to code number are only used for inter blocks. New mappings to code numbers were introduced for blocks coded in intra mode. In addition, for an intra block the selection of the table is dependent both on the position k of the current nonzero coefficient and a parameter n, which is defined as zero if the absolute value of any of the coefficient levels coded so far (in the inverse scan order) is larger than 1. Otherwise, it is defined as the number of non-zero coded coefficients, clipped to 4.
The results reportedly show that when applying the proposed coefficient coding scheme alone, an average coding gain of 1.6% can be obtained with all intra configuration, an average of 0.6% gain with random access configuration and 0.1% with low delay configuration.
For intra, the scheme was used both with and without adaptive scan. In the case of adaptive scans, three scans are used: horizontal, vertical and zig-zag. The scan order for first few coefficients (up to 64) is adaptively adjusted based on previously coded coefficients (using a counter as described in the document JCTVC-C250). This reportedly gives 3.3% BR reduction in total for intra.
The proposed scheme was also extended to 16x16 coefficient coding, using a "zero level" (instead of escape as in JCTVC-D369) to encode longer run-lengths. The average coding gain can reportedly reach 5.2% with all intra configuration. A 2.2% coding gain was observed with random access configuration, and 2.3% gain with low delay configuration.
In a new version, the whole scheme was combined with an improved RDOQ method (giving roughly another 1% but unclear how it relates to the modifications, i.e. how it would perform without being combined).
Individual elements of the proposal were examined and discussed individually:
-
New mappings to code numbers for intra
-
Adaptive scanning
-
Method for encoding transform blocks >= 16x16
-
RDOQ modification (non normative)
8.2.1.1.1.1.1.1.12JCTVC-D444 Cross-check on Qualcom's new RDOQ for LCEC [David Flynn, Thomas Davies] (late registration Monday 24th after start of meeting, uploaded Monday 24th, fifth day of meeting)
8.2.1.1.1.1.1.1.13JCTVC-D064 CE5: Verification of Qualcomm's contribution on LCEC [Jie Zhao, Andrew Segall]
This contribution confirmed the results on combinations 1, 1+2 and 1-3, with encoding time not precisely measured due to usage of a computer cluster.
8.2.1.1.1.1.1.1.14JCTVC-D228 Cross-check of Qualcomm's proposal for improved coefficient coding for LCEC [A. Fuldseth (Cisco)]
This contribution confirmed the results on combinations 1+2 and 1-3, reporting that the encoding time was increased to 103-109% (should be negligible for item 1 alone).
Dostları ilə paylaş: |