6.14Transform coefficient coding
6.14.1.1.1.1.1.1.1JCTVC-F077 Transform skip mode [M. Mrak, A. Gabriellini, N. Sprljan, D. Flynn (BBC)]
This contribution addresses transformation of residuals, where row and/or column transforms can be skipped. Encoder determines a selection of transforms / skipped transforms, i.e. so-called Transform Skip Mode (TSM), for each block based on RDO search. The transform skip mode choice is signalled to the decoder where inverse transforms of rows/ columns are performed or skipped.
This method was implemented in HM3.2, for motion compensated blocks. Skipped transforms are replaced with scaling of input coefficients. The presented design relies on the block transform with 16-bit intermediate data representation [JCTVC-E243]. Underlying transform properties are asserted to allow for a simple design of the transform skip - HM3.2 quantization reportedly does not require any changes and the choice of scaling factor for a row/column where transform is skipped depend only on the block size.
Tests were performed without RQT for targeted blocks (inter coded). Compared to HM3.2 anchors the presented approach reportedly shows gains of 0.2% BD-rate averaged over all inter coded configurations. Since the proposed approach has been implemented without RQT, its performance has also been compared to HM3.2 with disabled RQT for inter coded blocks. In that case, the gain is reported as 1.0% BD-rate averaged over all inter coded configurations.
The performance of the new method reportedly introduces gains comparable to the RQT gains on inter coded blocks, while preserving non-recursive design.
How often is it used? For 4x4, almost 50% are coded with 1D or no transform, and the ratio goes down for larger transform block sizes.
The method mainly has gain in LD cases. For RA, there is no gain (for HE) or even loss (for LC) when RQT is on. Gains with RQT off look more interesting.
Several experts expressed the view that this is interesting.
Further study was recommended (AHG?); the encoding method is said to not yet be fully optimized.
6.14.1.1.1.1.1.1.2JCTVC-F152 Cross-check of JCTVC-F077 Transform Skip Mode [T. Davies (Cisco)]
6.14.1.1.1.1.1.1.3JCTVC-F581 Cross-check report for BBC’s proposal on Transform skip mode (JCTVC-F077) by Motorola Mobility [K. Panusopone, X. Fang, V. Kung, L. Wang (Motorola Mobility)] [upload 07-14 after opening]
6.14.1.1.1.1.1.1.4JCTVC-F287 Improvements on last nonzero position coding of 4x4 TU in CAVLC [J. Xu, A. Tabatabai (Sony)]
In this proposal, the coding of last nonzero coefficient for 4x4 TU in CAVLC is changed. In HM3.0, Chroma and Luma from Intra TU are mixed together as one of contexts for last nonzero coefficient of 4x4 in CAVLC. This proposal redefines the contexts to separate Luma and Chroma and then different VLC tables are used for Luma and Chroma. Experimental results reportedly show that the BD-rate savings for LC conditions were 0.1% for Y, 1.7% for U and 1.6% for V under Intra only configuration, 0.0% for Y, 1.4% for U and 1.3% for V under random access configuration, and 0.0% for Y, 0.8% for U and 0.9% for V under low delay configuration.
No interest expressed - no action.
6.14.1.1.1.1.1.1.5JCTVC-F677 Cross-check for Sony’s Proposal (JCTVC-F287) on Improvements on last nonzero position coding of 4x4 TU in CAVLC [Z. Zhou, S. Liu (MediaTek)] [late reg. 07-07, upload 07-07]
6.14.1.1.1.1.1.1.6JCTVC-F124 Extended Mode-Dependent Coefficient Scanning [X. Zhao, X. Guo, M. Guo, S. Lei (MediaTek), S. Ma, W. Gao (PKU)]
In this contribution, an extended mode-dependent coefficient scanning (MDCS) method is proposed. Based on the MDCS in HM 3.0, this proposal makes two extensions. First, a modified MDCS is applied to larger block sizes including 16×16 and 32×32, in which the horizontal scanning order and vertical scanning order are different from the current ones. Second, MDCS is also applied for scanning of the absolute values in CABAC, which only uses zigzag scanning now. It is reported that, with both modifications, average 0.3% and 0.7% BD-rate reductions are reported for AI-HE and AI-LC configurations, respectively. It is also reported that the running time of the proposed method is almost the same with that of anchor.
Is the more random scan sequence a problem for software and hardware?
Seems to give some gain -> CE?
6.14.1.1.1.1.1.1.7JCTVC-F646 Cross-verification of MediaTek’s JCTVC-F124 on Extended Mode-Dependent Coefficient Scanning [J. Chen, V. Seregin (Samsung)] [late reg. 07-05, upload 07-07]
6.14.1.1.1.1.1.1.8JCTVC-F186 Predicted neighbour for context selection of significant_coeff_flag for parallel processing [C. Rosewarne, M. Maeda (Canon)]
This contribution proposes a method for context selection, modified from the approach presented in contribution JCTVC-E330 by TI and studied in CE11. Similarly to JCTVC-E330, the context selection is modified along the top edge and the left edge of the Transform Unit, where the zigzag scan changes direction. In these locations, a value for a neighbouring significant coefficient flag, immediately preceding a current significant coefficient flag along the scan pattern, is predicted. This modification was implemented in HM-3.0 and reportedly resulted in degradations of 0.1% for IA_HE, 0.0% for RA_HE and 0.0% for LD_HE configurations in the Luma channel. Use of the predicted value instead of the neighbouring value reportedly enables context index determination to occur earlier in the bitstream parsing process.
The contributor confirmed that there was no need to present this contribution, as it became obsolete due to adoption of diagonal scan.
6.14.1.1.1.1.1.1.9JCTVC-F668 Cross-check results of Canon’s Predicted neighbour for context selection of significant_coeff_flag for parallel processing (JCTVC-F186) [V. Sze (TI)] [late reg. 07-06, upload 07-06]
6.14.1.1.1.1.1.1.10JCTVC-F134 CE11.A: Cross checking of JCTVC-C227 and proposal on semantic, syntax, and implementation [C. Auyeung (Sony)]
This proposal clarifies the text and software implementation of JCTVC-F129 in that the zigzag scan is not removed entirely in the CAVLC case, as it will still be needed for CAVLC.
Suggested action: JCTVC-F129 needs to be clarified/corrected to avoid removing the zigzag, which would be a breaking bug for CAVLC. Agreed.
6.14.1.1.1.1.1.1.11JCTVC-F671 Cross-check of Sony's proposal on semantic, syntax, and implementation (JCTVC-F134) [V. Sze (TI)] [late reg. 07-06, upload 07-06]
6.14.1.1.1.1.1.1.12JCTVC-F236 IDCT pruning and scan dependent transform order [M. Budagavi, V. Sze (TI)]
High frequency region of large transforms is typically zero due to quantization and energy compaction properties of transform. This contribution presents non-zero low frequency sub-block statistics of large transform blocks in anchor bitstreams of HM-3.0. It is reported that 88%-91% of the large transform blocks do not need to undergo a full 2D inverse transform in HM-3.0 bitstreams. The non-zero sub-block information can be used to carry out IDCT pruning where in the IDCT computations that have zero input and zero output can be eliminated. This contribution asserts that IDCT pruning is a useful technique that can be used to reduce SIMD computational complexity based on source statistics in decoder. It is also asserted to result in corresponding power savings in hardware transform engines. Note that IDCT pruning as defined in this contribution is a lossless process (i.e. non-normative). This contribution also presents a normative tool: scan dependent transform order that defines row-column transform order depending on scan type. This tool is asserted to reduce transform complexity and increase amount of pruning that can be used. This contribution recommends that the pruning behavior of large transforms and scan dependent transform tool be considered in design of HEVC transforms.
The transform order would need to be normative.
Would only be relevant for intra blocks with hor/vert scan, which seems to be a very specific case. What would be the actual saving in terms of operations, e.g. for case of butterfly implementation?
A. Fuldseth reports that in software row/column order was found to perform better, but not interpreted so far (HM uses reverse sequence).
Further study.
6.14.1.1.1.1.1.1.13JCTVC-F501 Mode dependent coefficient scan for inter blocks [J. Song, M. Yang, H. Yang, H. Yu, X. Zheng (Huawei)]
In this contribution, a mode dependent coefficient scan method is proposed for inter-coded blocks. A scan pattern out of the three scans in HM3.0, i.e., zigzag, horizontal, and vertical scans, is selected based on the PU partition mode. The averaged RD performances of the proposed method under RA_HE, RA_LC, LB_HE, LB_LC, LP_HE, and LP_LC conditions are reportedly 0.0%, 0.0%, −0.1%, −0.5%, −0.2% and −0.4%, respectively.
Gives gain mainly for LD, and particularly CAVLC
In HE it may be conflicting with context modelling?
There may be relation with the Nx2N transforms proposed elsewhere (CE2)
Further study for LC only (CE5 on CAVLC)
6.14.1.1.1.1.1.1.14JCTVC-F227 Cross-check report for Huawei's proposal JCTVC-F501 [Y. Shibahara, T. Nishi (Panasonic)]
6.14.1.1.1.1.1.1.15JCTVC-F375 Binarization modification for last position coding [V. Seregin, I.-K Kim (Samsung)]
In this document modification of binarization, consisting in coupling unary and fixed binary codes, for last position coding is investigated and tested. And for fixed binary part CABAC compression with equal probabilities (bypass) mode is suggested for complexity reduction. Also, number of context models of last position is reduced from 52 to 38. Experimental results reportedly show 7% bins bypass coded with 0.1%, 0.0% and 0.0% BD-rate gain in high efficiency intra-only, random access and low-delay test conditions respectively.
In total, the throughput is estimated to be increased by approx. 2.5%, number of bypass bins by approx. 0.5-1% (assuming 10% last coeff. rate). This mainly applies at high QP.
Further discussion in BoG on CABAC - must be seen in combination with other simplification proposals.
6.14.1.1.1.1.1.1.16JCTVC-F538 Cross-check of Samsung’s binarization modification for last position coding (JCTVC-F375) [J. Sole (Qualcomm)]
Cross-checkers in principle support this, confirming that the implementation is correct and straightforward; however it was suggested to study the relationship with other CABAC simplifications, and BoG activity was requested for this.
6.14.1.1.1.1.1.1.17JCTVC-F552 Parallel processing of residual data in HE [J. Sole, R. Joshi, M. Karczewicz (Qualcomm)]
This document presents two techniques to facilitate the parallelization of the residual coding in HE. The first one extends the concept of parallel context processing of coefficient level to include the significance map coding pass. This provides localized data access for all the coding passes of the residual. The BD-rate of this technique for AI-HE, RA-HE and LB-HE configuration is reported as 0.25%, 0.21%, and 0.05%, respectively. A delayed state update in CABAC is also proposed to help parallelism. The BD-rate of both techniques combined for AI-HE, RA-HE and LB-HE configuration is reported as 0.15%, 0.13%, and 0.00%, respectively.
Context model is updated with a delay of one bin.
The gain in compression is due to usage of backward scanning.
The real amount of benefit for hardware implementation needs more thorough investigation.
The current approach would support better pipelining or at most 2x parallelism; for more parallelization, it would be necessary to introduce larger delay.
For interleaving: Tradeoff with additional memory.
In general, interest is expressed; further study – second part (delayed update) in CE.
6.14.1.1.1.1.1.1.18JCTVC-F654 Cross verification for Qualcomm's proposal JCTVC-F552 on Parallel Processing of Residual Data in HE [Hisao Sasai, Takahiro Nishi (Panasonic)] [late reg. 07-05, upload 07-12]
6.14.1.1.1.1.1.1.19JCTVC-F569 Adaptive Scan for Large Blocks for HEVC [Y. Yu, K. Panusopone, J. Lou, L. Wang (Motorola Mobility)]
This contribution proposes an adaptive scan for big block size for intra coding to the new video coding standard, HEVC (High Efficiency Video Coding). Reportedly without increasing complexity at both encoder and decoder, average coding gains of 0.3% and 0.2% are asserted for low complexity and high efficiency intra coding cases, respectively.
Horizontal and vertical scans applied for quarters of large transform blocks.
2 additional scan patterns for 16 and 32 cases each
Most gain comes in class E for both HE and LC, plus A for LC
Scan direction derived from prediction direction
Further study.
6.14.1.1.1.1.1.1.20JCTVC-F614 Cross-check report for Motorola Mobility's JCTVC-F569 by HKUST [F. Zou, O. C. Au (HKUST)]
6.14.1.1.1.1.1.1.21JCTVC-F598 Adaptive significance map coding for large transform [J. Min, Y. Piao, J. Chen (Samsung)] [first version rejected, second version also unacceptable, third version late upload 07-16]
This contribution proposes an adaptive context type selection method for significance map coding. The proposed method selects the best context model type according to coefficient distributions. The selected context model type is signaled to decoder for each transform unit when the coefficient distribution satisfied a given condition. The method reportedly provides 0.3% gain overall and 0.6% for class A in HE, All Intra configuration.
Would need much more improvement (less encoder complexity, more gain particularly for inter) to become interesting.
6.14.1.1.1.1.1.1.22JCTVC-F709 Crosscheck of Samsung's proposal JCTVC-F598 by Huawei [Q. Shen, H. Yang, H. Yu (Huawei)] [late reg. 07-11, upload 07-13]
Dostları ilə paylaş: |