6.16.2Alternative transforms
6.16.2.1.1.1.1.1.1JCTVC-F153 On fast implementation of 4-point ODST-3 in HM3 [C. Yeo, Y. H. Tan, Z. Li (I2R)]
In this contribution, possible fast implementations of the 4-point ODST-3 that are asserted to be mathematically equivalent to that in HM3 but have lower operations count than that found in the current HM3 software were presented. The (29, 55) approximation of ODST-3 required 11 additions and 8 multiplications as implemented in HM3, while two other implementations require either 10 additions and 6 multiplications or 13 additions, 4 multiplications and 2 bit-shifts. A previously mentioned (28, 56) approximation of ODST-3 required 11 additions, 5 multiplications and 3 bit-shifts, while two other implementations require either 10 additions, 5 multiplications and 1 bit-shift or 11 additions, 4 multiplications and 2 bit-shifts.
This contribution shows that in principle any of the two versions can be implemented in similar fast algorithms. Some argumentation is made that the 56 version saves some gates in adders, but this seems to be minimum. The group decided to leave DST as it is currently.
6.16.2.1.1.1.1.1.2JCTVC-F225 Consideration of reference pixel availability for adaptive DCT/DST decision [Y. Shibahara, T. Nishi (Panasonic)],
In the Working Draft 3 (WD3) of HEVC, 4x4 DST is used for intra prediction unit depending on its intra prediction mode. The DST shows better coding gain than DCT when it is used together with the coupled intra prediction modes. In the latest design of HEVC, however, availability of reference pixels is not considered for the choice of DST, although it may affect the coding gain of the transform. In this contribution, it is proposed to check the availability of the reference pixels before deciding to use DST. An experimental result reportedly shows that the proposed method provides 0.01%, 0.01%, 0.02% and 0.00% gain for AI HE, AI LC, RA HE and RA LC, respectively on the common condition. As for the condition using slice (1500 byte / slice), an experimental result reportedly shows that the proposed method provides 0.01%, 0.10%, 0.10%, 0.03% and 0.05% gain for AI HE, AI LC, RA HE and RA LC, respectively. And as for the condition of CIP enabled (constrained intra prediction), an experimental result shows that the proposed method provides 0.07% and 0.05% gain for RA HE and RA LC, respectively. The increase of encoding and decoding time is reported to be less than 1% for all results.
No action.
6.16.2.1.1.1.1.1.3JCTVC-F485 Cross check report for Panasonic's improving method for adaptive DCT/DST (JCTVC-F225) [A. Ichigaya, S. Yasuko (NHK)] [initial version rejected as placeholder; corrected version late upload 07-06]
6.16.2.1.1.1.1.1.4JCTVC-F656 Cross Check Report for Consideration of reference pixel availability for adaptive DCT/DST decision (JCTVC-F225) [Ankur Saxena, Felix Fernandes (Samsung)] [late reg. 07-05, upload 07-12]
6.16.2.1.1.1.1.1.5JCTVC-F295 Harmonizing ROT and SDIP [Z. Ma, F. Fernandes, E. Alshina, A. Alshin (Samsung)]
The Rotational Transform (ROT) is a secondary transform applied to 8x8 DCT-coefficient blocks while Short Distance Intra Prediction (SDIP) improves intra prediction by partitioning spatial-domain blocks into squares or rectangles. These tools have formerly been tested separately in HM 2.0. In this contribution it is reportedly demonstrated that the tools can be harmonized so that their gains are almost additive when both tools are combined together in HM 3.0. Specifically, it is reportedly shown that for the All-Intra (AI) case, the ROT provides 0.7% (HE) / 0.8% (LC) additional gains over SDIP and for the Random-Access case, the ROT provides 0.4% (HE) / 0.4% (HE) additional gains over SDIP.
The reported gain is slightly lower (by 0.2%) than without SDIP.
6.16.2.1.1.1.1.1.6JCTVC-F535 Crosscheck report for Samsung's harmonizing ROT and SDIP [C. Lai, L. Liu, J. Zheng(HiSilicon)] [late upload 07-07]
6.16.2.1.1.1.1.1.7JCTVC-F553 Mode-Dependent DCT/DST for 4x4 Chroma Blocks [Ankur Saxena, Felix Fernandes, Elena Alshina, Jianle Chen (Samsung)]
The 4x4 mode-dependent DCT/DST proposal from JCTVC-E125 was adopted at the Geneva meeting in March 2011 for 4x4 Luma blocks. In this proposal, the mode-dependent DCT/DST scheme is extended to 4x4 Chroma blocks as well. Experimental results are provided with HM 3.0 as anchor for the test conditions as stipulated for Core Experiment 7. Average BD Rate gains of 0.3%, 0.7%, 0.6% and 0.6% are reported for Chroma components in Intra High Efficiency, Intra Low Complexity, Random Access High Efficiency, and Random Access Low Complexity settings.
Only for horizontal and vertical modes.
What is the additional hardware (number of gates) complexity?
It relies on how the mode is coded.
Further study (continuation of CE7).
6.16.2.1.1.1.1.1.8JCTVC-F679 Cross Check Report for Samsung's Proposal "Mode-Dependent DCT/DST for 4x4 Chroma Blocks" (JCTVC-F553) by Panasonic [Y. Shibahara, T. Nishi (Panasonic)] [late reg. 07-07, upload 07-07]
6.16.2.1.1.1.1.1.9JCTVC-F554 On secondary transforms for intra prediction residual [A. Saxena, F. Fernandes (Samsung)]
It was reportedly previously shown by Han, Saxena & Rose in ICASSP 2010, that following intra prediction, the optimal transform is not a DCT, but a DST Type-7 with performance close to KLT along the direction of prediction, for the horizontal and vertical modes. The 4x4 DST from JCTVC-E125 was adopted in the HEVC Geneva meeting in March 2011. Shibahara & Nishi had proposed a mode-dependent 2-step transform in JCTVC-D151. In this contribution, various mode-dependent secondary transforms (MD-ST) are presented which are used after DCT based on the intra prediction mode. The proposed DCT/MD-ST transform scheme is applied at block sizes 8x8 and higher. No additional signalling information or R-D search is required during the encoding, and the algorithm works in a single-pass. The conventional quantization tables for HM 3.0 are retained and no changes have been made to the scanning order. Experimental results are provided with HM 3.0 as anchor for the test conditions as stipulated for Core Experiment 7. Average BD Rate gains of 0.6%, 0.5%, 0.3% and 0.3% (respectively 0.4%, 0.3%, 0.3% and 0.2%) are reported for Intra High Efficiency, Intra Low Complexity, Random Access High Efficiency, and Random Access Low complexity settings, respectively, for the 8x8 MD-ST (respectively 4x4 MD-ST).
Question: Is a forward DCT needed at the decoder side? Maybe, depending how it is implemented
Similarity with JCTVC-F224, using the same correlation model, gain due to the larger block sizes
Further Study (continuation of CE7).
6.16.2.1.1.1.1.1.10JCTVC-F716 Cross-check of Samsung's proposal JCTVC-F554 [J Xu] [late reg. 07-12, upload 07-14]
6.16.2.1.1.1.1.1.11JCTVC-F680 Cross Check Report for Samsung's Proposal "On secondary transforms for intra prediction residual" (JCTVC-F554) by Panasonic [Y. Shibahara, T. Nishi (Panasonic)] [late reg. 07-07, upload 07-07]
6.16.2.1.1.1.1.1.12JCTVC-F591 Modified Selection of 4x4 Mode-Dependent Transforms [R. Cohen, A. Vetro, H. Sun (MERL)]
In the current Working Draft, for 4x4 Luma Intra prediction residuals, the horizontal and vertical transform types horizTrType and vertTrType depend upon the Intra prediction mode, as specified by a look-up table. In this contribution, for certain prediction modes in which horizTrType and vertTrType differ, the encoder performs a rate-distortion optimized decision on whether to swap the values of horizTrType and vertTrType. This swap is signaled using a TrToggle flag the first time a relevant applicable TU occurs in a PU, for each Intra-coded PU. The proposed method was implemented on top of HM 3.0. BD-Rate changes for class C and D sequences, using an unmodified HM 3.0 as a reference, with a reported average impact of about −0.1%, and with the change for larger sizes averaging to 0.0%. The encoding time percentages with the current software implemention are 109–114% for all-Intra cases and 98–100% for the random access and low-delay cases. Decoding time percentages range from 99–102%.
Due to need for signalling, the gain is low. Encoder time increased.
No action taken – further study by proponent is required to show an improved gain and complexity tradeoff.
Dostları ilə paylaş: |