JCTVC-B038 [D. Wang, J. Song, H. Yang, M. Yang, H. Yu (Huawei), Xin Zhao, Siwei Ma (Peking Univ.)] Performance improved directional transform techniques
This document reports the two updated techniques on directional transform since the last meeting. The first one is the directional multiple transform (DMT) for inter residual blocks. Based on the direction information of the residual signal, multiple transforms are used, which are trained offline, in order to match the texture feature of the residues. With improved temporal predictive coding and syntax embedding implemented, the performance gain is further improved. The other technique is the rate distortion optimized transform (RDOT) for intra residual blocks, as described in detail in [2]. The transforms and the coefficient scanning order are further optimized and the performance gain is increased, under the discussed Transform AhG test condition. Moreover, since the selection of the transform used on one block is decided at encoder side, it does not introduce extra computation complexity at decoder side.
Directional Multiple Transform (DMT) introduces possibility to select between DCT or one of eight different (trained) directional transforms (KLT) for inter prediction blocks. Syntax is split into a flag that selects DCT or DirT and one element that selects the transform.
Training of transforms is using QCIF and CIF data outside of the training set. Classification is made on direction characteristics.
Implementation in KTA2.6r1, new transforms only used for 8x8 block case. Performance gain approx. 1.2% for CS1, 2.4% for CS2. 30% are 8x8, 50% of these are using directional transforms.
Rate Distortion Optimized Transform (RDOT) is a similar concept (choice of different transforms) instead of MDDT. Signaled at the MB level whether RDOT is used, then 2 different transforms (per hor/vert direction) in case of 4x4 blocks, which makes a possible 4 combinations to be checked and signaled in each mode. 4 different transforms, i.e. 16 combinations in case of 8x8 blocks.
Gain approx. 2% compared to MDDT
Encoding time of RDOT is 4.5x increased (early break in mode decision), decoding time 1.25x. No clear numbers available for DMT.
Further study – requires complete implementation of DMT, currently RDOT appears to add complexity without giving high bitrate reduction.
Dostları ilə paylaş: |