wg 11

Yüklə 0,57 Mb.

səhifə	15/23
tarix	02.08.2018
ölçüsü	0,57 Mb.
	#66318

1 ... 11 12 13 14 15 16 17 18 ... 23

JVET-J0054 Coupled primary and secondary transform [X. Zhao, X. Li, S. Liu (Tencent)]
JVET-J0064 Prediction dependent transform for intra and inter frame coding [Y. Lin, M. Mao, S. Song, J. Zheng, J. An (HiSilicon), C. Zhu (UESTC)]
JVET-J0066 Complexity Reduction for Adaptive Multiple Transforms (AMT) using Adjustment Stages [A. Said, H. Egilmez, V. Seregin, M. Karczewicz (Qualcomm)]

7.5Transforms (5)

Contributions in this category were discussed Sunday 15 Apr.il 0910–0940 (chaired by GJS and& JRO) and 1130-–1305 (chaired by JRO).

JVET-J0040 Set of Transforms [M. Siekmann, B. Stallenberger, C. Bartnik, J. Pfaff, D. Marpe, H. Schwarz, T. Wiegand (HHI)]

This contribution was discussed Sunday April 0910-–0935.

In this document, an adaptive selection of transforms for the residual coding is proposed. For each residual block a set of 5 transform candidates is chosen from the variety of AMT (adaptive multiple transform – DCT/DST-like transforms), NSST (non-square separable transforms) and additional offline trained (non-separable) secondary transforms. Relative to the JEM, there are additional secondary transforms. It is reported that relative to testing the product space of AMT and NSST transforms, as implemented in JEM, the coding efficiency is improved while at the same time the encoder run time is reduced.

Primary and secondary transforms are coupled to produce a set of transform candidates. Syntax is modified to select among these.

Test results for 49 frame segments of the CTC test sequences and CfP test sequences for higher QP values were provided. The reference was a segmentation with QTBT and triple-tree split in the HHI "NextSoftware" codebase. Overall gains of roughly 5.5%, 3.8%, and 3.0% were reported for AI, RA, and LD, respectively.

Comments:

There would be more effect for intra.
It was suggested that the syntax scheme and the restriction of the transform set may be providing the most gain rather than the particular transform set.

This is certainly of interest for further study.
JVET-J0054 Coupled primary and secondary transform [X. Zhao, X. Li, S. Liu (Tencent)]

This contribution was was discussed Sunday April 0935-–1010.

This contribution reports a coupled primary and secondary transform (CPST) scheme. Instead of signaling the indices of primary and secondary transform independently, the primary and secondary transform is coupled and signalled by only one transform index. With the proposed method, on top of Tencent CfP response JVET-J0029, it is reported that 22% overall encoder run-time saving is achieved with 0.4% loss for all intra configuration.

Only 5 options (DCT-2, EMT-0 and EMT-n+NSST-n, n=1..3)

e.g. EMT-0 is DST-7

Same combinations of transforms applied to luma and chroma.

Compared against JVET-J0029 with EMT/NSST off as anchor in AI CTC (which also uses a different signalling of combinations EMT/NSST than JEM, but allows more combinations than J0054). Whereas JVET-J0029 had 5.3% BR reduction when turning on EMT/NSST, JVET-J0054 has 4.9%.

It was observed that the encoder runtime decreases, but decoder runtime decreases. Why? Likely because NSST is used more often, and NSST is implemented as matrix multiply.

Note that transforms of JVET-J0029 and J0054 are somewhat different from JEM (in particular, different set of secondary transforms).

For further study.

JVET-J0062 Non-Separable Secondary Transform Implementations with Reduced Memory via Hierarchically Structured Matrix-based Transforms [A. Said, H. Egilmez, V. Seregin, M. Karczewicz (Qualcomm)]

This contribution was discussed Sunday 15 April 1130-–1200 (chaired by JRO).

This contribution presents hierarchically structured matrix-based transforms (HSMTs) for non-separable secondary transformation (NSST) as alternatives to HyGT-based NSST implementations. The proposed set of HSMTs reduces the NSST memory use in JEM7 by 131 Kbits (19%) and provides very similar coding gains under CTC test conditions, in AI and RA configurations.

Presentation deck to be uploaded.

HSMT implements NSST using multiple passes of smaller transforms for a given block. However, instead of using the pairwise Givens rotations (i.e., butterfly structures) or a full matrix, a hierarchical structure with multiple passes consisting of smaller matrices and permutations are used to define a non-separable transform.

Number of transforms is reduced from 35x3 to 13x3.

Compared to JEM anchor, bit rate changes by -−0.02% for AI, and +0.01% for RA CTC. No change in encoder/decoder run time.

It is commented that a similar approach had been previously proposed in JVET-D0085. This was similar to passes 0 and 1 suggested in J0062, basically separable (row/column). The assertion of the proponent is that it would end up in loss, therefore the additional passes 2 and 3 are added here.

For further study in terms of implementation aspects of NSST.

JVET-J0064 Prediction dependent transform for intra and inter frame coding [Y. Lin, M. Mao, S. Song, J. Zheng, J. An (HiSilicon), C. Zhu (UESTC)]

This contribution was discussed Sunday 15 April 1200-–1240 (chaired by JRO).

This contribution presents prediction dependent transform for intra and inter frame coding to enable better trade-off between coding efficiency and complexity. Totally two kinds of transform cores, i.e., DCT-2 and DST-7, are utilized in this contribution. The transform selection is dependent on prediction characteristics of current block. For residuals of intra coded block, intra prediction mode dependent transform is applied to both luma and chroma components. For residual of inter coded block, the transform selection is dependent on position of selected spatial MV candidate in HEVC merge mode. In addition, DST-7 is always applied to residual of FRUC template matching mode. It is reported that the proposed prediction dependent transform achieves better balance between coding performance and encoding time.

The proposal replaces the switchable primary transform of JEM, by a mode dependent switching to a combination of DCT-II and DST-VII. The results indicate a loss of 1.76% in AI, 1.23% in RA CTC (but only with test sequences from CfP classes UHD and HD). For tool-off configuration, the gain is less than when enabling EMT. The main advantage is claimed by an encoder runtime reduction (50% in AI, 90% in RA). Decoder runtime is not changed. However, the number of different transforms is reduced.

More evidence would be necessary that the implicit transform switching for inter cases is beneficial.

JVET-J0066 Complexity Reduction for Adaptive Multiple Transforms (AMT) using Adjustment Stages [A. Said, H. Egilmez, V. Seregin, M. Karczewicz (Qualcomm)]

This contribution was discussed Sunday 15 April 1240-–1305 (chaired by JRO and& GJS).

This contribution presents a proposed reduction of the complexity of AMT by approximating the AMT transforms using only a transform similar to a DCT-2 and adjustment stages of low complexity. The proposed adjustment stages are defined using sparse block-band orthogonal matrices, which reportedly provide good approximation for the set of AMTs used in JEM7. It is reported that employing matrices with not more than 4 nonzero elements per row results in very small changes in coding gains (on average less than 0.05% in BD-rate under CTC conditions).

This was proposed for block lengths of 16 and larger.

The proposal did not provide full detail of what was proposed (e.g. the tap values).
Proposal to use DCT-2/-3 and DST-2/-3 type families. These can use the same fast transform algorithm, but require an additional “adjustment stage”, which can be implemented as matrix multiply, and interpreted as a 4-tap spatially varying FIR filter. Useful for larger transforms (16 and larger). Less loss when 6-tap are used.

Some concern was raised that the number of multiplications is increased by the adjustment stage. The proponent however points out that this may still be less than for a full matrix multiply which would be necessary for some of the AMT transforms which don’t have fast algorithms.

No speed impact was evident in the JEM context.

It was asked if the cascading of forward and inverse transforms introduces reconstruction errors. How would it perform with low QPs?

No information is given about the precise matrices of the adjustment stages.

For further study in terms of implementation aspects of AMT.

Comments:

It was asked whether there is a measurable speed impact. The proponent said the implementation was not sufficiently optimized to test this.
How much rounding error is introduced by a cascade of forward and inverse transforms?
It was asked whether this had been tested with very small QP values. This had not been tested.

Yüklə 0,57 Mb.

Dostları ilə paylaş:

1 ... 11 12 13 14 15 16 17 18 ... 23