7.4Loop filters (4)
Contributions in this category were discussed Saturday 14 April. 1730–XXXX (chaired by JRO and GJS).
JVET-J0038 Signal Adaptive Diffusion Filters for Video Coding [J. Pfaff, J. Rasch, M. Schäfer, H. Schwarz, M. Winken, A. Henkel, M. Siekmann, D. Marpe, T. Wiegand (HHI)]
In this document, diffusion filters are introduced that may be applied to the prediction signal generated by a hybrid video codec. Two types of diffusion filters are proposed: Linear and nonlinear diffusion filters. The linear diffusion filters correlate the extended prediction signal n times using a symmetric filter mask. The nonlinear diffusion filters use the input prediction signal to identify structures of the underlying signal and diffuse along edges rather than perpendicular to them. It is reported that the proposed diffusion filters lead to a coding gain delta of up to -−1.45% in All Intra and -−2.00% in Random Access configuration.
Average gains are reported for HD and UHD sequences (mixture of CTC class A/B and CfP sequences). In RA, bit rate reduction is 1.09% for high rates (QP22..37), and 0.65% for lower rates. Gain seems to be higher for higher resolutions.
Not a loop filter, but rather in the prediction signal generation
Up to five different configurations, depending on block size and temporal layer
Max iterations is 35 in linear case, 20 in nonlinear (where each of the iterations is more complex)
Does not have impact on additional memory access, but the number of operations is clearly higher than for interpolation filtering.
For further study.
JVET-J0056 Multi-Dimensional Filter Selection for Deblocking [J. Dong, Y.-H. Chao, W.-J. Chien, L. Zhang, M. Karczewicz (Qualcomm)]
This contribution presents a multi-dimensional filter selection scheme for deblocking. The filter selection for a sample is four-dimensional, i.e., determined by four facts: the average local activities of Blocks P and Q, the difference of local activities of Blocks P and Q, the type of the belonged block (Type 0 or Type 1), and the distance from the segment. Given a combination of the four facts, the filter index is not fixed, but adaptively determined by the encoder and signalled in the bitstream. This contribution leverages a bit more computation resource for significant coding efficiency improvement, while still being easy for parallel processing.
Bit rate reduction is 1.25% for AI and 1.41% for RA in CTC. This is however with all tools except QTBT disabled.
It was asked how the performance would be with ALF enabled. Not known how it would perform with other tools on.
Encoder runtime is not changed, decoder runtime increases by 18% in AI, 10% in RA.
Activity criterion is based on 2nd derivative
15 different filter types (predefined)
Lookup table which filter to select through the 4 criteria is determined at encoder side and transmitted. The encoder designs the lookup table after encoding/decoding the frame, and determines which of the filters optimize the reconstruction locally and designs the LUT based on that.
More information needed how it interacts with other tools.
JVET-J0071 Non-local Structure-based Filter with integer operation [X. Meng, C. Jia, Z. Wang, S. S. Wang, S. Ma (Peking University), X. Zheng (DJI)]
This contribution is a continuation of NLSF (Nonlocal Structure in-loop Filter) technique that was proposed in JVET-J0011. It proposes a solution for integer NLSF algorithm. The NLSF design in J0011 contains two modules: group construction by block matching and SVD-based filtering. The collaborative filtering is achieved by iterative singular value decomposition (SVD) that calculates the singular values with their singular vectors by iterative power method whose internal data type utilizes double precision float-point representation. To adapt the video coding standard as well as being hardware friendly, this proposal addresses this issue by eliminating the double precision values via the decimal digits clipping after shifting the intermediate results to large numbers during iterations. The simulation results show that the proposed fix-point algorithm for SVD module could achieve comparable performance with original NLSF algorithm.
Bit rate reduction compared to JEM7 (all tools on) is 0.86% for RA CfP, whereas the FP implementation of J0011 gave 1.25%. Similar for LDB, 1.64% integer, 1.92% FP version.
21 groups, patches are size 6x6. A total of 21 SVD has to be determined, which could consist of 36 basis functions at maximum, but it is reported that due to thresholding only 4.6 basis functions on average need to be computed. A maximum of 10 was found necessary.
Decoder runtime is increased by 316% (compared to 397% in FP)
Grouping/clustering, determination of SVD basis and computation of SVD based reconstruction is necessary at both encoder and decoder side.
Question:
-
- How many operations for the grouping? (This could be the main reason for the complexity.)
-
- How many operations to determine one SVD basis at maximum (e.g. when restricting to 10 basis functions)?
Filtering itself probably is of less concern.
JVET-J0077 Deblocking Improvements for Large CUs [W. Zhu, K. Misra, A. Segall (Sharp)]
The contribution proposes a deblocking process designed to reduce the blocking artifacts result from the use of large transforms and block sizes. Compared to the JEM deblocking approach, the process incorporates stronger filters for both luma and chroma. Additionally, the process includes a control process that considers the block sizes on both sides of the boundary being deblocked. The stronger filters are used for luma samples that correspond to larger block sizes, while the JEM deblocking filters are still used for luma samples corresponding to the smaller block sizes. From chroma, the filter is selected uses a different approach, and the stronger filter is applied when chroma samples on either sides of the deblocking boundary belong to a large block. It is reported that the proposed deblocking change improves subjective quality at low bit-rates when compared with the deblocking used in JEM7, and it is proposed to include the technique in formal study.
R1 of the contribution includes a modification to deblocking control process resulting in application of wider stronger filter for large block boundaries when blocks on either side of the boundary make use of Local Illumination Compensation and the CBF is 0 for that block.
It is suggested to deblock 7 samples for large blocks and 3 samples for small blocks.
The results shown with still picture snapshots are obvious, and proponents believe that it is also visible in video. More extensive viewing tests would be necessary.
For further study.
Dostları ilə paylaş: |