JCTVC-M0354 Non-SCE3: Cross-verification of JCTVC-M0220 on uni-directional combined prediction [V. Seregin (Qualcomm)] [late] [miss]
JCTVC-M0222 Non-SCE3.4: Simplified Generalized Combined Prediction [P. Lai, S. Liu, T.-D. Chuang, Y.-W. Huang, S. Lei (MediaTek)]
This contribution presents technical descriptions and test results of simplification methods applied on top of SCE3.4 Generalized Combined Prediction (GCP) in JCTVC-M0221. In the first test set, GCP size limitations are imposed such that the minimum PU width and height for GCP uni-prediction is 8, and for GCP bi-prediction is 16. Furthermore, chroma components do not perform GCP (luma-only GCP). In the second test set, besides the GCP size limitations and luma-only GCP , the motion compensation interpolation filters in luma-only GCP is changed to bilinear filters. The test results are as below.
Test set 1: GCP size limitations, luma-only GCP
EL+BL BD-rates of 2X / 1.5X / SNR:
3 GCP modes: RA −1.6% / −2.2% / −1.7%, LDP −3.6% / −4.8% / −4.2%, LDB −3.2% / −4.4% / −3.1%,
2 GCP modes: RA −1.3% / −1.9% / −1.6%, LDP −3.5% / −4.6% / −4.3%, LDB −2.8% / −3.8% / −2.9%. The average encoding runtime are 126.4% and 118.7% for 3 and 2 GCP modes respectively, decoding runtime is about 104%. The worst-case complexity (GCP on the minimum block-size) is about 120%~160% as compared to SHM1.0, in terms of memory bandwidth.
Test set 2: GCP size limitations, luma-only GCP , GCP with bilinear MC
EL+BL BD-rates of 2X / 1.5X / SNR:
3 GCP modes: RA −1.4% / −1.7% / −1.7%, LDP −3.8% / −4.1% / −4.9%, LDB −2.5% / −3.0% / −2.7%. The average encoding runtime are 126.6%, decoding runtime is about 102%. The worst-case complexity, in terms of computation and memory bandwidth, is about 112% and 80% as compared to SHM1.0.
The method upsample the BL ref picture and add it with the EL ref picture before motion comp., and do motion comp. jointly.
The worst case memory bandwidth in test set 2 is claimed to be lower than SHM, due to various additional restrictions:
-
disable bi-pred for any PU sizes 8xN or Nx8
-
luma only
-
bilinear interpolation in motion comp.
Worst case computation is higher than SHM (around 12% Mul/Add)
Three extra modes
Some speedup optimizations are used at encoder.
Dostları ilə paylaş: |