Contributions in this category were discussed Friday 13 July 1200–1300 (chaired by GJS).
JVET-K0028 CE8 summary report on current picture referencing [X. . Xu, K. . Müller, L. . Wang]
This contribution provides a summary report of Core Experiment 8 on current picture referencing. Four tests have been agreed to carry out in CE8 in between JVET-J and JVET-K meetings, to study and evaluate technologies related to current picture referencing. In this report, coding performance and complexity of these tests are reported and analyzed. In particular, test results against VTM anchor are provided to show the coding efficiency and complexity trade-off of each proposed approach. Test results against BMS anchor are also provided to show the interaction with BMS coding tools. Crosschecking results for the performed tests are integrated in this contribution.
There is a substantial complexity impact. It was suggested that a baseline profile would need some constraints on the design.
Decision: Adopt 8.2.2 approach (JVET-K0076) into BMS. Regarding whether to include in CTC or not, it will be included; per section 12.2. Further study is needed to determine appropriate constraints and profiling implications. The current version seems too complex for a “baseline profile”, but some variation of this seems needed in the standard, and with some constraints it could become appropriate for a “baseline profile”.
JVET-K0048 CE8: Intra Region-based Template Matching (Test 8.1) [G. . Venugopal, K. . Müller, H. . Schwarz, D. . Marpe, T. . Wiegand (HHI)] JVET-K0075 CE8-2.1: Current picture referencing using block level flag signalling [X. . Xu, X. . Li, S. . Liu (Tencent)] JVET-K0076 CE8-2.2: Current picture referencing using reference index signalling [X. . Xu, X. . Li, G. . Li, S. . Liu (Tencent)] JVET-K0436 Crosscheck for CE8-2.2 [W. . Zhang (Hulu)] [late] JVET-K0450 CE8-3.1: Current picture referencing for intra pictures [L. . Wang, F. . Chen (Hikvision)] [late]
6.9CE9: Decoder side motion vector derivation (25)
Contributions in this category were discussed Friday 13 July 1100–1230 (chaired by JRO).
JVET-K0029 CE9: Summary Report on Decoder Side MV Derivation [S. . Esenlik, Y.-W. Chen]
The tools in the scope of this CE include bi-directional optical flow, template matching and bilateral matching based techniques for motion vector derivation and refinement at the decoder side.
The core experiment is organized into 5 sub-tests as follows:
CE9.1 - Decoder Side Motion Vector Refinement (DMVR): 5 tests are performed in this subcategory.
CE9.2 - Bilateral Matching: 8 tests.
CE9.3 - Template Matching: 7 tests.
CE9.4 - MV Candidate List Reordering by Template Matching: 3 tests.
CE9.5 - BIO: 3 tests.
This report summarises the status of each experiment. Crosscheck results are integrated in the document.
CE9.1: Decoder Side Motion Vector Refinement (DMVR)
Search Range is 1
Adaptive search pattern (6 points instead of 9)
Mean removed SAD as cost function
Early termination: if motion vector is not changed after an iteration
X. Chen (Hisilicon, Huawei)
Early termination after L0 search if motion vector is not changed after one iteration
Early termination based on initial SAD cost between prediction L0 and prediction L1
High precision SAD (no clip and round)
DMVR not applied if MV difference between the selected candidate and any of the previous candidates in the merge list is less than a pre-defined threshold in both horizontal and vertical directions, where the thresholds are ¼-pel, ½-pel and 1-pel for blocks with less than 64, less than 256 and more than 256 pixels, respectively.
MV difference mirroring.
Results are to be provided for number of iterations 4, 2, 1, and half-pel on/off.
6 point corner selective integer search and 4 point half pel search.
Results are to be provided by switching off spatial MV prediction from refined motion vectors in 32x32 grid.
Semih Esenlik (Huawei, USTC)
DMVR in BMS according to AHG13
(Test is DMVR off)
Xu Chen (Hisilicon, Huawei,)
Yu-Chi Su (MediaTek), only RA
Xiaoyu Xiu (InterDigital)
Chun-Chi Chen (Qualcomm)
Semih Esenlik (Huawei, USTC)
The following table shows properties of the different methods
Important complexity aspects are number of SADs, memory access (search range) in general, and latency (due to dependency between spatial neighbours, pipelining is complicated). The latter aspect is addressed in 9.1.1.a, however it loses 1.2% in VTM; and 0.6% in BMS. For the other aspects, it can be seen that increasing SAD number or SR improves quality.
It is agreed that DMVR is not mature enough to be moved into VTM.
It is agreed that the next version of BMS should include a DMVR that resolves the latency problem.
It is agreed that upcoming CEs should not include any approach that has a latency problem.
The only proposal from CE9.1 that resolves the latency problem is 9.1.1a.
It was initially agreed to adopt JVET-K0199 (as per CE9.1.1.a), i.e. do not use refined motion vectors for anything but the MC of the current block. This is asserted to be the simplest solution for the latency problem, no additional storage requirements, no additional rules, etc. This decision was later revised in context of the adoption of 9.2.9.l. However, in the context of 9.2.9.l, still the aspect of using the non-refined MV in deblocking (as initially suggested in K0199) was retained.
Question: Do we know what is the impact on worst case memory bandwidth for SR1/2/4? Compared to SR0 = DMVR off ? SR1: 140%; SR2: 186%; SR4: 298%; SR8: 600%.
Note: SR up to 2 with bilinear interpolation is claimed to be still 100%
Note these Numbers are preliminary, need more check – to be done in upcoming CE (there are some further notes under CE9 related section).
CE9.2: Bilateral Matching
The MV difference between the selected candidate and any of the previous candidates is not less than a pre-defined threshold (i.e. ¼-pel, ½-pel and 1-pel for blocks with less than 64, less than 256 and other larger blocks, respectively)
Sub-block refinement is removed
Mean removed SAD is applied adaptively based on CU size (i.e. MRSAD for blocks with more than 64 pixels)
Based on CE9.2.5 two modifications are tested:
The DCTIF for search is replaced by bi-linear filter;
The Search range is reduced from 8 to 2
Implementation based on Bilateral Matching code in BMS1.0 software.
Motion vector difference is mirrored for forward and backward MVPs (bilateral matching disabled otherwise).
Sub-CU refinement off.
Merge candidates are used as origin MVs.
Byeongdoo Choi (Sharp)
Implemented on top of CE9.2.9.
4 points half-pel search is replaced by 2 point adaptive half-pel search pattern.
Yue Li (USTC)
Bilateral matching cost function instead of generating template.
MVD mirroring with initial MV candidate signalled as merge index.
Results are to be provided for search ranges 4, 2, 1, and half-pel off.
Results are to be provided by disabling spatial prediction from refined motion vectors within 32x32 grid.
Results are to be provided for disabling spatial prediction from refined MV completely.