Joint Video Experts Team (jvet) of itu-t sg 6 wp and iso/iec jtc 1/sc 29/wg 11


CE4: Inter prediction and motion vector coding (35)



Yüklə 4,04 Mb.
səhifə18/53
tarix31.12.2018
ölçüsü4,04 Mb.
#88583
1   ...   14   15   16   17   18   19   20   21   ...   53

6.4CE4: Inter prediction and motion vector coding (35)


Contributions in this category were discussed Wednesday 11 July in Track B 1900–2100 (chaired by JRO), continued Thursday 12th 1300-2100 and Friday 13th morning.

JVET-K0024 CE4: Summary report on inter prediction and motion vector coding [H. . Yang, S. . Liu, K. . Zhang]

This contribution provides a summary report of Core Experiment 4 on inter prediction and motion vector coding. CE4 comprises seven categories, 1) affine motion compensation, 2) merge mode enhancement, 3) motion vector coding, 4) generalized bi-prediction, 5) reference picture boundary padding, 6) local illumination compensation, and 7) interpolation filter improvement. Test results against VTM anchor are provided to show the coding efficiency and complexity trade-off of each tool. Test results against BMS anchor are also provided to show the interaction with BMS coding tools. Crosschecking results are integrated in this contribution.



CE4.1: Affine MC

Aspects of affine motion compensation stage (discussed Wed 11th 1900-2100, chaired by JRO)

Test#

Description

Document#

AFFINE

BMS AFFINE as benchmark




4.1.2.a

Affine flexing using sub-block size of 4x4

JVET-K0047

4.1.2.b

Affine flexing using sub-block size of 8x8

JVET-K0047

4.1.5.d

EIF on top of BMS Affine

JVET-K0185

Random access results

 

VTM_tool_test

BMS_tool_test

Test#

Y

U

V

EncT

DecT

Y

U

V

EncT

DecT

AFFINE

-2.99%

-2.19%

-2.21%

137%

112%

-1.92%

-1.34%

-1.36%

108%

102%

4.1.2.a

-3.31%

-2.38%

-2.37%

131%

113%

-2.13%

-1.46%

-1.45%

105%

100%

4.1.2.b

-2.90%

-1.97%

-1.93%

126%

107%

-1.81%

-1.10%

-1.09%

105%

98%

4.1.5.d

-3.23%

-2.41%

-2.52%

132%

102%

-1.85%

-1.42%

-1.44%

109%

99%

Low delay B results

 

VTM_tool_test

BMS_tool_test

Test#

Y

U

V

EncT

DecT

Y

U

V

EncT

DecT

AFFINE

-2.06%

-1.33%

-1.52%

168%

108%

-1.98%

-1.52%

-1.70%

125%

104%

4.1.2.a

-2.27%

-1.48%

-1.48%

161%

111%

-2.18%

-1.79%

-1.61%

122%

102%

4.1.2.b

-1.94%

-1.08%

-1.17%

152%

111%

-1.94%

-1.40%

-1.65%

119%

102%

4.1.5.d

-2.47%

-1.61%

-1.94%

158%

102%

-1.89%

-1.58%

-1.64%

122%

100%

Current affine inter prediction in BMS firstly divides a coding unit to sub-blocks, and then calculates motion vector for each sub-block based on affine motion model. When doing MC for each sub-block, translational motion is assumed. This was a reasonable trade-off when affine MC was proposed. Because “true” affine MC requires per-pixel operation and existing MC operation is too expensive.

The two techniques here try to perform “true” affine MC by low-complexity MC operation.

Flexing in test 4.1.2 performs additional motion compensation horizontally for each line of a sub-block and vertically for each columns of a sub-block, separately, trying to approximate per-pixel affine MC. Whether horizontal or vertical MC is performed first depends on the whether the motion of a CU is more rotational featured and more zooming featured. Existing 8-tap DCTIF is used for interpolation. A look-up table implementation is provided in the implementation.

EIF in test 4.1.5.b performs per-pixel affine MC using a two-stage interpolation, bi-linear + 3-tap filter. It is asserted the new interpolation mechanism is much simpler than 8-tap DCTIF and thus could do per-pixel MC with low cost.
Aspects of affine motion compensation stage (discussed Wed 1900-2100)

4.1.1, 4.1.2 and 4.1.5 are touching the motion compensation stage

- 4.1.1b is a simplification of BMS affine, but it has 0.5% worse performance than BMS affine;

- 4.1.2 (affine flexing) am 4.1.5 (EIF) are doing more precise motion comp.

Affine flexing adds another step after regular block-based motion comp. This uses the regular 8-tap filters, but applies padding outside of the prediction block, such that no additional memory accesses are required. Entire lines are shifted horizontally, and then the result is shifted column-wise. This requires in worst case 8+8 additional multiplications per sample. The gain is -0.3% for 4x4 subblocks, and some small loss is observed when affine is performed on 8x8 subblocks. This additional complexity is not a good tradeoff versus the relative small gain.

EIF is pixel-based motion comp, which used a different motion compensation for affine blocks, by applying bilinear interpolation followed by a sharpening filter. The gain is approx. -0.2%. It is claimed that the method is less complex than the conventional motion comp (in terms of number of operations), however it probably is also less regular, as it consists of three subsequent stages: a) determining individual shift positions of samples b) performing bilinear interpolation c) sharpening filter. Likewise, the specification is likely more complicated.

From the current results, a solution using identical MC operation for affine cases as in other modes appears to be the preferred solution. However, compared to the BMS affine, it would be simpler to use fixed subblock size of 4x4 (as per CE 4.1.1a), i.e. remove the adaptive subblock size from BMS.

Decision: Adopt JVET-K0184 to BMS; (CE4.1.1a 4x4 fixed subblock size). It was later decided in the JVET Sunday plenary that the affine tools (with modifications as adopted for BMS initially) will be moved to VTM.

It is also mentioned that pixel-based MC may have subjective advantage, but it currently not evident whether this applies to relevant application cases.


Draft text was provided and reviewed Tue 17 1545.

JVET-K0565 Draft text for affine motion compensation [H. . Yang, H. . Chen, Y. . Zhao, J. . Chen (Huawei)] [late]
Suggestions made during discussion:

  • High level enabling flag at SPS as already in software is missing (PPS or slice needs further consideration)

  • Put into SPS 2 flags (1 for affine enabling, 1 for 6-parameter enabling)

  • All affine operations and syntax should be disabled by the first flag

  • All affine operations and syntax related to 6-parameter should be disabled by the second flag

It was generally remarked that the text was appropriate and complete, but needs more detailed investigation in the context of integration by editors.
For editors of draft text:

Some alignment of subblock syntax with ATMVP would be necessary

An implication of adopting affine and ATMVP requires also transferring some additional elements from HEVC, such as merge, MV prediction, MV difference coding

What is further needed for affine, is the 1/16 sample precision interpolation filters. Otherwise, the MC process of HEVC can be retained for subblocks. Rounding to 1/4 pel also needs to be described

8x8 MV compression requires action. Storage to be done with high (1/16 pel) precision. Note: It should be further studied if high precision is necessary.

CE4.1 continuation Track B Thursday 12th 1300-1500 (chaired by JRO)



Aspects of affine motion vector prediction

Test#

Description

Document#

AFFINE

BMS AFFINE as bench mark




4.1.3.a

Affine MVP construction

JVET-K0337

4.1.4.b

Affine MVP construction with added spatial candidates

JVET-K0218

4.1.5.a

Affine MVP construction

JVET-K0185

4.1.6

Affine MVP construction

JVET-K0244

4.1.7.a

Up to two affine candidate for affine inter

JVET-K0094

Random access results

 

VTM_tool_test

BMS_tool_test

Test#

Y

U

V

EncT

DecT

Y

U

V

EncT

DecT

AFFINE

-2.99%

-2.19%

-2.21%

137%

112%

-1.92%

-1.34%

-1.36%

108%

102%

4.1.3.a

-3.22%

-2.40%

-2.43%

134%

110%

-2.12%

-1.54%

-1.56%

112%

107%

4.1.4.a

-3.90%

-2.95%

-2.98%

148%

115%

-2.72%

-2.01%

-2.02%

113%

107%

4.1.4.b

-3.89%

-2.93%

-2.95%

148%

115%

-2.71%

-2.02%

-2.05%

113%

106%

4.1.5.a

-3.24%

-2.41%

-2.47%

138%

110%

-2.10%

-1.54%

-1.53%

110%

101%

4.1.6

-3.21%

-2.38%

-2.40%

132%

107%

-2.10%

-1.50%

-1.56%

111%

88%

4.1.7.a

-3.23%

-2.37%

-2.43%

142%

114%

-2.10%

-1.52%

-1.56%

111%

102%

Low delay B results

 

VTM_tool_test

BMS_tool_test

Test#

Y

U

V

EncT

DecT

Y

U

V

EncT

DecT

AFFINE

-2.06%

-1.33%

-1.52%

168%

108%

-1.98%

-1.52%

-1.70%

125%

104%

4.1.3.a

-2.08%

-1.46%

-1.62%

162%

92%

-2.08%

-1.62%

-1.92%

127%

108%

4.1.4.a

-2.38%

-1.61%

-1.59%

213%

107%

-2.39%

-1.89%

-2.07%

142%

109%

4.1.4.b

-2.37%

-1.56%

-1.65%

213%

107%

-2.39%

-1.98%

-2.03%

143%

110%

4.1.5.a

-2.13%

-1.26%

-1.59%

169%

107%

-2.07%

-1.69%

-1.57%

126%

103%

4.1.6

-2.10%

-1.43%

-1.48%

159%

104%

-2.04%

-1.57%

-1.74%

128%

108%

4.1.7.a

-2.12%

-1.46%

-1.47%

181%

115%

-2.09%

-1.87%

-1.89%

129%

104%

In BMS affine inter mode, two motion vectors at the top-left and top-right corners of a coding unit is encoded. The predictor for the motion vector pair is constructed using motion vectors of neighbouring blocks.

In the five tests here, a new types of candidate predictor is proposed. This candidate inherits the affine model from neighbouring blocks and use that model to derive the motion vector at the control points of the coding block. The derived motion vector pair is then used as a predictor for CPMVs of the coding block.

Tests here seems converged to constructing the MVP list with two types of predictors, inherited predictor and constructed predictor. And the number of predictors is 2.

The differences among the tests are:,


  • The position from where a candidate is derived

  • The order in which the MVP list is constructed

  • The number of candidates of a particular type

  • Whether MV scaling is allow when deriving a candidate

  • Whether a top-right MV derived from top-left and bottom-left MV could be used as the predictor for the top-right CPMV

Proponents are requested to provide an analysis about the number of operations, MV comparisons, memory usage, additional storage, etc. for the list construction and the inheritance, also in comparison with BMS affine.

Follow-up Sat. 14th morning:

Analysis was done (will be registered as new doc – add number

4.1.6 and 4.1.7 have higher complexity than BMS

4.1.8 has same complexity as BMS affine

4.1.3 has least complexity (its worst case is for inherited candidates, but still significantly lower than BMS – no scaling, no mult., no div.)

4.1.4 is same as 4.1.3 with more candidates (here analysed for 4-parameter model, for which case we don’t have CE results), complexity vs. 4.1.3 is higher by a factor of 1.4

4.1.5 is still roughly by a factor 3 less complex than BMS affine, but has some scaling/mul./div.

From this, the best choice in the domain of 4-parameter model prediction is the solution of 4.1.3a. It has the same compression performance as 4.1.5, but is less complex.



Decision (BMS): Adopt JVET-K0337 (4.1.3a, affine MVP list construction)

It is noted that the decoder run time increases. This is asserted to be due to the fact that affine prediction mode is now selected more frequently.

It is noted that inheritance requires additional buffering of affine model, including PU size. This is also the case for merge in BMS affine, whereas prediction in BMS affine currently does not use inheritance.

Performance of 4.1.4 is better as it also uses 6 parameter model (see other proposals below).


Yüklə 4,04 Mb.

Dostları ilə paylaş:
1   ...   14   15   16   17   18   19   20   21   ...   53




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin