Organisation internationale de normalisation

Frequency-domain prediction and time-domain post-filtering CE

Yüklə 5,54 Mb.

səhifə	175/197
tarix	02.01.2022
ölçüsü	5,54 Mb.
	#32757

1 ... 171 172 173 174 175 176 177 178 ... 197

Frequency-domain prediction and time-domain post-filtering CE

Max Neuendorf, FhG-IIS, presented

m36534

3DA Phase 2 Core Experiment on frequency-domain prediction and time-domain post-filtering

Sascha Disch, Christian Helmrich, Emmanuel Ravelli, Max Neuendorf

The contribution gives an overview of the technology, which improves the quality of tonal signals at low bit rates (i.e. 32 kb/s/channel). Examples of signals that are improved using this tool are:

Harpsichord, bagpipe, singing voice

The contribution proposes two tools:

Frequency-domain prediction (FDP).

This is a 2-tap forward-adaptive time-domain predictor that operates across pitch-indicated harmonic MDCT bin values below 3.75 kHz (at 48 kHz sampling rate), i.e. below range of noise filling tool.
It provides coding gain by reducing the energy of the MDCT coefficients
It requires a single estimate of the pitch as side information as 1-bit flag and 8-bit value. The predictor operates in fixed point so that it is deterministic thus making conformance easier.

Time-domain post-filter (TDPF)

This is based on technology used in the 3GPP EVS project.
It provides improved quality by shaping noise between pitch harmonic peaks.
It requires 1 bit flag, 9 bit pitch lag, 2 bit gain index

The contribution gives the result of two listening tests:

A first is RM0 as compared to RM0+FDP
A second is RM0+FDP as compared to RM0+FDP+TDPF

For the first, differential scores results showed 4 items better, none worse, mean better (mean value of 1.8).

For the second, differential scores results showed 5 items better, none worse, mean better (mean value of 3.5).

Hence it is expected that when both technologies are used together, the mean value at 5.

The presenter noted that the test items are items used in MPEG-2, MPEG-4 and even outside MPEG (e.g. other standards bodies or proponent-supplied items). Items added tended to have a strong tonal structure and also a “complex” tonal structure, i.e. not fixed frequency stride in harmonic structure.

Side information: at least 2 flag bits per frame or at most 21 bits per frame.

The presenter reported complexity figures:

FDP worst-case is 0.27 WMOPS per channel or 3% of decoder for e.g. 11.1 chanels
TDPF is worst-case 1.31 WMOPS. Average (50% of possible bins active) is 5-8%.

Average tool complexity is a total is 6% to 10% of total decoder.

Lukasz Januszkiewicz, Zylia, presented

m36541

Zylia Listening Test Report on 3DA Phase 2 Core Experiment on frequency-domain prediction and time-domain post-filtering

Tomasz Zernicki, Lukasz Januszkiewicz

The contribution gives the result of two listening tests:

A first is RM0 as compared to RM0+FDP (RM0 is Sys2 and RM0+FDP is Sys1)
A second is RM0+FDP as compared to RM0+FDP+TDPF (RM0 +FDP is Sys1 and RM0+FDP+TDPF is Sys2)

For the first, differential scores results showed 1 item better, mean not different.

For the second, differential scores results showed all items are not different, but mean is better at 95% level of significance.

Yüklə 5,54 Mb.

Dostları ilə paylaş:

1 ... 171 172 173 174 175 176 177 178 ... 197