Frequency-domain prediction and time-domain post-filtering CE
Max Neuendorf, FhG-IIS, presented
-
m36534
| -
3DA Phase 2 Core Experiment on frequency-domain prediction and time-domain post-filtering
| -
Sascha Disch, Christian Helmrich, Emmanuel Ravelli, Max Neuendorf
| -
|
The contribution gives an overview of the technology, which improves the quality of tonal signals at low bit rates (i.e. 32 kb/s/channel). Examples of signals that are improved using this tool are:
The contribution proposes two tools:
Frequency-domain prediction (FDP).
-
This is a 2-tap forward-adaptive time-domain predictor that operates across pitch-indicated harmonic MDCT bin values below 3.75 kHz (at 48 kHz sampling rate), i.e. below range of noise filling tool.
-
It provides coding gain by reducing the energy of the MDCT coefficients
-
It requires a single estimate of the pitch as side information as 1-bit flag and 8-bit value. The predictor operates in fixed point so that it is deterministic thus making conformance easier.
Time-domain post-filter (TDPF)
-
This is based on technology used in the 3GPP EVS project.
-
It provides improved quality by shaping noise between pitch harmonic peaks.
-
It requires 1 bit flag, 9 bit pitch lag, 2 bit gain index
The contribution gives the result of two listening tests:
-
A first is RM0 as compared to RM0+FDP
-
A second is RM0+FDP as compared to RM0+FDP+TDPF
For the first, differential scores results showed 4 items better, none worse, mean better (mean value of 1.8).
For the second, differential scores results showed 5 items better, none worse, mean better (mean value of 3.5).
Hence it is expected that when both technologies are used together, the mean value at 5.
The presenter noted that the test items are items used in MPEG-2, MPEG-4 and even outside MPEG (e.g. other standards bodies or proponent-supplied items). Items added tended to have a strong tonal structure and also a “complex” tonal structure, i.e. not fixed frequency stride in harmonic structure.
Side information: at least 2 flag bits per frame or at most 21 bits per frame.
The presenter reported complexity figures:
-
FDP worst-case is 0.27 WMOPS per channel or 3% of decoder for e.g. 11.1 chanels
-
TDPF is worst-case 1.31 WMOPS. Average (50% of possible bins active) is 5-8%.
Average tool complexity is a total is 6% to 10% of total decoder.
Lukasz Januszkiewicz, Zylia, presented
-
m36541
| -
Zylia Listening Test Report on 3DA Phase 2 Core Experiment on frequency-domain prediction and time-domain post-filtering
| -
Tomasz Zernicki, Lukasz Januszkiewicz
| -
|
The contribution gives the result of two listening tests:
-
A first is RM0 as compared to RM0+FDP (RM0 is Sys2 and RM0+FDP is Sys1)
-
A second is RM0+FDP as compared to RM0+FDP+TDPF (RM0 +FDP is Sys1 and RM0+FDP+TDPF is Sys2)
For the first, differential scores results showed 1 item better, mean not different.
For the second, differential scores results showed all items are not different, but mean is better at 95% level of significance.
Dostları ilə paylaş: |