CE on TBE and IGF in LPD
Christian Neukam, FhG-IIS, and Venkatraman Atti, Qualcomm, presented
-
m36530
| -
3DA Phase 2 Core Experiment on optimizations and improvements for low bitrate coding
| -
Sascha Disch, Max Neuendorf, Christian Neukam, Benjamin Schubert, Venkatraman Atti, Imre Varga, Venkata Chebiyyam
|
The contribution reports on a joint CD from FhG-IIS and Qualcomm.
The proposed technology provides enhancements for low bitrate multi-channel coding, in the bitrate range of 128 kb/s to 256 kb/s. The proposed technology is adapted from the 3GPP Enhance Voices Services (EVS) codec.
The CE technology is a parametric time-domain bandwidth extension (TBE) tool for ACELP and the application of Enhanced Noise Filling through Intelligent Gap Filling (IGF) in TCX in the LPD mode of the MPEG-H 3D audio core. The proposed changes also reduce the overall computational complexity of the decoding process.
The proponent observed that
-
Common parametric high band coding techniques often cause a lack of phase coherence between the low band and the high band resulting in undesired temporal dispersion of the individual speech pulses of voiced speech.
-
Fine-grain shaping of the temporal envelope of the parametrically coded parts of the high band is essential for speech quality in order to avoid pre-and post-echoes (causing artifacts like “double speak”).
The contribution proposes
-
A first tool, Time-domain Bandwidth Extension (TBE), is proposed for the LPD mode.
-
An existing tool, Intelligent Gap Filling (IGF) is newly applied to TCX mode.
Incorporating these new tools or modes requires support for new signal flow.
Complexity information was obtained by running RM0 and RM0+CE in real time on an ARM platform.
-
At 16 kb/s the CE has 4% reduction
-
At 24 kb/s the CE has 2% reduction
The presenter noted that the complexity reduction was due to there being no need to compute the QMF filterbanks in the tested processing modes.
The presenter showed listening results for the listening test when FhG-IIS and Qualcomm listeners are pooled together.
For absolute scores
-
16 kb/s mean better, at the 95% level of significance
-
24 kb/s mean better, at the 95% level of significance
For differential scores
-
16 kb/s, 10 better, mean better, mean value is 12
-
24 kb/s 7 better, mean better, mean value is 6
The presenter clarified that TBE or SBR modes are selected at configuration time, and so are not dynamically selectable.
Oliver Wuebbolt, Technicolor, presented
-
m36509
| -
Technicolor Crosscheck Results for Fraunhofer IIS CE
| -
Oliver Wuebbolt, Florian Keiler
|
The presenter showed listening results for.
For absolute scores
-
16 kb/s 3 items better, mean better, at the 95% level of significance
-
24 kb/s no difference
For differential scores
-
16 kb/s, 8 better, mean better, mean value is 15
-
24 kb/s mean better, mean value is 5
It was the consensus of the Audio subgroup to incorporate this technology into MPEG-H Phase 2.
Dostları ilə paylaş: |