Extension to Stereo
Christian Neukam, FhG-IIS, presented
-
m36535
| -
3DA Phase 2 Core Experiment on joint channels for low bitrate coding
| -
Guillaume Fuchs, Sascha Disch, Christian Neukam
|
The contribution reports on extending the TBE and IGF in LPD to stereo. Bitrates for the CE are between 128 kb/s and 256 kb/s for immersive audio signals.
The presenter noted that the LPD mode has no stereo compression tool, and when using of stereo tools in FD mode, it is not possible to seamlessly switch to LPD mode. Therefore, the contribution proposes a new joint stereo tool for LPD mode:
Complexity evaluation
-
At 24 kb/s stereo – complexity for total decoder is reduced by 24%
-
At 32 kb/s stereo – complexity for total decoder is reduced by 23%
A subjective quality evaluation was done via a listening test at FhG-IIS.
Absolute scores
-
24 kb/s stereo no items different, mean not different
-
32 kb/s stereo no items different, mean is better at 95% level of significance
Absolute scores
-
24 kb/s stereo 7 items better, mean better (value of 5)
-
32 kb/s stereo 7 items better, mean is better (value of 7)
In conclusion, the presenter noted that the proposed technology:
-
Gives improved performance at lower complexity
Venkatraman Atti, Qualcomm, presented
-
m36426
| -
Crosscheck report for LPD stereo CE of MPEG-H 3D Audio Phase 2
| -
Venkatraman Atti, Imre Varga, Venkata Chebiyyam
|
The contribution reports on a listening test conducted at Qualcomm.
Absolute scores
-
24 kb/s stereo no items different, mean not different
-
32 kb/s stereo no items different, mean not different
Absolute scores
-
24 kb/s stereo 2 items better, mean not different
-
32 kb/s stereo 4 items better, mean is better (value of 5)
The presenter noted that these results are consistent with the proponent listening test results results.
It was the consensus of the Audio subgroup to incorporate this technology into MPEG-H Phase 2.
Max Neuendorf, FhG-IIS, presented
-
m36591
| -
Discrete multi-channel coding tool for MPEG-H 3D audio
| -
Sascha Dick, Christian Helmrich, Nikolaus Rettelbach, Florian Schuh, Tobias Schwegler, Max Neuendorf
|
The presenter gave an overview of the 3D Audio Phase 1 technology, in which
-
Stereo coding occurs in fixed channel pairs
The proposed technology is called Multi Channel Tool (MCT). It is able to perform signal-adaptive joint coding, and can code any two channels as a pair such that the pairs with highest correlation are jointly coded. Furthermore, stereo coding “boxes” can be cascaded for the case that more than two channels are highly correlated.
[Figure from Slide]
It proposes two stereo coding tools:
-
Real-valued prediction
-
Rotation
Complexity
-
For 22.2 channel program, MCT requires 2.6 WMOPS
Subjective listening test results for 5.1 channel EBU test signals coded at 144 kb/s. The RM0 encoder used (FL, FR) and (BL, BR).
Absolute scores
Differential scores
-
4 items better, one as high as 15 MUSHRA points, mean better (value of 4)
The presenter noted that one item was improved from FAIR to GOOD with this technology.
Nils Peters, Qualcomm, asked if items were coded as concatenated. Christian Neukam, FhG-IIS, stated that all items were coded separately. The presenter played a “movie” that showed the instantaneous correlation between channels in each test signal.
Oliver Wuebbolt, Technicolor, presented
-
m36733
| -
Technicolor Crosscheck Results for "Discrete multi-channel coding tool for MPEG-H 3D audio" CE
| -
Oliver Wuebbolt, Florian Keiler
|
The contribution presents the results of a cross-check listening test.
Subjective listening test results for 5.1 channel EBU test signals coded at 144 kb/s.
Absolute scores
-
No items different, mean not different
Differential scores
-
2 items better, one as high as 12 MUSHRA points, 1 item worse (value of -4, CI just significant), mean better (value of 4).
Max Neuendorf, FhG-IIS, gave a verbal report, that when all data is pooled for differential scores
-
3 items better, none worse, mean better (value 4)
It was the consensus of the Audio subgroup to incorporate this technology into MPEG-H Phase 2.
Clemens Par, Swissaudec, presented
-
m36454
| -
"Phase 2" Core Experiment on Invariant-driven Inverse Coding
| -
Clemens Par
| -
|
The contribution presents a new architecture for this CE:
-
The inverse coding operates in the QMF domain
-
The architecture still uses ECMA S5 as a pre- and post-processor to the 3D Audio core, but with extensions to S5 to incorporate QMF-based S5 upmixing.
Complexity
-
Encoder 57.3 PCU
-
Decoder 40.4 PCU
The presenter did not provide an estimate of the 3D Audio core coder PCU complexity.
The contribution presents evidence of the merit of Invariant-driven Inverse Coding.
Systems under test
-
RM0 (9.1 channel) at 48 and 64kb/s
-
FhG IIS benchmarks at 96 and 128kb/s (22.2 channel)
-
CE (22.2 channel)
-
CE-PCM (8 channel, which is downmix from S5 encoder)
-
CE-3DAC (8 channel, which is CE-PCM after encoding and decoding with the MPEG-H 3D audio core)
The presenter noted a calibration and parameterization issue at 48kb/s, 96kb/s and 128kb/s for CO_03_SLNiseko, which therefore excluded from the following results:
48 kb/s
Absolute
Diff
64 kb/s
Absolute
-
CE not different
-
CE-PCM better at 95% level of significance (as compared to CE)
Diff
96 kb/s
Absolute
Diff
128 kb/s
Absolute
Diff
The presenter noted that the spatial upmix information requires only 2 kb/s side information. The encoder-decoder delay is less than 300 ms.
Chair noted that at 64 kb/s the CPE 3DAC has better performance than RM0, although not at the 95% level of significance, and this would be interesting to investigate.
Dostları ilə paylaş: |