Organisation internationale de normalisation

Yüklə 5,54 Mb.

səhifə	184/197
tarix	02.01.2022
ölçüsü	5,54 Mb.
	#32757

1 ... 180 181 182 183 184 185 186 187 ... 197

Extension to Stereo

Christian Neukam, FhG-IIS, presented

m36535

3DA Phase 2 Core Experiment on joint channels for low bitrate coding

Guillaume Fuchs, Sascha Disch, Christian Neukam

The contribution reports on extending the TBE and IGF in LPD to stereo. Bitrates for the CE are between 128 kb/s and 256 kb/s for immersive audio signals.

The presenter noted that the LPD mode has no stereo compression tool, and when using of stereo tools in FD mode, it is not possible to seamlessly switch to LPD mode. Therefore, the contribution proposes a new joint stereo tool for LPD mode:

Mid signal coded using the LPD core coder
Side signal codes by new stereo tool

Complexity evaluation

At 24 kb/s stereo – complexity for total decoder is reduced by 24%
At 32 kb/s stereo – complexity for total decoder is reduced by 23%

A subjective quality evaluation was done via a listening test at FhG-IIS.

Absolute scores

24 kb/s stereo no items different, mean not different
32 kb/s stereo no items different, mean is better at 95% level of significance

Absolute scores

24 kb/s stereo 7 items better, mean better (value of 5)
32 kb/s stereo 7 items better, mean is better (value of 7)

In conclusion, the presenter noted that the proposed technology:

Gives improved performance at lower complexity

Venkatraman Atti, Qualcomm, presented

m36426

Crosscheck report for LPD stereo CE of MPEG-H 3D Audio Phase 2

Venkatraman Atti, Imre Varga, Venkata Chebiyyam

The contribution reports on a listening test conducted at Qualcomm.

Absolute scores

24 kb/s stereo no items different, mean not different
32 kb/s stereo no items different, mean not different

Absolute scores

24 kb/s stereo 2 items better, mean not different
32 kb/s stereo 4 items better, mean is better (value of 5)

The presenter noted that these results are consistent with the proponent listening test results results.

It was the consensus of the Audio subgroup to incorporate this technology into MPEG-H Phase 2.

Max Neuendorf, FhG-IIS, presented

m36591

Discrete multi-channel coding tool for MPEG-H 3D audio

Sascha Dick, Christian Helmrich, Nikolaus Rettelbach, Florian Schuh, Tobias Schwegler, Max Neuendorf

The presenter gave an overview of the 3D Audio Phase 1 technology, in which

Stereo coding occurs in fixed channel pairs

The proposed technology is called Multi Channel Tool (MCT). It is able to perform signal-adaptive joint coding, and can code any two channels as a pair such that the pairs with highest correlation are jointly coded. Furthermore, stereo coding “boxes” can be cascaded for the case that more than two channels are highly correlated.

[Figure from Slide]

It proposes two stereo coding tools:

Real-valued prediction
Rotation

Complexity

For 22.2 channel program, MCT requires 2.6 WMOPS

Subjective listening test results for 5.1 channel EBU test signals coded at 144 kb/s. The RM0 encoder used (FL, FR) and (BL, BR).

Absolute scores

1 item better

Differential scores

4 items better, one as high as 15 MUSHRA points, mean better (value of 4)

The presenter noted that one item was improved from FAIR to GOOD with this technology.

Nils Peters, Qualcomm, asked if items were coded as concatenated. Christian Neukam, FhG-IIS, stated that all items were coded separately. The presenter played a “movie” that showed the instantaneous correlation between channels in each test signal.

Oliver Wuebbolt, Technicolor, presented

m36733

Technicolor Crosscheck Results for "Discrete multi-channel coding tool for MPEG-H 3D audio" CE

Oliver Wuebbolt, Florian Keiler

The contribution presents the results of a cross-check listening test.

Subjective listening test results for 5.1 channel EBU test signals coded at 144 kb/s.

Absolute scores

No items different, mean not different

Differential scores

2 items better, one as high as 12 MUSHRA points, 1 item worse (value of -4, CI just significant), mean better (value of 4).

Max Neuendorf, FhG-IIS, gave a verbal report, that when all data is pooled for differential scores

3 items better, none worse, mean better (value 4)

It was the consensus of the Audio subgroup to incorporate this technology into MPEG-H Phase 2.
Clemens Par, Swissaudec, presented

m36454

"Phase 2" Core Experiment on Invariant-driven Inverse Coding

Clemens Par

The contribution presents a new architecture for this CE:

The inverse coding operates in the QMF domain
The architecture still uses ECMA S5 as a pre- and post-processor to the 3D Audio core, but with extensions to S5 to incorporate QMF-based S5 upmixing.

Complexity

Encoder 57.3 PCU
Decoder 40.4 PCU

The presenter did not provide an estimate of the 3D Audio core coder PCU complexity.

The contribution presents evidence of the merit of Invariant-driven Inverse Coding.

Systems under test

RM0 (9.1 channel) at 48 and 64kb/s
FhG IIS benchmarks at 96 and 128kb/s (22.2 channel)
CE (22.2 channel)
CE-PCM (8 channel, which is downmix from S5 encoder)
CE-3DAC (8 channel, which is CE-PCM after encoding and decoding with the MPEG-H 3D audio core)

The presenter noted a calibration and parameterization issue at 48kb/s, 96kb/s and 128kb/s for CO_03_SLNiseko, which therefore excluded from the following results:
48 kb/s

Absolute

None different

Diff

None different

64 kb/s

Absolute

CE not different
CE-PCM better at 95% level of significance (as compared to CE)

Diff

None different

96 kb/s

Absolute

CE not different

Diff

None different

128 kb/s

Absolute

CE not different

Diff

None different

The presenter noted that the spatial upmix information requires only 2 kb/s side information. The encoder-decoder delay is less than 300 ms.

Chair noted that at 64 kb/s the CPE 3DAC has better performance than RM0, although not at the 95% level of significance, and this would be interesting to investigate.

Yüklə 5,54 Mb.

Dostları ilə paylaş:

1 ... 180 181 182 183 184 185 186 187 ... 197