Technology
Clemens Par, Swissaudec, presented
m24962
|
Referential Contribution for the Definition of Inverse Coding (Inverse MS Audio Filtering) with or without Invariant Based Calibration, and Conclusions for Use Cases, Requirements and Evaluation Procedures for 3D Audio
|
Clemens Par (swissaudec)
|
The contribution presents the technical means and the benefit of Swissaudec’s technology with regard to low bitrate coding in 3D audio. The recently published central “inverse coding” formulae were discussed together with the adjacent methodology to achieve 5.1 signals. Clemens Par likewise gave a definition, and pointed out the recently published location, of specific algebraic invariants that may help to calibrate an entire inverse coding system with lowest possible latencies.
Clemens Par, Swissaudec, presented
m24961
|
Further Evidence of Performance of Swissaudec’s VoiCode® Technology
|
Clemens Par (swissaudec)
|
The contributions reported the results of presented a listening test of the Swissaudec technology. DELTA SenseLab performed the test, which had 20 assessors and used the 10 items from the MPEG Surround Verification Test n8851:
Label
|
Item name
|
Genre
|
Duration (s)
|
LFE
|
amb1
|
Station_Atmo_6ch
|
Ambience
|
20
|
-
|
jaz1
|
tower1
|
Jazz
|
17
|
-
|
mod2
|
elliot2
|
Movies Drama
|
21
|
-
|
orc1
|
ravel1c
|
Orchestra
|
19
|
-
|
orc2
|
violin2
|
Orchestra
|
21
|
-
|
pop2
|
thalheim4
|
Pop
|
19
|
-
|
pop3
|
SantaCruz_09a
|
Pop (SantaCruz)
|
19
|
-
|
tra1
|
chostakovitch
|
Orchestra
|
25
|
-
|
tra2
|
fountain music
|
Piano, Ambience
|
21
|
-
|
tra3
|
jackson1
|
Ambience
|
16
|
-
|
The setup was as follows:
Item
|
Description
|
Label
|
1
|
Original
|
Orig
|
2
|
HE-AAC v2 coded at 120 kbps. The coding was done as:
L, R pair at 48 kb/s using HE-AAC v2
C channel at 24 kb/s using HE-AAC
Ls, Rs pair at 48 kb/s using HE-AAC v2
|
HEV2-120
|
3
|
Dolby ProLogic II with stereo downmix coded with 128 kbps AAC
|
PLII
|
4
|
ITU-R BS.775-1 [2] mono downmix coded with 96 kbps AAC followed by VoiCode® Surround [1, 4]
|
VC-96
|
5
|
ITU-R BS.775-1 [2] mono downmix coded with 64 kbps AAC followed by VoiCode® Surround [1, 4]
|
VC-64
|
6
|
ITU-R BS.775-1 [2] mono downmix coded with 32 kbps HE-AAC followed by VoiCode® Surround [1, 4]
|
VC-32
|
7
|
ITU-R BS.775-1 [2] mono downmix coded with 24 kbps HE-AAC followed by VoiCode® Surround [1, 4]
|
VC-24
|
8
|
3.5 kHz low pass anchor (based on 5.0 original)
|
LP-35
|
9
|
ITU-R BS.775-1 [2] stereo downmix coded with 48 kbps HE-AAC v2 followed by Soundfield UPM-1 upmix
|
UPM-48
|
Items 4 to 7 show a spatial bitrate that is as low as 40bits in the header.
Clemens Par pointed out that with regard to test n7138 LP-35 and PLII (coded with AAC at 160kbps in n7138) a difference of 15MOS was shown and that, due to the test procedure definition t3 within n7138 (exclusively assessing Mono HE-AAC compatibility with HE-AAC as core coder together with the hidden reference) total variance with regard to n7138 may be as high as 25MOS. Clemens Par stated that the primary use cases for the overall technology are low bandwidth devices with severe spatial bitrate constraints. The system, when using paired inverse coding, requires mere 20bit in the header per pair (which, for instance, would overall be 200bit in the header with 22.2 signals) whilst others coders would show spatial bitrates at least as high as 20kbps for the same task.
Clemens Par, Swissaudec, presented
m24964
|
Possible Revisions to ISO/IEC JTC 1/SC 29/WG 11/N12610 (“Draft Use Cases, Requirements and Evaluation Procedures for 3D Audio”)
|
Clemens Par (swissaudec)
|
The contribution offers a revision of N12610, from the 99th meeting. It offers a number of revisions:
-
Proposals need not fulfill all requirements, but must be offer sufficient technology that it can fully fulfill at least one test scenario. Such partial solutions should be judged by the figures of merit for the tests that are appropriate for the proposal.
-
Low complexity consumer electronics platforms are of paramount importance.
Akio Ando, NHK, presented
m24630
|
Compensation of unmasked noise for scalable audio coding
|
Akio Ando
|
The contribution proposes a method for efficiently coding and rendering a 22.2-channel program for 22.2, “family style” which is 6 front, 2 rear, LFE and stereo loudspeaker configurations. This is done via transmitting a stereo signal and additional information to reconstruct family style and full 22.2. It is envisioned that the embedded stereo signal would be used for “SmartPhone TV” use case.
The contribution explained the mechanism of noise unmasking when using perceptual coders and channel-based upmix matrixing.
Noise is modelled as a polynomial function of the signal to be coded, and such a equation can be solved with the constraint of minimising the MSE between the modelled noise and the actual noise. A proposed solution is to transmit 9 channels (for family style) :
-
AAC coded base signal (8 x 80 kb/s = 640 kb/s)
-
Additional signal (14 x 40 kb/s = 560 kb/s)
The “additional signal” is compressed via MDCT and coded in subbands. Model coefficients are transmitted once per file (in this case 6 seconds).
In high-frequency bands, first order polynomial was enough to give the vast majority of unmasking noise attenuation. Hence noise can be modelled as linear function of the coded signal.
BS.1116 listening tests on the performance of the matrixing technology were presented. Results showed very good performance at 1.2 mb/s for the 22.2. channel program (mean higher than -0.5 on the BS.1116 reporting scale).
The presenter noted that listener training involved hearing the coded signals and also the difference signal (i.e. noise) between original and coded.
Draft Call for Proposals
The Chair extracted text from N12610, “Draft Use Cases, Requirements and Evaluation Procedures for 3D Audio” that was the base of a “Draft Call for Proposals for 3D Audio.” The remainder of the time in the Audio Subgroup was spent discussing and revising this document. Most of the time was spent discussing and revising the statement of requirements.
The result was a document that had the approval of the Audio Subgroup, but which the group also acknowledged requires additional work. The Chair urged experts to review the document during the AhG period and to bring contributions to the next meeting that express their thoughts.
Dostları ilə paylaş: |