International organisation for standardisation

Brainstorming for next-generation audio standards

Yüklə 3,36 Mb.

səhifə	70/79
tarix	03.01.2022
ölçüsü	3,36 Mb.
	#42830

1 ... 66 67 68 69 70 71 72 73 ... 79

5.2.2MPEG-4 AAC Enhanced Low Delay - Markus Schnell

5.1.3Brainstorming for next-generation audio standards

The results of this discussion are captured in an output document N9900, New Directions in Audio Standardization.

5.2Task Group discussions

5.2.1MPEG-2 and MPEG-4 audio, conformance, reference software

MPEG-2

Yasushige Nakayama, NHK, presented

m15346

Request for the extension of the MPEG formats to support a 22.2 multichannel configuration

Yasushige Nakayama

This contribution proposes an extension of the MPEG-2 AAC Program Configuration Element (PCE) to support up to a 22.2 channel loudspeaker configuration. It notes that the AAC PCE is only able to map channels to loudspeakers arranged on the perimeter of a circle lying in a single horizontal plane. This is not able to address, for example, three levels of loudspeakers (ceiling, mid and floor) or loudspeakers in the center of a loudspeaker plain, for example in the center of the ceiling. The contribution provides specific syntax changes to the current specification. NHK feels that 22.2 loudspeaker audio presentation has a market in the home where delivery is via over the air broadcast.

Bernhard Grill, FhG, noted that is a good idea, and might be an opportunity to do maintenance work on MPEG-2 AAC. Pierrick Phillipe, France Telecom R&R, noted that identifying the exact location of loudspeakers in a room may ultimately be an important part of the proposed amendment.

It was the consensus of the Audio Subgroup to

Make this contribution an output document with a title like “Thoughts on the extension of the MPEG formats to support a 22.2 multichannel configurations.”
Consider a Liaison to the appropriate AES TC asking about AES activity in this area.

MPEG-4

Andreas Schneider, Dolby, presented

m15437

Proposed clarification for ISO/IEC 14496-4

Andreas Schneider
Heiko Purnhagen

This contribution notes that markets for 1024 transform length AAC and 960 transform length AAC are generally distinct. It proposes indicating in Conformance than a conformant decoder can indicate whether it supports 960, 1024 or both.

The Chair asked that the Dolby experts bring additional information, either later in the MPEG week or at the next MPEG meeting.

MPEG-7

Matthias Gruhne, FhG, presented

m15343

Technical Report on Direct Feature Extraction

Matthias Gruhne

This contribution notes that large databases of music exist, but mostly in compressed form. If one wished to extract audio-based metadata for this music, the most straightforward manner would be to decompress the music and apply the standardized FFT-based MPEG-7 feature extraction methods. This is computationally expensive. However, since the MPEG-7 features of interest are extracted from the frequency-domain, the contribution proposes an efficient method to transcode from an MPEG-1 Layer 3 or an MPEG-4 AAC compressed representation directly to the MPEG-7 feature set. This technique can realize a complexity reduction of more than 90%.

Audio experts noted that the report should be extended to

Be much more tutorial in the conversion from coded representation to FFT coefficient estimate.
Add information on complexity

5.2.2MPEG-4 AAC Enhanced Low Delay - Markus Schnell

Taejin Lee, ETRI, presented

m15366

Report on the AAC-ELD Verification test at ETRI and LG Electronics

Minje Kim
Taejin Lee
Jeongil Seo
Kyeongok Kang
Henny Oh

Test report from ETRI: 9 Listeners joint 2 tests: A1 Speech and A2 Music. Test data available in Excel sheet.

Conclusions:

A1 Speech: LD-48 performs indistinguishable from ELD-48;

A2 Music: LD-48 performs indistinguishable from ELD-48;

Heiko Purnhagen, Dolby, presented

m15395

Report on Enhanced Low Delay AAC Subjective Tests at Dolby Laboratories

Vesa Ruopilla
Per Ekstrand

Test site report at Dolby Labs, participated at 5 Tests: A1 Music (12 listeners), A2 Speech (8 listeners), A2 Music (8 listeners), T1 (10), T2 (10). Postscreening: 1 subject graded at 87 for one original item; for all other items consistent data for this subject -> propose not to reject that listeners Data has been provided for further data analysis.

Markus Schnell, FhG, presented

m15446

Results of AAC-ELD Verification test

Markus Schnell
Tobias Albert

FhG conducted all 6 experiments, as shown below. Test data available in Excel sheet.

Conclusion:

A1 Music: 1C-32==LD-32 < ELD-32==1C-48 < ELD-48 == LD-48

A2 Music: 1C-48 < LD-48 < ELD-48 == ELD-64-S == AAC-LD-64

A1 Speech: LD-32 < 1C-32 < ELD-32 == 1C-48 < LD-48 == ELD-48

A2 Music: 1C-48 < LD-48 == ELD-48 < ELD-64-S == AAC-LD-64

T1: 1C-24 < 1C-32 == ELD-24 < LD-32 < HE-24 == ELD-32

T2: 1C-48 < HE-32 < LD-48 == ELD-64-S <= LD-64 == ELD-48 < ELD-64; ELD-48 > LD-48

Johannes Boehm, Thomson, presented

m15464

Report on the AAC-ELD Verification test, Thomson

Johannes Boehm

Test report from Thomson Germany: Mushra, Step Software used. Number of Listeners: A1 Speech 10(9), A2Speech 12, A2 Musik 12. Test data available in Excel sheet.

Conclusions:

A2 Speech: 1C-48 worse than all other codecs ; overlapping CIs between all (E)LD codecs in the test

A2 Music: 1C-48 worse than all other codecs ; overlapping CIs between all (E)LD codecs in the test

A1 Speech: 1C-32 perform equally to LD-32; ELD-32 performs equally to 1C-48; (E)LD-48 close to transparency.

Markus Schnell, FhG, presented

m15447

Summary of ELD Verification test

Markus Schnell

Document decribes the preparation of the ELD Verification test. Detailed description of the pre-processing of speech items with respect to the ITU methodology for testing speech codecs. Codec under test was provided by FhG, the reference codecs HE-AAC and AAC-LD was provided by Dolby Sweden and FhG. One item in the music set had to be exchanged due to its bandwidth limitation to 12 kHz. It was replaced by an item of similar content.

Yüklə 3,36 Mb.

Dostları ilə paylaş:

1 ... 66 67 68 69 70 71 72 73 ... 79