5.1.3Brainstorming for next-generation audio standards
The results of this discussion are captured in an output document N9900, New Directions in Audio Standardization.
5.2Task Group discussions 5.2.1MPEG-2 and MPEG-4 audio, conformance, reference software
MPEG-2
Yasushige Nakayama, NHK, presented
m15346
|
Request for the extension of the MPEG formats to support a 22.2 multichannel configuration
|
Yasushige Nakayama
|
This contribution proposes an extension of the MPEG-2 AAC Program Configuration Element (PCE) to support up to a 22.2 channel loudspeaker configuration. It notes that the AAC PCE is only able to map channels to loudspeakers arranged on the perimeter of a circle lying in a single horizontal plane. This is not able to address, for example, three levels of loudspeakers (ceiling, mid and floor) or loudspeakers in the center of a loudspeaker plain, for example in the center of the ceiling. The contribution provides specific syntax changes to the current specification. NHK feels that 22.2 loudspeaker audio presentation has a market in the home where delivery is via over the air broadcast.
Bernhard Grill, FhG, noted that is a good idea, and might be an opportunity to do maintenance work on MPEG-2 AAC. Pierrick Phillipe, France Telecom R&R, noted that identifying the exact location of loudspeakers in a room may ultimately be an important part of the proposed amendment.
It was the consensus of the Audio Subgroup to
-
Make this contribution an output document with a title like “Thoughts on the extension of the MPEG formats to support a 22.2 multichannel configurations.”
-
Consider a Liaison to the appropriate AES TC asking about AES activity in this area.
MPEG-4
Andreas Schneider, Dolby, presented
m15437
|
Proposed clarification for ISO/IEC 14496-4
|
Andreas Schneider
Heiko Purnhagen
|
This contribution notes that markets for 1024 transform length AAC and 960 transform length AAC are generally distinct. It proposes indicating in Conformance than a conformant decoder can indicate whether it supports 960, 1024 or both.
The Chair asked that the Dolby experts bring additional information, either later in the MPEG week or at the next MPEG meeting.
MPEG-7
Matthias Gruhne, FhG, presented
m15343
|
Technical Report on Direct Feature Extraction
|
Matthias Gruhne
|
This contribution notes that large databases of music exist, but mostly in compressed form. If one wished to extract audio-based metadata for this music, the most straightforward manner would be to decompress the music and apply the standardized FFT-based MPEG-7 feature extraction methods. This is computationally expensive. However, since the MPEG-7 features of interest are extracted from the frequency-domain, the contribution proposes an efficient method to transcode from an MPEG-1 Layer 3 or an MPEG-4 AAC compressed representation directly to the MPEG-7 feature set. This technique can realize a complexity reduction of more than 90%.
Audio experts noted that the report should be extended to
-
Be much more tutorial in the conversion from coded representation to FFT coefficient estimate.
-
Add information on complexity
Taejin Lee, ETRI, presented
m15366
|
Report on the AAC-ELD Verification test at ETRI and LG Electronics
|
Minje Kim
Taejin Lee
Jeongil Seo
Kyeongok Kang
Henny Oh
|
Test report from ETRI: 9 Listeners joint 2 tests: A1 Speech and A2 Music. Test data available in Excel sheet.
Conclusions:
A1 Speech: LD-48 performs indistinguishable from ELD-48;
A2 Music: LD-48 performs indistinguishable from ELD-48;
Heiko Purnhagen, Dolby, presented
Test site report at Dolby Labs, participated at 5 Tests: A1 Music (12 listeners), A2 Speech (8 listeners), A2 Music (8 listeners), T1 (10), T2 (10). Postscreening: 1 subject graded at 87 for one original item; for all other items consistent data for this subject -> propose not to reject that listeners Data has been provided for further data analysis.
Markus Schnell, FhG, presented
FhG conducted all 6 experiments, as shown below. Test data available in Excel sheet.
Conclusion:
A1 Music: 1C-32==LD-32 < ELD-32==1C-48 < ELD-48 == LD-48
A2 Music: 1C-48 < LD-48 < ELD-48 == ELD-64-S == AAC-LD-64
A1 Speech: LD-32 < 1C-32 < ELD-32 == 1C-48 < LD-48 == ELD-48
A2 Music: 1C-48 < LD-48 == ELD-48 < ELD-64-S == AAC-LD-64
T1: 1C-24 < 1C-32 == ELD-24 < LD-32 < HE-24 == ELD-32
T2: 1C-48 < HE-32 < LD-48 == ELD-64-S <= LD-64 == ELD-48 < ELD-64; ELD-48 > LD-48
Johannes Boehm, Thomson, presented
m15464
|
Report on the AAC-ELD Verification test, Thomson
|
Johannes Boehm
|
Test report from Thomson Germany: Mushra, Step Software used. Number of Listeners: A1 Speech 10(9), A2Speech 12, A2 Musik 12. Test data available in Excel sheet.
Conclusions:
A2 Speech: 1C-48 worse than all other codecs ; overlapping CIs between all (E)LD codecs in the test
A2 Music: 1C-48 worse than all other codecs ; overlapping CIs between all (E)LD codecs in the test
A1 Speech: 1C-32 perform equally to LD-32; ELD-32 performs equally to 1C-48; (E)LD-48 close to transparency.
Markus Schnell, FhG, presented
Document decribes the preparation of the ELD Verification test. Detailed description of the pre-processing of speech items with respect to the ITU methodology for testing speech codecs. Codec under test was provided by FhG, the reference codecs HE-AAC and AAC-LD was provided by Dolby Sweden and FhG. One item in the music set had to be exchanged due to its bandwidth limitation to 12 kHz. It was replaced by an item of similar content.
Dostları ilə paylaş: |