3D Audio CEs

Achim Kuntz, FhG-IIS, presented


Evaluation of QMF Performance in MPEG-H 3D Audio CO Format Conversion

Achim Kuntz


Qualcomm's test results for the filter bank replacement CE by Fraunhofer IIS

Nils Peters, Deep Sen

The contributions report on listening tests that assessed a change proposed at the previous MPEG meeting, which was to make the formatter filterbank the same as the SBR filterbank, i.e. make both the standard QMF filterbank. The listening test was conducted using uncoded items, and converted the output to a 5.1 channel presentation using the format converter. The test used the MUSHRA methodology and in total 17 subjects participated in the two test sties.

It was the consensus of the Audio subgroup to incorporate the changes proposed in the contributions into the text of MPEG-H 3D Audio WD1-CO and reflect them in the RM1-CO reference software.

Werner de Bruijn, Philips, presented


Core Experiment on 3D-Audio rendering

Werner de Bruijn, Werner Oomen, Aki Härmä

The contribution proposed that the 3D Audio decoder be able to use multiple rendering algorithms with different ones selected based on loudspeaker position. Furthermore, different renderers could be used for different regions of the loudspeaker layout, e.g. regions with very different geometries.

The contribution notes two trends in the home audio market:

  • A shift from wired to wireless connection for loudspeakers. Wireless speakers better permit surround speakers to be properly places.

  • A shift from surround loudspeaker layouts (e.g. ITU-R 5.1) to front-only speaker layouts (e.g. “soundbars”).

The contribution reports on a test of a proposed renderer that can take advantage of closely spaced loudspeakers that would be suitable for “dipole” rendering. Three setups were investigated:

  • ITU-R 5.1 setup

  • Two front dipoles

  • Two front dipoles with two surround speakers

In summary, the contribution notes that

  • A single renderer is not adequate for all layouts, and that multiple renderers would be beneficial

  • A render based on two front dipoles showed very good performance as compared to VBAP

Yeshwant Muthusamy, Samsung Telecom America, asked how close together can the dipoles be and still work. The presenter stated that were tested with a spacing of approximately the width of a TV. Masayuki Nishiguchi, Sony, asked how large is the sweet spot. The presenter stated that this was not investigated. Johannes Boehm, Technicolor, asked if the dipole imposes timbre changes when the listener position changes. The presenter stated that this was true, that it was not investigated, but that still consumer are very happy with such dipoles in commercial soundbars. Thomas Sporer, FhG-IDMT, noted that the position and attributes of the side-reflectors may have a significant impact on the perceived sound.

The Chair noted that the contribution generated a very good discussion and asked that experts continue the discussion off-line. Particularly, he urged experts to consider:

  • What needs to be standardized to support the market needs raised by the contribution?

Jan Plogsties, FhG-IIS, presented


Thoughts on an Interface to Device-specific Rendering

Jan Plogsties

The contribution notes that a 3D Audio decoder needs to know the loudspeaker positions in a users layout in order to deliver the greatest value to the listener. However, the user may wish to deliver more than loudspeaker to the decoder. Hence, needed meta-data may be:

  • Loudspeaker configuration, as CICP configuration index or as speaker positions in space

  • Type of loudspeaker, e.g. sound bar

  • Desired DRC and dialog/background energy ratio

The presenter imagines several “levels” of meta-data communication. Syntax and semantics for three levels are in the contribution.

The Chair noted that the contribution sparked very good discussion, and that a companion contribution (m32182) will address some issues raised. This topic will continue to be discussed.

Taegyu Lee, Yonsei University, presented


Preliminary CE proposal on LFE binauralization for MPEG-H 3D Audio

Taegyu Lee, Henney Oh, Young-cheol Park, Dae Hee Youn

The contribution notes that LFE channels have not been included in the binauralization tests and hence do not appear to be supported by proposed binauralization technology. Furthermore, only a portion of the CfP test items have an LFE, or have a non-zero LFE.

The contribution reports on two subjective tests, each conducted with binauralized test items and presented on headphones.

  • A first subjective test measures if a user can detect the presence of an LFE. That is, Reference is e.g. 22.0 and System under test is 22.2. Results showed a clear ability to detect a LFE signal.

  • A second subjective test measures if a user prefers the case in which an LFE is present. Results showed a clear preference for signals with LFE.

Gregory Pallone, Orange, noted that Radio France decided to produce all content in 5.0 format, as LFE was not regarded as relevant for radio material.

Yuki Yamamoto, Sony, presented


Proposed corrections to 3D Audio CO RM0 working draft text and reference software

Yuki Yamamoto, Toru Chinen, Runyu Shi, Masayuki Nishiguchi

The contribution provides a technical solution, in text and reference software, for an identified problem associated with rendering an object to a position that is near or at the lower level of the loudspeaker in the e.g. 22.2 speaker layout. The issue pertains to drawing VBAP triangles on the surface of the spherical loudspeaker configuration, and errors in mapping a 2-dimensional triangle onto the 3-dimensional surface.

Achim Kuntz, FhG-IIS, presented


Core Experiment on N-Wise Panning

Christian Borß

The contribution presents an extension to the VBAP panning method. It permits panning between more than 3 speakers (hence the name N-wise, also called Virtual-VBAP). The contribution notes that the current 22.2 speaker layout and other nested standard configurations present many panning problems, such as asymmetric panning over upper speakers and clipping over lower speakers (as noted in m32226). The “standard” layouts have table-drive VBAP engine and other layouts may result in “tandem” rendering.

Virtual VBAP adds a virtual loudspeaker, whose virtual signal can be obtained via a symmetric downmix of adjacent speakers. This removes the “asymmetric” problem found in the original VBAP. The virtual loudspeaker location need only be identified once at configuration time, where they are added to the VBAP tables. The presenter noted that it is possible to place a virtual loudspeaker at the “Voice of God” location if the layout does not have one, and also the “Voice of Hell” location such that object location clipping at the lower speaker boundary does not have to occur.

Worst-case V-VBAP complexity is 33% higher than VBAP, but is less than XX% with respect to the 22.2 channel core decoder engine.

A listening test was performed to test the performance of rendering test items to 5.1 and 9.1 speaker layouts. The CO_11, CO_12 and an additional item, RC model plane fly-over was used as test items. A differential score analysis (V-VBAP – VBAP) results showed that V-VBAP performed better than VBAP when averaged over all items at the to the test set.

Additionally, the contribution proposes to incorporate a triangulation algorithm (using the QuickHull algorithm) to identify needed virtual speakers and resultant triangles for any speaker configuration. This computation need only be done at configuration time.

The presenter requested a workplan to continue the work on this CE.

The Chair encouraged experts to study both contributions and decide:

  • Is the Sony contribution a complete CE, and does it show sufficient merit to adopt?

  • What should go into the FhG-IIS requested workplan?

