Later in week:
Sony experts will support the FhG-IIS V-VBAP technology, and so there is no need for further study or action on the Sony proposal.
Sang Bae Chon, Samsung, presented
m32238
|
CE on Immersive Audio Rendering over Fewer Loudspeakers
|
Sang Bae Chon, Sunmin Kim
|
|
The contribution noted that ATSC 3.0 might be a potential customer of MPEG 3D Audio. Furthermore, it emphasised that MPEG 3D Audio will have to operate on the very large base of legacy sytems, e.g. 5.1 and 7.1 in the horizontal plane, since there will be at best a slow rollout of true 3D speaker systems in the marketplace.
It presented a technology for rendering the elevated channel signals onto the horizontal loudspeakers using a timbre-altering equalization filter. A MUSHRA test was conducted with original 22.2 as reference and CfP RM0 rendering to 5.1 as “current system.” Analysis of absolute scores showed one item better (Fountain Music) and grand mean better at the 95% level of significance. Analysis of difference scores showed an improvement for all items and the rand mean of differences.
The presenter asks for a workplan to progress the CE.
Achim Kuntz, FhG-IIS, presented
m32185
|
Core Experiment on Complexity Reduction of the MPEG-H 3D Audio CO Format Converter
|
Achim Kuntz
|
|
The contribution presents an optimized method for performing the same format converter functions as found in RM0.
It notes that the most complex operation in the RM0 format converter is a covariance analysis over all input signals, and that this analysis is done using a two-frame segment (to facilitate overlap-add processing).
The reduced-complexity proposal reduces the complexity of the downmix (renderer) by 36% and reduces the complexity of the entire 22.2 3D Audio decoder by nearly 10%.
To check for a possible reduction on subjective quality a MUSHRA test was performed with 22.2 orignal as Reference, RM0 rendering to 5.1 as RM and proposed rendering to 5.1. Results showed no difference in individual item scores or mean scores at 95% level of significance.
It was the consensus of the Audio subgroup to accept this technology into the CO WD1/RM1 text and reference software.
Max Neuendorf, FhG-IIS, presented
m32182
|
Flexible Signaling of 3D Loudspeaker Configurations for MPEG-H 3D Audio
|
Florian Schuh, Christian Ertel, Max Neuendorf, Andreas Hölzer, Johannes Hilpert, Nikolaus Rettelbach
|
|
The contribution presents a method for specifying the intended loudspeaker layout of an audio program. It identifies these issues
-
Specifying loudspeaker layouts
-
Specifying the assignment of core coder channels to loudspeakers in layout
There are two loudspeaker layouts:
-
the layout for which the Channel-signal program is intended
-
the layout for which the Object-signal (and SAOC) program is intended
There are three means to signal loudspeaker layouts
-
CICP speaker layout index
-
List of CICP individual loudspeaker indices
-
List of Elevation, Azimuth positional data for each loudspeaker, with either 1 or 5 degrees of quantization. Additionally, there is a way to efficiently indicate symmetrical loudspeaker pairs
Signalling use of individual channels (or channels within pair or quad)
-
Individual channels or channels within pair or quad
-
Objects
-
SAOC channels or channel groups
The presenter requests that the technology in the contribution be incorporated into the 3D Audio WD1 text.
Clemens Par, Suissaudec, noted that ecma has
-
A registration authority for loudspeaker configurations
-
ecma already references MPEG CICP, and would be interested in using aspects of technology in 3D Audio.
Gregory Pallone, Orange, noted that it may be important for 3D Audio to support a composite audio program in which one portion (e.g. the main program) has one loudspeaker layout and another portion (e.g. an inserted ad) has a different loudspeaker layout.
Werner Oomen, Philips and Gregory Palone, Orange, presented
m32241
|
Unified decoder and renderer architecture and API
|
Aki Härmä, Richard Furse, Marc Emerit, Gregory Palone, Werner de Bruijn, Werner Oomen
|
|
The contribution covers three issues:
-
Proposals for unification of C, O and HOA 3D Audio architecture. This would eliminate the necessity for any “tandem rendering”
-
API that supports using a proprietary renderer
-
A coded representation for parameter information or loudspeaker layout
User-centric configuration data could be
-
Binauralization data
-
Decoder loudspeaker layout data
The presenter noted that CO-RM0 currently decodes to a 22.2 channel “bus” which is then connected to the render, and that this can lead to significant inefficiencies and to spectral coloration and suboptimal sound localization. For example, decoded streams are connected (e.g. 22.2 program) or rendered and connect (e.g. 9.1 program) to an internal, virtual 22.2 channel bus.
Jan Plogsties, FhG-IIS, notes that the multiplexing of parameter packets into a 3D Audio bitstream might be difficult.
The presenter asks for
-
A workplan to progress the unified architecture proposal.
-
A workplan to unify proposals for representing and communicating
-
Decoder loudspeaker layouts.
-
BRIR
-
Object control (e.g. dialog level)
It is the consensus of the Audio subgroup to have the two workplans, with the expectation that the outcome of the workplans be incorporated into 3D Audio at the next MPEG meeting.
The Chair noted that it might be productive to have a face-to-face AhG meeting to accelerate this effort, perhaps in early February at Technicolor’s Hannover office.
Johannes Boehm, Technicolor, presented
m32245
|
Vision of a MPEG-H 3D Audio unified architecture
|
Johannes Boehm, Peter Jax, Florian Keiler, Sven Kordon, Alexander Krueger, Oliver Wuebbolt
|
|
The contribution proposes
-
All renderers directly target the actual loudspeaker layout
-
DRC blocks should be implemented prior to rendering, as the encoder does not know the actual loudspeaker layout
-
That USAC 3D be used as the common core coder
The presenter stated that the two workplans agreed to in the m32241 presentation satisfy what he asks for in this presention.
Oliver Wuebbolt, Technicolor, presented
m32244
|
Progress Report on Unification of MPEG-H 3D-Audio CO and HOA
|
Sascha Dick, Christian Ertel, Johannes Hilpert, Johannes Boehm, Peter Jax, Florian Keiler, Sven Kordon, Alexander Krueger, Oliver Wuebbolt
|
|
The contribution notes that RM0-CO uses USAC-3D as a core coder and RM0-HOA uses HE-AAC as a core coder. It proposes to use USAC-3D as the only core coder. This implementation has been done, with the HOA spatial information carried as a USAC-3D extension element.
A subjective test was conducted to assess the performance of USAC-3D core coder in the RM0-HOA framework. A statistical analysis showed that, relative to RM0-HOA, no items were worse, and when averaged over all items the result was better at 256 kb/s, and was not different at 1200 kb/s (the mean value was better)
It was the consensus of the Audio subgroup to use the USAC-3D as the core coder in MPEG-H 3D Audio and to adopt the bitstream re-organization proposed in the contribution.
Max Neuendorf, FhG-IIS, showed some slides that summarized discussion and presented a new 3D Audio decoder block diagram that combines all unification proposals. Going forward, he envisions a workplan that specifies the milestones for producing WD text and RM reference software.
The Chair suggested that there be two documents
-
“Workplan on Unification of 3D Audio Architecture” that only addresses unification issues.
-
“Workplan on 3D Audio CEs” that only addresses Core Experiment issues.
Dostları ilə paylaş: |