International organisation for standardisation organisation internationale de normalisation



Yüklə 4,35 Mb.
səhifə71/83
tarix02.01.2022
ölçüsü4,35 Mb.
#13094
1   ...   67   68   69   70   71   72   73   74   ...   83
Item

  1. No.

  2. Chn.

  1. Duration

  2. (sec)

  1. Description

  1. Pitch Pipe 3D
    (Long)

  1. 15

  1. 27

  1. The 5.1 signal is kept as intact. Delay and level were used in upmix to 15 channels.

  1. Pitch Pipe 3D

  2. (Short)

  1. 17

  1. 21

  1. The 5.1 signal is largely intact; the individual tones were shortened and attacks sharpened.

  1. Fountain Music

  1. 14

  1. 20

  1. Added “voice of god” signal.

  1. Applause

  1. 22

  1. 12

  1. Audience (i.e. source of applause) are located in the orchestra seating and higher balconies, so that the sound stage has a real “3 D” aspect.

  1. Metronome

  2. Walkabout

  1. 22.2

  1. 40

  1. Metronome is carried about the stage from back center to front center.

All mixes were made using loudspeakers located on low, mid, high rings that are equivalent to placement on a sphere. In the case of fewer than 22.2 channels, the loudspeakers used correspond directly to those in the 22.2 presentation. Hence one could consider a 15-channel program as a 22.2 channel program with 7 “zero” channels.



The items will be made available to participating MPEG companies.

Oliver Wuebbolt, Technicolor, presented



m24864

Thoughts on Draft Use Cases, Requirements and Evaluation Procedures for 3D Audio

Oliver Wuebbolt

The contribution makes four main points:

  • Added value of a possible 3D Audio standard

  • The need for high quality binaural reproduction

  • Is backwards compatibility really useful?

  • HOA may be an appropriate input format

The presenter motivated the added value point:

  • A/V alignment and consistency

    • Do we have a test method for this?

  • Flexible loudspeaker rendering

    • This seems like a substantial added value.

  • Binaural playback on headphones

    • Presenter notes that ATSC 3.0 has a strong focus on mobility.

    • This might be a “virtual” stand-in for physical loudspeaker assessment.

What is backwards compatibility?

  • Covered by Flexible/Universal rendering requirement

  • Alternatively, is it to add side information to MPEG-1 Layer II or to Dolby AC-3 coded downmix?

There was considerable discussion on “what is backwards compatibility?” Juergen Herre, IAL/FhG, cited the example of MPEG Surround in DAB, in which there is a compatible coded stereo signal and an extension of side information. Werner Oomen, Philips, felt that it was very important to support a transition process from old technology to new technology.

What can Higher Order Ambisonics (HOA) bring?



  • Description of soundfield instead of loudspeaker signals

  • Well suited for recording of ambiance (e.g. by using Eigenmike or Soundfield microphones).

  • However, use of HOA changes the traditional channel-based production workflow.

  • Inherently permits rendering to arbitrary loudspeaker locations.

The presenter noted that a complete audio program might consist of a HOA background scene and several foreground audio objects.

The presenter concluded with these summary points:



  • Flexible/universal format is a big added value.

  • High quality binaural playback on headphones is a must.

  • There should not be a requirement for strict backwards compatibility to legacy playback systems.

  • 3D Audio should accept HOA based input sound scenes.

Juergen Herre, IAL/FhG asked what would HOA bring that object-based representation would not? The presenter noted that HOA provides superior “ambiance” as compared to “point-based” objects.

The Chair noted that the presentation gave rise to very good discussion. Further, he urged the group to be very careful if adopting a requirement (e.g. on backwards compatibility) that would prevent considering a framework that has been brought to the Audio subgroup (e.g. HOA).

Jeongil Seo, ETRI, presented

m24787

Test Procedure Proposal for Interactivity Requirement in 3D Audio

Jeongil Seo, Seungkwon Beack, Kyeongok Kang

The contribution presents an evaluation procedure for testing audio object placement and interactivity. An interactivity requirement is:


  • Modification of the sound scene by manipulating the coded representation, e.g. by control of rendered audio object for use in personal interactive environments.

The contribution proposes an interactivity requirement and uses the SAOC framework to test and evaluate interactivity of system under test.

Juergen Herre, IAL/FhG noted that the presentation gives a very good platform for testing audio programs containing audio objects.



David Virette, Huawei, presented

m24897

Comments on 3D audio activity

David Virette, Lang Yue, Du Zhengzhong,

The contribution ranged over many points:

  • Use cases, requirements and evaluation procedures

  • Core Experiment methodology to be used in 3D Audio

  • Description of the Huawei 22.2 listening room

  • Huawei feels that TV for Smartphone and Personal TV are the most important use cases.

  • Consistency with video is very important in all application scenarios.

TV for Smartphone / Personal TV

  • Small screen size, limited area for speakers, complexity limitations

  • Headphone listening is one use mode

  • Packet errors may be an important issue (e.g. IP channel in which packets are late and cannot be used)

  • Use of a docking station should be added to use case scenarios

Home theatre

  • Huawei feels that this is not sufficiently well defined in current use case description. For example, sound source height/elevation and distance/proximity is important but not sufficiently explored.

Requirements

  • Channel-based input is the most common production format. Hence it should be mandatory.

  • Interactivity is important, e.g. gaming applications, but less than channel-based.

  • Scene-based input should be an alternative format. This looks to the future of 3D Audio.

Possible Evaluation

  • Huawei agrees that 22.2 configuration must be evaluated

  • A single loudspeaker subset should be selected (e.g. 10.2 or 7.1). This could be in a “standard” configuration and an “alternate” configuration.

  • Huawei agrees that Personal TV needs its own testing platform (e.g. the NHK LAF platform).

Relevant technologies

  • Huawei feels that the 3D Audio work should NOT be used to standardize a new mono/stereo channel-based audio coder (AAC/USAC family should be used if necessary).

CE Process

  • Huawei feels that the CE methodology should be revised before issuing a CfP and should use a common reference system (encoder and decoder) that is available to all MPEG members.

Report on Huawei listening room

  • NR 20

  • 7.37 l x 6.61 w x 3.5 h (meters)

  • Top loudspeaker layer is square, middle and bottom can have arbitrary geometry, including circular.

Summary

  • Work should focus on handheld devices

  • Should permit all three input formats

  • Evaluation procedures should support a specific tablet test, e.g. using a loudspeaker array.

  • Revise CE process for the 3D Audio work

Concerning the possibility of evaluating errored channels, the Audio Chair checked with the Video Chair and reported that Video has never done errored channel tests for packet (e.g. IP) channels.

Clemens Par, Swissaudec, observed that the proposed “transcoding” for portable devices imposes a low bitrate limitations. He urged not to confuse high-quality and low-bitrate evaluations for a single use case (e.g. hand-help portable TV).

The Chair asked the presenter about “consistency with video.” He stated that an evaluation test might be “tablet TV” with frame-based loudspeakers or spatialized headphones. He further recommended that the group use a real (e.g. cinema) A/V test signal. The Chair noted that a synthetic video with a “red dot” might be used with the new “metronome walkabout.”

Mohammed Raad, RaadTech, noted that the emerging devices (i.e. tablet TV) are mostly IP-based. Hence it might be appropriate for MPEG to address errors as on generic packet channels.

Werner Oomen, Philips, suggested that might be best if Systems handles all aspects of channel error control. The Chair noted that if Systems does this, then it could provide to the audio decoder an access unit sequence number such that the audio decoder knows that an access unit is missing and then can take a normative or “recommended but informative” action.

The presenter noted that an interesting use case might be that the user wishes to switch from e.g. tablet TV viewing to home cinema TV viewing. Gregory Pallone, Orange Labs, noted that it might be possible for a system to modify the “sweet spot” to adapt to the actual location of the listener(s).

Juergen Herre, IAL/FhG, asked whether MPEG should standardize rendering for a loudspeaker array (as might be associated with a tablet TV), in that this may be strictly post-processing, such as wavefield synthesis rendering.

There was some discussion on CE Methodology. The Chair recommended that experts communicate their ideas to him “over a cup of coffee” as a way to make progress on this topic.

Akio Ando, NHK, proposed that we test two categories of loudspeaker arrangements: speakers on a cube and speakers on a sphere.

Johannes Boehm, Technicolor, presented



m24888

SCENE BASED AUDIO TECHNOLOGY, AN OVERVIEW

J.Boehm

The contribution provides a tutorial on higher order ambisonics (HOA) and also a guide to the practical use of HOA.

HOA expresses an arbitrary, time-varying sound pressure level as a superposition of plane waves. When rendering, the loudspeakers are modelled as emitting plane waves. The equation expressing the “pressure function” at the listener point can be expressed in terms of these loudspeaker signals. The loudspeaker positions can be abstracted into “discrete spherical virtual loudspeaker” signals. Rendering can be done with the constraint of preserving total energy across all loudspeakers.

A soundfield microphone (e.g. EigenMike from mh acoustics) can be used to record at a scene. Subsequent signal processing is needed to produce the HOA signals. In practice, discrete audio objects might be added for a scene plus object representation.

In rendering, the loudspeaker positions can be determined automatically (e.g. with test signals and a microphone). Knowing this, HOA rendering to the target loudspeakers is automatic.

The presenter noted that content producers often do not want to “give their sound objects away,” such that a solution in which objects are already incorporated into the HOA scene may be an advantage.

Gregory Pallone, Orange Labs, noted that HOA has very nice “near-field” effect (i.e. localization of sound sources) that is independent of a specific loudspeaker configuration.




Yüklə 4,35 Mb.

Dostları ilə paylaş:
1   ...   67   68   69   70   71   72   73   74   ...   83




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin