The MPEG Audio Subgroup meeting was held during the 103rd meeting of WG11, January 21-25, 2013 in Geneva, Switzerland. The list of participants is given in Annex A.
Administrative matters Communications from the Chair
The Chair summarised the issues raised at the Sunday evening Chair’s meeting, proposed task groups for the week, and proposed agenda items for discussion in Audio plenary.
Approval of agenda and allocation of contributions
The agenda and schedule for the meeting was discussed, edited and approved. It shows the documents contributed to this meeting and presented to the Audio Subgroup, either in the task groups or in Audio plenary. The Chair brought relevant documents from Requirements, Systems to the attention of the group. It was revised in the course of the week to reflect the progress of the meeting, and the final version is shown in Annex B.
-
Task groups were convened for the duration of the MPEG meeting, as shown in Annex C. Results of task group activities are reported below.
Approval of previous meeting report
The Chair asked for approval of the 102nd Audio Subgroup meeting report, which was registered as a contribution. The report was approved.
m27925
|
102nd MPEG Audio Report
|
Schuyler Quackenbush
| Review of AHG reports and NB comments
There were no requests to review any of the AHG reports.
Joint meetings
Groups
|
What
|
Where
|
Day
|
Time
|
Audio, Sys
|
Raw Audio and Video
|
3
|
Wed
|
1130-1230
|
Audio, Sys
|
Coding Independent Code Points
MP4 FF extensions for Audio
|
Audio
|
Thu
|
0900-1000
|
Audio, 3DV
|
3D Audio in Augmented Reality
|
3
|
Thu
|
1000-1100
|
Audio, Req
|
3D Audio Call for Proposals
|
Audio
|
Thu
|
1400-1500
|
All
|
Conformance streams on SVN
|
3
|
Thu
|
1300-1330
|
All
|
Liaison
|
3
|
Thu
|
1330-1400
| Received National Body Comments and Liaison matters
No.
|
Body
|
Title
|
Response
|
M28339
|
Swiss NB
|
Swiss NB Comment on MPEG-H 3D Audio
|
|
M27688
|
Ecma International Secretary General
|
Liaison to JTC1 SC29 WG11 (Audio Group) on Creation of Task Group to standardise a Scalable Sparse Spatial Sound System (S5)
|
|
M28007
|
ITU-R SG 6/WP 6C
|
Liaison Statement from ITU-R SG 6/WP 6C
|
|
M28008
|
ITU-R SG 6/WP 6C
|
Liaison Statement from ITU-R SG 6/WP 6C
|
| Plenary Discussions
There were none.
Record of AhG meetings AhG meeting on 3D Audio – Sunday 1000-1800
Test Material
The Chair presented
The contribution reports on AhG activity, but the presenter highlighted 3D Audio Material Selection, which occurred during a meeting help January 17 and 18 at FhG-IIS, Erlangen, Germany. The report has two tables that show the test material choosen: 12 items of Channel/Object content and 12 items of HOA content.
Call for Proposals Text
The Chair presented
m27370
|
Draft Call for Proposals for 3D Audio
|
Schuyler Quackenbush
|
The Chair reviewed the changes made relative to the output from the 102nd MPEG meeting, most of which were purely editorial or to the logistics and timeline of the Call. No decisions concerning specific changes were deferred until after other contributions that commented on the Call were reviewed.
Takehiro Sugimoto, NHK, presented
m27804
|
Comments on Call for Proposals for 3D Audio
|
Takehiro Sugimoto, Akio Ando
|
The contribution requested
-
Specific changes to the CfP with respect to shall/should for object-based inputs.
-
Clarification on how only channel-based systems can participate in the Call.
It noted that object-based system is interactive and flexible, but it might be difficult to guarantee a high level of quality in all consumer decoders.
The Chair noted that proponents are free to convert objects or HOA to 22.2 rendered channel-based items. In fact, these are exactly the reference renderings that are available for each object and HOA item.
It was the consensus of the group to add a note to the Call that encoders that process only channel-based inputs can use the pre-rendered version of the Object or HOA items, effectively using the reference renderer as a front-end to their encoder.
Jan Plogsties, FhG-IIS, presented
m28127
|
Comments on the Draft Call for Proposals for 3D Audio
|
Juergen Herre, Jan Plogsties, Johannes Hilpert, Achim Kuntz
|
The contribution proposes to scale the “target” bitrate for each test item for Phase 1, for example:
-
If a 22.2 item uses a 1.5 Mb/s rate, a 9.0 test item shall use (9/22)*1.5 Mb/s.
-
If a 22.2 item uses a 1.5 Mb/s rate, a 30-channel item shall use (30/22)*1.5 Mb/s
There was discussion of the bitrate for each test item. Gregory Pallone, Orange Labs, proposed that the bitrate be associated with the number of rendered output channels. Thomas Sporer, FhG-IDMT, proposed to modify the formula shown above so as to never permit the item bit rate to be higher than the target 22.2 channel bit rate. Clemens Par, Swissaudec, noted that this proposal might not be appropriate for the very low bit rates in Phase 2.
Test 1.1: It was the consensus of the group to use a loudspeaker setup that matches that of the test item.
Test 1.2: The presentation noted that off-sweet-spot evaluation is able to discern differences in rendering of object-based and HOA content. Hence off-sweet-spot listening should be done only with objects. Thomas Sporer, FhG-IDMT, note that a renderer might also provide “sweet-spot” improvement to channel-based content. Johannes Boehm, Technicolor, noted that at low bit rates a channel-based system might down-mix to fewer channels and then up-mix in the renderer, such that the renderer plays an important role. It was the consensus of the group to use both object-based and channel-based items in Test 1.2 (off-sweet-spot).
Test 1.4. The presentation noted that the method for randomization is not clear and must be clarified. The Chair and Gregory Pallone, Orange Labs, both noted that they have a MATLAB script that does such a randomization. It was agreed that these should be reviewed, adapted if need be, and incorporated into an Annex of the Call.
The presentation supported that end-to-end codec delay shall not be more than 1 second and also that the bitstream have random access points that can occur at least as often as 1 per second. Werner Oomen, Philips, questioned whether 1 second random access is appropriate. Many experts in the group felt that the issue of random access could be addressed as a discussion based on the proponent technical description. It was the consensus of the group that a submission shall be able to support random access, but that such random-access does not need to be present in the submitted bitstreams.
Werner Oomen, Philips, presented
m28128
|
Comments to MPEG 3D Audio CfP
|
Werner Oomen, Aki Harma
|
The contribution proposes that the RM0 technology be evaluated and selected at the 105th meeting. The Call text has been edited to clarify that RM0 technology be evaluated and selected at the 105th meeting.
Philips conducted a subjective test to evaluate the difference between
-
22.2 reference
-
5.1 channel automated downmix
Specific conclusions of the test
-
22.2 did not offer a clear advantage over 5.1
-
22.2 did have a significantly greater “spaciousness” as compared to 5.1
-
Head movements and waking around to different locations resulted in major differences in perception
The test data is available in the contribution.
Overall conclusions of the test
The contribution proposes that the CE process start at end of Phase 1, as long as the CEs do not conflict with Phase 2 work. It was the consensus of the group that the CfP be edited to indicate that CE process will start at the end of Phase 1, but that they will not develop Phase 2 functionality.
Headphone listening is very important use case and should be tested.
Concerning informative versus normative in decoder specification, the contribution proposed that
-
there should be normative decoder elements that guarantee a minimum level of quality.
-
There should be the option of using alternate processing (informative) while still being conformant.
It was the consensus of the group that there be normative interfaces (e.g. to read in an HRTF) and normative processing (e.g. default HRTF convolution). Conformance is determined using the default processing, but conforming products are allowed to use alternate processing.
Clemens Par, Suissaudec, presented
m28198
|
Comments for m27370 ("Draft Call for Proposals for 3D Audio")
|
Clemens Par
|
The contribution makes the following proposals with regard to quality assurance and present use case scenarios:
-
Phase 2 Evaluation Procedures should be firmly established.
-
In Test 2.2 the “at sweet spot” assessment should be deleted.
-
Test 2.4 should only assess 3D formats that are currently the marketplace, e.g. 10.1, 8.1, and 7.1. In particular, “Flexible Loudspeaker Placement“ should not be assessed.
CfP Evaluation
Oliver Wuebbolt, technicolor
m28117
|
Thoughts on Evaluation of 3D Audio
|
Oliver Wuebbolt
|
The contribution estimates the total test time and shows that it is very large (e.g. 6 weeks). It proposes several modifications to decrease workload:
Test 1.1
-
Omit the testing at 1.5 Mb/s since all systems might score “Excellent”
Test 1.2
-
Decrease number of off-sweet spots positions assessed
-
Test with two or more listeners in parallel to decrease test time.
Test 1.3
Test 1.4 – this is a very important functionality.
-
Fewer loudspeaker configurations
-
However, the contribution proposes to add another bitrate (e.g. high and low)
Test 1.1 – All agreed that we don’t want to loose marketing message of “high quality at 1.5 Mb/s”
Test 1.2 – Using more than one subject at a time can result in one subjects body “shading” the soundfield of another’s experience. However, listener positions could be divided across listeners or test sites.
Test 1.3 – Testing could be done on more than one sound booth thus exploiting parallelism. There was consensus in the group to use only two bitrates for this test, e.g. lowest and highest.
Test 1.4 – There was consensus in the group to test the following loudspeaker configurations
-
10.1
-
8.1 (could drop this if need be)
-
5.1
-
Random selection of 10 out of 22.2
-
Random selection of 5 out of 22.2
Andreas Silzle, FhG-IIS, noted that the first test should not be of 1.5 Mb/s rate, but rather 256 kb/s so subjects learn the spatial artefacts.
Werner Oomen, Philips, noted that if test effort budget permits, he recommends adding off-sweet-spot testing.
Dostları ilə paylaş: |