2.8Plenary Discussion
There was none.
3Record of AhG meetings 3.1AhG Meeting on USAC -- Sunday 0900-1800
JungHoe Kim, Samsung, presented
m17227
|
Proposed change of the IPD sign in USAC
|
JungHoe Kim
Eunmi Oh
Werner Oomen
Heiko Purnhagen
Julien Robilliard
|
This contribution is a joint proposal that requests the sign of IPD in phase coding of MPEG Surround have the same sense as in HE-AAC V2. New reference bitstreams reflecting this change can be made available shortly after the end of the 91st meeting, as will modified text in the next USAC WD and modified reference encoder and decoder sourc.
It was the consensus of the AhG to accept the proposal.
JungHoe Kim, Samsung, presented
m17228
|
Proposed informative encoder description of phase coding in USAC
|
JungHoe Kim
Eunmi Oh
|
The contribution shows how one could construct an encoder using phase coding of IPD parameter. Prior to the next meeting Samsung will provide reference source code for the MPEG Reference Encoder.
It was the consensus of the AhG to integrate the contribution into an informative annex of the next USAC WD.
Hyunkook Lee, LG, presented
m17292
|
Report on the USAC reference encoder
|
Hyunkook Lee
Sungyong Yoon
|
This contribution proposes several bugfixes for the USAC encoder
-
ICCQuant in sac_enc.c to correct limit of for loop
-
OttBoxCalcIPD in sac_ipd.c to correct sence of IPD sign (this is corrected by m17228)
-
OttBoxCalcIPD in sac_ipd.c to correct order of arctan2 function.
-
OttBox sac_enc.c to correct L/R phase measurement.
Samsung and FhG are willing to review and improve these encoder-related modules.
It was the consensus of the AhG to draft a workplan to
-
Incorporate these updates
-
Make further improvements
-
Check performance via a listening test
Werner Oomen, Philips, presented
m17278
|
Corrections to Unified Stereo Coding
|
Erik Schuijers
Heiko Purnhagen
Pontus Carlsson
Werner Oomen
Julien Robilliard
|
The contribution proposes a number of corrections:
-
Align MPEG Surround decorrelation/residual sign to be as in HE-AAC V2 parametric stereo. This is exactly what has been requested by Samsung in m17227. Although the result of adding in the residual signal will change (due to the sign change), this is not expected to have any quality impact.
-
Correct errors in the Unified Stereo Coding encoder description in the WD informative annex.
-
Removal of gain clipping constant in MPEG Encoder and Decoder Reference Software. Philips has done informal listening test that confirm that the change does not impact audio quality.
New text, reference code, bitstreams and decoded waveforms will be available prior to the next MPEG meeting.
JungHoe Kim, Samsung, requested that there be a listening test to check the impact of the clipping change. The presenter proposed that an informal listening session could be conducted during the MPEG week. Heiko Purnhagen, Dolby, did not expect the sign change of the residual to have an impact on audio quality. Mohamad Raad, RaadTech, asked the magnitude of the errors that the gain clipping in the current code imposes. Heiko Purnhagen, Dolby, noted that it is merely a unnecessary limitation in the USAC decoder. There is no such decoder gain limitation in e.g. HE-AAC parametric stereo. The Chair noted that the technology proponents should of course check that legal bitstreams (i.e. conformant bitstreams) never cause a normative decoder to “blow up,” e.g. to divide by zero in the course of calculating the output waveforms.
It was the consensus of the AhG to accept the first two corrections into USAC WD6. The proposed change to gain clipping will continue to be studied and will be discussed at the next meeting.
Heiko Purnhagen, Dolby, presented
m17273
|
Educational information on Unified Stereo encoding
|
Heiko Purnhagen
Pontus Carlsson
Werner Oomen
Erik Schuijers
|
The contribution presents guidance for constructing a USAC encoder that supports unified stereo coding. The Chair expressed a great interest in getting some baseline “plumbing” as reference encoder source code, and noted that the educational text assumes that certain modules are present, which may not be the case in the MPEG Reference Encoder. It was agreed to draft a workplan to identify missing software modules and to identify resources to provide such code. The Chair requested that experts from Dolby assist in identifying missing encoder modules.
Experts from Dolby, Phlips, Samsung and FhG will help draft the workplan that identifies modules that possibly must be incorporated into WD4 to support the unified stereo CE. Pierrick Phillipe, Orange Labs, requested more information concerning the derivation of alpha that is used in calculating the downmix and residual. An example method to calculate alpha was requested. There was considerable additional discussion on other encoder options that might be realized and which are not covered on the educational text. The CE proponents will work with interested experts to modify the education text such that it addresses these concerns.
It was the consensus of the AhG to incorporate this document into the WD6 informative annex and additionally to draft a workplan on enhancing the MPEG Reference Encoder with the aim that it contains appropriate framework to support the unified stereo tool.
JungHoe Kim, Samsung, presented
m17230
|
Comments on reference software of MPEG Surround and reference software using MPEG Surround tool
|
JungHoe Kim
Miyoung Kim
Eunmi Oh
|
This contribution raises the issue as to whether the current MPEG Reference Encoder is sufficient to support the RM0 and additional CE functionality. It specifically notes modules in MPEG Surround that would be needed to enable full functionality.
The Chair encouraged interested parties to consider the issues raised. This will be discussed again later in the MPEG week. Based on the consensus of the group, items may be added to a workplan with the goal of enhancing the functionality of the MPEG Reference Encoder.
Kristofer Kjörling, Dolby, presented
m17165
|
Report on cross-check listening test for the USAC CE on Unification of USAC Windowing and Frame Transitions
|
Kristofer Kjörling
Lars Villemoes
|
The contribution reports results of listening tests on the USAC CE technology operating at 12kbps and 24kbps for mono signals. For both tests, the absolute scores did not reveal any differences. Differential analysis for 12 kb/s displayed
-
in the mean over all 15 items, where both A and C are better than B;
-
in four instances where A is better than B, (“Wedding speech”, “twinkle ff51”, “HarryPotter”; and “phi7”);
-
in two instances where C is better than B, (“Wedding speech” and “HarryPotter”);
-
in one instance where C is worse than B, (“te1 mg54 speech”).
Differential analysis for 24 kb/s displayed
-
in the mean over all 15 items, where A is better than B;
-
in three instances where A is better than B, (“louis raquin 15”, “te1 mg54 speech”, and “HarryPotter”);
-
in one instance where C is better than B, (“louis raquin 15”);
-
in one instance where C is worse than B, (“phi7”);
Jeongook Song, LG, presented
m17209
|
YSU-LG Listening test for CE on Unification of USAC Windowing and Frame Transitions
|
Jeongook Song
Chang-Heon Lee
Henney Oh
|
The contribution reports results of a listening test conducted at Yonsei University on the USAC CE technology operating at 12kbps for mono signals. Systems under test were WD4, WD4+FAC and WD4+FAC+FDNS. Absolute score analysis did not reveal any differences and differential analysis was not presented.
Taejin Lee, ETRI, presented
m17216
|
ETRI Listening Test Results for CE on Unification of USAC Windowing and Frame Transitions
|
Taejin Lee
Seungkwon Beack
Minje Kim
Kyeongok Kang
|
The contribution reports results of listening tests on the USAC CE technology operating at 12kbps and 24kbps for mono signals. Systems under test were WD4, WD4+FAC and WD4+FAC+FDNS. For both tests, the absolute scores did not reveal any differences.
JungHoe Kim, Samsung, presented
m17229
|
Crosscheck report on Unification of USAC Windowing and Frame Transitions
|
JungHoe Kim
Eunmi Oh
|
The contribution reports results of a listening test on the USAC CE technology operating at 12kbps for mono signals. Systems under test were WD4, WD4+FAC and WD4+FAC+FDNS. For this tests, the absolute scores did not reveal any differences.
Max Neuendorf, FhG, presented
m17167
|
Completion of Core Experiment on unification of USAC Windowing and Frame Transitions
|
Max Neuendorf
Bruno Bessette
Roch Lefebvre
Philippe Gournay
Stefan Bayer
Jeremie Lecomte
|
This contribution reviews the technical details of the USAC Windowing and Frame Transition CE. The presentation gave an overview of the technology and refers the interested reader to the contribution text. It also gives information on complexity and reports on a listening test at FhG/Voiceage. Finally, listening test results for the pooling of all data is presented.
The following shortcomings of WD4 are noted:
-
Frame grid and window grid are not aligned
-
Not every transform is critically sampled since some transitions require samples to be discarded
-
Many window shapes are sub-optimal.
-
Some transforms are not a power of 2, entailing additional programming complexity.
It then presents the transitions using the CE technology
-
Frame grid and window grid are aligned
-
All transforms are critically sampled
-
Nearly all window shapes are optimal.
-
All transforms are a power of 2.
The key to realizing these simplifications are
-
Exchange of LP weighting and IMDCT in the TCX branch of the decoder
-
Application of the LP weighting in the MDCT coefficient domain using the Frequency Domain Noise Shapping (FDNS) tool.
-
Using Forward Aliasing Cancellation (FAC) tool to cancel time domain aliasing in IMDCT when the adjacent frame is LP-ACELP.
Mohamad Raad, Raadtech, questioned whether the FDNS tool uses the appropriate interpolation. He noted that the goal is to match the actual signal target (short-time spectrum) rather than have a LPC interpolation scheme. The Chair asked Mohamad Raad and interested proponent experts to convene a break-out to discuss this in greater detail and report back to the group.
The contribution presents the instructions that would be needed on an idealized DSP to implement the modules that comprise the CE proposal. The result is that the complexity of WD4 vs WD4+CE is approximately the same. Pierrick Philippe requested time for additional study of what is presented in the contribution, and will report back to the group any further observations he may have. The presenter noted that ROM and RAM requirements for WD4 vs WD4+CE are not significantly different.
The presenter showed listening test results for the pooled listening test data. For 12 kb/s mono there are 54 listeners at 6 test sties. When doing a differential analysis, the mean score for WD4+CE is better than that for WD4 at the 95% level of significance; that the scores for 7 items are better at the 95% level of significance and that none are worse. Note that here CE is FDNS+FAC.
For 24 kb/s mono there are 36 listeners at 4 test sties. When doing a differential analysis, the mean score for WD4+CE is better than that for WD4 at the 95% level of significance; that the scores for 7 items are better at the 95% level of significance and that none are worse.
When doing absolute analysis, the pooled data show that the mean score of WD4+CE is not different from that of WD4 at the 95% level of significance.
Mohamad Raad, Raadtech, requested that the distribution of listener scores for each test site be presented. The Chair requested that differential analysis for individual test sites be presented. This data plus comments on complexity plus any other view on statistics or the decision process on this CE will be presented on Tuesday afternoon.
Taejin Lee, ETRI, presented
m17215
|
Report on TCX Improvements
|
Taejin Lee
Seungkwon Beack
Minje Kim
Kyeongok Kang
|
The contribution reviews the CE technology, which is a flexibile TCX overlap that depends on next frame’s TCX mode. A new technology proposed in this contribution is to have the transition length be responsive to transients in the overlap region, i.e. shorter overlap when transients are present. It contends that this increased the quality of the TCX mode for musical signals. Listening test results for TCX with transient protection (tr) were presented, showing:
-
12 kb/s mono: one better, none worse, global mean not different
-
20 kb/s mono: 3 better, one worse, global mean not different
The technology requires one additional encoding stage without LPC processing to determine the next frame’s TCX window type.
The Audio Subgroup will draft a workplan to coordinate progress in this CE.
Dostları ilə paylaş: |