International organisation for standardisation organisation internationale de normalisation

Yüklə 5,72 Mb.

səhifə	78/84
tarix	25.12.2017
ölçüsü	5,72 Mb.
	#35931

1 ... 74 75 76 77 78 79 80 81 ... 84

1Opening Audio Plenary

The MPEG Audio Subgroup meeting was held during the 94^th meeting of WG11, October 11-15, 2010, Guangzhou, China. The list of participants is given in 1.

2Administrative matters

2.1Communications from the Chair

The Chair summarised the issues raised at the Sunday evening Chair’s meeting, proposed task groups for the week, and proposed agenda items for discussion in Audio plenary.

2.2Approval of agenda and allocation of contributions

The agenda and schedule for the meeting was discussed, edited and approved. It shows the documents contributed to this meeting and presented to the Audio Subgroup, either in the task groups or in Audio plenary. The Chair brought relevant documents from Requirements, Systems to the attention of the group. It was revised in the course of the week to reflect the progress of the meeting, and the final version is shown in 2.

2.3Creation of Task Groups

Task groups were convened for the duration of the MPEG meeting, as shown in . Results of task group activities are reported below.

2.4Approval of previous meeting report

The Chair asked for approval of the 93^rd Audio Subgroup meeting report, which was registered as a contribution. There was some discussion and wording concerning discussion in the Audio closing plenary was modified, and the revised report was approved.

2.5Review of AHG reports

There were no requests to review any of the AHG reports.

2.6Joint meetings

Who	What	Where	When
Req, Audio	Audio for HEVC	Audio	Wed, 1400-1500

2.7Received National Body Comments and Liaison matters

Num.	Source	Respond
M17985	ITU-T SG 16 to SC 29/WG 11 Approval of a New Question on Telepresence systems	S. Quackenbush

2.8Plenary Discussion

3Record of AhG meetings

3.1AhG Meeting on USAC -- Sunday 1000-1800

USAC Text and Reference Software corrections

Max Neuendorf, FhG, presented

m18456

Corrections to Reference Software and CD of USAC

Max Neuendorf

This contribution proposes changed to the USAC CD text and Reference Software. The various proposed changes are categorized into the following three categories:

Changes to the CD text (which are only editorial)

Description of spectral noiseless coding
- Consistent use of variables
- Clarification of arithmetic coding context
- Correction to numerical constants
Description of overlap-add for synthesis of output from adjacent coded frames
Clarifications to MDCT-based TCX description
Decoding coefficents for complex stereo prediction in MDCT
Combination of SBR and MPS 212
Re-organization of CD text to conform to ISO directive

Changes to the Reference Software

Recalculation of pitch gain for use in the bass postfilter in TCX to ACELP transitions
Corrections to handling higher bands in 32-band QMF analysis filterbank
- Should not initialize high QMF bands to zero in unified stereo coding

Changes to both CD text and Reference Software

Simplification of AVQ bit stream syntax. This re-organizes the bit stream but does not affect the decoded waveform
- Make all integer serialization msb first
- Remove interleaving of AVQ coefficients

The AhG recommends to

Adopt the proposed changes to the CD text (which are only editorial) into a “Study on USAC CD” output document.
Make the proposed changes to the USAC reference software
Adopt the proposed changes to the CD text and make the associated changes to the USAC reference software

Schuyler Quackenbush, ARL, presented

m18077

Draft USAC CE Status and Workplan

S. Quackenbush

The presenter asked USAC CE proponents to please check the CE status tables and to bring them up to date so that producing the output document on Friday is that much easier.

Improved bass- post filter

Philippe Gournay , VoiceAge, presented

m18435

VoiceAge listening test results for the enhanced bass-postfilter CE

Philippe Gournay, Roch Lefebvre

The contribution presented 16 kb/s listening test results for this CE. Systems under test are RM7 and RM7+CE. Seven items containing singing voice with music background were used as items in addition to the CfP test items. There was no difference for absolute scores. When considering differential scores, 4 of the 19 items are better and the mean is better at 95% level of significance.

Max Neuendorf, FhG, presented

m18451

FhG listening test report on CE on improving the USAC bass post-filter

Guillaume Fuchs, Max Neuendorf

The contribution presented 16 kb/s mono listening test results for this CE. There is no difference for absolute scores. When considering differential scores, 2 items are better, 1 item worse at the 95% level of significance.

The contribution also reports that the CE software, as integrated into the decoder, was run and decoded the listening test bit streams to produce exactly the listening test decoded waveforms.

David Virette, Huawei, presented

m18467

Report on cross-check listening test for the CE on improved Bass-post filter for USAC

David Virette

The contribution presented 16 kb/s mono listening test results for this CE. There is no difference for absolute scores. When considering differential scores, 1 item worse (es01) at the 95% level of significance.

Kristofer Kjörling, Dolby, presented

m18379

Finalization of CE on an improved bass- post filter operation for the ACELP of USAC

Barbara Resch, Leif Sehlström, Heiko Purnhagen, Lars Villemoes, Kristofer Kjörling, Bruno Bessette, Philippe Gournay

The contribution presented 16 kb/s mono listening test results for this CE. There is no difference for absolute scores. When considering differential scores, 4 items better, the mean better, none worse at the 95% level of significance.

The contribution also presents a overview of the technology and a summary of the cross-check results.

It notes that the current bass post-filter, used in ACELP coding mode, helps enhance the vocal signal. However in the case that there is background music with strong low-frequency harmonics, when the coder switches between ACELP, TXCX and FD modes, the suppression of the musical low harmonics comes and goes and is quite audible.

When considering individual sites, for 2 items and for the mean, 2 of 4 sites agree on improvement.

When all data is pooled, there are a total of 32 listeners. When considering differential scores, 4 items better, the mean better and none worse at the 95% level of significance.

The AhG recommends adopting the CE technology into the “Study on USAC CD” output document.

Complexity reduction for time warping

Takeshi Norimatsu, Panasonic, presented

m18360

Panasonic cross check report on complexity reduction for time warping in USAC

Takeshi Norimatsu, Tomokazu Ishikawa, Haishan Zhong, Dan Zhao, Kok Seng Chong

The contribution reports results for a 64 kb/s stereo listening test. It showed no significant differences in either absolute or differential scores, for any item or for the mean.

Heiko Purnhagen, Dolby, presented

m18375

Dolby listening test results for CE on complexity reduction for time warping in USAC

Heiko Purnhagen, Kristofer Kjörling

The contribution reports results for a 64 kb/s stereo listening test. It showed no significant differences for absolute scores. For differential scores, 1 item was better.

Markus Multrus, FhG, presented

m18452

Completion of the Core Experiment on Reducing the Complexity of the USAC Time Warping

Stefan Bayer

The contribution gives an overview of the CE technology and a summary of the listening test results.

The technology uses time warping to reduce the pitch interval variations over a frame such that the pitch epochs have a more nearly harmonic frequency representation and thus can be more efficiently coded by the FD coder mode.

The following table presents computational complexity. All numbers are WMOPS.

RM8

RM8+CE

TW tool only

19.6

9.5

Full Decoder

33.4

23.3

The contribution reports results for a 64 kb/s stereo listening test conducted at FhG. It showed no significant differences in either absolute or differential scores, for any item or for the mean.

When listening test data are pooled over all test sites (25 listeners), it showed no significant differences in either absolute or differential scores, for any item or for the mean.

Heiko Purnhagen, Dolby, reported that he decoded the listening test bitstreams (which were the RM8 bitstreams) using the CE proponent decoder and produced the waveforms that were used in the listening test. He then decoded the bitstreams using the RM8 decoder and produced the reference RM8 decoded waveforms.

The AhG recommends adopting the CE technology into the “Study on USAC CD” output document.
Improved SBR in USAC

Toru Chinen, Sony, presented

m18398

Sony listening test report on improved SBR in USAC

Toru Chinen, Masayuki Nishiguchi

The contribution reports the results of a 12 kb/s mono listening test. It showed no significant differences in either absolute or differential scores, for any item or for the mean.

Max Neuendorf, FhG, presented

m18431

FhG Listening Test Report – improved SBR

Stephan Wilde, Max Neuendorf

The contribution reports the results of a 12 kb/s mono listening test. It showed no significant differences in absolute scores. For differential scores, 2 items were better at the 95% level of significance.

The contribution also reports on software verification: It confirms that the listening test bitstreams did decode to the listening test waveforms. Furthermore, it confirms that WD7 bitstreams decoded to the WD7 reference waveforms.

Kristofer Kjörling, Dolby, presented

m18378

Finalization of CE on improved SBR

Kristofer Kjörling, Leif Sehlström

The contribution gives an overview of the CE technology, presents new listening test results and presents all listening test results as pooled data.

In RM7, SBR uses much of the machinery present in MPEG-4 SBR. One shortcoming present in SBR is that the copy-up envelope can have considerable discontinuities (as, for example, shown in es01), which might be so large that adjustment limiters prevent a target envelope from being achieved. The CE technology is a “pre-adjustment” gain stage which insures that the high-band envelope adjuster is able to make the adjustments needed in the high bank.

Computational complexity is quite low, approximately 0.1 WMPOS.

A control bit in the SBR header signals that the tool should be used from that point forward. Toru Chinen, Sony, noted that ARIB in Japan are currently using HE-AAC and send an SBR header very 500 ms. Kristofer Kjörling, Dolby, noted that the SBR encoder would usually send the SBR header whenever it was advantageous to change SBR configuration (which is specified in the SBR header).

For the pooled data, two items are better at the 95% level of significance.

The AhG recommends adopting the CE technology into the “Study on USAC CD” output document.

Harmonic transposer

Kihyun Choo, Samsung, presented

m18372

Crosscheck report on harmonic transposer CEs

Kihyun Choo, Miyoung Kim, Eunmi Oh

The contribution reports the results of a 16 kb/s mono listening test that compares all configurations of the two CEs. There were no differences with respect to the absolute scores.

With respect to differential scores, it reports the following:

QMF – FFT: 1 better, 1 worse.
QMFxp – QMF: 1 worse.
FFTxp – FFT: 1 worse

Kimitaka Tsutsumi, NTT DOCOMO, presented

m18466

NTT DOCOMO Cross-check Report on Improved Harmonic Transposer in USAC

Kimitaka Tsutsumi, Kei Kikuiri, Nobuhiko Naka

The contribution reports the results of a 16 kb/s mono listening test that compares all configurations of the two CEs. There were no differences with respect to the absolute scores.

With respect to differential scores, it reports the following:

QMF – FFT: 3 better
QMFxp – QMF: no difference.
FFTxp – FFT: 3 better, mean better

In addition, the contribution reports that NTT DOCOMO confirms that the listening test bitstreams decode exactly to the listening test decoded waveforms.

Jeff Huang, Qualcomm, presented

m18459

Crosscheck listening test report for USAC on FFT and QMF harmonic transposers

Jeff Huang

The contribution reports the results of a 16 kb/s mono listening test that compares all configurations of the two CEs. There were no differences with respect to the absolute scores.

With respect to differential scores, it reports the following:

QMF – FFT: no difference

David Virette, Huawei, presented

m18468

Report on cross-check listening test for the Ces on QMF based harmonic transposer and improved harmonic transposer in USAC

David Virette

The contribution reports the results of a 16 kb/s mono listening test that compares all configurations of the two CEs. There were no differences with respect to the absolute scores.

With respect to differential scores, it reports the following:

QMF – FFT: 1 better
QMFxp – QMF: no difference
FFTxp – FFT: no difference

In addition, the contribution reports that Huawei confirms that the listening test bitstreams decode exactly to the listening test decoded waveforms.

Zhong Haishan, Panasonic, presented

m18501

Panasonic crosscheck report on improved harmonic transposer

Zhong Haishan, Chong Kok Seng, Zhao Dan, Takeshi Norimatsu, Tomokazu Ishikawa, Neo Sua Hong

The contribution reports the results of a 16 kb/s mono listening test that compares all configurations of the two CEs. There were no differences with respect to the absolute scores.

With respect to differential scores, it reports the following:

QMFxp – QMF: 2 better
FFTxp – FFT: 5 better
QMFxp – FFTxp: no difference

Zhong Haishan, Panasonic, presented

m18386

Finalization of CE on QMF based harmonic transposer

Haishan Zhong, Kok Seng Chong, Takeshi Norimatsu, Tomokazu Ishikawa, Lars Villemoes, Per Ekstrand, Kristofer Kjörling, Stephan Wilde, Sascha Disch, Frederik Nagel, Max Neuendorf ,

The contribution gives an overview of the CE technology, including complexity information, and also presents a summary of all listening test results.

The FFT transposer has high frequency resolution but also high complexity, while the QMF transposer has much lower complexity, as shown in the following table:

Configuration	Total WMOPS Transposer only	WMOPS percentage Transposer	Total WMOPS Decoder	WMOPS percentage Decoder
FFT based Harmonic Transposer with 10% oversampling frames (WD7)	5.79	100%	9.42	100%
QMF based harmonic transposer	0.86	14.8%	4.49	47.7%

When all listening test data is combined, analysis of differential MUSHRA scores shows:

QMF – FFT: no difference

Harmonic transposer – Cross products technology

Kristofer Kjörling, Dolby, presented

m18384

Finalization of CE on improved harmonic transposer in USAC

Lars Villemoes, Per Ekstrand, Sascha Disch, Frederik Nagel

The contribution presented an overview of the RM8 transposer technology as compared to the CE transposer technologies. It makes the point that the RM8 transposer results in many missing harmonics which are perceived as “ghost” higher fundamentals. The proposed cross-product technology permits “filling-in” the missing fundamentals via construction filterbank signals as a sum of adjacent low-band filter signals.

QMF suffers when signals have very low fundamentals, since its low frequency resolution results in many distinct fundamentals mapping to the same quantized representation.

When all listening test data is combined, analysis of differential MUSHRA scores shows:

QMFxp – QMF: 4 better, mean better
FFTxp – FFT: 4 better, mean better

When looking at individual test sites:

QMFxp – QMF: no strong consensus
FFTxp – FFT: 3 items for which at least 3 of 6 agree

The AhG recommends adopting the cross-product technology into the “Study on USAC CD” output document.

Harmonic transposer – QMF technology

Kristofer Kjörling, Dolby, presented

m18389

Overview of performance of transposer proposals, and suggested decoding modes

Kristofer Kjörling, Haishan Zhong, Kok Seng Chong, Takeshi Norimatsu, Tomokazu Ishikawa, Lars Villemoes, Per Ekstrand, Stephan Wilde, Sascha Disch, Frederik Nagel, Max Neuendorf ,

The contribution presents an overview of the proposed transposer technology. There are two CEs which propose the following replacements or additions to the current RM7 FFT transposer technology:

QMF (replacement or addition)
Cross products (addition)

It focuses on the 16 kb/s mono and stereo operating point because here the transposer requires the greatest fraction of decoder resources.

For mono:

QMF – 4.5 MOPS (reduces total decoder complexity to 48% of RM7)
Cross products
- FFT – 7.4 MOPS (reduces total decoder complexity to 78% of RM7)
- QMF – 4.7 MOPS (reduces total decoder complexity to 50% of RM7)

For stereo:

QMF – 8.6 MOPS (reduces total decoder complexity to 63% of RM7)
Cross products
- FFT – 11.5 MOPS (reduces total decoder complexity to 85% of RM7)
- QMF – 8.8 MOPS (reduces total decoder complexity to 65% of RM7)

It notes that the quality of the QMF transposer is comparable to that of the FFT transposer. It further notes that incorporating cross products into the transposer provides a significant increase in quality while requiring either a decrease in complexity (FFT) or a modest increase in complexity (QMF). If a decoder with FFT and cross products is used as a baseline of 100%, then the QMF transproser with cross products results in a decoder that is 65% of the baseline complexity.

In terms of quality, when all data is pooled, 1 item is better for the differential score QMFxp – FFTxp, however there is not strong agreement amongst the results from individual test sites.

The contribution proposes that there be a “Low Power” and a “High Quality” decoding modes, where a single bitstream syntax can be decoded in either decoding mode. For each decoding mode, the transposers in each would be:

Low Power: QMFxp
High Quality: FFTxp

Discussion

The Chair noted that, in USAC, there is no “low power” mode defined in USAC. Hence, the Chair feels that the proposal is whether to have a FFTxp transposer (low complexity) or a QMFxp transposer (very low complexity) or both.

Max Neuendorf, FhG, notes that when the individual test sites compare QMFxp versus FFTxp, for many sites there was judged to be 1 or more item worse (i.e. QMFxp worse than FFTxp).

Werner Oomen, Philips, felt that there should be only one transposer in USAC and that the group should pick one.

The Chair felt that there was not consensus in the group to make a decision at this time. The topic will be brought up again later in the MPEG week.
T/F domain post-processing

Kihyun Choo, Samsung, presented

m18371

Crosscheck report on adaptive T/F domain post-processing for USAC

Kihyun Choo, Miyoung Kim, Eunmi Oh

The contribution presents results of a listening test comparing WD6+CE and WD6. The operating points and test results for the statistic (WD6+CE - WD6) were

12 kb/s mono
- Absolute scores: no difference
- Differential scores: 1 better
8 kb/s mono
- Absolute scores: no difference
- Differential scores: 2 better

Heiko Purnhagen, Dolby, presented

m18373

Dolby listening test results for CE on T/F post-processing in USAC

Heiko Purnhagen, Kristofer Kjörling

The contribution presents results of a listening test comparing WD6+CE and WD6. The operating points and test results for the statistic (WD6+CE - WD6) were

12 kb/s mono
- Absolute scores: no difference
- Differential scores: no difference
8 kb/s mono
- Absolute scores: no difference
- Differential scores: 2 better, 1 worse (Normal distribution) or 1 better (Student t distribution)

Jeff Huang, Qualcomm, presented

m18461

Crosscheck listening test report for USAC on time frequency domain post-processing

Jeff Huang

The contribution presents results of a listening test comparing WD6+CE and WD6. The operating points and test results for the statistic (WD6+CE - WD6) were

12 kb/s mono
- Absolute scores: no difference
- Differential scores: : 3 better, 1 worse

David Virette, Huawei, presented

m18471

Finalization of CE on adaptive T/F domain post-processing for USAC

David Virette, Wei Xiao

The contribution presents results of a listening test comparing WD6+CE and WD6. The operating points and test results for the statistic (WD6+CE - WD6) were

12 kb/s mono
- Absolute scores: no difference
- Differential scores: 3 better, mean better
8 kb/s mono
- Absolute scores: no difference
- Differential scores: 5 better, mean better

When data from all test sites is pooled

12 kb/s mono
- Absolute scores: no difference
- Differential scores: 4 better, mean better
8 kb/s mono
- Absolute scores: no difference
- Differential scores: 5 better, mean better

It reviewed the complexity of the CE technology, which is shown in the following table:

	Average PCU	Maximum PCU	RM6
mono@8kbps	0.24	0.56
mono@12kbps	0.31	0.73	8

The presenter noted that the post-processing control bits are transmitted in the bit stream only if the coding mode is LP.

Kristofer Kjörling, Dolby, noted that the Spectrum Flattening Post Processing seems similar to the “Improved SBR” tool, which also helps to flatten the SBR HF envelope. The Chair asked how this post processor compares to the “Improved Bass Post-Filter” Philippe Gournay , VoiceAge noted that the Bass Post Filter was limited to processing the signal below 500 Hz. The presenter noted that this technology does noise shaping across the spectrum.

The Chair felt that there was not consensus in the group to make a decision at this time. The topic will be brought up again later in the MPEG week.

Yüklə 5,72 Mb.

Dostları ilə paylaş:

1 ... 74 75 76 77 78 79 80 81 ... 84