International organisation for standardisation



Yüklə 3,36 Mb.
səhifə67/79
tarix03.01.2022
ölçüsü3,36 Mb.
#42830
1   ...   63   64   65   66   67   68   69   70   ...   79
Score diff.

Best ref. Codec.

Type

Item

51,11

AMR-WB+

S

te1_mg54_speech

48,00

AMR-WB+

S

Wedding_speech

45,00

AMR-WB+

S

te19

42,22

AMR-WB+

S

carrot_speech

28,56

HE-AAC

M

phi7

21,56

HE-AAC

M

trilogy

21,33

AMR-WB+

SOM

HarryPotter

20,89

HE-AAC

M

salvation

20,44

AMR-WB+

SOM

te16_fe49

15,56

HE-AAC

M

dongwoo

14,67

HE-AAC

SOM

SpeechOverMusic_3

10,22

HE-AAC

SOM

twinkle_ff51

Eunmi Oh, Samsung, presented



m15462

Test item selection for Unified Speech and Audio Coding

Eunmi Oh
Miyoung Kim

This contribution used a process similar to the one proposed by Philips at the last meeting, shown here:

  • max (AMR-WB+ - HE-AAC v2): to improve content dependency of reference codecs

  • max (HE-AAC v2 - AMR-WB+): to improve content dependency of reference codecs

  • max (VC): to maintain the best performance of virtual codec

  • min (VC): to improve worst case behavior for virtual codec

As additional criterion, the contribution notes that test items should represent signals addressing the following issues:

  • to consider envisioned applications such mobile streaming and broadcasting where various types of contents exist

  • to achieve content robustness: equally well handle speech and music signals

  • to maintain the best performance of existing codecs

  • to improve coding efficiency at low bitrates

Eunmi Oh noted that bitrate and mono or stereo have a major impact on the items selected. Kristofer Kjörling, Dolby, noted that selecting based on maximum differences for the mixed category might not be the best criteria since it could be that both codecs that comprise VC have good scores, or have bad scores and in both of those cases the difference of scores is small.

The operating points tested were16 kb/s mono and 24 kb/s stereo, and the items selected were:




Category

Criteria

Speech

Music

Speech over Music

max (AMR - AAC)

es02

brahms

lion

max (AAC - AMR)

Wedding_ms_speech

dongwoo

phi6

max (VC)

te19

Music_2

speechOverMusic_3

min (VC)

es01

Music_4

twinkle_ff51

Markus Multrus, FhG, presented



m15427

Proposal for Test Item Selection for CfP on Unified Speech and Audio Coding

Markus Multrus
Max Neuenforf
Ralf Geiger
Bernhard Grill

The contribution presents listening tests for 12 kb/s, 16 kb/s/ and 24 kb/s mono operating points. The worst-scoring items for each coder comprising VC were noted, without regard to category. The number of “hits” for each item was summed up and the items with the most hits were selected.

Based on these criteria, the following list of test items per category was selected:



Speech

Music

Mixed

Green_speech

Music_5

HarryPotter

te19

phi7

SpeechOverMusic_2

es01

Trilogy




te1_mg54_speech

sc03



Johannes Boehm, Thomson, presented



m15465

Proposal for item selection, CfP Unified Speech and Audio Coding

Johannes Boehm

The contribution did a number of simplifications of the selection process:

  • Consider only at VC scores

  • Consider only 20 kb/s stereo operating point. The presenter noted that this is below the current operating for MPEG technology.

Given this, the process was to select

  • 2 items that were Min(VC)

  • 2 items that were Max(VC)

Only two listeners participated in the test. The following items were selected:




Group

Criterion

M

S

SM

Min(VC)

phi2

Wedding_speech

HarryPotter

Min2(VC)

Music_4

Arirang_speech

SpeechOverMusic_3

Max(VC)

brahms

te19

twinkle_ff51

Max2(VC)

Music_1

es03

te16_fe49


Discussion

The AhG recommends that there be a break-out group during the MPEG week to continue to discuss item selection, including:



  • Selection process

  • Operating modes (bitrate and mono or stereo) at which listening data is available and the number of listeners at each operating mode

  • Training procedure and training item selection

Anisse Taleb, Ericsson, proposed that there might be different items at different operating points. Heiko Purnhagen, Dolby, noted that this makes rate-distortion analysis problematic. The Chair noted that, typically, a single set of items have been used to assess performance at different rates. Pierrick Philippe, France Telecom Research, noted that perhaps 12 items, 4 for each of the three categories could be agreed on.

Werner Oomen, Philips, provided the following summary information:



Test

kb/s

S/M

Tested by (subjects)

1

64k

S

-

2

32

S

-

3

24

S

Samsung (3), Philips (2)

4

24

M

FhG (3), VoiceAge(3)

5

20

S

Thomson (2), ETRI(3)

6

20

M

ETRI(3)

7

16

S

-

8

16

M

Samsung (8), FhG (3), VoiceAge(3)

9

12

M

FhG (3), VoiceAge(3)

Schuyler Quackenbush, Audio Research Labs, presented



m15448

Draft Workplan on Subjective Testing of Unified Speech and Audio Coding Proposals

Schuyler Quackenbush

The Contribution presented a Workplan for the test of CfP responses. This draft was reviewed, some issues were clarified and remaining open issues were highlighted. The Chair encouraged experts to carefully review this draft since a final document is needed on Friday.

4General Interest Presentations

4.1Interactive Music Demonstration ETRI


This presentation described and demonstrated an consumer product in Korea that permits the user to play not only the producer mix of a music CD, but also a user-designated mix in which the user can set the level of a number of “voices” in the music, for example main vocal, chorus or specific instruments. This product is already associated with about 30 music CDs in Korea, and over 1 million interactive music CDs have been sold in ASIA (Korea, Japan, etc.).

The physical CD media is backward-compatible with conventional CD players, but has “hidden data” on CD that enables the interactive functionality.

This technology is proposed to MPEG as a MAF for physical CD and for streaming formats. The related input documents are as following.

4.2Overview of Professional Archive MAF


This presentation described the current status of the professional archive MAF. It is based on the MPEG-21 File Format. Hence the MAF supports a rich set of meta-data for the content set. It is envisioned that a flexible set of media compression tools will be supported, and currently MPEG-4 ALS and MPEG-4 AVC are supported. Tools are identified via identifier fields which are defined either in this specification or by appropriate registration authorities.


Yüklə 3,36 Mb.

Dostları ilə paylaş:
1   ...   63   64   65   66   67   68   69   70   ...   79




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2025
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin