Organisation internationale de normalisation



Yüklə 5,54 Mb.
səhifə28/197
tarix02.01.2022
ölçüsü5,54 Mb.
#32757
1   ...   24   25   26   27   28   29   30   31   ...   197

Contributions





Number

Session

Title

Source

Dispositions

m36495

File Format

Comments on 3D audio file format DAM2 text

Mitsuhiro Hirabayashi, Toru Chinen

Accepted N15379

m36471

File Format

SAP related proposals for ISO/IEC 14496-12 and 14496-15

M. M. Hannuksela (Nokia), X. Huang (USTC)

Accepted N15881

m36172

File Format

Update to conformance bitstream

Richard Mitic

Accepted N15484

m36364

File Format

Comments on Partial File Storage

David Singer

Noted

N15478


m36611

File Format

Partial File Delivery - Status Update

Thomas Stockhammer

Noted

N15478


m36584

File Format

Error Handling in File Format

Imed Bouazizi

Noted

N15478


m36365

File Format

Updated defect report on ISO/IEC 14496-12

David Singer

Noted

N15881


m36367

File Format

Does a trailing movie fragment help?

David Singer

Noted

m36368

File Format

Long samples and fragmented ISO BMFF files

Chris Flick, David Singer

Noted

m36606

File Format

Web Interactivity Track

Thomas Stockhammer, Giridhar D. Mandyam, Kent Walker, Charles Lo

Noted

m36667

File Format

Comments on semantics of track reference box in 14496-12

Hendry, Ye-Kui Wang

Accepted N15881

M36511

File Format

Proposed updates to ISO/IEC 23008-3 DAM 2

Ingo Hofmann, Harald Fuchs, Bernd Czelhan, Matteo Agnelli

Accepted N15379



    1. Summary of discussions




      1. m36172 Update to conformance bitstream


Thank you. We need to update the spreadsheet with all the latest files.
      1. m36495 Comments on 3D audio file format DAM2 text


Into the document that is a re-issue of the amendment, editors to integrate as appropriate.
      1. m36511 Proposed updates to ISO/IEC 23008-3 DAM 2


Thank you. File format experts need to discuss the marking of IPF frames and independent frames, the ‘roll’ behavior, and so on.

An AU contains the encoding of the audio to be played at the AU’s timestamp.


1) An IF or IPF is a sync sample

An IPF is marked by a SAP sample group with SAP type 1.


2) An IPF is a sync sample.

An IF is marked by a ‘roll’ sample group entry with a roll distance (1 or 2?), (or every frame except IPF is marked by the pre-roll sample group).


3) An IF or IPF is a sync sample

An IF that is not IPF is marked by a SAP sample group with SAP type 4.


4) An IF or IPF is a sync sample

Plus, define a new SAP type, equivalent to type 4, but it is acceptable as a sync point (redefine sync)


We agree an IF is SAP type 4. We currently do not allow SAP type 4 to be a sync sample.

  1. we seem to have a contradiction between the definition of sync sample (sap type 1 or 2) and the fact that the sync sample table and the sap group would differ.

  2. we may get streams with an empty sync sample table; players that don’t look for a roll group may be horribly confused or start from the beginning

  3. contradiction; SAP type 4 is not a sync sample, and anyway, it’s not formally perfect player behavior to fade in after a seek

3 types of frame:

Not independent frame: a difference frame, don’t start here

Independency frame; can be decoded without preceding frames, but the audio fades in; the audio is correct N frames later (like GDR)

IPF; contains the priming data such that starting here yields immediately correct audio with no fade-in (like IDR)


We settle on 2. We document that IPF==sync sample. IFs are marked by being members of a roll sample group with positive roll (probably 1 or 2) that says “audio is correct N frames later”. We recommend that streams start with IPF and suggest periodic IPF, partic. for DASH segmentation. We document that a stream with no IPF has an empty (not absent) sync sample table.
It’s fine if nothing comes out until you have fed N frames in. Players will need to start their decoding early, and feed enough, to have the audio ready to play.
If I have a raw audio stream, encode it into a set of access units, feed those access units to a decoder, do I get a stream of the same length out – no pre-padded silence, no trailing silence? Does the number of frames equal ceil(stream_length_in_samples / frame_length_in_samples)?
We note that the base ‘filter chain’ will deliver a predictable, normative, constant, N (large number, thousands) zeroes after initialization; and that the ‘driver’ that encapsulates this filter pipeline will know to discard them; the first audio sample output from that driver exactly corresponds to the data in the IPF or IF. Do we get these N always or only for cold-start after IF and not IPF?
We badly need to verify the operation here, and identify what happens at each layer.

      1. Yüklə 5,54 Mb.

        Dostları ilə paylaş:
1   ...   24   25   26   27   28   29   30   31   ...   197




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©muhaz.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin