Contributions
Number
|
Session
|
Title
|
Source
|
Dispositions
|
m36495
|
File Format
|
Comments on 3D audio file format DAM2 text
|
Mitsuhiro Hirabayashi, Toru Chinen
|
Accepted N15379
|
m36471
|
File Format
|
SAP related proposals for ISO/IEC 14496-12 and 14496-15
|
M. M. Hannuksela (Nokia), X. Huang (USTC)
|
Accepted N15881
|
m36172
|
File Format
|
Update to conformance bitstream
|
Richard Mitic
|
Accepted N15484
|
m36364
|
File Format
|
Comments on Partial File Storage
|
David Singer
|
Noted
N15478
|
m36611
|
File Format
|
Partial File Delivery - Status Update
|
Thomas Stockhammer
|
Noted
N15478
|
m36584
|
File Format
|
Error Handling in File Format
|
Imed Bouazizi
|
Noted
N15478
|
m36365
|
File Format
|
Updated defect report on ISO/IEC 14496-12
|
David Singer
|
Noted
N15881
|
m36367
|
File Format
|
Does a trailing movie fragment help?
|
David Singer
|
Noted
|
m36368
|
File Format
|
Long samples and fragmented ISO BMFF files
|
Chris Flick, David Singer
|
Noted
|
m36606
|
File Format
|
Web Interactivity Track
|
Thomas Stockhammer, Giridhar D. Mandyam, Kent Walker, Charles Lo
|
Noted
|
m36667
|
File Format
|
Comments on semantics of track reference box in 14496-12
|
Hendry, Ye-Kui Wang
|
Accepted N15881
|
M36511
|
File Format
|
Proposed updates to ISO/IEC 23008-3 DAM 2
|
Ingo Hofmann, Harald Fuchs, Bernd Czelhan, Matteo Agnelli
|
Accepted N15379
|
Summary of discussions
m36172 Update to conformance bitstream
Thank you. We need to update the spreadsheet with all the latest files.
m36495 Comments on 3D audio file format DAM2 text
Into the document that is a re-issue of the amendment, editors to integrate as appropriate.
m36511 Proposed updates to ISO/IEC 23008-3 DAM 2
Thank you. File format experts need to discuss the marking of IPF frames and independent frames, the ‘roll’ behavior, and so on.
An AU contains the encoding of the audio to be played at the AU’s timestamp.
1) An IF or IPF is a sync sample
An IPF is marked by a SAP sample group with SAP type 1.
2) An IPF is a sync sample.
An IF is marked by a ‘roll’ sample group entry with a roll distance (1 or 2?), (or every frame except IPF is marked by the pre-roll sample group).
3) An IF or IPF is a sync sample
An IF that is not IPF is marked by a SAP sample group with SAP type 4.
4) An IF or IPF is a sync sample
Plus, define a new SAP type, equivalent to type 4, but it is acceptable as a sync point (redefine sync)
We agree an IF is SAP type 4. We currently do not allow SAP type 4 to be a sync sample.
-
we seem to have a contradiction between the definition of sync sample (sap type 1 or 2) and the fact that the sync sample table and the sap group would differ.
-
we may get streams with an empty sync sample table; players that don’t look for a roll group may be horribly confused or start from the beginning
-
contradiction; SAP type 4 is not a sync sample, and anyway, it’s not formally perfect player behavior to fade in after a seek
3 types of frame:
Not independent frame: a difference frame, don’t start here
Independency frame; can be decoded without preceding frames, but the audio fades in; the audio is correct N frames later (like GDR)
IPF; contains the priming data such that starting here yields immediately correct audio with no fade-in (like IDR)
We settle on 2. We document that IPF==sync sample. IFs are marked by being members of a roll sample group with positive roll (probably 1 or 2) that says “audio is correct N frames later”. We recommend that streams start with IPF and suggest periodic IPF, partic. for DASH segmentation. We document that a stream with no IPF has an empty (not absent) sync sample table.
It’s fine if nothing comes out until you have fed N frames in. Players will need to start their decoding early, and feed enough, to have the audio ready to play.
If I have a raw audio stream, encode it into a set of access units, feed those access units to a decoder, do I get a stream of the same length out – no pre-padded silence, no trailing silence? Does the number of frames equal ceil(stream_length_in_samples / frame_length_in_samples)?
We note that the base ‘filter chain’ will deliver a predictable, normative, constant, N (large number, thousands) zeroes after initialization; and that the ‘driver’ that encapsulates this filter pipeline will know to discard them; the first audio sample output from that driver exactly corresponds to the data in the IPF or IF. Do we get these N always or only for cold-start after IF and not IPF?
We badly need to verify the operation here, and identify what happens at each layer.
Dostları ilə paylaş: |