Mpeg-4 Structured Audio

Yüklə 461 b.
ölçüsü461 b.

MPEG-4 Structured Audio

  • Eric D. Scheirer

  • Machine Listening Group MIT Media Laboratory

  • Editor, ISO 14496-3 (MPEG-4 Audio)

MPEG-4 Structured Audio, A New Standard for Interactive Sound, in the Creation of Which Tom White did not Run the Whole Show, but Only Played a Small (Though Valuable) Part

  • Eric D. Scheirer

  • Machine Listening Group MIT Media Laboratory

  • Editor, ISO 14496-3 (MPEG-4 Audio)

What’s this all about?


  • What is MPEG?

  • What is MPEG-4 Structured Audio?

  • Why was it created?

  • How does it work?

  • How can it be used in IA applications?

  • What is its current status?

  • A brief note on MPEG-4 AudioBIFS

Intellectual property in MPEG-4

  • Structured Audio and AudioBIFS are free

    • All patentable IP has been released to public domain
    • No licensing or other costs to build tools & players
    • (Standard itself costs $300 for printing/bureaucracy)
  • SA and AudioBIFS are open standards

    • Companies competing through cooperation
    • Interoperability makes the whole pie bigger
    • MPEG processes for improving/correcting standard
    • MIT has no veto over the future of the standard

What is MPEG?

  • MPEG is ISO/IEC JTC1 SC29 WG11

    • A subcommittee of the Int’l Standards Organization
    • The “Moving Pictures Experts Group”
  • MPEG-1 : 1993 (ISO 11172)

    • Digital audio/video coding (MP3)
  • MPEG-2 : 1994-7 (ISO 13818)

    • Digital coding for broadcast
  • MPEG-4: 1998 (ISO 14496)

    • Object based, synthetic/natural, interactive coding

MPEG Marketplace Model

MPEG Marketplace Model

MPEG Marketplace Model

MPEG-4 Audio

  • High-quality sound

    • Based on MPEG-AAC algorithm: twice as good as MP3
  • Low-bitrate sound

    • For WWW and cellular: speech/music as low as 4 kbps
  • Synthetic sound

  • AudioBIFS

    • Mix and postproduce multi-track sound streams

MPEG-4 Structured Audio

  • Transmit structured description of sound

  • Use real-time synthesis to play sound

  • “PostScript for audio”

  • Based on new (to MPEG) technology

    • SAOL: New music synthesis language
    • SASL: New music control format
  • A lot of related technology in academia

    • Csound, Music-11, SynthScript, Nyquist, CLM, ...

Standardization goals

  • Provide synthetic sound in MPEG-4

  • Bring algorithmic synthesis to wider community

    • Standardize academic state-of-the-art; don’t innovate
  • Get new companies to work on synthesis

    • Implementation required for full MPEG-4 system
  • Set a higher bar for PC sound architecture

    • Drive forward the world of sound on PCs!

MPEG-4 SA decoding process

What SAOL looks like

  • A C-like language

  • Based on the Music-N model

  • Variables hold audio signals

  • Unit generators do basic functions

  • Instruments controlled by score or MIDI

SAOL capabilities

  • Many nice features built in

    • Wavetable manipulation FFT/IFFT
    • Multitap delay lines Arrays of signals
    • FIR & IIR filters Effects routing
    • Granular synthesis 3-D audio interface
    • Dynamic layering and triggering
  • SAOL is extensible-from-within

    • (Allows encapsulation and structured programming)
  • Any kind of synthesis can be used in SAOL


  • “Xanadu” (Joseph Kung)

    • 60 seconds long, 44 KHz stereo (10.5 MB as WAVE)
    • 2.2 KB in header
    • 4.2 KB in bitstream (= 0.07 kbps)
    • No samples anywhere, only algorithmic synthesis
    • More than 1200:1 “compression”, no loss of quality
    • Could be controlled/restructured interactively

MPEG-MMA relationship

  • MIDI can control MPEG-4 SA synth

    • SASL = more flexible, more tightly coupled
  • DLS-2 synthesis embedded in SA synth

    • Do wavetable in series or parallel with other techniques
  • “Wavetable-only” profile of MPEG-4

    • MIDI + DLS-2 + compressed audio + video (no SAOL)
    • Logical path of progression from today to tomorrow
  • Lots of help from MMA - appreciated!

    • MPEG is ready to help in the other direction (MIDI-DLA?)

Applications ideas

  • MPEG-4 is not an application!

    • It’s a tool - enables functionality and interoperability
    • Implementations could be hardware, software, both
    • Authoring tools also very important
  • Use MPEG-4 SA like Staccato Synthcore

  • Use MPEG-4 SA like Beatnik

  • Use MPEG-4 SA like Koan

  • Use MPEG-4 SA for new music applications

Application example: Gaming

Current status

  • Standard and reference software finished

  • Many implementation projects starting

    • Creative Tech Center: Compression & Interactive Audio
    • Studer + EPFL: “ThreeDSpace” project
    • Hobbyist projects (Java API, ActiveX plugin)
    • Others: Be Inc., Sseyo, Kings College, UC Berkeley,
    • Catholic U. Leuven, Q-Team DE, Nokia, ...
    • 3 complete implementations already!
  • A few authoring tools projects

  • Active mailing list for developers

A brief note on AudioBIFS

  • BIFS is scene-description part of MPEG-4

    • “Binary Format for Scenes”
    • Based on VRML, but with many new features
  • AudioBIFS is the audio mixing part

    • Stream audio in multitrack format
    • Deliver mixdown instructions in AudioBIFS
    • Mixing, spatialization, effects in SAOL, multichannel
    • Terminal-adaptive capability
    • Candidate for “PC DSP architecture”?

AudioBIFS - scene graph model


  • MPEG-4 Structured Audio

    • The international standard for algorithmic sound synthesis
  • MPEG-4 AudioBIFS

    • The international standard for audio postproduction
  • New market opportunities for

    • Hardware/software MPEG-4 players (embedded or not)
    • Authoring tools (editors, sequencers)
    • Advanced interactive audio content

What was this all about?

  • MPEG-4 is not just about compression

  • MPEG-4 shows one way for the IA world to move beyond wavetable synthesis

For more information

  • MPEG home page

    • Requirements, future of MPEG
  • MPEG-4 SA home page

    • Draft standard, code, mailing lists, matchmaking
  • Contact

    • Slides, technical papers, discussion available

Kataloq: reports

Yüklə 461 b.

Dostları ilə paylaş:

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2022
rəhbərliyinə müraciət

    Ana səhifə