The testplan of the verification tests on MPEG-4 speech coding was fully defined at last MPEG meeting and the preparation of the items to be used was completed just before the Dublin meeting.
However it was realised that there was a considerable variation of level among different items and this would have been a further source of variation that could have made much more difficult the interpretation of the test results. Therefore it was agreed to adjust the ‘outliers’ and encode them again.
In addition, test conditions, in particular the codecs to be used in wide-band test, were revised by the Audio Subgroup. Details about the final decisions are given in document N.2277.
Test results will be made available at the beginning of September.
The test plan previously approved was completely revised and now it includes four sub-tests:
Audio coding at a bit rate below 10kbit/s.
Audio coding at bit rates around 16 Kbit/s
Saleable coding of mono material at 24 kbit/s
Scaleable coding of stereo material in a range of bitrate between 40 and 56 kbit/s
Details about the test plan are given in document N. 2278.
In addition during the meeting the source material collected for this test was pre-screened and 39 items out of 90 were selected (see document N.2279). The final selection of the test material will be done on the encoded material and it is one of the tasks of the ad hoc group for audio verification tests.
MPEG-4 video verification test
The goal of this test was to evaluate the performance of MPEG-4 error resilience tools, under conditions representative of video communications over mobile networks.
Test conditions were produced by means of a simulation of the complete transmission chain, including transmission errors. The combination of three different bit rates with two error conditions was considered. In order to obtain more reliable results, a new test method that is particularly suitable to evaluate time-varying video impairments was applied.
The test was carried out in three laboratories with non-expert viewers. Data analysis confirmed the validity of the method, the reliability of the subjects and revealed a bias due to the test site, although the trends of the results were very similar in the three laboratories.
Test results were discussed with the Video Subgroup and the main conclusions about the performance of the codecs are:
Generally speaking the only condition that presented annoying transmission errors was at 128 kbit/s with critical errors (i.e. 1e-3 10ms burst errors). Video experts felt that further evaluation is to be performed in an ad hoc group. If it is judged that significant improvement can be obtained by more appropriate choices of encoder parameters, new test material will be produced and the test repeated.
The quality of sequences encoded by MPEG-4 error resilience tools and affected by typical transmission errors of mobile networks (i.e. 1e-4 10ms burst errors) is comparable to the quality of sequences without errors,
At 32 kbit/s there is a considerable masking effect, thus transmission errors do not increase the annoyance due to coding artefacts
A third pre-screening of the material to be used in the lower bit rate test was conducted during the Dublin meeting. Sequences were coded by using MPEG-1 and ‘MPEG-4 Frame-Based’, both using rate control.
A number of problems related either to the implementation of MPEG-1 or to the particular rate control used were discussed and new coding parameter settings were agreed. In particular, in order to make fair comparisons between ‘MPEG-4 Frame-Based’and MPEG-1, it was decided that the MPEG-4 rate control will be implemented in the MPEG-1 encoder and MPEG-1 (TM5) adaptive quantisation weighting factor will be used in MPEG-4.
Moreover, during the pre-screening a new sequence named ‘Birthday’ was presented. This sequence was produced by BBC, in reply to a call for critical segmented material issued at the previous MPEG meeting. Although the sequence meets most of the requirements indicated in the call, it was decided not to use it in the verification test because in that sequence the whole information of each object is always available, even when two objects overlap. This would result in a disadvantage for the object-based coding that in this kind of sequences wastes bandwidth to encode hidden parts.
Finally, the test methods to be used in the two tests (i.e. high and low bitrates) were agreed.
Details about coding parameter settings and test methods are given in document N.2334
In Dublin, a first pre-screening on the material to be used for the scalability verification test was conducted.
The suitability of the sequences to be used in this test was discussed and it was suggested that sequences are representative of potential applications. It was also realised that negative effects may be introduced when the frame rate of only one of two foreground objects is improved.
Moreover, it was recognised that the rate control is an important element and it should be used in the production of test conditions.
Therefore, in the ad hoc group for video verification tests the suitability of available MPEG sequences will be investigated and a suitable rate control strategy will be developed.
Finally, taking into account the considerable amount of work to be done, it was decided to remove the spatial scalability from the first round of this verification test.
The material for a second pre-screening will be prepared according to the considerations explained above and it will be presented in Atlantic City.
Archival records of audio, video and audiovisual source material
In Dublin the Test Subgroup has started to organise the distribution of video and audiovisual source material on CD-ROMs.
Document N. 2336 addresses the logistics for such a distribution.
Both Audio and Video Subgroup representatives expressed an interest for archiving all the source material donated to MPEG. The establishment of these archives and the policy for the distribution of the material within MPEG will be discussed at the next meeting in Atlantic City. The distribution will start from the sequences for which the Convenor has received a written permission to print them on CD-ROM.
1) Video decoder complexity analysis for the definition of profile and levels based on the QoS activity
Ad-hoc activities have been reviewed. Contribution M3615 (“Some results on MB coding complexity”) shows promising results that should be validated by other optimized decoders. Anyhow, such results are in reasonable accordance with another encoder approach (a programmable macroblock processor from the University of Munich) and seem a good basis to derive meaningful video “levels” based on complexity.
A new definition of video complexity has been derived from these results. It is based on a linear combination of macroblock coding modes weighted by a correspondent complexity coefficient derived from the experiments. This new complexity definition has been approved by the ISG, has been proposed for the setting of “Levels” in video, and has been reported in output document N2318. Such definition according to the ISG discussions is certainly much more related to decoding complexity than the existing definition. The adoption of the new definition is under discussion in the video group and comments from NB are asked.
Results from other optimized decoders necessary to validate and improve the QoS results have been requested, Bit-streams with critical test sequences can be made available and there is no need of giving away source code, but no volunteers have been found for the work.
Further steps for accomplishing the QoS activity mandate are:
to verify the applicability and to provide guidelines of the defined complexity metrics for bounding the intrinsic complexity of video bit-streams
to verify the applicability of the complexity metrics for classifying different decoders in terms of conformance and QoS.
2) Computational Graceful Degradation for SNHC video and Synthetic Audio.
Review of results reported in contribution M3567 (Computational Graceful Degradation Analysis in SNHC) has permitted to find the main complexity dependencies involved in the processes of 3-D rendering. The algorithms algorithms considered covers the various filtering for mapping points, and the MIP mapping (extraction of points from different resolution images for large angles of vision for which the points are 3-linearly interpolated). A complete table of dependencies has been defined. In conclusion 4 parameters seem to completely define SNHC rendering complexity. They are:
the number of triangles,
the number of vertices,
the number of edges,
the number of visible pixels.
SNHC-CGD experiments to verify the complexity analysis have been defined and described in output document N2317. Goals, conditions of experiments and a list of relevant bitstreams and models for performing the profiling experiments are reported. Volunteers for the experiments and assistance from the SNHC have been defined.
3) Complexity analysis for Structured Audio Orchestra Language (SAOL)
Ad-hoc groups activities and contributions have been reviewed. Contribution M3602 (A method for measuring complexity in Structured Audio) explains as SAOL is a language for describing algorithms for audio processing and synthesis and therefore there is no way to have a statistical description or worst case complexity. A complexity evaluation approach independent from the decoder platform is necessary. Following such approach, SA bitstreams are partitioned in: 1) variable and tables, 2) Memory accesses, 3) Summing buses, 4) statements and expressions 5) core opcodes. Core opcodes are further divided in 4 groups. According to this approach a vector of generic and opcode operators can be extracted by each SA algorithm and can be used to characterize SA complexity. The proposed dimension of the vector is 12.
A second method described in contribution M3611 (Another method for measuring complexity in structured audio) is based instead on profiling a specific reference SA decoder. Only real-time synthesis has been considered. The complexity estimation measure is based on only 5 types of operations. The output is the total amount of operations, core mathematical operations are proposed to be equivalent to 5 non-opcode mathematical operations. In conclusion the method consider 7 parameters and does not include opcode optimization (it can be considered as a measure of worst case SA decoding complexity if opcodes are used).
The ISG and SA subgroup agreed that the first method is better suited for the goal of platform independent complexity analysis. The software tool designed for such measurements (provided by the EPFL) will be available within two weeks after the end of Dublin meeting for experiments and tests. Output document N2282 (Study of complexity of SAOL) describes the aims and conditions of the experiments. The main results that are expected for the next meeting are the measure of the variability of vector components for some typical algorithms and the evidence that a separation between opcodes for audio effects (libraries) and synthesis libraries is meaningful or not.
4) Complexity evaluation of MPEG-4 components. The analysis of the overall MPEG-4 System complexity, issue that was not discussed in the previous Tokyo meeting, has been started in the Dublin meeting.
Contribution M3631 (Computation Complexity Profiling of the IM-1 MPEG-4 Player) reports a first preliminary analysis of the various MPEG-4 systems components. The results are based on profiling the IM1 player 0.4.5 on a Pentium MMX233Mhz platform. Results on only two sequences are reported. This interesting contribution presents the first complexity comparisons between natural audio, natural video and synthetic video (a tool for facial animation). It also raise the need of the availability of an IM1 “encoder” in order to be able to generate content for which complexity measures can be extracted in a straightforward and reliable manner. Specific IM1 content is indeed necessary for the complexity evaluation of other SNHC tools, of synthetic audio, of other systems nodes (AudioFX for instance) and to evaluate the performance and complexity of the system synchronization model.
5) Various complexity issues: texture coding complexity, chroma keying shape coding, matching pursuit, padding complexity. A summary of the reflector mail exchanges about texture coding complexity has been presented to the group. Three modes of wavelets have been analyzed. Their complexity is comparable and a selection of them cannot be based on complexity considerations. Contributions M3645 (Complexity analysis and guidelines for profile definition of Still Texture Coding Implementation) has been reviewed. An output document (N2316 Recommendations for still texture coding (wavelets) implementations.) summarizing the results present in this meeting and contributions of previous meetings, has been edited and approved by the group.
Chroma keying shape coding complexity has been discussed reviewing reflector mail exchanges. Although an absolute complexity comparison of Chroma keying shape coding with the Simple Profile decoding has been asked to the ISG group, no action has been taken since such request has not been officially raised from the Requirement or Video Group.
Contribution M3572 (VLSI implementation of repetitive padding: cost and architecture) reports the architecture of a “padding” co-processor. Considering also previous contributions the ISG has proposed for the most processing demanding operation of MPEG-4 “composition” and “padding” new architectures for VLSI implementations.
Matching pursuit complexity has been briefly evaluated on the basis of previous contributions. No conclusion has been taken since no official request from the Requirement or Video group has been raised.
6) ISG FAQ A list of ISG FAQ has been prepared, volunteers preparing answers for each question and a coordinator for the ISG FAQ has been decided.
- Update of existing FAQ and collection of new ones (Bormans)
- CGD related questions (Mattavelli)
- Complexity measurements questions (La Fruit)
- QoS related (Mattavelli)
Liaison meeting report
Source: Barry Haskell (AT&T Research), Chair The Liaison group considered the following Dublin input documents
SC29/N2543 from VRML on External Authoring Interface (EAI)
SC29/N2552 from VRML on EAI
SC29/N2577 from FIAPF requesting Category A Liaison
SC29/N2611 from ITU-T SG16 Q11 on MPEG-4 Video
SC29/N2608 from CEN
M3514 from ITU-T SG12 on MPEG-4 Video verification
M3515 from ITU-T SG12 on MPEG-4 Audio verification