Research areas

Yüklə 193,76 Kb.
ölçüsü193,76 Kb.
  1   2   3

General research areas.
Image/Video Coding, including the emerging technologies around reconfigurable video coding, scalable video coding, multiview video coding, high-performance video coding, next-generation video coding, as well as efficient hardware/software implementations for high-definition real-time video coding. Image/Video Analysis, including compressed-domain object detection, multi target identification/tracking, image (extraction, restoration, and enhancement), and facial expression recognition. Other Topics, including Real-time Embedded Systems, Video Surveillance, Image/Video Watermarking, HW/SW Co-design of Multimedia Systems, as well as Reconfigurable and Multicore Architecture

The following book has 90 chapters. Each chapter can lead to a project.

Advanced Concepts for Intelligent Vision Systems

7th International Conference, ACIVS 2005, Antwerp, Belgium, September 20-23, 2005. Proceedings

Book Series

Lecture Notes in Computer Science


Springer Berlin / Heidelberg


0302-9743 (Print) 1611-3349 (Online)


Volume 3708/2005







Subject Collection

Computer Science

SpringerLink Date

Wednesday, October 05, 2005

Pl access from


  1. Radulovic et al, “Multiple description video coding with H.264/AVC redundant pictures”, IEEE Trans. CSVT, vol. 20, pp. 144-148, Jan.2010.

See also references cited at the end of this paper.
2. V. K Goyal, “ Multiple description coding: Compression meets the network”, IEEE SP Magazine, vol. 18, pp. 74-93, Sept. 2001.
These papers can lead to several projects.


F. Porikli, F. Bashir and H. Sun, “ Compressed domain video object segmentation”, IEEE Trans. CSVT, Vol. 19, 2009.

High efficiency video coding (HEVC)

TMuC, Test model under consideration ,

TMuC, “,”
Joint Collaborative Team on Video Coding (JCT-VC)

TMuC, “,”

2010. has info on developments in HEVC NGVC – Next generation video coding.

Some of the tools contributing to the gain are:

(1) RD Picture Decision

(2) RDO_Q (from Qualcomm)

(3) MDDT (from Qualcomm)

(5) New Offset (from Qualcomm)

(4) Adaptive Interpolation Filter (from Qualcomm & Nokia)

(5) Block Adaptive Loop Filter (BALF) (from Toshiba)

(6) Bigger Blocks and Bigger transform (32x32 and 64x64) (Qualcomm)

(7) Motion Vector Competition (France Telecomm)

(8) Template matching
JVT KTA reference software (KTA: key technical areas)
G.J. Sullivan and J.-R. Ohm, „ Recent developements in standardization of high efficiency video coding“, Proc. SPIE, vol. 7798, pp. 77980V-1 thru V-7, San diego, CA Aug. 2010. Many other papers.

IEEE Trans. on CSVT, vol. 20, Special section on high efficiency video coding (several papers), Dec. 2010.

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 5, NO. 7, NOVEMBER 2011 (several papers on HEVC) Introduction to the Issue on Emerging Technologies

for Video Compression.


(see M. Karczewicz et al, „A hybrid video coder based on extended macroblock sizes, improved interpolation and flexible motion representation“, IEEE Trans. CSVT, Vol.20, pp. 1698-1708, Dec. 2010.) (several other papers)
ETRI Journal, vol. 33, pp. 145-154, April 2011

Highly Efficient Video Codec for Entertainment-Quality


T. Wiegand, B. Bross, W.-J. Han, J.-R. Ohm, and

G. J. Sullivan, WD3: Working Draft 3 of High-

Efficiency Video Coding, Joint Collaborative Team (HEVC STANDRAD EMERGING)

on Video Coding (JCT-VC) of ITU-T VQEG and

ISO/IEC MPEG, Doc. JCTVC-E603, Geneva, CH,

March 2011.

Seyoon Jeong, Sung-Chang Lim, Hahyun Lee, Jongho Kim, Jin Soo Choi, and Haechul Choi


This paper introduces many new coding tools in H.264 resulting in improved coding efficiency. Evaluate these tools and explore other tools for improving the coding efficiency of H.264 leading to H.265. Also review the references.

ITU-T documents

The latest draft meeting report for the last meeting is at:
There is a lot of information in that meeting report, and it contains pointers to where to find more (e.g., where to find documents and software).
One trick is to search the report for the string "Decision:".
The output documents (of that meeting and the preceding one) are another good place to look. And the AHG reports and CE (core experiments) reports.
We have another meeting starting in two weeks, so a lot more information will be showing up soon, and of course some aspects of the design will change.

IEEE Trans. Circuits and Systems for Video Technology has a special section (several papers) devoted to the project. Vol. 20, Dec. 2010.

It would be helpful if the students could actually contribute to the effort instead of just writing reports that only you will read.
Two ways to contribute include testing and improving the reference software and improving the draft standard text specification. It is easy to find places in the text that are incomplete, incorrect, vague, grammatically bad, inconsistent, or otherwise needing improvement. (For the text especially, finding problems is not enough – what we need are the solutions to the problems.)
Obviously though, we would only want competent help, not interference or messages on our email reflector saying things like "Hello, I am working on a report for my class project about the entropy coding in HEVC, so please explain how it works to me and give me some software and test results to put into my report – which, by the way, is due next week."


And these documents are helpful you to understand KTA software.

VCEG-AE08 [J. Jung, T. K. Tan] KTA 1.2 software manual

VCEG-AE09 [J. Jung, G. Laroche] Performance evaluation of KTA 1.2 software

Wiener Spatial Filtering in VCEG KTA

please check the various post-filter

and loop-filter proposals.

it is also useful to look at the

MPEG-4 AVC/H.264 proposals about the

post-filter hint SEI message: JVT-S030, JVT-T039, JVT-U035.

These documents can be accessed from

JVT ftp site:
Implement loop filter (on/off) and post filter (on/off)on 4:4:4 HD sequences

and evaluate the performance. Design other post filters.

Variable block size spatially varying transforms See papers.

  1. C. Zhang, K. Ugur, J. Lainema and M. Gabbouj, “ Video coding using spatially varying transform,” T. Wada, F. Huang and S. Lin (Eds): PSIVT, LNCS 5414, pp. 796-806, 2009. { kemal.ugur, jani.lainema}, {cixun.zhang, monef.gabbouj} @tut.fl

  1. C. Zhang, K. Ugur, J. Lainema and M. Gabbouj, “ Video coding using variable block size spatially varying transforms,”.

These two papers address variable block size and also location in a MB (where to apply) Implementing these two algorithms can lead to new research topics.
J. Chen et al, “Efficient video coding using legacy algorithmic approaches”, IEEE Trans. on multimedia (accepted– Sept. 2011) Will have access to this paper soon. This has led to MPEG standards group to work on “ Type-1 video coding” standard. This paper can lead to several research projects – also comparison with H.264.
Video codec projects please access

WebM project
WebM supports VP8 and Vorbis audio Explore and implement this (VP8 on2 technologies – now google adobe flash)



G. Anbarjafari, and H. Demirel, "Image Super Resolution Based on Interpolation of Wavelet Domain High Frequency Subbands and the Spatial Domain Input Image," ETRI Journal, vol.32, no.3, June 2010, pp.390-394. DOI:10.4218/etrij.10.0109.0303

Abstract :

In this paper, we propose a new super-resolution technique based on interpolation of the high-frequency subband images obtained by discrete wavelet transform (DWT) and the input image. The proposed technique uses DWT to decompose an image into different subband images. Then the high-frequency subband images and the input low-resolution image have been interpolated, followed by combining all these images to generate a new super-resolved image by using inverse DWT. The proposed technique has been tested on Lena, Elaine, Pepper, and Baboon. The quantitative peak signal-to-noise ratio (PSNR) and visual results show the superiority of the proposed technique over the conventional and state-of-art image resolution enhancement techniques. For Lena’s image, the PSNR is 7.93 dB higher than the bicubic interpolation. File; interpolation.

Key word :

Static image super resolution, discrete wavelet transform.



Implement this and compare with various other interpolation techniques using, mse., psnr, uiqi, ssim, PEVQ (perceptual evaluation of video quality), CZD ( Czenakowski distance – measures differences between pixels) etc.


JSVM; Joint scalable video model

You can find more information on the current implementation of rate control in the JSVM reference software in the JVT document JVT-W043:

Note that this RC (rate control) algorithm controls the bit rate only on the base layer. The enhancement layers are still being coded with the fixed pre-determined QP values.

JM reference software manual

B.M.K. Aswathappa and K.R. Rao, “Rate-Distortion Optimization (RDO) using Structural Information in H.264 Strictly Intra-frame Encoder”, IEEE Southeastern Symposium on Systems Theory, pp.367-370, Tyler, TX, March 2010.
Extend the above technique - RDO based on SSIM (structural similarity metric) - to inter frame encoder in H.264.


LSI Logic Corporation: H.264/MPEG-4 AVC Video Compression Tutorial, available in:

Panasonic Corporation: AVC-Intra (H.264 Intra) Compression Tutorial, available in:
(1) K. Yu et al, “ Practical real time video codec for mobile devices,” vol. III, pp. 509-512, ICME 2003.
Developed a practical low complexity real-time video codec for mobile devices. Reduces computational cost in ME, integer DCT, DCT/quantizer bypass. Applied to H.263. Extend this to H.261, MPEG1,2,4 (MPEG 4 Visual, SP, ASP) and to H.264 baseline profile. SP ; simple profile, ASP: Advanced simple profile.

Pl access the paper P. Carrillo, H. Kalva and T. Pin, " Low complexity H.264 video encoding", SPIE. VOL.7443, PAPER # 74430A, Aug. 2009., Applications of digital image processing. By applying machine learning to video coding, the authors are able to reduce the complexity of highly optimized encoder by about 63%. This is based on reducing the complexity in mode selection 16x16 to 4x4 block sizes for ME/MC  in inter frames which is computationally exhaustive. This may be very useful in mobile devices (SMBA project - Korea). Pl implement this algorithm (simulation) and verify the results obtained in this paper. Explore implementing this algorithm in AVS China video and SMPTE VC-1 in order to reduce the encoder complexity. Both these standards have multiple block size ME/MC functions similar to those in H.264.

(2) G. Lakhani, “ Optimal Huffman coding of DCT blocks”, IEEE Trans. CSVT, vol.14, pp.522-527, April 2004.
an coding in JPEG baseline, Can similar techniques be applied to other video coding standards! (H.261, MPEG series, H.263 3D – VLC etc) See Table I about # of bits for each image (comparison).
(3) M. Horowitz et al, “ H.264 baseline profile decoder complexity analysis”,

IEEE Trans. CSVT, Vol. 13, pp. 704-716, July 2003 and

V. Lappalainen, A. Hallapuro and T.D. Hamalailen, “ Complexity of optimized H.26L video decoder implementation”, IEEE Trans. CSVT, Vol. 13, pp. 717-725, July 2003
Develop similar complexity analysis for H.264 Main Profile (both encoder/decoder) and compare with MPEG-2 Main Profile.
Doctoral thesis
Kemal Ugur, “ Improved prediction methods for low complexity, high quality video coding”  8 Nov. 2010., Tampere Univ. of Technology, Tampere, Finland

This thesis can be downloaded from
This can lead to research projects.
Reference papers:

a) A. Molino et al, “ Low complexity video codec for mobile video conferencing”, EUSIPCO 2004, Vienna, Austria, Sept. 2004. ( (

b) M. Li et al, “ DCT-based phase correlation motion estimation”, IEEE ICIP 2004, Singapore, Oct. 2004.

C) M. Song, A. Cai and J. Sun, “ Motion estimation in DCT domain”, (contact Proc. 1996 IEEE Intrnl. Conf. on Communication Technology, vol.12, pp. 670-674, Beijing, China, 1996.

d) S-F. Chang and D.G. Messerschmitt, “ Manipulation and compositing of MC-DCT compressed video”, IEEE JSAC, vol. 13, pp. 1-11, Jan. 1995.
ME/MC (generally implemented in spatial domain) is computationally intensive. ME/MC in transform domain may simplify this. Implementation complexity is a critical factor in designing codecs for wireless (mobile) communications. Consider this and other functions in a codec based on e.g., H. 264 (baseline profile) and other standards.
3a) S.S. Basavanhalli, ‘‘Complexity Analysis of H.264 baseline decoder on ARM9TDMI processor”, M.S. Thesis, EE Dept. UTA ,Dec. 2005..

3b) T. Bhatia, “ SIMD optimization of H.264 high profile HD decoder”, M.S. Thesis, EE Dept. UTA , Dec. 2005.

(4) Extend these theses to H.264 encoders.
Topic for MS thesis/research

"Transcoding from H264 Main/High profile to H264 Baseline Profile"

(without dropping B frames. i.e. to convert B frames to P frames efficiently)


Apple iphone and many other phones support Baseline decoding of h264 video. and some company broadcasts in Main the need for these occurred....just an example

J. But, “ A novel MPEG-1 partial encryption scheme for the purposes of streaming video”, Ph.D Thesis, ECSE Dept, Monash University, Clayton, Victoria, Australia,

2004. (copy of the thesis is in our lab). Also papers by J. But (under review)
Implementing Encrypted Streaming Video in a Distributed Server Environment

- Submitted to IEEE Multimedia

An Evaluation of Current MPEG-1 Ciphers and their Applicability to Streaming


- Submitted to ICON 2004
KATIA - A Partial MPEG-1 Video Stream Cipher for the purposes of Streaming


- Submitted to ACM Transactions on Multimedia Computing, Communications and


In view of the improvements in networks, internet services, DSL, cable modems, satellite dishes, set-top-boxes, hand-held mobile devices etc, video streaming has lots of potential and promise. One direct and extensive application is in the entertainment industry where a client can browse, select and access movies, video clips, sports events, historical/political/tourist/medical/geographical/scientific encoded video using video 0n demand VoD service. A major problem for content providers and distributors is in providing this service to bona fide/authorized client and collecting the regulated revenues without unauthorized persons duplicating/distributing the protected material. While encryption schemes have been developed for storage media, (DVD, Video CD etc), there is an urgent need to extend and implement this approach to video streaming over public networks such as internet, satellite links, terrestrial and cable channels. The thesis by But addresses this highly relevant and beneficial subject. Encryption techniques are applied to MPEG-1 coded bit streams such that with proper (authorized) decryption key, clients can access and watch the video of choice with out duplication/distribution. This approach requires a thorough understanding of MPEG-1 encoder/decoder algorithms together with video/audio systems and as well the encryption details.
The author has proposed a range of modifications to the distributed server design that will lead to lower implementation costs and also increase the customer base. The development of a MPEG-1 partial selection scheme for encryption of streaming video is significant since the current encryption algorithms are mainly designed for encryption and protection of stored video. Extension of the MPEG-1 cipher to MPEG-2 bit stream is discussed in general terms with details left for further research, Additional research areas are developing encryption schemes for other encoded bit stream based on MPEG-4 Visual, H.263 and the emerging H.264/MPEG-4 Part 10.
Further research; Apply/extend/implement these techniques to video streaming based on MPEG-2, MPEG-4 visual, H.263 and H.264/MPEG-4 Part 10 (encryption, authentication, authorization, robustness, copyright protection etc.).Pl see chapter 7 Conclusion of the thesis for summary and further research.
(5) H.264/MPEG-4 Part 10, the latest video coding standard specifies only video coding unlike MPEG1,2,4, H.263 etc. (see IEEE Trans. CSVT, vol. 13, July 2003, Special issue on H.264/MPEG-4 Part 10). lso several papers on H.264.For all video applications, audio is essential.
Investigate multiplexing of H.264/MPEG-4 Part 10 (encoded video) with encoded audio based on the MPEG-2,4 Systems compatibility at the transmitter side followed by inverse operations (demultiplexing into video and audio bit streams and decoding these two media along with the lip-sync and other aspects) at the receiver side. There are several standards/non standards based algorithms for encoding/decoding audio ( M. Bosi and R.E. Goldberg, “ Introduction to digital audio coding standards”, Norwell, MA: Kluwer, 2002).
. H.264/MPEG-4 Part 10 video can be in various profiles/levels and as well the audio (mono, stereo, surround sound, lossless etc) aimed at various quality levels/applications and as well at various bit rates. This research can lead to several M.S. Theses. This research also has practical/industrial applications.
Below are the comments by industry experts actively involved in the video/audio standards.
 Just like MPEG-2 video, the audio standards used in broadcast applications are defined by application standards such as ATSC (US Terrestrial Broadcast), SCTE (US/Canada Cable), ARIB (Japan) and DVB (Europe). ATSC and SCTE specify AC-3 (Dolby) audio while DVB specifies both MPEG-1 audio as well as AC-3. ARIB specifies MPEG-2 AAC.


 The story for audio to be used with H.264 is more complex. DVB is considering AAC with SBR (called AAC plus) while ATSC has selected AC-3 plus from Dolby. In addition, for compatibility all the application standards will continue to use the existing audio standards (AC-3, MPEG-1 and MPEG-2 AAC).


 The glue to all of these is the MPEG-2 transport that provides the audio/video synchronization mechanism for all the video and audio standards.

I have the document ETSI TS 101 154 V1.6.1 (2005) DVB: Implementation guidelines for the use of video and audio coding in broadcasting applications based on the MPEG-2 transport stream (file: ETSI-DVB).
Consider MPEG-2 and MPEG-4 SYSTEMS for multiplexing the video/audio coded bit streams.

WEB SITES for multiplexing/de multiplexing audio/video bit streams. (check the mp4creator tool).


Both of the above have their advantages and drawbacks, but can help you do what you want (multiplexing and splitting of any files including AVC coded bitstreams)

(6) MPEG-2 MULTIPLEXER FOR TS and PS (TS – Transport stream, PS – Program stream) page may help you.
More specifically,

implements "An ISO-13818 compliant multiplexer for generating MPEG2 transport and program streams.

Harishankar has implemented multiplexing/demultiplexing H.264 Video with AAC audio (including encoding/decoding)along with lip-synch as part of his M.S. Thesis. Extension to H.264 high profile video and AAC/SBR audio (also 5.1 audio channels – surround sound) is a worthwhile M.S. research.
see file ISPACSKS3

  1. 6. H. 264/MPEG-4 Part 10 ( see item 5 above) Several new profiles/extensions have been developed. (ex. Studio and/or digital cinema, 12 bpp intensity resolutions, 4:2:2 and 4:4:4 formats, file and optical disk storage/transport over IP networks etc). advanced television systems committee

But since July 2008, ATSC supports the ITU-T H.264 video codec. The standard is split in two parts:

  • A/72 part 1: Video System Characteristics of AVC in the ATSC Digital Television System

  • A/72 part 2 : AVC Video Transport Subsystem Characteristics

  1. Thus new ATSC standard A/72 for digital HDTV supports H.264 video codec...

    4) I am attaching here A/53, A/72 part 1, A/72 part 2 PDF.

    5)  Below is the link of press release.. which said that A/72 supports H.264...

See G.J. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC advanced video coding standard: Overview and introduction to the fidelity range extensions”, SPIE Conf. on applications of digital image processing XXVII, vol. 5558, pp. 53-74, Aug. 2004. This paper discusses the extensions to H.264 including various new profiles (high, High 10, High 4:2:2 and High 4:4:4) and compares the performance with previous standards. G.J. Sullivan, “ The H.264/MPEG-4 AVC video coding standard and its deployment status”, SPIE/VCIP 2005, Vol. 5960, pp. 709-719, Beijing, China, July 2005.

See also Y. Su, Ming-Ting Sun and Kun-Wei Lin, “ Encoder optimization for H.264/AVC fidelity range extensions”, SPIE - VCIP2005, vol. 5960, pp. 2067-2075, Beijing, China, July 2005.
These extensions can lead to several M.S. Theses and possibly Ph.D dissertations.
(7)see K. Yu et al, “ Practical real-time video codec for mobile devices”, IEEE ICME 2003, Vol. III, pp. 509-512, 2003. They have developed a practical low-complexity real-time video codec for mobile devices based on H. 263. Explore/develop similar codecs based on H.264 baseline profile. See also the paper S.K. Dai et al, “ Enhanced intra-prediction algorithm in AVS-M”, Proc. ISIMP, pp. 298-301, Oct. 2004. (M is for mobile applications). AVS is audio video standard of China.
(8) Implement/evaluate scalability extensions of H.264 (see current JVT documents). JSVM (Joint scalable video model) and SVC (scalable video coding). At present lots of activity on SVC by the JVT. Software is called JSVM. SVC has been finalized.

Do you know of any real-time implementations of SVC decoders (HW/SW)

for PCs, STBs, etc? July 15, 2009

You can find a open source software here:
The wiki is here for more information:
The player is Mplayer with a dedicated library for SVC.
We can achieve 720p with 2 enhancements layers in real time.

See the following link for the sequences supported by the decoder.

(9) Design/evaluate/simulate rate control techniques for all profiles/levels in H.264.
(10) Residual color transform (RCT)
In Frextensions to H.264/MPEG-4 Part 10, a new addition is the residual color transform. In this technique, the input/output and stored reference pictures are in RGB domain while bringing the forward and inverse color transformations inside the encoder and decoder for processing of the residual data only. Color transformations are RGB to YCgCo (orange and green chroma) and the inverse. Residual data implies (I assume)intra or motion compensated prediction errors.
pl see email from Woo-Shik Kim (31-1-2006)
In RCT, the YCgCo transform is applied to the residual signal after intra/inter prediction and before integer transform/quantization at the encoder, and the inverse YCgCo transform is applied to the reconstructed residual signal after dequantizaiton/inverse integer transform and before intra/inter prediction compensation at the decoder.

Since this is not a SVC subject, if you need further discussion you can use the JVT reflector or 4:4:4 AhG reflector.

The e-mail address is same for both ( and [4:4:4] is added to the subject for the 4:4:4 Ahg (adhoc group) reflector.


Best Regards,

Woo-Shik Kim

(11) Advanced 4;4;4; profile in H.264/MPEG-4 Part 10

Intra residual lossless DPCM coding is proposed in advanced 4:4:4 Profile of H.264. Implement this and compare with RESIDUAL COLOR TRANSFORM. See JVT-Q035 17-21 Oct. 2005 “Complexity of the proposed lossless intra for 4:4:4”.

Y.L. Lee Sejong Univ., Korea.


The most important thing to know about the High 4:4:4 profile is that we have removed it (or are in the process of removing it) from the standard.  We are working on a new Advanced 4:4:4 profile.  So the prior High 4:4:4 profile should be considered only a historical curiosity for purposes of academic study now.


In answer to your specific question, the primary other difference in the High 4:4:4 profile in addition to support of the 4:4:4 chroma sampling grid in a straight forward fashion similar to what was done to support 4:2:2 versus 4:2:0, was the support of a more efficient lossless coding mode, as controlled by a flag called qpprime_y_zero_transform_bypass_flag.  This flag, when equal to 1, causes invoking of a special lossless mode when the QP' value for the nominal "Y" component (which would be the G component for RGB video) is equal to 0.  In the special lossless mode, the transform is bypassed, and the differences are coded directly in the spatial domain using the entropy coding processes that are otherwise ordinarily applied to transform coefficients.


Best Regards,


-Gary Sullivan

(12) residual color prediction (H.264)

Pl open these documents. These files can be sources for research projects.

(1) P. Tsai, Y-C. Hu and C.C. Chang, “ A progressive secret reveal system based on SPIHT image transmission”, SP: Image communication, vol. 19, pp. 285-297, March 2004.
Secret image is directly embedded in a SPIHT encoded cover image (monochrome). Sally has extended this to color, RGB ----- YCBCR. She has also investigated robustness to various attacks. Sally’s thesis and software are in our lab.
Tsai, Hu and Chang suggest adding encryption schemes (DES, RSA) to encrypt the secret image before embedding. The objective is to enhance the security by steganography and encryption. INVESTIGATE THIS.
Ramaswamy has completed his thesis using SHA, DES, RSA for encrypting H. 264 video (verify integrity, identify sender/content creator etc). Both Puthussery and Ramaswamy have all the software (operational). This research topic is viable and relevant.

N. Ramaswamy, “ Digital signature in H.264/AVC MPEG4 Part 10”,M.S.. Thesis,UTA, Aug. 2004.

S. Puthussery “ A progressive secret reveal system for color images”, M.S. Thesis, UTA, Aug. 2004.


  1. I. Avcibas, N. Memon and B. Sankur, “ Steganalysis using image quality metrics”, IEEE Trans. IP, vol. 12, pp. 221-229, Feb. 2003.

  2. I. Avcibas, B. Sankur and K. Sayood, “ Statistical evaluation of image quality measures”, J. of Electronic Imaging, vol. 11, pp. 206-223, April 2002.

  3. A.M. Eskicioglu and P.S. Fisher, “ Image quality measures and their performance”, IEEE Trans. Commun., vol.43, pp. 2959-2965, Dec. 1995.

  4. A.M. Eskicioglu, “ Application of multidimensional quality measures to reconstructed medical images”, Opt.Eng., vol. 35, pp. 778-785, March 1996.

  5. B. Lambrecht, Ed., “ Special issue on image and video quality metrics”, Signal Process., vol. 70, Oct. 1998.


US PATENTS (US patent and trademark office) While the claims made in these patents can be simulated no products/devices based on these patents can be used for commercial purposes (proper licensing, patent release etc must be obtained)

1.US Patent 4, 999, 705 dated March 12, 1991 A. Puri, “Three dimensional motion compensated video coding”, Assignee: AT & T Bell Labs, Murray Hill, NJ.
2. US Patent 4, 958, 226 dated Sept. 18, 1990, B.G, Haskell and A. Puri, “ Conditional motion compensated interpolation of digital motion video”, Assignee AT & T Bell Labs, Murray Hill, NJ.
Patent # 1 discusses adaptive 2D or 3D DCT of MXN or MXNXP blocks and a special “zig-zag-zog” scan for the 3D DCT case. Here MXN is the spatial block and P is in the temporal domain. It has # of interesting features and claims improved compression. It follows the GOP concept (IPB PICTURES) as MPEG-1,2,4 WITH VARIABLE # OF b PICTURES or even the size of GOP. This patent can be basis of # of research topics specially at the M.S. level.

3. US Patent 5, 309, 232, May 3, 1994 J. Hartung et al, “ Dynamic bit allocation for three-dimensional subband video coding”, assignee AT & T Bell Labs, Murray Hill, NJ. (see also S-J Choi and J.W. Woods, “ Motion-compensated 3-D subband coding of video”, IEEE Trans. IP, VOL.8, PP 155-167, Jan. 1998.)

This patent has 3 of interesting features and can lead to several research topics specially at the M.S. level.



(1)Fractal lossless image coding. See proc. of EC-VIP-MC 2003 in our lab. Extend this approach to color images, video etc
(2) Explore Fractal/DWT (Similar to Fractal/DCT) in image/video coding.
(3) Explore fractal/SVD in image/video coding
See A. Sloan, “Through a glass, Darkly: Image recognition with poor quality imagery”, Advanced imaging, vol. pp.8-9 and 37, March 2003. ( vol. 18. some research areas are suggested.


(1) High-fidelity multi channel audio coding with Karhunen-Loeve transform
Dai Yang   Hongmei Ai   Kyriakakis, C.  and Kuo, C.-C.J.  

IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 365-380, July 2003.

Review this paper and related ones cited in the references.

KLT is applied to advanced audio coding (AAC) adopted in MPEG-2. Can this technique be extended to other multi channel audio coding algorithms? 
(2) Pl go to Google   HEAAC  or AAC plus   (HE is high efficiency). Also go to   This is an improved multi channel audio coder adopted in MPEG-4 and also by various companies. Also in MP3 called MP3 pro. It is both backward and forward compatible with AAC. (see M. Wolters, K. Kjorling and H. Purnhagen’ “ A closer look into MPEG-4 high efficiency AAC”, 115th convention AES, New York, NY: 10-13, Oct. 2003, and P. Ekstrand, “ Bandwidth extension of audio signals by spectral band replication”, Proc. Ist IEEE Benelux Workshop on Model Based Processing and coding of Audio (MPCA-2002), Leuven, Belgium, Nov. 2002)
Can the KLT approach described in ref. 13 above be applied to AAC part of HEAAC (the other part is SBR – spectral band replication) to further improve the coding efficiency.
(See M. Wolters et al, “ A closer look into MPEG-4 high efficiency AAC”, 115th AES convention, 10-13, Oct. 2003, New York, NY. Also
(3) Encode H.264 High profile (FRExtensions) video and HE-AAC audio, multiplex the two coded bit streams using MPEG-2 or MPEG-4 systems (or any other), followed by inverse operations at the receiver (demultiplex into video and audio coded bit streams and decode). ISMA (internet streaming media alliance) has adopted H.264 along with HE-AAC for streaming media over internet. Access
(4) Implement VC-1 (based on WMV9) and WMA (Microsoft). Encode both video and audio, multiplex the two coded bit streams followed by demultiplexing, decoding and maintaining lip synch between video and audio.

I have the hard copy of the PP slides on New technologies in MPEG audio presented by Dr. Quackenbush, ( MPEG audio research group chair, Audio research labs presented in the one day workshop on MPEG international video and audio standards, HKUST, Hong Kong, on 22 Jan. 2005. Several research projects can be explored based on these slides. This also has some slides on spatial audio coding. Pl see the paper below.

Spatial audio coding

Y.J. Lee et al, “ Design and development of T-DMB multichannel audio service system based on spatial audio coding”, ETRI Journal, vol. 31, #4, pp. 365-375, Aug. 2009. (
J. Breebart et al, “ MPEG spatial audio coding/MPEG surround: Overview and current status”, Proc. 119th AES Convention, NY, USA, Oct. 2005, Preprint 6447.
ISO/IEC JTC1/SC29/WG11 (MPEG)., “ Procedures for the evaluation of spatial audio coding systems”, Document N6691, Redmond, WA, July 2004.

(4) See thesis from NTU, Development of AAC-Codec for streaming in wireless mobile applications (E. Kurniawati) 2004. I have this thesis. This research develops various techniques in reducing the implementation complexity while maintaining the same quality desirable for mobile communications. One concept is using odd DFT for both psychoacoustic analysis and MDCT. Extend these techniques to HE-AAC audio, (see item 14 above) MPEG-1 Audio and MPEG-2 AUDIO.

(5) See the paper, A. Ehret et al, “ Audio coding technology of ExAC”, Proc. ISIMP 2004, pp. 290-293, Hong Kong, Oct. 2004, This paper discusses a new low bit rate audio coding technique based on enhanced Audio Coding (EAC) and SBR (spectral band replication). Multiplex this audio coder with AVS video coder, demultiplex into audio/video coded bitstreams and decode them to reconstruct the video/audio followed by lip synch. Consider several audio/video levels/profiles, (bit rates, spatial/temporal resolutions, mono/stereo/5.1 audio channels etc) several files on SBR. (see Harishankar’s (UTA-EE Dept.) thesis on H.264 video-AAC audio encode/multiplex/demultiplex/decode/lip synch)

( See M. Bosi and R.E. Goldberg, “ Introduction to digital audio coding standards”, Norwell, MA: Kluwer, 2002).

(6) In HE-AAC or AAC-plus reduce complexity (also lossless audio) by using lifting scheme for MDCT/MDST. See Yoshi’s dissertation (UTA-EE Dept.)
DTS Digital theater system audio Multichannel surround sound audio used in Theaters etc.


There is one paper in Audio Engineering Society convention on DTS

This paper can be requested thru inter library loan of uta library.

Yüklə 193,76 Kb.

Dostları ilə paylaş:
  1   2   3

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2022
rəhbərliyinə müraciət

    Ana səhifə