New wavelet transforms and their applications to data compression

(1)

INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films

the text directly from the original or copy submitted. Thus, some thesis and

dissertation copies are in typewriter ^ c e , while others may be from any type of

computer printer.

The quality of this reproduction is dependent upon the quality of the

copy submitted. Broken or indistinct print, colored or poor quality illustrations

and photographs, print bleedthrough, substandard margins, and improper

alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript

and there are missing pages, these will be noted. Also, if unauthorized

copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by

sectioning the original, beginning at the upper left-hand comer and continuing

from left to right in equal sections with small overlaps.

Photographs included in the original manuscript have been reproduced

xerographically in this copy.

Higher quality 6” x 9" black and white

photographic prints are available for any photographs or illustrations appearing

in this copy for an additional charge. Contact UMI directly to order.

Bell & Howell Information and Learning

300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA

800-521-0600

(2)

(3)

N E W W A V E L E T T R A N S F O R M S A N D T H E IR A P P L IC A T IO N S T O D A T A C O M P R E S S IO N

by

IN D E R P R E E T SINGH

M.Tech., Indian In stitu te of Technology, Delhi, INDIA, 1993 B.E.(Hons.), Panjab University, INDIA, 1992

A D issertation S ubm itted in P artial Fulfillment of th e Requirem ents for the Degree of

Do c t o r o f Ph il o s o p h y

in th e D epartm ent o f Electrical an d Com puter Engineering

We accept this dissertation as conforming to the required stan d ard

Dr. P. i^ ^ th o k lis, Supervisor, D ept, of Elect. & Comp. Eng. ______________________________________________ Dr. A. Antoniou, Supervisor, D ept, of Elect. & Comp. Eng.

r. W -S. Lu Ml

Dr. W-S. Lu, Member, D ept, of Elect. & Comp. Eng.

____________________________________ Dr. R. Illner, O utside M ember, D ept, of M athem atics and Statistics

Dr. F. K ossentini, E xternal Exam iner, D ept, of Elect. & Comp. Eng., UBC

A ll rights reserved. This dissertation m ay not be reproduced in whole or in part by photocopy or other means, without the permission o f the author.

(4)

11

S u p e r v is o r s : Dr. P. A gathoklis and Dr. A. Antoniou

A B S T R A C T

W ith the evolution of m ultim edia systems, image and video com pression is becom ing the key enabling technology for delivering various im age/video services over heterogeneous net works. T h e basic goal o f image d a ta com pression is to reduce the bit ra te for transm ission and storage while eith er m aintaining the original quality of th e d ata or providing an ac ceptable quality.

This thesis proposes a new wavelet transform for lossless compression of images w ith application to medical images. T h e transform uses integer arithm etic a n d is very com pu tationally efficient. T h e n a new color image transform ation, which is reversible and uses integer arith m etic, is proposed. T h e transform ation reduces th e redundancy am ong the red, green, an d blue color bands. I t approxim ates th e lum inance an d chrom inance com ponents of the Y IQ coordinate system . T his transform ation involves no floating-point/integer mul tiplications o r divisions, a n d is, therefore, very suitable for real-tim e applications where the num ber o f C P U cycles needs to be kept to a m inimum .

A technique for lossy com pression of an im age d ata base is also proposed. T he technique uses a wavelet transform a n d vector quantization for compression. The discrete cosine tra n s form is applied to th e coarsest scale wavelet coefficients to achieve even higher compression ratios w ith o u t any significant increase in com putational complexity. W avelet denoising is used to reduce th e im age artifacts generated by quantizing th e discrete cosine transform coefficients. T h is improves th e subjective q u ality of the decom pressed images for very low bit ra te im ages (less th a n 0.5 bits per pixel).

The thesis also deals w ith the real-tim e im plem entation of th e wavelet transform . T he new wavelet transform has been applied to speech signals. B oth lossless and lossy tech niques for speech coding have been im plem ented. T h e lossless technique involves using the reversible integer-arithm etic wavelet transform and H uffm an coding to o b tain the com pressed b itstre a m . T he lossy technique, on th e other hand, quantizes th e wavelet coefficients to obtain h ig h er com pression ratio a t the expense of some d egradation in sound quality. T he issues re la ted to real-tim e wavelet com pression are also discussed. Due to th e lim ited size of m em ory o n a DSP, a wavelet transform h a d to be applied to an in p u t signal o f finite length. T h e effects of varying th e signal length o n compression perform ance are also studied for different reversible wavelet transform s. T h e lim itations of th e proposed techniques are discussed a n d recom m endations for future research are provided.

(5)

U 1

E x a m in e r s:

Dr. P. i^ ^ th o k lis, Supervisor, D ept, of Elect. & C om p. Eng.

Dr. A. Antoniou, Supervisor. D ept, of Elect. & Com p. Eng. ____________________________________ r. W-S. Lu, k

Dr. W-S. Lu, M ember, Dept, o f Elect. & Comp. Eng. Dr. W-S. Lu, Member.

Dr. R. Illner, O utside M ember, Dept, o f M athem atics and Statistics

(6)

I V

A b s tr a c t ii T a b le o f C o n te n ts iv L ist o f F ig u r es v iii L ist o f T ables x L ist o f A b b r e v ia tio n s x i N o t a t io n x iii A c k n o w le d g e m e n t x v D e d ic a tio n x v i 1 I n tr o d u c tio n 1 1.1 H istorical overview ... 1 1.1.1 Need for c o m p re s s io n ... 2 1.2 C ontributions of the t h e s i s ... 3 1.3 Thesis o r g a n iz a t i o n ... 4 2 O v e rv iew o f W a v e le t-B a se d C o m p r e ssio n M e th o d s 6 2.1 I n t r o d u c t i o n ... 6

2.2 The transform ation s t a g e ... 7

2.2.1 S p atial decorrelating tr a n s f o r m s ... 8

2.2.2 Spectral-decorrelating t r a n s f o r m s ... 9

2.2.3 H ybrid transform s: W a v e le ts... 11

2.2.4 D efinition of m ultiresolution a n a l y s i s ... 11

2.2.4.I O rthonorm al b a s i s ... 11

(7)

Table of Contenta v

2.2.4 3 C alcu latio n o f one dim ensional wavelets coefficients . . . . 16

2.2.4.4 Signal decom position a n d reconstruction by using wavelets 17 2.2.4.5 E x ten sio n of wavelet basis to two d i m e n s i o n s ... 20

2.2.5 M u ltirate filter b a n k s ... 22

2.2.6 C orrelation betw een m ultiresolution analysis a n d filter banks . . . . 30

2.2.7 B iorthogonal w a v e le ts ... 31 2.2.8 Wavelets in im age c o d in g ... 32 2.2.8.1 R eversible w a v e le ts ... 33 2.3 Q u an tizatio n s t a g e ... 36 2.3.1 Scalar q u a n t i z a t i o n ... 36 2.3.2 Vector q u a n t i z a t i o n ... 37 2.4 C oding s t a g e ... 38 2.4.1 S ta tic coding t e c h n i q u e s ... 38 2.4.2 A daptive te c h n iq u e s ... 39 2.4.3 Em bedded c o d e r s ... 39 2.5 C o n clu sio n s... 41 L o ssless D a t a C o m p r e ss io n U s in g an I n t e g e r -A r it h m e t ic W a v e let T ra n s form 42 3.1 I n t r o d u c t i o n ... 42 3.1.1 P re lim in a rie s ... 42

3.2 P roposed wavelet tran sfo rm using in te g e r-a rith m e tic ... 45

3.2.1 Reversible tw o-ten transform in liftin g s c h e m e ... 46

3.2.2 C oding o f wavelet coefficients ... 47

3.2.3 A lgorithm fo r th e c o d e r ... 47 3.2.4 A lgorithm for th e d e c o d e r ... 48 3.3 C om pression of m edical im a g e s ... 48 3.3.1 E xisting m e t h o d s ... 48 3.3.1.1 Lossless J P E G predictive c o d in g ... 48 3.3.1.2 Im proved predictive co d in g ... 49 3.3.1.3 Tt-ansform-based m e t h o d s ... 49 3.3.2 R esults a n d c o m p a riso n ... 50

3.3.2.1 R easo n for th e b etter perform ance o f th e nonlinear T T filter 53 3.4 A n in teg er-arith m etic color-coordinate tran sfo rm atio n for color im age com pression ... 59

(8)

Table of Contents vi

3.5 Com pression o f color i m a g e s ... 60

3.6 C onclusions... 63

A M ix e d T r a n s f o r m T e c h n iq u e fo r L o s s y I m a g e C o m p r e s s io n 65 4.1 I n tr o d u c t i o n ... 65

4.2 Overview of vector q u a n t i z a t i o n ... 66

4.2.1 I n tr o d u c ti o n ... 66

4.2.2 Definition of vector q u a n tiz a tio n ... 67

4.2.3 O p tim al vector q u a n t i z a t i o n ... 68

4.2.4 T h e LBG a lg o r ith m ... 69

4.2.5 C odebook in itia liz a tio n ... 70

4.2.6 Com pression o f images using vector q u a n tiz a tio n ... 71

4.3 A new m ixed-transform technique for low b it-rate image c o d i n g ... 72

4.4 Proposed M X D T c o d e r ... 72

4.4.1 Selection o f codebook size and w i d t h ... 75

4.4.2 S tatistical analysis o f th e wavelet c o e ffic ie n ts... 75

4.4.3 E rro r c o r r e c t i o n ... 76

4.5 D e c o d e r s ... 77

4.6 R esults and d iscu ssio n ... 79

4.7 Com pression of color i m a g e s ... 87

4.7.1 Proposed color im age com pression te c h n i q u e ... 88

4.7.2 R esults and discussion ... 88

4.7.3 C o n c l u s io n s ... 91

O n t h e D S P I m p l e m e n t a t i o n o f W a v e le t l ï a n s f o r m fo r R e a l- tim e S p e e c h C o m p r e s s io n 93 5.1 I n t r o d u c t i o n ... 93

5.2 Block wavelet tra n s fo rm ... 95

5.2.1 E ntropy c o d i n g ... 96

5.3 Im plem entation of reversible integer a rith m e tic wavelet t r a n s f o r m ... 99

5.4 Im plem entation issues for th e TM S320C30 D S P ... 102

5.4.1 H ardw are d e s c r ip t i o n ... 102

5.4.2 Im plem enting th e block wavelet t r a n s f o r m ... 103

(9)

TM e of Contents v ii

5.4.4 Real-tim e Im plem entation ... 104

5.5 Results and discussion... 105

5.6 C onclusions... 108

6 C o n c lu sio n s a n d S co p e for F u tu r e W o rk 109 6.1 O v e rv ie w ... 109

6.2 Reversible wavelets for lossless image c o m p re s s io n ... 109

6.3 Lossy image c o m p re s s io n ... 110

6.4 Real-tim e compression of s p e e c h ... 110 6.5 Future research ... I l l

B ib lio g r a p h y 113

(10)

V III

L ist o f F ig u res

Figure 2.1 G eneral stru c tu re o f transform -based image com pression system . . . 7

Figure 2.2 (a) Uniform b a n d w id th filter banks, (b) Wavelet filter banks... 10

Figure 2.3 An M-fold dow nsam pler... 23

Figure 2.4 Effects of two-fold dow nsam pling in th e frequency dom ain (no alias ing). (a) Spectrum of th e in p u t signal, (b) T h e shifted versions of the origimd in p u t spectrum used to form th e spectrum o f th e dow nsam pled signal. (c) Spectrum of th e dow nsam pled signal... 25

Figure 2.5 Effects of two-fold dow nsam pling in th e frequency dom ain (w ith alias ing). (a) Spectrum o f th e in p u t signal, (b) T h e shifted versions of the original in p u t spectrum used to form th e spectrum o f th e dow nsam pled signal. (c) Spectrum o f th e dow nsam pled signal... 26

Figure 2.6 A n M-fold u p s a m p le r... 27

Figure 2.7 T he first noble i d e n t i t y . ... 27

Figure 2.8 T he second noble i d e n ti ty ... 27

Figure 2.9 A n M -channel analysis filter bank... 28

Figure 2.10 A n M -channel synthesis filter b an k ... 29

Figure 2.11 An 2-channel J -s ta g e wavelet decom position... 29

Figure 2.12 A two-channel m axim ally decim ated biorthogonal filter bank . . . . 32

Figure 2.13 Lifting stage: S plit, pred ict, u p d a te ... 35

Figure 3.1 Block diagram of analysis and synthesis wavelet filters... 43

Figure 3.2 Lifting stages for th e T T transform (Q denotes q u an tizatio n )... 46

Figure 3.3 O btaining th e original signal from th e T T wavelet coefficients using lifting (Q denotes q u a n tiz atio n )... 47

Figure 3.4 T he sam ple p red ictio n neighborhood... 49

Figure 3.5 Tw o-dim ensional block D O T for lossless com pression... 50

Figure 3.6 Various m edical te s t im ages... 54

(11)

List of Figures ix

Figure 3.8 C om parison o f com pression perform ance (in bits p er pixel) using dif

ferent algorithm s for various USC a n d M RI im ages... 56

Figure 3.9 Dual-wavelet function for th e 2/14 filter... 58

Figure 3.10 Dual-wavelet function for th e T S filter... 58

Figure 3.11 Dual-wavelet function for th e T T filter... 59

Figure 3.12 C om parison between lum inance com ponents of th e existing Y I Q and the proposed Y ' f Q m odels... 61

Figure 4.1 A basic mem oryless vector q u a n t i z e r ... 68

Figure 4.2 Block d iag ram of the proposed M XDT c o d er... 73

Figure 4.3 T he E rro r C orrection m e t h o d ... 77

Figure 4.4 W avelet-based soft thresholding... 79

Figure 4.5 (a) O riginal FA C E l Image, (b) O riginal FACE4 image... 82

Figure 4.6 (a) Decompressed image using th e J P E G decoder, (b) Decompressed image using the S P IH T decoder, (c) D ecom pressed im age using th e M XDT decoder, (d) Decompressed image using the M X D T l decoder... 83

Figure 4.7 (a) Decompressed image using th e J P E G decoder, (b) Decompressed image using the S P IH T decoder, (c) D ecom pressed image using th e MXDT decoder, (d) Decompressed image using the M X D T l decoder... 84

Figure 4.8 (a) Zoomed FA C E l image, (b) Zoomed decom pressed image using the S P IH T decoder, (c) 55oomed decom pressed im age using th e M X D T l decoder. 85 Figure 4.9 (a) Zoomed FACE4 image, (b) 2k>omed decom pressed image using the S P IH T decoder, (c) Zoomed decom pressed im age using th e M X D T l decoder. 86 Figure 4.10 Block d iag ram o f the proposed w avelet-D C T m ixed transform coder. 87 Figure 4.11 O riginal I m a g e ... 90

Figure 4.12 JP E G com pression (Com pression R atio = 67:1, PSN R — 27.6 dB). 90 Figure 4.13 C om pression using the proposed technique (Com pression R atio = 67:1, P S N R = 30.7 d B )... 91

Figure 5.1 (a) R elation between C P U tim e and d a ta len g th for D W T based on T T transform w ith different softw are im plem entations(b) R elation between C PU tim e and d a ta length for ID W T based o n T T transform w ith different softw are im plem entations... 101

Figure 5.2 Block d iag ram o f the real-tim e im plem entation of a wavelet-based speech coder on th e TMS320C30 D S P ... 105

(12)

L ist o f T ables

Table 3.1 Lossless compression ratios of USC i m a g e s ... 51 Table 3.2 Compression ratios for medical test images after pre-processing . . . 52 Table 3.3 Entropies of test images for th e H aar, TS an d T T filter-based approaches 53 Table 3.4 C P U tim e (in seconds) for test images using H aar and TS approaches 53 Table 3.5 C P U tim e (in seconds) for test images using th e T T approach . . . . 57 Table 3.6 F ilte r coefficients for various lin ear biorthogonal f i l t e r s ... 57 Table 3.7 C om parison betw een V (lum inance in V I Q m apping) and V ' (lumi

nance in Y ' Ï Q' m a p p in g ) 62

Table 3.8 C om parison of compression perform ance (bpp) am ong different lossless com pression tech n iq u es... 63 Table 3.9 T otal coding tim es o f color images (V ' Ï Q' ) for different coding techniques 64 Table 4.1 An example showing range correction of th e D C T coefficients . . . . 74 Table 4.2 C odebook sizes an d widths for a typical te st i m a g e ... 75 Table 4.3 C om parison of compression perform ance of various schem es... 81 Table 4.4 Subjective evaluation of various s c h e m e s ... 82 Table 4.5 C odebook sizes a n d widths for th e Y (lum inance com ponent) o f a

typical landscape i m a g e ... 89 Table 4.6 C om parison in compression perform ance betw een th e JP E G stan d a rd

and th e m ixed transform te c h n iq u e ... 89 Table 5.1 C P U tim es (in seconds) using different software r e a l i z a t i o n s ... 99 Table 5.2 Q uantization levels for different s u b b a n d s ... 102 Table 5.3 Com pressed file size for the proposed m ethod an d the coder proposed

in [107] 106

Table 5.4 Com pressed file size for various vocoders. T h e com putation tim es for the proposed technique are given in th e p a re n th e s is ... 107

(13)

X I

L ist o f A b b rev ia tio n s

ID O ne dim ensional

2D Two dim ensional

bpp B its p e r pixel

BRD Delayed branch instruction

B W T Block wavelet transform

HVS H um an visual system

C PU C entral processing u n it

C R Com pression ratio

C R E W C om pression w ith reversible em bedded wavelets

C T C om puterized tom ographic (images)

D C T D iscrete cosine transform

D F T D iscrete Fourier transform

DM D aughter m odule

DMA D ual-m em ory access

DP D ata page pointer

D PR A M Dynam ic program m able random -access mem ory

DSP D igital signal processing (processor)

EZW E m bedded zerotree wavelet (coding)

JB IG Jo in t Bilevel Image G roup

JP E G Jo in t Photographic E x p erts Group

KLT K arhunen-Loeve transform

LBG algorithm Linde Buzo G ray algorithm

M RI M agnetic resonant imaging

M SE M ean squared error

N IN T N earest integer (less tham or equal to) P S N R Peak signal-to-noise ratio

Q M F Q u a d ratu re m irror-im age filter

RAM Random -access memory

(14)

List of Abbreviations x ii

ROM R ead-only meomory

R P T B R ep eat block in stru ctio n R P T S R epeat sigle in stru ctio n

S + P Said 4- Pearlm an

S P IH T Set p artitio n in g in hierarchical trees

SRAM S tatic random-access m em ory

S T F T Short-tim e Fourier transform T S transform Two-six transform

T T transform T w o-ten transform

USC U niversity of Southern C alifornia

VoIP Voice over Internet protocol

VQ Vector quantization

(15)

X U l

Notation

• T he symbols Z , 72., and C denote the sets o f integers, real numbers, and complex num bers, respectively.

• is a real inner product space of m easurable functions such th a t f I f i ^ W d x < + 0 0

J — OC

• T he classical norm of f { x ) € is given by i i / i i c = r \ n x ) \ d x

J — OC • norm is given by

l l / f = f l /W I ^ d x

J — O C

• £ ^ (R ” ) is an inner product space of m easurable square integrable n-dim ensional func tions.

• T he classical norm of f { x i , X2, - - -, x„) G £ ^ (R " ) is given by r o c roo ro o

— I j . . . / | y(xi, X2, Xfj)! d x i d x 2 . . . d x j i J — 0 0 J — O C J — O C

• T he series Vq, Vi, V2, . . . refers to the sequence of subspaces in £^(R ).

• T he symbols ‘©’ a n d ‘®’ refer to th e tensor su m and tensor p ro d u ct of two subspaces, respectively.

• T he sym bol j denotes the q u an tity >/—!.

• Functions of a continuous variable are in d icated w ith round parenthesis, for exam ple, f { t ) w here t € R.

• Functions of a discrete variable are indicated w ith square parenthesis, for exam ple, ar[n] where n € Z .

• Bold face symbols are usually used to represent vector o r m atrix variables. • T he q u an tity denotes th e transpose o f A .

(16)

Notation x iv

e T he z -transform o f a discrete sequence x\n \ is denoted as X ( z ) an d is defined as

n=tx3

Zx[n] = X { z ) = ^ 2 " (0.1)

n=—oc

• The inner p ro d u ct of two functions in the continuous variable case is defined as

/

+ 0 C f { t ) g - i t ) d t (0.2)

-O C

and in th e discrete variable case as

< f[n],g[n] > = Ÿ 1 (0.3)

n=—oo

• T h e d elta function <5[n] is defined as

= ( = ® (0.4)

I 0 otherwise

e T he n o tatio n f * g stan d s for convolution of / an d g

• T he notatio n [xj denotes th e greatest integer not m ore th a n x ( or equivalently x is rounded to th e nearest integer towards —oo).

• F ilter coefficients are assum ed to be real unless sta te d explicitly.

• T he sam pling period of d iscrete sequences is assum ed to be one unless s ta te d otherwise. • A filter H has a transfer function H{z) . T h a t is, ro m an font is used to name a

p articu lar filter while italics are used to indicate th e tran sfer function associated w ith it.

(17)

X V

A c k n o w le d g e m e n t

I would like to express m y deepest g ratitu d e to m y supervisors, D r. P an ajo tis A gathoklis and D r. A ndreas A ntoniou, for th eir valuable guidance and com m ents thro u g h o u t m y grad u ate study. I would also like to th a n k D r. R einhard Illner for all the valuable discussions we had on the m ath em atical th eo ry of wavelets. His grad u ate course on wavelets was really helpful.

I would like to th a n k m y wife, M eher, and m y brother, Jasjeet, for th e ir em otional su p p o rt and encouraging me to com plete this work.

In ad d itio n I w ould like to m ention some im p o rtan t friends who becam e a n integral p a rt o f my stu d e n t life. T h ey include A vijit B hunia, Sandeep Agarwal, R a n g a n a th a n Gu- ru n a th a n a n d S u b ram an ian M uthu, to nam e a few.

Finally, I would like to th a n k my m om and m y d ad for all th e sacrifices th e y m ade for me an d for giving me th e will to continue.

(18)

X V I

D e d ic a tio n

(19)

Chapter 1 Introduction

T raditional image compression techniques have been designed to exploit the statistical re dundancy present w ithin real world images. The discrete cosine transform is one example of this statistical approach. Rem oving redundancy can only give a lim ited am ount of com pression; to achieve high memory savings, some of th e non-redundant inform ation must be removed. By using m ethods th a t closely mimic th e human visual system (HVS), compres sion can take into account the im portance o f each individual coefficient and code accordingly. Psychophysicists an d visual psychologists have discovered th a t the eye filters th e image into a num ber of bands, eacli approxim ately one octave wide in frequency. Further, in the spa tial dom ain, th e image should be considered to be composed of inform ation a t a num ber of different scales. Wavelets decompose th e image into multiple bands a t octave frequencies sim ilar to th e m ultiple channel m odels of the HVS. New compression techniques based on the wavelet transform are getting extrem ely popular these days. Wavelet technology can help a tta in higher transm ission speeds an d clearer pictures th an th e conventional statistical m ethods by providing higher com pression ratios an d reduced com putational complexity.

1.1 H is to r ic a l o v e r v ie w

Wavelets, filter banks, and m ulti-resolution signal analysis, which have been used inde pendently in th e field of applied m athem atics, signal processing, an d com puter vision, re spectively, have recently converged to form a single theory. T he first wavelet system was constructed by H a ar [1] in 1910. T h e H aar wavelet system , as it is now known, uses piece- wise co n stan t functions as its basis to decompose th e in p u t signal. A lthough wavelet theory has m any close ties to science a n d engineering since th e early 1900s, these linkages were not discovered until th e early 1980s. U ntil th a t time wavelet theory was a disjoint set of ideas th a t lacked a unified framework.

(20)

1. Introduction 2

m id 1980s. In 1984, the term ‘wavelet’ was introduced by G rossm an and M orlet [2]. In 1988, a trem endous breakthrough in wavelets was brought a b o u t by Daubechies [3]. In this classic paper, Daubechies introduced a family o f com pactly su p p o rte d wavelet system s th a t satisfy the property o f orthogonality a n d perfect reconstruction. In 1989, M allat [4] presented the theory of m ultiresolution analysis th a t la te r becam e p o p u la r as th e M allat algorithm . T his algorithm provided an easy insight to engineers and physicists in applying wavelet concepts to diverse fields such as signal processing etc. In 1992, C ohen, Daubechies a n d Feauveau [5] established th e theory of biorthogonal wavelet filters com m only known as th e CDF filters. Unlike orthogonal wavelet filters (except for th e triv ial case o f th e H aar a n d other Haar- like transform s), biorthogonal filters allow for sym m etric finitely-supported basis functions. T he sym m etry property can offer significant benefits in m any applications, image d a ta compression being one o f th em . Wavelet transform s have proven to be extrem ely useful for image coding as many researchers have shown (e.g., [6]-[23]). C onsequently such transform s have been used extensively in m any image d a ta com pression algorithm s. In 1993, Shapiro [9] introduced th e concept of em bedded coding. His coding scheme, called embedded zerotree wavelet (EZW ) coding, was based on wavelet transform s. D ue to the obvious advantages of the em bedded property in m any applications, em bedded coding quickly grew in popularity especially as a means o f building unified lossless/lossy com pression systems. In 1995, Zandi et al. [10] proposed compression with reversible embedded wavelets (C R E W ), a reversible em bedded im age com pression system based on some ideas of Shapiro. N ot long after, in 1996, Said a n d Pearlm an [11] introduced a new coding schem e known as se t partitioning in hierarchical trees (SPIH T ), w hich is conceptually sim ilar to EZ W w ith some im plem entation improvements.

A part from th e em bedded coding techniques which involve using highly complex coders and decoders, th ere are o th e r wavelet-based techniques th a t becam e p o p u lar in certain areas of image com pression. O ne o f them is know n as vector qu an tization (V Q ). A lthough the concept of vector q u an tizatio n has been aro u n d for decades [12], applying VQ to a wavelet transform was introduced by B arlaud e t al. in 1989 [13]. M odifications to th e VQ approach were later rep o rted in various other publication [14], [15].

1.1.1 Need for compression

T he need o f com pression em erges from th e fact th a t an y real d a ta gathered through a d a ta acquisition system always h as some degree of correlation. T h e degree of correlation depends on th e type of d a ta an d th e am ount of noise in th e d a ta . C om pression is widely used in

(21)

1. Introduction 3

m any areas, e.g., im age processing, speech coding, m edicine and th e In tern et, to nam e a few. Even though d a ta mem ories are g ettin g cheaper a n d faster, th ere will be cases w here com pression of d a ta is necessary. For example:

1. To store a m oderately large image, say a 512x512 pixels, 24-bit color image, requires a b o u t 0.75 M B ytes. A video signal typically has 30 fram es per second.

2. A stan d ard 35 m m p h o to g rap h digitized a t 12 pm resolution requires a b o u t 18 M Bytes. 3. One second of N T S C color video entails 23 MBytes.

T his shows th a t one c an easily find situ atio n s where th e current hardw are is inadequate, e ith er technically or economically. Com pression techniques can reduce th is gap by

• saving storage,

• saving CPU tim e, or by • saving transm ission tim e.

T h e requirem ent for com pression has become even more critical w ith th e boom in In tern et. Various m ultim edia protocols such as H.323 and H.324 have built-in com pression of aud io and video signals to m inim ize th e b an d w id th per channel. Some of th e In te rn e t applications such as voice over In te rn e t protocol (VoIP) use state o f th e a rt audio coders such as G.728 and G.729A [16] [17].

T h e m ajo r requirem ent of com pression is th a t one should be able to quickly sw itch betw een th e original a n d th e com pressed d a ta .

1.2 C o n tr ib u tio n s o f t h e th e s is

T his thesis studies reversible wavelet transform s and th e ir application to lossless com pres sion o f speech and im age signals. T h e ideas are drawn from all of th e previously described developm ents, b u t o f m ost d irect im p o rtan ce are the reversible wavelet transform s proposed by Boliek et al. [10] a n d th e S4-P transform proposed by Said and P earlm an [11]. M any o f the ideas on reversible wavelets are closely related to ideas presented in these two p ap ers. A p a rt from lossless com pression, wavelets have also been used for lossy com pression of im ages. M any existing w avelet-based lossy compression techniques a re based on em bedded coding techniques [9], [11]. In th is thesis, focus is directed towards u sin g vector quantiza tion [12] a n d wavelets for lossy com pression. Some of th e topics addressed in detail are as follows:

(22)

1. Introduction 4

— a new reversible transform and com parison w ith existing transform s w ith foctis on medical image compression

— extension of th e reversible transform ations to th e class of color images

— practical issues associated w ith th e im plem entation of transform s such as com pu tatio n al com plexity

• Lossy Compression

— a mixed transform technique based on the discrete wavelet transform and the discrete cosine transform w ith vector quantization technique

— m ethods of m inim izing quantization artifacts using post processing • P rac tic al Im plem entation

— practical issues related to real-tim e im plem entation of reversible transform tech niques

— effects of block wavelet transform o n compression perform ance and com puta tional complexity

1.3 T h e s is o r g a n iz a tio n

The thesis is divided into six chapters. T h e first two chapters provide introductory and background m aterial necessary to und erstan d th e rest of the thesis. T he rem aining chapters present a com bination of fundam entals an d research results in detail.

In C h a p te r 2, concepts underlying m u ltirate filter banks and wavelet transform s are outlined. T h e chapter begins by presenting th e fundam entals of m u ltirate filter banks and wavelet-based decom position. A special class o f wavelets called reversible wavelets [10] is described. T h is is followed by m ethods of quantizing the wavelet coefficients. Two different qu an tizatio n techniques, namely, scalar qu an tizatio n and vector quantization, are discussed. Finally, different types o f encoders are discussed. The advantages o f wavelet coders over other en tro p y coders are also discussed.

In C h a p te r 3, a new reversible transform is proposed. T h e transform is used for the com pression o f m edical images. T h e perform ance achieved is com pared w ith th a t achieved w ith other existing lossless com pression techniques. T h e transform is then extended to compress color im ages. A new reversible color image transform ation to remove sp ectral redundancy am ong th e color bands is also presented. R esults obtained w ith th e new transform are com pared w ith results obtain ed w ith some o f th e existing reversible transform s.

(23)

1. Introduction 5

uses wavelets and vector quantization for th e compression of th e wavelet coefficients an d D C T w ith scalar q u an tizatio n for encoding the scaling coefficients. T he results are com pared w ith results obtained w ith s ta te o f th e a rt wavelet coders such as th e EZ W [9] and S P IH T [11] coders. Subjective results are also given.

In C h ap ter 5, th e proposed reversible wavelet transform is used for th e com pression of speech signals. R eal-tim e im plem entation was done on the TM S320C30 DSP. Issues con cerning th e real-tim e wavelet transform (called block wavelet transform) are also discussed. B oth lossless and lossy speech coding are implemented. T he results are com pared w ith results obtained w ith some of th e existing coders.

Finally, C hapter 6 sum m arizes some o f the more im portant contributions a n d results presented in the thesis. T he chapter concludes by p u ttin g forward some suggestions an d directions for future work.

(24)

6

Chapter 2 Overview of Wavelet-Based

Compression Methods

2 .1

I n tr o d u c tio n

C oding Is th e process of finding a representation o f a given signal. It is usually used to eith er find a representation w ith less redundancy so th a t fewer bits are req u ired to encode the given signal, or to add red u n d an cy in ord er to facilitate erro r d etectio n /co rrectio n of the signal tra n sm itte d over noisy channels. T h e former coding approach is also called compression. W ith the boom in m ultim edia an d high-resolution video containing a high degree of redundancy, com pression has gained widespread a tten tio n in b o th research and industry.

T here are essentially two basic kinds of com pression schemes: lossless a n d lossy. In the case of lossless compression, one is interested in reconstructing th e d a ta exactly, w ithout any loss of inform ation. Lossless com pression is o ften used for compressing te x t files a n d medical images. Lossy compression, on th e o th er hand, can tolerate errors in th e im age as long as the qucility of th e signal is ‘accep tab le’ after compression. A lthough the te rm acceptable’ is subjective, an acceptable ap p ro x im atio n of th e signal is one th a t is, for p ra c tica l purposes, visually indistinguishable from th e original signal.

T he general stru ctu re o f a n image com pression system is as shown in Figure 2.1. In this diagram represents th e original image. T he transform is applied to th e original image to reduce the sp atial o r sp ectral correlation or both, to generate th e transform ed image y[&, I]. T h e transform ed image is quantized to o b tain a quantized im age f]. Q uantization is used to d iscard any coefficients deemed to be insignificant. T h is results in loss of inform ation content firom th e image a n d th u s the com pression is lossy. In th e case of lossless compression, no q u an tizatio n is perform ed since no loss of inform ation can be

(25)

2. Overview of Wavelet-Rased Compression Methods

w t n____________V Ik, u

Forward ; ^ j Cocfficicm |____ J Entropy

Transform Quantizer Coder

Input :_______________________________________________________________________ O u tp u t

im age b it stre a m

F ig u r e 2 .1 . General structure o f transform-based im age compression system. tolerated.

T h e quantized image is encoded to ob tain the final com pressed bit stream 6[m]. T here are various types of encoders and entropy encoder is one of them . Entropy is a measure of the am ount of inform ation in a n ob ject. For example, a speech signal w ith periods of silence will have lower entropy th a n a signal w ith no periods o f silence. Entropy can be expressed in bits. In this form, it is generally referred to as inform ation content. Every non-random signal has some degree o f redundancy or correlation in itself. By an ap p ro p riate prediction algorithm , the redundancy can be removed thus reducing th e entropy of th e signal as well (entropy is directly p roportional to inform ation content). As reduction in entropy reduces th e average num ber o f b its p er sam ple of th e signal, it results in the overall compression of th e signal. An encoder using this concept of reducing entropy to achieve compression is an entropy encoder. T h e entropy is reduced by using th e fact th a t a sym bol w ith higher p robability of occurrence should use fewer bits com pared to a sym bol w ith lower probability of occurrence. At the receiver, the coded b it stream is decoded using an equivalent entropy decoder followed by a n inverse transform ation to o b tain th e decompressed image.

In the sections th a t follow, each o f these two or three stages (for lossless or lossy com pressions, respectively) in a com pression system are discussed in detail.

2 .2

T h e tr a n sfo r m a tio n s ta g e

T h e aim of compression is to remove redundancy firom th e original d a ta so th a t the same inform ation (or slightly less) can be coded in a fewer b its. C orrelated d a ta is characterized by th e fact th a t one can , given a p a rt of th e d a ta, fill in th e m issing p a rt. Several types o f correlation exist. T hey can be characterized as follows:

# S patial correlation: O ne can often predict the value o f a pixel in an im age by looking a t the neighboring pixels.

• S pectral correlation: O ne c an predict one fiequency com ponent by looking a t th e neighboring frequency com ponents. This m eans th a t a spectral transform (Fourier transform , for exam ple) of a signal is sm ooth.

(26)

2. Overview of Wavelet-Baaed Compression Methods 8

# Tem poral correlation: In a digital video, most pixels of two neighboring frames change very Uttle in th e tim e direction (e.g., the background).

G enerally a transform ation remove either spatial a n d /o r spectral correlation in an image. Teclmiques th a t remove b o th spectral and spatial correlation are called hybrid coding tech niques.

2.2.1 Spatial decorrelating transforms

A sp atial decorrelating transform removes the correlation from a n im age by using the neigh boring pixel values. A predictor is used to estim ate the current value of th e pixel based on previous pixels. Different types of predictors have been rep o rted in the literature. A compression scheme based o n this approach is called predictive coding. T he current pixel value is estim ated based on th e neighboring pixels and the type o f predictor being used. T h e difference betw een actu al and estim ated value is called th e prediction error (or error residual). For lossless compression, the prediction error is stored w ithout any loss using entropy coding a n d is used to o b tain the final b it stream . O n th e other hand, if the com pression is lossy, th e error residual is quantized and then coded to obtain th e b it stream . T he predictor can also be adaptive in which case its coefficients change w ith the type of signal being coded. The Jo in t Photographic E xperts Group, also known as JP E G , uses a variety of lossless predictors [18] for predicting the current pixel value based on the neigh boring pixels. G enerally th e predicted value of the current pixel is estim ated by using the p a st pixel values in th e rows a n d columns. The m ore the num ber o f neighboring pixels used for prediction, th e higher is th e order of the predictor. A further possibility is to delay the encoding of a pixel until th e ‘future tre n d ’ of th e signal can be observed, an d then take advantage of this trend. T his is called delayed coding [19].

T h e Joint Bilevel Image G roup, also known as JBIG [20], uses a set o f predictors to estim ate the cu rren t pixel value. A lthough JB IG is generally for the coding of binary images, it can efficiently code images w ith resolution up to 4 b its/p ix el. However, beyond th a t, its perform ance tends to suffer compared to other trainsform-based m ethods. T he m ain awl vantage of th e predictive coding technique is th a t it involves very few com putations com pared to spectral-correlation and hybrid coding methods. Various applications where predictive coders are used can be found in [21], [22].

(27)

2. Overview of Wavelet-Baaed Compreaaion Methods 9

2.2.2 Spectral-decorrelating transforms

In spectral-decorrelating transform s, th e in p u t image is partitioned into blocks of pixels an d each block of pixels is processed separately. T h e pixels in each block are transform ed in the frequency dom ain (spatial frequency in rows and columns) using a linear-orthonorm al transform ation. A linear transform ation is a n inform ation preserving process. T h e m ain objective of a transform ation is to p ro d u ce statistically independent (or a t least u ncorre lated) transform coefficients so th a t th ey c an be coded independently of each o th e r w ith good efficiency. A nother objective is energy com paction, which m eans concentrating th e energy to a few coefficients and th ereb y m aking as m any coefficients as possible sm all enough th a t they need not be tra n sm itte d . M ost of th e spectral transform s achieve these objectives reasonably well. T he m ost com m only known transform s are th e discrete Fourier tran sfo rm (D PT ), th e W alsh-Had am a rd transform (W H T), the Karhunen-Loeve tran sfo rm (KLT) a n d the discrete cosine transform (D C T ) to nam e a few [19]. A lthough th e K LT achieves th e theoretical lim it of perform ance o f any decorrelation technique (in o th er words, its coefficients are uncorrelated), th e D C T provides a good balance betw een perform ance and com p utation al complexity. Hence it is m ost widely used, and in c ertain circum stances, its perform ance can come close to t h a t of th e KLT [8j. Recently, su b b an d coding h as also been used for image compression. T h is involves filtering th e image th ro u g h a bank of fil ters. T h e o u tp u t of each o f the filters h as inform ation w ithin the passband of the filter. T h e advantage of subband coding is th a t th e q u an tizatio n of each of the su b b an d s can b e done based on hum an perception, thus im proving th e subjective quality o f th e decom pressed image. A specific fam ily of transform s used to o b ta in a type of su b b an d decom position of signals, called wavelets, has received considerable a tte n tio n in recent years [3]. W avelets can be very closely related to subband filters, w ith some additio n al properties, th a t m ake th em favorable for image compression [23]. O ne o f th e m ain features of wavelets is th e ir tim e- frequency representation. Wavelets have higher frequency resolution a t lower firequencies and lower frequency resolution a t higher fiequencies. T h is is a ttrib u te d to the nonuniform band w idth s of the filters used a t each sp atial resolution. Figure 2.2(a) shows th e tim e- frequency representation for uniform filter banks. T h e frequency resolution is c o n sta n t a t different frequencies. T he same also holds for tim e resolution. T his is th e characteristic of th e sh o rt-tim e Fourier transform (S T F T ). C onsider a signal x[t) a n d assum e th a t it is sta tio n a ry over an a rb itra ry window function h{t) o f lim ited d u ratio n , centered a t tim e location r . T h e Fourier transform o f th e windowed signals x{t) h*{t — r ) yields th e S T F T

(28)

2. Overview of Wavelet-Baaed Compreaaion Methods 10 time 3T 2T T • • • 0 w l 2 w l 3 w l _Frequency (a) time 4T 2TT -• -• 0 wO/2N wO/N wO Frequency (b)

(29)

2. Overview of Wavelet-Based Compression Methods 11

as

S T F T ( r ,/ ) = / * x {t) h * { t - T ) d t (2.1) J —OC

which m aps the signal onto a two-dim ensional function in the tim e-frequency plane. How ever, th e analysis depends critically on the choice of th e window h{t). Figure 2.2(b) shows the tim e-frequency representation for nommiform filter banks (e.g., wavelets). T h e fre quency resolution is higher a t lower frequencies and lower a t lower frequencies. O n the other hand, tim e resolution is higher a t higher frequencies and lower a t higher frequencies. The transform ations th a t use b o th frequency and sp atial correlation in an image fall in the the category o f so called hybrid transforms. In the next section, we discuss the basic theory of wavelets and provide details on th e current trends.

2 .2 .3 H y b r id t r a n s f o r m s : W a v e le t s

Tim e-frequency representation, also called midtiresolution representation, is a general m ethod for constructing o rthonorm al bases, developed by M allat and Meyer [24]. Intuitively, m ul tiresolution sUces th e space of square integrable functions C? into a nested sequence of subspaces Vi, where each Vi corresponds to a different scale. The m ultiresolution is com pletely determ ined by th e choice o f a special function called the scaling function.

2 .2 .4 D e f i n i t i o n o f m u l t i r e s o l u t i o n a n a ly s is

M ultiresolution analysis provides a n a tu ra l framework for understanding th e wavelet basis, and for th e construction of new exam ples. T he concept o f m ultiresolution approxim ation of a function was introduced by M allat [4] an d provides a powerful fram ework to u n d erstan d wavelet decom position a n d reconstruction. Before th e theory of m ultiresolution analysis is presented, it is w orthwhile to discuss the concept of orthonorm al basis. This concept is widely used in the developm ent of m ultiresolution analysis.

2.2.

4 .1

Orthonormal basis

A

family of elements {u„}, n

E I, (I

countable index set), is called orthonorm al if Vn € I , ||u|| = 1 and

< Uj,

Uj >

=

5ij

, Vi, J e I

(2.2)

where < u j > is th e in n er product o f ti< and Uj as given in E quation 0.3. Then {un}nei is called a n orthonorm al set. An o rthonorm al set is complete if it satisfies the p ro p erty

(30)

2. Overview of Wavelet-Baaed Compreaaion Methoda 12

of completeness. An orth o n o m ial set of functions, {«„}, n G

I, (I

countable index set) belonging to space, is S8Ùd to be com plete if there exists no functions different from zero in £ ^ (R ) which is orthogonal to all functions Un- A com plete orthonorm al set is also called an orthonorm al basis (or stan d ard basis). Background theory on completeness can be found in [25].

As a consequence of this p roperty of com pleteness, for any a rb itra ry function u in £ ^ (R ), if < u, Un > = 0 for all n G

I,

u must be zero. In th is case, u can be represented as

U = ^ U, Un Un (2.3)

n e I and

\ M h = E I < « , « " > I" ( 2 .4 )

n € I

E quation 2.3 is th e generalized form of th e Fourier series w ith any arb itrary orthonorm al basis {un}, n G

I .

E x a m p le s

The functions t G I form a com plete orthogonal set in £ ^ [0 ,2vr]. A nother exam ple of orthonorm al basis in £ ^ (R ) is / m . n ( x ) = 2 - ^ / 2 - n ) ( 2 .5 ) where

{

1, 0 < x < 1 / 2 - 1 , 1/2 < X < I 0 , otherw ise 2 .2 .4 .2 S c a lin g fu n c tio n a n d th e m o t h e r w a v e le t

A m ultiresolution analysis consists of a sequence of successive approxim ation spaces Vj (closed subspaces) th a t satisfy th e relation

■ V2 C V i C Vq C V - i (Z V- 2 ■■■ (2.6)

w ith

u Vj = £ 2 (R ) (2.7)

(31)

2. Overview of Wavelet-Baaed Compression Methods 13

and

n Vj = {0 } (2.8)

iei

If Pj is the orthogonal projection o p e ra to r onto Vj, th en

lim P j f = f V f e C ^ iR ) (2.9)

i->-oo

A n ad d itio n al requirem ent for th e m ultiresolution analysis is

f i t ) G V j f i V t ) € Vo (2.10)

In o th e r words, all th e spaces are scaled versions of th e control space Vq. Moreover,

/(< ) € Vo => f i t — n ) € Vb for aU n € / (2 .1 1) In sum m ary, the five conditions fo r a set o f subspaces VyC£^(R) to be suitable for m ul tireso lu tio n analysis are

1. • • ■ Vb C Vi C Vq C V— 1 C V—2 * • • 2 - U je i Vj = £2(R ) ,

a e i VG = {0 }

3. f i t ) G Vj <=> f i 2 t ) G V j - i

4. T h ere exists G such t h a t Vq = { f \ f i t ) = a,- 0 (t — %)} , 4> is also known as th e scaling function.

5. {4>nit) = <f>it — n ) } is a n o rth o n o rm al basis o f Vb.

T h ese shall be referred as the five axiom s o f m ultiresolution analysis. If any orth o n o rm al basis satisfies all o f th e above conditions, it could be a valid wavelet basis for m ultiresolution analysis.

T h e basic tenet o f m ultiresolution analysis is th a t whenever collection o f closed subspaces satisfies axioms 1-5 o f m ultiresolution analysis, th e n th ere exists a n orthonorm al wavelet basis {t/’j,*; J, & G Z} of L ^(R ), V'j,ifc(®) = — k), such th a t, for all / in Z-^(R),

P j - i f = P j f

+

E

< / ’ V'M > V'jjk

(2.12)

fcez

w here Pj is the orthogonal p ro jectio n onto Vj. Please refer to [3] for details.

Now, if we call W j th e orthogonal com plem ent o f Vj in V^_i, i.e.,

(32)

th e n Wj contains th e details necessary to go from Vj to Vj-i- Iteratin g E q u atio n 2.13 gives

B y virtue of axioms 2 an d 3 for m ultiresolution analysis, this implies th a t

L 2(R ) = ® j ^ z W j , (2.15)

a decom position o f Z<^(R) into m u tu ally orthogonal subspaces. In o th er words, a given resolution can be a tta in e d by a su m o f ad ded details.

If a function (f> defines an orth o n orm al basis o f Vq such th a t it satisfies th e five conditions for m ultiresolution analysis, it is called th e scaling fu n ctio n of th e wavelet basis. T hus, if Vo C V -i, then

c/fc 4>{2x - k) (2.16)

k 6 I

where (f>{x) is the scaling function. In teg ratin g b o th sides of E quation 2.16 w ith respect to (w .r.t) z , we obtain roo r o c f <f>{x)dx

=

f ^ ct <f>{2x — k )dx (2.17) J - o o 6 I = cjt (f>{2x - k )dx (2.18) t e l “

Assuming <f> 6 £ ^ (R )u £ ^ (R ) such th a t (f>{x)dx = 1, and su b stitu tin g 2x — k w ith x ', E q u atio n 2.18 can be w ritten as

1 = 2 2

r°°(f>ix^)dx'/2

(2.19)

k e l

S u b stitu tin g <f>{x)dx = 1 in 2.19 above, we get

2 2 Cifc = 2

(2.20)

k

Also, if <f>{x) satisfies th e orthogonality condition which states th a t,

< <^(x), 4>{x — m ) > = 25om V m 6 I (2.2 1) a n d also satisfies th e dilation equation given in E quation 2.16, it can be shown th a t

22

Cfc Cfc-2m = 2 <îom, Vm € I (2.22)

(33)

where

^ 171 = 0

otherw ise

A nother condition w hich governs the choice of th e c/t’s is the approximation condition, which is as follows: If p is a n atu ral num ber, then

E (-1 )* k ^ C k = 0 (2.23)

k for m = 0, 1, 2, . . . , p — I.

It can be shown th a t the num ber p characterize th e sm oothing function <i>{x) such th a t polynomial functions of degree p — 1 o r less, are linear com binations of <f>{x) and its integer translates [26]. Moreover, the higher th e value of p, th e greater th e num ber o f nonzero c^’s in the orthonorm al basis [27].

If the scaling function (f> is given by E quation 2.16 and th e wavelet basis is defined by E quation 2.12, th en it can be shown [3] th a t th e m other wavelet V’(x) is given by

V’(x) = Z ( - ^ r C i-a <f>{2x - n ) (2.24) amd

iPj^ (x) = 2-^/2 ^ (2-J'x - k) (2.25)

If Pj is the projection on Vj and Qj is the projection on Wj, then

f = P j f -i- Q j f (2.26)

where

Thus,

Q j f — ^ 3 ^ >

V’i.fc

(^) (2.27) it € I

Therefore, we can sto re the inform ation in the form of wavelet coefficients < / , V’j,* > for different values o f j an d k. These inner products are called th e wavelet coefficients o f the function f { x ) . So, if we have a n approxim ation of a signal a t th e resolution corresponding to Vj, then a b e tte r approxim ation is obtained by adding th e details corresponding to Wj.

(34)

T his am ounts to a weighted sum of wavelets a t th a t scale. E quation 2.12 describes this relationship w here < > are th e wavelet coeflScients ( or weights) of the m other wavelet ‘ip{x) a t th e scale j . T h u s, by ite ra tin g this idea, a square integrable signal can be seen as th e successive approxim ation or weighted sum o f wavelet basis {V'j.ikîJi fc € Z} a t finer and finer scales (w ith som e o f these .

2 .2 .4 .3 C a lc u la tio n o f o n e d im e n sio n a l w a v e le ts c o e ffic ie n ts

In this section, th e weighting coefficients for th e orthonorm al D4 wavelet basis are d e

rived [3]. By using th e approxim ation and orthogonality conditions on th e wavelet coeffi cients to g eth er w ith th e su m m atio n condition, th en from E q u atio n s 2.20, 2.22 an d 2.23, we get a set o f equations as

3

53

Cfc = 2 (2.29) fc= 0 3 ^ ^ = 2 (Jom (2.30) t= 0

53

(-1)*= A:"* Cfc := 0 for m = 0, I (2.31) fcei

E xpanding th e above equations, we get

Co -f- Cl -H C2 + C3 = 2 (2.32)

C0C2 + C1C3 = 0 (2.33)

—Cl 4- 2c2 — 3c3 = 0 (2.34)

CO — Cl -I- C2 — C3 = 0 (2.35)

c ^ 4 - c f - l - C 2 - f - < ^ = 2 (2.36) Solving th e first four equations sim ultaneously, we get

I 4- V3

“ 4

3 - V3

(35)

2. Overview of Wavelet-Baaed Compreaaion Methoda 17 3 + >/3 C2 = C3 = 4 1 — 4

It can be shown th a t E quation 2.36 is d ep en den t on the rest of th e four equations. Squaring b o th sides of Equations 2.32 a n d 2.35 and adding, we get

2(c^ + 4- 4- Cg) 4- 4(coC2 4- C1C3) = 4

By using E^quation 2.33, we get c0 4 - c f - | - C2 4 - C3 = 2 which is E quation 2.36.

Clearly increasing the value o f p in E q u a tio n 2.23 increases th e num ber of nonzero wavelet coefficients and thus sm oothes th e scaling function [27]. For example, if p = 2, th e n one can reconstruct exactly any factor w hich is a sum o f linear equations. If p = 3, th e n any q u ad ratic curve can be decom posed a n d reconstructed exactly using those wavelet coefficients. B u t, th e num ber o f nonzero coefficients increase as the value of p is increased, th u s adding to th e com putational complexity.

2 .2 .4 .4 S ig n a l d e c o m p o s itio n a n d r e c o n s t r u c t i o n b y u s in g w a v e le ts

T h e decom position of a signal c an be perform ed by using a n algorithm introduced by M allat [4] called th e caacade algorithm. T h is algorithm decomposes the signal by taking th e projections o n the set of subspaces { W j } g enerated by th e seeded m other wavelet {V’j.n}- Clearly, as we increase the value o f j , we sh ift to coarser scales. If

f { x ) = E « n (X ) (2.37)

n w here a° = < / , <t>o,n > , th e n / €

Vq-Moving to coarser scales Vj, j > 0 discards some fine inform ation contained in the wavelet com ponents. Note th a t th e set of subspaces {Wj} are orthogonal to each other,

i.e.,

1 0, otherwia

Wi n W j = { (2.38)

otherwise

In th is way, no inform ation o f th e signal is lost in th e decom position. In general, if we transform from Vq —*■ Vi, i.e.. P i : Vo -4 V i, th e n

P l f = E “n A (^ .n (x )) (2.39)

n

(36)

2. Overview of Wavelet-Based Compression Methods 18 However, from 2.16 < ^ Cn—2/ (2.41) (2.42) Therefore. A / — —ÿ= ^1,/ (2.43) v 2 n I — ^ ^ ^ 5 Z ^ (2-44)

Moreover as P i is th e projection o f subspace Vq onto Vi, P i / € Vi. Hence,

P l f = (2.45)

I

where a} =< f , <^i / > , for aU Z E Z. C om paring Equations 2.44 and 2.45, we get

5 Z * ^ -2/ “ n (2-46)

Generalizing this to any transform ation Pj : —► Vj, we get

— -ÿ= 5 Z * ^ -2/ ^ (2-47)

v2 „

where oj = < / , > As th e subspace V) is a coarser approxim ation to V j-i, the transform ation m a trix given in E quation 2.47 is called th e lowpass filter m atrix or the Lf-matrix.

In order to g e t th e wavelet com ponents, i.e., Qi : Vq W i , we can w rite

Q i f = E «n Q i i M x ) ) (2.48)

However, from 2.24

E E V’l,/ (2-49)

n I

(37)

2. Overview o f Wavelet-Based Compression Methods 19 Therefore, Ql [0O,n(3?)] -7= H “ n C i_„+a V 'w M V ^ n I ^ ( -1)" Ci_„+2/ OLn where ^ 2 2 ( ~1)" Ci_„+2/ a® (2.51) (2.52) (2.53) (2.54) G eneralizing, we get 2 Z ( ~ ^ ) ” c i_ „+2/ -1 _(2.55)

As the subspace W j contains the details th a t were missing in Vj relative to V j - \ , the tran sfo rm atio n m a trix given in the above eq u atio n is called th e highpass filter m atrix or the H -m a trix . C om bining Equations 2.47 and 2.55, we get

aJ = La-'-^ (2.56)

y = H a^-^ (2.57)

In m ost practical applications, is assumed to b e sam e as th e discrete in p u t sequence / (n) where / ( n ) = f{x)\x=n for n € I . In other w ords, th e in p u t sequence is assum ed to be identical to th e finest scale of wavelet decom position. Please refer to A ppendix A for more details.

For signal reconstruction firom th e wavelet com ponents a n d th e signal a t coarsest scale. we use Also where P j - i f = P j f + Q j f = ^aÎ4>j^i{x) (2.58) (2.59) (2.60)

(38)

2. Overview of Wavelet-Baaed Compreaaion Methoda 20

S u b stitu tin g value of P j - i f from E q u atio n 2.58, we o b tain ,

^ > + ^ b { < > (2.61)

I I

Using th e fact th a t for any two functions / and g € C?, < g , f > = < g, f >*, E q u atio n 2.61 can be w ritte n as

+ H ] b> (2.62)

D e c o m p o s itio n u s in g w a v ep a ck ets

A nother approach of decom posing the signal is based o n the theory of wavepackets [3]. In this case, th e stream of d a ta is first decomposed in to lower-order scaling coefficients and higher-order wavelet coefficients. Each o f th e higher-order wavelet coefficients is fu rth er decom posed into lower-order and higher-order coefficients. Thus, if th e subspace Vq is split into orthogonal subspaces Vi and W i as given by

then,

In E q u a tio n 2.64, W l corresponds to th e coarse d etails in Wi while W ^ contains th e high- resolution com ponents o t W i

-2 -2 .4 .5 E x te n s io n o f w a v e le t b a sis to tw o d im e n s io n s

The orth o n o rm al wavelet basis in £ ^ (R ^ ) can be co n stru cted by s ta rtin g w ith a basis in £ ^ (R ). L et = 2"^/^ ip{2~^x — t ) be th e orthonorm al basis in £ ^ (R ). Its two-dim ensional equivalent is generated by taking th e tensor p ro d u ct

V'ji.ki j , . k , ( r i , Z2) = ^i,.ifc,(a:i) V’j2,Jk2(a?2) (2-65) There exists an o th er b e tte r m ethod in which d ila tio n of the resulting orthonorm al basis controls b o th the variables simultaneously. In th is approach, one considers the ten so r product o f two one-dim ensional m ultiresolution analysis ra th e r th a n two one-dim ensional wavelet basis. Thus, if

(39)

and

F E Vj F{2^x — Til,2^y ~ Mg) S Vq V n i, ti2 € Z (2.67)

th en V j forms a m ultiresolution la d d er in £^(R ^) satisfying

••• V 2 c V i C V o C V _ i C V _ 2 (2.68) y V j = £2 (R^) , (2.69) J€l n Vj = {0} (2.70) J€l T hus <^j;ni,na(3T, y) — 0j,nt (^) 4*j,Jij{y) (2.71) = 2~^ <f>{2~^x — ni)4>{2~^y — TI2) , nt,Ti2 E I (2.72) V j_ i = V j - i ® V j .i (2.73)

=

(Vj e Wj) ® (Vj e Wj)

(2.74)

where W j consists o f three p arts given by v>j,n, (r ) <^j,n2(y) for (W j ® Vj), 0j,nj(y) (r ) for (V^ ® Wj), and V’j,ni(a:) V^.naCv) for (Wj ® Wj). This leads to th e th ree wavelets

y) = V'(y) (2.77)

y) = <t>{y) V'(z) (2.78)

y) = V'Cir) V’(y) (2-79)

w here h, v, an d d stan d for horizontal, vertical and diagonal com ponents.

Filtering can be done on rows a n d columns in a two-dim ensional array, corresponding to th e horizontal a n d vertical directions in images, for example. If is an iV x iV array, then applying E q u atio n 2.47 to the rows o f th e image results in an arra y o f size N /2 x N . A pplying E q u ation 2.55 also results in a n array of size N / 2 x N , say T h e transform ation is th e n applied to th e columns of th e array to o b ta in a n d each of size n /2 x N /2 , respectively. T h e sam e tran sfo rm atio n is applied to 6^/^ to o b ta in a n o th er two arrays, 6^’* an d of sizes N / 2 x N / 2 , respectively. T he elem ents of the a rra y a* are called the scaling coefficients of th e image while elem ents of arrays 6^’”, a n d b \ ^ a re called th e wavelet