Katholieke Universiteit Leuven
Departement Elektrotechniek ESAT-SISTA/TR 1999-52
A Comparison between Iterative block-LMS and Partial Rank Algorithm 1
Koen Eneman, Simon Doclo, Marc Moonen
2March 1999
Internal report ESAT-SISTA
1
This report is available by anonymous ftp from ftp.esat.kuleuven.ac.be in the directory pub/SISTA/eneman/reports/99-52.ps.gz
2
ESAT (SISTA) - Katholieke Universiteit Leuven, Kardinaal Mercier- laan 94, 3001 Leuven (Heverlee), Belgium, Tel. 32/16/321809, Fax 32/16/321970, WWW: http://www.esat.kuleuven.ac.be/sista. E-mail:
koen.eneman@esat.kuleuven.ac.be, simon.doclo@esat.kuleuven.ac.be. Simon
Doclo is a Research Assistant supported by the I.W.T. Marc Moonen is a
Research Associate with the F.W.O. Vlaanderen. This research work was car-
ried out at the ESAT laboratory of the Katholieke Universiteit Leuven, in
the frame of the concerted research action MIPS (`Model{based Information
Processing Systems') of the Flemish Government, the F.W.O. research project
nr. G.0295.97, the IUAP program P4{02 (1997{2001) ('Modeling, Identica-
tion, Simulation and Control of Complex Systems'), the IT{project Multimi-
crophone Signal Enhancement Techniques for Hands{free Telephony and Voice
Controlled Systems (AUT/970517/Philips ITCL) of the I.W.T. and was par-
tially sponsored by Philips{ITCL. The scientic responsibility is assumed by
its authors.
A Comparison between Iterative block-LMS and Partial Rank Algorithm
Koen Eneman, Simon Doclo, Marc Moonen
March 17, 1999
1 Introduction
In this report, we examine the relation between two adaptive ltering techniques : iterative block-LMS [1] and PRA (Partial Rank Algorithm) [2][3]. It is shown that iterative block-LMS approaches PRA with stepsize 1, if the number of iterations goes to innity. This is a simple block-extension of the fact that the normalised LMS algorithm can be considered as the limiting result of reiterating the LMS algorithm [4]. In each iteration of block-LMS, the a-priori (and a-posteriori) error decreases, and the dierence between the actual and the calculated lter coecients decreases.
The computational complexity for both algorithms is compared.
2 Notation
The size of each block is
Land the lter length is
N.
The input vector at time
kis denoted by x k = [
xk
xk
;1 ::: xk
;N
+1] T .
In block-LMS as well as in PRA, we consider
Lconsecutive input vectors, such that the input data matrix X n
2RN
L for block
nis a Toeplitz-matrix,
X n = [ x nL x nL
+1 :::x nL
+L
;1] =
2
6
6
6
4
x
nL
xnL
+1 ::: xnL
+L
;1x
nL
;1 xnL
::: xnL
+L
;2... ... ...
x
nL
;N
+1 xnL
;N
+2 ::: xnL
;N
+L
3
7
7
7
5 :
(1)
For block-LMS, the adaptive lter of length
Nfor block
nat iteration step
ris denoted by w
(n r
).
For PRA, the adaptive lter of length
Nfor block
nis denoted by w n .
The desired signal for block
nis denoted by d n = [
dnL
dnL
+1 ::: dnL
+L
;1] T .
The error signal for block
nis denoted by e n = [
enL
enL
+1 ::: enL
+L
;1] T .
3 Block-LMS and PRA
Block-LMS and PRA (as many adaptive ltering algorithms) consist of two parts : 1. calculation of the a-priori error
2. weight update using this a-priori error
3.1 Calculation of the a-priori error
The calculation of the a-priori error is the same for both algorithms. For each element in e pri n , we can write
e
pri k =
dk
;x Tk w n
; k=
nL:::nL+
L;1
;such that
e pri n = d n
;X Tn w n
:(2)
3.2 Weight update
The weight update formula for block-LMS has the same form as the (well-known) weight update formula for LMS, except for the fact that the mean gradient over a block of size
Lis calculated and that the lters are updated every
Lsamples :
w n
+1= w n +
nL
X+L
;1k
=nL x k
epri k
= w n +
X n e pri n
= w n +
X n
;d n
;X Tn w n
(3) For the same block of
Lsamples, we can as well iterate this formula :
w
(0)n = w n (4)
w
(n r
+1)= w
(n r
)+
X n
d n
;X Tn w
(n r
)(5) Block-LMS for which equation 5 is iterated is called iterative block-LMS or block row action projection (BRAP)[5]. The dierent steps of the algorithm are summarised in table 1.
The partial rank algorithm (PRA) is a block version of the ane projection algorithm (APA) [5]. The weight update equation is an extension of equation 3. A normalisation step is added, resulting in the following weight update equation :
w n
+1= w n +
X n
;X Tn X n +
I L
;1;d n
;X Tn w n
(6) If
L= 1, PRA reduces to NLMS. The steps of the PRA-algorithm are summarised in table 2.
4 The Relation between iterative block-LMS and PRA
4.1 Weight update
Theorem 1 For the
rth iteration, the weights w
(n r
)can be written in function of w n
(0)as
w
(n r
)= w
(0)n +
X n
"
r
;1X
l
;
I L
;X Tn X n
l
#
d n
;X Tn w
(0)n
:
(7)
Proof : For
r= 1, the proof is trivial and follows from equation 3.
Suppose equation (7) holds for w
(n r
), then we can prove that
w n
(r
+1)= w
(n r
)+
X n
d n
;X Tn w n
(r
)
= w
(0)n +
X n
"
r
;1X
l
=0;
I L
;X Tn X n
l
#d n
;X Tn w
(0)n
+
X n
"
d n
;X Tn w n
(0)+
X n
"
r
;1X
l
=0;
I L
;X Tn X n
l
#d n
;X Tn w n
(0)
!#
= w
(0)n +
X n
"
r
;1X
l
=0;
I L
;X Tn X n
l + I L
;X Tn X n
Xr
;1l
=0;
I L
;X Tn X n
l
#
d n
;X Tn w
(0)n
= w
(0)n +
X n
"
I L +
;I L
;X Tn X n
Xr
;1l
=0;
I L
;X Tn X n
l
#
d n
;X Tn w
(0)n
= w
(0)n +
X n
"
I L +
Xr
l
=1;
I L
;X Tn X n
l
#d n
;X Tn w
(0)n
= w
(0)n +
X n
"
r
X
l
=0;
I L
;X Tn X n
l
#d n
;X Tn w
(0)n
Theorem 2 The iteration described in equations (4) and (5) converges to
w
(1)n = w
(0)n + X n
;X Tn X n
;1d n
;X Tn w
(0)n
;
(8)
if
is chosen such that
max2
, with
max the maximum eigenvalue of X Tn X n .
Proof : If we take the eigendecomposition of X Tn X n X Tn X n = Q T Q
;then we can write
r
;1X
l
=0;
I L
;X Tn X n
l = r
X;1l
=0;
Q T Q
;Q T Q
l
= Q T
"
r
;1X
l
=0( I L
;) l
#
Q
= Q T diag
Xr
;1l
=0(1
;j ) l
!
Q
; =
j6=0Q T diag
1
;(1
;j ) r
j
Q
;(9)
such that equation (7) becomes
w
(n r
)= w n
(0)+ X n Q T diag
1
;(1
;j ) r
j
Q
d n
;X Tn w
(0)n
:(10)
For
rgoing to
1, equation (10) becomes
r lim
!1w
(n r
)= w
(0)n + X n Q T diag
r lim
!11
;(1
;j ) r
j
Q
d n
;X Tn w
(0)n
= w
(0)n + X n Q T diag
1
j
Q
d n
;X Tn w
(0)n
(11) under the condition that
j1
;j
j1
8j, which is satised if (1
;max )
;
1 (assuming
>0) or
max2
. We can rewrite equation (11) as
w
(1)n = w
(0)n + X n Q T
;1Q
d n
;X Tn w n
(0)= w
(0)n + X n
;X Tn X n
;1d n
;X Tn w
(0)n
:
(12)
which is the same weight update formula as for PRA with
= 1 and
= 0 (see equation 6).
4.2 A-posteriori error
The a-posteriori error is dened as :
e post n = d n
;X Tn w n
+1= d n
;X Tn w
(n R
;1):(13) After
Riterations with block-LMS the a-posteriori error is
e post n = d n
;X Tn w
(n R
;1)= d n
;X Tn
"
w
(0)n + X n Q T diag 1
;(1
;j ) R
;1j
!
Q
d n
;X Tn w
(0)n
#
=
d n
;X Tn w
(0)n
;X Tn X n Q T diag 1
;(1
;j ) R
;1j
!
Q
d n
;X Tn w n
(0)=
"
I L
;Q T QQ T diag 1
;(1
;j ) R
;1j
!
Q
#
d n
;X Tn w
(0)n
(14)
=
hI L
;Q T diag
1
;(1
;j ) R
;1Q
id n
;X Tn w
(0)n
:(15) For
Rgoing to
1, the a-posteriori error becomes zero :
R lim
!1e post n =
I L
;Q T Q
d n
;X Tn w n
(0)= 0
;(16)
as could be expected. Iterative block-LMS approaches PRA with
= 1 and
= 0 and PRA is dened such that the a-posteriori error is zero.
5 Computational complexity
5.1 Iterative block-LMS
The formulas dening iterative block-LMS (BRAP) and the corresponding cost are
given in table 1.
for each block of L samples do
X n =
2
6
6
6
4
x
nL
xnL
+1 ::: xnL
+L
;1x
nL
;1 xnL
::: xnL
+L
;2... ... ... ...
x
nL
;N
+1 xnL
;N
+2 ::: xnL
;N
+L
3
7
7
7
5
w
(0)n = w n
d n =
2
6
4 d
nL
...
d
nL
+L
;13
7
5
for r
= 0
to R;1
doy
(n r
)= X Tn w
(n r
)2
LN;Le
(n r
)= d n
;y
(n r
) Lw
(n r
+1)= w
(n r
)+
X n e
(n r
)2
LN+
Lg
w n
+1= w
(n R
;1)g
Table 1: Block Row Action Projection (BRAP)
The total cost per sample for iterative block-LMS (BRAP) (see table 1) is
4
NR+
R(17)
5.2 Partial Rank Algorithm
The formulas dening PRA are given in table 2. In order to obtain vector p n the inverse of a regularised symmetric matrix M n +
I L = X Tn X n +
I L has to be com- puted. As data is shifted block by block ecient QR-based updating of X Tn X n is not possible for
L>1. However, X Tn X n can be updated
1:
M n = X Tn X n =
hX
0n T X
1n T
:::X N=L n
;1T
i2
6
6
6
4
X
0n
X
1n X N=L n ...
;13
7
7
7
5
(18)
= M n
;1+ X
0n T X
0n
;X N=L n T X N=L n (19)
1
assuming
Nis multiple of
Land dening
Xkn=
"
x
nL;k L
::: x
nL+L;1;k L
... ... ...
x
nL;L+1;k L
::: x
nL;k L
#
for each block of L samples do
X kn
8= k
"
x
nL ...
;kL
:...
:: xnL
+L ...
;1;kL
x
nL
;L
+1;kL
::: xnL
;kL
#
; k
= 0
!N L
X n =
2
6
6
6
4
X
0n X
1n
X N=L n ...
;13
7
7
7
5
d n =
2
6
4 d
nL
...
d
nL
+L
;13
7
5
y n = X Tn w n 2
LN;Le n = d n
;y n
LM n = M n
;1+ X
0n T X
0n
;X N=L n T X N=L n (2
L+ 1)
L(
L+ 1)
p n = ( M n +
I L )
;1e n
23L3+
L2+
Lw n
+1= w n +
X n p n 2
LN+
Lg
Table 2: Partial Rank Algorithm (PRA)
This reduces the cost substantially if
N=Lis large. p n = ( M n +
I L )
;1e n can be computed eciently by solving the linear set of equations ( M n +
I L ) p n = e n using Gauss elimination (cost :
2L
33) and back substitution (cost :
L2). The total cost per sample for PRA (see table 2) is
4
N+ 8
L3 + 4
2 L+ 3 (20)
5.3 BRAP vs. PRA
Whereas the cost of BRAP is independent of the block length, PRA becomes more expensive for large values of
L. For small values of
R, BRAP will be cheaper than PRA. The more iterations
Rare performed with BRAP, the more expensive it be- comes, but the better it approaches the convergence behaviour of PRA (see section 6). We now compute
Rcrit , the value for
Rfor which BRAP is as expensive as PRA. By combining eq. 17 and 20, we obtain
R
crit = 4
N+
8L
32+ 4
L+ 3
4
N+ 1
:(21)
For long adaptive lters,
N >>L. By taking the limit for
Nto
1we nd
R
1
crit = 1 (22)
As could be expected, for small block lengths
L,
Rcrit is small and PRA outperforms BRAP both in terms of convergence and cost.
The eect of parameter
Lon the convergence of PRA is shown in gure 1 and 2. Both
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
10−15 10−10 10−5 100
evolution of the filter weights for PRA
L=1 −> NLMS
L=100 L=50 L=20 L=10 L=2 L=3
Figure 1: Error between the lter coecients and the true path as a function of time for PRA. Coloured stationary noise was ltered with a 150-taps FIR lter. The adaptive lter length
Nis 150.
coloured noise and a music signal are ltered with an articial path and identied with PRA. Block length
Lwas changed from 1 (=NLMS) to 100. Large values of
Llead to better steady-state suppression and initial convergence. For large values of
L
also
Rcrit will be considerable and BRAP with
R <Rcrit might be employed to obtain a cheap adaptive algorithm with a performance comparable to that of PRA.
As ever, there exists a trade-o between good performance at a high cost (PRA) and reduced performance at a lower cost (BRAP with
R<Rcrit ). The optimal
Rcan be determined using the following heuristic :
increase
Rby making a trade-o between the expected improvement in perfor- mance and extra cost, taking in account
{ the available processor power
{ the additional cost and improvement in performance when choosing PRA
if
RRcrit , choose PRA
Some performance tests are needed to estimate the expected improvement in perfor-
mance for BRAP when
Ris increased.
1 2 3 4 5 6 7 8 x 104 10−1
100
evolution of the filter weights for PRA
L=1 −> NLMS
L=2
L=3
L=10
L=20
L=50
L=100
Figure 2: Error between the lter coecients and the true path as a function of time for PRA. A music signal was ltered with a 1200-taps FIR lter. The adaptive lter length is 1000.
6 Simulations
Coloured stationary noise was ltered with a 100-taps FIR lter. PRA and BRAP with
R= 1
;2
;3
;10
;20
;50
;100 are compared. The length of the adaptive lters
Nis 100. In a rst experiment the block length
Lwas chosen to be 20. For this value of
L
,
Rcrit = 3
:86. Figure 3 shows the evolution of the lter coecients. Figure 4 and 5 show the evolution of the a-priori and a-posteriori errors respectively. In a second experiment the block length
Lwas chosen to be 3. Figure 6 shows the evolution of the lter coecients. For this value of
L,
Rcrit = 1
:09. Figure 7 and 8 show the evolution of the a-priori and a-posteriori errors respectively.
7 Conclusion
In this report we have shown that the partial rank algorithm (PRA) can be consid-
ered as the limiting result of reiterating the block-LMS algorithm (BRAP) with the
same data. As a consequence, for BRAP, the convergence of the adaptive weights
and the a-priori error is inferior to that of PRA. For a small number of iterations
the computational cost of BRAP is lower than that of PRA. This means that for
large block sizes and limited processing power iterative block-LMS can be a viable
alternative for PRA.
0 50 100 150 200 250 300 350 400 450 500 10−7
10−6 10−5 10−4 10−3 10−2 10−1
PRA 100x 50x 10x 3x 2x 1x = BLMS evolution of the filter coefficients
Figure 3: Error between the lter coecients and the true path as a function of time for BRAP and PRA,
L= 20. Coloured stationary noise was ltered with a 100-taps FIR lter. The adaptive lter length
Nis 100.
8 Acknowledgements
Simon Doclo is a Research Assistant supported by the I.W.T. Marc Moonen is a Research Associate with the F.W.O. Vlaanderen. This research work was carried out at the ESAT laboratory of the Katholieke Universiteit Leuven, in the frame of the concerted research action MIPS (`Model{based Information Processing Systems') of the Flemish Government, the F.W.O. research project nr. G.0295.97, the IUAP pro- gram P4{02 (1997{2001) ('Modeling, Identication, Simulation and Control of Com- plex Systems'), the IT{project Multimicrophone Signal Enhancement Techniques for Hands{free Telephony and Voice Controlled Systems (AUT/970517/Philips ITCL) of the I.W.T. and was partially sponsored by Philips{ITCL. The scientic responsibility is assumed by its authors.
References
[1] S. Haykin, Adaptive Filter Theory, Information and system sciences series. Pren- tice Hall, 3 rd edition, 1996.
[2] K. Ozeki and T. Umeda, \An adaptive ltering algorithm using an orthogonal
projection to an ane subspace and its properties", Electron. Commun. Japan,
vol. 67{A, pp. 19{27, 1984.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
−180
−160
−140
−120
−100
−80
−60
−40
−20 0
evolution of the a−posteriori error
dB
1x = BLMS 2x 3x
10x
50x
100x PRA
Figure 4: Evolution of the a-priori errors for BRAP and PRA,
L= 20. Coloured stationary noise was ltered with a 100-taps FIR lter. The adaptive lter length
Nis 100.
[3] S. G. Kratzer and D. R. Morgan, \The partial-rank algorithm for adaptive beam- forming", in Proc. SPIE Int. Soc. Optic. Eng., 1985, vol. 564, pp. 9{14.
[4] D. R. Morgan and S. G. Kratzer, \On a Class of Computationally Ecient, Rapidly Converging, Generalized NLMS Algorithms", IEEE Signal Processing Letters, vol. 3, no. 8, pp. 245{247, aug 1996.
[5] S. Gay, Fast Projection Algorithms with Application to Voice Echo Cancellation,
PhD thesis, Rutgers, The State University of New Jersey, New Brunswick, New
Jersey, USA, October 1994.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
−140
−120
−100
−80
−60
−40
−20 0
evolution of the a−priori error
dB
PRA 100x 50x 10x 3x 2x 1x = BLMS
Figure 5: Evolution of the a-posteriori errors for BRAP and PRA,
L= 20. Coloured stationary noise was ltered with a 100-taps FIR lter. The adaptive lter length
Nis 100.
0 10 20 30 40 50 60 70 80 90 100
10−3 10−2 10−1
evolution of the filter coefficients
3x
50x 10x
100x PRA 2x 1x = BLMS
Figure 6: Error between the lter coecients and the true path as a function of time
for BRAP and PRA,
L= 3. Coloured stationary noise was ltered with a 100-taps
FIR lter. The adaptive lter length
Nis 100.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
−350
−300
−250
−200
−150
−100
−50 0
evolution of the a−posteriori error
dB
PRA
100x 50x 10x 3x
2x 1x = BLMS
Figure 7: Evolution of the a-priori errors for BRAP and PRA,
L= 3. Coloured stationary noise was ltered with a 100-taps FIR lter. The adaptive lter length
Nis 100.
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
−60
−50
−40
−30
−20
−10 0
evolution of the a−priori error
dB
1x = BLMS
2x
10x 3x 50x
100x PRA