■'i *L T Y OF ( ^ l i A ^ U A f ! . I ' U D ^ ' S
) A N
N EU R A L N ETW O R K S AND N EU R A L FIELDS:
Discrete and C ontinuous Space Neural M odels
by
R oderick E dw ards
B.A., U niversity of V ictoria, 1980 B.Sc,, Uni versity of V ictoria, 1988 M .Sc., H erio t-W att U niversity, 1990
A D issertatio n S u b m itte d in P a rtia l Fulfillm ent o f the R equirem ents for th e D egree of
D O C T O R O F P H IL O S O P H Y
in th e D ep artm en t of M ath e m a tic s a n d S ta tistic s We accept th is d isse rta tio n as conform ing
to th e req u ire d s ta n d a r d
Dr. R. Illner, S up ervisor (D e p a rtm e n t of M a th e m a tic s a n d S ta tistic s)
D r. A. H urd, D e p a rtm e n ta l M em ber (D e p a rtm e n t of M ath e m a tic s an d S ta tis tic s)
D r. P. van d en D riessche, D e p a rtm e n ta l M em ber (D e p a rtm e n t o f M a th em atics an d S ta tis tic s)
——
r==>---A Dr. N. D im opoulos, O u tsid e M em ber (D e p a rtm e n t of E lectrical an d C o m p u te r E n gineerin g)
Dr. R. S n e $ O u tsid e M em ber (R oyal R oads)
D r. Jacq u es B elair, E x te rn a l E x am in er, (U n iv ersity of M o ntreal)
© R O D E R IC K E D W A R D S, 1994 U n iv ersity o f V ic to ria
All rig h ts reserved. D isse rta tio n m ay n o t bf rep ro d u c e d in w hole or in p a r t, by p h o to co py in g o r o th e r m ean s, w ith o u t th e p erm issio n of th e a u th o r.
II
S up erviso r: Dr. R ein h ard Illner
A bstract
‘A ttr a c to r ’ .al netw ork m odels have usefu! p ro p erties, b ut biology suggests th a t m ore varied d y n am ics m ay be significant. Even th e eq u atio n s of th e Hopfield netw o rk , w ith o u t th e c o n stra in t of sym m etry, can have com plex b eh av io u rs which have been little s tu d ie d . Several new ideas o r app roaches to neural netw ork th eo ry are ex am in ed here, focussing on th e d istin c tio n betw een discrete a n d co n tin u o u s sp ace n eu ral m odels. F irs t, sim ple ch aotic d y nam ical sy stem s are ex am in ed , as c a n d id a te s for m ore n a tu ra l n eu ral netw ork m odels, in clud ing coupled system s of L orenz eq u a tio n s a n d a H opfield e q u atio n m odel w ith a balance of in h ib ito ry an d e x c ita to ry n eu ro n s. Also, co n tin u o u s space m odels w ith a s tru c tu re like th a t o f th e H opfield netw ork are briefly ex p lo red , w ith in terestin g tra in in g possibilities.
T h e m a in resu lts deal w ith th e a p p ro x im a tio n of Hopfield netw ork e q u atio n s w ith a p a rtic u la r class o f conn ectio n s tru c tu re s fallow ing a sy m m etry ), by a reaction- diffusion e q u a tio n , using tech n iq u es borrow ed from p a rtic le m eth o d s used in th e nu m erical so lu tio n of fluid -dy nam ical e q u a tio n s. It is show n th a t th e ap p ro x im atio n ho lds rigorously only in c e rta in sp a tia l regions b u t th e sm all regions w here it fails, n am ely w ith in tra n s itio n layers b etw een regions of high a n d low activ ity , are not likely to b e critical. T h e result serves to classify connectivities in H opfield-type m o d els a n d sheds light on th e lim itin g b eh av io u r of netw orks as th e n u m b er of n e u ro n s goes to infinity. S ta n d a rd J r cre tiz a tio n s of th e reaction diffusion eq u atio n s are a n aly zed to clarify th e effects w hich can arise in th e lim iting process. T h e d isc re te sp ace sy stem s can have sta b le p a tte rn e d eq u ilib ria which m u st be close to m e ta s ta b le p a tte r n s of th e co n tin u o u s system s.
O u r re su lts also suggest th a t th e fine s tru c tu re of n eural co n n ectio n s is im p o r ta n t, a n d to o b ta in com plex b eh av io u r in th e Hopfield netw ork eq u a tio n s, a p re d o m in a n c e o f in h ib itio n o r wildly o scillatin g connection m atrix en tries are in d i ca te d .
E x am in ers:
D r. R . Illner, S u p erv iso r (D e p a rtm e n t of M a th e m a tic s a n d S tatistic s)
Dr. P. van den D riessche, D ep artm erftaT M em ber (D e p a rtm e n t of M ath em atics a n d S ta tistic s)
Dr. N. D im opoulos, O u tsid e M em ber (D e p a rtm e n t of E lectrical a n d C o m p u te r E ngineering)
Mei
Dr. R . Snell, O u tsid e M em ber (R oyal R oads)
iv
Table o f C ontents
A b s t r a c t ... ii T able of C o n t e n t s ...iv List o f F i g u r e s ...vi A c k n o w l e d g e m e n t s ...vii D edication ...viii 1. I n t r o d u c t i o n ... I 1.1 C o n te x t of n eu ral netw ork r e s e a r c h ...I 1.2 D evelopm ent o f n eu ral netw ork m o d e l s ... 31.3 M otivation for new a p p r o a c h e s ... 6
1.4 S u m m ary of a p p ro ac h es ta k e n a n d resu lts o b t a i n e d ... 10
2. B ack g ro u n d on th e conven tion al H opfield netw ork m o d e l ...I I 2.1 T h e H opfield netw ork e q u a t i o n s ... 14
2.2 E n erg y fun ction al for th e H opfield netw ork ... 17
2.3 M em brane p o te n tia ls , firing ra te s a n d th e S - £ ex change . . . 18
2.4 T h e sigm oid resp o n se fu n ctio n a n d th e high gain co n d itio n . 20 2.5 L earn in g rules for th e H opfield netw ork... ... 21
2 .6 T h e role of e x te rn a l i n p u t s ...23
3. L im ita tio n s of th e con ventio nal H opfield netw ork ...25
4. C h a o tic n e u ra l netw orks ... 31
4.1 C h ao s in n eu ral netw orks: why a n d h o w ? ... 31
4.2 T h e Lorenz e q u a tio n s as a n e u ra l n e t w o r k ... 35
4.3 A tte m p ts to e x te n d th e L orenz e q u a t i o n s ... 37
4.4 K w a n ’s m odel an d a new ch ao tic H opfield n e t w o r k ... 46
4.5 T e n ta tiv e c o n c l u s i o n s ...58
5. Integro -differen tial eq u a tio n s as n e u ra l f i e l d s ... 60
5.1 T h e integ ro-differen tial e q u a tio n ... 60
5.2 T h e energy fu n ctio n al ...60
5.3 E q u ilib ria a n d gain ... 6 '
5.4 E q u ilib ria w ith a h a rd n o n l i n e a r i t y ..., 6 3 5.5 T h e S - £ exchange rev isited ...65
5.7 S tab ility for d isc /e tc tim e d y n am ics ... 69
5.8 T rain in g th e d iscrete tim e e q u a t i o n ...71
5.9 T e n ta tiv e c o n c l u s i o n s ...80
6. A p p ro x im atio n of n eu ral netw ork d y n am ics by a reaction-diffusion e q u a tio n 82 6.1 In tro d u c tio n ...82
6.2 A p p ro x im atio n o f an in teg ral o p e ra to r by a differential o p e ra to r . 84 6.3 Q u a d ra tu re for th e i n t e g r a l ... 94
6.4 G en eralizatio n o f C o tt e t’s resu lt ... 97
6.5 C onvergence o f th e a p p r o x i m a t i o n ...103
6.6 D i s c u s s i o n ... 117
6 .A A pp en d ix on b o u n d s for th e seco nd d eriv ativ e o f th e so lu tio n . . 121
6.B A p p en d ix on w id th of tra n s itio n layers ...124
7. B ehaviour o f th e reaction -d iffusion e q u a tio n s ... 129
7.1 C 'o tte t’s e q u a t i o n . 129 7.2 T h e g en era l e q u a t i o n ... 133
7.3 C o nclusions ...133
8. F in ite difference d isc re tiz a tio n s of th e reaction-diffusion eq u atio n s . . 135
8.1 Form s o f th e reaction-diffusion e q u a tio n a n d th e ir d iscretiza tio n s . 135 8.2 L yapunov f u n c t i o n a l ... 138
8.3 S ta b ility o f flat eq u ilib ria . ... 141.
8.4 A sta b le eq u ilib riu m of p e rio d 6 ... 143
8.5 Large scale s ta b le p a t t e r n s ... 148
8.6 P a tte rn s in d - d i m e n s i o n s ... 157
8.7 D iscretizatio n s o f th e g en era l reaction-diffu sio n e q u atio n . . 158
8.8 S tab le p a tte r n s a n d m e t a s t a b i l i t y ...159
9. C o n c l u s i o n s ...164
9.1 Im p licatio n s for n eu ral n etw o rk t h e o r y ...164
9.2 M ath em atical i m p l i c a t i o n s ...166
9.3 F u rth e r d i r e c t i o n s ... 166 ,
VI
List o f Figures
2.1 An exam ple of a sigm oid response fu nctio n: r - <;(Ah)...16
4.1a T rajecto ries of coupled Lorenz system.* (4.4) w ith a — 1)...42
4.1b T rajecto ries of coupled Lorenz sy stem s (4.4) with o = 2 ...42
4.1c T rajectories of coupled L orenz system s; a = -2.5, in p u ts - 0 ,5 0 . . . 14
4 .Id T rajecto ries of coupled L orenz sy stem s; a — 5.5, in p u ts ~ 50, 50. . . 45
4.2 T h e ‘K w an ’ m odel. ...47
4.3 A ‘h u m p ’ function: th e difference of tw o logistic fu n ctio n s... 52
4.4a In p u t-sh ifted h u m p fu n ctio n w ith I — - 0 .5 2 5 ... 53
4.4b In p u t-sh ifte d h u m p fu n ctio n w ith / = - 0 . 6 ...54
4.4c In p u t-sh ifted h u m p fu n ctio n w ith / = 0.7... 4.5 H um p functio n w ith / = - 0 . 8 ap p lied only to th e in h ib ito ry neuron. 5.1 An ex am p le of a region o f s u p p o rt for T ( x , y ) w ith 3 disjoint boxes. 5.2 S u p p o rt of T(n , y ) (sh a d e d ) afte r o ne in p u t in E xam ples 5.3, 5.4, 5.5. 5.3 S u p p o rt of T ( x , y ) (s h a d e d ) a fte r tw o in p u ts in E xam ples 5.4 an d 5.5. 6.1 An exam ple of an even tj(x) satisfying m om ent conditions. 6.2 An exam p le of th e fu n ctio n 6 '(e ) from C o tte t’s eq u atio n . 6.3a An exam p le of th e re a c tio n te rm , b(v), in C o tte t’s eq u atio n . . 6.3b D erivative of th e re a c tio n te rm in C o tt e t’s e q u a tio n ... 6.4 A sketch o f a so lu tio n to C o t t e t ’s e q u a tio n show ing tra n sitio n layers. 6.5 An exam p le of a co n n ec tio n m a trix o f th e ty p e rovered by T heo rem 6.5 6.6 A sketch o f f ( w ( y )), from e q u a tio n (6.37), for use in energy estim ates. 3.1 A p erio d 6 eq u ilib riu m for a d iscrete space version of C o tte t’s e q u a tio n . 8.2a In itial d a t a for ite ra tio n to a p erio d 12 e q u ilib riu m ... 8.2b T h e p erio d 12 e q u ilib riu m re su ltin g from th e ite ra tio n ... 8.3a R an d o m p erio d 50 in itial d a t a for C o t t e t ’s eq u a tio n discretized. 8.3b T ran sitio n layers form ed a t t - 5 from th e ran d o m initial d a ta . . . . 162 8.4 S olution for th e g en era l reaction -diffu sio n e q u atio n discretized. . . . 1 6 3
55 57 67 75 76 88 105 107 108 110 120 127 144 150 152 161
A cknow ledgem ents
I am g ratefu l for th e gen ero u s financial su p p o rt w hich enabled m e to p u rsu e th is research an d co ntin ue to help s u p p o rt a family. T h e work was p a rtia lly s u p p o rte d by a n N SERC P o s tg ra d u a te S ch olarship, a U niversity o f V ictoria Fellow ship, a n d research g ra n ts of Dr. R ein n ard Illner (N S E R C research gran t A -7847) an d Dr. P au lin e van den D riessche (N S E R C research g ra n t A-8965).
T h e experien ce would not have been as p ro d u ctiv e o r enjoyable w ere it n o t for less co n crete (b u t not less sig nifican t) form s of a ssistan ce from several people. In p a rtic u la r, I th a n k R ein h ard for in s p ira tio n and enco u rag em en t, C a th y for p atien ce and s u p p o rt, a n d Tom for e m p a th iz in g . I also th a n k P hil a t the c o m p u te r help desk for saving C h a p te r 1 from m a g n etic oblivion.
\ II
To my father,
1. Introduction
1.1 Context of neural network research.
N eural netw orks, as m odels of biological n eu ral activ ity a n d especially as m odels of c o m p u ta tio n , have show n g re a t p com ire. It is cleai th a t biological b rain s, even very sim ple o nes, are cap a b le o f easily p erfo rm in g ce rta in tasks th a t are ex trem ely difficult to im plem ent in a s ta n d a rd seq u en tial p ro g ram on a st a n d ard seq uen r al c o m p u te r, d esp ite th e g reat pow er a n d com plexity of such m achines. Such task s include, for exam ple: recognition of a n o b je ct in an im age; recall o f a m em orized p a tte rn from a p a rtia l o r d is to rte d one; a n d co n tro l of lim b m ovem ents to m an o eu v re sm o oth ly a ro u n d o b je c ts. F u rth e rm o re , biological b ra in s a re cap a b le of le a n in g . T h e field of n eu ral n etw ork m odelling d eveloped as an a tte m p t to u n d e rs ta n d w hat it .s a b o u t a large sy stem of relatively sim ple in terco n n ected u n its such as biological neu rons th a t allows th e m to p erfo rm su ch processing ta sk s so well.
C e rta in p ro p e rtie s o f biological n e u ra l sy stem s a re ev iden t even from u rela tively naive physiological persp ectiv e. T h e y a re certain ly m assively p arallel sy stem s, ra th e r th a n single, pow erful seq uential p ro cesso rs like conventional co m p u ters. T h ey are also rem arkably fa u lt-tc le ra n t. T h a t is, a lth o u g h th e ir o p e ra tio n is d eg rad e d by su b sta n tia l d am ag e, a n d fun ction s m ay be en tirely lost w ith sufficient d am ag e, a m o d e ra te loss o f processing u n its o r co n n ec tio n s does n o t seriously affect th e func tio n in g of th e system . Also, a n d th is is a re la te d fact, in fo rm atio n is ‘s to re d ’ in a d is trib u te d m an n er. M em ories, for ex am p le, a re no t lo c a te d in specific m em ory ‘reg iste rs’ as in a c o m p u te r, w here d am ag e to th e se specific registers w ould o b lite ra te th e m em ory.
T h e physiological stu d y o f n eu ral a c tiv ity h as show n th a t th e essential m ech a nism is th e a cc u m u la tio n of electrical p o te n tia l in th e b o d y or ‘so m a’ of a n eu ro n
from th e incom ing signals from o th e r n eu ron s, leading to the ‘firing’ o f i« spike along th e n e u ro n ’s axo n w hen th e p o te n tia l exceeds a thresh old. Axons sp lit in to m any branches an d th e signals tra n s m itte d along an axon are pro p ag ated in to each b ran ch . T hese are co n n ected to o th e r n eu ro n s via sy napses (eith er directly o n to th e som a or o nto d en d rite s of th e o th e r n eu ro n ). T h ro u g h th ese sy n ap tic co n n ectio ns, a signal tra n s m itte d by one n eu ro n influences th e activ ity of o th ers. T h e sy napse tra n s m its a signal v ia th e em ission of n e u ro tra n sm itte rs across th e synaptic, cleft betw een th e sy r .pse a n d th e receiving d e n d rite or som a. B ut the signal is m odified in t h 1! tran sm issio n (th e signal reach in g th e synapse from th e axon of the sen< „ or
p re-sy n ap tic n eu ro n is called ihe pre-synapt>c p o ten tial; th e signal receiver! by th e som a o f th e receiving o r p o s t-s y n a p tic n euro n is called th e p o st-sy n ap tic p o te n tia l). D epending o n w hat ty p e o f tr a n s m itte r s th e s v * -p 8e em ploys, th e signal may in hib it o r excite th e a c c u m u la tio n of p o te n tia l in th e p o st-sy n ap tic neuron. Also, th e sy n ap se may respo nd to a given p re -sy n a p tic p o te n tia l w ith m ore o r less efficiency. T h e degree to which it am plifies o r m u tes th e p re-sy n ap tic p o te n tial is called its ‘sy n a p tic efficacy’. T h e a c c u m u la tio n o r in te g ra tio n o f th e po st sy n ap tic p o te n tia ls a p p e a rs to be m ore o r less a s tra ig h t su m m a tio n . T h is sum m ed p o te n tia l, called th e ‘m e m b ra n e ’ p o te n tia l, is a s ta te v ariab le for th e neu ro n. A fter firing, th e re is a ‘refracto ry p e rio d ’ in w hich th e n eu ro n recovers a n d regains th e ab ility to fire; it will fire ag ain as long as th e m em b ran e p o te n tia l is still above th e th resho ld. T h u s , th e re is a m axim um ra te a t w hich th e n eu ro n can fire regardless of th e stre n g th of th e signals it receives. T h e firing ra te m ay b e considered a function of t'^e m e m b ran e p o te n tia l a n d th is fu n ctio n a p p e a rs to be a sigm oid, with no response from consis te n tly belo w -thresho ld m em b ran e p o te n tia l b u t a s a tu r a te d , m axim um response for very large m em b ran e p o te n tia ls. T h u s , in a sense, a neuron is a non lin ear ad d e r. (M uch of th e above d escrip tio n is ta k e n from [5]).
It was originally su gg ested by D onald H ebb [29], an d has since been generally accep ted , th a t th e le arn in g process involves m odification of sy n ap tic efficacies. Es sentially, w hen p re-sy n ap tie a n d p o st-sy n ap tic p o te n tials rem ain h ig h for sufficient tim e, th a t is w hen b o th pre- a n d p o st-sy n ap tic neu rons are very active, a sy n a p se ’s efficacy is en h an c ed . T h is proces® o ccurs on a longer tim e scale th a n th e fast evo lutio n of th e neural activ ities a n d it is also m ore p erm a n en t in effect. T h is allows a m odification in th e s y ste m ’s m o d e of o p e ra tio n in response to experience.
T h e above p ro p e rtie s of n e u ra l system s are rep ro du cible in m odels u sin g m any in terco n n ected sim plified ‘n e u ro n s ’. W hile th e d etails of th e electro -chem ical pro cesses o ccu rrin g in biological n eu ro n s a re su p p ressed , w h at should b e th e essen tial featu res of th e process a re re ta in e d to m ake a relativ ely sim ple m a th e m a tic a l m odel. N evertheless, th e resu ltin g m o dels involve n o n lin earity an d feedback a n d re a listi cally consist o f (a t th e very le a st) m any th o u sa n d s o f differential e q u a tio n s (see e q u a tio n s 2.3).
1.2 Development of neural network models.
To design an artificial n e u ra l netw ork th a t perform s a useful fu n ctio n req uires m ore th a n w ritin g dow n e q u a tio n s for th e above d escrib ed process. T h ese sim ply describ e th e lowest level of n e u ra l netw ork activ ity , nam ely th e m a n n e r in which in d iv id u al n eu ron s receive, re sp o n d to an d d is trib u te signals from a n d to o th e r neuron s. T h is is essentially a fram ew ork in w hich to work. H igher levels o f design (i.e. how to connect th e n eu ro n s to perfo rm ta sk s) m u st involve new principles. P rogress in th e neu ral netw ork field is ex actly th e discovery o f such prin cip les.
F irs t, we m ay ask w h at fu n ctio n s n eu ral n etw o rks can or sh ould p e rfo rm . C u r rent artificial n eu ral netw ork m odels p erfo rm fairly sim ple ta sk s re la te d to
per-I
cep tio n , memor;, a n d m o to r control. O ne of th e s ta n d a rd a p p lic a tio n s o f n eu ral netw orks is to function as a ‘c o n te n t-a d d re ssa b le ’ m em ory. R a th e r th a n lo catin g a ‘m e m o ry ’, wt ieh m ay be any m eaningful piece of in fo rm atio n (we m ight ju s t c o n sider it a specific p a tte r n of b its ), by know ing th e ad d ress of th e m em ory ‘re g is te r’ w here it was sto re d , a c o n ten t-ad d ressab le m em ory lo cates it c o n te n t, i.e. by th e specific p a tte r n req u ired . A nice exam ple of th is process it; given by D enker [18]. An in co m p lete o r d is to rte d version of th e p a tte r n is used to lo c ate th e original com plete a n d u n d is to rte d p a tte r n . T h is m akes it clear th a t th e sam e process cast be u sed for p a tte r n reco g n itio n and im age re s to ra tio n ta -k s . which are in h e re n tly sim ilar. N ote th a t all these im age rela ted ta s k s req u ire prior know ledge, i.e., a previously sto red p a tte r n to recall.
A related ta sk p erfo rm e d by c u rre n t r.eura.1 netw orks is th a t of association o r classification. It m ay be req u ired to g en era te one o f a lim ited set of a p p ro p ria te resp on ses given a w ide ran g e o f possible stim uli. T h u s, th e netw ork m ust asso ciate th e ‘a p p ro p ria te ’ resp o n se w ith any given in p u t stim u lu s. W h at is m e an t by ‘a p p ro p r ia te ’ d ep en d s on th e p ro b lem b u t in any case th e netw ork can learn (i.e. m odify its d y n am ics) to m ake resp o n ses th a t are desired e ith e r by an e x te rn a l tra in e r o r by in te rn a l co n sid era tio n s. T h a t is, a designer ‘o p e ra tin g ’ th e netw ork can nu d g e th e d y n am ics to w ard s m a k in g w hat she considers an a p p ro p ria te resp o n se (s u p e r vised learning ) o r th e sy stem can have a b u ilt-in way of deciding w hen and how to m odify its d y n am ics (u n su p erv ised learn in g ). Ai. ex am p le of th e form er is th e m u lti-lay er p e rc e p tro n (see e.g. [58]) an d a bimple ex am p le of th e la tte r is th e C arp e n te r-G ro ssb e rg classifier (as described in [45]). If th e n u m b er of responses is m uch sm aller th a n th e n u m b e r of possible in p u ts or stim u li, th e n th e asso ciato r a c ts as a classifier; a class o f in p u ts is m a p p ed to each response.
o p tim iz a tio n task s (such as finding good solutions to th e trav ellin g salesm an problem [36]) and for m o to r co n tro l (sec, e.g. [11,43]).
A m ong th e m o st significant sim plifying principles o r a n aly tic insigh ts th a t have en a b le d n eu ral netw ork m odels to carry o u t th e above fu n ctio n s, a re th e p e rc e p tro n convergence th e o re m , th e b ack -p ro p ag atio n alg o rith m , a n d th e use o f a n energy fu n ctio n al a n d th e H ebb ‘le a rn in g ’ ru le to fix p a tte rn s as a ttr a c to r s in netw ork d y n am ics. T h e p e rc e p tro n th e o re m elu cid ates th e classification p ro p e rtie s o f c e rta in layered netw orks of sim ple ad d itiv e u n its called p erce p tro n s, m ak ing th e m p ro to ty p ical asso ciato rs/classifiers [56, pp. 109ff]. W ith th e aid o f th e b ack -p ro p ag a tio n alg o rith m these n etw orks allow su p erv ised tra in in g to p ro d u ce th e desired response to a n in p u t w hen it is know n (a t least w ith a g o o d success r a te ), a n d th e n to con tin u e prod ucing responses to in p u ts w hen th e a p p ro p ria te resp o n se is n o t know n a
priori. Since o u r research does n o t involve th ese m odels, we do n o t d escrib e th e m
f u rth e r here, b u t refer th e read e r to [49,59].
For eq u atio n s (like o u r e q u a tio n s (2 .2 ) o r (2.3)) m od ellin g th e ad d itiv e n eu ral processes described above, w hen th e m a trix of sy n ap tic efficacies (th e ‘conn ectio n m a tr ix ’) is sy m m etric, th e re is a L yapunov fu n ctio n al, rep re se n tin g th e ‘free e n e rg y ’ of th e sy stem . T h is ob serv atio n , a p p a re n tly m ade in d e p e n d e n tly by C ohen a n d G rossb erg [12] a n d by Hopfield [34,35], in itia te d th e s tu d y of ‘a ttr a c to r n eu ral n etw o rk s’, en su rin g th a t th e b eh av io u r o f th e netw ork will alw ays be convergent. M oreover, Hopfield show ed th a t a co n n ectio n m a trix m ay b e c o n s tru c te d , in a n a tu ral way using so m ething like H e b b ’s ‘le a rn in g ’ ru le, so th a t a set T p a tte rn s becom e fixed point a tta c to r s o f th e d y n am ics. T h u s , given a n in itia l s ta te of th e netw ork w here th e p a tte rn of n eu ral a c tiv ity is sim ilar to one of th e fixed p a tte r n s (i.e. is in it s basin of a ttr a c tio n ) , th e netw o rk will evolve to w ard s th e fixed p a tte r n . T h is can be in te rp re te d as recall o f a m em ory. T h ere are re stric tio n s on th e p a tte rn s
6
for effective s to ra g e in H opfield’s fo rm u latio n : th ey m u st be reasonably o rth o g o n al (u n c o rre la te d ) a n d consist of roughly h a lf ‘o n ’ an d h alf ‘off’ bits; a n d th ere m ust not be to o m any of th e m .
T h ese m e th o d s have m ade possible th e p erfo rm an ce of th e fu n ctio n s described above a n d a g re a t deal of analy sis h as been c ^ n e to assess th e ir cap a b ilitie s an d lim ita tio n s a n d to im prove th e ir p erfo rm ance. A lte rn a te learn in g rules re tn c .e th e re stric tio n of o rth o g o n a lity of p a tte r n s , e n h a n c e th e sto rag e cap acity of th e netw ork an d red u ce th e p ro b lem s of ‘sp u rio u s’ m em ories, o r e x tra fixed p o in ts, which arise in H opfield’s o rig in al fo rm u latio n [13,48]. T h ese a lte rn a te learn in g rules are less b i ological, how ever, requ iring a sy n ap se to ‘know ’ th e sta te s of m any rem o te sy nap ses a t once. A d d itio n a l fe a tu re s have b een in tro d u c e d to e x te n d th e cap ab ilities o f th e first m o dels b u t still rely in g on th e sam e essen tial princir les. For ex am ple, th e use of sto c h a stic effects (sim u lated a n n e a lin g ) allows th e system to evolve n ot ju st to th e nearest local m in im u m o f th e energy fu n c tio n a l, b u t to bounce a ro u n d u ntil it finds th e g lo bal m in im u m o r a t least a relatively low one. T h is is useful in o p tim iz atio n prob lem s, for in s ta n c e [5, pp.89-91]. S im ilar sto c h a stic m e th o d s a re applied in a classifier know n as th e ‘B o ltzm a n n m a ch in e’ [32]. O th e r techniques allow te m p o ra l sequences of p a tte r n s to b e m a d e a ttr a c to r s o f th e dy n am ics, ra th e r th a n single fixed p a tte r n s [39]. S urveys of n e u ra l n etw o rk m odels m ay be found for in sta n ce in [5,18,28,45,48,65].
1.3 M otivation for new approaches.
T h e sim plifying principles o r a n a ly tic to o ls discussed above have proven useful in allow ing .h e c re a tio n of artificial n e u ra l netw orks th a t p erfo rm c e rta in ta sk s reaso n ab ly well. T h e p erfo rm a n ce of th e se m o dels is n o t always as goo d as m ig h t be h o p ed b u t th e re is n o d o u b t th a t th e y com prise a new a n d effective too l for
th e se tasks th a t is fu n d am e n tally different from o th e r m e th o d s. T h e y are so m etim es very much fa ste r th a n o th e r m eth o d s a n d have th e fa u lt-to le ra n c e and d is trib u te d m em ory c h a ra c te ristic o f biological n eu ral system s.
However, it is u o t a t all clear th a t these sim plifying principles are th e o nes th a t n a tu re uses. A lth o u g h the general featu res of th e u n d erly in g dy nam ical e q u a tio n s (ad d itiv e in p u ts , sigm oid response f u n c tio n .. . ) are p lau sib le reflections o f n a tu re , th e way th<; m odels o p e ra te a n d th e ir p a rtic u la r s tru c tu re s a re n o t. T h ere is no n a t u ral reason f<r sy m m e tric connections as req uired by th e Hopfield netw ork ( a n d even th o u g h th e an alysis o f Hopfield n ets so m etim es allow s som e re la x a tio n of sy m m e try , in n a tu re co n n ectio n s are m anifestly n o n -sy m m etric). T h is is discussed f u rth e r in C h a p te r 3. Also, th e necessity to su p erv ise le arn in g , a t least fo r p rim itiv e b rain functions (e.g. p erce p tio n o f o b je c ts), is c o n tra ry to o u r o b serv atio n s of n a tu re . It has been suggested t h a t in ter-cellu lar chem ical a c tiv ity in th e b ra in could a c t as a kind of ‘su p erv iso r’, for exam p le tu rn in g on o r off th e cap a city to learn in response to need. It is also possible th a t for so m e fu n ctio n s, one s u b sy ste m of a b ra in could supervise a n o th e r. H owever, th e kind o f su p erv isio n n eed ed in th e artificial neural netw ork m odels is to o d e p e n d e n t on th e know ledge a n d in te rv e n tio n of th e desig ner to be n a tu ra l. (R esearch on u n su p erv ised le a rn in g in n e u ra l netw ork m o d els h as recently been su m m arized in a p a p e r by B ecker [6]). F u rth e rm o re , m o st m odels require artificially sto p p in g a n d s ta r tin g th e d y n am ics, re s e ttin g in itia l co n d itio n s o r a t least p resenting in p u ts a t fixed m om en ts to be u sed for tra in in g . In a n a tu ra l n eu ral sy stem , th e n eu ro n s a re co n tin u ally activ e a n d e x te rn a l in p u ts , w hen p resen t, m u st sim ply a lte r th e ir d y n am ics. T h e re is a fu n d a m e n ta l p ro b lem in tra in in g such an u n su p erv ised sy stem in decid in g w hen a n in p u t sh o u ld be u se d for retriev in g an existing m em ory a n d w hen it sh ou ld itse lf b e le a rn e d . T h e su p erv ised a rtifi cial m odels a re o p e ra te d in le a rn in g m ode a n d retriev al m ode sep arately . N a tu re
8
evidently has a n o th e r solutio n. It is also not n a tu ra l for a neural system to have convergent dy nam ics. As S k ard a a n d Free , an p u t it, “C onvergence to a point a t tr a c to r am o u n ts to ‘d e a th ’ for th e sy ste m ” [60, p. 172]. In fact, m easu rem en ts of neu ral a c tiv ity in biological b rain s show th a t com plex d y n am ics are typ ical. S kard a and F reem an [60] a n d T ra u b a n d Miles [66| claim to have d e m o n stra te d th a t th e dy n am ics can b e c h ao tic in th e m a th e m a tic a l sense. lr any case, n a tu re clearly does n o t lim it h erself tc fixed p o in t a ttr a c to r s a n d perio dic oscillators.
In fact, n euro physiological m easu rem en t of neural a c tiv ity has typically been not a t th e level o f in d iv id u al n eu ro n s b u t averaged a c tiv ity over an a rea occupied by m any n eu ro n s (th e electro en c ep h alo g ra m , o r E E G ). T h u s, on th e neurophysiological side, research e rs are o fte n n o t even w orking in th e sam e fram ew ork as th e artificial n eu ral n et m o dellers. T h e re a p p e a rs to be in terestin g b eh av io u r on this level of averaged activ ity , so m od elling a t th is level is of in te re st [66, pp.101-193; 26, p p .7- 10; 60, p p .163,190].
T h e se o b serv atio n s do n o t d e tra c t from th e value o f ex istin g artificial neural netw ork m odels. T h ey do suggest th a t w hile th ese m odels m im ic som e of th e func tions o f biological n e u ra l sy stem s, th e y do not work in th e sam e way. T h ere is necessarily a n in te rp la y betw een th e d escrip tio n a n d u n d e rs ta n d in g of biological b rain s on th e o n e h a n d a n d th e develo pm en t of a b s tra c t m odels a n d artificial n e t works o n th e o th e r. F ro m th e p o in t o f view of u n d e rs ta n d in g brain function, it is clearly n ecessary to tr y to discover th e m echanism s used by n a tu re . However, th is may involve sim plifying th e d e sc rip tio n so th a t insigh t in to un derlying principles may b e o b ta in e d . F ro m th e p o in t of view of developing useful artificial netw orks, it is still p ru d e n t p erio d ically to ta k e w h atev er in sp ira tio n from n a tu re we are c u r rently cap a b le o f co m p reh en d in g . In p a rtic u la r, if we w ish to be able to d o m ore w ith n eu ral netw ork m od els th a n is covered by th e list of functio ns discussed above
(o r even to do th ese w ith n a tu r e ’s effectiveness), it will be necessary to look for new principles.
In a tte m p tin g to build neural netw ork m o d els th a t p erform useful functions based on th e observed behav iou r of biological n e u ra l sy stem s, th e la rg e st o b stacle we face is th a t th e re is no real u n d e rsta n d in g o f th e c o m p u ta tio n a l processes o c c u r ring in b rain s, except in th e sim plest cases o f in v e rte b ra te m o to r co n tro l functions a n d p e rh a p s to som e ex te n t in th e m a p p in g o f im ages in th e visual co rtex . T ra u b an d Miles [66, pp.xiii-xv,205] p oint o u t th a t a lth o u g h th e a c tiv ity of p a r ts o f th e b rain can be m o n ito red , no-one knows w h at c o m p u ta tio n is being p erfo rm ed in m ost cases. T h ey s tu d y the h ip p o c am p u s, for ex am p le, w hich is know n to c o n trib u te to th e fo rm ation o f lo ng -term m em ory, b u t how th is is d o n e is n o t know n, d esp ite all th e d etailed ex p erim en ts. In th e ir work, they d esig n a com plex m a th e m a tic a l m odel closely d escribing the s tru c tu re of th e h ip p o c a m p u s a n d d escrib e an d co m p are th e . activ ity of b o th th e m odel a n d th e original, b u t w ith no real id e a of th e significance of w hat th ese system s a re doing. N evertheless, th e in fo rm a tio n o b ta in e d in e x p e r im ents like th e se does give us som e clues. If we w ant to understand th e processes o ccu rrin g in b ra in s, we can a t le ast ex plore th e m a th e m a tic s o f th e b eh av io u rs o b served a n d look for in sights in to th e ir p o te n tia l in fo rm a tio n processing cap a b ilitie s. T h en th e m o d eller can a tte m p t to use th e m as b u ild in g blocks for in fo rm atio n p ro cessing task s. Even if we do n o t h it u p o n th e e x a c t process o ccu rrin g in biological b rain s, th e re is th e p o te n tia l for new a n d useful artificial n etw o rk m odels.
For th e se reasons, we co n sid er it im p o r ta n t to a tte m p t to stre tc h th e b o u n d aries o f co nventional ap p ro ach es to m odelling n e u ra l netw ork s. It is n o t so easy to discover new fu n d am en tal prin ciples, b u t u n less new app ro ach es a re ta k e n a n d gro u n d w o rk is laid th ey will n ever b e discovered. E v en if we sim ply leave th e confines o f sy m m e try in th e Hopfield netw ork e q u a tio n s , th e y becom e cap a b le of
10
com plex a c tiv ity an d a n aly tic tools in th is case a re scarce. (We will loosely call th e se e q u atio n s the ‘Hopfield netw ork e q u a tio n s ’ from now on, alth o u g h properly th e te rm ‘H opfield n etw o rk ’ refers to th e se eq u a tio n s w ith th e p a rtic u la r s tru c tu re of sy m m e tric co n n ectio n s a n d fixed-point m em ories — even th is usage of th e nam e is p e rh a p s n ot technically co rrect, as p o in ted o u t by G rossberg (28, p.23| b u t has becom e com m on n evertheless).
1.4 Summary o f approaches taken and results obtained.
In th is research p ro g ram it was decided to ex plore several new approaches to n e u ra l netw o rk m odels. T h ese each involve m a th e m a tic a l m e th o d s different from th o se usu ally em ployed — we need new m a th e m a tic a l tools for new an aly tic in sig h ts. Tw o basic ideas in itia te d th e research: ch ao tic dy nam ics w ith in p u t-d riv en b ifu rc a tio n , a n d con tinu ou s sp ace m odelling. T h e research in th e tw o areas is essen tia lly d isjo in t, b u t som e in te re stin g id eas em erg ed , co n n ectin g th em . T h e n a tu re of th e research is e x p lo ra to ry a n d th e work o f C h a p te rs 4 a n d 5 in p a rtic u la r is prelim in ary , w hereas th a t of C h a p te rs 6 -8 is m o re fully w orked o u t. H ere we su m m a ri/e th e ap p ro ac h es tried a n d th e resu lts achieved.
1. An in itia l in v e stig atio n was m a d e in to th e possibilities o f using chaos in n eu ral n etw orks as a b ack g ro u n d s ta te , in such a way th a t a p p ro p ria te in p u ts are ‘reco g n ized ’ by changes in th e d y n am ics (b ased on work of E vans, et ul. (24|, as well as th a t of P rieso l, et al., [51] a n d K w an [42]). T h e m odels of Freem an [27] a n d T ra u b an d M iles [66] ex h ib it in te re stin g b eh av io u r (such as chaos) b u t th e se m odels a re to o com plex to be am en ab le to analy sis. S im pler m odels are
need ed for in sig h t. W e briefly su m m arize th e id eas in references [24,42 a n d 51] a n d ex p lo re th e ir possibilities a little fu rth e r. S om e new insights o b ta in e d m ake c o n tin u e d work feasible a n d p o te n tia lly fruitful. In p a rtic u la r, it is shown th a t u n d er
c e rta in con ditio ns, m any Lorenz sy stem s (w hich wc in te rp re t as system s of tn re e n eu ro n s each) coupled to g e th er re ta in b o u n d ed b u t irreg u lar beh av io u r a n d resp o n d to p a rtic u la r in p u ts by converging to a fixed p o in t. We show th a t th e o th e r m odel sugg ested by Kwan a n d his c o lla b o ra to rs [42,51] allows th e sam e possibilities in a d iscrete tim e s e ttin g , an d we also show th a t th ese can be refo rm u lated as a Hopfield netw ork w ith a p a rtic u la r co n n ectio n m a trix s tru c tu re interw eav ing positive a n d n eg ativ e en tries.
2. An a tte m p t was m ade to ex p lo re co n tin u o u s space versions of th e H opfield netw o rk eq u atio n s (i.e. an in teg ro -d ifferen tial eq u atio n m odel). T h is provides a different fram ew ork an d different in sig h ts, even th o u g h th e resu lts could in th e ory be tran sferred back to th e s ta n d a r d H opfield netw ork by space d iscretiza tio n . U sing very sim ple con nectio n fu n ctio n s a n d e x te rn a l in p u ts, m ainly c h a ra c te ristic fu nction s, it was d e m o n stra te d (by c o n s tru c tio n ) th a t such m odels can ‘le a rn ’ in a sim ple sense w hile o p e ra tin g co n tin u o u sly a n d resp o n d in g only to e x te rn a l in p u ts (no tra in e r req u ired to sto p a n d s ta r t th e sy ste m , to sw itch from le arn in g to retrieval m o d e , etc.).
3. A m ore com plete a n d m a th e m a tic a lly rigorous inv est!; itio n was m ad e in to th e a p p ro x im atio n o f th e Hopfield n etw o rk e q u a tio n s by reactio n-diffusion e q u a tio n s. T his is a sim plifying p rin cip le th a t provides a different classification o f co n n ectiv ities th a n the s ta n d a rd s y m m e tric /ru m -sy m m e tric o n e, an d show s how m odels w ith one class of con nectiv ities beh av e. T h is h a d been trie d by C o tte t [14] for a p a rtic u la r sim ple ty p e of co n n ectio n m a trix b u t we have found th a t th e m e th o d can b e e x ten d ed to a far w ider class of co n n ectiv ities. T h eo re m s are p resen ted d e m o n stra tin g th e a p p ro x im atio n form ally, a n d rigorously p rov ing its convergence u n d e r a p p ro p ria te co ndition s. A p ro b le m arose w ith th e ap p ro a c h th a t h a d n o t been noticed by C o tte t, lim iting th e convergence to regions of h ig h o r low a c tiv ity
away from tra n s itio n layers. B ut th e se tra n s itio n layers are typically very th in an d effects are e x p ec ted to p ro p a g a te o u t from th ese regions ex trem ely slowly, s o th e a p
pro x im atio n is still q u ite good from th e p oin t o f view of an essentially b in a ry -s ta te n eu ral netw ork. T h ese th eo rem s ap p ly to a still restric ted class of co n n ectiv ities a n d th u s serve to classify ty p e s o f co n n ec tiv ity into tho se th a t have b eh av io u r like th e se P D E s a n d th o se th a t do n o t. For th o se th a t do, th e theory of these reaction- diffusion e q u atio n s can be ap p lied to g ain insigh t into behaviour. In p a rtic u la r, th e reaction-diffusion e q u atio n s give in sigh t in to th e lim iting beh aviou r as th e n u m b er of n eu ro u s goes to infinity. Also, th e ty p e s o f co n n ectiv ity th a t d o not satisfy th ese th e o re m s (p a rtic u la rly th ose w ith a p rep o n d eren c e of in hibition o r those w ith co n n ectiv ity m atrice s w ith wildly flu c tu a tin g e n trie s) prom ise m ore com plex beh av io u r th a n th a t possible for th e P D E s. T h e H opfield netw ork form ulation of K w an ’s m odel has such m atrices.
4. F inally, fu rth e r an aly sis on th e above reaction-diffusion eq u atio n s was c a r ried o u t via s ta n d a r d (finite-difference) space d iscretizatio n s. In fact, th e sy stem s of O D E s resu ltin g fro m th ese d isc re tiz a tio n s a re very sim ple neural netw orks of th e H opfield e q u a tio n ty p e , w hich (as a resu lt of th e ap prox im ation th eo rem s) are re p re se n ta tiv e o f a w hole class of H opfield n e ts (th o s e th a t a re ap p ro xim ated by th e p a r tic u la r reaction -d iffusion e q u a tio n ). D ifferent p a ra m e te rs in th e reaction-diffusion e q u a tio n will have different d is c re tiz a tio n s rep re se n ta tiv e of different classes o f H op field n ets. W e prove p ro p o sitio n s d e m o n s tra tin g th e existence o f large scale sta b le p a tte rn s for th e se very sim ple H opfield n ets. T h is is co n trary to ex p e c ta tio n s from th e reaction-diffusion eq u a tio n s, w hich ty p ically have no stab le sta te s o th e r th a n th o se co n sta n t in sp ace. T h is p o in ts o u t th e c a r . th a t m ust be taken in d e d u cin g p ro p e rtie s of d iscrete from co n tin u o u s sy stem s a n d vice versa, despite a rigo ro us (a lm o st) convergence th e o re m . P rev io u s a n a ly tic research suggests th a t rn e ta sta b le
s ta te s of th e reaction-diffusion e q u a tio n s are responsible.
C h a p te r 2 p resen ts b ack g ro u n d m a teria l on th e conventional Hopfield netw ork (a n d som e en h an cem en ts). A co m p lete su m m a ry of th e exten siv e lite ra tu re on th is su b je c t is no t a tte m p te d ; r a th e r th e m ain m a th e m a tic a l p ro p erties of th e m odel th a t have relevance to later c h a p te rs a re described. C h a p te r 3 co n tain s a discussion of th e lim itatio n s o f th e co n v en tio n al Hopfield n etw ork a n d re la te d m odels a n d o u tlin es an a lte rn a te set of featu res o r p ro p erties th a t we co nsider d esirab le in a n eu ral system m odel. !n C h a p te rs 4 th ro u g h 8 we develop a n d explore m odels ex h ib itin g rom e of th e se featu res.
II
2. Background on th e conventional H opfield n et
work m odel
2.1 The Hopfield network equations.
A good d eal of a tte n tio n has been p aid in th e la st few years to th e ty p e of neu ral netw ork m odel often referred to as »he ‘H opfield n etw o rk ’ since H opfield’s im p o rta n t c o n trib u tio n to th e s tu d y of su ch sy stem s [34,35]. O f th e m any discussions of th is typ e o f n eu ral netw o rk , one of th e best a n d m ost com prehensive is th e book by A m it [5]. T h e e q u a tio n s d escrib in g th e d ynam ics o f these netw orks are o f th e form
it, =
- a u i
+ £Ti j g( \ uj ) ,
(2.1)j
w here i an d j a re indices over all n eu ro n s in th e netw ork, it; represents th e m em b ran e p o te n tia l of th e i th n eu ro n , a > 0 is a ‘leakage’ ra te (or resistance p a ra m e te r in H opfield’s fo rm u latio n ), Tjj is th e ‘s y n a p tic efficacy’ m o d u latin g th e effect of neuron
j on neuron i, a n d g is a sigm oid resp o n se function w ith ‘g a in ’ A > 0, describing how
a n e u ro n ’s firing ra te d ep en d s o n its m em b ran e p o te n tia l. We have also followed the com m on p ractice o f assu m in g th a t n eu rons a re identical here (i.e. o , g an d A do n o t d e p en d on i). T h ese e q u a tio n s m ay also have ex te rn a l in p u t te rm s, c;, a n d th re sh o ld te rm s , 0,, d escrib ing sign als to each neuron arriving from o u ts id e th e net an d firing th resh o ld s o th e r th a n zero. For exam ple,
it; = - a i t ; 4 - Ti j g (A( u j - 0 j ) ) 4 - c ; . (2.2)
j
As a m odel for biological n e u ra l netw orks th is is clearly a g re a t sim plification b u t it nevertheless e x tra c ts som e featu res o f th e ir design. A rtificial n eural netw orks using these dy n am ics have p roven useful in som e ap p licatio n s (as m en tio n ed in S ection 1.2). T h ere are , o f cou rse, m any m odels o f neural n ets (see, for ex am p le,
th e survey p ap ers [28,45,48,65]). T h e dy nam ics of m any of th e m odels have th e form of eq u a tio n s (2.2).
T h e d y n am ical b eh av io u r of these eq u atio n s is in gen eral com plex a n d difficult to describ e. M uch of th e lite ra tu re has co n ce n trate d on th e special case of a sy m m etric con nectio n m a trix , T (i.e. Tij — Tj;). T h is is m a th e m a tic a lly convenient as th e re exists an energy fu nctio nal in th is case w hich en su res convergent b eh av io u r of th e net [35]: Any in itial co n d itio n s ap p ro ac h a fixed p o in t. C onvergence is useful in th e ap p licatio n of th ese neural n ets to c o n te n t-a d d re ssa b le m em ory or th e retrieval of p a tte rn s ( ‘m em o ries’) from d is to rte d o r sim ilar versions of th em ; it is possible to c o n stru c t the tra n s itio n m a trix so th a t selected p a tte rn s becom e fixed p o in ts of th e d y n am ics (discussed below ).
T h e Hopfield netw ork e q u atio n s w ith e x te rn a l in p u ts an d th re sh o ld s m ay be w ritten as in (2.2) o r, if in ste a d we let u, rep resen t th e a m o u n t by w hich the m em b ran e p o te n tia l of neu ro n i exceeds its th resh o ld , we m ay w rite th e m a lte rn a te ly as
it, = T i j g ( \ u j ) - a u j + c, - c*0,. (2.3)
j
We req uire a > 0, A > 0 a n d g sigm oidal in sh ap e, increasing on
R
a n d b o u n d ed . T y pically we use g :R
—*•( 0 ,
1) o r g :R
—*■ ( - 1 , 1 ) (th e effect o f th is choice is discussed below). To be precise we will assu m e from now on (except in C h a p te r 5):0 : R -
( - 1 , 1 ) , f l e e 1,</(*) >
0,
(2.4)$ '( * ) < fl'(0) = i
in
>
2
2
1 0 1u
F ig u re 2.1 An example of a sigmoid response function: v — </(Au) . vi t h g = tanh and A = 3. The straight line is
f i u ) - n. _
We present beiow som e basic p ro p erties of th e Hopfield netw ork eq u atio ns. A lth o u g h th e y a re well know n, th ey provide a b ack g ro u n d to th e m a te ria l of subse q u e n t c h ap ters. In d eed , m any of th ese p ro p e rtie s c arry over d irectly to o th e r form s of th e m odel ex p lo red la te r.
2.2 Energy functional for the Hopfield network.
Hopfield [35] show ed th a t ihere exists an energy fu n ctio n al (o r L yapunov func tio n a l) for e q u a tio n s (2.3) in th e case w here T is sym m etric:
= + j E A" G ( V M V i . (2.5)
i j i 0 i
w here r. = g ~ x a n d
Vi = g ( \ u i ) (2.6)
rep resents th e firing rat-* o f neuron i. In fact, along so lu tion s to (2.3),
^ = S T " + + x S G ( v i) v i - £ ( c t - a $ i) vi
i j i i
7= - \ 11 Z
+ X S G^ 6i ~
~ “W*
= ~ £ Tii v i ~ + Ci “ a d i j bi
= - A V ( y
Tijg(Xuj) - aui + a - a0i
)g'(\ui)ui
)
= - n >
i
since g' > 0, so t h a t en erg y decreases ex cep t a t eq u ilib ria. Also
18
a t eq uilib ria. F or given p a ra m e te rs , E is clearly b o u n d ed below , since t’, is b o u n d ed . T h u s, solution tra je c to rie s for e q u atio n (2.3) m ust descen d th e energy surface to w ard s a local m in im u m , w hit, m ust be an eq u ilib riu m point.
If T is n ot sy m m e tric, th e n th e E above is not a L yapunov functional for th e sy stem . C onvergent b eh av io u r a p p e a rs to be slightly ro b u st in reg ard to a sy m m etry (p a rtic u la rly for ran d o m ly d ilu te d , b u t c ’’erw ise sy m m e tric conn ection m atrices [5, p.363ff.]) b u t in g en eral non-convergent b eh av io u r is to b e ex p ected (see, e.g.,
W e will rev isit th e se energy fun ction als several tim es, for analogous in teg ral
th e ir finite-difference d isc re tiz a tio n s (C h a p te r
8
).2.3 Membrane potentials, firing rates and the
S -
£ exchange.
W e can re-express e q u a tio n (2.3) in te rm s o f firing ra te s, e;, as follows: [53,54,61,64]).
e q u a tio n m odels (C h a p te r 5), reaction-diffusion e q u a tio n m odels (C h a p te r 7) a n d
from (
2
.6
), soTijVj - a ( ^ G ( y , ) ) +- c, - a 9 i ,
(2-7)
T h is is en tirely equ iv alent to (2.3) for in itia l co n d itio n s Vj(0) <E range(</). It is easy to see th a t so lu tio n s to e q u a tio n s (2.3) a n d therefore to e q u a tio n s (2.7), are b o u n d ed a n d so exist globally in tim e a n d are u n iq u e (see S ection 2.6).
T h is tech n iq u e o f sw itching betw een different variables proves very useful in la te r c h a p te rs , for in teg ral e q u a tio n s an d reaction-diffusion eq u atio n s co rresp o n d in g to th e H opfield netw ork e q u atio n s. In th e case o f d iscrete tim e o r eq u ilib riu m e q u a tio n s th is change of variables takes a p a rtic u la rly sim ple form . W e give a n a b s tr a c t fo rm u latio n of th is idea in Section 5.5.
We n o te here th a t d iscrete tim e versions o f th e H opfield netw ork eq u atio n s c a n be o b ta in e d via w h at has b ee', called by G ro ssb erg [28, p .26] th e S - £ exchange, from e q u a tio n s describing th e a c tu a l firing of each n eu ro n . If Xj represents th e ‘a ctio n potential* of a n eu ro n , i.e. th e signal being sen t dow n its ax o n , we can su p p o se th a t x,-(t) takes on th e values +
1
? m ean in g th a t a spike is e m itte d along th e axo n a t th is tim e, o r - 1 , in d ic a tin g th a t no spike is e m itte d . T h e activ ity c a n th e n be m odelled (w ith g re a t sim plification from th e biological reality ) as!,■(< +
1
) = sgn (2
.8
)w here sgn(i/) =
1
if y >0
an d sgn(j/) =- 1
if y <0
( i t ’s value w hen y =0
is not c ritic a l), a n d 0, a n d c, a rc th e th re sh o ld s a n d e x te rn a l in p u ts respectively. T h e ev o lu tio n is ta k en to be asy n ch ro n o u s, i.e. only one (ra n d o m ly selected) neuron c a n ch ang e s ta te a t each tim e ste p a n d it fires a t th e n ex t tim e step if its acc u m u la ted inco m in g signals exceed its th re sh o ld . T h e lack o f c o n n ec tio n fro m a n eu ro n back o n to itse lf ( T a =0
) is necessary in th e d iscrete tim e case for th e ex isten ce of th e en erg y fu n ctio n al. T h e S - £ exchange consists o f th e follow ing tra n sfo rm a tio n .Let
M O = 5 ^ T i j X j ( t ) + C i , (2.9) j * i
20
over i and a d d ck, using (2.9) to jr-t
u k (t + 1) = T k.iSgn(uj(t) - 0j) + ck . (2.10)
i^k
Thie. is a d isc re te tim e ai.alogue o f e q u a tio n (2.2) w ith o =
1
. T h e S - £ exchange may b e reversed via*,■(/ +
1
) = sgn(u,(< ) - $i ) . (2
.11
)T h e d isc re te an d con tin u o u s tim e H opfield netw ork m odels (using e q u atio n s (
2
.10
) and (2
.2
) respectively) are form ally analogous a n d b o th m odelling ap p ro ach es have been ta k e n . H opfield [35] showed th a t if th e g ain in (2.2) is large, th e solutio ns to th e two sy stem s have sim ilar b eh av io u r. (O f course, th e high g ain lim it o ' * is th e ‘h ard n o n lin e a rity ’, sg n (u )).2.4 The sigmoid response function and the high gain condi
tion.
In th e lite r a tu r e g is o ften ta k e n to b e a n o d d fu nctio n ta k in g values in ( - 1 , 1 ) , such as ta n h . In p a rtic u la r, su ch a fu n ctio n h as <
7
(0
) =0
. H orizontal sh ifts in the resp o n se fu n ctio n m ay be acc o u n ted for by a th resh o ld term . A m ore realistic sigm oid m ig h t, how ever, tak e values in (0,1). For exam ple, a logistic fu n ctio n is often used:g( Xu) = ... ...
v ' 1 + e ~ 4Xu
However, by a change of c o o rd in a tes, th e resu ltin g H opfield-type eq u atio n can be tran sfo rm e d in to th e orig in al one w ith a n a d d itio n a l th resh o ld te rm . E q u atio n (2.3) can still be tra n s fo rm e d to e q u a tio n (2.7) a n d th e n we let w = 2v - 1 so th a t w ~ 2v, T h en
2
Wi =
Now let F{n>) 2 G ( u' p ), so th a t F ' ( w ) = G ' ( ^ ) . T h en
iut ~ A ^ TijiVj - a F ( w j ) -h A | 2cj — 2 a0 j + ^ Tij
j \ j
w hich is of th e sam e form as before w ith in p u ts
2
c, a n d th resh o ld s ^2
a0
* - V^j T h u s , horizontal a n d vertical sh ifts in th e resp o n se fun ctio n do n o t significantly a lte r th e m odel, ex cept in changing th e th re sh o ld values. If5
(0
) = 0, th e n th e Hopfield eq u atio n s w ith o u t in p u ts o r th re sh o ld s (2.1) have th e stead y s ta te solutio n it; =0
. In biological n e ts, w here firing ra te s sh ou ld b e s tric tly p ositive, th is m ay not m ake sense, b u t th e re is no reason n o t to c re a te artificial n ets w ith th is p ro p e rty if it is desired.In th e absence of in p u t o r th resh o ld te rm s in th e H opfield netw ork eq u atio n s (2.3), if th e gain , A, is sm all, th e n all so lu tio n s decay to zero. T his m ay be proved e ith e r by a c o n tra c tio n m ap p in g arg u m e n t o r (in th e sy m m etric case) fro m th e energy fun ctio nal. We will no t give a p ro o f o f th is re su lt here, b u t we will prove it for analog ou s e q u a tio n s in Section 5.3, using th e co n tra c tio n m a p p in g th e o re m a n d in S ection 8.3 using th e energy fun ctional. It is a s tra ig h tfo rw a rd m a tte r to m odify these proofs for th e s ta n d a rd Hopfield e q u atio n s.
2.5 Learning rules for the Hopfield network.
‘L earn in g’ in a co nv en tion al Hopfield n et is acco m p lished by se ttin g
Tij =
£
>n = l
for i ^ j , an d 71,, =
0
, w here rep resen ts th e m th p a tte r n to b e le arn ed w ith = ± 1 , say. If th e n u m b e r of p a tte rn s s is n ot to o larg e a n d th e p a tte r n s area p p ro x im a te ly o rth o g o n al th e n th e p a tte r n s will be close (in p h ase space) to fixed p o in ts of th e dy nam ics. T his is shown in [34,35), th o u g h th e o rth o g o n ality co nd ition ta k e s a slightly different form w hen e, t {0,1} as in H opfield’s original form ulation . It is n o t g u aran teed th a t th ese will be th e only fixed p o in ts of th e d ynam ics; in fact, th e re are usually ex tra n eo u s fixed p o in ts, called ‘spu riou s m em o ries’.
T h is ‘learn in g r u le ’, loosely referred to as H ebbian learn in g , is not th e only o n e th a t has been ap p lied to th e H opfield netw ork eq u atio n s. Som e o th e rs (an d th e ‘H ebbian ru le’) a re describ ed by D enker [18] a n d Michel a n d Farrell [48], for e x am p le. In p a rtic u la r, we m en tio n h ere th e A daline rule a n d th e ‘g eo m etric’ o r ‘p seu d o -in v erse’ rule. Like th e ‘H e b b ia n ’ rule used by Hopfield, b o th of these can be b u ilt u p one p a tte rn a t a tim e, so th a t a d d itio n a l m em ories can be added w itho ut d o in g th e calcu la tio n for all p a tte r n s from scratch. T h is in crem en tal ‘learnin g p ro cess’ is obviously necessary fro m th e biological view point. T h e H ebb ride expressed in c re m e n ta lly is
T j j " * 1'1 = r ! ” ’ + v , t j ,
w here v = (e ;) here is th e new p a tte r n to be sto red [18, p.224|. T h e A daline rule is
, | - , ( m + l ) _ r p { m ) , r „ - L „ .
I l j ~ * i j 1 v i v } >
w h ere vf- = ^ N
*8
4he n u m b er of n eu ro ns in th e netw ork an d F < 1 is a p o sitiv e p a ra m e te r [18, p p .224-225]. T h is rule tak es th e com pon ent of v o rth o g o n a l to th e sp an of th e previo usly sto red p a tte rn s , a n d th u s does n o t require in itia lly o rth o g o n al p a tte rn s . A v aria n t of th is id ea which re ta in s sy m m etry is th e ‘g e o m e tric ’ ru le [18, pp.225-227]r p ( m + l) r p ( m ) . p *’i ^ j
~ * i j ^ 1 V 1 . y l *
w here v x • v x = ( e ^ ) 2. T h is la st ru le g u aran tees exact sto ra g e of th e d esired p a tte r n s w ith no ‘sp u rio u s m em o ries’ a n d equal d e p th s of th e p a tte r n s in th e energy surface.
From th e c o m p u ta tio n a l p o in t of view, th ese have con siderab le ad v an tag es over th e H ebb rule suggested by Hopfield. H owever, th e H ebb rule has som e basis in biology, w hile th e o th e rs above do not. In p a rtic u la r, th e calcu latio n o f vf- requ ires know ledge a t th e i th neuron o f th e s tre n g th s of sy n a p tic connections across th e netw ork. T h is in fo rm atio n is n o t available to biological neurons.
2.6 The role of external inputs.
L ittle o f th e lite ra tu re on th e co nv entio nal H opfield netw ork deals w ith th e role of ex te rn a l in p u ts. K jp fie ld ’s fo rm u latio n of th e n etw ork does n o t req u ire th e m . R a th e r, in itial co n d itio n s serve as ‘in p u ts ,’ a n d th e so lu tio n evolves to w ard s a (pre- estab lish ed ) fixed po in t. In p u ts are u sed effectively in ap p licatio n s o f H opfield nets to o p tim iz a tio n p ro blem s (see e.g. [36]), b u t th is is a pu rely m a th e m a tic a l ap p licatio n , n o t a biological sim u latio n .
We n o te here th a t sufficiently stro n g e x te rn a l in p u ts can d o m in a te th e b e hav io u r o f th e netw ork eq u atio n s. T h is m ay be significant in th e u n su p erv ised o p e ra tio n of a biological n eu ral sy stem , as th e re m ust b e stim u lu s from o u tsid e to d is tra c t it from its m ean dering s a n d to p ro v id e ex p erien ce from w hich to le arn . In effect, a stro n g in p u t can reset th e sy stem , th e eq u iv alen t (w ith o u t su p erv isio n ) of artificially re s ta rtin g th e ev o lu tio n from a new set of in itia l con ditions.
F irst, we need a m ax im u m p rin cip le to en su re b o u n d ed n ess o f solu tio ns. C o n sider e q u atio n (2.3) w ith o u t th e in p u t te rm , c,. S u p p o se th a t u; is p o sitiv e a n d th a t tii, is also positive. T h e n
a u j < T j j g ( X u j ) at0j < |T ij| a$i
which is a co n sta n t. S im ilarly for «; a n d ti, negative,
a u i
> V
T i j ( j { \ u j )-
a H j >- V
\ T i j \-
a B i.
j J
T hese tw o to g e th e r show th a t »,«,• > 0 im plies
a | l t ; | < ^ | T , ' j | + c* 10 ; | .
j
T h erefo re, if
« | » | | > ^ |Tij I + a|0;| = B j , (2.12)
j
th e n u ,u , < 0 a n d |« ,| c a n n o t grow . F or in itial d a ta w ith in these h o u n d s (2 .1 2), th e solutio n m u st alw ays rem ain w ith in th em .
If in p u t is ad d ed , we still have a m axim um principle w ith th e te rm |c,| ad d ed to th e b o u n d B j , so so lu tio n s a re still b o u n d ed . B ut w hile an in p u t is active,
a 11 ; < B i
=;• it,
j > C, — 2 B i ,a it; > - B i =:• U; < C; + 2 B i ,
from (2.3). T h u s,
|c; | > 2Bi => iijCi > 0
for as long as a |u ; ( t ) | < Bi.So if Mi is in itially w ith in th e b o u n d s (2.12) a n d th e n a stro n g in p u t is ap p lied , |c;| > 2 B i , th e n u; h as th e sign of c;, a n d u, is pulled to w ard c;, a t least u n til it exceeds th e b o u n d s in (2.12). T h u s , w herever a stro n g in p u t is ap plied , th e n eu ral a c tiv ity resp o n d s in k in d , an d th e firing ra te becom es high (o r low) w hen th e in p u t is p o sitiv e (or n eg ativ e).
3. L im itations o f th e conventional H opfield n et
work
T h e Hopfield netw o rk , p ro p erly sp eak in g , co nsists o f th e d y n am ical sy stem given by (2 .2 ), (2.3) o r (2.7), w ith in itial co n d itio n s rep resen tin g a n in p u t p a tte r n , a n d a sy m m etric co n n ectio n m a trix d eterm in e d b efo reh an d by th e ‘H eb b ian ’ le a rn
ing ru le to fix chosen p a tte r n s in to m em ory. A g reat m any v aria tio n s on th is schem e ' have been p u t forw ard in th e lite r a tu r e on n e u ra l netw o rks, a g reat ^^al o f a tte n tio n
has gone into d e te rm in in g th e m em ory ca p a c ity of such a netw ork a n d ways have been developed to rem ove som e o f its p ro b lem s, such as th e necessity for ‘n e a rly ’ o rth ogo nal m em ories, a n d th e ex isten ce of ‘sp u rio u s m em ories’. T h ere a re le arn in g rules, for in sta n ce, th a t en ab le an y in p u t p a tte r n s to be sto red (still w ith a m ax i
m um capacity o f course) a n d th a t prev en t th e o ccu rren ce of sp u rio u s uj nories (a s %
discussed in Section 2.5).
However valuable th e se n eu ral netw ork m o dels a re from th e p o in t o f view of th e ir cap ab ilities in th e a b s tra c t, th ey do n o t co rresp o n d very closely to biology. T h is is not an o b je c tio n to th e ir stu d y — o n th e co n tra ry , it is surely ad v an tag eo u s to follow two lines of research , o n e th a t keeps an eye on biology a n d a tte m p ts to u n d e rsta n d a n d m odel its processes m ore deeply, a n d th e o th e r th a t ta k e s in sig h ts alre ad y gained from biology as a s ta rtin g p o in t a n d a tte m p ts to re sh a p e th e m to m axim um effect in a n a b s tr a c t o r artificial s e ttin g . S om etim es in sig h ts m ay arise from su rp risin g d ire c tio n s, such as H opfield’s im p o rtin g of s ta tis tic a l physics in to n eu ral netw ork th eory. However, th e re is, n o d o u b t, a lim it to how m uch can be ex p ec ted from th e s ta n d a r d Hopfield netw ork. It h as ta k en its place as a useful tool for c e rta in task s, such as ‘co n ten t a d d re ssa b le m em o ry ’, im age re s to ra tio n , classification, a n d will co n tin u e to b e ex p lo red in rela tio n to th ese ta sk s. B u t th e stu d y of biological b rain s req u ires us to ste p o u ts id e th e b o u n d s set by th e