Some comments on cybernetics and control

(1)

Some comments on cybernetics and control

Citation for published version (APA):

Kickert, W. J. M., Bertrand, J. W. M., & Praagman, J. (1978). Some comments on cybernetics and control. IEEE Transactions on Systems, Man and Cybernetics, 8(11), 805-809. https://doi.org/10.1109/TSMC.1978.4309868

DOI:

10.1109/TSMC.1978.4309868

Document status and date: Published: 01/01/1978 Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne Take down policy

If you believe that this document breaches copyright please contact us at: openaccess@tue.nl

providing details and we will investigate your claim.

(2)

IEEE TRANSACTIONSONSYSTEMS, MAN, ANDCYBERNETICS,VOL.SMC-8,NO. 11,NOVEMBER 1978

Correspondence

Some

Commentson

Cybernetics

and Control

WALTER J. M. KICKERT, JAN-WILLEM M. BERTRAND,

AND JAAP PRAAGMAN

Abstract Thetheory of cyberneticsasintroduced

by

Ashby

and developed by Ashbyand Conant will be analyzedand commented

upon.Ashby's law ofrequisite varietyand,inparticular,the

underly-ingmeasureofoptimality thequantityofentropy areexamined.

Nextthe cybernetic theorem oferrorcontrol andcausecontrolis observed, and finally the cybernetic theorem of the

necessity

of

modelingforregulationisstudied.Inallthreecasesseveralpractical

conditions and restrictions for the

applicability

of the theoremsto

controlengineering arepointed out. I. INTRODUCTION

Since the first introduction ofcyberneticsbyN. Wiener[1],asa

"science ofcommunication and control,"numerouscontributions tothis fieldhave been made. One of themostoutstanding

contri-butions,in ouropinion,has been madebythe late W. R.Ashby, wholaidabasisfor the link betweeninformation and controlwith

his well-known law of requisite variety [2]. Both he and R.C.

Conant furtherdeveloped thistheory. Someimportantand

well-known elements of this development are their views on

error-controlled regulators versus cause-controlled regulators [3], and

their views on modeling as a necessary part ofregulation [4].

These elements only represent part oftheir work, especially in

view of their recent achievements in the field of complex and

hierarchical systems [5]-[8]. Theanalysis and comments in this

correspondence, however, will be restricted to the above-mentioned points, mainlybecause of their relevanceforcontrol.

There are three -reasons which emphasize the necessity of

analysis, comments,andcriticism. First is theobviousimportance

of this branch ofcybernetics for control theory. It will be clear

that information and communication play an important rolein

control and that, more specifically, the above-mentioned issues

arecertainlyveryimportantforcontroltheory.Thesecondreason

is the incomprehensibleignorance of this theory on the part of

controlengineers. Itisastonishinghowlittleattention thetheory

ofcommunication, andinparticular,thetheoryofcyberneticshas received in control theory, though its importanceis quite clear. The third reason is the unshakable popularity of this theory

among system theorists. One might hope that some"cross talk between cyberneticians and control engineers" as Porter calls it

[9] will result in a greater appreciation by control engineers of

Ashby's link between informationandcontrol, andin asomewhat restricted but betterfounded popularity among systemtheorists.

II. THE LAWOFREQUISITE VARIETY [2], [3], [10]

Themostfamouslawof cybernetics isundoubtedly Ashby'slaw ofrequisite variety. Thisvery generallaw, whichcontrary to the

usual control theory does not presuppose linearity, low-order structure, etc.,givesanupper limit to thedegreeofcontrollability

that a regulator can possiblyachieve.Inview of the generality of

ManuscriptreceivedAugust31, 1977; revisedJuly 6,1978.

The authors are with theDepartmentofIndustrial Engineering,Technological

University of Eindhoven, Eindhoven,TheNetherlands.

the law and itsobvious importance for control, which caneasily

be shown in numerous examples, it is indeed astonishing that so little attention is paid to this law by control theorists.

Thelaw ofrequisite varietystates thatthecapacityof adevice

as aregulator cannot exceed itscapacity as a channel of

communi-cation,or toputit inAshby'swords: "only variety in the

regula-tor can force down the variety due to the disturbances; only variety candestroyvariety." Imagine asystemcomposedof a set D ofdisturbances, a set R of control actions, and a set Z of outcomes, defined by a

mapping

4: D x R -+Z. This

obviously

represents acontrol system.By taking finite discrete sets D, R, and Z andby visualizing 4 as a table, it can easily be shown that the goal ofkeepingthe outcomeZkE Zconstant,thatis,of decreas-ing the variety in theoutcomes, can only be met by a correspond-ing increase in the variety of R.

Therelation betweeninformation and controlisessentiallythe

following.Asthe criterion for the success of the regulator (thegoal

is constancy of the outcomes Z) Shannon'smeasure ofselective

information in asignal, theentropyH(Z) isintroduced: H(Z)= - 3

p(zi)

log2

P(ZJ)

zieZ

Ashby states that optimality of a regulator R isequivalentto the

minimization of the entropy H(Z) of the outcomes. One ofthe

advantages of this measure of optimality is that it does not presume numerical variables; entropy also applies to variables

that canonlybeclassified (nominal).

Strictly speaking, the use of this entropy measure implies that thevariables involved(outcomes)arestochastic. (Notice that the requirementofstochasticsystemvariables doesnotimplythat the

systemrelations be stochastic; on thecontrary,mostof this cyber-netic theory leads to the necessity of deterministic systems

rela-tions.)Most frequently, however,theentropy measure is applied as a measure of variety without strict probabilitydensity

func-tions. Itthenserves as a measureofthe number of possible

alter-natives. The assumption behind this use of entropy is that all alternatives havethe sameprobability:

n n 1 1

H

P(zi)

log

P(Zi)=

log

-=log

n. This probability assumption is often omitted. Hence, in fact, the entropy measure is not only usedin case of stochastic variables butalso with (varying)deterministic variables.

Itshould be remarked thatalthough the use of entropy has an

advantage over classical control theory in that it incorporates

stochasticvariables, it does not solve the kind ofproblems that are solved by classical control theory, simply because entropy does notdeal withthem. Although stabilityanalysis,transfer

func-tiontheory,etc., do not exclude the existence of stochasticsignals,

those theories just do not consider it; they deal with analytical functions in time(pulse, step,ramp, sinusoid,transient response,

steady state,etc.). In contrast,entropy onlyconsiders aggregates ofvariables,such asdensityfunctionsandvarieties. Hence, a great

deal ofcontroltheoryis not covered by this theory of cybernetics.

Secondly, itseemsquestionabletoequateoptimalitywith mini-malentropy. A well-known fact ofinformation theory is that the

0018-9472/78/1100-0805$00.75 ) 1978 IEEE

(3)

IEEETRANSACTIONS ON SYSTEMS, MAN, ANDCYBERNETICS, VOL.SMC-8,NO. 11,NOVEMBER 1978

p(Z)

Fig. 1. Comparison of entropies.

entropy H(Z) is minimal when there isone isuch that p(Zi)= 1

and

p(zj)

= 0, Vj¢ i, and that H(Z) is maximal when all

p(zj),

zj

E Z areequal. Minimizing H(Z)means compressingthe

prob-ability distributionof Z. Hence,whatever theposition ofzi,H(Z) will decreaseiftheprobability distribution is compressedaround

zi (see Fig. 1). Thismeans thatentropy onlymeasuresvariety and

does nottake into account the absolute position of the optimal variable. This omission has serious implications in questions of steady-stateerrors, bias,etc.

Thirdly, some attention should be paid to the concept of variety,for which entropy is ameasure.A good regulator should

minimize the variability ofthe output. Apart from the possibly additional requirement of a particular output value which entropydoesnotguarantee-this requirement usuallymeansthat anoutputvariable shouldfluctuateaslittleaspossible. The usual

measure for this is variance. However, it should be emphasized thatbetween thesetwothereisadifference ofpractical relevance,

andnotjustofmathematics. Letustry toillustrate this difference through anexample.

Example: Imagine a control system with a disturbance set

D= {a, b,c}, acontrolaction setR ={p, q, r}, andanoutputset

Z= {1,2,3,4, 5,6, 7, 8,9}.Thetransition tableof D x R- Zis

a b c

p 1 9 4

q 5 2 8.

r 7 6 3

Let the three values of D be equally likely. Then the measureof

variety,the entropy, ofDis

H(D)= log3.

Supposewetake thefollowingtwo controlstrategies: 1)maintain afixed control action R*= p,and 2) relate the control action to

the disturbance asfollows: a-q

b r

C

Theresultingoutputswill be, respectively, 1) Z={1,4,9}

2) Z={4,5,6}.

Theentropyof bothoutputsisH(Z)= log3.Hence,accordingto

thismeasure,neither of the controlstrategies suppressesthe

vari-etyin the disturbances atall,andnodifferenceexists betweenthe

static 1) and dynamic 2) control strategy. But supposing the

output values admit a numerical interpretation and taking

the variance as a measure of variability, it will be clear that con-trol strategy2)is far better than strategy 1).Thevariance of (1, 4,

9)is much larger than that of(4, 5, 6), irrespective of any desired mean value.

Thus entropy is neither a measure for the mean value of the

fluctuating variable,nor a measure for the variance around this mean value. Entropyisameasurefor variety, that is, the number

of possibilities weighted according to their probabilities of occurrence.

This remark represents quite a serious objection against Ashby's measure of entropy in case of a numerical output variable,soletusexaminethissomewhatmoreformally. Entropy isdefined as

H(Z) Z p(z =

zi)

logp(z

zi),

whilevarianceisdefined as

var (Z)= E p(z = :)(z, -)2

ZeZ

Thesedefinitionsshow theabove-mentioneddifference: entropy is anominalmeasure,whereasvarianceisaninterval(metric) mea-sure. These reflections lead us to the conclusion that because

entropy isindifferent todistancesbetween

zi,

itisgenerally not a measureofvariance.

There still remains the interesting question of the conditions

under whichthe entropyis ameasureof variance.Inotherwords,

whatproperties shouldtheprobability density functionp(Z)have, soastopermitconclusionsfrom entropies aboutvariances. Let us startby requiring order preservation:

H(zJ)

>

H(z2),

iffvar

(zl)

> var

(z2).

Asufficient conditionfororder preservationisthat a strictly

mono-tonic mapping exists between entropy and variance. Now it is

clear from proposition 20.5 of Shannon

[11]

that when order preservation is required we haveto restrict ourselves to certain classes of distribution functions, as the proposition implies that

foranyarbitrary non-Gaussiandistribution, thereexists a

Gaus-sian distribution with the same variance but higher entropy.

Classesof distributions for whichorderpreservationdoholdare the Gaussian, uniform, and exponential distribution. For these

classesthe entropy equals,

respectively,

log r

Z,/2re,

log

UZ\

/12,

log

aZ

e

where

az

is the variance. From these results it also follows that there is no class of distribution functions for which anything

strongerthan order

preservation

holds.

In summary, it can be stated that minimal entropy may sometimes beanecessary,butsurelywillnot oftenbeasufficient

condition foroptimality.

III. ERROR-CONTROLLED REGLILAFORS[3], [12]

A diagram such as Fig. 2 should be quite familiarto control

engineers,

for it is the

representation

of the well-known form of

regulation byerrorcontrol widelyusedbycontrol engineers, e.g., in servomechanisms. Itis surprisingto note that a huge bodyof theories and techniqueshas evolved around thiserror-controlled

regulator--infact most of classicalcontroltheorydeals withit

whereasincyberneticsitcaneasilybe shownthatitmightwell be better not to control by the error but by what gives rise to the error. It can be proven, namely in cybernetics,that such an error-controlled regulatorcan neverregulateperfectly,thatis,can never keep Z constant. The fundamental property of the

error-controlled regulator, that it can never be

perfect.

is

hiardly

806

(4)

807 CORRESPONDENCE

D+ z

~

PROCESS

REGULT:OR

Fig.2. Error-controlled regulator.

z

R

Fig. 3. Cause-controlled regulator.

considered bycontrol engineers. Simply stated, the argument is thatbecauseRcounteractsDonlybylearningof it fromZ,which hasalreadybeen affectedbyD, completesuccessisimpossible.Of

coursetheproofof thiscommon-senseobservation is notso triv-ial[12].The deductionassumesthat the transformation of Z toR isrepresentedasafinitestatemachinesothatifthe entropy of Z is zero, it will be passed on to R, that is, the entropy of R also

becomeszero. Because 4: D x R -+ZisthemappingZ= D

-R,

it canbeproven that thecaseofcomplete control,H(Z)=0,can only occur when the input D is constant too, that is, when

H(D)= 0.

In contrast to an error-controlled regulator which can never attain complete success (as hasbeen proved),acause-controlled

regulator, as visualized in Fig. 3, is able to perfectly regulate Z

because the regulator reacts directly to the disturbances which affect thesystem.This is the type ofregulationinwhich the

regu-lator anticipates S,sothat theregulatory action is simultaneous with that of S. Incyberneticterms, thejobof the regulator isto block the flow of information from Dto Z.

Althoughanerror-controlledregulatormaybeinferior in prin-cipleto acause-controlled regulator,awell-known practicalfact

is that the success of the feedforward prediction in the latter

scheme completely depends on theadequacy andvalidityof the system's modelin theregulator. Slightdifferences between model and realityorchanges intime ofthe realprocesswillresultinan accumulation oferrors, which might result in instability (in the Lyapunovsenseof theterm). Moreover,inthenextsectionitwill beshown that thestatementthatacause-controlledregulatorcan beoptimal shouldbe somewhat morespecified.

IV. THENECESSITY OFMODELINGIN REGULATION [4] Although most control engineers will start their design ofa

controlsystemforacomplex dynamicprocessby makingamodel

of the process, this modeling is often regarded as optional.

Makingamodel has the intuitiveappealofbeing helpful,but it is

quite possible that other methods ofdesign without any model might do as well or even better, in which case the making ofa

model would beawasteof time.Inotherwords,aslongasit isnot

proven that modeling is a necessary part of regulation, the usefulness of this often difficult and time-consuming activity re-mains doubtful. It is the great merit ofConant and Ashby that they have proved that modeling isnotonlyahelpful butastrictly necessary partofregulation. They show thatanyregulator which

is both optimal and simplemust beisomorphic with thesystem being regulated (see their paper for a precise definition of the

terms used in this statement). Take Fig. 3 as the general configurationofacontrolsystem, with

p:

D-R

a:

D-S

0:

S x R-+Z.

GiventhedisturbancesD,thecontrol actionsR,the system vari-ables S,and the outcomevariablesZ, theirmodeling theorem is the simplest optimal regulator produces control actions R, related to S

by

a

mapping

h: S-+R so that for all

d E D:p(d)=[ha(d)].

There is an aspectoftheirproof that is somewhat artificial:they

firstassume aprobability distribution p(S) andaconditional

dis-tribution p(R/S) specifying the behavior of the regulator. Although a conditional distribution p(R/S) doesnot necessarily imply any causal relationship between R and S, this notationis the usual onefor a stochasticsystem relation. Their proof might thereforebemisinterpretedassuggestingthat SaffectsR

(contra-dictingtheirconfigurationof acause-controlledregulator)orthat the regulatoritself is astochastic system. In view of their

conclu-sion about the deterministic relation h: S-+R,thisseemsartificial (comparewith theproof given furtheron).

A morerelevant remark to bemade here is thatformally they proved only that a simple optimal regulator must bea model.

They did not prove that a model automatically is the simplest optimal regulator,andthatisthewayin which the control

engin-eer would like to work. So let usexamine the cause-controlled regulator somewhat moreclosely.

Assume the control system configuration of Fig. 3; then the

followingholds:

H(Z, R,S)=H(S, R) + H(Z/S,R).

Because there isa

mapping

4: S x R-+Z,

H(Z/S, R)

=0,sothat

H(Z,

S,

R)

=

H(S,

R)

=

H(S)

+

H(R/S).

By definitionH(R/S) . 0, and thusH(Z,S,R)> H(S).The

equa-lity H(Z, S, R)=H(S)occurs ifH(R/S)=0, that is,ifthereis a

mapping

h: S-)R. On the other hand, the

following

is also an

identity:

H(Z)=H(Z,R, S)- H(R, S/Z).

Because entropy isnonnegative,H(Z)<

H(Z,

S,R)always holds.

Define the multiplicity k of the (many-to-one) mapping

0: R x S-*Zasthe

largest

numberof

pairs

(r1,

sj)

ER x Swhich map to a same z E Z. Call K the binary log of k. Then H(R, S/Z)< K.Once 0 is defined,K is afixed value. Thus

H(Z).H(Z, R, S)-K. H(S)- K,

whilealso,generally, H(Z)< H(Z,S,R)withH(Z) <H(S)incase

h: S -*R exists. This resultsin adual

inequality

H(S)

.

H(Z)

.

H(S)

-K,

where the left-handinequalityassumes theexistenceof amapping

h: S-+R and the

right-hand

inequality

isgeneral.

A few remarkable conclusions can be drawn from this

in-equality.

If the

mapping

(b:R x S-+Zisone-to-one, then K=0,

so that H(Z). H(S). Two cases can be distinguished in this inequality:

H(Z)

.

H(S)

ifno

mapping

h: S-+Rexists,thatis,ifthe con-trol action R varies independentlyofS at least tosomedegree,

(5)

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS,VOL.SMc-8, NO. 11,NOVEMBER 1978

H(Z)=H(S) if there exists a mapping h: S -.R or, as a special case if a fixed control action R* is chosen, whatever R* may be.

Hence,inthe case of a one-to-one mapping there is no advantage ofdynamic regulation over static (fixed), (see also the example and remarks in SectionII). In fact, regulation no longer makes any sense under such conditions, for R will never be able to suppress the variety in S. The foregoing implies that a regulator

with asystem model can only make use of the

multiplicity

in the system mapping. This seems a peculiar implication. But on close

inspection this is just an exact formulation of the fact that the

regulator's effectiveness depends on whether different distur-bances compensated for by different control actions together lead

tothesame desired outcome an extension or variation of the law ofrequisite variety. In other words,

controllability

depends on the

multiplicityofthe system mapping; e.g., the well-known mapping ofanerror-controlled regulator Z=D- R is many-to-one.

Withrespect to the original question, namely, whether or not a modelalsoimplies optimality, the following remarks are relevant. BecauseH(Z,S,R) .H(S)always holds,

H(Z).H(S)- H(R,S/Z)

also holds. When R isamodelofS,i.e., h: S-+R exists, H(Z, R,S)= H(S)

and

H(R, S/Z)=H(S/Z)

hold, which implies that the entropy of the output is H(Z)=H(S)-H(S/Z).

Note, however, that from the general

inequality

H(R,

S/Z)

.

H(S/Z) together with the conditional

equality

H(R,

S/Z)

=

H(S/Z)

when h: S-*R exists, it cannot be concluded that this

H(S/Z)istheminimum of H(R,S/Z);in other words,it cannot be concluded that

H(Z)

is minimal ifh: S-- Rexists.A

simple

coun-ter example can prove that this is actually false.

Assume that S={S1,S2}, R=

{r,,

r2} and Z=

{z1,

z2}'and

that

D: S x R-*Zis defined

by

the transition table

Sl S2

ri | ZS 2

r2 Z2 Z2

If the

following

model

mappings

h:S-+R

S - ri

S2- r2 or

S- r2

S2 +r

are taken, the output will be

suboptimal.

Only

the

particular

modelmappings

S2+ ri or

s-+r2

produce

an

optimal

output.

The

conclusion

is that

the setof opti-mal

regulators

is

a

subset of

the set

of model

regulators.

By adopting

a

different

definition of

optimality,

namely, that of

attaining

a

prescribed

goal,

a

very

simple

alternative

proofofthe

hypothesis

that the

simplest

optimal

regulator

must

be an

isomor-phism

h: S

-*

R

can be

given

[13].

Given

the system

mapping 4: R x S-- Z,the

goal

Gcanbeviewedas asubsetofthe totalset

of

outcomes: G

c

Z. The

measure of

optimality

here

is whetheror

not

cE

z

G. In

fact,

this is

a measure

of

effectiveness.

To ensure that

z

cE

G,

it is

necessary

that

the

pair (r,

s)

c

i-

1 (G).

Here

4-

'(G) is defined

to be

{(r,

s) (r, s,

z)

c

OA

z

c

G}.

Evidently

4 `(G) c

R x

S,

which

exists

everywhere

under the

assumption

that the

system

is

controllable. Thus

defined,

this is a relation

which,

however,

is not

necessarily single

valued.

The

regulator can

choose from various

possible

control

actions. The simplest

regula-torwill bethe

mapping

h: S-+Rwithh c--

1(G).

This deduction

of the

simplest

optimal

controller

is

much

more simple

than that

of Conant and Ashby; actually it is trivial.

We will end this

section

with

some

general

remarks

about the

simplest

controller.

The theorem

which

states that

when a

regula-tor

is

optimal

there must

exist a

mapping

from process

variables

to

control variables seems to be

of little practical

relevance for

classical control

engineers.

For instead

of

working

with

instantan-eous values

of time variables as in

cybernetics,

where

mappings

such as

a

are

generally

not one-to-one

and

consequently

pc''is

generally

not a

mapping,

classical

control

theory

works with

vari-ables in the

Laplace

domain. In this domain the mappings the transfer functions are all deterministic and one-to-one, so that

the

systems

mapping

a

and

the

regulators

mapping

p can always

betransformed into a"model

mapping"

h: p .-

a:

S- R.

Every

controller would then be a model. Because the use of Laplace

transforms

presupposes

that

all

initial conditions of

the

differen-tial

equations

are

equal

to

zero,

one can

evenstate thatin the time

domain the time function

mappings

of

classical control

theoryare

one-to-one. Note that

in

that time domain

one works

with

map-pings between time series and not with instantaneous values of variables like in cybernetics.

Secondly, it should be noted that the regulatorp is a model of the subsystem

a,

and not of the whole system from disturbance up to the output. It seems obvious that a sensible control engineer wants to predict, and hence to model, the whole mechanism from disturbance to output before designing a controller. The system

part

consisting

of the transformation 4: R x S- Z should be

incorporated in the model. Strongly related to this point is the question of the meaning which should be attached to both subsys-tems a and 4. Much depends on the interpretation of these abstract blocks.

Thirdly, the exact meaning of the statement that

"every

good regulator must be a model" should once more be emphasized. It does mean that optimality implies

modeling,

but it does not mean that modeling implies optimality. In other words, the class of optimal regulators is only a subset of the class of model regula-tors. The condition of

modeling

for optimal regulation is neces-sary but not sufficient. In practice this means that designing a regulator indeed must start by modeling, but that not every model leads to optimal regulation.

V. CONCLUSION

It has been shown that the basis of the cybernetic theory

of

information and control, the concept ofentropy, has some prac-tical limitations as a measure of optimality. Furthermore, the theorem that a cause-controlled regulator can be completely suc-cessful as opposed to an error-controlled regulator,

fails

to deal

808

(6)

CORRESPONDENCE

with theproblem ofstability. The theorem that every regulator

must necessarily consist ofamodel of thesystemtoberegulated

has beenshownto beofless practical usethan it seems.

These comments definitely donotimply that those theoremsare

useless. On the contrary, they represent the first valuable steps

towards a fruitful and practical link between the two well-developedtheories of information and control. It maybethat this theory of cybernetics will shed a new and useful

light

on the development of controlsystems.

REFERENCES

[1] N. Wiener, Cybernetics. Cambridge, MA: Massachusetts Institute of

Technology, 1948.

[2] W. R.Ashby,AnIntroductiontoCybernetics.London:Chapman&Hall,1956.

[3] R. C. Conant, "The information transfer required in regulatory processes," IEEE Trans. Syst. Sci.Cybern.,vol.SSC-5,no.4,pp.334-338,Oct. 1969. [4] R.C.Conant and W.R.Ashby,"Every good regulatorofasystemmustbea

model of thatsystem,"Int.J.Systems Sci.,1970,vol. 1,no.2,pp.89-97. [5] W. R. Ashby,"Measuring the internal informationalexchangein asystem,"

Cybernetica,vol.VIII,no. 1, pp.5-22, 1965.

[6] R.C.Conant,"Detecting subsystemsof acomplexsystem,"IEEETrans.Syst.,

Man,Cybern., vol.SMC-2,no.4, pp. 550-553,Sept.1972.

[7] ,"Information flowsinhierarchicalsystems,"Int.J.GeneralSystems,vol.1, pp. 9-18, 1974.

[8] -,"Lawsofinformation which governsystems,"IEEE Trans.Syst., Man,

Cybern., vol. SMC-6,no.4,pp.240-255,Apr.1976.

[9] B.Porter,"Requisitevariety in the systems and controlsciences,"Int.J.General Systems,vol. 2, pp.225-229,1976.

[10] W. R.Ashby,"Requisitevarietyanditsimplicationsfor the control ofcomplex

systems,"Cybernetica,vol.I,no.2,pp.83-99, 1958.

[11]C.E. ShannonandW.Weaver,"The mathematicaltheoryofcommunication,"

Univ.of Illinois Press,1949, 12thprinting,1971.

[12] R.C.Conant, "Information transfer incomplexsystems, withapplicationsto

regulation,"dissertation, Univ.ofIllinois,1968.

[13] A. C. J. de Leeuw, "Systems theoryand organisation theory,"dissertation. Eindhoven: Techn. Univ.Eindhoven,1974(Dutch).

Pattern

Recognition Procedures

with

Nonparametric

Density

Estimates

WLODZIMIERZ GREBLICKI

Abstract-Modified class conditional

density

estimatesfor pat-ternclassification obtainedby replacing the

sample

sizesfor

particu-lar classesbytheoverallsamplesize inexpressions for the original

estimates are

presented,

and their

consistency

is proved. Pattern recognition procedures derived from original and modified

Rosenblatt-Parzen,

Loftsgaarden-Quesenberry,

and orthogonal

seriesestimatorsaregiven,andBayes risk

consistency

isestablished.

I. INTRODUCTION

Pattern recognition algorithms with nonparametric density estimates have been studied byseveralauthors. The Rosenblatt-Parzen estimate has been used by Van Ryzin [13], [14], while Devroye and Wagner [3] have employed both the Rosenblatt-Parzen and theLoftsgaarden-Quesenberry estimates. Van Ryzin

[14] has also examined procedures with orthogonal series

estimates.

In this correspondence we present pattern recognition procedures with the Rosenblatt-Parzen, the Loftsgaarden-Quesenberry,and the orthogonalseries classdensity estimatesof

Manuscript received October 11, 1977; revised July 31, 1978.

The author is with the Institute of Engineering Cybernetics, Technical University ofWroclaw,Wroclaw, Poland.

theiroriginal forms,andweintroducemodified classdensity

esti-matesandderiveappropriate procedures.The modifiedestimates

areobtainedby replacingthesamplesizes forparticularclassesby

the overall sample size in the expressions for the original esti-mates.Weprove Theorems4 and 7onconsistencyofthe modified

class densityestimates, and then using generalthe Greblicki [7]

and theWolverton-Wagner [16]theoremsonBayesrisk

consist-ency, weestablish Theorems 5 and 8on asymptoticaloptimality

ofso obtained procedures. We also show that procedures

in-troduced by Van Ryzin [13], [14] and Devroyeand Wagner[3]

can be derived from either original or modified class density

estimates.

II. PRELIMINARIES

LetQl={1, ,M};elements of Ql will be called classes. Let(QZ

X) be a pair of random variables. LI takes values in Ql, and

pi= P{Q=i}. Xtakes valuesinRP,

andf

isthe classconditional

density, i.e.,the conditionaldensityofXgiventhe class i.L(i,j)is the loss weincur in taking action i E Q when theclass isj.We assumethe 0-1 lossfunction. Foradecisionfunction /, i.e., fora function mapping RPintofl,theexpected lossis

M

R(f)=

E

pj FL(f(x),

j)fj(x)

di(x) j=l

wherep is the Lebesgue measure onRP.A decision function

i0

which classifiesevery x as coming fromanyclassiforwhich

pi

f(x)

=max pj

fj(x)

j

is aBayes decision function. All the class densities as wellasthe

class prior probabilities areassumed tobeunknown and will be estimatedfrom thelearning sequence

(Cil, Xl)

9..

(fn", X"),

i.e.,asequence ofnindependent observations of the pair (S2, X). LetPi =N

I/n,

where Ni is the number ofobservations fromthe

class i,beanestimate ofpi,and let

ji(x)

bean estimate of

fh(x).

TheQi-valued function

0fr,,

definedfor allxeRPand all realiza-tions of thelearningsequenceiscalled anempirical decision

func-tion. Throughout this correspondence we are concerned with pattern recognition procedures thatare sequences

{,,0}

of empir-icaldecision functions classifyingevery x among any class i for

which

j

Wesaythat the procedure isBayes risk consistentif lim

ER(#.)

=R(fro)

where

t0

isanyBayesdecisionfunction.Itisclear that the asymp-toticaloptimality,i.e., Bayes risk consistencydepends on proper-ties of class density estimates. The next two theorems on the

asymptotical optimality are dueto Greblicki [7]andWolverton andWagner [16],respectively.

Theorem 1:If

ai(X) P

f

i(X)f

as n a)o, at almost all

(ti)

xE

RP,

for i-1= ,

M,

then the procedure isBayes riskconsistent.

Itshould bementioned thatunder someadditionalassumption, 0018-9472/78/1100-0809$00.75 C) 1978 IEEE