Some comments on cybernetics and control
Citation for published version (APA):Kickert, W. J. M., Bertrand, J. W. M., & Praagman, J. (1978). Some comments on cybernetics and control. IEEE Transactions on Systems, Man and Cybernetics, 8(11), 805-809. https://doi.org/10.1109/TSMC.1978.4309868
DOI:
10.1109/TSMC.1978.4309868
Document status and date: Published: 01/01/1978 Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne Take down policy
If you believe that this document breaches copyright please contact us at: openaccess@tue.nl
providing details and we will investigate your claim.
IEEE TRANSACTIONSONSYSTEMS, MAN, ANDCYBERNETICS,VOL.SMC-8,NO. 11,NOVEMBER 1978
Correspondence
Some
CommentsonCybernetics
and ControlWALTER J. M. KICKERT, JAN-WILLEM M. BERTRAND,
AND JAAP PRAAGMAN
Abstract Thetheory of cyberneticsasintroduced
by
Ashby
and developed by Ashbyand Conant will be analyzedand commentedupon.Ashby's law ofrequisite varietyand,inparticular,the
underly-ingmeasureofoptimality thequantityofentropy areexamined.
Nextthe cybernetic theorem oferrorcontrol andcausecontrolis observed, and finally the cybernetic theorem of the
necessity
ofmodelingforregulationisstudied.Inallthreecasesseveralpractical
conditions and restrictions for the
applicability
of the theoremstocontrolengineering arepointed out. I. INTRODUCTION
Since the first introduction ofcyberneticsbyN. Wiener[1],asa
"science ofcommunication and control,"numerouscontributions tothis fieldhave been made. One of themostoutstanding
contri-butions,in ouropinion,has been madebythe late W. R.Ashby, wholaidabasisfor the link betweeninformation and controlwith
his well-known law of requisite variety [2]. Both he and R.C.
Conant furtherdeveloped thistheory. Someimportantand
well-known elements of this development are their views on
error-controlled regulators versus cause-controlled regulators [3], and
their views on modeling as a necessary part ofregulation [4].
These elements only represent part oftheir work, especially in
view of their recent achievements in the field of complex and
hierarchical systems [5]-[8]. Theanalysis and comments in this
correspondence, however, will be restricted to the above-mentioned points, mainlybecause of their relevanceforcontrol.
There are three -reasons which emphasize the necessity of
analysis, comments,andcriticism. First is theobviousimportance
of this branch ofcybernetics for control theory. It will be clear
that information and communication play an important rolein
control and that, more specifically, the above-mentioned issues
arecertainlyveryimportantforcontroltheory.Thesecondreason
is the incomprehensibleignorance of this theory on the part of
controlengineers. Itisastonishinghowlittleattention thetheory
ofcommunication, andinparticular,thetheoryofcyberneticshas received in control theory, though its importanceis quite clear. The third reason is the unshakable popularity of this theory
among system theorists. One might hope that some"cross talk between cyberneticians and control engineers" as Porter calls it
[9] will result in a greater appreciation by control engineers of
Ashby's link between informationandcontrol, andin asomewhat restricted but betterfounded popularity among systemtheorists.
II. THE LAWOFREQUISITE VARIETY [2], [3], [10]
Themostfamouslawof cybernetics isundoubtedly Ashby'slaw ofrequisite variety. Thisvery generallaw, whichcontrary to the
usual control theory does not presuppose linearity, low-order structure, etc.,givesanupper limit to thedegreeofcontrollability
that a regulator can possiblyachieve.Inview of the generality of
ManuscriptreceivedAugust31, 1977; revisedJuly 6,1978.
The authors are with theDepartmentofIndustrial Engineering,Technological
University of Eindhoven, Eindhoven,TheNetherlands.
the law and itsobvious importance for control, which caneasily
be shown in numerous examples, it is indeed astonishing that so little attention is paid to this law by control theorists.
Thelaw ofrequisite varietystates thatthecapacityof adevice
as aregulator cannot exceed itscapacity as a channel of
communi-cation,or toputit inAshby'swords: "only variety in the
regula-tor can force down the variety due to the disturbances; only variety candestroyvariety." Imagine asystemcomposedof a set D ofdisturbances, a set R of control actions, and a set Z of outcomes, defined by a
mapping
4: D x R -+Z. Thisobviously
represents acontrol system.By taking finite discrete sets D, R, and Z andby visualizing 4 as a table, it can easily be shown that the goal ofkeepingthe outcomeZkE Zconstant,thatis,of decreas-ing the variety in theoutcomes, can only be met by a correspond-ing increase in the variety of R.Therelation betweeninformation and controlisessentiallythe
following.Asthe criterion for the success of the regulator (thegoal
is constancy of the outcomes Z) Shannon'smeasure ofselective
information in asignal, theentropyH(Z) isintroduced: H(Z)= - 3
p(zi)
log2P(ZJ)
zieZ
Ashby states that optimality of a regulator R isequivalentto the
minimization of the entropy H(Z) of the outcomes. One ofthe
advantages of this measure of optimality is that it does not presume numerical variables; entropy also applies to variables
that canonlybeclassified (nominal).
Strictly speaking, the use of this entropy measure implies that thevariables involved(outcomes)arestochastic. (Notice that the requirementofstochasticsystemvariables doesnotimplythat the
systemrelations be stochastic; on thecontrary,mostof this cyber-netic theory leads to the necessity of deterministic systems
rela-tions.)Most frequently, however,theentropy measure is applied as a measure of variety without strict probabilitydensity
func-tions. Itthenserves as a measureofthe number of possible
alter-natives. The assumption behind this use of entropy is that all alternatives havethe sameprobability:
n n 1 1
H
P(zi)
log
P(Zi)=
log
-=log
n. This probability assumption is often omitted. Hence, in fact, the entropy measure is not only usedin case of stochastic variables butalso with (varying)deterministic variables.Itshould be remarked thatalthough the use of entropy has an
advantage over classical control theory in that it incorporates
stochasticvariables, it does not solve the kind ofproblems that are solved by classical control theory, simply because entropy does notdeal withthem. Although stabilityanalysis,transfer
func-tiontheory,etc., do not exclude the existence of stochasticsignals,
those theories just do not consider it; they deal with analytical functions in time(pulse, step,ramp, sinusoid,transient response,
steady state,etc.). In contrast,entropy onlyconsiders aggregates ofvariables,such asdensityfunctionsandvarieties. Hence, a great
deal ofcontroltheoryis not covered by this theory of cybernetics.
Secondly, itseemsquestionabletoequateoptimalitywith mini-malentropy. A well-known fact ofinformation theory is that the
0018-9472/78/1100-0805$00.75 ) 1978 IEEE
IEEETRANSACTIONS ON SYSTEMS, MAN, ANDCYBERNETICS, VOL.SMC-8,NO. 11,NOVEMBER 1978
p(Z)
Fig. 1. Comparison of entropies.
entropy H(Z) is minimal when there isone isuch that p(Zi)= 1
and
p(zj)
= 0, Vj¢ i, and that H(Z) is maximal when allp(zj),
zj
E Z areequal. Minimizing H(Z)means compressingtheprob-ability distributionof Z. Hence,whatever theposition ofzi,H(Z) will decreaseiftheprobability distribution is compressedaround
zi (see Fig. 1). Thismeans thatentropy onlymeasuresvariety and
does nottake into account the absolute position of the optimal variable. This omission has serious implications in questions of steady-stateerrors, bias,etc.
Thirdly, some attention should be paid to the concept of variety,for which entropy is ameasure.A good regulator should
minimize the variability ofthe output. Apart from the possibly additional requirement of a particular output value which entropydoesnotguarantee-this requirement usuallymeansthat anoutputvariable shouldfluctuateaslittleaspossible. The usual
measure for this is variance. However, it should be emphasized thatbetween thesetwothereisadifference ofpractical relevance,
andnotjustofmathematics. Letustry toillustrate this difference through anexample.
Example: Imagine a control system with a disturbance set
D= {a, b,c}, acontrolaction setR ={p, q, r}, andanoutputset
Z= {1,2,3,4, 5,6, 7, 8,9}.Thetransition tableof D x R- Zis
a b c
p 1 9 4
q 5 2 8.
r 7 6 3
Let the three values of D be equally likely. Then the measureof
variety,the entropy, ofDis
H(D)= log3.
Supposewetake thefollowingtwo controlstrategies: 1)maintain afixed control action R*= p,and 2) relate the control action to
the disturbance asfollows: a-q
b r
C
Theresultingoutputswill be, respectively, 1) Z={1,4,9}
2) Z={4,5,6}.
Theentropyof bothoutputsisH(Z)= log3.Hence,accordingto
thismeasure,neither of the controlstrategies suppressesthe
vari-etyin the disturbances atall,andnodifferenceexists betweenthe
static 1) and dynamic 2) control strategy. But supposing the
output values admit a numerical interpretation and taking
the variance as a measure of variability, it will be clear that con-trol strategy2)is far better than strategy 1).Thevariance of (1, 4,
9)is much larger than that of(4, 5, 6), irrespective of any desired mean value.
Thus entropy is neither a measure for the mean value of the
fluctuating variable,nor a measure for the variance around this mean value. Entropyisameasurefor variety, that is, the number
of possibilities weighted according to their probabilities of occurrence.
This remark represents quite a serious objection against Ashby's measure of entropy in case of a numerical output variable,soletusexaminethissomewhatmoreformally. Entropy isdefined as
H(Z) Z p(z =
zi)
logp(zzi),
whilevarianceisdefined asvar (Z)= E p(z = :)(z, -)2
ZeZ
Thesedefinitionsshow theabove-mentioneddifference: entropy is anominalmeasure,whereasvarianceisaninterval(metric) mea-sure. These reflections lead us to the conclusion that because
entropy isindifferent todistancesbetween
zi,
itisgenerally not a measureofvariance.There still remains the interesting question of the conditions
under whichthe entropyis ameasureof variance.Inotherwords,
whatproperties shouldtheprobability density functionp(Z)have, soastopermitconclusionsfrom entropies aboutvariances. Let us startby requiring order preservation:
H(zJ)
>H(z2),
iffvar(zl)
> var(z2).
Asufficient conditionfororder preservationisthat a strictly
mono-tonic mapping exists between entropy and variance. Now it is
clear from proposition 20.5 of Shannon
[11]
that when order preservation is required we haveto restrict ourselves to certain classes of distribution functions, as the proposition implies thatforanyarbitrary non-Gaussiandistribution, thereexists a
Gaus-sian distribution with the same variance but higher entropy.
Classesof distributions for whichorderpreservationdoholdare the Gaussian, uniform, and exponential distribution. For these
classesthe entropy equals,
respectively,
log r
Z,/2re,
logUZ\
/12,
logaZ
ewhere
az
is the variance. From these results it also follows that there is no class of distribution functions for which anythingstrongerthan order
preservation
holds.In summary, it can be stated that minimal entropy may sometimes beanecessary,butsurelywillnot oftenbeasufficient
condition foroptimality.
III. ERROR-CONTROLLED REGLILAFORS[3], [12]
A diagram such as Fig. 2 should be quite familiarto control
engineers,
for it is therepresentation
of the well-known form ofregulation byerrorcontrol widelyusedbycontrol engineers, e.g., in servomechanisms. Itis surprisingto note that a huge bodyof theories and techniqueshas evolved around thiserror-controlled
regulator--infact most of classicalcontroltheorydeals withit
whereasincyberneticsitcaneasilybe shownthatitmightwell be better not to control by the error but by what gives rise to the error. It can be proven, namely in cybernetics,that such an error-controlled regulatorcan neverregulateperfectly,thatis,can never keep Z constant. The fundamental property of the
error-controlled regulator, that it can never be
perfect.
ishiardly
806807 CORRESPONDENCE
D+ z
~
PROCESSREGULT:OR
Fig.2. Error-controlled regulator.z
R
Fig. 3. Cause-controlled regulator.
considered bycontrol engineers. Simply stated, the argument is thatbecauseRcounteractsDonlybylearningof it fromZ,which hasalreadybeen affectedbyD, completesuccessisimpossible.Of
coursetheproofof thiscommon-senseobservation is notso triv-ial[12].The deductionassumesthat the transformation of Z toR isrepresentedasafinitestatemachinesothatifthe entropy of Z is zero, it will be passed on to R, that is, the entropy of R also
becomeszero. Because 4: D x R -+ZisthemappingZ= D
-R,
it canbeproven that thecaseofcomplete control,H(Z)=0,can only occur when the input D is constant too, that is, when
H(D)= 0.
In contrast to an error-controlled regulator which can never attain complete success (as hasbeen proved),acause-controlled
regulator, as visualized in Fig. 3, is able to perfectly regulate Z
because the regulator reacts directly to the disturbances which affect thesystem.This is the type ofregulationinwhich the
regu-lator anticipates S,sothat theregulatory action is simultaneous with that of S. Incyberneticterms, thejobof the regulator isto block the flow of information from Dto Z.
Althoughanerror-controlledregulatormaybeinferior in prin-cipleto acause-controlled regulator,awell-known practicalfact
is that the success of the feedforward prediction in the latter
scheme completely depends on theadequacy andvalidityof the system's modelin theregulator. Slightdifferences between model and realityorchanges intime ofthe realprocesswillresultinan accumulation oferrors, which might result in instability (in the Lyapunovsenseof theterm). Moreover,inthenextsectionitwill beshown that thestatementthatacause-controlledregulatorcan beoptimal shouldbe somewhat morespecified.
IV. THENECESSITY OFMODELINGIN REGULATION [4] Although most control engineers will start their design ofa
controlsystemforacomplex dynamicprocessby makingamodel
of the process, this modeling is often regarded as optional.
Makingamodel has the intuitiveappealofbeing helpful,but it is
quite possible that other methods ofdesign without any model might do as well or even better, in which case the making ofa
model would beawasteof time.Inotherwords,aslongasit isnot
proven that modeling is a necessary part of regulation, the usefulness of this often difficult and time-consuming activity re-mains doubtful. It is the great merit ofConant and Ashby that they have proved that modeling isnotonlyahelpful butastrictly necessary partofregulation. They show thatanyregulator which
is both optimal and simplemust beisomorphic with thesystem being regulated (see their paper for a precise definition of the
terms used in this statement). Take Fig. 3 as the general configurationofacontrolsystem, with
p:
D-R
a:D-S
0:
S x R-+Z.GiventhedisturbancesD,thecontrol actionsR,the system vari-ables S,and the outcomevariablesZ, theirmodeling theorem is the simplest optimal regulator produces control actions R, related to S
by
amapping
h: S-+R so that for alld E D:p(d)=[ha(d)].
There is an aspectoftheirproof that is somewhat artificial:they
firstassume aprobability distribution p(S) andaconditional
dis-tribution p(R/S) specifying the behavior of the regulator. Although a conditional distribution p(R/S) doesnot necessarily imply any causal relationship between R and S, this notationis the usual onefor a stochasticsystem relation. Their proof might thereforebemisinterpretedassuggestingthat SaffectsR
(contra-dictingtheirconfigurationof acause-controlledregulator)orthat the regulatoritself is astochastic system. In view of their
conclu-sion about the deterministic relation h: S-+R,thisseemsartificial (comparewith theproof given furtheron).
A morerelevant remark to bemade here is thatformally they proved only that a simple optimal regulator must bea model.
They did not prove that a model automatically is the simplest optimal regulator,andthatisthewayin which the control
engin-eer would like to work. So let usexamine the cause-controlled regulator somewhat moreclosely.
Assume the control system configuration of Fig. 3; then the
followingholds:
H(Z, R,S)=H(S, R) + H(Z/S,R).
Because there isa
mapping
4: S x R-+Z,H(Z/S, R)
=0,sothatH(Z,
S,R)
=H(S,
R)
=H(S)
+H(R/S).
By definitionH(R/S) . 0, and thusH(Z,S,R)> H(S).The
equa-lity H(Z, S, R)=H(S)occurs ifH(R/S)=0, that is,ifthereis a
mapping
h: S-)R. On the other hand, thefollowing
is also anidentity:
H(Z)=H(Z,R, S)- H(R, S/Z).
Because entropy isnonnegative,H(Z)<
H(Z,
S,R)always holds.Define the multiplicity k of the (many-to-one) mapping
0: R x S-*Zasthe
largest
numberofpairs
(r1,
sj)
ER x Swhich map to a same z E Z. Call K the binary log of k. Then H(R, S/Z)< K.Once 0 is defined,K is afixed value. ThusH(Z).H(Z, R, S)-K. H(S)- K,
whilealso,generally, H(Z)< H(Z,S,R)withH(Z) <H(S)incase
h: S -*R exists. This resultsin adual
inequality
H(S)
.H(Z)
.H(S)
-K,where the left-handinequalityassumes theexistenceof amapping
h: S-+R and the
right-hand
inequality
isgeneral.A few remarkable conclusions can be drawn from this
in-equality.
If themapping
(b:R x S-+Zisone-to-one, then K=0,so that H(Z). H(S). Two cases can be distinguished in this inequality:
H(Z)
.H(S)
ifnomapping
h: S-+Rexists,thatis,ifthe con-trol action R varies independentlyofS at least tosomedegree,IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS,VOL.SMc-8, NO. 11,NOVEMBER 1978
H(Z)=H(S) if there exists a mapping h: S -.R or, as a special case if a fixed control action R* is chosen, whatever R* may be.
Hence,inthe case of a one-to-one mapping there is no advantage ofdynamic regulation over static (fixed), (see also the example and remarks in SectionII). In fact, regulation no longer makes any sense under such conditions, for R will never be able to suppress the variety in S. The foregoing implies that a regulator
with asystem model can only make use of the
multiplicity
in the system mapping. This seems a peculiar implication. But on closeinspection this is just an exact formulation of the fact that the
regulator's effectiveness depends on whether different distur-bances compensated for by different control actions together lead
tothesame desired outcome an extension or variation of the law ofrequisite variety. In other words,
controllability
depends on themultiplicityofthe system mapping; e.g., the well-known mapping ofanerror-controlled regulator Z=D- R is many-to-one.
Withrespect to the original question, namely, whether or not a modelalsoimplies optimality, the following remarks are relevant. BecauseH(Z,S,R) .H(S)always holds,
H(Z).H(S)- H(R,S/Z)
also holds. When R isamodelofS,i.e., h: S-+R exists, H(Z, R,S)= H(S)
and
H(R, S/Z)=H(S/Z)
hold, which implies that the entropy of the output is H(Z)=H(S)-H(S/Z).
Note, however, that from the general
inequality
H(R,S/Z)
.H(S/Z) together with the conditional
equality
H(R,S/Z)
=H(S/Z)
when h: S-*R exists, it cannot be concluded that thisH(S/Z)istheminimum of H(R,S/Z);in other words,it cannot be concluded that
H(Z)
is minimal ifh: S-- Rexists.Asimple
coun-ter example can prove that this is actually false.
Assume that S={S1,S2}, R=
{r,,
r2} and Z={z1,
z2}'and
thatD: S x R-*Zis defined
by
the transition tableSl S2
ri | ZS 2
r2 Z2 Z2
If the
following
modelmappings
h:S-+RS - ri
S2- r2 or
S- r2
S2 +r
are taken, the output will be
suboptimal.
Only
theparticular
modelmappingsS2+ ri or
s-+r2
produce
an
optimal
output.
The
conclusion
is that
the setof opti-malregulators
is
a
subset of
the set
of model
regulators.By adopting
a
different
definition of
optimality,
namely, that ofattaining
aprescribed
goal,
a
very
simple
alternative
proofofthehypothesis
that the
simplest
optimal
regulator
must
be anisomor-phism
h: S
-*R
can be
given
[13].
Given
the system
mapping 4: R x S-- Z,thegoal
Gcanbeviewedas asubsetofthe totalsetof
outcomes: G
c
Z.
The
measure of
optimality
here
is whetherornot
cE
z
G. In
fact,
this is
a measure
of
effectiveness.
To ensure thatz
cE
G,
it is
necessary
that
the
pair (r,
s)
ci-
1
(G).
Here4-
'(G) is definedto be
{(r,
s) (r, s,
z)
cOA
z
cG}.
Evidently
4 `(G) cR x
S,
which
exists
everywhere
under the
assumption
that thesystem
is
controllable. Thus
defined,
this is a relation
which,however,
is not
necessarily single
valued.
The
regulator canchoose from various
possible
control
actions. The simplest
regula-torwill bethe
mapping
h: S-+Rwithh c--1(G).
This deductionof the
simplest
optimal
controller
is
much
more simple
than thatof Conant and Ashby; actually it is trivial.
We will end this
section
with
some
general
remarks
about thesimplest
controller.
The theorem
which
states that
when aregula-tor
isoptimal
there must
exist a
mapping
from process
variablesto
control variables seems to be
of little practical
relevance forclassical control
engineers.
For instead
of
working
withinstantan-eous values
of time variables as in
cybernetics,
where
mappingssuch as
a
are
generally
not one-to-one
and
consequently
pc''isgenerally
not a
mapping,
classical
control
theory
works withvari-ables in the
Laplace
domain. In this domain the mappings the transfer functions are all deterministic and one-to-one, so thatthe
systems
mapping
a
and
the
regulators
mapping
p can alwaysbetransformed into a"model
mapping"
h: p .-a:
S- R.Every
controller would then be a model. Because the use of Laplace
transforms
presupposes
that
all
initial conditions of
thedifferen-tial
equations
are
equal
to
zero,
one can
evenstate thatin the timedomain the time function
mappings
of
classical control
theoryareone-to-one. Note that
inthat time domain
one works
withmap-pings between time series and not with instantaneous values of variables like in cybernetics.
Secondly, it should be noted that the regulatorp is a model of the subsystem
a,
and not of the whole system from disturbance up to the output. It seems obvious that a sensible control engineer wants to predict, and hence to model, the whole mechanism from disturbance to output before designing a controller. The systempart
consisting
of the transformation 4: R x S- Z should beincorporated in the model. Strongly related to this point is the question of the meaning which should be attached to both subsys-tems a and 4. Much depends on the interpretation of these abstract blocks.
Thirdly, the exact meaning of the statement that
"every
good regulator must be a model" should once more be emphasized. It does mean that optimality impliesmodeling,
but it does not mean that modeling implies optimality. In other words, the class of optimal regulators is only a subset of the class of model regula-tors. The condition ofmodeling
for optimal regulation is neces-sary but not sufficient. In practice this means that designing a regulator indeed must start by modeling, but that not every model leads to optimal regulation.V. CONCLUSION
It has been shown that the basis of the cybernetic theory
of
information and control, the concept ofentropy, has some prac-tical limitations as a measure of optimality. Furthermore, the theorem that a cause-controlled regulator can be completely suc-cessful as opposed to an error-controlled regulator,fails
to deal808
CORRESPONDENCE
with theproblem ofstability. The theorem that every regulator
must necessarily consist ofamodel of thesystemtoberegulated
has beenshownto beofless practical usethan it seems.
These comments definitely donotimply that those theoremsare
useless. On the contrary, they represent the first valuable steps
towards a fruitful and practical link between the two well-developedtheories of information and control. It maybethat this theory of cybernetics will shed a new and useful
light
on the development of controlsystems.REFERENCES
[1] N. Wiener, Cybernetics. Cambridge, MA: Massachusetts Institute of
Technology, 1948.
[2] W. R.Ashby,AnIntroductiontoCybernetics.London:Chapman&Hall,1956.
[3] R. C. Conant, "The information transfer required in regulatory processes," IEEE Trans. Syst. Sci.Cybern.,vol.SSC-5,no.4,pp.334-338,Oct. 1969. [4] R.C.Conant and W.R.Ashby,"Every good regulatorofasystemmustbea
model of thatsystem,"Int.J.Systems Sci.,1970,vol. 1,no.2,pp.89-97. [5] W. R. Ashby,"Measuring the internal informationalexchangein asystem,"
Cybernetica,vol.VIII,no. 1, pp.5-22, 1965.
[6] R.C.Conant,"Detecting subsystemsof acomplexsystem,"IEEETrans.Syst.,
Man,Cybern., vol.SMC-2,no.4, pp. 550-553,Sept.1972.
[7] ,"Information flowsinhierarchicalsystems,"Int.J.GeneralSystems,vol.1, pp. 9-18, 1974.
[8] -,"Lawsofinformation which governsystems,"IEEE Trans.Syst., Man,
Cybern., vol. SMC-6,no.4,pp.240-255,Apr.1976.
[9] B.Porter,"Requisitevariety in the systems and controlsciences,"Int.J.General Systems,vol. 2, pp.225-229,1976.
[10] W. R.Ashby,"Requisitevarietyanditsimplicationsfor the control ofcomplex
systems,"Cybernetica,vol.I,no.2,pp.83-99, 1958.
[11]C.E. ShannonandW.Weaver,"The mathematicaltheoryofcommunication,"
Univ.of Illinois Press,1949, 12thprinting,1971.
[12] R.C.Conant, "Information transfer incomplexsystems, withapplicationsto
regulation,"dissertation, Univ.ofIllinois,1968.
[13] A. C. J. de Leeuw, "Systems theoryand organisation theory,"dissertation. Eindhoven: Techn. Univ.Eindhoven,1974(Dutch).
Pattern
Recognition Procedures
withNonparametric
Density
EstimatesWLODZIMIERZ GREBLICKI
Abstract-Modified class conditional
density
estimatesfor pat-ternclassification obtainedby replacing thesample
sizesforparticu-lar classesbytheoverallsamplesize inexpressions for the original
estimates are
presented,
and theirconsistency
is proved. Pattern recognition procedures derived from original and modifiedRosenblatt-Parzen,
Loftsgaarden-Quesenberry,
and orthogonalseriesestimatorsaregiven,andBayes risk
consistency
isestablished.I. INTRODUCTION
Pattern recognition algorithms with nonparametric density estimates have been studied byseveralauthors. The Rosenblatt-Parzen estimate has been used by Van Ryzin [13], [14], while Devroye and Wagner [3] have employed both the Rosenblatt-Parzen and theLoftsgaarden-Quesenberry estimates. Van Ryzin
[14] has also examined procedures with orthogonal series
estimates.
In this correspondence we present pattern recognition procedures with the Rosenblatt-Parzen, the Loftsgaarden-Quesenberry,and the orthogonalseries classdensity estimatesof
Manuscript received October 11, 1977; revised July 31, 1978.
The author is with the Institute of Engineering Cybernetics, Technical University ofWroclaw,Wroclaw, Poland.
theiroriginal forms,andweintroducemodified classdensity
esti-matesandderiveappropriate procedures.The modifiedestimates
areobtainedby replacingthesamplesizes forparticularclassesby
the overall sample size in the expressions for the original esti-mates.Weprove Theorems4 and 7onconsistencyofthe modified
class densityestimates, and then using generalthe Greblicki [7]
and theWolverton-Wagner [16]theoremsonBayesrisk
consist-ency, weestablish Theorems 5 and 8on asymptoticaloptimality
ofso obtained procedures. We also show that procedures
in-troduced by Van Ryzin [13], [14] and Devroyeand Wagner[3]
can be derived from either original or modified class density
estimates.
II. PRELIMINARIES
LetQl={1, ,M};elements of Ql will be called classes. Let(QZ
X) be a pair of random variables. LI takes values in Ql, and
pi= P{Q=i}. Xtakes valuesinRP,
andf
isthe classconditionaldensity, i.e.,the conditionaldensityofXgiventhe class i.L(i,j)is the loss weincur in taking action i E Q when theclass isj.We assumethe 0-1 lossfunction. Foradecisionfunction /, i.e., fora function mapping RPintofl,theexpected lossis
M
R(f)=
E
pj FL(f(x),j)fj(x)
di(x) j=lwherep is the Lebesgue measure onRP.A decision function
i0
which classifiesevery x as coming fromanyclassiforwhich
pi
f(x)
=max pjfj(x)
jis aBayes decision function. All the class densities as wellasthe
class prior probabilities areassumed tobeunknown and will be estimatedfrom thelearning sequence
(Cil, Xl)
9..(fn", X"),
i.e.,asequence ofnindependent observations of the pair (S2, X). LetPi =N
I/n,
where Ni is the number ofobservations fromtheclass i,beanestimate ofpi,and let
ji(x)
bean estimate offh(x).
TheQi-valued function0fr,,
definedfor allxeRPand all realiza-tions of thelearningsequenceiscalled anempirical decisionfunc-tion. Throughout this correspondence we are concerned with pattern recognition procedures thatare sequences
{,,0}
of empir-icaldecision functions classifyingevery x among any class i forwhich
j
Wesaythat the procedure isBayes risk consistentif lim
ER(#.)
=R(fro)where
t0
isanyBayesdecisionfunction.Itisclear that the asymp-toticaloptimality,i.e., Bayes risk consistencydepends on proper-ties of class density estimates. The next two theorems on theasymptotical optimality are dueto Greblicki [7]andWolverton andWagner [16],respectively.
Theorem 1:If
ai(X) P
f
i(X)fas n a)o, at almost all
(ti)
xERP,
for i-1= ,M,
then the procedure isBayes riskconsistent.Itshould bementioned thatunder someadditionalassumption, 0018-9472/78/1100-0809$00.75 C) 1978 IEEE