Tilburg University
Kriging metamodeling for simulation
van Beers, W.C.M.
Publication date:
2005
Document Version
Publisher's PDF, also known as Version of record
Link to publication in Tilburg University Research Portal
Citation for published version (APA):
van Beers, W. C. M. (2005). Kriging metamodeling for simulation. Tilburg University Press.
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
53976
..
/3.-/-1
UNIVERSITEIT * 2]F *
.16----4
;AN TILBURG SIBLIOTHEEK1 TILBURG
STELLINGEN
behorende bij het
proefschrift
Kriging
Metamodeling
for
Simulation
door Wim C.M.
Van
Beers
1. In'discrete-event' simulatie geeft
Kriging
beterevoorspellingen dan polynomiale regressiemodellen.(dit proefschrift)
2.
Kriging
interpolatieisrobuusttenaanzien vandeklassieke modelveronderstelling datde respons eenconstantevariantieheeft.(dit proefschrift)
3.
Voor
tijdrovende simulatieszijn
modelafhankelijke
sequentiele proefopzettenteprefereren boven klassieke proefopzetten.(dit proefschrift)
4. Hoewelhetsimuleren vandegemiddelde
wachttijden voor
een gegevenbezettingsgraad van een M/M/1 model door middel vaneenniet-lineaire
stochastische
differentievergelijking uit principiele
overwegingen de voorkeurheeft, is hetvoor toetsdoeleindenefficient
eneffectief om de analytisch verwachte wachttijdenteverhogenmetNHD-trekkingen.KRIGING METAMODELING
FORSIMULATION
Proefschrift
ter verkrijging vandegraad vandoctor
aandeUniversiteitvanTilburg,
op gezag vanderector magnificus, prof. dr. F.A. van der DuynSchouten,
inhet openbaarteverdedigen ten overstaan van
een door hetcollege voor promotiesaangewezencommissie in de aula vandeUniversiteit
BIBLIOTHEEK TILBURG
Acknowledgment
This thesis istheresult of myresearchduringsix years in the Operations Research group oftheDepartment
of
InformationSystems andManagement intheFaculty of Economics and BusinessAdministrationatTilburg
University. During that time I met many peoplewhosupported me in various ways. They made mystayproductiveand-not in
theleast-pleasant. To some of them Iowe special thanks.First of all, I
amextraordinarily grateful to mysupervisor, Professor Jack Kleijnen.Heguidedmethroughthefascinating worldof
randomsimulation and taught me to search for theintuition
('wiedes') behindthe results. We had manyvaluable discussions, andheencouraged me to develop the ideas that we discussed in such a way thatthey couldbepublishedin international journals. Iamgrateful for the manycriticalcommentshe
made-in
red, green, and blue. Theexistence ofthis thesis is largely due to him.Alsospecial thanks are due to themembers of myPh.D.committee: Professors H.Daniels, D.den Hertog, T. Dhaene,J.Rooda, and H. Wynn
for
theircarefulreading,thoughtfulsuggestions,andcorrections. Ifeel honored to have them in my committee.
Among alltheotherswhosupportedmeduring myPh.D.study, I particularly wishtothank ourdaughter,Miranda. Her typicalmental support was a real
encouragement to me.
Last butcertainlynot least, I am mostgrateful to my wife,Marianne. She patiently gave me all the timeand space I needed to ventureintothosethings that I foundfascinating.There are no words to describe theemotionalandloving way she supportedme.Undoubtedly,thebiggest thanks are for her!
1 Introduction
toKriging
Metamodelingfor
Simulation 11.1 Real systems and mathematical
models 1
1.2 Simulationtypes 2
1.3 Metamodels 3
1.4
Kriging
4 1.5Kriging
applications area 5
1.6 Summary
of
thesis 6References 7
2 Kriging for
Interpolation
in
RandomSimulation 11
2.1 Introduction 11
2.2 Kriging 14
2.2.1 History
of
Kriging 142.2.2 Basics
of
Kriging 142.2.3 Formalmodel
for Kriging 15
2.3 Detrended
Kriging 19
2.4 Twoexamples and
five
metamodels 202.4.1 Example I: M/M/1
hyperbola 23
2.4.2 Example
I[:
fourth-degreepolynomial 24 2.5 Nuggeteffect 25
2.6 Conclusionsandfuture
research 26
References 27
3 Robustness
of Kriging
whenInterpolating in
RandomSimulation with
HeterogeneousVariances 29
3.1 Introduction 29
3.2
Kriging
Basics 31
3.3 Randomsimulationwithheterogeneous variances ...... 32
3.4 Novel
Kriging
method:Studentization 35
3.5 Numerical
examples 36
3.6 Conclusionsandfurther research 43
References 44
4
Application-driven
SequentialDesignsfor
Simulation Experiments:
Kriging
Metamodeling 47
4.1 Introduction 47
4.2
Kriging basics . 50
4.3 Doeand
Kriging
52 4.4 Application-drivensequentialdesign 53
4.4.1
Pilot
inputcombinations 53
4.4.2 Candidateinput combinations 55
4.4.3 Cross-validation 55
4.4.4 Jackknifing 56
4.4.5 Sequentialization 57
4.4.6 Stopping rule 58
4.5 Two examples 59
4.5.1 Example I: ahyperbolicVOfunction 59 4.5.2 Example H:afourth-order polynomial
I/0
function ... 614.5.3 Estimated variograms: Gaussian versuslinear ... 63
4.6 Conclusions andfurtherresearch 63
References 64
5 Customized Sequential Designs
for
RandomSimulation Experiments:
Kriging
Metamodeling
andBootstrapping 67
5.1 Introduction 68
5.2
Kriging
Basics 70 5.3 DOEandKriging 73
5.4 Sequential DOE 75
5.5 Two examples 79
5.5.1 M/M/1 model 79
5.5.2 (s,S)inventory
model 89
5.6 Conclusionsandfuture
research 92
5.7 Appendix
92References 94
6.2 Future
research 99
Samenvatting (Summary
in Dutch)
103Chapter 1
Introduction
to
Kriging
Metamodeling for
Simulation
1.1
Real
systems
and
mathematical
models
Many scientificdisciplinesusemathematical models(including simulation;seebelow) to
describecomplicated real systems. The goal ofsuchmodelsisgetting more insight into the real system,to answer questions such as: What is theoutput's sensitivity tothe inputs; what is the optimal combination oftheinput values?
To obtainsuchinsight, analytical methods (e.g.,differentialcalculus) often turn out to fail. In suchcases,numerical methods maybetried; i.e., experimentation with the model may answerthequestions about the realsystem.This experimentation often implies thatthemodel is converted intoacomputer code (or computer program) that is run foranumber
of
differentinputcombinations. Next, the resultinginput/output
(I/0)
behavior ofthe model shouldbeanalyzed.We emphasize that theselection oftheinputcombinations shouldbeguided byscientific principles (e.g., changingoneinput at a time canbeproven tobeineffective
if
inputs interact, and inefficientelse). Theseprinciplesareinvestigated in mathematical statistics under the nameDesign OfExperiments (DOE). In numerical experiments (as opposed to experiments with the
realsystem)theseprinciplesneedadjustment;seeKleijnen et al. (2005)andChapters 4 and 5. We focus on computer expensive simulation experiments; i.e., experiments that require much computer time. Inthesesituations, DOE iscertainlyneeded.
analysis requiresformalmethods. Most popularareLeastSquares(LS)curvesfitted to the I/0
data.We, however, focus onanalternative method knownasKriging;seeSection 1.4. Next,weshall discuss the key terms inthetitle ofthis chapter andthisthesis.
1.2
Simulation
types
A
simulationmodel might beaphysical model, e.g., ascalemodel ofaracing car in a windtunnel. We, however,
limit
ourselvestomathematical models (also seethepreceding section).Following Kleijnen (1974), we define simulationasexperimenting with a model over time.This
definitionindicates thatthevariabletimeplaysaspecial roleinsimulation. Indeed, Law and
Kelton (2000) also emphasize the role of time, by using the term dynamic simulation. Besides this simulation type, there is static simulation, in which time plays no role; an example is the Monte Carlo (MC)method.By definition, the MCmethodusesPseudo-Random Numbers (PRN);i.e., computergeneratednumbers between zero and one thatareindependent and
uniformly
distributed over this interval. We shall also useMCmodels in this dissertation (see Chapters 2,3, and 4).Therearedeterministicandrandom simulationmodels. Ifthesimulationmodel contains no random components, the simulation modeliscalled deterministic. For example, such
simulationmight solve a set
of
complicateddifferentialequations describingtheairflowaround airwings. Simpson et al. (2001) mention applicationsof
deterministic simulation invarious disciplines.Typically, in this typeof
simulation,aninputcombinationneedssimulation only once(repeatedcomputer runs with thesameinputcombination give exactly thesameoutput value, provided we do not make any changes inthecomputer software or hardware).Introduction to Kriging Metamodeling for Simulation 3
simulation type. Note that we donotstudy dynamic,deterministicsimulation models (often used inComputerAidedEngineering, CAE; seeforexample De Geest et al. (1999)).
Table1: Simulation typeswithappropriate dissertationchapters
Deterministic Random
Static Chapter 4 Chapters 2 and 3
Dynamic Chapter 5
Finally,wediscussathird waytodistinguish various simulationtypes:continuousversus
discrete-event simulation. Continuous simulation experiments with models consisting of differentialequations(so statevariableschangecontinuously). Computercodesapproximate thesedifferentialequations by difference equations.Thesemodelsareusually deterministic (see
the preceding discussion, summarized in Table 1).Obviously,random elements maybeadded to thedifferentialequations;forexample, econometric models may consist
of
difference equations plusadditivenoise.Discrete-event simulationhas statevariables thatchangeinstantaneously,atpoints in time that are not necessarily equidistant;forexample, customers arrive ataserveratrandompoints of time.Anevent isthe change inthesystem'sstate;forexample, the number
of
waitingcustomersincreases.Randomsimulationincludes discrete-event event simulation; see VanBeers and
Kleijnen (2005).
Simulation-in its
manyforms-is
applied in manyscientificdisciplines. We focus on thedisciplineknownasOperations Research/Management Science (OR/MS). In OR/MS,simulation isalsooftenapplied-,eventhoughsimulationisdescribedasmethod of last resort 'when allelse
fails...' in
thefamousChapter 21 inWagner (1975).1.3 Metamodels
A metamodel is also calledaresponsesu,face,auxiliarymodel, emulator,etc.Metamodeling
originatesfrom neuro-linguistics inthebeginning ofthetwentieth century;seeKorzybski and Meyers 1958. Kleijnen et al. (2005) defineametamodel asanapproximation of the true I/O
simpler thantheunderlyingsimulation model;anexample isafirst-order polynomialregression (meta)model of the
I/0
function ofaqueueing simulation. A metamodeltreats thesimulation model asablack box; i.e., it uses the VO datawithoutknowledge of the way thesimulation modelprocesses theseinputs to gettheoutputs.A metamodel is a tool forthesystematicexperimentationandanalysis ofasimulation model, togain insight (againseeSection 1.1).Kleijnen(1998)discusses the use
of
metamodeling for(i)
sensitivity analysis(ii) optimization
(iii) validation
andverification. We shall focusonsensitivityanalysis.There are many types
of
metamodels;seeKleijnen (2005). Most populararelow-order polynomialregression models. However,Kriging isalsoapplied frequently in deterministicsimulation; see Den HertogandStehouwer (2002),andSimpson et al. (1998).Becauseother types
of
metamodels do not play a role in this thesis and we wish to avoid errorsof
omission, we refrain from listing(anddefining)thesetypesandgivingkey references. In thisthesis,wefocuson Kriging,andcompare its performancewithclassical regression.
Atkinson (1989, ch.3) definesinterpolation as theselection of a function in such a way thatits graphpassesthroughafinite set
of
given data points.Krigingappliedtodeterministic simulationmeetsthat definition, as we shall see in the next section. Inpractice, the term interpolation is also used to meanafunctionthat approximates the VO data: forexample, a regression model may fit the I/0 data such thatit
minimizes the sumof
squaredapproximation errors (LS criterion).1.4 Kriging
In the195Os,theSouthAfrican miningengineer D.G. Krige (born in 1919, andstillalive)
devisedaninterpolation methodtodetermine true ore-bodies,based on samples.The basic idea is thatthesepredictionsareweighted averages of the observed outputs, where the weightsdepend
Introduction to Kriging Metamodeling for Simulation 5
observed.The weightsare chosen so astominimizetheprediction variance, i.e.,theweights should provide a Best Linear UnbiasedEstimator (BLUE) oftheoutput value foragiven input.
Therefore,Kriging isalsocalled Optimal Interpolation.
The dependence oftheinterpolationweights on the distances betweentheinputs was mathematically formalized bythe French mathematician Georges Matheron (1930-2000) in his monumental 'Traitddegdostatistique appliqude' (1962). He introducedafunction, which he calledavariogram, to describe the variance of the difference between two observations; also see
Chapter 2.
SoKrigingoriginated in geostatistics to answer concrete questions in the goldmining industry:
Drilling
forore-deep
undertheground-is
expensive,soefficientprediction methodsare necessary.Lateron, Krigingwassuccessfully introducedinto deterministicsimulation by Sacks et al. (1989). In thisthesiswe introduceKriginginterpolation into the area
of
random simulation.Krigingprovides 'exact' interpolation,i.e.,predicted outputvaluesat inputs already
observed equaltheobserved outputvalues. Suchinterpolationisattractiveindeterministic simulation,andKrigingis often applied inCAD(mentionedinSection 1.2).In discrete-event
Simulation, however, Kriging hasjuststarted.
Above, we mentioned thatthevariogram isthecornerstonein Kriging. Hence,accurate estimation ofthevariogram,based on theobserved data,isessential. JournelandHuijbregts (1978,pp. 161-195) present various parametric variogram models. The values of itsparameters
are obtainedbyeither Weighted Least Squares (WLS)or Maximum LikelihoodEstimation (MLE);seeCressie (1993). In thisthesis,we shall useboth methods: WLS in Chapters 2,3, and
4, and MLE inChapter 5.
1.5
Kriging
applications area
Aftertheintroduction
of
Kriging intheminingindustry,it
became very popularin other areas of geostatistics, suchasmeteorology, oceanography, agricultureandenvironmentstudies; seeAfterthepioneeringarticle
of
Sacks(1989), Kriging has alsobeenwidelyapplied in deterministicsimulationforengineering,aimed at thedesignof
better computer chips, televisionscreens,aircraft,andautomobiles.
In this thesis we introduceKriging for simulation inOR/MS, covering both deterministic and random simulations(in differentchapters). We
limit
ourselves tosensitivityanalysis; we feel that sensitivityanalysisprecedesoptimization, sowe leaveKriging for optimizationasfutureresearch(alsoseeChapter 6).
1.6 Summary
of
thesis
Thisthesiscontains the full text of fourpapersthat eitherhavealreadybeenpublished or have
been submittedfor publication. It istheverbatim text, exceptforthreesentencesinChapter 2, indicated by footnotes. In this section we summarizeeach paper.
In Chapter 2, we introduceKriging interpolation forrandom simulation. We develop a novel type
of
Kriginginterpolation, and callit
DetrendedKriging.We demonstrateKriging through two numerical examples:(i) ahyperbole inspired by the single-server queueingsimulation withPoisson (Markov)arrivalandserviceprocesses(known as theM/M/1queueing model)
(ii) an
artificialmodel, namelyafourthdegree polynomial withmultiplemodes plusadditivenoise.
We test our novel method through cross-validation,andcompare it tolow-orderregression metamodelsfittedthrough Ordinary Least Squares (OLS). The mainconclusion ofthis chapter is thatKriginggives better predictions thantheseregression models. Furthermore, tests show that the Kriging'sso-called nugget effect equals the variance of the additive noise.
In Chapter 3, we droptheclassicKrigingassumption
of
outputswithconsmnt variances. In practice, this assumption is not realistic. Therefore we investigatethe consequencesof
KrigingIntroduction to Kriging Metamodeling for Simulation 7
heterogeneity;i.e,Kriging isarobustmethod thatstilloutperformslow-order polynomial regression.
In Chapter 4, we propose a novel method toselectexperimental designs
for
Kriging. Ourmethodissequential,becausewe focusonexpensive deterministic simulation.Ourdesigndiffers from traditionaldesigns suchasLatinHypercubeSampling (LHS);seeMcKay,Beckman, and
Conover (1979), andalsoKoehler and Owen (1996),and Kleijnen et al. (2005). More
specifically,our method accounts for the specific
I/0
functionimplied bytheunderlying simulationmodel.Our customized, tailor-made designsareconstructed throughcross-validation and jackknifing. We test our method through the two academic applications that we also used in Chapter 2. Our main conclusions are thatthenovel method simulatesrelativelymore inputs in the more interesting parts oftheunderlyingI/0
function, andit
gives better predictions thantraditionalLHSdesigns.
In Chapter 5, we extendthemethod
of
Chapter 4 to random simulation, especially discrete-eventsimulation. However, customization is nowbasedonbootstrapping instead ofcross-validation. We testour method through two discrete-event simulation models that are classic inOR/MS,namely theM/M/1 queueing model and an (s,S)inventorymanagement model. We again comparetheperformance of our methodwithclassicalLHSdesigns. Itturns out thatour design indeed gives better results.
Finally, inChapter 6 we summarize the conclusions ofthepreceding chapters. We also discussthe advantages andthedisadvantages
of
Kriging.Wefinish withpossible topicsforfuture research.References
Atkinson, K.E. (1989), An introduction to numericalanalysis. John Wiley & Sons, Inc., New York
De Geest, J., T. Dhaene, N. Fache, and D. DeZutter(1999),Adaptive CAD-model building algorithm forgeneral planar microwave structures./EEETransactionsonMicrowave Theory and Techniques, 47, no. 9,pp. 1801-1809
Den Hertog, D., and H.P. Stehouwer (2002),Optimizing colorpicturetubesby high-cost non-linear programming. EuropeanJournal
of
OperationalResearch, 140(2),197-211Journel, A.G. and Huijbregts, C.J. (1978),MiningGeostatistics. Academic Press,
London
Kleijnen, J.P.C. (1974),Statisticaltechniquesinsimulation, partL MarcelDekker Inc., New York
Kleijnen, J.P.C. (1998), Experimental designfor sensitivityanalysis,optimization, and validation
of
simulation models. Chapter 6 in: Handbookofsintulation, edited byJ.Banks, Wiley, New York,pp.173-223Kleijnen, J.P.C. (2005), Invited review: Anoverview ofthedesignandanalysis of
simulation experiments for sensitivity analysis.European Journal of Operational Research tin press)
Kleijnen, J.P.C., S.M.Sanchez,T.W.Lucas andT.M.Cioppa (2005),Auser's guide tothebravenewworld o
f
designingsimulationexperiments. INFORMS Journal onComputing (accepted as State-of-the-Art Review)Koehler, J.R. and A.B. Owen (1996), Computer experiments. Handbook of statistics,
by S. Ghosh and C.R. Rao, vol. 13,pp.261-308
Korzybski, A. andR.Meyers (1958),Scienceand sanity:anintroduction to non-aristotelian systems and general semantics; fourth edition. International Non-Aristotelian LibraryPublishing Company,Lakeville
Law, A.M. and W.D. Kelton(2000), Simulation modeling and analysis;
third
edition. McGraw-Hill,BostonMatheron, G. (1962), Traitd de gdostatistique appliqude. Memoires du Bureau de Recherches Geologiques et Minieres, Editions Technip, Paris, no.14: pp.
Introduction to Kriging Metamodeling for Simulation 9
McKay, M.D.,R.J.Beckman and W.J. Conover (1979),
A
comparisonof
three methodsforselecting valuesof
input variables intheanalysisof
output from a computercode. Technometrics, 21, no. 2,pp.239-245(reprinted in 2000: Technometrics, 42, no. 1, pp. 55-61Sacks, J.,W.J. Welch, T.J. Mitchell and H.P. Wynn(1989), Designandanalysis of
computer experiments.Smtistica/Science, 4, no. 4, pp. 409-435
Simpson, T.W.,J.J.Korte,T.M.Mauery, andF.Mistree (1998),Comparison of
responsesurfaceand Krigingmodelsfor multidisciplinarydesign optimization. AIAA Journal, vol. 1,pp.381-391
Simpson, T.W.,T.M. Mauery,J.J. Korte, andF.Mistree (2001), Kriging
metamodels
for
global approximation in simulation-basedmultidisciplinarydesignoptimization. AIAAJournal, 39, no. 12, 2001,pp.2233-2241
Van Beers, W.C.M. andJ.P.C. Kleijnen(2005), Customized sequential designs for
randomsimulationexperiments:Krigingmetamodelingandbootstrapping. Submitted for publication
Chapter 2
Kriging for Interpolation
in
Random
Simulation
Abstract
Wheneversimulationrequires much computer time,interpolationis needed.Simulationists use different interpolationtechniques (forexample, linear regression), but thispaperfocuses on
Kriging. Thistechniquewasoriginallydevelopedingeostatistics by D.G.Krige, and has recentlybeenwidelyappliedindeterministic simulation. Thispaper,however, focuses on randomorstochasticsimulation. Essentially,Kriginggives moreweightto'neighbouring' observations. Thereareseveral types
of
Kriging;this paperdiscusses-besidesOrdinaryKriging -a noveltype,which 'detrends' data through the useof linearregression. Resultsarepresented fortwo examplesof
input/outputbehaviour of theunderlyingrandom simulation model: Ordinary and DetrendedKriginggive quite acceptable predictions;traditionallinear regression gives theworstresults.2.1 Introduction
A primary goal of simulationiswhat iforsensitivity analysis: What happens if inputs of the simulation model change? Therefore simulationists runagivensimulation
program-or
computercode-for (say)
ndifferentcombinations of theksimulationinputs. Weassume thatPaper byVan Beers, W.C.M. andJ.P.C.Kleijnen, Journal OftheOperationalResearchSociety, no. 54,2003, pp
theseinputsareeither parameters orquantitativeinputvariables ofthesimulationmodel. Typically, Krigingassumes thatthe number
of
values per input variableisquite'big',
certainly exceeding two(twovalues are usedinsimulationexperimentsbased on 2k-Pdesigns).Given this set ofninputcombinations,theanalysts run the simulation and observe the outputs. Note that most simulation models havemultipleoutputs, butinpracticetheseoutputs are analysedperoutput type.
The crucial question of this paper is: How to analyse thissimulation input/output (I/0)
data?Classic analysisuseslinear-regression (meta)models;seeKleijnen (1998). Ametamodel is an approximation of the
I/0
transformationimplied bytheunderlying simulationprogram. Many other termsarepopular in certain disciplines:Response surface,compact model, emulator, etc. Suchametamodeltreats thesimulation model asablack box; that is, thesimulationmodel's VO is observed, andtheparameters of the metamodelareestimated.Thisblack-boxapproach has thefollowing
advantages anddisadvantages.An advantage is thatthemetamodel canbeapplied to all types
of
simulationmodels, either deterministic or random, either in steady-state orintransientstate.Adisadvanmge is that it cannot take advantage ofthespecificstructure ofagiven simulation model, so it may take more computer time comparedwithtechniques suchasperturbation analysis andscorefunction.Metamodelling can also helpin optimizationandvalidation ofthesimulationmodel. In this paper, however, we do not discussthesetwo topics, but refer tothereferences ofthispaper.
Further, ifthesimulation modelhashundreds
of
inputs,thenspecial 'screening' designs areneeded, discussedin Campolongo,Kleijnen,and Andres (2000). In ourexamples-but not in our methodologicaldiscussion-we limitthenumberofinputs totheminimum,namelyasingle
input.
Whereaspolynomial-regression metamodels havebeenapplied extensivelyin discrete-event simulation (suchasqueueingsimulation), Kriginghashardlybeenappliedto random simulation: Asearch ofIAOR(International Abstracts
of
Operations Research) gave only two hits. However, in deterministicsimulation(applied in many engineering disciplines; see ourKriging for Interpolation in Random Simulation 13
In randomsimulation,however, thisKrigingproperty may not besodesirable, sincetheobserved (average)value is onlyanestimate of the true, expected simulation output.Unfortunately,
Krigingrequires extensive computation,so adequatesoftwareis needed.We discovered that for random simulation no software is available, so we developed our own software, in Mattab.
Note that several types
of
random simulation maybedistinguished:(i) Deterministic simulation withrandomly sampled inputs. For example, in investment analysis wecancompute the cash
flow
development over time throughaspreadsheet suchasExcel. Next, we sampletherandom valuesof
inputs-such as the cashflow growthrate-by
meansof
either MonteCarlo orLatinHypercube Sampling(LHS)throughanadd-on such as @Riskor Crystal Ball; see Van GroenendaalandKleijnen (1997)(ii)
Discrete-event simulation. For example, classic queueingsimulationisapplied in logisticsand telecommunications.
(iii)
Combined continuous/discrete-event simulation. For example,simulationof
nuclearwastedisposalrepresentsthe physicalandchemicalprocessesthroughdeterministicnon-linear difference equationsandmodelsthehuman interventionsasdiscreteevents (seeKleijnen and Helton, 1999).
Ourresearchcontributionconsists inthedevelopment ofanovel (namely, detrended)
Kriging type, andtheexploration of how wellthis Krigingtype performs compared with OrdinaryKrigingandtraditionalpolynomial-regression modeling. The main conclusion of our examples is:A perfectlyspecified detrendingfunctiongivesbestpredictions;Ordinary Kriging is
acceptable;theusuallinear regression givestheworstresults.
We organize the remainder of thispaperasfollows.First we sketch the history
of
Krigingandits application in geology, metereology,anddeterministic simulation. Then we describe the
2.2 Kriging
2.2.1 History
of Kriging
Kriging isaninterpolationtechniqueoriginallydeveloped by D. G. Krige,aSouthAfrican miningengineer. Inthe 1950s hedevised this method to determine true ore-grades,based on
samples. Next,heimproved the methodincooperation with G. Matheron,aFrench
mathematician atthe'EcoledesMines'. At the same time,inmeteorology L. Gandin (in the formerSovietUnion)workedonsimilarideas,under thename'optimuminterpolation' (see Cressie, 1993).
Nowadays,Kriging isalsoapplied to 1/0 data
of
deterministic simulationmodels; wereferagaintoSacks etal. (1989)'s pioneering article. Manymorepublications followed; for
example, Meckesheim et al. (2001) give35references. AlsoseeKoehler and Owen (1996), and
Jones,Schonlau,andWelch (1998).
2.2.2 Basics
of Kriging
Kriging isanapproximation method that can give predictions
of
unknown values ofarandom function, random field, or random process. These predictionsarebest linear unbiased estimators, under theKrigingassumptionspresented in the next subsection.Actually, these predictionsareweighted linear combinations of the observed values.
'Kdging assumes that the closer the input data are, the more positively correlated the prediction
errors are.Mathematically, this assumption is modeled throughasecond-orderstationary
Kriging for Interpolation in Random Simulation 15
thepredictionequals theobservedvalue. (In deterministicsimulationthis propertyiscertainly attractive, as wesaidabove.)
In Kriging, a crucial role is played bythevariogram:
A
diagram of the variance of thedifference between the measurements at two input locations; alsoseeFigure1.which has symbols explained in the next subsection. The assumption ofasecond-order stationary covarianceprocessimplies thatthevariogram isafunction ofthedistance (say)hbetween two locations. Moreover, the further apart two inputs are,thesmaller this dependence
is-until the
effectisnegligible.
2.2.3
Formal
model
for Kriging
A randomprocessZ(•) can be described by {Z(s) :s E D} where D i s a fixed subset of Rd and
Z(s) isarandomfunctionatlocation s D; see Cressie (1993, p. 52).
There are several types of Kriging, but we limit this subsection toOrdinary Kriging, whichmakes the
following
twoassumptions (already mentioned above, but not yet formalized):(i) The model assumptionisthat the random process consists of a constant 11 and an error term O (s):
Z(s)=B+O (s) with seD,BER (1)
(ii)
Thepredictorassumption is thatthepredictor forthepoint so -denoted byp(Z(so))-is
aweighted linearfunction of alltheobserved output data:p(Z(So))=Elli,Z(st) with TIi,1,=1 (2)
To select the weights A, in (2), thecriterion is minimal mean-squared prediction error (say) a defined as
03 =E[{Z(so)-p(Z(so))}21 (3)
To minimize(3)given (2),theconstraint X 1,1, =l isadded to theobjective functionthrough
theLagrangianmultiplier m: Then wecanwritetheprediction error as
1
E[{Z(so)-63:.,A,z(s,)}2]-2m[Ili,1 -11. (4)
To minimize (4), weutilizethevariogram; alsoseeFigure 1. Bydefinition,thevariogram is
2, (h) = var[Z(s + h) -
Z(s)],
where h =s, -si asexplained by the stationary covarianceprocessassumption with h e Rd and i,j - 1,..., n. Obviously, we have
var[Z(s + h) - Z(s)] = 23(st - s,) = 22(h).The spacing h is also called the lag. Aftersometedious manipulations, (4) gives
- X l,T "=,A;.1, »,-st)+261 1,·1, Aso -s,)-2mC X liA,-1). (5)
Differentiating (5) withrespect to Ai , · · · , A„ and m.givestheoptimal ,11, . . . , A„ :
f 1-1'r-'TY_
11= 7+1
11-1 and m=-(1-1'r-ly)/(1'r-'1), (6)
ir-1 1 3
where y denotes thevector
of
(co)variances (y(so-st)....,y(so-s„))',
r
denotes the nx n matrixwhose (i,j)thelement is )(s,-s,), 1 -(1, ..., 1)'is
thevectorofones; alsoseeCressie (1993, p. 122).We emphasize thattheseoptimal Kriging weights 1 depend onthe specificpoint so that
is tobepredicted,whereaslinear-regression metamodelsusefixedestimatedparameters (say) B
Note further that some of the weights 1 may be negativez.
The optimal weights (6) givetheminimalmean-squaredpredictionerror: (3)becomes
(alsoseeCressie (1993, p. 122)
Cr; = Il,ly(so-st)+m
(1'r-'7-1)2 (7)
= y'r-'y
1, r -1 1
However, in (6) and (7) 2 (h) isunknown. Theusualestimator \s
Kriging for Interpolation in Random Simulation 11
29(17) =
-LY
(Z(st)-Z(s,))2 (8)
N(hM - Nth) \where|N(h)| denotes thenumber
ofdistinct
pairs in N(11)={(s,.s,):st-sj=h; i, j =1,....n};
see Matheron (1962). The estimator in (8)isunbiased, iftheprocess Z(•) is indeed second-order
stationary; seeCressie (1993, p. 71).
Given (8)for differentIlhll values,thevariogramisestimatedby fittingacurve through
the estimated values 2f (11). This curve displays the following importantcharacteristics(see
Figurel):
(i) Forlarge values
of
1|h||,thevariogram 2 f(h) approaches aconstant C(0),called thesit!'. 'Forthese large Ilhll values, all variances of the differences Z(s + h) - Z(s) are invariant with
respect to h.
To prove this property, we definethecovariogram C(h)=Cov(Z(s), Z(s + h)). Obviously, COV(Z(S),Z(S)) =Var(Z(s)).Then it is easy to derive
22 (h) = 2(((0) - C(h)). (9)
Because C(h) 1 0 as 11 h Ill' -, the variogram hastheupper
limit 2C(0).
(ii) The interval of Ilhll on which the curve does increase (to the sill), is called therange
(say) r;that is, C(11) < E for 11 h 11> r + r,. We shall giveaspecificmodel in (10).
(iii) Although(9)impliesgamma 3 (0) =0,thefittedcurve does not alwayspassthrough zero: It may have a positive intercept-calledthe nuggetvariance. This variance estimates noise.
For example, in geostatistics this nugget effects means that when going back to the 'same' spot, a completely differentoutput (namely, a gold nugget)isobserved.
We add that in random simulation, the same input (say, the same traffic rate in queueing
simulation)givesdifferentoutputsbecausedifferentpseudo-random numbers areused. Below we shallreturn to thisissue.
To fit
avariogram curve through theestimatesresulting from(8), analysts usually apply the exponential model[co+ct(1-e-Alita) if h#0
Y(h) =4 (10)
[0 if h=0
where obviously co isthenugget, co + ct the sill, and athe range.However, other models are also fitted; for example,thelinear model
Y(h)=
4[co+bllhll if h#0
(11)[0 if h=0
where again co isthenugget;seeCressie (1993, p. 61).Actually,weshall apply (11) in our experiments, because it is the simplest model and yet gives acceptable results(forexample, it estimatesthenuggeteffect very well).
In deterministicsimulation.analysts use more general distance formulas than (8). For example, Sacks et al. (1989, p. 413)andJones et al. (1998, p. 5) use for the k-dimensional inputs
xi = (X,(„, ..., X'. k,)' and xj = (xjct,•..., x,(k,)' theweighted distanceformula
h(xi,xi)
=E:.1 0: IX, g,-XM,
i
(12) where 19, (with d:2 0)
measuresthe importance oftheinput xg, and pg controls theKriging for Interpolation in Random Simulation 19
2.3
Detrended
Kriging
OrdinaryKrigingwasdefined by (1), where v E R wastheconstant mean of the randomprocess Z(•).This assumption, however,limitstheapplication
of
OrdinaryKrigingtorather simple models ofthe process Z(•). A moregeneralassumption is that v is notaconstant, but an unknown linear combinationof
known functions{fo(s), -:f„(s)}, se D.
Thisis
called Universal Kriging:seeHuijbregts and Matheron (1971, p. 160) and also Cressie (1993, p. 151).Cressie (1993)discusses real(non-simulated) coal-ash data,andRegniereandSharov (1999)
discusssimulated spatialandtemporal output data ofarandomsimulationmodelforecological
processes.
Now we introduceanovel type
of
Kriging that wecall DetrendedKriging.Detrended Krigingpre-processestheoriginal data, and then appliesOrdinaryKriging totheresulting data so wecanapply softwarefor Ordinary Kriging.For UniversalKriging,however, software is available onlyfor
spatialandtemporal data, notfor simulation withanarbitrarynumber of inputs-to the best ofour knowledge3.We assume that theprocess mean
*(s)
satisfiesthedecomposition#(S)= S (S)+71(S) (13)
where S(s) is a known signal function (see, however, the text below (14)) and 11(s) isawhite noiseprocessthatmodels the measurement error; that is, 11(s) is normally identically and independentlydistributed with zero mean (NIID). So,we replace (1) by
ZCS) = SCS) + 11(S) + a(S). (14)
In practice,thesignalfunction S(s) in (14)isunknown. Therefore we estimate S(s)
through S(s), from the set
of
observed(noisy) I/O data {(st,Z(s, )). i =l, ..., n}. Because of theassumedwhite noise, weuseordinaryleastsquares (OLS) to obtaintheestimator S(s).
Next we applyOrdinary Kriging tothedetrended set <(st,Z(s, ) - S(st ) ) :i= 1, . . . . n . Our
predictor forthe output
of
location so is the sum of thisOrdinaryKrigingprediction and the3 Afterthis paperwasaccepted,theMatlabKrigingtoolboxofLophaven, Nielsen,andSpndergaardbecame
estimator S(so)
To test our new Detrended Kriging, we apply classiccross-validation:seeKleijnen and
van Groenendaal (1992, p. 156).Cross-validation eliminates one
I/0
combination,say (sk I Z (s k )), from to the
original data set {(st,Z(s,)): i =l,..., n},so
the remaining n -1 datacombinations are {(st,Z(s,)): i -1,..., k -1, k +1...., n}.This new
setgivesapredictionp( Z(sk )) ·Thisprocess
of
eliminationandpredictionisrepeated for (say) c differentcombinations (c C n) .Obviously, if we sorttheoriginal set such that thefirst c observations are deleted one at a time, then we get k =1,2, . . . ,c.
To summarizetheresultingpredictionaccuracy, we use the I, norm of the difference
I 2 \B
vector li p(Z(sk))-Z(sk) 11 (the 4 norm 11x11 is defined as Ek-ixk 11 ). In our experiments
we find that the l, and
L-
norms givesimilarconclusions.Note that inKriging,allpredictionerrors may be zero at the 1/0 points thatareactually
usedto estimatetheKrigingmodel. Therefore weusecross-validation.
2.4
Two
examples and
five
metamodels
Weareinterested intheapplication
of
Krigingtodiscrete-eventsimulationmodels, such as simulated queueing systems. As LawandKelton (2000»-thebestsellingtextbook onsimulation-states (on page 12),asingleserverqueueing system is quite representative of more
complex, dynamic, stochastic simulation models. Forfurther simplification,we suppose that the
output ofinterest is the meanwaiting time in thesteady state, E(W).Thisoutput can be
Kriging for Interpolation in Random Simulation 21
S(i) and A(i)areidenticallyandindependentlydistributed (IID),sosimulationisstraightforward.
The output E(W)isusually estimated through the simulationrun'saverage
iii _r, W(i)
6
withOEb<n
(16)Abn-b+1
where bdenotesthelength oftheinitialization(start-up, transient)phase(which may be zero), and n therun length. (InM/M/1 analysisandsimulationthrough renewal analysis, this initialization isnoissue;in practical simulations, however, it isamajor problem; see Law and
Kelton (2000,pp.496-552).) In other words,thedynamic simulation model generates the time series (15), but thisseries ischaracterized through the single statistic (16).
Actually, simulation is donefor sensitivityanalysis (possiblyfollowedbyoptimization). Such an analysis aims at estimatingtheinput/output (I/0)function(say)
E(Z(s)) =S(s) (17)
where-following
(14)-Z
denotes the(multiple)output and s the(multiple) input. In the M/M/1example we have Z=W and s= 1/#
with arrival rate A andservice rate v.In general, S(s) in (17)hasunknownshape andparameters. However, when studying the
performance ofaspecific simulation methodology,researchersoften use theM/M/1 simulation modelbecause the
I/0
function S(s) isthenknown-assuming thatthemethodology hasselected anappropriateinitializationlength b in (16)(obviously, knowledge of S(s) may not be used by the methodologyitself):
E(W) = 1 with
1<1.
(18)BCM- A) B
Unfortunately,thelatter assumption is very questionable: it iswellknown that selecting an appropriate transient-phase length b and runlength n in (16)is
difficult.
Moreover, Krigingassumesthat-in general-thesimulation observations Z have additive white noise:see(14). In the M/M/1 example, (14) gives (i) Z=W, (ii)
S(s) = A/(v
(1-A)),(iii)
normality holds if W in (16) uses asample size (n - b +1) such that anasymptotic central
limit
theorem holds,(iv)
constant variances resultif
differentsimulatedtrafficrates usedifferentandappropriate sample
sizes-see
Kleijnen and VanGroenendaal (1995)-and (v) independenceresults ifnocommon pseudorandom numbers areused.Altogether,Hence, it is much moreefficientandeffectivetogenerateKriging testdatathrough sampling from (13) with S(s)=1/(51(1
-A))
instead of (15) and (16). Indeed,ourapproach requireslesscomputer time,andguarantees that thewhitenoise assumption holds,including the desiredvalue forthevariance ofthewhitenoise.The alternative using (15) and (16)would require very longruns.especially for hightrafficrates A/# T 1 this alternativerequires n T. .In conclusion, to testtheKrigingmethodology wegenerate datathroughastatic, random Monte Carlo model like(13) instead ofadynamic stochastic simulation model such as (15)
combined with (16). So,theMonteCarlotechnique is bothefficientandeffective.
Besides Example I, we study Example II representing simulationswith multiple local maxima, which are interesting when optimizing simulation outputs. Example I represents queueing simulations that show 'explosive' meanwaitingtimes as thetrafficrateapproaches the valueone.Example II has no specific interpretation.
We samplethewhite noise-term 11(s) in (14) throughtheMatlabfunctioncalled 'randn', whichgivesstandard
NIID
variates; that is, 11(s) has zero mean andunitvariance. We also experiment withalarger variance namely 25; this results in larger error terms, but not in otherconclusions.
To estimate possible values of the L2 norm (defined above), we use 100
rnacro-replications.Fromthesemacro-replications we estimate
4's
median, 0.10 quantile Qo 1, and 0.90 quantile Q09.In both examples we take n =21 equallyspacedinput values: s, with i = l,..., 21. For
cross-validation we select (ratherarbitrarily)
c=5
inputsvalues:Weeliminate i=2,8,9,1 5,1 6
respectively. We comparedthefollowing fivemetamodels. i) OrdinaryKriging
ii)
second-degree detrending: S(s) isasecond-degreepolynomialiii)
perfectlyspecified detrendingfunction: S(s) isahyperbolainExample I, andafourth degreepolynomialinExample IIiv)
fifth-degreedetrending: S(s) isafifth-degreepolynomial (overfitting)Kriging for Interpolation in Random Simulation 23
Note that wealsostudyaperfectly specified detrending function (see
iii),
becausethisfunction providesautopian situation:This functiongives theminimumprediction error;inpractice, this functionis unknown (otherwisesimulationwould not beused).Furthermore,it
helps us toverify the correctness ofour computer program.2.4.1 Example I:
M/M/1 hyperbola
We take S(s) =
s/(1-s) on D=[0.01,0.99] c R:
this hyperbolic function may represent themeansteady-statewaiting time foratraffic rate s in anM/M/1 queueingsystem.This function gives Figure2,which also displaysanexample ofthenoisy
output Z(s).
Theinputlocations ares, c <0.01,0.05,0.10,..., 0.95,0.99}. The cross-validationiscarried out at s = 0.05,0.35,0.40,0.70 and 0.75. 25 20 S(S)
15 1-Cs)
0 eliminated data 105
0 . 70-7%1' a:--*-::0::'$ 4 '4 0 0.2 0.4 0.6 0.8 S . -5Figure 2: S(s) = s/(1 - s) andexample sample output Z(s)
Table 1:Prediction accuracy: estimated quantiles ofLedistribution forExample I (M/M/1 ) metamodel
L2 i ii iii iv V
00 i 1.2429 1.3622 1.1972 2.7411 17.593 median 1.8832 2.1522 1.8419 3.6678 18.17 00, 2.5698 2.925 2.5829 4.4677 18.6522.4.2 Example
II:
fourth-degree
polynomial
We take thefollowingspecificpolynomial: S(s)=-0.0579s4 + 1.11 s' - 6.845
sl
+14.1071 s+2on D = [0,10] c R'.
Thispolynomial has two maxima:A
local one andaglobal one; see Figure 3. We obtainoutput forthefollowing21 input locations st € 0,0.5,1,...,10}.Wecross-validate at s = 0.5,3.5,4,7 and 7.5. 15 10
-2
S(s)a
\
. . . ... ..CO . .0 5 5 -:-I.A ...Z(s) 0 i i I eliminated data-5 0 2 4 6 8 s 1 0
-10 -15Figure 3: S(s)=-0.0579 s4 + 1.1 l
s' -6.845
s2+14.107ts + 2 andexample sample output Z(s)Kriging for Interpolation in Random Simulation 25
Table2:Example II ; alsoseeTable 1
metamodel
4 i ii iii iv V
Qo.i 1.5976 1.515 1.2094 1.2173 5.5965 median 2.4713 2.41 1.8748 1.9117 6.0363 00.9 3.3226 3.246 2.6424 2.6959 6.50482.5
Nugget effect
We also wish to better understandtherelationship betweenthenugget effect in (11) and the
variance of the noise 11(s) in (13). Therefore we performasimpleMonteCarloexperiment: We take Z(s) =1 0+11(s) where 11(s) is NIID with v =0 and 0 2= 1,4,9,16, and25respectively. We sample two macro-replicates, setting the seed
of
Matlab's'randn'-rather arbitrary-to the
values 10 and 20. In the variousKrigingmetamodels, we fitthelinear variogram of (11); see Figure 4 (we display results for theseedvalue of 10 only; notethedifferentscales for the y-axis in thefourplots).
6 · 25
seed: 10
5 seed: 10sigma: 1 20· sigma: 2
4. 15- 3-• 10- • 2 . .. 1 • 5-0. · 0 100 50· seed:10 90 seed: 10 sigma: 3 80- Sigma: 4 40· 70-30- · 60 50· * 40 20- * . · • .. 30· • 10 + 20 10 0· · · 0· · · · · 02 46 8 10 0 2 46 8 10
The intercept in (11) estimatesthenugget effect; this intercept ispresentedfor different 02 values inTable3.Obviously,these resultsconfirmour conjecture: The nugget effect is the variance ofthenoise.
Table3:Estimated nugget effectsfor different whitenoise variances a 2 Q 2 seed10 seed 20 1 1.1 0.9
444
9 9.6 8.5 16 17.1 15.5 25 26.5 24.12.6 Conclusions
and
future
research
We assume that in practice the mean v oftheKrigingmetamodel (1) is notaconstant, but is a composition ofasignalfunctionandwhitenoise.Wepresentedresults fortwo examples of inpuUoutputbehavior oftheunderlying random simulation model:Ordinary Kriging and Detrended Kriginggivesquite acceptable predictions, whereastraditional linearregression gives theworstresults.
0LS predictssopoorly,becauseOLSassumes thatthe
fitting
errorsarewhitenoise,whereasKrigingallows errors thatarecorrelated; morespecifically,theclosertheinputs are, the
more positive correlation. Moreover, 0LS usesasingle estimatedfunction forallinputvalues,
whereasKrigingadaptsits predictor as the input changes. Note that OLS maybeattractive when looking foran
explanation-not
aprediction-of
thesimulation'sI/0
behavior;forexample, whichinputs are most important; doesthesimulation showabehavior thattheexperts find acceptable (alsoseeKleijnen, 1998)?0LS isalsocomparedwith(universal) Krigingby RegniereandSharov (1999).Their
Kriging for Interpolation in Random Simulation 21
Further, we found that tile nugget effectequals thenoise variance.
We restrictedour examples toasingle input. Therefore we gaveeach
weight in the
more general distance formula (12),thefixed value of one. In designoptimization,however,
theseparameters are used to controltheimportance of theinputvariable x ; seeforexample
Simpson et al. (2001, p. 8)andJones et al. (1998, p. 5). Infuture workweshall investigate multipleinputs.
Further, we shall relaxtheassumption
of
whitenoise:We shall investigatetheeffects of non-constant variances(whichoccurin queueing simulations), common random numberusage (whichcreatescorrelations amongthesimulation outputs),andnon-normality(Kriging uses maximumlikelihoodestimators of the weights4,
whichassumesnormality). Finally,we shall applyKrigingto practical queueingandinventory simulations.Acknowledgment
We thank the two anonymousreferees
for
their most useful comments onanearlier version.References
Campolongo, F., J.P.C. Kleijnen, and T. Andres (2000), Screening methods. In: Sensitivity Analysis, edited by A. Saltelli, K. Chan, and E.M. Scott, Wiley, Chichester (England), pp.
65-89
Cressie, N.A.C. (1993).StatisticsforSpatial Dam,Wiley, New York
Huijbregts, C.J.andMatheron, G. (1971). UniversalKriging (An optimalmethodforestimating and contouring in trend surface analysis). In Proceedings of Ninth International Symposium on Techniques for Decision-making in the Mineral Industry
Jones, D.R., M.Schonlau, and W.J. Welch (1998).Efficient global optimization
of
expensiveblack-box functions.Journal of Global Optimization.13. 455-492
Kleijnen, J.P.C. and W. vanGroenendaal (1995), Two-stageversussequential sample-size determination in regression analysis of simulation experiments.American Journal of Mathematical and Management Sciences, 15, nos. 1&2, 1995, PM 83-114
Kleijnen, J.P.C. (1998),Experimental designforsensitivity analysis,optimization,andvalidation of simulation models. In: Handbook of Simulation. edited by J. Banks, Wiley, New York, pp.
173-223
Kleijnen, J.P.C.andJ.Helton (1999), Statisticalanalyses
of
scatter plots toidentify important factorsin large-scale simulations. ReliabilityEngineering andSystemsSafety, 65, no. 2, pp.147-197
Koehler, J.R. and A.B. Owen (1996), Computer experiments. In: Handbook OfSmtistics, Volume13,edited byS.GhoshandC.R.Rao, Elsevier, Amsterdam, pp.261_308
Law, A.M. andW.D.Kelton(2000), Simulation modeling and analysis,
third
edition, McGraw-Hill, BostonMatheron, G. (1962) Traite de Geostatistique Appliquee, Memoires du Bureau de Recherches
GeologiquesetMinieres, No. 14,Editions Technip,Paris,(pp.57-59)
Meckesheimer, M., R.R. Barton,T.W.Simpson, and A.J. Booker (2001),Computationally inexpensive metamodelassessmentstrategies.American /nstitute OfAeronautics and AstronauticsJournal(submitted)
Rdgnitre, J. and A. Sharov (1999),Simulatingtemperature-dependent ecologicalprocesses at the
sub-continental scale: male gypsy moth flight phenology.International Journal of Biometereotogy, 42, 1999,pp. 146-152
Sacks,
J.,
Welch, W.J., Mitchell, T.J. & Wynn, H.P. (1989), Designandanalysisof
computerexperiments. Statistical Science, 4. no.4.pp.409-435
Simpson, T.W., Mauery, T.M., Korte, J.J., and Mistree, F.(2001).Kriging Metamodels for Global Approximation in Simulation-Based Multidisciplinary Design Optimization. American Institute of Aeronautics and Astronautics Journal (submitted)
Chapter 3
Robustness
of Kriging
when
Interpolating in
Random
Simulation
with
Heterogeneous
Variances:
Some
Experiments
Abstract
Thispaper investigates the use
of
Krigingin random simulation whenthesimulation output variances are not constant. Kriginggivesaresponsesurfaceor metamodel that can be used for interpolation.BecauseOrdinaryKrigingassumesconstant variances, thispaperalso applies DetrendedKrigingtoestimateanon-constant signal function, andthenstandardizestheresidual noise throughtheheterogeneous variances estimatedfromreplicated simulationruns.Numerical examples, however,suggestthat OrdinaryKriging isarobustinterpolationmethod.3.1 Introduction
Given a set
of
inpuVoutput (I/O)data Kriging fitsametamodel (also calledresponsesurface,emulator, auxiliary model, etc.). Kriging isaninterpolationmethod; that is, at the observed 1/O valuesits predictionsareexactly equal totheobservedoutputvalues.However, classicleast
squares
(LS)-used
inregression-typically fits
alow-degreepolynomial,which results innon-zero
fitting
errors (or residuals) such that the sumof
these squarederrorsisminimized.Originally,theSouthAfrican miningengineer D.G.Krigemodeled realworld I/0 data to predictgold quantities;seeCressie (1993). Later on,Krigingwasapplied todeterministic simulation 1/Odata(modelingtheperformance
of
computer chips, cars, etc.);seeSacks et al. (1989). More recently, the applicationof
Krigingtorandom simulation(forqueuing problems, etc.)wasproposedbyBarton (1994), andwasinvestigatedindetail by VanBeersandKleijnen (2003). The latter authors suggested that it maybebetter to replace classicOrdinary Kriging by theirnovel method called DetrendedKriging, whichcorrectsor'detrends' theoriginaloutput data throughanestimated signal function (see S(s) below). However, both Ordinary and DetrendedKrigingassume that the outputs have constant variances.In practice, this constant variance assumptionisoften completelyunrealistic. For example,theso-calledM/M/1queue isanimportant buildingblock in discrete-eventsimulation; see LawandKelton (2000). For theM/M/1,Cohen (1969) provesanalytically thatthe
steady-statewaiting time (say) Z foratraffic load s with 0<s<1 has mean
E(Z(s))=-1 (1)
1-sand variance
\ 3(2 - s)
Var(Z(s))= (2)
(1 -s) 2
sothis varianceisdefinitelynotconstant whenthetrafficloadchanges.Therefore we now investigate the consequences
of
varianceheterogeneitywhen applying Kriging to the I/0 of random simulation.Whereas linear-regression analysis
of
low-order polynomialmetamodels has been applied extensively in discrete-event simulation (such as queueing simulation), Kriging has hardly been applied to random simulation.The mainconclusion of our (limited)experiments will be thatOrdinaryKrigingseems a
Robustness of Kriging when Interpolating in Random Simulation with 31
Heterogeneous Variances: Some Experiments
The remainder of this paperisorganizedasfollows.Section3.2summarizesthebasics of Kriging.Section3.3 discussessolutions fortheproblems
of
non-constantandexplodingvariancesinrandomsimulation. Section3.4proposesanovelKrigingmethod, which estimates a non-constant signal function andthenon-constant variances. Section3.5 presentsnumerical examples, whichsuggestthat OrdinaryKriging isarobust prediction methodwhereasthenovel Krigingmethodisbetter only in utopian situations. Section3.6gives conclusionsandproposes
futureresearchtopics.
3.2
Kriging
basics
Krigingassumesthefollowingmodel:
Z(s) = S(s) + '1(s) with 11(s)
-NID(0, 02 (s)) (3)
where S(s) is the so-calledsignal function and4(s) the additivenoise; we use the notation x - NID(a, b) when x is normally and independently distributed (NID) with mean a and variance b. Below, we shall discuss the classic assumption
of
white noise,which means that02(s) in(3) reduces toaconstant (say) 0 2.
Note: The model in (3) maygeneratenegative output values,whereasM/M/1generates
non-negative valuesonly.Nevertheless, we use(3)because
it
provides bettercontrol over our experiment and itisfaster thanM/M/1 simulation; see VanBeers andKleijnen (2003).Ordinary Krigingassumesastationary covariance process
for
Z(s) in (3) (see Cressie1993).That assumption implies thattheexpected values E (Z(s)) areconstant, and the
covariances of Z(s, + h) and Z(si) depend only on the distance (or lag) h.The predictor for the
point so -denoted by p(Z(so))-is
aweightedlinearcombination of all the (say)nobserved output data:To selecttheweights /1, in(4), Krigingminimizes the mean-squared prediction error. The
technique uses thevariogram,defined as 23 (11) =var(Z(s + h)-Z(s)), see Figure l discussed below.The optimal weights turn out to be
" -t' +1 1 i,;, i, j r-1 (5)
where 7 denotesthe vector (7(so -s, ) . . . . , y(so
-s„))',r
denotes the n x n matrix whose (i,j)*
elementis 2 (s, -sj),and 1 denotesavectorofones; alsoseeCressie (1993, p. 122). Note thattheweights A, in (5) vary withthepredictionpoint,whereaspolynomial-regression metamodels use thesameestimated metamodel forallthesepoints.3.3
Random simulation
with
heterogeneous variances
Inthispaper,weallowheterogeneous variances;hence, theKrigingassumption ofastationary covariance process does not hold. Therefore, we do not expectasmooth variogram. Furthermore,
Van BeersandKleijnen(2003) found thattheintercept ofthefittedlinear variogram estimates theso-callednuggetefect. The term 'nugget' originated in the Kriging of gold mining data:
when going back to the 'same' spot (lag h 1 0; see the text above equation 4),acompletely different output-namely, agold nugget-maybeobserved. In general,Kriginginterprets a nugget effect asanestimate
of
noise;forexample, measurement noise in geostatistics. Random simulationmodels produce noisydata-by
definition. Inthispaperwe shall investigate whetherthe intercept ofthe variogramstillestimatesanugget effect in case
of
heterogeneous variances. Moreover, in simulation the variance may besolarge that simulation is actually impractical: as the traffic load approaches the value one, the varianceexplodes-as the M/M/1 illustrates; see (2). Therefore ChengandKleijnen(1999) recommend that(i) in suchasituation we should simulate a load much lower than one,andextrapolate;
Robustness of Kriging when Interpolating in Random Simulation with 33
Heterogeneous Variances: Some Experiments
Likewise,Asmussen (1992) recommends not tosimulate the M/M/1 foratraffic rate s > 0.9; he too suggests to extrapolate inthose cases.VanBeers andKleijnen(2003),however, simulated
loads up to
0.99-but
theyart(ticially keptthevariances constantandsmallovertheirwhole rangeof
simulated values, 0.01 5 3 5 0.99.Initially,we sampleasingle output value for each of21 input (trafficrate)values, as
follows. We use (3) where we insert (1) and (2),whichmodel the signal and the variance functions of the M/M/1. We do so for21 traffic rates s e *101,0.05,0.10,...,0.95,0.99}. We
ignorethevariance heterogeneity andthevariance explosion,andproceed as in Van Beers and
Kleijnen (2003).We thengetestimated variogramslikeFigure 1.Thisfigure showsa
wildly
oscillatingvariogram; obviously, thelinearapproximation is poor. This poor
fit
gives an inaccurateKrigingmetamodel withtheweights A, defined in (5).15000 10000 -21(h)
...
5000- , I ..-I-... 0 01 0.2 0.3 04 05 0.6 0 7 0 8 0. 1 hFigure1: Estimated variogram for Z(s) = s/(1 - s) + 11(s) with 11(s) - NID ), 3(2 - s)/(1 - s)2 )
Figure 1 illustrates that in random simulation we should ensure thatthesignaYnoise
ratio
is 'adequate'.Torealize suchasignal/noise ratio, we should-first of all-avoid variance explosion; that is, we should simulateanon-saturated trairic load: that is. a load much lower than
Comparison withthevalues onthey-axis
of
Figure 1 shows thattherange ofthevariogram ismuch smaller,sovarianceshaveindeedbeen decreased.
. .. 14- * 12- 10-.. 2,(h) ' 6. -,
...
0 0.1 02 03 04 05 06 07 0. 09 1 hFigure2: Variogram afterelimination of s = 0.99 in Figure 1
Besidesavoidingvariance explosion, we may reducethemagnitudes oftheremaining variances through increasingthenumber ofreplicates(say) m-whenever the noise of a single
replicate is too big. We denotethenumber
of
identicallyandindependentlydistributed (IID) replicatesfor input value s by m(s). Thenoneoption istoselectthese m(s) such that thevariances ofthe average responsesbecomeaconstant (say) 02:
t- 1 Var(Z(s))
Var(Z(s))= , = 02, (6)
m(s)
see Kleijnen and Van Groenendaal (1995).
Note:Insteady-state
simulation-as
opposed toterminating simulation-we may make a single long runandpartition that runintosubruns that play the roleofreplicates; see Law and Kelton (2000).In practice, however,theexperimental designimplied by (6) maybeimpractical (see
Robustness of Kriging when Interpolating in Random Simulation with 35 Heterogeneous Variances: Some Experiments
averages
Z(s).
These averages are'the' observations attheinputs s towhich we fitaKriging metamodel,asfollows.3.4
Novel
Kriging
method:
Studentization
To solvethevariance heterogeneity problem, wefirstapply VanBeersandKleijnen (2003) 's
DetrendedKriging,whichassumes
Z(s)= SCS)+11(s)+ a (s) (7)
where S(s) is the signal function, 11(s)- NID(9,02) is the whitenoise model for measurement
error, and b (s) denotesfittingerror. However,unlikeVanBeers andKleijnen (2003) we now
allow 11(s) to haveheterogeneousvariancesdetermined bytheinput s; see (3).
Unbiased estimators
of
thesevariances-not
depending onametamodel such as (7»aretheclassic estimators
I ", 4,(s,) - 2(s, ,)'
a2 (st)= "' (8)
m(st)-1
where Z,(s,) denotes the
r
replicated observation on Z(s,) andzcs,)= Ir:.)z.(st)
(9)m(si)
denotesits sampleaverage.
LikeVanBeers andKleijnen (2003), wemay estimatethesignalfunction S(s) through
unequal variances, so the best linear unbiased estimator (BLUE) is given by weighted least
squares(WLS)-not ordinary least squares (OLS).
Next,weapplyOrdinaryKriging tothedetrendedandstandardized observations
Z (S)=Z(S) - S(S) (10)
0(s)/ fm
Because (10) resembles Student'stStatistiC, we callthetransformation in(10) 'Studentizing'. If we neglect the random character of S(S), then we conjecture thatthetransformed data
Z(s) in (10) haveaconstantmean(namely zero) andaconstant variance, namely
IN(m -2).So
2(s) then satisfies the Ordinary Kriging assumptions.
Finally,the predictor for so (say) p(Z(so)) follows from (10):
d(so)
p(Z(S )) = P(Z(sonx--+
SWA, (11)qm
where p(Z(so)) isthe OrdinaryKriging predictor for so based onthe transformed data in (10);
d (so) isgiven by (8) if so
-st;
otherwise, weusepiecewise linear interpolation between the variances at the two neighboring points thathavealreadybeenobserved. (This interpolation avoids negative estimated variances.)3.5
Numerical
examples
Asthesignal jiinction in (3)wefirstselect S(s) = s/(1 -s), whichequals theM/M/1hyperbola in (1). FollowingVanBeers andKleijnen (2003),weexperiment with fivedetrending models,