• No results found

Kriging metamodeling for simulation

N/A
N/A
Protected

Academic year: 2021

Share "Kriging metamodeling for simulation"

Copied!
116
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Tilburg University

Kriging metamodeling for simulation

van Beers, W.C.M.

Publication date:

2005

Document Version

Publisher's PDF, also known as Version of record

Link to publication in Tilburg University Research Portal

Citation for published version (APA):

van Beers, W. C. M. (2005). Kriging metamodeling for simulation. Tilburg University Press.

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

(2)

53976

..

(3)

/3.-/-1

UNIVERSITEIT * 2]F *

.16----4

;AN TILBURG SIBLIOTHEEK

1 TILBURG

STELLINGEN

behorende bij het

proefschrift

Kriging

Metamodeling

for

Simulation

door Wim C.M.

Van

Beers

1. In'discrete-event' simulatie geeft

Kriging

beterevoorspellingen dan polynomiale regressiemodellen.

(dit proefschrift)

2.

Kriging

interpolatieisrobuusttenaanzien vandeklassieke modelveronderstelling datde respons eenconstantevariantieheeft.

(dit proefschrift)

3.

Voor

tijdrovende simulaties

zijn

modelafhankelijke

sequentiele proefopzettenteprefereren boven klassieke proefopzetten.

(dit proefschrift)

4. Hoewelhetsimuleren vandegemiddelde

wachttijden voor

een gegeven

bezettingsgraad van een M/M/1 model door middel vaneenniet-lineaire

stochastische

differentievergelijking uit principiele

overwegingen de voorkeurheeft, is hetvoor toetsdoeleindene

fficient

eneffectief om de analytisch verwachte wachttijdenteverhogenmetNHD-trekkingen.

(4)
(5)

KRIGING METAMODELING

FOR

SIMULATION

Proefschrift

ter verkrijging vandegraad vandoctor

aandeUniversiteitvanTilburg,

op gezag vanderector magnificus, prof. dr. F.A. van der DuynSchouten,

inhet openbaarteverdedigen ten overstaan van

een door hetcollege voor promotiesaangewezencommissie in de aula vandeUniversiteit

(6)

BIBLIOTHEEK TILBURG

(7)

Acknowledgment

This thesis istheresult of myresearchduringsix years in the Operations Research group oftheDepartment

of

InformationSystems andManagement intheFaculty of Economics and BusinessAdministrationat

Tilburg

University. During that time I met many peoplewhosupported me in various ways. They made mystayproductive

and-not in

theleast-pleasant. To some of them Iowe special thanks.

First of all, I

amextraordinarily grateful to mysupervisor, Professor Jack Kleijnen.Heguidedmethroughthefascinating world

of

randomsimulation and taught me to search for the

intuition

('wiedes') behindthe results. We had many

valuable discussions, andheencouraged me to develop the ideas that we discussed in such a way thatthey couldbepublishedin international journals. Iamgrateful for the manycriticalcommentshe

made-in

red, green, and blue. Theexistence ofthis thesis is largely due to him.

Alsospecial thanks are due to themembers of myPh.D.committee: Professors H.Daniels, D.den Hertog, T. Dhaene,J.Rooda, and H. Wynn

for

their

carefulreading,thoughtfulsuggestions,andcorrections. Ifeel honored to have them in my committee.

Among alltheotherswhosupportedmeduring myPh.D.study, I particularly wishtothank ourdaughter,Miranda. Her typicalmental support was a real

encouragement to me.

Last butcertainlynot least, I am mostgrateful to my wife,Marianne. She patiently gave me all the timeand space I needed to ventureintothosethings that I foundfascinating.There are no words to describe theemotionalandloving way she supportedme.Undoubtedly,thebiggest thanks are for her!

(8)

1 Introduction

to

Kriging

Metamodeling

for

Simulation 1

1.1 Real systems and mathematical

models 1

1.2 Simulationtypes 2

1.3 Metamodels 3

1.4

Kriging

4 1.5

Kriging

applications area 5

1.6 Summary

of

thesis 6

References 7

2 Kriging for

Interpolation

in

Random

Simulation 11

2.1 Introduction 11

2.2 Kriging 14

2.2.1 History

of

Kriging 14

2.2.2 Basics

of

Kriging 14

2.2.3 Formalmodel

for Kriging 15

2.3 Detrended

Kriging 19

2.4 Twoexamples and

five

metamodels 20

2.4.1 Example I: M/M/1

hyperbola 23

2.4.2 Example

I[:

fourth-degreepolynomial 24 2.5 Nugget

effect 25

2.6 Conclusionsandfuture

research 26

References 27

3 Robustness

of Kriging

when

Interpolating in

Random

Simulation with

Heterogeneous

Variances 29

3.1 Introduction 29

3.2

Kriging

Basics 31

3.3 Randomsimulationwithheterogeneous variances ...... 32

3.4 Novel

Kriging

method:

Studentization 35

3.5 Numerical

examples 36

3.6 Conclusionsandfurther research 43

References 44

(9)

4

Application-driven

SequentialDesigns

for

Simulation Experiments:

Kriging

Metamodeling 47

4.1 Introduction 47

4.2

Kriging basics . 50

4.3 Doeand

Kriging

52 4.4 Application-drivensequential

design 53

4.4.1

Pilot

input

combinations 53

4.4.2 Candidateinput combinations 55

4.4.3 Cross-validation 55

4.4.4 Jackknifing 56

4.4.5 Sequentialization 57

4.4.6 Stopping rule 58

4.5 Two examples 59

4.5.1 Example I: ahyperbolicVOfunction 59 4.5.2 Example H:afourth-order polynomial

I/0

function ... 61

4.5.3 Estimated variograms: Gaussian versuslinear ... 63

4.6 Conclusions andfurtherresearch 63

References 64

5 Customized Sequential Designs

for

Random

Simulation Experiments:

Kriging

Metamodeling

and

Bootstrapping 67

5.1 Introduction 68

5.2

Kriging

Basics 70 5.3 DOEand

Kriging 73

5.4 Sequential DOE 75

5.5 Two examples 79

5.5.1 M/M/1 model 79

5.5.2 (s,S)inventory

model 89

5.6 Conclusionsandfuture

research 92

5.7 Appendix

92

References 94

(10)

6.2 Future

research 99

Samenvatting (Summary

in Dutch)

103

(11)

Chapter 1

Introduction

to

Kriging

Metamodeling for

Simulation

1.1

Real

systems

and

mathematical

models

Many scientificdisciplinesusemathematical models(including simulation;seebelow) to

describecomplicated real systems. The goal ofsuchmodelsisgetting more insight into the real system,to answer questions such as: What is theoutput's sensitivity tothe inputs; what is the optimal combination oftheinput values?

To obtainsuchinsight, analytical methods (e.g.,differentialcalculus) often turn out to fail. In suchcases,numerical methods maybetried; i.e., experimentation with the model may answerthequestions about the realsystem.This experimentation often implies thatthemodel is converted intoacomputer code (or computer program) that is run foranumber

of

differentinput

combinations. Next, the resultinginput/output

(I/0)

behavior ofthe model shouldbeanalyzed.

We emphasize that theselection oftheinputcombinations shouldbeguided byscientific principles (e.g., changingoneinput at a time canbeproven tobeineffective

if

inputs interact, and inefficientelse). Theseprinciplesareinvestigated in mathematical statistics under the name

Design OfExperiments (DOE). In numerical experiments (as opposed to experiments with the

realsystem)theseprinciplesneedadjustment;seeKleijnen et al. (2005)andChapters 4 and 5. We focus on computer expensive simulation experiments; i.e., experiments that require much computer time. Inthesesituations, DOE iscertainlyneeded.

(12)

analysis requiresformalmethods. Most popularareLeastSquares(LS)curvesfitted to the I/0

data.We, however, focus onanalternative method knownasKriging;seeSection 1.4. Next,weshall discuss the key terms inthetitle ofthis chapter andthisthesis.

1.2

Simulation

types

A

simulationmodel might beaphysical model, e.g., ascalemodel ofaracing car in a wind

tunnel. We, however,

limit

ourselvestomathematical models (also seethepreceding section).

Following Kleijnen (1974), we define simulationasexperimenting with a model over time.This

definitionindicates thatthevariabletimeplaysaspecial roleinsimulation. Indeed, Law and

Kelton (2000) also emphasize the role of time, by using the term dynamic simulation. Besides this simulation type, there is static simulation, in which time plays no role; an example is the Monte Carlo (MC)method.By definition, the MCmethodusesPseudo-Random Numbers (PRN);i.e., computergeneratednumbers between zero and one thatareindependent and

uniformly

distributed over this interval. We shall also useMCmodels in this dissertation (see Chapters 2,3, and 4).

Therearedeterministicandrandom simulationmodels. Ifthesimulationmodel contains no random components, the simulation modeliscalled deterministic. For example, such

simulationmight solve a set

of

complicateddifferentialequations describingtheairflowaround airwings. Simpson et al. (2001) mention applications

of

deterministic simulation invarious disciplines.Typically, in this type

of

simulation,aninputcombinationneedssimulation only once(repeatedcomputer runs with thesameinputcombination give exactly thesameoutput value, provided we do not make any changes inthecomputer software or hardware).

(13)

Introduction to Kriging Metamodeling for Simulation 3

simulation type. Note that we donotstudy dynamic,deterministicsimulation models (often used inComputerAidedEngineering, CAE; seeforexample De Geest et al. (1999)).

Table1: Simulation typeswithappropriate dissertationchapters

Deterministic Random

Static Chapter 4 Chapters 2 and 3

Dynamic Chapter 5

Finally,wediscussathird waytodistinguish various simulationtypes:continuousversus

discrete-event simulation. Continuous simulation experiments with models consisting of differentialequations(so statevariableschangecontinuously). Computercodesapproximate thesedifferentialequations by difference equations.Thesemodelsareusually deterministic (see

the preceding discussion, summarized in Table 1).Obviously,random elements maybeadded to thedifferentialequations;forexample, econometric models may consist

of

difference equations plusadditivenoise.

Discrete-event simulationhas statevariables thatchangeinstantaneously,atpoints in time that are not necessarily equidistant;forexample, customers arrive ataserveratrandompoints of time.Anevent isthe change inthesystem'sstate;forexample, the number

of

waitingcustomers

increases.Randomsimulationincludes discrete-event event simulation; see VanBeers and

Kleijnen (2005).

Simulation-in its

many

forms-is

applied in manyscientificdisciplines. We focus on thedisciplineknownasOperations Research/Management Science (OR/MS). In OR/MS,

simulation isalsooftenapplied-,eventhoughsimulationisdescribedasmethod of last resort 'when allelse

fails...' in

thefamousChapter 21 inWagner (1975).

1.3 Metamodels

A metamodel is also calledaresponsesu,face,auxiliarymodel, emulator,etc.Metamodeling

originatesfrom neuro-linguistics inthebeginning ofthetwentieth century;seeKorzybski and Meyers 1958. Kleijnen et al. (2005) defineametamodel asanapproximation of the true I/O

(14)

simpler thantheunderlyingsimulation model;anexample isafirst-order polynomialregression (meta)model of the

I/0

function ofaqueueing simulation. A metamodeltreats thesimulation model asablack box; i.e., it uses the VO datawithoutknowledge of the way thesimulation modelprocesses theseinputs to gettheoutputs.

A metamodel is a tool forthesystematicexperimentationandanalysis ofasimulation model, togain insight (againseeSection 1.1).Kleijnen(1998)discusses the use

of

metamodeling for

(i)

sensitivity analysis

(ii) optimization

(iii) validation

andverification. We shall focusonsensitivityanalysis.

There are many types

of

metamodels;seeKleijnen (2005). Most populararelow-order polynomialregression models. However,Kriging isalsoapplied frequently in deterministic

simulation; see Den HertogandStehouwer (2002),andSimpson et al. (1998).Becauseother types

of

metamodels do not play a role in this thesis and we wish to avoid errors

of

omission, we refrain from listing(anddefining)thesetypesandgivingkey references. In thisthesis,wefocus

on Kriging,andcompare its performancewithclassical regression.

Atkinson (1989, ch.3) definesinterpolation as theselection of a function in such a way thatits graphpassesthroughafinite set

of

given data points.Krigingappliedtodeterministic simulationmeetsthat definition, as we shall see in the next section. Inpractice, the term interpolation is also used to meanafunctionthat approximates the VO data: forexample, a regression model may fit the I/0 data such that

it

minimizes the sum

of

squaredapproximation errors (LS criterion).

1.4 Kriging

In the195Os,theSouthAfrican miningengineer D.G. Krige (born in 1919, andstillalive)

devisedaninterpolation methodtodetermine true ore-bodies,based on samples.The basic idea is thatthesepredictionsareweighted averages of the observed outputs, where the weightsdepend

(15)

Introduction to Kriging Metamodeling for Simulation 5

observed.The weightsare chosen so astominimizetheprediction variance, i.e.,theweights should provide a Best Linear UnbiasedEstimator (BLUE) oftheoutput value foragiven input.

Therefore,Kriging isalsocalled Optimal Interpolation.

The dependence oftheinterpolationweights on the distances betweentheinputs was mathematically formalized bythe French mathematician Georges Matheron (1930-2000) in his monumental 'Traitddegdostatistique appliqude' (1962). He introducedafunction, which he calledavariogram, to describe the variance of the difference between two observations; also see

Chapter 2.

SoKrigingoriginated in geostatistics to answer concrete questions in the goldmining industry:

Drilling

for

ore-deep

underthe

ground-is

expensive,soefficientprediction methods

are necessary.Lateron, Krigingwassuccessfully introducedinto deterministicsimulation by Sacks et al. (1989). In thisthesiswe introduceKriginginterpolation into the area

of

random simulation.

Krigingprovides 'exact' interpolation,i.e.,predicted outputvaluesat inputs already

observed equaltheobserved outputvalues. Suchinterpolationisattractiveindeterministic simulation,andKrigingis often applied inCAD(mentionedinSection 1.2).In discrete-event

Simulation, however, Kriging hasjuststarted.

Above, we mentioned thatthevariogram isthecornerstonein Kriging. Hence,accurate estimation ofthevariogram,based on theobserved data,isessential. JournelandHuijbregts (1978,pp. 161-195) present various parametric variogram models. The values of itsparameters

are obtainedbyeither Weighted Least Squares (WLS)or Maximum LikelihoodEstimation (MLE);seeCressie (1993). In thisthesis,we shall useboth methods: WLS in Chapters 2,3, and

4, and MLE inChapter 5.

1.5

Kriging

applications area

Aftertheintroduction

of

Kriging intheminingindustry,

it

became very popularin other areas of geostatistics, suchasmeteorology, oceanography, agricultureandenvironmentstudies; see

(16)

Afterthepioneeringarticle

of

Sacks(1989), Kriging has alsobeenwidelyapplied in deterministicsimulationforengineering,aimed at thedesign

of

better computer chips, television

screens,aircraft,andautomobiles.

In this thesis we introduceKriging for simulation inOR/MS, covering both deterministic and random simulations(in differentchapters). We

limit

ourselves tosensitivityanalysis; we feel that sensitivityanalysisprecedesoptimization, sowe leaveKriging for optimizationasfutureresearch

(alsoseeChapter 6).

1.6 Summary

of

thesis

Thisthesiscontains the full text of fourpapersthat eitherhavealreadybeenpublished or have

been submittedfor publication. It istheverbatim text, exceptforthreesentencesinChapter 2, indicated by footnotes. In this section we summarizeeach paper.

In Chapter 2, we introduceKriging interpolation forrandom simulation. We develop a novel type

of

Kriginginterpolation, and call

it

DetrendedKriging.We demonstrateKriging through two numerical examples:

(i) ahyperbole inspired by the single-server queueingsimulation withPoisson (Markov)arrivalandserviceprocesses(known as theM/M/1queueing model)

(ii) an

artificialmodel, namelyafourthdegree polynomial withmultiplemodes plus

additivenoise.

We test our novel method through cross-validation,andcompare it tolow-orderregression metamodelsfittedthrough Ordinary Least Squares (OLS). The mainconclusion ofthis chapter is thatKriginggives better predictions thantheseregression models. Furthermore, tests show that the Kriging'sso-called nugget effect equals the variance of the additive noise.

In Chapter 3, we droptheclassicKrigingassumption

of

outputswithconsmnt variances. In practice, this assumption is not realistic. Therefore we investigatethe consequences

of

Kriging

(17)

Introduction to Kriging Metamodeling for Simulation 7

heterogeneity;i.e,Kriging isarobustmethod thatstilloutperformslow-order polynomial regression.

In Chapter 4, we propose a novel method toselectexperimental designs

for

Kriging. Our

methodissequential,becausewe focusonexpensive deterministic simulation.Ourdesigndiffers from traditionaldesigns suchasLatinHypercubeSampling (LHS);seeMcKay,Beckman, and

Conover (1979), andalsoKoehler and Owen (1996),and Kleijnen et al. (2005). More

specifically,our method accounts for the specific

I/0

functionimplied bytheunderlying simulationmodel.Our customized, tailor-made designsareconstructed throughcross-validation and jackknifing. We test our method through the two academic applications that we also used in Chapter 2. Our main conclusions are thatthenovel method simulatesrelativelymore inputs in the more interesting parts oftheunderlying

I/0

function, and

it

gives better predictions than

traditionalLHSdesigns.

In Chapter 5, we extendthemethod

of

Chapter 4 to random simulation, especially discrete-eventsimulation. However, customization is nowbasedonbootstrapping instead of

cross-validation. We testour method through two discrete-event simulation models that are classic inOR/MS,namely theM/M/1 queueing model and an (s,S)inventorymanagement model. We again comparetheperformance of our methodwithclassicalLHSdesigns. Itturns out thatour design indeed gives better results.

Finally, inChapter 6 we summarize the conclusions ofthepreceding chapters. We also discussthe advantages andthedisadvantages

of

Kriging.Wefinish withpossible topicsforfuture research.

References

Atkinson, K.E. (1989), An introduction to numericalanalysis. John Wiley & Sons, Inc., New York

(18)

De Geest, J., T. Dhaene, N. Fache, and D. DeZutter(1999),Adaptive CAD-model building algorithm forgeneral planar microwave structures./EEETransactionsonMicrowave Theory and Techniques, 47, no. 9,pp. 1801-1809

Den Hertog, D., and H.P. Stehouwer (2002),Optimizing colorpicturetubesby high-cost non-linear programming. EuropeanJournal

of

OperationalResearch, 140(2),197-211

Journel, A.G. and Huijbregts, C.J. (1978),MiningGeostatistics. Academic Press,

London

Kleijnen, J.P.C. (1974),Statisticaltechniquesinsimulation, partL MarcelDekker Inc., New York

Kleijnen, J.P.C. (1998), Experimental designfor sensitivityanalysis,optimization, and validation

of

simulation models. Chapter 6 in: Handbookofsintulation, edited byJ.Banks, Wiley, New York,pp.173-223

Kleijnen, J.P.C. (2005), Invited review: Anoverview ofthedesignandanalysis of

simulation experiments for sensitivity analysis.European Journal of Operational Research tin press)

Kleijnen, J.P.C., S.M.Sanchez,T.W.Lucas andT.M.Cioppa (2005),Auser's guide tothebravenewworld o

f

designingsimulationexperiments. INFORMS Journal onComputing (accepted as State-of-the-Art Review)

Koehler, J.R. and A.B. Owen (1996), Computer experiments. Handbook of statistics,

by S. Ghosh and C.R. Rao, vol. 13,pp.261-308

Korzybski, A. andR.Meyers (1958),Scienceand sanity:anintroduction to non-aristotelian systems and general semantics; fourth edition. International Non-Aristotelian LibraryPublishing Company,Lakeville

Law, A.M. and W.D. Kelton(2000), Simulation modeling and analysis;

third

edition. McGraw-Hill,Boston

Matheron, G. (1962), Traitd de gdostatistique appliqude. Memoires du Bureau de Recherches Geologiques et Minieres, Editions Technip, Paris, no.14: pp.

(19)

Introduction to Kriging Metamodeling for Simulation 9

McKay, M.D.,R.J.Beckman and W.J. Conover (1979),

A

comparison

of

three methodsforselecting values

of

input variables intheanalysis

of

output from a computercode. Technometrics, 21, no. 2,pp.239-245(reprinted in 2000: Technometrics, 42, no. 1, pp. 55-61

Sacks, J.,W.J. Welch, T.J. Mitchell and H.P. Wynn(1989), Designandanalysis of

computer experiments.Smtistica/Science, 4, no. 4, pp. 409-435

Simpson, T.W.,J.J.Korte,T.M.Mauery, andF.Mistree (1998),Comparison of

responsesurfaceand Krigingmodelsfor multidisciplinarydesign optimization. AIAA Journal, vol. 1,pp.381-391

Simpson, T.W.,T.M. Mauery,J.J. Korte, andF.Mistree (2001), Kriging

metamodels

for

global approximation in simulation-basedmultidisciplinary

designoptimization. AIAAJournal, 39, no. 12, 2001,pp.2233-2241

Van Beers, W.C.M. andJ.P.C. Kleijnen(2005), Customized sequential designs for

randomsimulationexperiments:Krigingmetamodelingandbootstrapping. Submitted for publication

(20)
(21)

Chapter 2

Kriging for Interpolation

in

Random

Simulation

Abstract

Wheneversimulationrequires much computer time,interpolationis needed.Simulationists use different interpolationtechniques (forexample, linear regression), but thispaperfocuses on

Kriging. Thistechniquewasoriginallydevelopedingeostatistics by D.G.Krige, and has recentlybeenwidelyappliedindeterministic simulation. Thispaper,however, focuses on randomorstochasticsimulation. Essentially,Kriginggives moreweightto'neighbouring' observations. Thereareseveral types

of

Kriging;this paperdiscusses-besidesOrdinaryKriging -a noveltype,which 'detrends' data through the useof linearregression. Resultsarepresented fortwo examples

of

input/outputbehaviour of theunderlyingrandom simulation model: Ordinary and DetrendedKriginggive quite acceptable predictions;traditionallinear regression gives theworstresults.

2.1 Introduction

A primary goal of simulationiswhat iforsensitivity analysis: What happens if inputs of the simulation model change? Therefore simulationists runagivensimulation

program-or

computer

code-for (say)

ndifferentcombinations of theksimulationinputs. Weassume that

Paper byVan Beers, W.C.M. andJ.P.C.Kleijnen, Journal OftheOperationalResearchSociety, no. 54,2003, pp

(22)

theseinputsareeither parameters orquantitativeinputvariables ofthesimulationmodel. Typically, Krigingassumes thatthe number

of

values per input variableisquite

'big',

certainly exceeding two(twovalues are usedinsimulationexperimentsbased on 2k-Pdesigns).

Given this set ofninputcombinations,theanalysts run the simulation and observe the outputs. Note that most simulation models havemultipleoutputs, butinpracticetheseoutputs are analysedperoutput type.

The crucial question of this paper is: How to analyse thissimulation input/output (I/0)

data?Classic analysisuseslinear-regression (meta)models;seeKleijnen (1998). Ametamodel is an approximation of the

I/0

transformationimplied bytheunderlying simulationprogram. Many other termsarepopular in certain disciplines:Response surface,compact model, emulator, etc. Suchametamodeltreats thesimulation model asablack box; that is, thesimulationmodel's VO is observed, andtheparameters of the metamodelareestimated.Thisblack-boxapproach has the

following

advantages anddisadvantages.

An advantage is thatthemetamodel canbeapplied to all types

of

simulationmodels, either deterministic or random, either in steady-state orintransientstate.Adisadvanmge is that it cannot take advantage ofthespecificstructure ofagiven simulation model, so it may take more computer time comparedwithtechniques suchasperturbation analysis andscorefunction.

Metamodelling can also helpin optimizationandvalidation ofthesimulationmodel. In this paper, however, we do not discussthesetwo topics, but refer tothereferences ofthispaper.

Further, ifthesimulation modelhashundreds

of

inputs,thenspecial 'screening' designs are

needed, discussedin Campolongo,Kleijnen,and Andres (2000). In ourexamples-but not in our methodologicaldiscussion-we limitthenumberofinputs totheminimum,namelyasingle

input.

Whereaspolynomial-regression metamodels havebeenapplied extensivelyin discrete-event simulation (suchasqueueingsimulation), Kriginghashardlybeenappliedto random simulation: Asearch ofIAOR(International Abstracts

of

Operations Research) gave only two hits. However, in deterministicsimulation(applied in many engineering disciplines; see our

(23)

Kriging for Interpolation in Random Simulation 13

In randomsimulation,however, thisKrigingproperty may not besodesirable, sincetheobserved (average)value is onlyanestimate of the true, expected simulation output.Unfortunately,

Krigingrequires extensive computation,so adequatesoftwareis needed.We discovered that for random simulation no software is available, so we developed our own software, in Mattab.

Note that several types

of

random simulation maybedistinguished:

(i) Deterministic simulation withrandomly sampled inputs. For example, in investment analysis wecancompute the cash

flow

development over time throughaspreadsheet suchasExcel. Next, we sampletherandom values

of

inputs-such as the cashflow growth

rate-by

means

of

either MonteCarlo orLatinHypercube Sampling(LHS)throughanadd-on such as @Riskor Crystal Ball; see Van GroenendaalandKleijnen (1997)

(ii)

Discrete-event simulation. For example, classic queueingsimulationisapplied in logistics

and telecommunications.

(iii)

Combined continuous/discrete-event simulation. For example,simulation

of

nuclearwaste

disposalrepresentsthe physicalandchemicalprocessesthroughdeterministicnon-linear difference equationsandmodelsthehuman interventionsasdiscreteevents (seeKleijnen and Helton, 1999).

Ourresearchcontributionconsists inthedevelopment ofanovel (namely, detrended)

Kriging type, andtheexploration of how wellthis Krigingtype performs compared with OrdinaryKrigingandtraditionalpolynomial-regression modeling. The main conclusion of our examples is:A perfectlyspecified detrendingfunctiongivesbestpredictions;Ordinary Kriging is

acceptable;theusuallinear regression givestheworstresults.

We organize the remainder of thispaperasfollows.First we sketch the history

of

Kriging

andits application in geology, metereology,anddeterministic simulation. Then we describe the

(24)

2.2 Kriging

2.2.1 History

of Kriging

Kriging isaninterpolationtechniqueoriginallydeveloped by D. G. Krige,aSouthAfrican miningengineer. Inthe 1950s hedevised this method to determine true ore-grades,based on

samples. Next,heimproved the methodincooperation with G. Matheron,aFrench

mathematician atthe'EcoledesMines'. At the same time,inmeteorology L. Gandin (in the formerSovietUnion)workedonsimilarideas,under thename'optimuminterpolation' (see Cressie, 1993).

Nowadays,Kriging isalsoapplied to 1/0 data

of

deterministic simulationmodels; we

referagaintoSacks etal. (1989)'s pioneering article. Manymorepublications followed; for

example, Meckesheim et al. (2001) give35references. AlsoseeKoehler and Owen (1996), and

Jones,Schonlau,andWelch (1998).

2.2.2 Basics

of Kriging

Kriging isanapproximation method that can give predictions

of

unknown values ofarandom function, random field, or random process. These predictionsarebest linear unbiased estimators, under theKrigingassumptionspresented in the next subsection.

Actually, these predictionsareweighted linear combinations of the observed values.

'Kdging assumes that the closer the input data are, the more positively correlated the prediction

errors are.Mathematically, this assumption is modeled throughasecond-orderstationary

(25)

Kriging for Interpolation in Random Simulation 15

thepredictionequals theobservedvalue. (In deterministicsimulationthis propertyiscertainly attractive, as wesaidabove.)

In Kriging, a crucial role is played bythevariogram:

A

diagram of the variance of the

difference between the measurements at two input locations; alsoseeFigure1.which has symbols explained in the next subsection. The assumption ofasecond-order stationary covarianceprocessimplies thatthevariogram isafunction ofthedistance (say)hbetween two locations. Moreover, the further apart two inputs are,thesmaller this dependence

is-until the

effectisnegligible.

2.2.3

Formal

model

for Kriging

A randomprocessZ(•) can be described by {Z(s) :s E D} where D i s a fixed subset of Rd and

Z(s) isarandomfunctionatlocation s D; see Cressie (1993, p. 52).

There are several types of Kriging, but we limit this subsection toOrdinary Kriging, whichmakes the

following

twoassumptions (already mentioned above, but not yet formalized):

(i) The model assumptionisthat the random process consists of a constant 11 and an error term O (s):

Z(s)=B+O (s) with seD,BER (1)

(ii)

Thepredictorassumption is thatthepredictor forthepoint so -denoted by

p(Z(so))-is

aweighted linearfunction of alltheobserved output data:

p(Z(So))=Elli,Z(st) with TIi,1,=1 (2)

To select the weights A, in (2), thecriterion is minimal mean-squared prediction error (say) a defined as

03 =E[{Z(so)-p(Z(so))}21 (3)

To minimize(3)given (2),theconstraint X 1,1, =l isadded to theobjective functionthrough

theLagrangianmultiplier m: Then wecanwritetheprediction error as

1

(26)

E[{Z(so)-63:.,A,z(s,)}2]-2m[Ili,1 -11. (4)

To minimize (4), weutilizethevariogram; alsoseeFigure 1. Bydefinition,thevariogram is

2, (h) = var[Z(s + h) -

Z(s)],

where h =s, -si asexplained by the stationary covarianceprocess

assumption with h e Rd and i,j - 1,..., n. Obviously, we have

var[Z(s + h) - Z(s)] = 23(st - s,) = 22(h).The spacing h is also called the lag. Aftersometedious manipulations, (4) gives

- X l,T "=,A;.1, »,-st)+261 1,·1, Aso -s,)-2mC X liA,-1). (5)

Differentiating (5) withrespect to Ai , · · · , A„ and m.givestheoptimal ,11, . . . , A„ :

f 1-1'r-'TY_

11= 7+1

11-1 and m=-(1-1'r-ly)/(1'r-'1), (6)

ir-1 1 3

where y denotes thevector

of

(co)variances (y(so-st)....,y(so-s„))',

r

denotes the nx n matrixwhose (i,j)th

element is )(s,-s,), 1 -(1, ..., 1)'is

thevectorofones; alsoseeCressie (1993, p. 122).

We emphasize thattheseoptimal Kriging weights 1 depend onthe specificpoint so that

is tobepredicted,whereaslinear-regression metamodelsusefixedestimatedparameters (say) B

Note further that some of the weights 1 may be negativez.

The optimal weights (6) givetheminimalmean-squaredpredictionerror: (3)becomes

(alsoseeCressie (1993, p. 122)

Cr; = Il,ly(so-st)+m

(1'r-'7-1)2 (7)

= y'r-'y

1, r -1 1

However, in (6) and (7) 2 (h) isunknown. Theusualestimator \s

(27)

Kriging for Interpolation in Random Simulation 11

29(17) =

-LY

(Z(st)-Z(s,))2 (8)

N(hM - Nth) \

where|N(h)| denotes thenumber

ofdistinct

pairs in N(11)={(s,.s,):st-sj=h; i, j =1,....n};

see Matheron (1962). The estimator in (8)isunbiased, iftheprocess Z(•) is indeed second-order

stationary; seeCressie (1993, p. 71).

Given (8)for differentIlhll values,thevariogramisestimatedby fittingacurve through

the estimated values 2f (11). This curve displays the following importantcharacteristics(see

Figurel):

(i) Forlarge values

of

1|h||,thevariogram 2 f(h) approaches aconstant C(0),called the

sit!'. 'Forthese large Ilhll values, all variances of the differences Z(s + h) - Z(s) are invariant with

respect to h.

To prove this property, we definethecovariogram C(h)=Cov(Z(s), Z(s + h)). Obviously, COV(Z(S),Z(S)) =Var(Z(s)).Then it is easy to derive

22 (h) = 2(((0) - C(h)). (9)

Because C(h) 1 0 as 11 h Ill' -, the variogram hastheupper

limit 2C(0).

(ii) The interval of Ilhll on which the curve does increase (to the sill), is called therange

(say) r;that is, C(11) < E for 11 h 11> r + r,. We shall giveaspecificmodel in (10).

(iii) Although(9)impliesgamma 3 (0) =0,thefittedcurve does not alwayspassthrough zero: It may have a positive intercept-calledthe nuggetvariance. This variance estimates noise.

(28)

For example, in geostatistics this nugget effects means that when going back to the 'same' spot, a completely differentoutput (namely, a gold nugget)isobserved.

We add that in random simulation, the same input (say, the same traffic rate in queueing

simulation)givesdifferentoutputsbecausedifferentpseudo-random numbers areused. Below we shallreturn to thisissue.

To fit

avariogram curve through theestimatesresulting from(8), analysts usually apply the exponential model

[co+ct(1-e-Alita) if h#0

Y(h) =4 (10)

[0 if h=0

where obviously co isthenugget, co + ct the sill, and athe range.However, other models are also fitted; for example,thelinear model

Y(h)=

4

[co+bllhll if h#0

(11)

[0 if h=0

where again co isthenugget;seeCressie (1993, p. 61).Actually,weshall apply (11) in our experiments, because it is the simplest model and yet gives acceptable results(forexample, it estimatesthenuggeteffect very well).

In deterministicsimulation.analysts use more general distance formulas than (8). For example, Sacks et al. (1989, p. 413)andJones et al. (1998, p. 5) use for the k-dimensional inputs

xi = (X,(„, ..., X'. k,)' and xj = (xjct,•..., x,(k,)' theweighted distanceformula

h(xi,xi)

=

E:.1 0: IX, g,-XM,

i

(12) where 19, (with d:

2 0)

measuresthe importance oftheinput xg, and pg controls the

(29)

Kriging for Interpolation in Random Simulation 19

2.3

Detrended

Kriging

OrdinaryKrigingwasdefined by (1), where v E R wastheconstant mean of the randomprocess Z(•).This assumption, however,limitstheapplication

of

OrdinaryKrigingtorather simple models ofthe process Z(•). A moregeneralassumption is that v is notaconstant, but an unknown linear combination

of

known functions

{fo(s), -:f„(s)}, se D.

Thisis

called Universal Kriging:seeHuijbregts and Matheron (1971, p. 160) and also Cressie (1993, p. 151).

Cressie (1993)discusses real(non-simulated) coal-ash data,andRegniereandSharov (1999)

discusssimulated spatialandtemporal output data ofarandomsimulationmodelforecological

processes.

Now we introduceanovel type

of

Kriging that wecall DetrendedKriging.Detrended Krigingpre-processestheoriginal data, and then appliesOrdinaryKriging totheresulting data so wecanapply softwarefor Ordinary Kriging.For UniversalKriging,however, software is available only

for

spatialandtemporal data, notfor simulation withanarbitrarynumber of inputs-to the best ofour knowledge3.

We assume that theprocess mean

*(s)

satisfiesthedecomposition

#(S)= S (S)+71(S) (13)

where S(s) is a known signal function (see, however, the text below (14)) and 11(s) isawhite noiseprocessthatmodels the measurement error; that is, 11(s) is normally identically and independentlydistributed with zero mean (NIID). So,we replace (1) by

ZCS) = SCS) + 11(S) + a(S). (14)

In practice,thesignalfunction S(s) in (14)isunknown. Therefore we estimate S(s)

through S(s), from the set

of

observed(noisy) I/O data {(st,Z(s, )). i =l, ..., n}. Because of the

assumedwhite noise, weuseordinaryleastsquares (OLS) to obtaintheestimator S(s).

Next we applyOrdinary Kriging tothedetrended set <(st,Z(s, ) - S(st ) ) :i= 1, . . . . n . Our

predictor forthe output

of

location so is the sum of thisOrdinaryKrigingprediction and the

3 Afterthis paperwasaccepted,theMatlabKrigingtoolboxofLophaven, Nielsen,andSpndergaardbecame

(30)

estimator S(so)

To test our new Detrended Kriging, we apply classiccross-validation:seeKleijnen and

van Groenendaal (1992, p. 156).Cross-validation eliminates one

I/0

combination,

say (sk I Z (s k )), from to the

original data set {(st,Z(s,)): i =l,..., n},so

the remaining n -1 data

combinations are {(st,Z(s,)): i -1,..., k -1, k +1...., n}.This new

setgivesaprediction

p( Z(sk )) ·Thisprocess

of

eliminationandpredictionisrepeated for (say) c different

combinations (c C n) .Obviously, if we sorttheoriginal set such that thefirst c observations are deleted one at a time, then we get k =1,2, . . . ,c.

To summarizetheresultingpredictionaccuracy, we use the I, norm of the difference

I 2 \B

vector li p(Z(sk))-Z(sk) 11 (the 4 norm 11x11 is defined as Ek-ixk 11 ). In our experiments

we find that the l, and

L-

norms givesimilarconclusions.

Note that inKriging,allpredictionerrors may be zero at the 1/0 points thatareactually

usedto estimatetheKrigingmodel. Therefore weusecross-validation.

2.4

Two

examples and

five

metamodels

Weareinterested intheapplication

of

Krigingtodiscrete-eventsimulationmodels, such as simulated queueing systems. As LawandKelton (2000»-thebestsellingtextbook on

simulation-states (on page 12),asingleserverqueueing system is quite representative of more

complex, dynamic, stochastic simulation models. Forfurther simplification,we suppose that the

output ofinterest is the meanwaiting time in thesteady state, E(W).Thisoutput can be

(31)

Kriging for Interpolation in Random Simulation 21

S(i) and A(i)areidenticallyandindependentlydistributed (IID),sosimulationisstraightforward.

The output E(W)isusually estimated through the simulationrun'saverage

iii _r, W(i)

6

with

OEb<n

(16)

Abn-b+1

where bdenotesthelength oftheinitialization(start-up, transient)phase(which may be zero), and n therun length. (InM/M/1 analysisandsimulationthrough renewal analysis, this initialization isnoissue;in practical simulations, however, it isamajor problem; see Law and

Kelton (2000,pp.496-552).) In other words,thedynamic simulation model generates the time series (15), but thisseries ischaracterized through the single statistic (16).

Actually, simulation is donefor sensitivityanalysis (possiblyfollowedbyoptimization). Such an analysis aims at estimatingtheinput/output (I/0)function(say)

E(Z(s)) =S(s) (17)

where-following

(14)-Z

denotes the(multiple)output and s the(multiple) input. In the M/M/1

example we have Z=W and s= 1/#

with arrival rate A andservice rate v.

In general, S(s) in (17)hasunknownshape andparameters. However, when studying the

performance ofaspecific simulation methodology,researchersoften use theM/M/1 simulation modelbecause the

I/0

function S(s) isthenknown-assuming thatthemethodology has

selected anappropriateinitializationlength b in (16)(obviously, knowledge of S(s) may not be used by the methodologyitself):

E(W) = 1 with

1<1.

(18)

BCM- A) B

Unfortunately,thelatter assumption is very questionable: it iswellknown that selecting an appropriate transient-phase length b and runlength n in (16)is

difficult.

Moreover, Krigingassumesthat-in general-thesimulation observations Z have additive white noise:see(14). In the M/M/1 example, (14) gives (i) Z=W, (ii)

S(s) = A/(v

(1-A)),(iii)

normality holds if W in (16) uses asample size (n - b +1) such that an

asymptotic central

limit

theorem holds,

(iv)

constant variances result

if

differentsimulatedtraffic

rates usedifferentandappropriate sample

sizes-see

Kleijnen and VanGroenendaal (1995)-and (v) independenceresults ifnocommon pseudorandom numbers areused.Altogether,

(32)

Hence, it is much moreefficientandeffectivetogenerateKriging testdatathrough sampling from (13) with S(s)=1/(51(1

-A))

instead of (15) and (16). Indeed,ourapproach requireslesscomputer time,andguarantees that thewhitenoise assumption holds,including the desiredvalue forthevariance ofthewhitenoise.The alternative using (15) and (16)would require very longruns.especially for hightrafficrates A/# T 1 this alternativerequires n T. .

In conclusion, to testtheKrigingmethodology wegenerate datathroughastatic, random Monte Carlo model like(13) instead ofadynamic stochastic simulation model such as (15)

combined with (16). So,theMonteCarlotechnique is bothefficientandeffective.

Besides Example I, we study Example II representing simulationswith multiple local maxima, which are interesting when optimizing simulation outputs. Example I represents queueing simulations that show 'explosive' meanwaitingtimes as thetrafficrateapproaches the valueone.Example II has no specific interpretation.

We samplethewhite noise-term 11(s) in (14) throughtheMatlabfunctioncalled 'randn', whichgivesstandard

NIID

variates; that is, 11(s) has zero mean andunitvariance. We also experiment withalarger variance namely 25; this results in larger error terms, but not in other

conclusions.

To estimate possible values of the L2 norm (defined above), we use 100

rnacro-replications.Fromthesemacro-replications we estimate

4's

median, 0.10 quantile Qo 1, and 0.90 quantile Q09.

In both examples we take n =21 equallyspacedinput values: s, with i = l,..., 21. For

cross-validation we select (ratherarbitrarily)

c=5

inputsvalues:We

eliminate i=2,8,9,1 5,1 6

respectively. We comparedthefollowing fivemetamodels. i) OrdinaryKriging

ii)

second-degree detrending: S(s) isasecond-degreepolynomial

iii)

perfectlyspecified detrendingfunction: S(s) isahyperbolainExample I, andafourth degreepolynomialinExample II

iv)

fifth-degreedetrending: S(s) isafifth-degreepolynomial (overfitting)

(33)

Kriging for Interpolation in Random Simulation 23

Note that wealsostudyaperfectly specified detrending function (see

iii),

becausethisfunction providesautopian situation:This functiongives theminimumprediction error;inpractice, this functionis unknown (otherwisesimulationwould not beused).Furthermore,

it

helps us toverify the correctness ofour computer program.

2.4.1 Example I:

M/M/1 hyperbola

We take S(s) =

s/(1-s) on D=[0.01,0.99] c R:

this hyperbolic function may represent the

meansteady-statewaiting time foratraffic rate s in anM/M/1 queueingsystem.This function gives Figure2,which also displaysanexample ofthenoisy

output Z(s).

Theinputlocations are

s, c <0.01,0.05,0.10,..., 0.95,0.99}. The cross-validationiscarried out at s = 0.05,0.35,0.40,0.70 and 0.75. 25 20 S(S)

15 1-Cs)

0 eliminated data 10

5

0 . 70-7%1' a:--*-::0::'$ 4 '4 0 0.2 0.4 0.6 0.8 S . -5

Figure 2: S(s) = s/(1 - s) andexample sample output Z(s)

(34)

Table 1:Prediction accuracy: estimated quantiles ofLedistribution forExample I (M/M/1 ) metamodel

L2 i ii iii iv V

00 i 1.2429 1.3622 1.1972 2.7411 17.593 median 1.8832 2.1522 1.8419 3.6678 18.17 00, 2.5698 2.925 2.5829 4.4677 18.652

2.4.2 Example

II:

fourth-degree

polynomial

We take thefollowingspecificpolynomial: S(s)=-0.0579s4 + 1.11 s' - 6.845

sl

+14.1071 s+2

on D = [0,10] c R'.

Thispolynomial has two maxima:

A

local one andaglobal one; see Figure 3. We obtainoutput forthefollowing21 input locations st € 0,0.5,1,...,10}.We

cross-validate at s = 0.5,3.5,4,7 and 7.5. 15 10

-2

S(s)

a

\

. . . ... ..CO . .0 5 5 -:-I.A ...Z(s) 0 i i I eliminated data

-5 0 2 4 6 8 s 1 0

-10 -15

Figure 3: S(s)=-0.0579 s4 + 1.1 l

s' -6.845

s2+14.107ts + 2 andexample sample output Z(s)

(35)

Kriging for Interpolation in Random Simulation 25

Table2:Example II ; alsoseeTable 1

metamodel

4 i ii iii iv V

Qo.i 1.5976 1.515 1.2094 1.2173 5.5965 median 2.4713 2.41 1.8748 1.9117 6.0363 00.9 3.3226 3.246 2.6424 2.6959 6.5048

2.5

Nugget effect

We also wish to better understandtherelationship betweenthenugget effect in (11) and the

variance of the noise 11(s) in (13). Therefore we performasimpleMonteCarloexperiment: We take Z(s) =1 0+11(s) where 11(s) is NIID with v =0 and 0 2= 1,4,9,16, and25respectively. We sample two macro-replicates, setting the seed

of

Matlab's

'randn'-rather arbitrary-to the

values 10 and 20. In the variousKrigingmetamodels, we fitthelinear variogram of (11); see Figure 4 (we display results for theseedvalue of 10 only; notethedifferentscales for the y-axis in thefourplots).

6 · 25

seed: 10

5 seed: 10sigma: 1 20· sigma: 2

4. 15- 3-• 10- • 2 . .. 1 • 5-0. · 0 100 50· seed:10 90 seed: 10 sigma: 3 80- Sigma: 4 40· 70-30- · 60 50· * 40 20- * . · • .. 30· • 10 + 20 10 0· · · 0· · · · · 02 46 8 10 0 2 46 8 10

(36)

The intercept in (11) estimatesthenugget effect; this intercept ispresentedfor different 02 values inTable3.Obviously,these resultsconfirmour conjecture: The nugget effect is the variance ofthenoise.

Table3:Estimated nugget effectsfor different whitenoise variances a 2 Q 2 seed10 seed 20 1 1.1 0.9

444

9 9.6 8.5 16 17.1 15.5 25 26.5 24.1

2.6 Conclusions

and

future

research

We assume that in practice the mean v oftheKrigingmetamodel (1) is notaconstant, but is a composition ofasignalfunctionandwhitenoise.Wepresentedresults fortwo examples of inpuUoutputbehavior oftheunderlying random simulation model:Ordinary Kriging and Detrended Kriginggivesquite acceptable predictions, whereastraditional linearregression gives theworstresults.

0LS predictssopoorly,becauseOLSassumes thatthe

fitting

errorsarewhitenoise,

whereasKrigingallows errors thatarecorrelated; morespecifically,theclosertheinputs are, the

more positive correlation. Moreover, 0LS usesasingle estimatedfunction forallinputvalues,

whereasKrigingadaptsits predictor as the input changes. Note that OLS maybeattractive when looking foran

explanation-not

a

prediction-of

thesimulation's

I/0

behavior;forexample, whichinputs are most important; doesthesimulation showabehavior thattheexperts find acceptable (alsoseeKleijnen, 1998)?

0LS isalsocomparedwith(universal) Krigingby RegniereandSharov (1999).Their

(37)

Kriging for Interpolation in Random Simulation 21

Further, we found that tile nugget effectequals thenoise variance.

We restrictedour examples toasingle input. Therefore we gaveeach

weight in the

more general distance formula (12),thefixed value of one. In designoptimization,however,

theseparameters are used to controltheimportance of theinputvariable x ; seeforexample

Simpson et al. (2001, p. 8)andJones et al. (1998, p. 5). Infuture workweshall investigate multipleinputs.

Further, we shall relaxtheassumption

of

whitenoise:We shall investigatetheeffects of non-constant variances(whichoccurin queueing simulations), common random numberusage (whichcreatescorrelations amongthesimulation outputs),andnon-normality(Kriging uses maximumlikelihoodestimators of the weights

4,

whichassumesnormality). Finally,we shall applyKrigingto practical queueingandinventory simulations.

Acknowledgment

We thank the two anonymousreferees

for

their most useful comments onanearlier version.

References

Campolongo, F., J.P.C. Kleijnen, and T. Andres (2000), Screening methods. In: Sensitivity Analysis, edited by A. Saltelli, K. Chan, and E.M. Scott, Wiley, Chichester (England), pp.

65-89

Cressie, N.A.C. (1993).StatisticsforSpatial Dam,Wiley, New York

Huijbregts, C.J.andMatheron, G. (1971). UniversalKriging (An optimalmethodforestimating and contouring in trend surface analysis). In Proceedings of Ninth International Symposium on Techniques for Decision-making in the Mineral Industry

Jones, D.R., M.Schonlau, and W.J. Welch (1998).Efficient global optimization

of

expensive

black-box functions.Journal of Global Optimization.13. 455-492

(38)

Kleijnen, J.P.C. and W. vanGroenendaal (1995), Two-stageversussequential sample-size determination in regression analysis of simulation experiments.American Journal of Mathematical and Management Sciences, 15, nos. 1&2, 1995, PM 83-114

Kleijnen, J.P.C. (1998),Experimental designforsensitivity analysis,optimization,andvalidation of simulation models. In: Handbook of Simulation. edited by J. Banks, Wiley, New York, pp.

173-223

Kleijnen, J.P.C.andJ.Helton (1999), Statisticalanalyses

of

scatter plots toidentify important factorsin large-scale simulations. ReliabilityEngineering andSystemsSafety, 65, no. 2, pp.

147-197

Koehler, J.R. and A.B. Owen (1996), Computer experiments. In: Handbook OfSmtistics, Volume13,edited byS.GhoshandC.R.Rao, Elsevier, Amsterdam, pp.261_308

Law, A.M. andW.D.Kelton(2000), Simulation modeling and analysis,

third

edition, McGraw-Hill, Boston

Matheron, G. (1962) Traite de Geostatistique Appliquee, Memoires du Bureau de Recherches

GeologiquesetMinieres, No. 14,Editions Technip,Paris,(pp.57-59)

Meckesheimer, M., R.R. Barton,T.W.Simpson, and A.J. Booker (2001),Computationally inexpensive metamodelassessmentstrategies.American /nstitute OfAeronautics and AstronauticsJournal(submitted)

Rdgnitre, J. and A. Sharov (1999),Simulatingtemperature-dependent ecologicalprocesses at the

sub-continental scale: male gypsy moth flight phenology.International Journal of Biometereotogy, 42, 1999,pp. 146-152

Sacks,

J.,

Welch, W.J., Mitchell, T.J. & Wynn, H.P. (1989), Designandanalysis

of

computer

experiments. Statistical Science, 4. no.4.pp.409-435

Simpson, T.W., Mauery, T.M., Korte, J.J., and Mistree, F.(2001).Kriging Metamodels for Global Approximation in Simulation-Based Multidisciplinary Design Optimization. American Institute of Aeronautics and Astronautics Journal (submitted)

(39)

Chapter 3

Robustness

of Kriging

when

Interpolating in

Random

Simulation

with

Heterogeneous

Variances:

Some

Experiments

Abstract

Thispaper investigates the use

of

Krigingin random simulation whenthesimulation output variances are not constant. Kriginggivesaresponsesurfaceor metamodel that can be used for interpolation.BecauseOrdinaryKrigingassumesconstant variances, thispaperalso applies DetrendedKrigingtoestimateanon-constant signal function, andthenstandardizestheresidual noise throughtheheterogeneous variances estimatedfromreplicated simulationruns.Numerical examples, however,suggestthat OrdinaryKriging isarobustinterpolationmethod.

3.1 Introduction

Given a set

of

inpuVoutput (I/O)data Kriging fitsametamodel (also calledresponsesurface,

emulator, auxiliary model, etc.). Kriging isaninterpolationmethod; that is, at the observed 1/O valuesits predictionsareexactly equal totheobservedoutputvalues.However, classicleast

(40)

squares

(LS)-used

in

regression-typically fits

alow-degreepolynomial,which results in

non-zero

fitting

errors (or residuals) such that the sum

of

these squarederrorsisminimized.

Originally,theSouthAfrican miningengineer D.G.Krigemodeled realworld I/0 data to predictgold quantities;seeCressie (1993). Later on,Krigingwasapplied todeterministic simulation 1/Odata(modelingtheperformance

of

computer chips, cars, etc.);seeSacks et al. (1989). More recently, the application

of

Krigingtorandom simulation(forqueuing problems, etc.)wasproposedbyBarton (1994), andwasinvestigatedindetail by VanBeersandKleijnen (2003). The latter authors suggested that it maybebetter to replace classicOrdinary Kriging by theirnovel method called DetrendedKriging, whichcorrectsor'detrends' theoriginaloutput data throughanestimated signal function (see S(s) below). However, both Ordinary and DetrendedKrigingassume that the outputs have constant variances.

In practice, this constant variance assumptionisoften completelyunrealistic. For example,theso-calledM/M/1queue isanimportant buildingblock in discrete-eventsimulation; see LawandKelton (2000). For theM/M/1,Cohen (1969) provesanalytically thatthe

steady-statewaiting time (say) Z foratraffic load s with 0<s<1 has mean

E(Z(s))=-1 (1)

1-s

and variance

\ 3(2 - s)

Var(Z(s))= (2)

(1 -s) 2

sothis varianceisdefinitelynotconstant whenthetrafficloadchanges.Therefore we now investigate the consequences

of

varianceheterogeneitywhen applying Kriging to the I/0 of random simulation.

Whereas linear-regression analysis

of

low-order polynomialmetamodels has been applied extensively in discrete-event simulation (such as queueing simulation), Kriging has hardly been applied to random simulation.

The mainconclusion of our (limited)experiments will be thatOrdinaryKrigingseems a

(41)

Robustness of Kriging when Interpolating in Random Simulation with 31

Heterogeneous Variances: Some Experiments

The remainder of this paperisorganizedasfollows.Section3.2summarizesthebasics of Kriging.Section3.3 discussessolutions fortheproblems

of

non-constantandexploding

variancesinrandomsimulation. Section3.4proposesanovelKrigingmethod, which estimates a non-constant signal function andthenon-constant variances. Section3.5 presentsnumerical examples, whichsuggestthat OrdinaryKriging isarobust prediction methodwhereasthenovel Krigingmethodisbetter only in utopian situations. Section3.6gives conclusionsandproposes

futureresearchtopics.

3.2

Kriging

basics

Krigingassumesthefollowingmodel:

Z(s) = S(s) + '1(s) with 11(s)

-NID(0, 02 (s)) (3)

where S(s) is the so-calledsignal function and4(s) the additivenoise; we use the notation x - NID(a, b) when x is normally and independently distributed (NID) with mean a and variance b. Below, we shall discuss the classic assumption

of

white noise,which means that

02(s) in(3) reduces toaconstant (say) 0 2.

Note: The model in (3) maygeneratenegative output values,whereasM/M/1generates

non-negative valuesonly.Nevertheless, we use(3)because

it

provides bettercontrol over our experiment and itisfaster thanM/M/1 simulation; see VanBeers andKleijnen (2003).

Ordinary Krigingassumesastationary covariance process

for

Z(s) in (3) (see Cressie

1993).That assumption implies thattheexpected values E (Z(s)) areconstant, and the

covariances of Z(s, + h) and Z(si) depend only on the distance (or lag) h.The predictor for the

point so -denoted by p(Z(so))-is

aweightedlinearcombination of all the (say)nobserved output data:

(42)

To selecttheweights /1, in(4), Krigingminimizes the mean-squared prediction error. The

technique uses thevariogram,defined as 23 (11) =var(Z(s + h)-Z(s)), see Figure l discussed below.The optimal weights turn out to be

" -t' +1 1 i,;, i, j r-1 (5)

where 7 denotesthe vector (7(so -s, ) . . . . , y(so

-s„))',r

denotes the n x n matrix whose (i,

j)*

elementis 2 (s, -sj),and 1 denotesavectorofones; alsoseeCressie (1993, p. 122). Note thattheweights A, in (5) vary withthepredictionpoint,whereaspolynomial-regression metamodels use thesameestimated metamodel forallthesepoints.

3.3

Random simulation

with

heterogeneous variances

Inthispaper,weallowheterogeneous variances;hence, theKrigingassumption ofastationary covariance process does not hold. Therefore, we do not expectasmooth variogram. Furthermore,

Van BeersandKleijnen(2003) found thattheintercept ofthefittedlinear variogram estimates theso-callednuggetefect. The term 'nugget' originated in the Kriging of gold mining data:

when going back to the 'same' spot (lag h 1 0; see the text above equation 4),acompletely different output-namely, agold nugget-maybeobserved. In general,Kriginginterprets a nugget effect asanestimate

of

noise;forexample, measurement noise in geostatistics. Random simulationmodels produce noisy

data-by

definition. Inthispaperwe shall investigate whether

the intercept ofthe variogramstillestimatesanugget effect in case

of

heterogeneous variances. Moreover, in simulation the variance may besolarge that simulation is actually impractical: as the traffic load approaches the value one, the varianceexplodes-as the M/M/1 illustrates; see (2). Therefore ChengandKleijnen(1999) recommend that

(i) in suchasituation we should simulate a load much lower than one,andextrapolate;

(43)

Robustness of Kriging when Interpolating in Random Simulation with 33

Heterogeneous Variances: Some Experiments

Likewise,Asmussen (1992) recommends not tosimulate the M/M/1 foratraffic rate s > 0.9; he too suggests to extrapolate inthose cases.VanBeers andKleijnen(2003),however, simulated

loads up to

0.99-but

theyart(ticially keptthevariances constantandsmallovertheirwhole range

of

simulated values, 0.01 5 3 5 0.99.

Initially,we sampleasingle output value for each of21 input (trafficrate)values, as

follows. We use (3) where we insert (1) and (2),whichmodel the signal and the variance functions of the M/M/1. We do so for21 traffic rates s e *101,0.05,0.10,...,0.95,0.99}. We

ignorethevariance heterogeneity andthevariance explosion,andproceed as in Van Beers and

Kleijnen (2003).We thengetestimated variogramslikeFigure 1.Thisfigure showsa

wildly

oscillatingvariogram; obviously, thelinearapproximation is poor. This poor

fit

gives an inaccurateKrigingmetamodel withtheweights A, defined in (5).

15000 10000 -21(h)

...

5000- , I ..-I-... 0 01 0.2 0.3 04 05 0.6 0 7 0 8 0. 1 h

Figure1: Estimated variogram for Z(s) = s/(1 - s) + 11(s) with 11(s) - NID ), 3(2 - s)/(1 - s)2 )

Figure 1 illustrates that in random simulation we should ensure thatthesignaYnoise

ratio

is 'adequate'.Torealize suchasignal/noise ratio, we should-first of all-avoid variance explosion; that is, we should simulateanon-saturated trairic load: that is. a load much lower than

(44)

Comparison withthevalues onthey-axis

of

Figure 1 shows thattherange ofthevariogram is

much smaller,sovarianceshaveindeedbeen decreased.

. .. 14- * 12- 10-.. 2,(h) ' 6. -,

...

0 0.1 02 03 04 05 06 07 0. 09 1 h

Figure2: Variogram afterelimination of s = 0.99 in Figure 1

Besidesavoidingvariance explosion, we may reducethemagnitudes oftheremaining variances through increasingthenumber ofreplicates(say) m-whenever the noise of a single

replicate is too big. We denotethenumber

of

identicallyandindependentlydistributed (IID) replicatesfor input value s by m(s). Thenoneoption istoselectthese m(s) such that the

variances ofthe average responsesbecomeaconstant (say) 02:

t- 1 Var(Z(s))

Var(Z(s))= , = 02, (6)

m(s)

see Kleijnen and Van Groenendaal (1995).

Note:Insteady-state

simulation-as

opposed toterminating simulation-we may make a single long runandpartition that runintosubruns that play the roleofreplicates; see Law and Kelton (2000).

In practice, however,theexperimental designimplied by (6) maybeimpractical (see

(45)

Robustness of Kriging when Interpolating in Random Simulation with 35 Heterogeneous Variances: Some Experiments

averages

Z(s).

These averages are'the' observations attheinputs s towhich we fitaKriging metamodel,asfollows.

3.4

Novel

Kriging

method:

Studentization

To solvethevariance heterogeneity problem, wefirstapply VanBeersandKleijnen (2003) 's

DetrendedKriging,whichassumes

Z(s)= SCS)+11(s)+ a (s) (7)

where S(s) is the signal function, 11(s)- NID(9,02) is the whitenoise model for measurement

error, and b (s) denotesfittingerror. However,unlikeVanBeers andKleijnen (2003) we now

allow 11(s) to haveheterogeneousvariancesdetermined bytheinput s; see (3).

Unbiased estimators

of

these

variances-not

depending onametamodel such as (7»are

theclassic estimators

I ", 4,(s,) - 2(s, ,)'

a2 (st)= "' (8)

m(st)-1

where Z,(s,) denotes the

r

replicated observation on Z(s,) and

zcs,)= Ir:.)z.(st)

(9)

m(si)

denotesits sampleaverage.

LikeVanBeers andKleijnen (2003), wemay estimatethesignalfunction S(s) through

(46)

unequal variances, so the best linear unbiased estimator (BLUE) is given by weighted least

squares(WLS)-not ordinary least squares (OLS).

Next,weapplyOrdinaryKriging tothedetrendedandstandardized observations

Z (S)=Z(S) - S(S) (10)

0(s)/ fm

Because (10) resembles Student'stStatistiC, we callthetransformation in(10) 'Studentizing'. If we neglect the random character of S(S), then we conjecture thatthetransformed data

Z(s) in (10) haveaconstantmean(namely zero) andaconstant variance, namely

IN(m -2).So

2(s) then satisfies the Ordinary Kriging assumptions.

Finally,the predictor for so (say) p(Z(so)) follows from (10):

d(so)

p(Z(S )) = P(Z(sonx--+

SWA, (11)

qm

where p(Z(so)) isthe OrdinaryKriging predictor for so based onthe transformed data in (10);

d (so) isgiven by (8) if so

-st;

otherwise, weusepiecewise linear interpolation between the variances at the two neighboring points thathavealreadybeenobserved. (This interpolation avoids negative estimated variances.)

3.5

Numerical

examples

Asthesignal jiinction in (3)wefirstselect S(s) = s/(1 -s), whichequals theM/M/1hyperbola in (1). FollowingVanBeers andKleijnen (2003),weexperiment with fivedetrending models,

Referenties

GERELATEERDE DOCUMENTEN

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:.. • A submitted manuscript is

Gezien het aangetroffen aardewerk in één enkele kuil gevonden werd, er buiten dit spoor geen andere gelijkaardige vondsten werden gedaan en er verder in het

Door het feit dat dit areaal in het verleden reeds verstoord werd door de aanleg van een ondergrondse parkeergarage werden in deze zone geen archeologische sporen of

- Produce high performance cells and thus be able to lower the working temperature of the SOFC, moving from an electrolyte supported cell to anode supported

Er wordt behandeld hoe deze metamodellen geschat worden en hoe er datapunten in de dataset worden toegevoegd). Aansluitend worden de resultaten van de schattingen met behulp van

In the present EEG study we used a fake brain stimulation device in a placebo (i.e., cognitive enhancement), nocebo (i.e., cognitive impairment) and a control condition (i.e.,

Since this research investigates the role of the market competitiveness in the relationship between temporal orientation and firm performance, it is important to

The EPC aims to provide a smooth evolution of past and present network technologies towards a common core. This results in seamless mobility between the different generations