• No results found

A novel item-allocation procedure for the three-form planned missing data design

N/A
N/A
Protected

Academic year: 2021

Share "A novel item-allocation procedure for the three-form planned missing data design"

Copied!
12
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Tilburg University

A novel item-allocation procedure for the three-form planned missing data design

Lang, Kyle M.; Moore, E. Whitney G.; Grandfield, Elizabeth M.

Published in:

MethodsX

DOI:

10.1016/j.mex.2020.100941

Publication date:

2020

Document Version

Publisher's PDF, also known as Version of record

Link to publication in Tilburg University Research Portal

Citation for published version (APA):

Lang, K. M., Moore, E. WG., & Grandfield, E. M. (2020). A novel item-allocation procedure for the three-form

planned missing data design. MethodsX, 7, [100941]. https://doi.org/10.1016/j.mex.2020.100941

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal

Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

(2)

ContentslistsavailableatScienceDirect

MethodsX

journal homepage:www.elsevier.com/locate/mex

Method Article

A

novel

item-allocation

procedure

for

the

three-form

planned

missing

data

design

Kyle

M.

Lang

a,∗

,

E.

Whitney

G.

Moore

b

,

Elizabeth

M.

Grandfield

c

a Tilburg University Department of Methodology and Statistics, the Netherlands b Division of Kinesiology, Health & Sport Studies, Wayne State University, United States c University of Kansas Medical Center, United States

abstract

We proposeanewmethodofconstructingquestionnaireformsinthethree-formplannedmissingdatadesign

(PMDD). The random item allocation (RIA) procedure that we propose promises to dramatically simplify the

process ofimplementing three-form PMDDs without compromisingstatistical performance. Our methodis a

stochasticapproximationtothecurrentlyrecommendedapproachofdeterministicallyspreadingascale’sitems

acrosstheX-,A-,B-, andC-blockswhenallocating theitemsinathree-formdesign.Directempiricalsupport

fortheperformanceofourmethodisonlyavailableforscalescontainingatleast12items,sowealsoproposea

modifiedapproachforusewith scalescontainingfewerthan12items.Wealsodiscussthelimitationsofour

procedure and severalnuances forresearchers to considerwhen implementing three-formPMDDs using our

method.

● The RIAprocedureallowsresearchersto implementstatisticallysound three-formplanned missingdata

designswithouttheneedforexpertknowledgeorresultsfrompriorstatisticalmodeling.

● The RIA procedure can be used toconstruct both“paper-and-pencil” questionnaires and questionnaires

administeredthroughonlinesurveysoftware.

● TheRIA procedureisasimpleframework toaidindesigningthree-formPMDDs;implementing theRIA

methoddoesnotrequireanyspecializedsoftwareortechnicalexpertise.

© 2020TheAuthors.PublishedbyElsevierB.V.

ThisisanopenaccessarticleundertheCCBYlicense.(http://creativecommons.org/licenses/by/4.0/)

article info

Method name: Random Item Allocation for Three-Form Planned Missing Data Designs

Keywords: Planned missing data, Survey design, Matrix sampling, Questionnaires

Article history: Received 1 April 2020; Accepted 22 May 2020; Available online 28 May 2020

DOI of original article: 10.1016/j.psychsport.2020.101701

Corresponding author.

E-mail addresses: k.m.lang@tilburguniversity.edu (K.M. Lang), WhitneyMoore@wayne.edu (E. Whitney G. Moore).

https://doi.org/10.1016/j.mex.2020.100941

(3)

SpecificationsTable

Subject Area: Psychology

More specific subject area: Psychological Research Methods

Method name: Random Item Allocation for Three-Form Planned Missing Data Designs Name and reference of original method Three-Form Planned Missing Data Design

Graham, J. W., Hofer, S. M., & MacKinnon, D. P. [4] . Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedures. Multivariate Behavioral Research, 31 , 197 – 218. Resource availability: NA

Methoddetails

This article is a companion to Moore et al. [17] and serves two purposes. In the first part of thisarticle,we discussa novelimplementationofthe three-formplannedmissingdata design—the random item allocation (RIA) approach—that was shownto perform well inMoore et al.[17].The RIA approach promises tosubstantially simplify theprocess of implementing plannedmissing data designs, in practice. In the second part, we provide additional details of the methodology of the resamplingstudyreportedinMooreetal.[17].

Before proceeding, we provide a brief overview of planned missing data designs (PMDDs) to contextualize the following content. PMDDs are a type of matrix sampling approach wherein researchers intentionallyadministerincomplete questionnairesto participants.Eachparticipantsees only a subset of the full set of itemsin the researcher’sstudy. The items that participants donot seebecomemissingvaluesinthefinaldataset.Thesemissingdataaremissingcompletelyatrandom (MCAR)sincetheresearcherdefinedthemissingdatapatternsapriori(i.e.,withoutconsiderationfor anyofthevariablesintheanalysis)andrandomlyassignedparticipantstothemissingdatapatterns. Consequently, the planned missing data introduced by a PMDD are easily treated with principled missingdatamethodslikemultipleimputationorfullinformationmaximumlikelihood.

ThemostcommontypeofPMDD,thethree-formdesign,entailssplittingthequestionnaireitems intofourblocks:anX-BlockcontainingitemseachparticipantwillseeandA-,B-,andC-Blocksthat contain itemsonly two thirds of the participantswill see.After allocatingthe itemsto blocks, the researcher creates three questionnaireforms by combining the X-Block items withthe items from two of the A-, B-, or C-Blocks. Therefore, in terms of the blocks they comprise, the final set of questionnaires is XAB, XAC, and XBC. For more details on PMDDs, we refer interested readers to Graham[3];Graham,Hofer,andMacKinnon[4];Graham,Taylor,Olchowski,andCumsille[5];orLittle andRhemtulla[11].

PMDDitemallocationprocedures

When researchers implement a PMDD, one of the more difficult decisions they must make is how to allocate items across blocks. This problem has two facets: (1) how to distribute the items between the A-, B-, and C-Blocks, and (2) which items to include in the X-Block. Previous research has suggested that the itemswithin (sub)scales should be divided among the A-,B-, and C-Blocks to maximize covariance coverage betweenscales [4,7]. The results presentedby Mooreet al. [17] corroborate the performance of this approach (hereafter the “between-block” assignment method). The naturalalternative tothe between-block assignmentmethod wouldbe to allocateall theitemsofa(sub)scaletoeithertheA-,B-,orC-Block.Thisapproach(hereafterthe“within-block” assignment method) should not be used when modeling associations among variables because it reducescovariancecoverage[7].

(4)

Fig. 1. Flowchart describing the logic of the RIA procedure. Note: P = Number of scale items to distribute.

The resultsof Moore etal. [17] suggest a much simplersolution, however. Randomlyassigning items to the X-, A-, B-, and C-Blocks does not appear to produce any deleteriouseffects—at least when the numberof items ineach scale is reasonably large (i.e.,12 ormore items). Moore etal.

[17]showedthat:

1. Randomly allocating the scale items to the A-, B-, and C-Blocks (without accounting for scale membership)performedjustaswellasexplicitlysplittingtheitemsbetweenblocks.

2. Assigning a random subset of the scale items to the X-Block (without accounting for scale membership) performed as well as (or slightly better than) theoretically informed X-Block assignment.

Taken together, these two findings imply that researchers can construct an optimal three-form PMDD by simply deciding how many scale items they wish to include in the X-, A-, B-, and C-Blocks and randomly allocating the scale items to satisfy the desired counts (while assigning all demographics tothe X-Block). Wecall thisapproach the“random item allocation” (RIA)procedure.

(5)

In lieuofthe threesteps shown inFig.1, currentrecommendations dictatefirstassigning scale itemstotheX-Blockusingexpertknowledgeand/ortheresultsofpriorstatisticalmodeling,andthen allocating the remaining scale items across the A-,B-, andC-Blocks so that items from the same scalearespreadacrossblocks[7,11].TheRIAproceduredoesnotrequireexpertknowledge,previous results,orexplicitlybalancedassignment,so RIAsubstantiallysimplifies theprocessofcreatingand implementingPMDDs.

Implementationdetails

Although the RIA procedure appears to work well based on the findings of Moore et al. [17], researchers considering aPMDD should be mindfulof certain nuancesin the wayPMDDs mustbe implemented with RIA. First, we recommend choosing the number of scale itemsassigned to the X-Block, PX, so that the remaining number of items, P – PX, is evenly divisible by three (for the three-form design). Doingso will ensure that the length of each final questionnaire formis equal. Second, althoughthe RIAprocedure involvesrandomly allocatingscaleitemsto theX-Block, the X-Blockshould not necessarilycontain onlytheserandomly assignedscale items. Variablesinthe A-, B-,andC-Blockswillbepartiallymissinginthefinaldataset,soanyitemsforwhichmissingdatais especiallyundesirableshouldgointotheX-block.Afewcommonexamplesofsuchitemsinclude: 1. Demographicvariables.

2. Importantcovariates.

3. Auxiliaryvariables(i.e.,covariatesthatareusedformissingdatatreatment).

4. Anyitemsforwhichmissingdatawillbeespeciallydifficulttoaddress(e.g.,outcomeswithunusual distributions).

Additionally,itmaybeworthincludinganyimportantindividualitems(e.g.,important,univariate predictors or outcomes) in the X-Block. PMDDs work best when they can use strong within-scale associationstosupportmissingdatatreatment(hencethepreferenceforbetween-blockassignment), andunivariateitemsclearlycannotleveragewithin-scaleassociations.

Caveats,limitations,&extensions

TheRIAprocedureentailsrandomlyassigningitemstoblocks,butnoteverymethodofrandomly allocatingitemstoblocksconstitutesanimplementationofwhatwearecallingRIA.Manyweb-based surveyprograms(e.g.,Qualtrics)willgenerateanovelquestionnaireforeachparticipantbyrandomly sampling from a pool of items. This “on-the-fly” approach to item allocation has been suggested in the literature (e.g., [11]), but we are not aware of any empirical evaluation of its performance. Furthermore,theresultsofMooreetal.[17]donotdirectlyapplyto“on-the-fly” itemrandomization becausetheRIAprocedureweimplementedinthisstudyrepresentsadifferenttypeofrandomization. For each replication in our study, we generated a newset of (three) questionnaire forms via RIA, butevery hypothetical“participant” in our studysaw only one of thosethree forms. The situation modeledinourstudy,therefore,isonewhereinaresearchergeneratesafixedsetofthreeformsvia theRIAprocedure anddoesnotupdate thestructure/contentsofthoseformsduringdatacollection (either manually or via the sampling software). The “on-the-fly” item randomization approach is a logical extension of the procedure tested in our study, not an equivalent alternative. Increased computationalcomplexityoftheresultingmissingdataproblemisonepotentialdrawbackofthe “on-the-fly” approach.Randomlygeneratinga,potentiallyunique,questionnaireformforeachparticipant willincreasethenumberofmissingdatapatternsrelativetothethree-formdesignweexploreinthis study.Although“on-the-fly” randomizationwillgenerallyproducemoremissingdatapatterns,these missingdatawillstillbeeasilytreatedMCAR,soweconjecturethatthe“on-the-fly” approachwould performwell,inpractice.Theveracityofthisconjectureiscurrentlyunderinvestigation,however,so theresultsofMooreetal.[17] shouldnotbetakenasdirectempiricalsupportfor“on-the-fly” item randomization.

(6)

B-, andC-Blocks. RIAshould onlybe applied toscales that havea relativelylarge numberofitems (the numberofitemsrequiredisdiscussedbelow).Whenitcomes toallocatingitemstotheA-,B-, andC-Blocks,RIAisastochasticapproximationtothebetween-blockassignmentmethod—RIAworks because it tends to split a scale’s itemsacross blocks.When applied to scales with few items, the RIAapproachwilltendtogeneratesolutionswhereinsomeblockshavenoitemsfromagivenscale whileotherblockscontainmultipleitemsfromthesamescale—i.e.,solutionsthat(partially)resemble those produced by the within-block assignment method.In thesesituations, directly implementing the between-block assignment method is probably the best option. The best approach for a scale comprisingonlyfouritems,forexample,wouldbetosplitthefouritemsevenlybetweentheX-,A-, B-, andC-Blocks (i.e., assign oneitem to each block).Similarly, a scale withfewer than fouritems should haveoneitem includedintheX-Blockandtheremaining itemsdeterministicallydistributed between asmanyof the A-,B-, andC-Blocks aspossible. Withthree items, forexample, one item should go intotheX-Block, andthenone itemcould go intotheA-block andone intotheB-Block. TheC-Blockwouldnotgetanyitems,inthiscase.

HybridRIA

ThescalesanalyzedinMooreetal.[17]contained13,13,and14itemsrespectively,sothefindings suggest thattheRIAprocedureworkswellforscales with13ormoreitems.Thatbeingsaid,ascale with 12 items would,on average, contribute three itemsto each block, and a 13th item does not dramaticallychangetheexpecteditemallocation.Therefore,webelieveitisreasonabletoextrapolate thegoodperformanceoftheRIAproceduretoscalescontaining12ormoreitems.Becausetheresults of Moore et al. [17] do not directly support the use of RIA for scales with fewer than 12 items, we suggest a hybrid approach. For scales that comprise 5 to 11 items, one could use conditional randomizationwiththerequirementthateachblockmustcontainatleastone itemfromeachscale.

Fig. 2 illustrates the workflow for implementing such a hybrid RIA fora scale withfew (e.g.,less than 12)items. We have not directly evaluated the performance of this hybrid procedure, but we have good reasonto expect this approachto perform well. Namely, thehybrid approach combines two item allocations procedures—RIA and between-blockassignment—that do havedirect empirical support.ToimplementaPMDDusing(hybrid)RIA,wesuggestthefollowingprocedure:

1. Assigndemographics,covariates,auxiliaryvariables,andotherimportant(orproblematic)univariate itemstotheX-Block(asdiscussedabove).

2. Classifythescalesintotwogroups: a. SmallScales(e.g.,fewerthan12items) b. LargeScales(e.g.,12ormoreitems)

3. Poolthe itemsfromall largescales andmake X-,A-,B-, andC-Blocks byfollowing theRIA logic outlinedinFig.1.

4. Foranysmall scales, makeX-, A-,B-, andC-Blocks by following thehybrid RIAlogic outlined in

Fig.2.

5. Thefinal X-, A-,B-,andC-Blocks arethe unionofthe X-,A-,B-, andC-Blockscreatedin Steps3 and4.

6. CombinethefinalX-,A-,B-,andC-Blocksintothethreequestionnaireforms(i.e.,XAB,XAC,XBC). Any univariate itemsthat are not important enoughto include inthe X-Block can be randomly allocated among the A-, B-, and C-Blocks. This procedure is represented graphically in the visual abstractforthispaper.

(7)

Fig. 2. Flowchart describing the logic of the hybrid RIA procedure as applied to a single scale. Note: P = Number of items in the scale.

willresultinrandomlypresentingoneofthethreequestionnaireformstoeachparticipant.The “on-the-fly” approach, on theother hand,could potentially presenta differentcombination ofitemsto eachparticipant.

(8)

Extendedmethodsoftheresamplingstudy

In this section, we provide additional methodological details of the resampling study reported in Moore etal. [17]. We conductedthis resamplingstudy toevaluate the performance ofdifferent instantiationsofthethree-formPMDDinanecologicallyvalidfashion.Theoriginal datafromwhich we sampled (hereafter, the “population data”) were collectedby Moore and Fry [15] to study the effects ofmotivationalclimateperceptionsonexerciseparticipants’class ownershipandenjoyment. We excluded cases from the population data that met either of the following criteria: (1) had a missingracevalue or(2)endorseda racecategorythatrepresentedlessthan1% ofthesamplesize. Weimplementedtheseexclusioncriteriaforfourreasons:

1. Imputing/analyzingnominalvariableswasnotthefocusofourstudy. 2. Nominalvariablesarenotoriouslydifficulttoimpute[9].

3. Sparsecategoricalvariablesoftencauseestimationproblems[1]

4. Nominal variable imputation tends to be very slow, so retaining missing race values would substantiallyextendthecomputationtimeofourstudywithoutaddinganyscientificbenefit.

The resulting populationdata containedN = 5244participants ofwhich 98.5%self-identified as female (0.65%missing) and90.2%self-identified aswhite.Theaverageobserved participantagewas 49.27years(SD=11.09,1.47%missing).Allvariablesexceptracehadasmallamountofmissingdata. The variable-wisepercentagesofmissingdatarangedfrom0.04%to 1.47%.Forfurtherdetailsofthe populationdatacollectionandcharacteristicsseeMooreandFry[15].

Variables

Inthepopulationdataforthisstudy,weincludedthreeoftheoriginalfiveconstructscollectedby MooreandFry[15].Specifically,13itemsassessingego-involvingclimateand14itemsassessing task-involvingclimatefromthePerceivedMotivationalClimateinExerciseQuestionnaire(PMCEQ;[6]),and 13 itemsfromtheCaringClimateScale (CCS;[19]). Formore informationaboutthePMCEQ orCCS, seeMooreetal.[17]orMooreandFry[15].Wealsoincludedindicatorsofparticipantage,biological sex,andrace.

Resampling

For each replication ofthe resampling study, we drew a random sample (with replacement) of size N=500fromthepopulationdatadescribedabove. RatherthandrawnewsamplesfortheN {400,300,200,100}conditions,werecursively “trimmed” observationsfromthe originalsampleof N=500.FortheresultsreportedinMooreetal.[17],weretainedallextantmissingdataduringthe resamplingprocesses.Whenweranthestudyusingonlycompletecasesasthepopulationdata,the resultswereessentiallyequivalenttothosederivedfromtheincompletepopulationdata.

Imposingplannedmissingdata

Withineachresampled(ortrimmed)dataset,weimposedplannedmissingdataaccordingtonine different instantiations of the three-form design. These versions differed in terms of two crossed factors: thecomposition oftheX-Block andthewayinwhichwe assigneditemstotheA-,B-,and C-Blocks.TheX-Blockfactorhadthreelevels:

1. AtrivialX-Blockthatcontainedonlysex,age,andrace.

2. AninformedX-Blockthatcontainedthedemographicvariableslistedin(1)anditemschosenwith guidancefrompreviousCFAmodels[6,14].

3. A random X-Block that contained the demographic variables listed in(1) and randomly selected scaleitems.

See Moore[13]andMooreandFry [16]formore informationregardingthe developmentofthe informedX-Blockandtheparcelingscheme.

(9)

1. Awithin-blockconditionwhereinwe assignedall itemsofeach parceltoeither theA-,B-,or C-Block.

2. Abetween-block conditionwhereinwedistributedtheitemsofeachparcelacrosstheA-,B-,and C-Blocks.

3. Arandom-allotmentconditionwhereinwerandomized theassignmentofitemstotheA-,B-,and C-Blocks.

IntherandomX-Blockandtherandomparcelconditions,wegeneratedanewrandomassignment forevery replicationoftheresamplingstudy.The combinationoftherandomX-Block and random-allotmentmethodsconstitutestheRIAapproachdiscussedinthefirstpartofthisarticle.

Analysismodel

Theanalysismodelfromwhichwederivedtheparameterestimatesusedtoevaluatethedifferent versions of PMDD was a confirmatory factor analysis (CFA) with standardized latent variables (i.e., the measurement scale was set with the so-called “fixed factor” method of identification). The latent correlation structure was fully saturated, and all item intercepts, factor loadings, and residual variances were freely estimated. Each latent factor loaded onto three parceled indicators. We calculatedtheparcelscores afterimputingthe data(i.e.,a uniqueset ofparcelswascomputed fromeach of theM= 100 imputed datasets). Toevaluate therelative performance ofthe different PMDDs,weconsideredtheeffectsonlatentcorrelations,factorloadings,itemintercepts,andresidual variances.

Outcomemeasures

To evaluate the relative performance of the different implementations of PMDD, we compared latentreliabilitiesaswellasbiasesandefficienciesoftheparameterestimatesnotedabove.

Latentreliability

FollowingBollen[2]andRaykov[21],wedefinelatentreliabilityas:

ρ



Yj



=



I  i=1

λ

i j



2

ψ

j j



I  i=1

λ

i j



2

ψ

j j+ I  i=1

θ

ii

whereYjisthescalescore(i.e.,sumoftheobserveditems)forthejthscale,

λ

ij isthefactorloading

linking the ith indicator to thejthlatent construct,

ψ

jj is the latentvariance for thejthconstruct, and

θ

ii is the residual variance for the ith indicator. Latent reliability, similar to Cronbach’s alpha

coefficient,can beviewedasthesquaredcorrelationbetweenan observedscalescore(i.e.,thesum oftheitemscores)andthatscale’struescore[2,21].UnlikeCronbach’salpha,however,thequantities that go into computing latent reliability are derived froma latent variable model, so they are not contaminatedbymeasurementerror.AswithCronbach’salpha,

ρ

(Y)isboundedby0.0and1.0(higher valuesindicategreaterreliability).

Relativeefficiency(RE)

We calculatedthe RE ofeach estimatedparameter (i.e.,latent correlations, factor loadings,item intercepts,andresidualvariances).REisdefinedas:

RE=R−1 R  r=1 SE

(

θ

)

r SE

(

θ

ˆ

)

r

where SE(

θ

)r is the standard error for the parameter in the complete data control condition (i.e.,

(10)

for the parameter in the planned missing condition, andr = 1, 2, …, R indexes replication of the resampling study. In our study, RE quantifies the loss of efficiency (i.e., the increase in sampling variability) introduced by the planned missingdata (relative to data with only naturallyoccurring missingdata).Avalue ofRE=1.0wouldindicateno lossofefficiency;whereasa valueofRE< 1.0 indicatessomelossofefficiency(smallervaluesindicategreaterlosses).

Percentrelativebias(PRB)

WealsocalculatedthePRBforeachestimatedparameterandlatentreliability.PRBisdefinedas: PRB=100



ˆ

θ

θ

θ

where

θ

ˆ=R−1Rr=1

θ

ˆr is the average of the estimated parameters and

θ

is the true value of the

parameter. Inthisstudy,wetook theaverages ofthecompletedata parameterestimates(i.e.,those estimates derived from data with no planned missing) as the “true” parameter values. PRB gives a measure of bias (i.e., the expected difference between the estimated and true parameters) as a percentage ofthe trueparameter value. Absolute valuesof PRB largerthan 10 are oftenviewedas indicativeof“unacceptable” levelsofbias[18].

Convergencefailures

Inadditiontoevaluatingbiasandefficiency,wealsotrackedfourtypesofconvergencefailure: 1. Complete failures of an entire study replication (i.e., runs wherein the program crashed for an

indeterminatereason).

2. Failuresof the imputationprocess (i.e.,fatal errors returned by the program when imputing the missingdata).

3. Non-convergentCFAmodels(i.e.,runswhereineithertheprogramcrashedwhenestimatingtheCFA modelsorthemaximumlikelihoodestimatoroftheCFAmodelsdidnotconverge).

4. CFAmodelsthatconvergedtoinadmissiblesolutions(i.e.,Heywoodcases)

Software&computingenvironment

WeconductedallanalysesusingtheRstatisticalprogramminglanguage[20].Totreatthemissing data(bothplannedandun-planned),weusedthemicepackage[27]togenerate100imputeddatasets using20iterationsofthechainedequationsalgorithm.Beforerunningthefullresamplingstudy,we conductedasmallnumberoftestrunswhereinwecheckedtheconvergenceoftheimputationmodels by examiningtrace plotsoftheimputedvalues’meansandstandarddeviations.We usedpredictive meanmatching[10,23]astheelementaryimputationmethodbecauseittendstoperformwellwith non-normallydistributed,quasi-continuousitemssuchasthoseinourdata[26].

We estimated the CFA models using ordinary maximum likelihood estimation in the lavaan package[22].WepooledthemultiplyimputedparameterestimatesusingtheRubin[24]poolingrules asimplementedinthemitoolspackage[12].TheonlinesupplementarymaterialincludestheRscripts usedforthisstudy.

(11)

Procedure

Ourfinaldesign comprised3(X-Block) × 3(Parcel)× 5(SampleSize)= 45fullycrossedconditions. Within each condition, we ran R = 495 replications. As noted above, each replication began by randomly samplingN =500 observationsfrom thepopulation data.TogeneratesampleswithN < 500, we “trimmeddown” the currentworking datasetby removing 100 observations. We repeated thisprocess,recursively,tocreatesampleswithN∈{400,300,200,100}.AteachlevelofN—before imposingtheplannedmissingdata—wefittheanalysismodeltothefulldataandsavedtheparameter estimatesforthecompletedatacontrolconditionthatwoulddefinethe“true” populationvalues(as describedabove).

Supplementarymaterialand/orAdditionalinformation

AZIParchivecontainingtheRscriptsusedtoconductthisresamplingstudyisavailableasonline supplementarymaterial.

DeclarationofCompetingInterest

The authors declare that they have no known competing financial interests or personal relationshipsthatcouldhaveappearedtoinfluencetheworkreportedinthispaper.

Supplementarymaterials

Supplementarymaterialassociatedwiththisarticlecanbefound,intheonlineversion,atdoi:10. 1016/j.mex.2020.100941.

References

[1] A. Agresti , An Introduction to Categorical Data Analysis, 2 ed, John Wiley & Sons, Hoboken, New Jersey, 2007 .

[2] K.A. Bollen , Measurement models: the relation between latent and observed variables, Structural Equations with Latent Variables, John Wiley & Sons, New York, 1989, pp. 179–225 .

[3] J.W. Graham , Missing Data: Analysis and Design, Springer Science & Business Media, 2012 .

[4] J.W. Graham , S.M. Hofer , D.P. MacKinnon , Maximizing the usefulness of data obtained with planned missing value patterns: an application of maximum likelihood procedures, Multivar. Behav. Res. 31 (1996) 197–218 .

[5] J.W. Graham , B.J. Taylor , A.E. Olchowski , P.E. Cumsille , Planned missing data designs in psychological research, Psychol. Methods 11 (2006) 323–343 .

[6] H. Huddleston , M.D. Fry , T.C. Brown , Corporate fitness members’ perceptions of the environment and their intrinsic motivation, Rev. Psicol. Desporte 21 (2012) 15–23 .

[7] T.D. Jorgensen , M. Rhemtulla , A. Schoemann , B. McPherson , W. Wei , T.D. Little , Optimal assignment methods in three-form planned missing dta designs for longitudinal panel studies, Int. J. Behav. Dev. 38 (2014) 397–410 .

[8] P. L’ecuyer , R. Simard , E.J. Chen , W.D. Kelton , An object-oriented random-number package with many long streams and substreams, Oper. Res. 50 (6) (2002) 1073–1075 .

[9] K.M. Lang , W. Wu , A comparison of methods for creating multiple imputations of nominal variables, Multivar. Behav. Res. 52 (3) (2017) 290–304 .

[10] R.J.A. Little , Missing-data adjustments in large surveys, J. Bus. Econ. Stat. 6 (3) (1988) 287–296 .

[11] T.D. Little , M. Rhemtulla , Planned missing data designs for developmental researchers, Child Dev. Perspect. 7 (4) (2013) 199–204 .

[12] Lumley, T. (2019). mitools: tools for multiple imputation of missing data. R package version 2.4. https://CRAN.R-project. org/package=mitools .

[13] Moore, E.W.G. (2012). Planned missing study design: two methods for developing the study survey versions (KUant Guides

#23.0) . Retrieved from: http: crmda.ku.edu/guide-23-planne_missing.

[14] E.W.G. Moore, M.D. Fry, Psychometric support for the ownership in exercise and empowerment in exercise scales, Meas. Phys. Educ. Exerc. Sci. 18 (2014) 135–151, doi: 10.1080/1091367X.2013.875472 .

[15] E.W.G. Moore, M.D. Fry, National franchise members’ perceptions of the exercise psychosocial environment, ownership, and satisfaction, Sport, Exerc., Perform. Psychol. 6 (2017 a) 188–198, doi: 10.1037/spy0 0 0 0 084 .

[16] E.W.G. Moore, M.D. Fry, Physical Education student’s ownership, empowerment, and satisfaction with PE and physical activity, Res. Q. Exerc. Sport 88 (2017 b) 468–478, doi: 10.1080/02701367.2017.1372557 .

[17] E.W.G. Moore, K.M. Lang, E.M. Grandfield, Maximizing data quality and shortening survey time: three-form planned missing data survey design, Psychol. Sport Exerc. (in press).

(12)

[19] M. Newton , M.D. Fry , L. Watson , L. Gano-Overway , M.S. Kim , M. Magyar , M. Guivernau , Psychometric properties of the Caring Climate Scale in a physical activity setting, Rev. Psicol. Deporte 16 (2007) 67–84 .

[20] R Core Team (2019). R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ .

[21] T. Raykov , Behavioral scale reliability and measurement invariance evaluation using latent variable modeling, Behav. Ther. 35 (2004) 299–331 .

[22] Y. Rosseel, lavaan: an R package for structural equation modeling, J. Stat. Softw. 48 (2) (2012) 1–36 URL http://www. jstatsoft.org/v48/i02/ .

[23] D.B. Rubin , Statistical matching using file concatenation with adjusted weights and multiple imputations, J. Bus. Econ. Stat. 4 (1) (1986) 87–94 .

[24] D.B. Rubin , Multiple Imputation for Nonresponse in Surveys, John Wiley & Sons, New York, 1987 .

[25] Sevcikova, H., & Rossini, T. (2019). rlecuyer: r Interface to RNG with Multiple Streams. R package version 0.3–5. https: //CRAN.R-project.org/package=rlecuyer .

[26] S. Van Buuren , Flexible Imputation of Missing Data, Chapman and Hall/CRC, 2012 .

Referenties

GERELATEERDE DOCUMENTEN

As already argued, under NMAR neither multiple imput- ation nor listwise deletion (which is what technically hap- pens when in this example the outcome variable is not imputed)

Inspired by Ka- makura &amp; Wedel (2000), a general framework based on latent variable models is proposed to analyze missing data. With this framework, the authors develop

Finally, it is important to realise that the codes used for crash simulation are based on general purpose finite element codes. No methods, especially suited for

[r]

To get more insight in the current way of working within the local warehouse store, with a explic- itly focus on the material handling from the arriving of products till the purchase

One problem with the resulting dataset using known imputation techniques is that the imputed values are assumed to be real.. This means that any further processing of the data is

For the second step (‘Background’) we reviewed literature on DDBM, business model design methods and data driven innovation in the context of media companies.. The third

• ACL.sav: An SPSS data file containing the item scores of 433 persons to 10 dominance items (V021 to V030), 5% of the scores are missing (MCAR); and their scores on variable