Evaluating the performance of various latent class model estimators in the presence of direct effects

(1)

4

Master’s Thesis Psychology,

Methodology and Statistics Unit, Institute of Psychology Faculty of Social and Behavioral Sciences, Leiden University Date: August 2017

Student number: 0951234

Evaluating the performance of various latent class

model estimators in the presence of direct effects

(2)

Abstract

In general, latent class analysis can be used to group comparable cases into different clusters or latent classes. The current simulation study was conducted to look at the performance of a new two-step method for latent class analysis with external variables, in situations where the assumption of measurement invariance does not hold. In the first part of the two-step method, the basic latent class model is estimated and in the second part, the relation between the latent variable and the covariate is estimated. It is also possible to model measurement non-invariance in the second part by adding direct effect(s) between the covariate and the latent class indicators. The performance of the new two-step method was evaluated by parameter bias and coverage over several conditions.

With regard to parameter bias, the results suggested that the new two-step method does not underperform in comparison with the most used methods for latent class analysis, the

one-step and the three-one-step method, both with or without modeling the measurement

non-invariance. As long as the separation level is large enough, good performance can be achieved for all methods. With regard to coverage, the results showed that standard error corrections are preferred for the two-step method. But even with this correction, in the low sample size and low separation conditions, the two-step method showed worse results than the one-step golden standard. But with increasing sample size/and or separation level, the two-step method also reached good performance in most situations.

(3)

Coverage ... 23 Conclusion ... 25 Parameter bias ... 25 Coverage ... 26 Discussion ... 26 Comparability ... 26 Simulation set-up ... 28 Two-step method ... 29 Appendix A ... 30 Appendix B ... 31 Appendix C ... 32 Appendix D ... 33 Appendix E ... 34 Appendix F ... 35 R-code ... 35

Latent Gold Syntax ... 43

(4)

Introduction

In social science and psychology there is often an interest in theoretical constructs which cannot be directly observed. In these situations, the observable effect of the theoretical construct can be used (Furr & Bacharach, 2014). These invisible theoretical constructs are often referred to as latent variables and the observable effect of the theoretical construct can be seen as an indicator of the corresponding theoretical construct (MacCallum & Austin, 2000).

For this, a variety of different latent variable models are used in social science and psychology. For example, factor analysis, traditionally, gives structure to data by relating continuous observed variables to an unmeasured continuous factor or latent variable (Bollen, 2002). Item Response Theory looks at the probability of a certain reaction on a categorical observed item, given the ability on the continuous latent variable (Beaujean, 2014; Bollen, 2002). And with repeated measures over time, latent curve models can be used (Bollen, 2002).

Less known in psychology is the use of latent class analysis. In contrast with the previous methods, latent class analysis assumes a categorical latent variable with categorical observed items (Bollen, 2002; Muthén, 2002). Latent class analysis gives an indication of the latent structure, by grouping comparable cases into different latent classes (McCutcheon, 1987; Muthén, 2002), which are mutually exclusive and exhaustive (Millsap, 2011).

For example, Muthén and Muthén (2000) used latent class analysis to identify different types of antisocial behaviour, with each latent class representing a different type of activity. Crow et al. (2012) identified types of eating disorder symptoms and Amato, King and Thorsen (2016) identified types of relationships in stepfamilies with the use of latent class analysis.

(5)

In general, the latent class model can be divided in a measurement part and a structural part (Bolck, Croon & Hagenaars, 2004). The measurement part looks at the relation between the latent variable and the observed items (Goodman, 2002). But in some situations, it can also be important to look at the association between the latent class model and external variables (Moustaki, 2003) for example distal outcomes (Bakk, Oberski & Vermunt, 2016; Bakk, Tekle & Vermunt, 2013) or covariates (Asparouhov & Muthén, 2014a; Bakk & Kuha, 2017; Vermunt, 2010). This will be represented in the structural part of the latent class model (Vermunt, 2010).

For example, Muthén and Muthén (2000) extended their work by identifying different types of antisocial behavior in relation to the background variables age, gender and ethnicity. This gave an indication of how the background variables were represented within each latent class.

Assumption of conditional independence

An important assumption in latent class analysis, is the assumption of local independence. Given the levels of the latent variable, the observed items should be conditionally independent of each other. Any association between the items is then completely explained by their dependence on the latent classes. In this way, the latent variable reflects the true construct of concern (McCutcheon, 1987).

This idea of conditional independence can be extended for the analysis of latent structures with external variables, for which the underlying assumption is measurement invariance (Kankaraš, Moors & Vermunt, 2010; MacCallum & Austin, 2000; Masyn, 2017).

For example, the assumption of measurement invariance is violated when observed items load on a covariate (Masyn, 2017). In this way, the covariate has an indirect influence on the observed items, through the latent variable (Moustaki, 2003) as well as a direct

(6)

influence (Masyn, 2017). As a consequence, any systematic differences on the observed items can no longer be seen as true differences in relation to the latent variable (Masyn, 2017, Millsap, 2011). Furthermore, violation of measurement invariance can lead to biased parameter estimates (Asparouhov & Muthén, 2014a; Masyn, 2017; Mellenbergh, 1989).

Measurement non-invariance can be modeled by adding direct effects between the external variable and the observed items (Kankaraš, et al., 2010; Masyn, 2017). In this way, there is only partial measurement invariance by allowing to freely estimate the association between the external variable and the affected observed items and thus allowing these items to respond differently, based on the level of the external variable (Kuha & Moustaki, 2015). But in some situations, even modeling the direct effects is not enough to get unbiased parameter estimates (Asparouhov & Muthén, 2014a).

Current methods for latent class analysis

The most used methods for looking at the association between covariates and the latent structure of a model are the one-step method and three-step methods (Asparouhov & Muthén, 2014a; Vermunt, 2010).

For the one-step method, the whole latent class model, containing the measurement part as well as the structural part, will be estimated at the same time (Bolck, et al., 2004; Vermunt, 2010). Although the one-step method shows good results and performs best in comparison with the three-step method (Asparouhov & Muthén, 2014a; Bakk & Kuha, 2017; Vermunt, 2010), there are still some disadvantages. For example, the whole latent class model should be specified beforehand and errors in this specification can lead to biased estimates in both parts of the model (Bolck, et al., 2004). Furthermore, when alterations are made to the latent class model, both the measurement part as well as the structural part should be reassessed. As a consequence, it is possible that the definition of the latent classes changes.

(7)

Added to that, the different parts of the latent class model are often perceived as two separate processes (Vermunt, 2010).

In contrast with the one-step method, the three-step method is based on a stepwise approach. For this, different implementations are possible, namely the modified BCH method and the ML method. In the first step, the measurement part of the latent class model is estimated. The next step is to assign all the cases to a latent class (Bakk, et al., 2013; Bolck, et al., 2004; Vermunt, 2010), which can be done based on multiple assignment methods (Bolck, Croon & Hagenaars, 1998; McCutcheon, 1987). The final step is to relate the previously predicted class membership to the external variables (Bakk, et al., 2013; Bolck, et al., 2004; Vermunt, 2010), while correcting for the classification errors that are made in the second step (Bolck, et al., 2004) and the consequential biased parameter estimates (Bakk, et al., 2013; Bolck, et al., 2003; Vermunt, 2010). It is this applied correction method that differs between the BCH method and the ML method (Bakk, et al., 2013; Vermunt, 2010).

A disadvantage of these correction methods is that they do not work well in all situations (Vermunt, 2010). Furthermore, with the three-step method it is not possible to add direct effects between the external variable and the observed items in the structural part of the model (Bakk, et al., 2013), as all information of the measurement model is captured in the latent class membership (Bakk, et al., 2013, Vermunt, 2010). While omitting these direct effects leads to more biased parameter estimates, especially when the number of direct effects increases, specifying the direct effects in the measurement part, will not completely solve the problem for the ML method (Asparouhov & Muthén, 2014a).

Two-step method for latent class analysis

A new two-step method for latent class analysis has been developed, to address some of the previously mentioned disadvantages of the one-step and three-step method. In the first

(8)

step, the standard measurement part of the model is estimated by looking at the relation between the latent variable and the observed items. The second step looks at the structural part of the model, where the association between the latent variable and external variable is estimated (Bakk & Kuha, 2017). When there are dependencies in the data, these should also be specified in the second step.

Besides the fact, that the two-step method is keeping the logic of a stepwise approach, the added value is that, in the second step, it would be possible to correct for misspecifications from the first step, as it directly conditions on the step one parameter estimates. For example, when adding an external variable, in step two, with a direct effect on the observed items that are measuring the latent variable, this relation can be modeled effectively in the structural part, without the need to reassess all information of the measurement part or changes in the definition of the latent classes. Furthermore, the two-step method does not use a classification step, as this is not needed for relating external variables to the latent class model. Following from that, there is no need to correct for classification errors (Bakk & Kuha, 2017). But if desired, the individual class membership can still be calculated, based on the response pattern on the observed variables (McCutcheon, 1987).

Based on data without direct effect, the first results show that the two-step method is, in most situations, almost as efficient as the one-step method. In comparison with the

three-step method, the two-three-step method performs better or at least equally well as the three-three-step

method (Bakk & Kuha, 2017).

Use of simulation

As the most used methods still show some problems in estimating the effect of a covariate on the latent structure when direct effects are present in the data, a simulation study was performed. The goal was to assess performance in situations where measurement

(9)

invariance does not hold. For this, bias levels of the two-step method were compared with the

one-step and three-step method. And the coverage rate for the two-step method was compared

with the one-step method. This has been done for different conditions and methods, with and without modeling the direct effects. An overview of the used methods can be found in Table

1. The one-step, two-step and the ML method were applied with both, modeling and ignoring

the direct effect(s) that are present in the data. The BCH method only looks at the effect of not modeling the direct effect(s), as it is not possible to specify direct effects between the external variable and the observed items in the structural part of the model (Bakk, et al., 2013).

Table 1.

Overview of the used methods with some characteristics. Method Number of steps Modeling direct effect(s)

FIML 1 Yes FIML no 1 No STEP2 2 Yes STEP2 no 2 No ML 3 Yes ML no 3 No BCH 3 No

The rest of the paper is organized as follows: in the next section some theory behind the different latent class model estimators is presented. Then the simulation set-up and criteria for the current study are discussed. Thereafter, the performance of the new two-step method is evaluated, based on the simulation study. Finally, this paper will end with a general conclusion and some discussion points.

(10)

Latent class models with external variables

This theory part is focused on the new two-step method, but difference in the estimation process with regard to the one-step and three-step method will also be discussed.

Step one: Measurement part

The first step, for the two-step method, is to estimate the basic latent class model without covariates, also called the measurement part. In general, the measurement part consists of a latent variable X, with a total of T latent classes, with t referring to a specific latent class. Furthermore, the model consist of K (binary) observed indicator variables, with k referring to a specific item. Hereby, 𝒀 is a vector referring to a complete response pattern 𝒚 on the observed items, with 𝑌𝑘 referring to the item-specific response (Bakk, et al., 2013;

Vermunt, 2010).

For the measurement part, the relation between the latent variable and the observed items has to be estimated (Goodman, 2002). This is based on the assumption that the probability of having response pattern 𝒚, on the observed items can be described by:

𝑃(𝒀 = 𝒚) = ∑[𝑃(𝑋 = 𝑡)𝑃(𝒀 = 𝒚|𝑋 = 𝑡)]

𝑇

𝑡=1

(1)

Here, the probability of having response pattern 𝒚, 𝑃(𝒀 = 𝒚), is a weighted average of the T class-specific probabilities for response pattern 𝒚, 𝑃(𝒀 = 𝒚|𝑋 = 𝑡), with 𝑃(𝑋 = 𝑡) referring to the class size proportion of class t.

Regarding the assumption of local independence, the K observed items should be mutually independent of each other, within each of the T latent classes (Bakk, et al., 2013, Vermunt; 2010; Vermunt & Magidson, 2004). In this way, there is no association between each combination of items (Agresti, 2007), which cannot be explained by their dependence on

(11)

the latent class (McCutcheon, 1987). For each latent class, this assumption can be described by: 𝑃(𝒀 = 𝒚|𝑋 = 𝑡) = ∏ 𝑃( 𝐾 𝑘=1 𝑌_𝑘 = 𝑦_𝑘|𝑋 = 𝑡) (2)

Where the class-specific probability for response pattern 𝒚 can be seen as the product of the item-specific responses probabilities conditional on belonging to class t. Combining the prior equations leads to the complete latent class model, described by:

𝑃(𝒀 = 𝒚) = ∑[𝑃(𝑋 = 𝑡) 𝑇 𝑡=1 ∏ 𝑃( 𝐾 𝑘=1 𝑌_𝑘 = 𝑦_𝑘|𝑋 = 𝑡)] (3)

Where the probability of having response pattern 𝒚, is depending on the class size proportion for each of the T classes and the class-specific response probability for each of the K observed items (Bakk, et al., 2013; Vermunt & Magidson, 2004). Maximum Likelihood estimation combined with Newton-Raphson algorithms is usually used in most software packages to estimate the model defined in equation 3 (Vermunt & Magidson, 2004).

Step two: Structural part

The second step is to look at the association between the latent class model and the covariates, in the structural part of the model (Vermunt, 2010). For a single latent variable this can be described by:

𝑃(𝒀 = 𝒚|𝑋 = 𝑡, 𝒁 = 𝒛) = 𝑃(𝑋 = 𝑡|𝒁 = 𝒛)⏟

𝑓𝑟𝑒𝑒

𝑃(𝒀 = 𝒚|𝑋 = 𝑡) ⏟

𝑓𝑖𝑥𝑒𝑑 (4)

Hereby, 𝒁 is a vector of covariates with 𝒛 referring to the expression of 𝒁.

The model given by equation 4 can be broken down into two parts. The first part is 𝑃(𝒀 = 𝒚|𝑋 = 𝑡), which refers to the measurement part estimated in step 1 (see equation 2). The second part is 𝑃(𝑋 = 𝑡|𝒁 = 𝒛), which refers to the structural part estimated in step 2. This part can be estimated freely, while keeping the step 1 parameter estimates fixed. This

(12)

estimation is based on the use of a multinomial logistic regression model (Bakk & Kuha, 2017), which can be described by:

𝑃(𝑋 = 𝑡|𝒁 = 𝒛) = exp (𝛽0𝑡+ 𝛽𝑡𝑐)

∑𝑇_𝑡=1exp (𝛽_0𝑡+ 𝛽_𝑡𝑐) (5)

For this, the estimated probability of belonging to each of the latent classes, can be seen as a function of the different covariate values. Note, that the estimated probabilities over the T latent classes, always count up to one, for every value of the covariate (Agresti, 2007). The parameters of the structural part are also estimated with Maximum Likelihood estimation (Bakk & Kuha, 2017).

Modeling measurement non-invariance

For the two-step method, the structural part of the latent class model is also the place were measurement non-invariance can be modeled by adding direct effect(s) between the covariate and observed item(s). This can be described by:

𝑃(𝒀 = 𝒚|𝑋 = 𝑡, 𝒁 = 𝒛) = 𝑃(𝑋 = 𝑡|𝒁 = 𝒛)𝑃(𝒀 = 𝒚|𝑋 = 𝑡, 𝒁 = 𝒛) (6) Here, the second part, 𝑃(𝒀 = 𝒚|𝑋 = 𝑡, 𝒁 = 𝒛), is no longer fixed for the items which have a direct relation with the covariate, they can now be estimated freely. For items, without direct effects, this equals the estimated values of 𝑃(𝒀 = 𝒚|𝑋 = 𝑡) in step one, which are still used as fixed values.

Variance estimation

In general, sampling variances can be obtained by taking the inverse of the Fisher information, also called the negative Hessian matrix (Bakk, et al., 2014). But with the use of stepwise methods, this will lead to biased estimates of the standard errors (Bakk, Oberski &

(13)

Vermunt, 2014; Bolck, et al., 2004). The Fisher information for the combined two-step model of the estimated parameters (𝜽), can be described by:

𝐼(𝜽∗) = [𝐼11

𝐼′₁₂ 𝐼₂₂] (7)

With 𝜽∗_{referring to the true value of} _{𝜽 and the different elements referring to its}

decomposition of the parameters that are estimated in the measurement part (𝜽₁) and the structural part (𝜽2) (Bakk & Kuha, 2017).

When estimating the structural part of the model (step 2), the predicted estimates of the measurement part (step 1) are used as fixed (population) values, while they are estimates with their own sampling variance (Bakk, et al., 2014; Bolck, et al., 2004). With the use of a sample instead of a whole population, estimates could change when other samples are used (Yang, 2008). So, when estimating the structural part, there should be a correction for the sampling variance in the measurement model, to get unbiased estimates of the standard errors in the structural model (Bakk, et al., 2014; Bolck, et al., 2004; Yang, 2008). The variance matrix of the second step can then be obtained by:

𝑉 = 𝐼⏟ + 𝐼₂₂−1 ⏟ ≡ 𝑉22−1𝐼12𝛴11𝐼′12𝐼22−1 2+ 𝑉1 (8)

With 𝑉₂ referring to the variance in the second step as the step 1 parameters represent true population values and 𝑉1 refers to the extra variance as these parameter values are actually

estimates (Bakk & Kuha, 2017), with their own sampling variance (Bakk, et al., 2014). The needed Fisher information (𝐼) and sampling variance (𝛴), for equation 8, can be directly distracted from the Latent Gold output (Vermunt & Magidson, 2015). 𝐼22 refers to the

Fisher information for the freely estimated parameters in the second step, where 𝐼12 refers to

(14)

from step 1 (Bakk, et al., 2014). And 𝛴₁₁ refers to the sampling variance of the step 1 estimates (Bakk & Kuha, 2017).

It should be noted, when adding direct effect(s) between the covariate and the observed item(s), the step 1 parameters, for the items which have an direct relation with the covariate, are no longer fixed and they will be re-estimated. Besides that, there will also be some additional free parameters which reflect the direct effect(s). As a consequence the Fisher information, 𝐼₂₂ and 𝐼₁₂, will be assembled differently. Examples of Fisher information referring to different numbers of direct effects can be found in Appendix F. More information about the variance estimation in the three-step methods can be found in Bakk, et al. (2014).

One-step and three-step method

As mentioned before, for the one-step method, the whole latent class model will be estimated at the same time (Bolck, et al., 2004; Vermunt, 2010). This means that the complete model given by equation 4 is estimated simultaneously, without fixing the measurement part to the step 1 parameter estimates. In contrast, the three-step method starts with the estimation of the basic latent class model or measurement part, as defined in equation 3. The next step is to assign all cases to a latent class. This clustering is based on the response patterns of the cases on the observed variables (McCutcheon, 1987). With modal assignment, given the response pattern 𝒚, cases will be assigned to latent class t with the highest posterior probability, which is given by:

𝑃(𝑋 = 𝑡|𝒀 = 𝒚) =𝑃(𝑋 = 𝑡)𝑃(𝒀 = 𝒚|𝑋 = 𝑡)

𝑃(𝒀 = 𝒚) (9)

The final step is to relate the previously predicted class membership to the external variables (Bakk, et al., 2013; Bolck, et al., 2004; Vermunt, 2010), while correcting for the classification errors that are made in the second step (Bolck, et al., 2004). This classification

(15)

error is given by 𝑃(𝑊 = 𝑠|𝑋 = 𝑡), the probability of the estimated latent class, given the true latent class (Bakk, et al., 2013; Vermunt, 2010).

As mentioned before, two different three-step methods can be distinguished, namely the modified BCH method and the ML method. With BCH, the correction in the third step is based on the use of a weighted version of the estimated class membership, for each of the cases, (Bakk, et al., 2013), which account for the classification error (Asparouhov & Muthén, 2014b).With ML, the correction is based on the re-estimation of the latent class model in the third step. Here, the estimated class membership of the second step is used as the only indicator of the latent variable and the corresponding classification error is used as a fixed value (Bakk, et al., 2013; Vermunt, 2010). More information on the correction methods can be found in Bakk, et al. (2013) and Vermunt (2010). It should also be noted that there is an uncorrected version of the three-step method, however it will not be discussed here, because it is known that it will lead to severely biased parameter estimates (Bolck, et al., 2004; Vermunt, 2010).

Simulation set-up

The used population model consisted of a latent class model with 3 latent classes, 6 binary observed indicators (0-1) and 1 numerical covariate with 5 categories (1-5). Class 1 scores positive on all items, class 3 scores negative on all items and class 2 scores positive on the first 3 items and negative on the last 3 items.

As previous research indicated that sample size, separation level (Bakk, et al., 2013; Vermunt, 2010) and the number of direct effects (Asparouhov & Muthén, 2014a) do have an influence on the accuracy of the parameter estimates, the used sample sizes (N) were 500, 1000, 2000 and 4000. The used entropy values were .36, .65 and .90, corresponding with low, moderate and high separation between the classes (S). This set up was obtained by putting the

(16)

class specific probability of a positive response to .70,.80, and .90 in the three separation level conditions. Furthermore, the used number of class specific direct effects (D) ranged from 1-3.

For the current simulation, the effect of the covariate on the latent class model was set at β1 = -1 and β2 = 1, with class 1 as reference class. The corresponding intercept values were

set to obtain equal class sizes (α1 = 2.33 and α2 = -3.699). Based on moderate direct effect(s),

the class specific direct effects were set in class 3 and the first effect was set on the first indicator: γ13 = 0.4. With the first index referring to the indicator item and the second index

referring to the class specific direct effect. The second direct effect was set at γ43 = 0.4 and the

third direct effect was set at γ53 = 0.4. Based on strong direct effect(s), the first direct effect

was set at γ13 = 0.7, the second at γ43 = 0.7 and the third at γ53 = 0.7.

Based on sample size, separation levels, the number of direct effects and the strength of the direct effect(s), the simulation study examined 7 methods over 72 conditions. With 100 replications for each of the conditions.

The simulation study was performed with the use of Latent GOLD version 5.1.0.17083 (Vermunt & Magidson, 2016), R version 3.3.1 (R Core Team, 2016) and RStudio version 1.0.44 (Rstudio Team, 2016).

Simulation criteria

The performance of the different methods was based on parameter bias and coverage.

Parameter bias

Parameter bias can be seen as the deviation of the estimated effect parameters from the true value of the effect parameters (Bakk, et al., 2013). Here, the mean parameter bias was used, with respect to the two effect parameters β1 and β2, which were then averaged over each

(17)

The first step with regard to the parameter bias, was to look at the bias patterns over the different conditions. Then a mixed model ANOVA was performed to get more information on the variables that influence the bias most. For this study the manipulated between-subject factors were sample size, separation level, number of direct effects and the strength of the direct effect(s). The used within-subject factor was method.

Since the current study had a large sample size (N = 7061), p-values could become very small and indicate statistical significance even though this does not necessarily mean that the significant effect is of practical importance (Lantz, 2013; Lin, Lucar & Shmueli, 2013). Therefore, the focus of the mixed model ANOVA results is on the effect sizes, as this will give an indication of the magnitude and meaning of the effects (Lantz, 2013; Lin, et al., 2013). The effect size that was used is generalized eta squared (𝜂_𝐺2), which can be adjusted for the use with a mixed model design (Bakeman, 2005). For the interpretation of the effect sizes, the following threshold values are used: .02 for a small effect, .13 for a medium effect and .26 for a large effect (Bakeman, 2005).

Furthermore, preliminary analysis showed multiple violations of normality according to the Shapiro-Wilk test (p > .05). But these were of no concern, as each cell consisted of more than 25 observations (Schmider, Ziegler, Danay, Beyer & Bühner, 2010). Furthermore, there was violation of the sphericity assumption (Mauchly’s W = .019, p < .001). Therefore, as a correction method Greenhouse-Geisser is used (ε < .75) for the within-subjects effects. There were also homogeneity violations, marked by a significant Levene’s test for all methods (p < .001) and unequal group sizes (min. = 59, max. = 100). In this situation the F test becomes significant too often, since the highest variance is related to the smallest group (De Heus, 2015; Lix, Keselman, Keselman, 1996), which is also referred to as negative pairings (Skidmore & Thompson, 2013). Despite the violation, the results of the mixed model ANOVA are still used, as significance level was not the main focus of the mixed model

(18)

ANOVA and the violation can be brought back to only 1 of the 72 conditions. Not taking into account the condition which is characterized by the lowest group size (low sample size combined with low separation and 3 strong direct effects), the F test would be robust against the violation of homogeneity of variances, since the other groups sizes would be approximately equal (De Heus, 2015). Lastly, there were no signs of multivariate outliers according to Mahalanobis distance.

Expectations. In most situations, the current latent class methods still show some bias.

Therefore, the expectation was that there would also be some bias for the new-proposed two-step method, which could be based on both, an underestimation or an overestimation of the parameters.

Comparing the two-step method with the one-step method, the expectations was that there would be more bias for the two-step method. In comparison with the three-step method, the expectation was that there would be more bias for the three-step method, as the two step method does not need any correction methods for classification errors.

In general, the expectation was that there would be more bias as sample size and/or separation level decreases or as the number of direct effects present in the data increases. Furthermore, the expectation was that there would be more bias as the number of omitted direct effects increases, because the effect of the omitted direct effects should then be taken in by the indirect path between the covariate and the observed indicator. The effect of strength of the direct effect(s) was purely exploratory, as previous simulations studies only used one strength value.

Coverage

As mentioned earlier, due to sampling variance, the use of stepwise methods can lead to problems with the estimation of standard errors (Bakk, et al., 2014; Bolck, et al., 2004)

(19)

Therefore, when estimating the structural part, there should be a correction for the sampling variance in the measurement model (Bakk, et al., 2014; Bolck, et al., 2004; Yang, 2008).

As the problems with the estimation of the standard errors also have an influence on the corresponding confidence intervals (Bakk, et al., 2014), the coverage was used to investigated the effect of the correction method in different conditions. The coverage gives “the proportion of replications for which the 95% confidence interval contains the true parameter value” (Muthén & Muthén, 2002, p. 606). Traditionally, the results will be compared to a nominal coverage rate of 0.95 (Bakk, et al., 2014). But with the low number of simulations per condition, all coverage values between 0.91 and 0.99 are comparable to the nominal coverage rate of 0.95 (Bakk & Kuha, 2017). Here, the mean coverage is used, with respect to the two effect parameters β1 and β2. Furthermore, this is averaged over each

condition.

The focus will be on exploring the effect of standard error correction for the new

two-step methods, both with and without modeling the direct effect(s). Information about the

effects of different correction methods, for a three-step latent class model with covariates, can be found in Bakk, et al. (2014). As a baseline method, the one-step methods, both with and without modeling the direct effects, will be used. The reason for this comparison is that, in general, this, one-step, full information maximum likelihood method will give better estimations than limited information likelihood methods (Bolck, et al., 2004), like the

two-step method (Bakk & Kuha, 2017).

Simulation results

Before presenting the results, it should be pointed out that not all replications of the stimulation study were used in the analysis. Replications that resulted in non-convergence, or absolute standard error values larger than three and/or parameter estimations with a deviation

(20)

of more than two units from the true value, and their corresponding boundary solutions or local maxima, were excluded from further analysis. These replications were removed because non-convergence and boundary solutions lead to uninterpretable results. The number of excluded replications per condition can be found in Table 2. The replications were removed per condition over all methods. In this way, equal sample sizes per condition were maintained.

Table 2.

Number of excluded replications based on non-convergence, standard error values and/or parameter estimations. γ = 0.4 γ = 0.7 N S D 1 2 3 1 2 3 500 L 3 6 7 2 5 41 M 1 2 1 H 1 1000 L 2 8 4 19 M H 2000 L 1 1 1 8 2 10 M H 4000 L 4 3 7 M H

Note. N = sample size. S = separation level. D = number of direct effects. γ = strength of the direct effect(s).

As shown by the results in Table 2, most replication were excluded in the low sample size and low separation conditions, especially with multiple strong direct effects. Although, in the worst condition forty percent of replication was excluded, this is not exceptional. Similar results were presented by Bakk, et al. (2013).

(21)

Parameter bias: Bias patterns

Looking at the parameter bias, the results are focused on the expectations made for the current simulation set-up. For more information on the performance of the one-step and

three-step methods, the articles mentioned in the introduction can be consulted. Figure 1 shows the

averaged absolute mean bias for the effect parameters over all conditions. The absolute values are used here to get a clearer picture of the differences and similarities, when comparing the bias patterns across methods. The specific values and direction of the bias can be found in

Appendix A for data with moderate direct effect(s) and in Appendix B for data with strong

direct effect(s).

The first thing that stands out according to the plots in Figure 1, are the increased levels of bias for the ML method in comparison with the other methods. This three-step method models the direct effect(s) in the first step, without taking the relation between the latent variable and covariate into account. In contrast with suggestions of Asparouhov and Muthén (2014a), modeling the direct effect(s) in this way, leads to more problems, than completely ignoring them with ML no or BCH. A possible explanation for this is that the current simulation study did set a strong effect for the relation between the covariate and the latent class model, while Asparouhov and Muthén (2014a) only used a moderate effect. In this way, the modeled direct effect had to take in a stronger effect, as the relation between the covariate and the latent class model in not modeled in the measurement part (Asparouhov & Muthén, 2014a), which could eventually have resulted in higher bias values. But due to the large difference between the ML method and the other methods, further comparisons will not take the ML method into account.

(22)

Figure 1. Averaged absolute mean bias for the effect parameters β1 and β2 for all conditions. Separation values 1, 2 and 3 are corresponding to, respectively, a

(23)

Comparing the two-step methods with the one-step methods, FIML shows the best performance in most conditions up until two direct effects. Except in the low separation conditions, the two-step methods follow FIML closely. With three direct effects, STEP2 is the preferred model. In general, the methods that are modeling the direct effect(s), FIML and

STEP2, are showing less biased results than the methods that are ignoring the direct effect(s).

Comparing the two-step methods with the three-step methods, STEP2 which is modeling the direct effect(s) shows, in most conditions, the best performance of all stepwise methods. But in general, the differences between the methods are not huge and decreasing with increasing separation level. Furthermore, there is a general decrease in bias as the separation level, sample size or the number of direct effects increases. In contrast, there is a general increase in bias as the number of omitted direct effects increases. And with regard to strength, with multiple direct effects, the overall bias is higher with stronger effects, especially in the lower separation condition(s). It should be noted, that the degree of the effects is depending on the method.

Overall, when the separation level is at least moderate, the two-step methods show good estimation of the effect parameters, as the bias is less than five percent (Bakk, et al., 2013). But in most situations there still remains a little bit of bias, especially when not modeling the direct effect(s).

Parameter bias: Mixed model ANOVA

The results of the mixed model ANOVA can be found in Appendix C. For the between-subjects effects, there was a small effect of separation level on the mean bias (F(2, 6989) = 205.868, p < .001, 𝜂_𝐺2 = .033). Post-hoc comparisons showed that the mean bias score for all pairwise comparison, with regard to the low (M = .084), moderate (M = .034) and high separation condition (M = .013), were significant (p < .001).

(24)

For the within-subject effects, there was a small effect of method on the mean bias (F(2955, 20653.457) = 1314.033, p < .001, 𝜂_𝐺2 = .073). As shown in Table 3, ML showed an increased mean bias in comparison with the other methods. Not taking ML into account, there no longer is a meaningful effect of method (F(1991, 13913.600) = 74.940, p < .001, 𝜂_𝐺2 = .003).

Table 3.

Descriptive statistics for the within-subject effect of method.

Method Min. Max. Mean SD

FIML -1.234 1.607 0.015 0.142 FIML no -1.879 0.873 0.040 0.139 STEP2 -1.085 1.311 0.023 0.151 STEP2 no -1.466 1.227 0.031 0.157 ML -1.011 1.477 0.152 0.192 ML no -1.286 1.095 0.028 0.161 BCH no -1.474 1.295 0.015 0.222

Although the mixed model ANOVA showed more significant effects, the corresponding effect size were negligible. Therefore, these result are not discussed.

Coverage

Figure 2 shows the averaged mean coverage for the effect parameters over all

conditions. The true coverage values can be found in Appendix D for data with moderate direct effect(s) and in Appendix E for data with strong direct effect(s).

The first pattern, shown by the plots in Figure 2, is that the corrected STEP2 methods, always shows similar or better performance than the uncorrected STEP2 counterparts, which ignore the additional sampling variance of the first step estimates. Especially, when there is

(25)

Figure 2. Averaged mean coverage for the effect parameters β1 and β2 for all conditions. Separation values 1, 2 and 3 are corresponding to, respectively, a low,

(26)

low separation and low sample size, the corrected STEP2 methods are preferred.

When comparing the results of the one-step methods and the corrected STEP2 method to a nominal coverage rate of 0.95, both methods show good performance, with coverage values within the interval, as long as the separation level is large enough, which is depending on the strength of the direct effect(s). This effect of separation level can be explained by bias. With increasing separation levels, bias values are decreasing and leading to better coverage. This is especially important for the corrected STEP2 methods, in the lower sample size conditions.

Furthermore, with increasing sample size, results of the corrected STEP2 and the

FIML method, which are modeling the direct effect(s), become more and more similar,

especially in these low separation conditions. Eventually, the corrected STEP2 methods shows better, or at least similar, results in comparison with this one-step method. But this effect is influenced by the number and strength of the direct effects.

Conclusion

Parameter bias

Figure 1 showed some differences in bias patterns for the two-step methods in

comparison with the one-step and three-step methods. There also seemed to be some main and interaction effects with regard to bias. But before comparing the results with previous studies, it should be noted that these studies did not use a formal statistical test, but only descriptives. As Figure 1 also shows the importance of separation level (Vermunt, 2010), sample size and the number of direct effects, although in the opposite direction, (Asparouhov & Muthén, 2014a), the mixed model ANOVA only found small main effect for separation level and method. The found effect of method was based entirely on the increased levels of

(27)

bias for the ML method. Not taking the ML method into account, made the main effect of method disappear.

Concluding from these results, as long as the separation level is high enough, it does not matter what the sample size is, how many direct effects there are, what the strength of these effect(s) is, which method is used or if there are any interactions effects. It will results in similar bias values, as the corresponding effect sizes are ignorable. Although increasing separation level will lead to good estimations of the effect parameters, the bias will not completely disappear.

Coverage

The general advice when using the two-step methods is to always use standard error correction, as this leads to better coverage performance, especially in the low separation conditions. The second advice is to model the direct effect(s), as this becomes more important as the number and strength of the direct effect(s) increases. In general, good performance for the corrected STEP 2 method can be reached with increasing samples size and/or separation values, as this has a positive influence on the coverage values. And with increasing sample size, depending on the number and strength of the direct effects, the corrected STEP2 method starts to outperform its counterpart of the one-step methods.

Discussion

Comparability

An obstacle for the current simulation study, was that the results were not comparable to previous simulation studies, as these studies did not use formal statistical tests. To make research more comparable, future studies should report formal statistics and especially the

(28)

corresponding effect sizes of an effect. Hereby, some considerations should be made about the choice of effect size measure.

The current simulation study has used generalized eta squared as the effect size measure. The advantage of using generalized eta squared as an effect size measure for ANOVA, especially when there is a within-subjects factor, is that it can be compared across studies regardless of design, as long as the same factor(s) and outcome variable(s) are included (Bakeman, 2005; Olejnik & Algina, 2003). Besides that, it can be easily calculated based on the standard SPSS output (Bakeman, 2005).

But simulations studies, based on one-way between-subjects ANOVA, show that this is not always the preferred measure, as in general, eta squared shows more bias than, the most used alternative with regard to the proportion of variance explained, omega squared (Okada, 2013; Skidmore & Thompson, 2013). And with heterogeneity of variances based on negative pairings, which was the case in the current simulation study, the bias values for both effect size measures increases and the differences between eta squared and omega squared become even larger as the number of levels for a factor increases (Skidmore & Thompson, 2013).

Although, generalized omega squared is available (Olejnik & Algina, 2003), this is not necessarily the preferred choice, because it is harder to use (Bakeman, 2005), as it depends on bias-corrected sample estimators (Okada, 2013). Furthermore, there are also indications that bias value for eta squared will decrease as the number of observations per level increases and become more similar to omega squared values (Okada, 2013).

As the current simulation study had both, some harming and protective factors, for a more unambiguously answer of which effect size measure would have been best, more information is needed. For example, this could be focused on the impact of using multiple factors, including a within-subjects factor, as the previous results were based on one-way between subjects ANOVA. Or the interaction between the number of levels (harming factor)

(29)

and the number of observations per level (protective factor). Or the magnitude of an assumption violation, as the violation in the current simulation study could be brought back to 1 out of 72 conditions. But all should lead to an better indication whether the differences in bias, under different conditions, weigh up to the complexity of the effect size measure.

Additionally, when an effect size measure is chosen, there should also be extra attention to interpretation and the corresponding guidelines for this effect size measure. Bakeman (2005) based the cut-off points for generalized eta squared on the eta squared guidelines, defined by Cohen (.02; .13; .26). But if it was based on, for example the eta

squared guidelines of Lantz (2013), it would have led to slightly different cut-off points for generalized eta squared. As the eta squared guidelines of Lantz (2013) are based on the on

the squared correlation guidelines (.01; .09; .25). Although the differences are small, in the current simulation, a different choice of guidelines would have led to different results, as there would have been an additional interaction effect between method and the number of direct effects (𝜂_𝐺2 = .011).

Simulation set-up

According to Vermunt (2010), to perform latent class analysis one will need a sample size of at least 500. In the current simulation study, as well as in other simulation studies, this criteria is met (Asparouhov & Muthén, 2004, Bakk, et al., 2013; Vermunt, 2010).

Besides large scale studies, like the National Longitudinal Survey of Youth (Muthén & Muthén, 2000) or the 2013 National Youth Risk Behavior Survey (Kim, Barreira & Kang, 2015), there are also a lot of applied studies to find, were latent class analysis is used with sample sizes below 500 (e.g. James, McField & Montgomery, 2013; Mannarini & Boffo, 2015; Martins, Carlson, Alexandre & Falck, 2015; McDonald, et al., 2012). Therefore, further

(30)

research should focus more on the influence of small sample sizes on the estimation of (extended) latent class models, as this is used a lot in practice.

Two-step method

With regard to parameter bias, the results showed that, without taking the ML method into account, similar bias patterns can be found regardless of the used model. The biggest advantage of using the two-step method in comparison with the other methods, is that, in the second step, it is possible to correct for misspecification from the first step, without the need to reassess the whole latent class model as it directly conditions on the step one parameters. With this potential of the two-step method in mind, further research should look at the performance of the two-step method in different conditions.

Based on previous simulation studies relating to the one-step and three-step method, the focus could be on the use of other external variables, like distal outcomes (Bakk, et al., 2016; Bakk, et al., 2013), multiple external variables (Bakk, et al., 2014), external variables with different measurement levels (Bakk, et al., 2013; Bakk & Vermunt, 2015) or situation where the strength of the relation between the latent class model and external variable(s) is different. On the other hand, the focus could also be on the performance on the two-step in more complex models, like latent transitions analysis or growth mixture models (Asparouhov & Muthén, 2014a) or in situations where assumptions are violated (Bakk & Vermunt, 2015).

(31)

Table with averaged absolute mean bias over the effect parameters β1 and β2, for data with moderate direct effect(s) (γ = 0.4).

low separation moderate separation high separation

Method D N 500 1000 2000 4000 500 1000 2000 4000 500 1000 2000 4000 FIML 1 -0.028 -0.020 0.005 0.012 0.007 -0.014 -0.005 0.002 0.006 -0.009 -0.003 -0.001 2 0.013 0.009 -0.009 0.000 0.011 -0.005 0.008 0.000 -0.002 0.000 -0.004 0.003 3 0.069 0.072 0.130 0.086 0.061 0.027 0.016 0.025 0.010 0.003 -0.001 0.002 FIML no 1 -0.008 -0.008 0.015 0.024 -0.018 -0.037 -0.029 -0.021 -0.009 -0.023 -0.017 -0.015 2 0.079 0.102 0.093 0.101 0.011 0.000 0.013 0.009 -0.016 -0.009 -0.011 -0.007 3 0.122 0.116 0.145 0.111 0.045 0.027 0.015 0.021 -0.010 -0.006 -0.013 -0.008 STEP2 1 0.161 0.098 0.076 0.089 0.033 0.003 -0.004 0.004 0.008 -0.009 -0.004 -0.001 2 0.099 0.059 0.037 0.020 0.006 0.003 0.012 -0.007 0.001 0.002 -0.005 0.003 3 0.037 0.042 0.057 0.003 0.052 0.008 -0.004 0.001 0.004 0.005 -0.003 -0.001 STEP2 no 1 0.167 0.097 0.077 0.090 0.023 -0.007 -0.017 -0.011 -0.001 -0.019 -0.014 -0.011 2 0.090 0.077 0.055 0.040 -0.006 -0.006 0.000 -0.014 -0.010 -0.005 -0.012 -0.005 3 -0.054 0.045 0.081 0.054 0.009 -0.007 -0.016 -0.009 -0.011 -0.006 -0.014 -0.011 ML 1 0.187 0.148 0.141 0.150 0.117 0.144 0.184 0.214 0.082 0.088 0.121 0.120 2 0.234 0.140 0.172 0.151 0.153 0.134 0.177 0.207 0.077 0.087 0.103 0.118 3 0.220 0.325 0.367 0.337 0.228 0.225 0.247 0.270 0.117 0.110 0.128 0.128 ML no 1 0.167 0.102 0.093 0.095 0.025 -0.012 -0.016 -0.012 -0.005 -0.022 -0.014 -0.011 2 0.086 0.078 0.052 0.029 -0.012 -0.016 -0.010 -0.024 -0.010 -0.004 -0.012 -0.002 3 -0.034 0.050 0.082 0.051 -0.007 -0.017 -0.013 0.003 -0.019 -0.017 -0.019 -0.018 BCH no 1 0.171 0.096 0.085 0.088 0.015 -0.013 -0.003 0.002 0.000 -0.016 -0.009 -0.003 2 0.097 0.054 0.034 -0.039 -0.021 -0.032 -0.026 -0.048 -0.011 -0.002 -0.013 0.000 3 -0.082 0.015 0.073 0.009 -0.037 -0.045 -0.041 -0.024 -0.019 -0.014 -0.018 -0.019

(32)

Table with averaged absolute mean bias over the effect parameters β1 and β2, for data with strong direct effect(s) (γ = 0.7).

low separation moderate separation high separation

Method D N 500 1000 2000 4000 500 1000 2000 4000 500 1000 2000 4000 FIML 1 0.023 -0.008 -0.002 -0.005 -0.013 0.009 -0.003 0.000 -0.021 0.012 -0.002 0.000 2 -0.070 0.023 -0.009 0.005 -0.009 -0.004 0.016 0.009 0.005 0.002 0.001 -0.001 3 0.054 0.109 0.106 0.088 0.061 0.068 0.070 0.062 0.010 0.005 0.016 0.013 FIML no 1 0.105 0.075 0.090 0.085 0.003 0.020 0.008 0.012 -0.035 -0.002 -0.017 -0.015 2 0.027 0.151 0.121 0.148 0.077 0.075 0.095 0.098 0.011 0.011 0.010 0.008 3 0.130 0.168 0.154 0.147 0.100 0.096 0.102 0.099 0.010 0.008 0.021 0.020 STEP2 1 0.171 0.092 0.038 0.008 0.002 0.003 -0.008 0.004 -0.018 0.018 0.003 0.002 2 0.117 0.077 0.036 0.028 0.008 0.011 0.013 0.019 0.004 0.002 0.000 -0.002 3 -0.006 0.045 0.031 0.022 0.012 0.014 0.021 0.012 0.003 -0.012 -0.002 -0.001 STEP2 no 1 0.168 0.116 0.066 0.041 0.001 0.003 -0.008 0.004 -0.027 0.008 -0.006 -0.008 2 0.161 0.138 0.106 0.112 0.038 0.042 0.043 0.056 0.003 0.004 0.001 -0.001 3 -0.079 0.119 0.127 0.127 0.047 0.059 0.069 0.063 -0.003 -0.008 0.006 0.005 ML 1 0.178 0.070 0.013 -0.045 0.037 0.111 0.171 0.201 0.096 0.143 0.151 0.152 2 0.223 0.200 0.118 0.052 0.117 0.116 0.114 0.182 0.079 0.098 0.128 0.122 3 0.178 0.168 0.211 0.231 0.158 0.177 0.155 0.124 0.139 0.143 0.151 0.157 ML no 1 0.162 0.113 0.070 0.037 -0.003 -0.003 -0.017 -0.003 -0.028 0.011 -0.003 -0.009 2 0.154 0.146 0.128 0.119 0.016 0.006 0.014 0.028 -0.007 -0.003 -0.004 -0.009 3 -0.074 0.114 0.127 0.126 0.046 0.064 0.071 0.063 -0.002 -0.007 0.008 0.007 BCH no 1 0.173 0.117 0.055 -0.013 -0.023 -0.021 -0.032 -0.015 -0.022 0.020 0.007 0.003 2 0.158 0.135 0.096 0.062 -0.014 -0.006 -0.035 0.008 -0.025 -0.010 -0.014 -0.016 3 -0.111 0.099 0.062 0.111 0.031 0.041 0.031 0.028 -0.012 -0.014 0.000 0.001

(33)

Appendix C

Table with results of the mixed model ANOVA

Source SS df MS F Sig. ηG2 Between-Subject Effects Intercept 94.117 1 94.117 894.322 .000 .06902 Sample size .106 3 .035 .336 .799 .00008 Separation 43.331 2 21.665 205.868 .000 .03300 Direct effects 2.832 2 1.416 13.453 .000 .00223 Strength 1.307 1 1.307 12.417 .000 .00103

Sample size * Separation 1.406 6 .234 2.227 .038 .00111

Sample size * Direct effects 3.442 6 .574 5.450 .000 .00270

Sample size * Strength .720 3 .240 2.281 .077 .00057

Separation * Direct effects 1.436 4 .359 3.411 .009 .00113

Separation * Strength .247 2 .124 1.176 .309 .00019

Direct effects * Strength 1.193 2 .597 5.669 .003 .00094

Sample size * Separation * Direct effects 9.725 12 .810 7.701 .000 .00760 Sample size * Separation * Strength .466 6 .078 .738 .619 .00037 Sample size * Direct effects * Strength .466 6 .078 .739 .618 .00037 Separation * Direct effects * Strength 1.058 4 .265 2.514 .040 .00083 Sample size * Separation * Direct effects * Strength 2.048 12 .171 1.621 .078 .00161

Error 735.514 6989 .105

Within-Subjects Effects

Method 100.404 2.955 33.976 1314.033 .000 .07329

Method * Sample size 1.464 8.865 .165 6.389 .000 .00115

Method * Separation 8.754 5.910 1.481 57.283 .000 .00685

Method * Direct effects 13.490 5.910 2.282 88.276 .000 .01051

Method * Strength 6.752 2.955 2.285 88.372 .000 .00529

Method * Sample size * Separation 3.238 17.731 .183 7.063 .000 .00254 Method * Sample size * Direct effects 3.064 17.731 .173 6.684 .000 .00241 Method * Sample size * Strength .649 8.865 .073 2.833 .003 .00051 Method * Separation * Direct effects 10.752 11.821 .910 35.180 .000 .00840 Method * Separation * Strength 4.695 5.910 .794 30.723 .000 .00368 Method * Direct effects * Strength 1.904 5.910 .322 12.457 .000 .00150 Method * Sample size * Separation * Direct effects 4.968 35.462 .140 5.418 .000 .00390 Method * Sample size * Separation * Strength .786 17.731 .044 1.715 .031 .00062 Method * Sample size * Direct effects * Strength .611 17.731 .034 1.333 .157 .00048 Method * Sample size * Separation * Direct effects * Strength 1.422 35.462 .040 1.551 .019 .00112

(34)

Appendix D

Table with averaged absolute coverage over the effect parameters β1 and β2, for data with

moderate direct effect(s) (γ = 0.4).

FIML STEP2 FIML no STEP2 no

N S D C95 C95 C95(2) C95 C95 C95(2) 500 L 1 0.92 0.30 0.55 0.91 0.25 0.47 2 0.87 0.53 0.73 0.95 0.45 0.70 3 0.82 0.64 0.83 0.94 0.61 0.82 M 1 0.94 0.64 0.87 0.94 0.61 0.86 2 0.94 0.87 0.94 0.94 0.85 0.93 3 0.92 0.86 0.92 0.95 0.86 0.92 H 1 0.97 0.97 0.98 0.97 0.96 0.97 2 0.97 0.95 0.96 0.97 0.94 0.97 3 0.97 0.96 0.97 0.98 0.97 0.97 1000 L 1 0.93 0.27 0.61 0.92 0.25 0.56 2 0.91 0.61 0.88 0.93 0.54 0.89 3 0.82 0.77 0.87 0.91 0.70 0.90 M 1 0.95 0.67 0.94 0.94 0.65 0.92 2 0.94 0.86 0.93 0.95 0.82 0.93 3 0.92 0.89 0.93 0.93 0.86 0.93 H 1 0.93 0.93 0.94 0.91 0.93 0.94 2 0.94 0.93 0.93 0.94 0.93 0.94 3 0.92 0.92 0.92 0.94 0.91 0.92 2000 L 1 0.96 0.27 0.67 0.94 0.25 0.65 2 0.94 0.71 0.93 0.89 0.70 0.94 3 0.81 0.77 0.88 0.82 0.77 0.92 M 1 0.94 0.78 0.95 0.92 0.74 0.95 2 0.95 0.89 0.95 0.95 0.88 0.95 3 0.92 0.91 0.93 0.93 0.90 0.93 H 1 0.96 0.96 0.96 0.96 0.96 0.97 2 0.94 0.92 0.93 0.93 0.91 0.91 3 0.92 0.92 0.92 0.91 0.90 0.91 4000 L 1 0.96 0.22 0.75 0.96 0.21 0.73 2 0.99 0.81 0.98 0.87 0.84 0.96 3 0.91 0.88 0.97 0.81 0.85 0.92 M 1 0.94 0.75 0.94 0.93 0.71 0.94 2 0.98 0.93 0.97 0.97 0.92 0.96 3 0.95 0.93 0.96 0.94 0.90 0.93 H 1 0.96 0.94 0.96 0.94 0.90 0.92 2 0.95 0.95 0.96 0.95 0.93 0.94 3 0.96 0.94 0.95 0.96 0.97 0.97

Note. . N = sample size. S = separation level. D = number of direct effects. C95 = uncorrected coverage. C95(2)

(35)

Appendix E

Table with averaged absolute coverage over the effect parameters β1 and β2, for data with

strong direct effect(s) (γ = 0.7).

FIML STEP2 FIML no STEP2 no

N S D C95 C95 C95(2) C95 C95 C95(2) 500 L 1 0.91 0.38 0.62 0.91 0.36 0.59 2 0.91 0.70 0.83 0.93 0.66 0.86 3 0.76 0.75 0.86 0.82 0.70 0.81 M 1 0.96 0.56 0.91 0.96 0.52 0.88 2 0.91 0.85 0.90 0.94 0.90 0.95 3 0.91 0.92 0.95 0.89 0.92 0.95 H 1 0.95 0.92 0.98 0.95 0.89 0.98 2 0.94 0.92 0.94 0.95 0.93 0.94 3 0.94 0.91 0.93 0.94 0.93 0.94 1000 L 1 0.95 0.38 0.69 0.93 0.39 0.70 2 0.90 0.76 0.87 0.90 0.75 0.91 3 0.83 0.75 1.00 0.83 0.77 1.00 M 1 0.95 0.44 0.89 0.95 0.43 0.87 2 0.97 0.92 0.96 0.88 0.92 0.96 3 0.92 0.95 0.96 0.89 0.91 0.94 H 1 0.95 0.88 0.97 0.93 0.85 0.95 2 0.93 0.91 0.92 0.95 0.92 0.92 3 0.96 0.96 0.97 0.94 0.94 0.94 2000 L 1 0.97 0.57 0.88 0.89 0.55 0.90 2 0.95 0.84 0.94 0.82 0.78 0.90 3 0.86 0.90 0.96 0.67 0.67 0.88 M 1 0.96 0.42 0.93 0.95 0.40 0.92 2 0.94 0.90 0.93 0.83 0.87 0.92 3 0.91 0.96 0.97 0.79 0.87 0.88 H 1 0.93 0.91 0.98 0.93 0.89 0.96 2 0.97 0.96 0.96 0.96 0.96 0.97 3 0.95 0.95 0.96 0.94 0.96 0.96 4000 L 1 0.94 0.60 0.90 0.80 0.58 0.91 2 0.95 0.83 0.94 0.60 0.65 0.80 3 0.88 0.82 0.93 0.53 0.56 0.77 M 1 0.95 0.34 0.94 0.96 0.33 0.93 2 0.94 0.89 0.91 0.65 0.75 0.82 3 0.82 0.92 0.95 0.62 0.75 0.78 H 1 0.94 0.87 0.93 0.92 0.87 0.92 2 0.94 0.92 0.94 0.95 0.93 0.95 3 0.94 0.94 0.94 0.92 0.94 0.95

Note. . N = sample size. S = separation level. D = number of direct effects. C95 = uncorrected coverage. C95(2)

(36)

Appendix F

R-code for parameter estimation, with examples for different numbers of direct effects and the corresponding Latent Gold syntaxes for data generation.

R-code

The following R-code is combined for data with one, two or three direct effects. When specific code is needed, this is given for the different situations. It should be noted that, in this R-code, (#) refers to the number of direct effects.

# --- # set working directory # --- setwd(".../(#) effect") # --- # load packages # --- # generating .lgs files library(brew) # --- # number of simulations # --- nsim <- 100 i <- 1 # --- # Run template

# > to run .lgs filses in Latent Gold # ---

source("run.template.R") # ---

# simulation conditions: sample size * separation level # ---

alfabeta <- c(0.8473,1.386294361, 2.197225) samplesize <- (c('f500', 'f1000', 'f2000', 'f4000')) # combined

paramsample <- expand.grid(alfabeta=alfabeta, samplesize=samplesize, stringsAsFactors=FALSE)

# --- # start simulation # ---

(37)

for (icondition in 1:nrow(paramsample)) { for (i in 1:nsim){ # --- # 1. data generation # --- envir <- new.env()

assign("samplesize", paramsample$samplesize[icondition], envir=envir) # > sample size only for data generation

assign("alfabeta", paramsample$alfabeta[icondition], envir=envir) assign("alfabeta2", (2*paramsample$alfabeta[icondition]), envir=envir) # create .lgs file

run.template("generate.brew", envir=envir, temp.filename.base="generate") # run Latent Gold

shell(paste("C:\\...\\LatentGOLD5.1\\lg51.exe", "generate.lgs", "/b"))

# --- # 2. centering data # ---

data <- read.delim('data.dat', header=TRUE) data$x <- data$x1-mean(data$x)

write.table(data[,-1], 'data.dat', col.names=TRUE, quote=FALSE, row.names=FALSE) # ---

# 3. one step brew file # + direct effects # --- envir <- new.env()

run.template("one_step.brew", envir=envir, temp.filename.base="onestep") # run Latent Gold

shell(paste("C:\\...\\LatentGOLD5.1\\lg51.exe", "onestep.lgs", "/b"))

# --- # 4. one step brew file # - direct effects # --- envir <- new.env()

run.template("one_step_no.brew", envir=envir, temp.filename.base="onestep_no") # run Latent Gold

shell(paste("C:\\...\\LatentGOLD5.1\\lg51.exe", "onestep_no.lgs", "/b"))

(38)

# --- # 5. step one brew file # --- envir <- new.env()

run.template("step_one.brew", envir=envir, temp.filename.base="stepone") # run Latent Gold

shell(paste("C:\\...\\LatentGOLD5.1\\lg51.exe", "stepone.lgs", "/b"))

# ---

# 6. read in all step one parameters for step 2A (uncorrected + direct effects) # ---

envir <- new.env()

parsstep1 <-as.matrix(read.table('param_step1.txt', sep=" ", dec=",")) # 1-2 = intercept cluster # 3-5 = indicator effect of y1 # 6-8 = indicator effect of y2 # 9-11 = indicator effect of y3 # 12-14 = indicator effect of y4 # 15-17 = indicator effect of y5 # 18-20 = indicator effect of y6 # ---

# 7. step 2A: brew file uncorrected # + direct effects

# - SE correction # --- # ---

# 7.1: fix parameter values from step 1 for (#) effects # 1 DIRECT EFFECT ON Y1:

# > re-estimate intercept (1-2) + indicator effect of Y1 (3-5) # > fix other values of step 1

# fixed values with 1 direct effect bpars <- (parsstep1[6:20])

# 2 DIRECT EFFECTS ON Y1 and Y4:

# > re-estimate intercept (1-2) + indicator effect of Y1 (3-5) and Y4 (12:14) # > fix other values of step 1

# fixed values with 2 direct effects bpars <- (parsstep1[6:11, 15:20]) # 3 DIRECT EFFECTS ON Y1, Y4 and Y5:

(39)

# > re-estimate intercept (1-2) + indicator effect of Y1 (3-5), Y4 (12:14) and Y5 (15:17)

# > fix other values of step 1 # fixed values with 3 direct effects bpars <- (parsstep1[6:11, 18:20]) # ---

# 7.2: assign the fixed values assign("bpars", bpars, envir=envir)

run.template("step_two_A.brew", envir=envir, temp.filename.base="step_two_A") shell(paste("C:\\...\\LatentGOLD5.1\\lg51.exe",

"step_two_A.lgs", "/b")) # ---

# 8. step 2B: brew file for correction # + direct effects

# + SE correction # --- # ---

# 8.1: fix freely estimated parameters from step 2A for (#) effects # 1 DIRECT EFFECT ON Y1:

# > effect parameters (1:4) + indicator effect of Y1 (5:7) + direct effect on Y1 (8:10) acpars <- as.matrix(read.table('par_step2A.dat',sep="", dec="," ))

acpars <- (acpars[1:10])

# > effect parameters (1:4) + indicator effect of Y1 (5:7) + direct effect on Y1 (8:10) # + indicator effect of Y4 (11:13) + direct effect on Y4 (14:16)

acpars <- as.matrix(read.table('par_step2A.dat',sep="", dec="," ))

acpars <- (acpars[1:16]) # cluster + direct effect on Y1 + direct effect on Y4 # 3 DIRECT EFFECT ON Y1, Y4 and Y5:

# > effect parameters (1:4) + indicator effect of Y1 (5:7) + direct effect on Y1 (8:10) # + indicator effect of Y4 (11:13) + direct effect on Y4 (14:16)

# + indicator effect of Y5 (17:19) + direct effect on Y5 (20:22) acpars <- as.matrix(read.table('par_step2A.dat',sep="", dec="," )) acpars <- (acpars[1:22]) # cluster + direct effect on Y1 & Y4 & Y5 # ---

# assign the fixed values

assign("acpars", acpars, envir=envir)

run.template("step_two_B.brew", envir=envir, temp.filename.base="step_two_B") shell(paste("C:\\...\\LatentGOLD5.1\\lg51.exe",

"step_two_B.lgs", "/b")) # ---

(40)

# ---

# 8.2.1: read in 'writeHessian' matrices step one # for an overview of the parameters see 6 # all matrices

Step1matrices <-read.table('hes_step1.dat', header=F, dec = ",")

# matrix 2: correct SE estimator of step 1 (inverse of negative Hessian) Step1Var <- as.matrix(Step1matrices[21:40,])

# ---

# 8.2.2: step 1 variance of the fixed values for (#) effects # 1 DIRECT EFFECT ON Y1:

# - intercept & indicator effect Y1

step1varnointercept <- as.matrix(Step1Var[6:20, 6:20]) # 2 DIRECT EFFECTS ON Y1 and Y4:

# - intercept & indicator effect Y1 and Y4

step1varnointercept <- as.matrix(Step1Var[c(6:11, 15:20), c(6:11, 15:20)]) # 3 DIRECT EFFECTS ON Y1, Y4 and Y5:

# - intercept & indicator effect Y1, Y4 and Y5

step1varnointercept <- as.matrix(Step1Var[c(6:11, 18:20), c(6:11, 18:20)]) # ---

# 8.2.3: read in step2B hessian matrices

Step2bmatrices <- read.table('hes_step2B.dat', header=F, dec = ",") # ---

# 8.2.4: read in the free estimated parameters for (#) effects # 1 DIRECT EFFECT ON Y1:

# cluster (1-4) + indicator effect of Y1 (5-7) + direct effect on Y1 (8-10) J <- as.matrix(Step2bmatrices[1:10,1:10])

# cluster (1-4) + indicator effect of Y1 (5-7) + direct effect on Y1 (8-10) # + indicator effect of Y4 (17:19) + direct effect on Y4 (20:22) J <- as.matrix(Step2bmatrices[c(1:10, 17:22), c(1:10, 17:22)])

# 3 DIRECT EFFECT ON Y1, Y4 and Y5

# cluster (1-4) + indicator effect of Y1 (5-7) + direct effect on Y1 (8-10) # + indicator effect of Y4 (17:19) + direct effect on Y4 (20:22) # + indicator effect of Y5 (23:25) + direct effect on Y5 (26:28) J <- as.matrix(Step2bmatrices[c(1:10, 17:28), c(1:10, 17:28)])

# ---

Evaluating the performance of various latent class model estimators in the presence of direct effects

Evaluating the performance of various latent class

model estimators in the presence of direct effects

Abstract

Table of contents

Introduction

Latent class models with external variables

Simulation set-up

Simulation criteria

Simulation results

Conclusion

Discussion

Appendix C

Appendix D

Appendix E

Appendix F