• No results found

The latent structure of Neurodevelopmental Disorders (NDD’s) : a factor mixture modeling approach to NDD symptom heterogeneity in young children

N/A
N/A
Protected

Academic year: 2021

Share "The latent structure of Neurodevelopmental Disorders (NDD’s) : a factor mixture modeling approach to NDD symptom heterogeneity in young children"

Copied!
32
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The Latent Structure of Neurodevelopmental Disorders (NDD’s): a Factor Mixture

Modeling Approach to NDD Symptom Heterogeneity in Young Children.

Sara Boxhoorn 5798957 Master Thesis

University of Amsterdam

(2)

2

Abstract

There is a high degree in overlap among Neurodevelopmental Disorders, suggesting that the current DSM classification of NDD’s is substandard. The first aim of this study was to test whether an empirically derived classification might be beneficial. Several factor mixture models were fitted to NDD markers data in a sample of children at risk for NDD’s. Results suggested that the sample of children at risk for NDD was best described by two classes; one severely, and one mildly impaired class. The severe NDD class showed relatively more impairments in the communication, social interaction, (fine) motor- and language skills. The mild NDD class showed slightly more frequent attention and sleep problems, hyperactivity/impulsivity and problem behavior. Diagnostic classifications could help children with NDD’s when they would contribute to a better understanding of their problems and guide intervention (Gillberg, 2010). Diagnostic classifications therefore need to inform us on the course of the development as reliably as possible. The second aim of the present study was to examine whether the empirical- and DSM classificationsat the age of two predict symptoms associated with NDD at the age of four. Higher probabilities of being a member of the severe NDD class predicted lower functioning across several NDD symptoms around the age of four. DSM-5 categories did not predict NDD symptoms significantly. We argue that empirically derived classifications of NDD markers will not only improve our

understanding of NDD’s, but will also help to unravel clinical phenotypes of NDD’s, and could therefore guide interventions that these children need.

(3)

3

The Latent Structure of Neurodevelopmental Disorders (NDD’s): a Factor Mixture

Modeling Approach to NDD Symptom Heterogeneity in Young Children.

The new chapter Neurodevelopmental Disorders (NDD’s) in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013) consists of Intellectual

Disabilities, Communication Disorders, Autism Spectrum Disorder, Attention-Deficit/Hyperactivity Disorder, Specific Learning Disorder, Motor Disorders and a rest category ‘Other Neurodevelopmental Disorders’ (APA, 2013). The term “Neurodevelopmental Disorders” has been used traditionally to refer to disorders that emerge early in life (Chen, Liu, Su, Hang, & Lin, 2007; Ehninger, Li, Fox, Stryker, & Sylva, 2008; Millan, 2013) and show abnormal brain functioning (Moreno-De-Luca et al., 2013). The clinical phenotypes of NDD’s are complex and typically span various developmental areas. Children diagnosed with NDD may show a variety of cognitive, emotional and behavioral problems (Reynolds & Goldstein, 1999; Chen et al., 2007), specific language, communication and social problems (Barthélémy, 2014; Landa, Holman & Garret-Mayer, 2007; Miniscalco, Niygren, Hagberg, Kadesjö, & Gilliberg, 2006) and atypical motor development (e.g. Demers, McNevin, & Azar, 2013; Fournier, Hass, Naik, Lodha, & Cauragh, 2010; Zelaznik & Goffman, 2010).

Co-morbidity rates among NDD’s are high (e.g. Kadesjö, 2000; Kaplan, Dewey, Crawford & Wilson, 2001; Mueller, 2012; Pennington & Bishop, 2009; Piek & Dyk, 2004). In a sample of school children with NDD’s, half of the children met diagnostic criteria for two diagnoses (Kaplan et al., 2001). For most NDD’s co-morbidity seems to be “the rule rather than the exception” (Moreno-De-Luca, Myers, Challman, Moreno-De-Luca, Evans, & Ledbetter, 2013, p. 406). For instance, in a sample of children with ADHD roughly half of the children showed similar impaired motor skills as children diagnosed with DCD (Fliers, Franke & Buitelaar, 2010). This finding is in line with experiences in clinical practice, where health practitioners argued to introduce standard screenings for motor problems during the diagnostic assessment of ADHD (Fliers et al., 2010).

Scientists have proposed several ‘central deficits’ to explain the high overlap, such as common brain dysfunctions (Menon, 2013; Moreno-De-Luca et al., 2013), or an impaired procedural learning system (Nicolson & Fawcett, 2007). The high degree in overlap might also suggest that the current classification of NDD’s is substandard.

Empirical approaches to classify NDD’s have indeed reported alternative classifications, but up to now a consistent picture has not yet emerged. To illustrate this, we will now briefly discuss previous research on classifying NDD symptoms. First, some empirical approaches suggest categories that differ from current diagnostic categories (Grados & Matthews, 2008; Mulligan, Anney, O’Regan, Chen, & Butler et al., 2009). Second, some studies suggest different categories, ordered along a continuum (Leyfer, Tager-Flusberg, Dowd, Tomblin, & Folstein, 2008; van der Meer, Oerlemans, van Steijn, Lappenschaar, & de Sonneville et al., 2012). For instance, Latent Class Analysis (LCA) revealed three latent classes that each contained a mix of ADHD and ASD symptoms (van der Meer et al., 2012). However, classes differed with respect to the ratio of ASD to ADHD symptoms and in impairment severity. That is, the

(4)

4 higher the ratio of ASD to ADHD symptoms, the higher the level of impairment severity (van der Meer et al., 2012). These results firstly suggest a different classification as compared to the diagnostic

classification. Additionally, they suggest quantitative differences between classes, ordered along a continuum. Similar results have been reported for Selective Language Impairment (SLI) and ASD symptoms (Leyfer et al., 2008). Third, continuous differences within one category have been reported as well. For instance, Dyck, Piek and Patrick (2011) found two latent classes in a sample of children with various NDD’s and typically developing children. Children between classes mainly differed in level of functioning, with one class containing mostly typical developing children and the other children

diagnosed with NDD´s. Within these classes, however, differences between children varied continuously1.

Finally, some results suggest only continuous differences between NDD’s. For instance, epidemiological and genetic data suggest that differences between NDD’s represent continuous differences in one common, latent factor (Moreno-De-Luca et al., 2013).

The implications of previous findings are two-fold. First, the ‘mixed’ categories that have been reported (Leyfer et al., 2008; van der Meer et al., 2012) suggest that the current NDD classification should be reconsidered. That is, NDD symptoms may be more optimally categorized into latent classes that comprise symptoms from ‘different’ diagnostic categories. Second, the continuous differences across and within classes suggest that differences in NDD phenotypes mainly reflect continuous variation in NDD symptoms rather than, or in addition to, qualitative differences. That is, class differences may only exist in terms of the relative level of functioning (e.g. Dyck et al., 2011; van der Meer et al., 2012). The current study therefore aimed to obtain an empirically derived classification of NDD symptoms, by deriving latent classes across NDD-symptoms, and by simultaneously allowing for quantitative differences in NDD symptoms within, as well as between these classes.

As experiences early in life are known to have long-term enduring effects (Bale, Baram, Brown, Goldstein, & Insel, 2010), it is highly informative to study measures of impaired functioning at a young age. To this end, we studied the latent structure of NDD symptoms for children between 15 and 30 months old (t1) and included several markers that signal the (likely) presence of a neurodevelopmental

disorder in the first four years of life (Gillberg, 2010, p. 1545, table 2). More specifically, we included measures of motor abnormality, general developmental delay, speech- and language delay,

communication- and social interaction problems, behavior problems, hyperactivity/impulsivity, attentional- and sleep problems.

The empirical classification was obtain using Factor Mixture Analysis (FMA) (Lubke & Muthén, 2005, Muthén & Muthén, 2012). Previous empirical approaches to classify NDD’s used either latent class- (LCA) or factor analysis (FA) to model NDD symptoms. In LCA, latent variables are categorical, whereas in FA latent variables are continuous. LCA yields therefore information about subgroups in the

population and FA about gradual differences within the population. In psychology, these methods are

(5)

5 therefore often used to model clinical subgroups (LCA) or severity differences in symptoms (FA). The current study used FMA as this technique allows for the detection of classes in the data, as well as continuous factors within each class (Lubke & Muthén, 2005). That is, FMA allows to model severity differences within subgroups by estimating both continuous and categorical latent variables within one model. The present study therefore examined the latent structure of NDD symptoms by fitting several factor mixture models to the data. More specifically, we tested whether young children at risk for NDD were most optimally classified into subgroups that showed severity differences within these subgroups (H1a), comprised symptoms from various NDD categories (H1b), and with mainly quantitative, rather than qualitative differences in NDD symptoms between subgroups (H1c). To this end, we compared several FM models that represented, respectively, quantitative severity differences between classes with- and without severity within classes, and that differed in the extent of qualitative differences between classes. We expected to find subgroups with symptoms from various NDD categories that showed variation in NDD symptom severity within classes, and mainly quantitative differences rather than qualitative differences between classes. Additionally, we estimated the distribution of DSM-IV diagnoses in the empirical classification to compare the empirical classification to the DSM-IV classification. To this end the relative proportions of DSM-IV diagnoses were calculated for each class. In line with previous literature, we expected to find quantitative rather than qualitative differences in DSM-IV diagnoses across classes.

Latent classifications can be used to inspire future diagnostic classifications, but are no reversible concepts. That is, diagnoses may, and ideally should, additionally inform health practitioners on treatment and prognosis (Rutter, 2011). Diagnostic classifications could help young children with NDD´s when they would contribute to a better understanding of their problems and guide intervention (Gillberg, 2010). We therefore need diagnostic classifications to inform us not only on the present symptoms but also on the development of these symptoms as reliably as possible. Currently, ND diagnoses may not contribute to a better understanding, and even worse, misguide intervention (Gillberg, 2010). That is, interventions for young children with NDD´s may focus too restrictively on only those developmental areas that are impaired according to current diagnostic criteria. This approach risks that a child develops problems in another developmental area, problems that were previously only present on a subclinical level (Gillberg, 2010). The second aim of the present study was therefore to quantify which classification would inform us best on how the clinical phenotypes of children with NDD symptoms manifest in the near future. To this end, we examined whether the empirical classificationand DSM-classifications predict symptoms associated with NDD at the age of four (t2) (H2).

(6)

6

2. Method

2.1 Sample Selection

Participants of the present study were selected as part of a Dutch screening study (Screenings Onderzoek Sociale Ontwikkeling; SOSO project) from October 1999 up to April 2002 (Dietz, Swinkels, van Daalen, van Engeland, & Buitelaar, 2006). Participants were selected if they were identified as being at high risk to develop ASD by the Early Screening of Autistic Traits Questionnaire (ESAT) (Swinkels, Dietz, van Daalen, Kerhof, van Engeland et al, 2006).2 The screening procedure comprised several

phases. First, children at age 14-15 months were selected using a three minutes pre-screening test at the baby clinic; the 4-item ESAT. These children were tested with the MSEL and 14-item ESAT. Second, all children scoring above the 14-item ESAT cut off were selected for additional investigation at the University Medical Centre Utrecht (UMC Utrecht). Additionally, some children (N=27) that screened negatively were also further investigated for various reasons, for instance, when their siblings were diagnosed with NDD’s. Participants were invited for follow up clinical assessments around 24 and 42 months (Dietz et al., 2006).

For 421 participants one or more measurements were available. For most of the participants one or more assessments were missing at a given time point (88.2 %). Additionally, children varied greatly in age within each time-cohort, to the extent that a given participant during the first follow-up could be of the same age as another participant during the second or even third follow up. For each time point children were selected within relatively similar phases of language development. As guidelines language

benchmarks were used that have been reported for both typically developing children, and children with ASD (Tager-Flusberg, Rogers, Cooper, Landa, & Lord et al., 2009). At t1, we set the minimum age to 15 months, as this corresponds to the benchmark of reaching phase 2 in the early language development for children with ASD (First Words; Tager-Flusberg et al., 2009). The maximum age was set to 30 months, as this is the upper limit of the age range in phase 3 for typically developing children (Word Combinations; Tager-Flusberg et al., 2009). Therefore, in theory, at t1 all selected children should have reached phase 2 (First Words) and no child should have reached phase 4 (Sentences). At t2, we set the minimum age to 37 and the maximum age to 48 months, the upper limit of the age range in Phase 4 (Sentences) for children with ASD as well as for typically developing children. The minimum age at t2 was set to 37 months, as only follow up assessments were used that took place at least 12 months after the assessment at t1. The mean time difference between t1 and t2 was 23.3 months (SD=5.79). To summarize, to find a balance between not too much variation in age and an adequate sample size, data for two ‘time points’ were selected: measurements for children aged 15-30 months (t1) and follow-up measurements for children aged37-48 months (t2). Lastly, we excluded participants with missing data due to a language barrier from

2

The ESAT was designed to detect ASD at a young age, but seems relative more suitable to identify general developmental problems. That is, a positive ESAT result corresponds with only a 30% chance to correctly identify ASD, but a 100% chance to correctly identify general

(7)

7 the analysis (N=8). Our selection criteria reduced our total sample size to N=344 (not-selected sample size = 77).

2.2 Sample Characteristics

The sample comprised 126 females (36,6 %) and 218 males (63.4%). At t1 mean age was 18.99 months (SD=4.85), at t2 mean age was 43.32 months (SD=2.28).The study was conducted by the Utrecht University and participants were recruited within the Province of Utrecht. All participants were diagnosed by experienced child psychiatrists, according to the diagnostic criteria of the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; American Psychiatric Association,

2000). The child psychiatrists were informed by the child’s developmental history, standard psychiatric observation through the Autism Diagnostic Observation Schedule Generic (ADOS-G; Lord et al., 2000), Mullen Scales of Early Learning (MSEL; Mullen, 1995) and pediatric examination. The resulting diagnoses at t1 comprised Autism (N = 24), PDD-NOS (N = 16), AD(H)D (N=9), Expressive Language Disorder (N=23), Mixed Language Disorder (N=6), Mental Retardation without PDD-NOS (N=22), Articulation Disorder (N=1), ‘different Axis 1 disorder (N = 12) , “Axis 2 disorder” (N=8) . Eight children were investigated but did not receive a diagnosis. For another 214 children information on diagnoses was not available.

2.3 Measures

To analyze the latent structure of NDD’s we included several measures of markers that signal the (likely) presence of a neurodevelopmental disorder in the first four years of life (see Gillberg, 2010, p. 1545, table 2). More specifically, we included measures of motor abnormality, general developmental delay, speech- and language delay, communication- and social interaction problems, behavior problems, hyperactivity/impulsivity, inattention and sleep problems. Measures for two additional markers proposed by Gillberg (2010) were not available (hypo activity and feeding difficulties around the age of 4 months). 2.3.1 Motor abnormality, general developmental delay and speech- and language delay.

Mullen Scales of Early Learning (MSEL; Mullen, 1995) assessments were used to obtain measures of motor abnormality, speech- and language delay and general developmental delay. The MSEL measure early development of learning, from birth until 68 months, across five subscales: Gross Motor, Visual Reception, Fine Motor, Expressive Language and Receptive language The MSEL are frequently used to study (young) children with ASD and other developmental disabilities (Bishop, Guthrie, Coffing, & Lord, 2011). All tasks are individually administered and t-scores, percentiles and age-equivalents can be

computed for each scale separately. The current study used t-scores to account for the impact of age on performance. The MSEL has shown good internal consistency (Cronbach’s α ≥ .90), but inadequate test-retest reliability (r < .80) (Williams et al., 2014).

(8)

8 the Gross Motor subscale was not available. Both language scales were used as measures of speech- and language delay.

2.3.2 Communication and social interaction problems

Autism Diagnostic Observation Schedule Generic (ADOS-G; Lord, Risi, Lambrecht, Cook & Leventhal, 2000) assessments were used to obtain measures of communication and social interaction problems. The ADOS is a semi-structured observation method that requires a skilled examiner to evoke specific (social) behavior during several standardized tasks (Lord et al., 2002). Four different ADOS modules adapt to different language and developmental levels. The various tasks provide observation scores across the following categories: Language and Communication, Reciprocal Social Interaction, Fantasy, Stereotypical Behaviors and Restricted Interests and Other Deviant Behaviors (such as hyperactive behavior and anxiety). Each observation is rated between 0 and 3, indicating “no

abnormality” to “moderate to severe abnormality”. The ADOS has shown high discriminant validity as well as high internal consistency and test-retest reliability (Landa, 2005).

The present study used Module 1 assessments, for children “who do not use spontaneous phrase speech consistently” (Lord et al., 2000, p. 207). We used algorithm scores of the social and

communication domain as measures of respectively communication- and social interaction problems. 2.3.3 Behavior problems, hyperactivity/impulsivity, inattention and sleep problems

Items from several Child Behavior Checklist 1.5-5 subscales (CBCL; Achenbach, & Rescorla, 2001) were used to obtain measures of behavior problems, hyperactivity, impulsivity, inattention and sleep problems. The CBCL-1.5-5 is a questionnaire for primary caregivers and measures emotional and behavioral problems of their child during the past two months. Ninety-nine items are rated on a 3-point scale from 0 (“not true”) to 2 (“very true or often true”). The subscales represent a seven-syndrome structure (Ivanova, Achenbach, Rescorla, Harder, & Ang et al., 2010) within two domains (Internalizing and Externalizing). The Internalizing problem domain comprises Emotionally Reactive,

Anxious/Depressed, Somatic Complaints and Withdrawn Syndromes. The Externalizing problem domain comprises Attention Problems and Aggressive Behavior Syndromes. Sleep Problems is an additional subscale, not belonging to either the Externalizing- or Internalizing domain. The manual also provides scales based on the DSM-IV: Affected-, Anxiety-, Pervasive Developmental-, Attention

Deficit/Hyperactive-, and Oppositional Defiant problems. The CBCL 1.5 has shown good internal consistency (Cronbach’s α=.92) (Rescorla, Achenbach, Ivanova, Harder, & Otten et al., 2011).

The present study used items that belong to the Syndrome Subscales Attention Problems (items 5, 6, 56, 59, 95) and Sleep Problems (items 22, 38, 48, 64, 74, 84, 94) to obtain measures of respectively inattention and sleep problems. Next, additional items from the Attention Deficit/Hyperactive problem scale were included to provide measures of impulsivity and hyperactivity (item 8, 16 and 36). Lastly, additional items from the Oppositional Defiant Problem scale were included to obtain measures of problem behavior (item 15, 20, 44, 81, 85 and 88).

(9)

9 2.4 Procedure

2.4.1 Missing data

A considerable amount of our data at t1 was missing (62.66 %). Furthermore, part of our data was selectively missing for children that scored below the 14-item ESAT cut off (N=208). More specifically, for most of these children MSEL measures were available (97.6%), but for considerably less ADOS-G (12.01%) and CBCL1.5-5 data (10.1%). Children with lower ESAT scores were indeed significantly more likely to have missing data for ADOS-G (βADOS-G=-0.427, S.E.=0.05358, Wald Z=-7.963, p <0.000) and CBCL1.5-5 measures (βADOS-G=-0.496, S.E.=0.05848, Wald Z=-7.963, p <0.000). To summarize, missing data for ADOS-G and CBCL-1.5-5 assessments depended on ESAT scores. To meet requirements for Missing Conditionally at Random (MAR), missing data may depend on observed data, but not on unobserved data (Schafer & Graham, 2002). We therefore included ESAT sum scores as auxiliary variable into the model. Furthermore, we explored whether age predicted missing data for CBCL1.5-5 measures; as youngest participants in our sample were 15 months, but age norms for this questionnaire start at 18 months. Relatively younger children were indeed more likely to have missing data for CBCL1.5-5 measures (βage= -0.23129, S.E.=0.02846, Wald Z=-8.127, p <0.000). Age was therefore included into the model as auxiliary variable as well. MAR simulations studies showed that Full

Information Maximum Likelihood (FIML) estimations outperformed listwise, pairwise and similar pattern imputation (Enders & Bandalos, 2001). The current study therefore used the FIML missing data

estimation approach.

Additionally, we explored the impact of ESAT as auxiliary variable in our model more thoroughly. That is, correlations between cause of missingness Z and dependent variables Y (i.e. RZY,

Graham, 2009) can give an estimate of the impact of including a cause of missingness into the model (Collins, Schafer & Kan, 2001). To this end we calculated the correlation between cause of missingness, ESAT total scores and dependent variables at t1. Spearman rank order correlation coefficients between ESAT total scores and MSEL and ADOS-G scores varied respectively between -0.42 and -0.48 and between 0.44 and 0.53. In the context of the high amount of missing data, these results suggest that the performance of the model highly benefits from including ESAT total scores as auxiliary variable into the model (see Collins et al., 2001).

Finally we explored differences between participants that did and participants that did not drop out. Neither age at t1, ESAT sum scores or gender predicted drop out significantly (βage=-0.1651, S.E.=0.1269, Wald Z=-1.301, p=0.1933,

β

ESAT=0.1799, S.E.=0.1299, Wald Z=1.385, p=0.1659,

β

gender=0.3463, Wald Z=1.364, p=0.1727).

(10)

10 2.4.2.1 Analytic Strategy for the Factor, Latent Class and Factor Mixture Analysis

All analyses were carried out Mplus version 6 (Muthén, & Muthén, 1998-2010). We fitted factor, latent class and factor mixture models to the empirical data. To this end, we used the analytic procedure suggested by Clark, Muthén, Kaprio, D’Onofrio and Viken et al. (2012). Latent class- and exploratory factor models were fitted to the data first. We specified exploratory models as we lacked knowledge about the measurement structures of all indicators together. That, is previous studies examined the

measurement structure of the current study’s indicators separately (e.g. ADOS-G; Lord, Risi, Lambrecht, Cook, & Leventhal et al., 2000; CBCL1.5-5; Pandolfi, Magyar, & Dill, 2009), but no study so far

combined these indicators into one single model. Next, the number of classes and factors of the best models were used as upper bound for specifying the factor mixture models. This set up accounts for the possibility that factor mixture models may need less classes or factors to describe the data accurately. That is, factor mixture models may, for instance, capture additional continuous variation within classes as compared to latent class models (Clark et al., 2012). Second, factor mixture models were fitted, with an increasing number of classes and factors (see table 4).

All models were estimated with a maximum likelihood parameter estimator with robust standard errors (MLR) using a numerical integration algorithm. MLR uses the FIML missing data estimation approach, that performs well with MAR (Enders & Bandalos, 2001), and provides robust estimates for non-normal data (Muthén, & Asparouhov, 2002; Yuan, & Bentler, 2000; Yuan, Yang-Wallentin, & Bentler, 2012). Mixture models are susceptible to local solutions and it is therefore recommended to run the model multiple times with varying starting values. Hipp & Bauer (2006) recommend at least 50-100 sets of random starting values, that vary sufficiently, and to consider the frequency of the optimal solution as a model fit diagnostic (Hipp & Bauer, 2006). The current study aimed to replicate the highest log likelihood at least two times in the final stage solutions (Muthén, & Muthén, 1998-2010) for at least two runs with varying starting values (cf. Asparouhov & Muthén, 2012). When several final stage solutions resulted in similar log likelihood values3, parameter estimates of these solutions were studied. If these estimates were similar we chose the solution with the highest log likelihood value (Muthén, & Muthén, 1998-2010). When the best log likelihood value was not replicated or when models had not converged using 2000 random starts and 500 final stage iterations, we tried to fit the model using user-specified starting values. To obtain these starting values we estimated (continuous and categorical) model parts separately (Muthén, & Muthén, 1998-2010, p. 414).

2.4.2.2 Factor Mixture Model variants

The present study aimed to test whether children with NDD symptoms could be best classified into subgroups that show severity differences in NDD symptoms within these subgroups, and mainly quantitative differences in NDD symptoms between subgroups. To this end we specified five factor mixture model variants (see table 1). More specifically, in Model 1, class differences were only based on

(11)

11 mean factor scores, with no symptom heterogeneity within classes (i.e. by specifying the factor covariance matrix equal to zero). In Model 2, class differences were also only based on mean factor scores, but additionally each class was allowed to show symptom heterogeneity. In Model 3 residual variances were in addition freely estimated across classes. In Model 4, differences between classes were not based on mean factor scores, but on mean differences in MSEL Fine Motor,- Receptive and Expressive Language scores, and the ADOS-G Communication and Social Interaction domains4 (i.e. thresholds varied freely across

classes, see table 1). In Model 5, differences between classes were based on mean differences in the same observed variables as well. Additionally, all observed variables contributed differently to variation in factor scores across classes (i.e. factor loadings varied freely across classes, see table 1). Finally, fully exploratory factor mixture models were fit to the data to allow for even less restrictive factor models within class (i.e. different cross loadings across classes).

Factors across classes can be interpreted the same in Model 1 and 2. In model variants 3 to 5 and the exploratory models factors across classes should be interpreted differently, implying different factor(s) within each class (see Clark et al., 2013). Model 1 and 2 therefore suggest quantitative rather than

qualitative differences across classes, and model 3 to 5 qualitative rather than quantitative differences. Table 1. Overview of estimated parameters and class varying parameters5 for all Factor Mixture Model variants. MI stands for Measurement Invariant, PI for Partially Invariant.

Model parameters:

factor mean factor covariance matrix residual variances intercepts factor loadings (α) (Ψ) (ε) (v) (λ)

Models:

1 α Ψ=0 classinvariant classinvariant classinvariant 2 α Ψ classinvariant classinvariant classinvariant 3 α Ψ ε classinvariant classinvariant 4 α=0 Ψ ε v classinvariant

5 α=0 Ψ ε v λ

2.4.2.3 Model Comparison

We compared model fit for exploratory factor models, latent class models and factor mixture models using goodness of fit criteria and likelihood ratio tests. Simulation studies have shown that Bayesian Information Criterion (BIC) outperformed Aikake Information Criterion (AIC) and adjusted BIC (aBIC) in estimating model fit (Nylund, Asparouhov & Muthén, 2007). We therefore firstly used BIC

4

Thresholds and factor loadings for categorical indicators were kept invariant across classes for all models, as for categorical indicators item thresholds and factor loadings need to be constrained concurrently for model identification (Muthén, & Muthén, 1998-2010).

5

Note that thresholds and factor loadings for categorical indicators were kept invariant across classes for all model variants, as for categorical indicators item thresholds and factor loadings need to be constrained concurrently for model identification (Muthén, & Muthén, 1998-2010). Noninvariant models with class varying thresholds and factor loadings for categorical indicators did not converge, most likely as result of the high amount of missing data for the categorical indicators.

(12)

12 indices to compare model fit of the exploratory factor-, latent class- and factor mixture models. For the latent class- and factor mixture models we also considered entropy values. Entropy values give an indication of correct class assignment. Models with entropy values around .80 assign at least 90% of the sample to the correct class (Lubke, & Muthén, 2007). Latent class membership was used as predictor in the second part of our study and we therefore aimed to select not only a good fitting model, but also a model with high separation between classes. Therefore, as a second criterion, only models with entropy values around 0.80 were selected. Lastly, we calculated Bootstrapped Likelihood Ratio Test (BLRT) p-values for these remaining models (Lubke & Neale, 2008; Nylund, Asparouhov, & Múthen, 2007). The BLRT tests the rejection of a k-1 versus k class model using a bootstrapping procedure. A significant BLRT p-value therefore indicates that the model fits the data better than a model with one class less (Asparouhov & Muthén, 2012). For the sake of completeness, AIC and aBIC fit indices were reported for all models as well.

2.4.2.4. Comparing relative proportions of DSM-IV diagnoses within each class

DSM-IV diagnoses were not included into our model to predict latent class assignment, as our aim was to obtain the empirical classification without using DSM-IV classifications. Instead, after

selecting the best model, latent class probabilities were calculated for every participant. Calculating relative proportions with most likely class membership may not be reliable for relatively small classes (Tueller & Lubke, 2010). To enhance reliability, we therefore used a correction method (Bakk, Tekle, & Vermunt, 2013). That is, relative proportions of DSM-IV diagnoses for each class were calculated using most likely class membership probabilities corrected for bias.

2.4.2.5. Prediction of NDD symptoms at t2 by empirically derived and diagnostic classifications We used diagnostic and empirical classifications (i.e. DSM-IV diagnoses and latent class membership at t1) to predict NDD symptoms at t2, using multivariate regression models. NDD symptoms at t2 comprised CBCL-1.5-5 sum scores for Attention, Sleep- and Oppositional Defiant Behavior Problem scales and MSEL Fine Motor, Expressive- and Receptive Language.

The diagnostic predictor consisted of the categories Autism, PDD-NOS, Mixed- and Expressive Language Disorders and Intellectual Disability without PDD-NOS. To improve power and to account for the current diagnostic standards, DSM-IV diagnoses were lumped into DSM-5 categories. That is, Autism and PDD-NOS diagnoses were lumped into one “Autism Spectrum” category, and Mixed- and

Expressive Language Disorders into one “Language Disorder” category. Two dummy variables to identify group membership were used as predictor in a multivariate regression model (i.e. one with ones for “Autism Spectrum” and zeros for “Language Disorder” and “Mental Retardation without PDD-NOS” and one with ones for “Language Disorder” and zeros for “Autism Spectrum” and “Mental Retardation without PDD-NOS”.

Several studies reported biased estimates when class membership was used to predict external variables (e.g. Bolck, Croon, and Hagenaars, 2004; Cheng, 2012). To this end, we used the bias-adjusted

(13)

13 three step approach (Bakk, Tekle, & Vermunt, 2013) to predict NDD symptoms at t2. The bias-adjusted class assignments were calculated in R (R Development Core Team, 2011), using the R code provided by Bakk et al. (2013).

3. Results

3.1 FA & LCA fit results

A two-class model and a three-factor model provided the best fit (see table 2 and table 3). We therefore fitted FM models up to 2 classes and 3 factors in the subsequent step.

Table 2. LCA fit results

Model Number of Log- aBIC BIC entropy BLRT LMR

free parameters likelihood p-value p-value

1- class 50 -6092.991 12319.402 12478.014 (MLR scaling correction factor = 0.994) 2-class 96 -5740.897 11737.960 12042.496 0.846 0.000 0.000 (MLR scaling correction factor = 1.029) 3-class 106 no convergence

Table 3. EFA fit results

Model LogL #par BIC aBIC AIC

EFA 1 factor -5713.968 75 11865.984 11628.065 11577.936 (MLR scaling correction factor = 1.090) 2 factor -5582.674 99 11743.572 11429.518 11363.348 (MLR scaling

(14)

14 correction factor = 1.190) 3 factor -5493.960 122 11700.477 11313.462 11231.919 (MLR scaling correction factor = 1.122) 4 factor6 -5529.680 144 11700.412 11243.607 11147.360

3.2 Factor Mixture Model Fit Results

Fit results for factor mixture models are displayed in table 4. For all two class/three factor mixture models we did not replicate the highest log likelihood value; table 4 does therefore not include results for two class/three factor models. For various two class/two- and one factor model variants highest log likelihood values were not replicated either. These results are marked in table 4 with asterisks.

Variant 4 of the two class/one factor model showed best fit and good class separation, based on respectively the BIC as first criterion and entropy values as second. Furthermore, BLRT for this model was significant, indicating that a two class/one factor solution showed superior fit as compared to a one class/one factor solution. Finally, BIC value was lower for this model than for any factor- or latent class model. The superior fit suggests this model captured additional continuous variation within classes, and additional mean differences between classes as compared to respectively factor- and latent class models. Variant 4 from the two class/one factor models was therefore selected as our best solution. FM fit results therefore suggest that the sample of children at risk for NDD is best described by two separate classes, with one factor within each class. The class-varying intercepts suggest differences between classes for MSEL Fine Motor, Expressive- and Receptive Language scales and ADOS-G Communication and Social Interaction domain scores.

Table 4. FMM fit results

Model #par Loglikelihood BIC AIC aBIC entropy BLRT p-value 2c1f Model 1 76 -5722.430 11888.750 11596.861 11647.658 0.726 Model 2 78 -5703.437 11862.444 11562.874 11615.008 0.804 Model 3 83 -5670.312 11825.397 11506.624 11562.100 0.497 Model 4 87 -5559.832 11627.801 11293.665 11351.814 0.804 0.0000

(15)

15 Model 5 93 -5554.798 11641.095 11291. 597 11352.420 0.763 Expl. FM 151 -5502.859 11887.654 11307.717 11352.698 0.843 2c2f Model 1 101 -5570.250 11730.404 11342.499 11410.006 0.779 Model 2 105* -5567.534* 11748.336* 11345.069* 11415.249* 0.888* Model 3 110 -5545.987 11699.400 11299.974 11369.486 0.757 Model 4 115* -5461.071* 11593.816* 11152.142* 11229.006* 0.797* Model 5 124* -5454.965* 11634.170* 11157.930* 11240.810* 0.614* Expl. FM* 199* -5377.716* 11917.720* 11153.432* 11286.441* 0.765* -- --

* highest loglikelihood not replicated

Figure 1 displays estimated mean performances for fine motor-, receptive and expressive language skills (left) and estimated mean scores for communication and social interaction domains (right) for each class separately. The best FM solution comprised a smaller severely affected NDD class (i.e. 22.7-24.7 % of the sample) and a bigger, mildly affected NDD class (75.3-77.3% of the sample). The mild NDD class scored more typical as compared to the severe NDD class; not only did they score higher on the Fine Motor, Expressive- and Receptive Language tasks, but they also scored lower on deficits in the Communication and Social Interactions Domains. More specifically, the mild NDD class showed average fine motor- and receptive language-, but below average expressive language skills. The severe NDD class showed below average fine motor skills, and weak expressive- and receptive language skills. Against our expectations, these differences between classes were independent of their factor scores, suggesting qualitative rather than quantitative differences. The symptom profile of the severe NDD class showed, next to an overall lower level of performance, relatively high amounts of social interaction deficits and impaired fine motor skills (see figure 1).

(16)

16 Figure 1. Class profiles with mean t-values for MSEL Fine Motor, Expressive- and Receptive Language Subscales (left) and mean scores of the

ADOS-G algorithm for the Communication and Social Interaction domains (right), with error bars representing 99% Confidence Intervals.7 Means within each class are calculated using data from all participants whose posterior class membership probability is largest for that particular class. In clinical practice t-scores are often expressed in qualitative descriptions of test performance: a t-score between 37-43 corresponds to a performance “below average”, a t-score between 44-56 to an “average” performance, and a t-score between 57-63 to an performance“above average”. (Bouma, Mulder, Lindeboom, & Schmand, 2012). A higher score on the ADOS-G algorithm domains corresponds to quantitatively more atypical behavior within the Communication Domain (left) or within the Social Interaction Domain (right).

Figure 2 displays conditional item response probabilities8 for the Attentional-, Sleep- and

Oppositional Defiant Behavior Problems and Hyperactivity/Impulsivity items. The severe NDD class showed slightly less frequent problems related to attention, hyperactivity/impulsivity and oppositional

7

Assumption of univariate normality was checked after the analysis within each class, as the present study is assumed the sample to represent a mixture of populations, and the analyses to separate the populations. Within each class, assumptions of univariate skewness and kurtosis were met (Zskewness<|3| Zkurtosis<|3|. Multivariate normality was checked by rerunning the analysis with an ML estimator (i.e. an maximum likelihood parameter estimator with non robust standard errors) and comparing difference in fit and standard errors. There were no differences in fit and all standard error estimates were similar or approximately similar, except for indicator Social Interaction in class 2, suggesting non-normality.

8

More specifically; response probabilities are conditioned on class, but not on factor scores, so they represent conditional probabilities for each class summed across all factor scores within each class.

(17)

17 defiant- and sleep behavior. Parallel item response profiles would suggest quantitative differences across classes. Reversed patterns in profiles across classes would suggest qualitative differences instead (Lubke, Hudziak, Derks, van Bijsterveldt, & Boomsma, 2009). Figure 2 shows that the conditional item response probabilities follow a parallel pattern within- and across the various CBCL1.5-5 subscales. These results therefore suggest (minor) quantitative rather than qualitative differences across classes for attentional problems, hyperactivity/impulsivity and behavioral- and sleeping problems. However, these results should be interpreted with caution as the fit of Model 4 was not compared to the fit of a model with changing thresholds across classes.10

To summarize, FM fit results suggest the sample of children at risk for NDD is best described by two separate classes; a mild NDD class and a severe NDD class. The severe NDD profile is characterized by relatively more impairments in the communication and social interaction and weaker (fine) motor- and language skills. The mild NDD profile is characterized by slightly more frequent attention problems, hyperactivity/impulsivity and problem behavior. The severe NDD class shows below average fine motor- and weak expressive- and receptive language skills. The mild NDD class shows average fine motor- and receptive language-, but below average expressive language skills.

Figure 2. Class profiles based on estimated probabilities conditional on class, integrated over the factor scores within class. for responses “Not True” (i.e.

behavior is not observed) “Somewhat or Sometimes True” and “Very True or Often True” (i.e. behavior is often present).

(18)

18 Table 5 depicts the standardized loadings for the one-factor solution in both classes. CBCL item 95 (“Wanders away”) was excluded from the analysis as there was too little variation in the sample responses for this item. All indicators loaded significantly on the same factor within each class, except for Social Interaction and CBCL item 56 (“Clumsy”; Attention Problems) (table 5). In addition, indicators Communication and MSEL Fine Motor did not contribute much to variation in factor scores, as factor loadings for these indicators were small (<.3). Against our expectations, the factor did not represent NDD severity, as high factor scores related to better performance (MSEL and ADOS-G indicators), as well as more frequent symptoms (CBCL1.5-5 indicators).

Table 5. One-factor structure of the 2c1f PI-1 model indicators11 for class 1 and 2 (N=344)

_________________________________________________________________________________________

Indicator Standardized S.E. Est./S.E. p-value

loading

_________________________________________________________________________________________

Attention Problems

5. Can’t concentrate 0.581 0.095 6.139 0.000 6. Can’t sit still 0.765 0.066 11.578 0.000 56. Clumsy 0.200 0.123 1.622 0.105 59. Quickly shifts activity 0.624 0.096 6.533 0.000

Additional Hyperactive/Impulsivity scale items:

8. Can’t stand waiting 0.748 0.075 9.981 0.000 16. Demands must be met 0.755 0.091 8.282 0.000 36. Gets into everything 0.663 0.088 7.566 0.000

Sleep Problems

22. Doesn’t want to sleep alone 0.481 0.166 2.892 0.004 38. Trouble sleeping 0.597 0.125 4.796 0.000 48. Nightmares 0.688 0.092 7.463 0.000 64. Resists bed 0.603 0.132 4.572 0.000 74. Sleeps little 0.610 0.113 5.391 0.000 84. Talks, cries in sleep 0.454 0.127 3.562 0.000 94. Wakes often 0.626 0.097 6.464 0.000

Oppositional Defiant Problems

15. Defiant 0.671 0.086 7.770 0.000 20. Disobedient 0.691 0.084 7.414 0.000 44. Angry moods 0.652 0.099 6.606 0.000 81. Stubborn 0.775 0.069 11.150 0.000 11

CBCL item 95 (“Wanders away”; Attention Problems Domain) was excluded from all analyses as there was too little response variation in the sample for this item.

(19)

19 85. Temper 0.672 0.110 6.097 0.000

88. Uncooperative 0.618 0.099 6.262 0.000 ADOS-G Communication -0.271 0.111 -2.454 0.0014 Algorithm Domain score

ADOS-G Social Interaction -0.101 0.076 -1.332 0.183 Algorithm Domain score

MSEL Fine Motor Scale 0.232 0.052 4.446 0.000 MSEL Receptive Language 0.626 0.062 10.177 0.000 Scale

MSEL Expressive Language 0.545 0.061 8.881 0.000 Scale

__________________________________________________________________________________________

3.3 Predicted relative proportions of DSM-IV diagnoses and other classifications

The relative proportions of DSM-IV diagnoses across classes are displayed in table 6. The severe NDD class showed relatively higher proportions of DSM-IV diagnoses that are characterized by more symptoms. That is, the severe NDD class contained relatively higher proportions of Autism and Mixed Language Disorders. The severe NDD additionally showed higher estimated proportions for Intellectual Disability and lower estimated proportions for children that were investigated but not diagnosed (“Not diagnosed”). To summarize, in line with our predictions these predicted proportions suggest classes mainly differed in terms of quantitative, rather than qualitative DSM-IV symptoms.

Table 6. Averaged relative proportions of DSM-IV NDD diagnoses for class 1 and class 2, obtained by using proportional latent class assignments corrections for bias.

__________________________________________________________________________________________

Severe NDD class Mild NDD class

__________________________________________________________________________________________

DSM-IV classifications:

Expressive Language Disorder (N=23) 0.26 0.74 Mixed Language Disorder (N=7) 0.59 0.41

Articulation Disorder (N=1) 1 0

Autism (N=21) 1 0

Autism with co-morbid disorder (N=3) 0.80 0.20 PDD-NOS (N=13) 0.80 0.20 PDD-NOS with co-morbid disorder (N=3) 0.55 0.45

ADHD (N=9) 0.15 0.85

Intellectual Disability (N=22) 0.88 0.12

“Development Disorder within another 0 1

area than language” (N=1)

(20)

20

Axis-II diagnosis (N=8) 0.21 0.79

Different Axis-I Disorder (N=12) 0.11 0.89

Not diagnosed (N=8) 0.09 0.91

No information available (N=214) 0.05 0.95

3.4 Predictions of empirically derived model and diagnostic classifications at t1 for symptoms associated with NDD at t2.

The overall effect of DSM-5 categories “Autism Spectrum Disorder” (ASD), “Language Disorder” (LD) and “Intellectual Disability” (ID) on NDD symptom measures at t2 was not significant, Wald χ2(2)=4.047, p=0.1322.

Latent class membership at t1 did predict NDD symptoms measures at t2 significantly, Wald

χ

2(1)=5.677, p<0.05. More specifically, relatively higher probabilities for being a member of the severe

NDD class predicted a relatively lower performance on the Fine Motor-, Receptive- and Expressive Language MSEL subtasks at t2 (table 7). Furthermore, a relatively higher probabilities for being a member of the severe NDD class also predicted higher total scores for Attentional Problem- and Oppositional Defiant CBCL1.5-5 subscales (table 7). Latent class membership did not predict Sleep Problem sum scores significantly. These predictions at t2 differed partly from the latent class profiles found at t1; a higher probability for being a member of the severe NDD class at t1 predicted relatively lower fine motor, - and language skills, but relatively more attention, sleep and behavioral problems at t2.

Table 7. Multivariate regression results for (corrected) latent class membership probabilities. P-values are corrected for multiple comparisons using the BY

method that accounts for dependent tests (Benjamini & Yekutieli, 2001).

_______________________________________________________________________________________________________________

Measure at t2 β S.E. Wald Z p-value

________________________________________________________________________________________________________________

MSEL Fine Motor -0.426 0.076 -5.637 0.00000

MSEL Receptive Language -0.421 0.071 -5.5928 0.00000 MSEL Expressive Language -0.404 0.080 -5.030 0.00000 CBCL Attentional Problems 0.346 0.072 4.838 0.00000 CBCL Oppositional Defiant 0.206 0.079 2.601 0.02646 CBCL Sleep Problems 0.126 0.083 1.508 0.32340

Discussion

The present study aimed to obtain an empirically derived classification of NDD markers in young children at risk for NDD. More specifically, the present study tested whether NDD markers in young children at risk for NDD could be most optimally classified into latent classes that, firstly, showed

(21)

21 severity differences in NDD symptoms, and secondly comprised symptoms from various NDD

categories. Finally, the present study examined whether subgroups showed mainly quantitative rather than qualitative differences.

Results suggest that the sample of children at risk for NDD is best described by two separate classes; one small, (i.e. 22.7-25.7% of the sample) severely impaired NDD class and one big, mildly impaired NDD class (i.e. 75.3-77.3% of the sample). Both classes show continuous differences within each class in one factor. Against our expectations, this factor does not represent one NDD severity dimension, as higher factor scores correspond to better performance (MSEL indicators) and less impairments (ADOS-G indicators), but also more frequent symptoms (CBCL1.5-5 indicators). Furthermore, differences between classes are not based on differences in this factor, but on mean differences in fine motor- and language skills, and communication and social interaction impairments. These results therefore suggest qualitative, rather than quantitative differences between classes.

Additionally, the present study compared the empirical classification with DSM-IV classifications at t1. The severe NDD class showed relatively higher proportions of DSM-IV diagnoses characterized by more symptoms, suggesting mainly quantitative rather than qualitative differences in DSM-IV symptoms across classes.

The second aim of the present study was to test whether the empirical and DSM-classifications predict NDD symptoms at t2. Results suggest that higher probabilities for being a member of the severe NDD class at t1 predict lower functioning across several NDD markers at t2. That is, higher probabilities of being a member of the severe NDD class predicted lower fine motor, expressive- and receptive language abilities at t2, and relatively more attention and behavioral problems. DSM-5 categories “Autism Spectrum Disorder” (ASD), “Language Disorder” (LD) and “Intellectual Disability” (ID) on NDD did not predict NDD symptom measures at t2. However, the DSM-5 analysis was done in a subsample diagnosed with Autism, PDD-NOS, Mixed- and Expressive Language Disorder and Intellectual Disability. These results may therefore not be representative for the whole sample.

The results of the present study firstly seem to support previous research that reported continuous differences within latent classes in a sample of children with various NDD’s and typically developing children (Dyck, Piek & Patrick, 2011). The present study generalized these results to a

(22)

22 different sample of (very) young children at risk for NDD. Secondly, present results support findings of quantitative differences in DSM-IV symptoms across latent classes in various NDD samples (e.g. Leyfer et al., 2008; van der Meer et al. 2012). That is, the severe NDD class comprised higher proportions of DSM-IV diagnoses characterized by more symptoms. Third, the results provide additional support for different symptom profiles in young children with developmental problems (Möricke, Lappenschaar, Swinkels, Rommelse & Buitelaar, 2013). That is, different symptom profiles were found for children at risk for NDD with a high amount of communication and social interaction impairments, as compared to a low amount of communication and social interaction impairments. Similar results have been reported in a sample of children aged between 14-18 months (Möricke et al., 2013). In addition to this study,

however, present results suggest a symptom profile that includes impaired (fine) motor- and language skills, and problems related to attention, hyperactivity/impulsivity, sleep and oppositional defiant behavior. Finally, the results did neither support nor provide conclusive evidence on whether symptom severity in young children at risk for NDD is represented by common NDD severity factor(s). That is, there are various alternative explanations for a partially measurement invariant model providing superior fit. One likely explanation could be that measurement invariant FM models do not seem to fit real data well in practice (Clark, Muthén, Kaprio, D’Onofrio, & Viken et al., 2013). This suggests that the chance of finding a measurement invariant FM model to provide the best fit is low a priori, let alone with a high amount of missing data and small sample size.

The present study provides insight into the behavioral profile of young children at risk for NDD, as the empirical data firstly support specific symptom profiles. Secondly, the results suggest specific symptoms profiles for children that seem to have a higher risk at developing more, and more severe impairments in the near future.. The question remains, however, to what extent the results of present study should be considered reliable. That is, the present study showed various shortcomings as presented by the data, measurement instruments and factor mixture analyses. Future studies of NDD symptom profiles should aim to overcome all, or at least some of these limitations, so one can interpret results more confidently.

The main limitation of the present study was presented by the data, which contained selective missing data for ADOS-G and CBCL1.5-5 measures. Missing data for ADOS-G and CBCL1.5-5

(23)

23 measures was related to lower ESAT sum scores (and therefore to a potentially lower risk to develop NDD symptoms). As a consequence, children with relatively low ESAT sum scores were overrepresented as compared to children with higher ESAT sum scores. We included ESAT sum scores to account for this shortcoming, but, the actual impact of these selective missing data on the parameters estimates remains unknown. Additionally, as a result of the high amount of missing data and the high number of model parameters to be estimated, we were not able to test whether fully non invariant models would fit data better than partially invariant models. Moreover, we did not find stable solutions for most two class/two factor models, or any of the two class/three factor models. Again, the high amount of missing data on the one hand, and the high number of model parameters on the other, most likely restricted our capability to test these relatively more complex models.

Another limitation relates to the selective missing data for CBCL1.5-5 measures. The categorical nature of these measures together with the overrepresentation of low scoring children may have

oversimplified the representation of several NDD markers within classes. That is, parents of children that scored below or above the ESAT cut off may have had the tendency to respond relatively more positively on various CBCL items. The limited response options for this part of the data may therefore have produced highly skewed items, which may have distorted the results of factor analysis within each class. Such problems can be overcome by either using only continuous measures, or specifying two-part factor mixture models, in which categorical and continuous components of the same measures are modelled concurrently (Kim & Muthén, 2009).

Additionally, the present study did not include all relevant markers for NDD, what may have influenced the class assignments. Future studies should examine whether more general measures of social interaction, motor performance and additional measures of mood symptoms (Gillberg, 2010) and sensory sensitivity (e.g. Baranek, Boyd, Poe, David, & Watson, 2007; Piek, & Dyck, 2004) may lead to different classifications.

Finally, not only different measures may lead to different classification results, as there may be various alternative interpretations for good fitting factor mixture models. To name a few, skewed data may be a reason for a good fitting factor mixture model (Lubke, & Neale, 2008), as well as non-normality of factor across classes and misspecification of measurement model (Bauer & Curran, 2004). To this end,

(24)

24 future studies should aim to replicate results of factor mixture analyses in independent samples, using reliable, well-established measurement instruments. Future studies that aim to model NDD symptoms using FMA could additionally highly benefit from conducting simulation studies prior to the analyses. That is, simulation studies prior to the analyses could be used to obtain power estimates of certain outcomes, given varying data properties such as mean differences between classes, missing data and sample size.

To summarize, results of present study suggest the sample of children at risk for NDD is best described by two separate classes, one mild and one severe NDD class. Furthermore, they suggests that higher probabilities of being a member of the severe NDD class predict lower functioning across several NDD markers in one to two years. Despite various limitations, the present study shows that future studies on the classification of NDD can use factor mixture modeling to obtain more insight into the clinical phenotypes of young children at risk for NDD. Accurate classifications of NDD symptoms could support the search for shared and non-shared genetic mechanisms involved in the onset and susceptibility to develop NDD’s. Moreover, accurate classifications are as important to research on ‘central deficits’ in NDD, and research studying differences among specific or various NDD’s at other explanatory levels. They are important since clinical phenotypes of NDD derived from empirically data seem to resemble reality more accurately than the NDD DSM-5 categories do at present; whereas empirically derived phenotypes of NDD seem to account for the high degree in overlap, DSM-5 categories don’t. Studying brain-, cognitive and behavioral functioning in these clinical phenotypes will therefore not only broaden our knowledge on the clinical manifestation of NDD’s, but may also lead to insights that can be applied in clinical practice. That is, present results suggest that mildly as well as more severely impaired children at risk for NDD show impairments across various developmental areas. Current diagnostic criteria,

however, seem to only focus on what is mostly impaired. Focusing too restrictively on only those mostly impaired areas, may risk that a child develops problems in another developmental area, problems that were previously only present on a subclinical level (Gillberg, 2010). Therefore, providing more accurate NDD classifications that acknowledge this high overlap in symptoms will guide future interventions to target the mostly salient as well as the (relatively) less salient, subclinical NDD symptoms.

(25)

25 References

Achenbach, T. M., & Rescorla, L. A. (2001). Manual for the ASEBA Preschool Forms & Profiles. Burlington, VT: University of Vermont Department of Psychiatry.

American Psychiatric Association (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.).

Washington, DC: Author.

American Psychiatric Association (2013). Diagnostic and statistical manual of mental disorders (5th ed.).

Washington, DC: Author.

Asparouhov, T., & Muthén, B. (2012). Using Mplus TECH11 and TECH14 to test the number of latent classes (Mplus Web Notes: No. 14). Los Angeles, CA: Mplus.

Bakk, Z., Tekle, F. B., & Vermunt, J. K. (2013). Estimating the association between latent class membership and external variables using bias-adjusted three-step approaches. Sociological

Methodology, 43(1), 272-311. doi: 10.1177/0081175012470644

Bale, T. L., Baram, T. Z., Brown, A. S., Goldstein, J. M., Insel, T. R., McCarthy, M. M., ... & Nestler, E. J. (2010). Early life programming and neurodevelopmental disorders. Biological Psychiatry, 68(4), 314-319. doi:10.1016/j.biopsych.2010

Baranek, G. T., Boyd, B. A., Poe, M. D., David, F. J., & Watson, L. R. (2007). Hyperresponsive sensory patterns in young children with autism, developmental delay, and typical development. American Journal on Mental Retardation, 112(4), 233-245.

Barthélémy, C. (2014). Understanding the neuro-developmental pathogenesis of social disability: towards a cross-disorder approach. European Child & Adolescent Psychiatry, 23, 59-60. doi: 10.1007/s00787-013-0516-5

Bauer, D. J., & Curran, P. J. (2004). The integration of continuous and discrete latent variable models: Potential problems and promising opportunities. Psychological Methods, 9(1), 3-29. doi:

10.1037/1082-989X.9.1.3

Bishop, S. L., Guthrie, W., Coffing, M., & Lord, C. (2011). Convergent validity of the Mullen Scales of Early Learning and the differential ability scales in children with autism spectrum disorders. American Journal on Intellectual and Developmental Disabilities, 116(5), 331-343. doi:

(26)

26 Carter, C. S. (2007). Sex differences in oxytocin and vasopressin: Implications for autism spectrum

disorders? Behavioural Brain Research, 176(1), 170–186. doi:10.1016/j.bbr.2006.08.025

Carter, A. S., Black, D. O., Tewani, S., Connolly, C. E., Kadlec, M. B., & Tager-Flusberg, H. (2007). Sex differences in toddlers with autism spectrum disorders. Journal of Autism and Developmental Disorders, 37(1), 86–97. doi: 10.1007/s10803-006-0331-7

Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification, 13, 195-212. Retrieved from

http://www.link.springer.com/article/10.1007/BF01246098#page-1

Chen, C. Y., Liu, C. Y., Su, W. C., Huang, S. L., & Lin, K. M. (2007). Factors associated with the diagnosis of neurodevelopmental disorders: a population-based longitudinal study. Pediatrics, 119(2), e435-e443. doi: 10.1542/peds.2006-1477

Cheng, Z. (2012). The Relation between Uncertainty in Latent Class Membership and Outcomes in a Latent Class Signal Detection Model (Doctoral dissertation). Retrieved from:

http://hdl.handle.net/10022/AC:P:13139

Clark, S. L., Muthén, B., Kaprio, J., D'Onofrio, B. M., Viken, R., & Rose, R. J. (2013). Models and strategies for factor mixture analysis: An example concerning the structure underlying psychological disorders. Structural Equation Modeling: a Multidisciplinary Journal, 20(4), 681-703. doi: 10.1080/10705511.2013.824786

Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6(4), 330-351. doi:10.1037//1082-989X.6.4.330

Demers, M. M., McNevin, N., & Azar, N. R. (2013). ADHD and motor control: A review of the motor control deficiencies associated with attention deficit/hyperactivity disorder and current

treatment options. Critical Reviews™ in Physical and Rehabilitation Medicine, 25(3-4). doi:10.1615/CritRevPhysRehabilMed.2013009763

Dietz, C., Swinkels, S., van Daalen, E., van Engeland, H., & Buitelaar, J. K. (2006). Screening for autistic spectrum disorder in children aged 14–15 months. II: population screening with the Early Screening of Autistic Traits Questionnaire (ESAT). Design and general findings. Journal of Autism

(27)

27 and Developmental Disorders, 36(6), 713-722. doi: 10.1007/s10803-006-0114-1

Dyck, M. J., Piek, J. P., & Patrick, J. (2011). The validity of psychiatric diagnoses: The case of ‘specific’ developmental disorders. Research in developmental disabilities, 32(6), 2704-2713. doi: 10.1016/j.ridd.2011.06.001

Ehninger, D., Li, W., Fox, K., Stryker, M. P., & Silva, A. J. (2008). Reversing neurodevelopmental disorders in adults. Neuron, 60(6), 950-960. doi: 10.1016/j.neuron.2008.12.007

Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8(3), 430-457. doi: 10.1207/S15328007SEM0803_5

Fliers, E. A., Franke, B., & Buitelaar, J. K. (2010). Motor problems in children with ADHD receive too little attention in clinical practice. Nederlands Tijdschrift voor Geneeskunde, 155(50), A3559-A3559. Retrieved from: http://www.ntvg.nl/

Fournier, K. A., Hass, C. J., Naik, S. K., Lodha, N., & Cauraugh, J. H. (2010). Motor coordination in autism spectrum disorders: a synthesis and meta-analysis. Journal of Autism and Developmental Disorders, 40(10), 1227-1240. doi: 10.1007/s10803-010-0981-3

Gillberg, C. (2010). The ESSENCE in child psychiatry: early symptomatic syndromes eliciting

neurodevelopmental clinical examinations. Research in Developmental Disabilities, 31(6), 1543-1551. doi: 10.1016/j.ridd.2010.06.002

Grados, M. A., & Mathews, C. A. (2008). Latent class analysis of Gilles de la Tourette syndrome using comorbidities: clinical and genetic implications. Biological Psychiatry, 64(3), 219-225.

doi:10.1016/j.biopsych.2008.01.019

Groen, W. B., Swinkels, S. H., van der Gaag, R. J., & Buitelaar, J. K. (2007). Finding effective screening instruments for autism using Bayes theorem. Archives of Pediatrics & Adolescent Medicine, 161(4), 415-416. doi:10.1001/archpedi.161.4.415

Hipp, J. R., & Bauer, D. J. (2006). Local solutions in the estimation of growth mixture models. Psychological Methods, 11(1), 36-53. doi:10.1037/1082-989X.11.1.36

Ivanova, M. Y., Achenbach, T. M., Rescorla, L. A., Harder, V. S., Ang., R. P., Bilenberg, N., …&

(28)

28 seven-syndrome model of the child behavior checklist for ages 1.5–5. Journal of the American Academy of Child & Adolescent Psychiatry, 49(12), 1215-1224. doi: 10.1016/j.jaac.2010.08.019 Joreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis.

Psychometrika, 34, 183-202. Retrieved from:

http://link.springer.com/article/10.1007/BF02289343#page-1

Kadesjö, B. (2000). Neuropsychiatric and neurodevelopmental disorders in a young school-age population. Epidemiology and comorbidity in a school health perspective (Doctoral thesis). Retrieved from: http://swepub.kb.se/

Kaplan, B. J., Dewey, D. M., Crawford, S. G., & Wilson, B. N. (2001). The term comorbidity is of questionable value in reference to developmental disorders data and theory. Journal of Learning Disabilities, 34(6), 555-565. doi: 10.1177/002221940103400608

Kim, Y., & Muthén, B. O. (2009). Two-part factor mixture modelling: Application to an aggressive behavior measurement instrument. Structural Equation Modeling, 16(4), 602-624.

doi:10.1080/10705510903203516

Landa, R. J. (2005). Assessment of social communication skills in pre-schoolers. Mental Retardation and Developmental Disabilities Research Reviews, 11(3), 247-252. doi: 10.1002/mrdd.20079

Landa, R. J., Holman, K. C., & Garrett-Mayer, E. (2007). Social and communication development in toddlers with early and later diagnosis of autism spectrum disorders. Archives of General Psychiatry, 64(7), 853-864. doi:10.1001/archpsyc.64.7.853.

Leyfer, O. T., Tager‐Flusberg, H., Dowd, M., Tomblin, J. B., & Folstein, S. E. (2008). Overlap between autism and specific language impairment: comparison of autism diagnostic interview and autism diagnostic observation schedule scores. Autism Research, 1(5), 284-296. doi:10.1002/aur.43 Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Leventhal, B. L., DiLavore, P. C., ... & Rutter, M. (2000).

The Autism Diagnostic Observation Schedule-Generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of Autism and Developmental Disorders, 30(3), 205-223. doi:0162-3257/00/0600-0205$18.00/0

Lubke, G. H., Dolan, C. V., Kelderman, H., & Mellenbergh, G. J. (2003). On the relationship between sources of within-and between-group differences and measurement invariance in the common

Referenties

GERELATEERDE DOCUMENTEN

This study finds indications that the children of the (developmental) language disordered groups show a delay in their lexical semantic development in comparison to the

The Mean, Standard Deviation of the Situational Characteristic Derived from the Substitutes for Leadership Theory and the Amount of Respondents. N Mean

The following section we will use a Monte Carlo study to compare the performance of our maximum likelihood estimator to an adapted method used in Rösch and Scheule [2005] to see

Skill variety is positively related to work motivation Task significance Work motivation Age Emotionally meaningful motives Skill variety Prevention focus Promotion focus

This research set out to find out whether three differences between acquiring companies from Germany and their targeted companies in other countries, namely cultural

Justine, as a woman “in whom brain and heart have so enlarged each other that [her] emotions are as clear as thought, [her] thoughts as warm as emotions,” has an approach towards

gezondheidstoestand en gegevens over de uitgevoerde onderzoeken en behandelingen. U heeft recht op inzage of kopie van dit dossier, behalve als de privacy van een ander hierdoor

We manipulated six factors that all affect cluster separation: (a) the between- cluster similarity of factor loadings, (b) the number of data blocks, (c) the number of observations