• No results found

A comparison of three response time-accuracy models

N/A
N/A
Protected

Academic year: 2021

Share "A comparison of three response time-accuracy models"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Hannah Ros Sigurdardottir

1

, Gunter Maris

1, 2

& Maarten Marsman

1

1 Department of Psychology, Psychological Methods, University of Amsterdam,

The Netherlands

2 Psychometric Research Center, Cito, Arnhem, the Netherlands

Abstract

There are three popular item response models that jointly model response accuracy and response times: the drift diffusion model (Ratcliff, 1978; Tuer-linckx & de Boeck, 2005), the signed residual time model (Maris & van der Maas, 2012) and the hierarchical response model (van der Linden, 2007). The three primary differences that distinguish the models from one another are as follows: First, they differ in what latent traits they assume. Second, they differ in whether they assume response times and response accuracy to be independent or dependent conditional on their latent trait levels. Third, and related to the independence assumption, the models differ in what speed-accuracy function they assume. Despite these differences, the models are more similar than has been thought so far. This project aims to find the manifest probabilities of each model, and to capitalize how the models are related to one another.

Keywords: Item response theory, marginalization, manifest probabilities, la-tent traits, response times, response accuracy, hierarchical response model, drift diffusion model, signed residual time model, Ising model

(2)

Item response models are used primarily within the field of psychology, with the goal of estimating peoples’ ability in a certain subject. Since ability cannot be directly observed, it is a so-called latent trait. Ability is traditionally estimated by using the proportion of correct answers, or response accuracy. In item response theory (IRT), the probability of a person correctly responding to an item is modeled, conditional on her ability in that domain. A commonly used IRT model is the two-parameter logistic model (2PLM) (Birnbaum, 1968). The 2PLM models the probability of a person correctly responding to an item based on her ability and two item parameters: item difficulty and item discrimination - how difficult an item is and how well an item discriminates between people of high and low abilities. The 2PLM states that the probability of a correct response is the following:

p(xi = 1 | θp) =

exp(αi(θp− δi)) 1 + exp(αi(θp− δi))

,

where x is a binary random variable with values 0 for incorrect responses and 1 for correct responses. Ability of a person p is represented by θp, and αi and δi are the discrimination and difficulty parameters of an item i.

With recent advancements of technology, tests are increasingly administered by com-puter. This allows for more sources of information to be gathered aside from response accuracy, one of them being the time it takes people to respond to items, referred to as

response times. With this new possibility follow questions about whether it is possible

to utilize response times in order to learn something new about ability. For example, can response accuracy and response times be jointly modeled in a way that leads to a more accurate estimate of ability than when response accuracy is modeled alone? As it turns out, this is indeed the case (van der Linden, 2008; Bolsinova & Tijmstra, 2017). Van der Linden, Klein Entink and Fox (2010) show that by taking response times into account, it is possible to increase the accuracy of the ability estimate significantly, especially for persons with extreme abilities (Klein Entink, 2009; van der Linden, Klein Entink, & Fox, 2010). An-other question following from this new source of information is how response accuracy and response times should be jointly modeled, so as to get the most accurate results. There are different theories as to how this should be done, which lead to different approaches in jointly modeling response times and response accuracy. The aim of this paper is an investigation of the statistical relations between three different and highly popular approaches to jointly modeling response times and response accuracy: the drift diffusion model (DDM) (Ratcliff, 1978; Tuerlinckx & de Boeck, 2005), the signed residual time model (SRTM) (Maris & van der Maas, 2012) and the hierarchical response model (HRM) (van der Linden, 2007). A recent paper by Rijn and Ali (2017) did a comparison of these same models, where they

(3)

focus on adaptive testing and model fit. Their results reveal that the SRTM shows a slight superiority over the other models in adaptive testing. Whereas Rijn and Ali (2017) focused on a practical application of these models, the focus of this comparison will be theoretical of nature, where we will marginalize each model to find their manifest probabilities. This is inspired by the works of Holland (1990) and Cressie and Holland (1981).

An important aspect that the DDM, SRTM and HRM share is that the part of each model related to accuracy - the marginal model of accuracy - is or can be simplified to the 2PLM. However, there are also several differences between these models, based on differences in their underlying theories. One of those differences relates to the latent traits in the models. All models assume that people have differing inherent abilities, a latent trait estimated from response accuracy. However, they do not agree on the following question: should a latent trait be estimated from response times, as is the case for response accuracy? Some theorists indeed believe that in addition to having different inherent abilities, people also have an inherent speed η at which they work, a latent trait estimated from response times. Others believe that response times are solely dependent on ability, and that people don’t have a constant speed at which they work. This is an important distinction of the DDM, SRTM and HRM; the two former only explicitly assume a single latent trait, θ, while the HRM also defines a second latent trait, η.

When it comes to jointly modeling response times and response accuracy into a single model, a thought must be given to the relationship between the two variables, often called the speed-accuracy function (SAF). Van der Linden and Glas (2010) note that descriptively studying response times shows that longer response times are associated with incorrect responses, and shorter response times with correct responses, implying that people take longer when they don’t know the answer. However, experimental studies have shown that the more time people take in responding to an item, the more likely they are to correctly respond (van der Linden & Glas, 2010; Luce, 1986; Goldhammer, 2015). This contrast in the relationship between response accuracy and response times is reflected in the three models - the DDM and SRTM assume different SAFs while the HRM doesn’t assume a specific SAF but aims to capture the relationship between response accuracy and response times seen in the data.

In addition to having differing SAFs, the models also differ in whether they assume response accuracy and response times to be independent conditional on their latent traits. Assuming conditional independence (CI) between response accuracy and response times given the latent traits implies that if their latent trait/s are kept constant, there will be no relationship between response accuracy and response times (Bolsinova, Tijmstra, Molenaar, & De Boeck, 2017). That is, the mean response time of an item, both for people who respond correctly and incorrectly, should be the same given that they have similar latent trait levels

(4)

(van der Linden, 2007). However, the relationship between response times and accuracy is more complicated than often implied by models assuming CI (Bolsinova et al., 2017). The DDM and HRM assume response accuracy and response times to have CI given their latent trait levels, while SRTM does not make that assumption.

The fact that the models differ in what latent traits they assume makes it difficult to directly compare them in their latent trait states. We will therefore marginalize the models and compare them in their manifest representations, using the strategy as displayed in Marsman, Maris, Bechger, and Glas (2015) and Epskamp, Maris, Waldorp, and Borsboom (2016).

The outline of this paper is the following: We will start by introducing each of the three models before discussing the strategy needed to marginalize them. Lastly, we will compare the manifest probabilities of the models.

Three Models for Response Times and Response Accuracy The Drift Diffusion Model

The DDM was introduced by Ratcliff (1978) to explain the cognition of a person faced with a two-choice reaction time task. The primary idea behind the DDM is that when deciding between two choices, respondents gradually accumulate evidence in favor of either choice, until reaching a threshold, or boundary, of having enough evidence to respond. This process is governed by a drift rate φ, which is the rate of information accumulation. If the drift rate is positive, a person will be inclined to respond correctly to that item, while being inclined to respond incorrectly if the drift rate is negative. The DDM is depicted in figure 1.

In the image there are two boundaries, one upper boundary and one lower boundary. The upper boundary represents the evidence threshold for the correct response, and the lower boundary represents the evidence threshold for the incorrect response. The distance between these two boundaries is referred to as the boundary separation α, and determines how long it takes people to react to an item.

The DDM has been used mainly for experimental cognitive psychology research (Mormann, Malmaud, Huth, Koch, & Rangel, 2010; Krajbich & Rangel, 2011; Eckhoff, Holmes, Law, Connolly, & Gold, 2008), but was introduced to psychometrics by Tuerlinckx and de Boeck (2005), who demonstrated that the 2PLM is formally linked to the marginal accuracy model of the DDM. In this case, the drift rate consists of a person ability and an item difficulty, φ = θ + δi, and the boundary separation αi is equal to the item dis-crimination parameter of the 2PLM. The starting point z, as seen in figure 1, represents whether respondents are biased towards either response option independently of their drift rate. If respondents are unbiased, z is half of the boundary separation: z = αi/2 so that

(5)

Figure 1 . The DDM. When a decision-making tasks is presented to a person, she gathers evidence until reaching either the upper boundary (correct option) or the lower boundary (incorrect option). The distance between them makes up the boundary separation.

the distance between each boundary from z is equal. With these definitions and assuming participants to be unbiased, the marginal accuracy model of the DDM is equal to the 2PLM. Assuming respondents to be unbiased, the DDM models the joint distribution of observed response times tand response accuracy x as:

f (xi, ti | φi) = πσ2 α2 i exp φiαi(xi− 1 2) σ2 − φ2i 2(ti − Ter) ! × ∞ X m=1 m sin πm 2  exp −1 2 π2σ2m2 α2 i (ti − Ter) ! ,

where αi is the boundary separation, σ2 is the variance of the change in information ac-cumulation over time, ti is the observed response time and Ter is the non-decision time. Subtracting non-decision time from the observed response time (ti− Ter) gives the decision

time ti, and φi is the drift rate. The model as it stands is not identifiable, so one of the parameters needs to be set to an arbitrary value. To identify the model we follow Tuerlinckx and de Boeck (2005) and fix σ = 1. When boundary separation is large, it takes longer to respond to an item, but the item also discriminates better between people with high and low abilities.

The SAF characterized by the DDM implies that as the discrimination of an item and therefore the expected response time increases, the likelihood of responding correctly

(6)

increases if the ability of a person is more than the difficulty of the item, and decreases if the ability of a person is less than the difficulty of the item(Tuerlinckx & de Boeck, 2005). That is, as discrimination and expected response time increase, ones ability is expected to have more effect on what response is chosen.

The DDM assumes CI between response times and accuracy, given ability. As ex-plained above, this means that for a fixed ability, there should be no relationship between response times and response accuracy. The model has been extended for many purposes - for example by van der Maas, Molenaar, Maris, Kievit, and Borsboom (2011), who ob-served that the model as formulated by Ratcliff and Tuerlinckx and de Boeck is not optimal for ability tests, and thus modified it to be more applicable for tests of that type. Other examples of the extension of the DDM are a hierarchical diffusion model, which contributed to the relatively new field of cognitive psychometrics (Vandekerckhove, Tuerlinckx, & Lee, 2011), and including crossed random effects in the model (Vandekerckhove, Verheyen, & Tuerlinckx, 2010), to name a few.

The Signed Residual Time Model

The second approach we will discuss is the SRTM, designed for tests with a time limit and an explicit scoring rule (Maris & van der Maas, 2012). The scoring rule introduced by Maris and van der Maas (2012) is the following:

(2xpi− 1)(di− tpi) (1)

where diis the time limit given for an item i, and xpiand tpiare a person’s response accuracy

and response time to an item. Response accuracy xpi is scored 0 for incorrect responses,

which means that the first part of equation (1) becomes negative; (2x − 1) = −1, and 1 for correct responses, which leads to (2x − 1) = 1 so the equation becomes positive. The residual time (di− tpi) multiplied with either −1 or 1 gives rise to bonus points for correct responses, increasing as the response is faster, and penalty points for incorrect responses, also increasing with faster responses. With this scoring rule, Maris and van der Maas (2012) wanted to encourage test-takers to respond as quickly as possible while still focusing on correctly responding, by enforcing them to respond quickly when they know the answer, but taking their time if in doubt. Figure 2 depicts Maris and van der Maas’s scoring rule.

Above the horizontal line in figure 2 are the scores obtained from correct responses, and below the horizontal line for incorrect responses. When response time increases, the amount of bonus or penalizing points decreases.

(7)

Figure 2 . The SRT scoring rule. The y-axis determines the score a person receives, and the x-axis shows the response time. If a person correctly responds to an item in 2 seconds, she will get 3 points. However, if her response time had been 1 second, she would have gotten 4 points. Similarly, if a person responds incorrectly in 2 seconds, she gets -3 points. Figure taken from Maris and van der Maas (2012).

as:

f (xi, ti | θ) =

exp (2xi− 1)(di− ti)(θ + δi) exp di(θ + δi)

− exp −di(θ + δi),

where θ represents person ability and δi item difficulty, while (2xi− 1)(di− ti) is the scoring rule as explained above. The marginal model for accuracy of the SRTM is equal to the

2PLM model, where the time limit di is equal to the item discrimination. The SRTM

assumes that the total score of a person using the scoring rule, P

i(2xpi− 1)(di− tpi) is a sufficient statistic for her ability θp, and the total score of an itemP

p(2xpi− 1)(di− tpi) is a sufficient statistic for item difficulty δi.

The SAF characterized by the SRTM is similar to the DDM: as expected response time increases, people with higher abilities than the item difficulty will be more likely to respond correctly, while people with lower abilities than the item difficulty will be more likely to respond incorrectly. In addition, the SRTM predicts that people with abilities higher than the item difficulty will have fast correct responses and slow incorrect responses, while people with abilities lower than the item difficulty will have slow correct responses and fast incorrect responses (Maris & van der Maas, 2012). The model therefore does not assume CI between response accuracy and response times given ability, since for fixed levels

(8)

of ability there is still a relationship between response accuracy and response times. This is where the model differs from the predictions of the DDM. The model also predicts that as ability becomes more extreme (both very low and very high), responses are expected to be quicker.

The SRTM is used by Math garden (Dutch: Rekentuin); a computer game designed to help children practice mathematics, using a computerized adaptive system (Klinkenberg, Straatemeier, & van der Maas, 2011). A screen shot of the game is shown in figure 3.

Figure 3 . A screenshot from Math garden: A computer adaptive system for children to practice mathematics. Math garden uses the SRTM.

Math garden lets children practice mathematics in a fun setting, and chooses the items presented to each child based on their abilities. The system has been used to monitor arithmetic in primary education in the Netherlands. Klinkenberg et al. (2011) showed that, using the SRTM and Math Garden, more precise estimates of student abilities were acquired than when using a model that didn’t incorporate response times. In addition, Klinkenberg (2014) showed that the estimations of ability obtained with the scoring rule is both reliable and valid. Math garden has also been used to show that, by letting children solve items matched to their abilities, computer adaptive testing can increase the time children spend practicing mathematics and in turn their mathematical performance increases (Jansen et al., 2013). Even so, it has been argued that this scoring rule is not optimal for high-stake situations, because of the increased stress the time limit can impose (Rijn & Ali, 2017; Goldhammer, 2015).

(9)

The Hierarchical Response Model

The HRM was introduced by van der Linden (2007), and in contrast to the SRTM, the HRM is particularly suited for situations where persons have ample time for responding to the items. The HRM combines two separate latent trait response models - an IRT model for response accuracy and a log-normal model for response times. Both of these models have separate item parameters - the accuracy model has an item difficulty, item discrimination and guessing parameter, while the log-normal model has item time intensity and item time discrimination parameters. Therefore, the HRM allows items’ response time distributions to differ. This separates the HRM from the SRTM and DDM, which do not have these item time parameters nor allow for response time distributions to differ between items. However, van der Maas et al. (2011) showed that the HRM is related to the DDM, with boundary separation and drift rate in the DDM being translatable to the time intensity and speed parameters of the HRM.

Both the response accuracy and response time models of the HRM assume CI - the response accuracy model assumes CI given ability, and the response time model assumes CI given speed. When combined to form the HRM, the model assumes CI of response accuracy and response times given the relationship between speed and ability. The model assumes a correlation between ability and speed, but does not specify the details of it. Therefore, the HRM does not specify any specific SAF, but rather aims to assess the relation between response accuracy and response times from the data. The response accuracy model of the HRM is the 2PLM extended with a parameter for guessing, called the three parameter logistic model. However, since guessing will not be considered in this paper, we will simplify it and use the 2LM. The marginal model for accuracy of the HRM is thus the 2PLM, as is the case for the SRTM and DDM. The joint distribution of response accuracy x and response times t according to the HRM is then the following:

f (xi, ti | θ, η) = exp(xiiθ + δi)) 1 + exp(αiθ + δi) aiπti exp−a2i(ln(ti) + η + bi)2  ,

where the parameters of the response accuracy model are αi, which refers to item discrim-ination, δi refers to the item difficulty, and θ to ability. For the response time model, ai refers to the item time discrimination, bi refers to the item time intensity, and η refers to speed. The HRM is depicted in figure 4.

As can be seen in figure 4, response accuracy x and response times t are modeled

independently, given ability and speed. Unlike the SRTM and the DDM, the HRM

models response accuracy and response times so that ability doesn’t directly influence response times, only indirectly through its relationship with speed. Bolsinova and Tijmstra (2017) extended the HRM so that ability has a direct influence on response times using

(10)

θ

η

X

1

X

2

t

1

t

2

Figure 4 . The HRM. The model consists of two separate response models, one where θ governs response accuracy x, and one where η governs response times t. There is an undefined correlation between θ and η.

cross-loadings, and showed that this increased accuracy of the ability estimate. We will introduce a similar transformation, described in section ’Marginalization of the HRM’. In addition, more extensions of the HRM have been made, for example to allow for conditional dependence, and to make the model suitable for adaptive testing (Bolsinova, Tijmstra, & Molenaar, 2016; van der Linden, 2008).

In sum, the DDM, SRTM and HRM differ in several ways, for instance in whether they assume CI, what variables they condition on, what SAF they assume as well as being structurally different. Despite this, they have the commonality that the marginal model for accuracy is equal to the 2PLM in all models. The aim of this paper is to investigate the statistical relations between the three models, by attaining their manifest probabilities.

Methods: How to Marginalize the Models

The differences between the DDM, SRTM and HRM make it difficult to directly compare them in their latent trait expressions. To account for that, we will compare the models in their manifest expressions, marginalizing out the latent traits. Direct marginalization is hard, but we will make use of recent discoveries in the psychometric literature that link marginal IRT models to the Ising model, which uses only manifest probabilities. We will demonstrate these findings, and how to use them for our purposes.

The Relationship Between IRT Models and the Ising Model: Background

The strategy we will use to marginalize our models makes use of the relationship between IRT models and the Ising model, a specific graphical model. The Ising model is a statistical

(11)

model originating from physics (Ising, 1925; Lenz, 1920), and was originally designed to model the orientation of particles. The model explicitly models pairwise interactions and main effects of a vector of n binary random variables Y = (Y1, . . . , Yn) ∈ {−1, + 1}n as

p(Y = y) = expPn i=1yiµi+Pni=1 Pn j=1σijyiyj  P yexp  Pn i=1yiµi+Pni=1 Pn j=1σijyiyj ,

where µ ∈ Rn is a vector of main effects, which describes the tendency of the binary

variables to take either the value -1 or 1. The pairwise interaction parameters σij are in a

symmetric matrix Σ ∈ Rn×n, which describes that any two neighboring variables tend to

be in either different or the same state, depending on their sign.

Even though both the Ising model and IRT models have been around for several decades, the proof of a formal relationship between the two is relatively new. In 2002, Cox and Wermuth showed an approximate relation between the Ising model and the Rasch model - the simplest IRT model - and a year later, Molenaar (2003) suggested that there might be a formal relation between the two. However, this relationship was not detailed until more than a decade later, when demonstrated by Marsman et al. (2015) and Epskamp et al. (2016). As Marsman et al. (2015) and Epskamp et al. (2016) showed, the Ising model corresponds to the marginal of a multidimensional IRT (M-IRT) model, where the number of latent traits implied by the model equals the rank of the adjacency matrix of the network. These models can be equated by using a trick introduced by Kac (1969), which states that an exponential of a square can always be replaced by the following Gaussian integral (Kac, 1969; Kruis & Maris, 2016; Marsman et al., 2015):

expa2= Z R 1 √ π exp  2aθ − θ2 dθ, (2)

where a consists of observable variables and θ is a latent trait. In this manner, it is possible to introduce a latent trait where there previously was none.

These results are important for this project, because we now know that when an IRT model is marginalized using the strategy we will detail below, a graphical model emerges. Since we want to marginalize latent trait models, we can use this strategy for our purposes. The fact that the marginal of an IRT model is a graphical model has important implica-tions, since graphical models and latent models have very different theories. In a graphical model such as the Ising model, correlations between variables are not explained by a hidden theoretical construct, as is the case for latent trait models. Rather, it observes the relations between variables without any assumptions of a latent trait causing it. A graphical model is most commonly depicted with variables as nodes, and edges between nodes as associa-tions between variables, as depicted in figure 5. Figure 5 depicts an example of a network

(12)

x

5

x

2

x

1

x

4

x

3

Figure 5 . An example of a graphical model. The nodes represent items, and the edges represent their associations. Here, there are no latent traits used to explain the associations between items.

or graph - where the nodes represent items, and the edges between them represent their correlations. This model only uses observed variables. As a comparison, the same data could be modeled with a latent trait model, as depicted in figure 6.

Figure 6 shows the same five items as in figure 5, but explained with a latent trait model. The fact that these two models are equivalent is interesting, and at the same time it is intriguing to observe what happens when marginalizing the DDM, SRTM and HRM.

The Relationship Between IRT Models and the Ising Model: Strategy

As a courtesy to the reader, we will derive the results from Marsman et al. (2015) and Epskamp et al. (2016) here. First, recall the joint distribution of binary random variables:

p(Y = y) = expPn i=1yiµi+Pni=1 Pn j=1σijyiyj  P yexp  Pn i=1yiµi+Pni=1 Pn j=1σijyiyj  = 1 Zexp  yTµ + yTΣy,

(13)

x

5

x

2

x

1

x

4

x

3

θ

Figure 6 . An example of a latent trait model. The latent trait θ influences the items, which explains the correlations between items.

where Z is the normalizing constant. Observe that the diagonal of Σ is not identified, since for all of the binary random variables the square y2i equals one, such that σiiyiyi = σii for all i is constant and cancels in the above ratio.

The latent trait representation of the Ising model originates from the work of Kac (1969), who showed that for each eigenvector of the connectivity matrix Σ, a latent trait θ is associated, such that the observed variables yi are independent given the full set of latent traits, i.e.,

i6=j : Yi⊥ Yj | Θ.

Due to the indeterminacy of the diagonal elements from Σ the eigenvalue decomposition is not unique. Here we decompose it as

Σ + cI = Q (Λ + cI) AT= AAT,

where Λ is a diagonal matrix of eigenvalues, Q a matrix of eigenvectors and the translation by c serves to ensure that the all eigenvalues are positive. That is, this translation ensures

(14)

elements from Σ are preserved in the decomposition.

With the eigenvalue decomposition of the connectivity matrix Σ it is convenient to rewrite the model:

p(Y = y) = 1 Zexp  yTµ + yTAATy= 1 Zexp   y Tµ + n−1 X r=1   n X i=1 yiair   2  ,

where r indexes the eigenvectors and eigenvalues. Observe that this expression of the Ising model contains the exponential of a sum of squares, and we can apply equation (2) to each square to obtain: p(y) = Z Rn−1 1 Zπn−12 expyT(µ + 2Aθ) − θTθ)dθ.

This latent trait expression has been (re-)discovered many times in both the statistical and psychometric literature (Anderson & Vermunt, 2000; Olkin & Tate, 1961; McCullagh, 1994; Besag, 1974; Holland, 1990; Lauritzen & Wermuth, 1989). Importantly, this expression can be factored p(y) = Z Rn−1 n Y i=1 exp yi(µi+ 2aiθ) 1 Zπn−12 exp−θTθdθ,

where ai is the i-th row-vector of A. Observe that the first factor in this expression is the kernel of a multidimensional IRT model (Reckase, 2009):1

p(yi | θ) = exp yii+ 2aiθ) P yiexp yi(µi+ 2aiθ)  = 1 Zi(θ) exp yi(µi+ 2aiθ),

where Zi(θ) is the IRT model’s normalizing constant. When we normalize the first factor

in the integral we also have to multiply the second factor by this normalizing constant to ensure that the integral is left unaltered;

p(y) = Z Rn−1 n Y i=1 1 Zi(θ) exp yii+ 2aiθ)× Qn i=1Zi(θ) Zπn−12 exp−θTθ = Z Rn−1 n Y i=1 p(yi | θ)g(θ)dθ,

1Observe that this is the expression of a multidimensional IRT model for a spin random variable (y

i

{−1, + 1}), which is slightly different from the expression for that of the typically used binary random variables xi=12(1 + yi) ∈ {0, 1}: p(xi| θ) = exp xi(2µi+ 4aiθ)  1 + exp (2µi+ 4aiθ) .

(15)

where p(yi | θ) is the marginal IRT model and g(θ) is the distribution of the latent traits. In typical applications of the multidimensional IRT model, the latent trait θ is as-sumed to follow a multivariate normal distribution. Here, the latent trait distribution g(θ) is seen to be a mixture of multivariate normal distributions:

g(θ) = Qn i=1Zi(θ) Zπn−12 exp−θTθ⇐⇒X y p(y)f (θ | y), (3)

a 2n component mixture of posterior distributions, where the posterior distribution g(θ |

y) are NyTA, 12I, and the mixture probabilities p(y) are given by the Ising model.

Marsman, Waldorp, and Maris (2016) showed that the marginal distribution of this 2n

component mixture model has either one or two modes in each of its dimensions, and Marsman et al. (2017) demonstrated that it can be closely approximated with a mixture of two normal distributions.

The relation detailed above encompasses the marginal Rasch model and extended Rasch model as special cases, where they are directly linked to the CurieWeiss model -also known as the fully connected Ising model since all variables are fully connected, and all connections are of the same strength - see Marsman et al. (2017, 2015) for details.

Reverse Engineering of the Strategy

Because we want to use the procedure demonstrated above to get to the manifest probabil-ities of our three models, we will show how the procedure can be reversed.

Let us take the general case of an exponential family model. These types of models consist of three parts: a base function h(xi, ti), which depends only on the data, a sufficient statistic s(xi, ti) that depends both on the data and the latent trait, and a normalizing con-stant Zi(θ), which depends on the latent traits. Our model, that states the joint distribution of response accuracy xi and response times ti conditional on θ, then looks as such:

f (xi, ti | θ) = 1

Zi(θ)hi(xi, ti) exp s(xi, ti)θ



.

If we compare this to the derivation above, we can see that for the Ising model, hi(yi) = exp(yiµi) si(yi) = yi2ai Zi(θ) = X yi exp yi(µi+ 2aiθ).

Next, we define the marginal distribution of our model, which is an integral of the model multiplied with the prior distribution g(θ). We will define g(θ) in the same way as above

(16)

in equation (3), such that the normalizing constant Zi(θ) cancels and a quadratic term of θ is introduced. This is to ensure that equation (2) can be applied. We therefore define g(θ) as being proportional to the normalizing constant Zi(θ), and a squared exponential of θ;

g(θ) ∝Y i

Zi(θ) exp−θ2

For this to be a probability distribution of its own, a normalizing constant is needed. We

write the unknown normalizing constant here as √πZ. The joint marginal distribution

p(xi, ti) =RRp(xi, ti | θ)g(θ)dθ is now the following;

f (xi, ti) = Z R 1 Z(θ)hi(xi, ti) exp(s(xi, ti)θ) 1 Z√πZ(θ) exp  −θ2 =1 Zhi(xi, ti) Z R 1 √ πexp  si(xi, ti)θ − θ2  dθ.

In this form, equation (2) can be applied to the marginal model such that we integrate θ out and are left with:

p(xi, ti) = 1 Zhi(xi, ti) exp 1 2si(xi, ti) 2.

We now have the manifest probabilities of the original latent trait model.

In the results section, we will apply this strategy to our three models in order to get to their manifest probabilities.

Results: Marginalization of the Three Models

Since we have laid out the procedure of how to get from a latent trait model to its marginal probabilities, we will now apply this to the DDM, SRTM and HRM.

Marginalizing the DDM

With drift rate φi decomposed into its two parts, θ and δi, fixing σ = 1, and assuming

respondents to be unbiased, the joint distribution of response accuracy xi and response

times ti according to the DDM is

p(xi, ti | θ) = π α2i exp αi  xi− 1 2  (θ + δi) − (θ + δi) 2 2 (ti − Ter) ! × ∞ X m=1 m sin πm 2  exp −1 2 π2m2 α2i (ti − Ter) ! .

(17)

Terming the decision time ti = (ti − Ter), we have: p(xi, ti| θ) = π α2i exp αi  xi− 1 2  (θ + δi) − 1 2ti(θ + δi) 2 ! × ∞ X m=1 m sin πm 2  exp −1 2 π2m2 α2i ti ! = c(ti) exp  xiαi(θ + δi) − ti(θ+δi) 2 2  Zi(θ)

where c(ti) is used as a short hand notation for the infinite sum and Ziθ is the appropriate normalizing constant. Note that because the infinite sum is only dependent on response times for an unbiased model, the term can be accommodated for in the response time distribution.

From here, we marginalized the DDM in two ways: First, we continued with the model unaltered, and second, we introduced a second latent trait to the model.

We will start with the unaltered model. Assuming g(θ) to be the following distribution

g(θ) = 1 Z√π

Y

i

Zi(θ) exp(−θ2)

we obtain the manifest probabilities:

p(x, t) = 1 Z√π Z ∞ −∞ Y i c(ti) exp  xiαi(θ + δi) − 1 2 X i ti(θ + δi)2− θ2   = 1 Z Q ic(ti) p 1 +P iti/2 exp   X i xiαiδi− 1 2 X i tiδi2+ (P ixiαi+ tiδi)2 4(1 +P iti/2)  .

Note that except for the function of time before the exponent, and the presence of an-other function of time in the the denominator of the squared exponential, the manifest probabilities of the DDM are very similar to a graphical model. We will call this model

DDM1.

For the second option, we will work with the fact that the DDM can be viewed as a specific instance of a hierarchical model. That is, even though the model doesn’t specify a separate latent trait for speed, there is a functional relationship between ability and speed in the model defined through the identity:

(θ + δ)2 = G(η, µt).

(18)

family model, meaning that the values of η are restricted to a curve, seeing as it is a function of θ. We will use this to introduce a transformation of θ with two new latent traits:

ηT= η(θ)T= η1(θ), η2(θ)= 1 2θ, − 1 2θ 2.

Since this is an exponential family model, it consists of sufficient statistics si(xi, ti), a base function hi(xi, ti) and a normalizing constant Zi(θ). For this model, the (minimal) sufficient statistics are

si(xi, ti) =s1i(xi, ti), s2i(xi, ti)= [αixi− 2δi, ti] , the base function is

h(xi, ti) = exp 1 2αixiδi− 1 2tiδ 2 i  × ∞ X m=1 sin 1 2πm  exp − 1 2iπ 2m2t ! ,

and the normalizing constant

Zi(θ) = Zi= αi π . Therefore, we have the model

p(xi, ti| η) = 1 Zi hi(xi, ti) exp  sTη. Defining g(η) as g(η) = 1 Ziexp(−η 2 1− η22)

we have the marginal model: p(x, t) = Z R2 Y i 1 Zi hi(xi, ti) exp(si1η1+ si2η2) × 1 Y i Ziexp(−η12− η22)dη12 =Y i 1 Zhi(xi, ti) exp    1 4   X i si1   2 +1 4   X i si2   2  .

This is a graphical model, but with response accuracy and response times independent given the two η’s. This model will be referred to as DDM2. From here, it is possible to introduce

a transformation of η2 as a linear function of η1, which would lead to response accuracy and response times being conditionally independent on the same latent trait. We will not derive this here.

(19)

Marginalizing the SRTM

The SRTM defines the joint distribution of response accuracy and response times as:

f (xi, ti | δi, θ) =

exp (2xi− 1)(di− ti)(θ + δi) exp di(θ + δi)− exp −di(θ + δi)

.

We will simplify this by writing yi = (2xi− 1), where yi∈ (−1, + 1), and ti = (di− ti) are the residual times such that we have the distribution:

p(yi, ti | δi, θ) =

exp yiti(θ + δi) exp di(θ + δi)

− exp −di(θ + δi).

There are two different ways to apply our reverse engineering technique to this model. The first option is to directly apply the reverse engineering trick to the unaltered SRTM, and the second option is to introduce a new response time variable so that we can assume CI between that variable and response accuracy. We will work out both options.

We can rewrite p(yi, ti | δi, θ) with a normalizing constant Zi(θ):

p(yi, ti | δi, θ) = 1 Zi(θ)exp yiti(θ + δi)  = 1 Zi(θ) exp yitiθ + yitiδi and define g(θ)2 g(θ) = 1 Z√πZ (θ) exp  −θ2

such that we have the marginal distribution

p(yi, ti) = exp   X i yitiδi   Z R 1 Q iZi(θ) exp  θ X i yiti   1 Z√πZi(θ) exp  −θ2 = exp   X i yitiδi   Z R 1 Z√πexp  θ X i yiti − θ2  dθ.

Applying equation (2), we get that

p(yi, ti) = 1 Zexp   X i yitiδi+ 1 2( X i yiti)2   2

This poses the possible difficulty of points of discontinuity for where θp= −δi. However, an application

of L’Hôpital’s rule to evaluate limits for θptending to −δishows that there are not points of discontinuity,

as limθp→ −δi

θ+δi

exp(di(θ+δi)) =

1

(20)

We can write this as p(y, t∗) = 1 Zexp  xt∗Tµ + 1 2[x Tt∗ ]2 

where µ = δi. This is a rank one network model, called a random graph model (Erdos

& Rényi, 1960), in which the random edge weight is the product of the residual times. A random graph implies that the graph structure changes over individuals. This model will be referred to as SRTM1.

For the second approach, we begin with a slightly altered version of the SRTM, where again yi∈ (−1, + 1) :

p(yi, ti| δi, θ) = (θ + δi)

exp yi(di− ti)(θ + δi) exp di(θ + δi)

− exp −di(θ + δi).

In this approach, we will take into consideration the fact that the SRTM assumes that response times and response accuracy are dependent given the latent trait, while the DDM and the HRM assume CI between response times and response accuracy. The dependence between response times and response accuracy can be disentangled by a certain transfor-mation of the residual response times. Maris and van der Maas (2012) observe that the conditional distribution for ti is

p(ti| yi, δi, θ) = (θ + δi) exp yi(di− ti)(θ + δi) yi  exp yidi(θ + δi)− 1 ,

and note that the dependence between response accuracy and response times is manifested in a qualitative difference between the behavior of able and unable people. That is, able people have fast correct responses and slow incorrect responses, while unable have slow correct responses and fast incorrect responses:

p(Ti | yi = 1, φi) = p(di−Ti| yi= −1, φi) = p(di−Ti | yi= 1, −φi) = p(Ti| yi= −1, −φi),

where φi = θ + δi. To ensure CI, a transformation of response times can be done where

the original response times stay as they are for correct responses, but the residual response times are used for incorrect responses. Following Maris and van der Maas (2012), we will refer to them as pseudo times t∗:

Ti∗ =      Ti if Yi= +1 di− Ti if Yi= −1 ∼ (T | yi = 1),

(21)

has fast correct responses and slow incorrect responses because for their incorrect responses, the actual time is replaced with the residual time. This means that the pseudo times for both their correct and incorrect responses are quick. Similarly, people with low abilities are expected to give slow correct answers and fast incorrect answers, but their pseudo times will indicate slow responses regardless of whether they respond correctly or not. This is depicted in figures 7 and 8, which depict the scores of the SRTM with original response times and the scores of the SRTM with pseudo times, respectively. In the figures, the green lines represent scores for correct responses, while the red lines represent scores for incorrect responses. With this transformation, the pseudo time SRTM shares the CI property with

Figure 7 . Scores of the SRTM with orig-inal response times.

Figure 8 . Scores of the SRTM with pseudo times.

the HRM and DDM. We then have:

p(y, t| θ, δ) = p(y | θ, δ)p(t| θ, δ) =Y i " eyidi(θ+δi) edi(θ+δi)+ e−di(θ+δi) (θ + δi)e(di−ti)(θ+δi) edi(θ+δi)− 1 # = Q 1 iZi(θ) exp   X i (yidi+ di− ti)θ + X i (yidi+ di− ti)δi  , where Zi(θ) =  edi(θ+δi)+ e−di(θ+δi)e di(θ+δi)− 1 (θ + δi) .

Now we want to define a distribution for the latent trait g(θ), such that we can cancel out Zi(θ): g(θ) = 1 Z√π Y i Zi(θ) exp−θ2.

With g(θ) defined, we find that the marginal distribution is the following:

Z R p(y, t| θ, δ)g(θ) dθ = Z R 1 Z√πexp   X i (yidi+ di− ti)θ + X i (yidi+ di− ti)δi− θ2  

(22)

= 1 Zexp    X i (yidi+ di− ti)δi+ 1 4   X i (yidi+ di− ti)   2  .

Simplifying the terms in the exponent, we now have the joint distribution of the pseudo times tand transformed response accuracy y:

p(y, t∗) =1 Zexp   X i yidi(δi+ d+/2) − X i ti(δi+ d+/2) + 1 4 X i X j didjyiyj+ 1 4 X i X i titj −1 2 X i X j tiyjdj   =1 Zexp(y Tµ y+ t∗Tµt+ yTΣyyy + t∗TΣttt∗T+ 2yTΣytt∗) =1 Zexp      y t∗   T  µy µt∗  +   y t∗   T  Σyy Σyt∗ Σty Σtt∗     y t∗     , where d+=P idi, t∗+ = P iti, and µy = [di(δi+ d+/2)] µt= [−δi− d+/2] Σyy = [didj/4] Σyt= [−di/4] Σtt= [1/4].

Note that the main effects µt and µy both contain the item difficulty δi and the item

time limit di, which is equal to the discrimination parameter of the 2PLM. This model

will be referred to as SRTM2. The interaction effects are rank one matrices, which reflects

the dimensionality of the corresponding latent trait (Holland, 1990). This means that

the interaction effects of the marginal SRTM suggest that its corresponding latent trait model has a unidimensional latent trait. It is also interesting to observe that the pairwise interaction effect Σyt∗ is negative.

Marginalizing the HRM

We will write the response accuracy of the HRM in accordance with the SRTM, with the values -1 and 1 in place for 0 and 1. We therefore introduce the response accuracy

(23)

looks as such: p(y | θ) = exp yi(αiθ + δi)  exp (αiθ + δi) + exp −(αiθ + δi) = 1 Zi(θ) exp yi(αiθ + δi).

Here, Zi(θ) is the normalizing constant. The probability of response times given speed is the following: p(t | η) =ai πti exp(−a2i ln(ti) + η + bi)2 = 1 ti exp−a2 iln(ti)2− 2a2i ln(ti)(η + bi  ×√ai πexp(−a 2 i(η + bi)2) = 1 ti exp−a2iln(ti)2− 2a2i ln(ti)(η + bi  1 Wi(η) ,

where Wi(η) is the normalizing constant. As mentioned previously, the HRM consists of the joint probability of response accuracy and response times, which is p(y, t) = p(y | θ)p(t | η). Since the distribution of (log) response times and accuracy are independent in the HRM, conditional on η and θ, their marginal distribution will also be independent if θ and η are modeled to be independent a priori. However, because we are interested in the effect response times have on ability and not particularly in the speed parameter, we will introduce a transformation so as to get a direct correlation between ability and response times - rather than an indirect correlation through the correlation between θ and η. The transformation is the following: 3

η = θ + λ

Where we assume that the relationship between θ and η is linear, and therefore we can write η as the sum of θ and some error λ. This transformation will lead to the HRM being structured as is depicted in figure 9, as opposed to the depiction in figure 4. That is, θ now has a direct connection to the (log) response times, while in the original model, it had an indirect relation through its correlation with η. The model is now in a similar form as

3

Another way to do this would be to replace the latent trait distribution to explicitly include a correlation, such that f (θ, η) = 1 p1 − ρ2 Y i Zi(θ)Wi(η) exp  −θ 2− 2θρη − η2 1 − ρ2  .

In this way, the results are almost exactly the same as the results of the method we use, with the exception that for this method,

Σtt∗= ( α2 2 j if i 6= j α4i− α2i if i = j Σyt= [ρaiα2j].

(24)

the SRTM and the DDM in the sense that in both of those models, θ as well has a direct connection to the response times. The Jacobian for this transformation is equal to one:

θ

η

X

1

X

2

lnt

1

lnt

2

Figure 9 . The HRM after transformation of η. Now θ has a direct connection to the (log) response times.

η = θ + λ ⇐⇒ λ = η − θ ⇐⇒ |J | = 1, such that the distribution of time, including the transformation, is

p(t | θ, λ) = 1 ti exp−a2i(ln(ti))2− 2a2i(ln(ti))(θ + λ + bi  1 Wi(θ + λ) . We now define the joint distribution of θ and λ as:

g(θ, λ) =Y i

Zi(θ)Wi(θ + λ) 1

Zπexp(−θ

2− λ2).

Now, the marginal distribution of the HRM is the following:

Z R2 p(x | θ)p(t | θ, λ)g(θ, λ)dθ dλ = 1 Q iti exp   X i yiδi− X i a2iln(ti)2− 2 X i a2i ln(ti)bi   × Z R exp  θ X i h yiαi− 2a2i ln(ti) i − θ2   × Z R exp  −2λ X i a2iln(ti) − λ2  

(25)

= 1 ZQ iti exp   X i yiδi− X i a2i ln(ti)2− 2 X i a2i ln(ti)bi   × exp   X i X j yiyj αiαj 4 + X i X j ln(ti) ln(tj)2a2ia2j − X i X j yiln(tj) αia2j 2  .

This gives us the graphical model p(y, t∗) =1 Zexp  yTµy+ t∗Tµt+ yTΣyyy + t∗TΣttt+ 2yTΣytt∗  =1 Zexp      y t∗   T∗  µy µt∗  +   y t∗   T∗  Σyy Σyt∗ Σty Σtt∗     y t∗     , where ti ≡ ln(ti), and µy = [δi] µt= [2a2ibi− 1] Σyy = [ 1 4αiαj] Σtt∗ =      2a2ia2 j if i 6= j 2a4i − a2 i if i = j Σyt∗ = [ 1 4αia 2 j].

The diagonal on Σttimplies that |ai| < 1/

2, otherwise the integral does not exist. Why this is the case requires further study. We will refer to this model as HRM1. Note that the main effect for pseudo times µt∗ consists only of the item difficulty, while the main effect

for response accuracy µy contains both item time intensity and item time discrimination.

As for the SRTM2, all interaction effects are rank one matrices.

Comparing the marginal models

From the three latent trait models, we obtained five marginal models; one for the HRM and two for the SRTM and DDM.

We will start out by comparing DDM1, SRTM2 and HRM1. These three all have

the CI assumption, and their latent trait expressions all assume that θ directly influences response accuracy and some function of response times. This is depicted in figure 10 for the SRTM2 and DDM1. For the DDM1, the function of response times is the decision time t = t− Ter, for the SRTM2 it is the pseudo times, and for the HRM1 it is the natural logarithm of the response times. However, the latent trait expression of HRM1 also assumes

(26)

that η influences its log response times, as seen in figure 9.

θ

X

1

X

2

f (t

1

)

f (t

2

)

Figure 10 . The latent trait expression of the SRTM2 and DDM1. θ directly influences

response accuracy and a function of response times. For the SRTM2 this function is the

pseudo times, while for the DDM1, it is the decision time.

Interestingly, the SRTM2 and HRM1 are of the same form:

p(y, t) = 1 Zexp      y t   T  µy µt  +   y t   T  Σyy Σyt Σty Σtt     y t     , where for SRTM2, µy = [di(δi+ d+/2)] µt= [−δi− d+/2] Σyy = [didj/4] Σyt = [−di/4] Σtt = [1/4], whilst for HRM1, µy = [δi] µt= [2a2ibi− 1] Σyy = [ 1 4αiαj] Σtt∗ =      2a2ia2j if i 6= j 2a4i − a2 i if i = j Σyt∗ = [1 4αia 2 j].

(27)

What is also noteworthy is that these are both graphical models. This is an engaging result consistent with the works of Marsman et al. (2015) and Epskamp et al. (2016), where marginalization of latent trait models also lead to graphical models. For both of these models, the interaction effects are rank one matrices, which implies a unidimensional latent trait in their latent trait expressions. Noticeably, the pairwise interaction matrix Σyt for the SRTM2 is negative, while for the HRM1 it is positive. The meaning of this difference

would be interesting to further study. The main effect for response accuracy µy in the SRT2 consists of both the time limit (item discrimination) and item difficulty. In the HRM1, this

main effect consists only of item difficulty. For the SRT2, the main effect of pseudo times

µt∗ also consist of item difficulty an item discrimination. For the HRM1, the main effect of

decision times µt consists of the parallel parameters from the time distribution: item time intensity and item time discrimination.

In addition, notice that the HRM1 in its latent form didn’t assume any specific

rela-tionship between log response times and response accuracy, while its marginal does assume a specific relationship between the two, namely

exp(−X i X j yilntj) αia2j 2 .

Although we are not quite sure what this relationship looks like, it is apparent that the relation is negative, meaning that higher log times will be more likely to result in less accuracy. This relation depends on both item discrimination α and item time discrimination a.

From these observations, we propose the following general graphical framework for

the joint distribution of response accuracy and response times, and of which the SRTM2

and HRM1 are special cases:

p(y, t) = 1 Z(µ, Σ)exp      y t   T µ +   y t   T Σ   y t     ω(t),

where y refers to the response accuracy taking the values −1 and 1, t is the function of response times, µ are the main effects for response accuracy and times (item parameters), and Σ are the interaction effects for response times and accuracy (also item parameters). The measure ω(t) ensures that the model can be integrated, while Z(µ, Σ) is the normal-izing constant, ensuring that the model integrates to one, and depends on item effects and interactions.

(28)

The DDM1 has the following distribution: p(y, t) = 1 Z Q ic(ti) q 1 +12P iti exp    X i yiαiδi− 1 2 X i tiδi2+ P iyiαi+ tiδi 2 41 +12P iti    .

This is of similar form as the other two models, in the sense that the exponential consists of main effects, the response accuracy, response times and interaction effects. However, the function of time before the exponent, and the presence of another function of time in the denominator of the squared exponential make it of a slightly different form. These similarities do however give hope of the possibility of the existence of some function of time

for the DDM such that its marginal will be of the same form as the HRM1 and the SRTM2,

and can fit into the general graphical framework. We do not know whether such a function exists, and leave this option to be solved at a later time.

In addition to the DDM1, SRTM2 and HRM1, we arrived at two other marginal

models; the DDM2 and SRTM1. The DDM2 turned out to be a graphical model, and

the SRTM1 is a random graph model. Although perhaps interesting by themselves, these

models don’t have specific relations to the other three models, and therefore we will not discuss them further.

Conclusions & Discussion

We have discussed three popular models that jointly model response accuracy and response times. The DDM was introduced for cognitive experimental psychology, and assumes that only ability influences response times and accuracy. It does not (explicitly) assume a latent trait speed. The DDM assumes CI between response accuracy and response times given ability. It predicts that as response time increases, accuracy increases for people with a high ability, while it decreases for people with a low ability. The SRTM has an explicit scoring rule and was introduced for tests with a time limit. Like the DDM, the SRTM assumes only a latent trait for ability, but models response accuracy and time as dependent given ability. It expects people with a high ability to have fast correct responses and slow incorrect responses, and people with a low ability to have slow correct responses and fast incorrect responses. Lastly, the HRM assumes two latent traits; one for ability and one for speed, and models response accuracy and time as independent given speed and accuracy. It assembles two models into one, a separate model for response accuracy and a model for log normal response times. It doesn’t predict a particular relation between response times and response accuracy.

We applied a strategy of marginalization from the works of Marsman et al. (2015) and Epskamp et al. (2016) that show the relation of the Ising model and IRT models. Our

(29)

results showed firstly that all three models were successfully marginalized, some in more ways than one. This adds one more piece to the growing literature of papers investigating the relationships between latent trait models and models that only use manifest variables, such as graphical models. The fact that the manifest probabilities could be found for all models is a friendly reminder of the fact that even though latent traits are useful tools for modeling, the fact that a latent model fits data doesn’t prove the existence of latent traits, nor does it disprove other interpretations of the data.

The marginal models we arrived at were in total five: The DDM1: the marginal of the

original DDM, DDM2: the marginal of the DDM where a second latent trait was introduced

such that it is a hierarchical model, SRTM1: the marginal of the unchanged SRTM, SRTM2:

the marginal of the SRTM with a transformation of response times, and lastly, HRM1: the

marginal of the HRM with a transformation of speed such that ability would have a direct influence on response times in addition to speed.

The DDM2 is a graphical model, which is interesting in itself. However, for this

comparison it is uninteresting, since it is not comparable to the rest of the models. Similarly, the SRT1 is also not comparable to the rest of the models, since it doesn’t assume CI, and its form was different from the rest. It takes the form of a random graph model.

The most interesting results we obtained from the SRTM2 and HRM1: they are

both graphical models of the same form. From this we suggested a more general graphical framework of which both models are specific cases, and possibly even more models unknown at this time. The matrix for pairwise interaction effects for the SRTM2is negative, while for

the HRM1 it is positive. This difference is intriguing, and requires further study. Another interesting result from this is that even though the latent trait expression of the HRM doesn’t characterize a specific SAF, its marginal does. The next step here would be to

simulate data according to the model and depict the SAF characterized by the HRM1.

The last marginal model we found was the DDM1, which was very similar in form

to the SRTM2 and HRM1, but did not quite fit into the form. This however gives us hope

of the possible existence of a function of time such that the DDM will fit in the general graphical framework. This is the next step of research we think should follow, and we encourage the reader to think of a possible way to do this. This possibility brings up questions about the function of times for the other models, specifically, how many different possible functions of response times can we apply to the models, and will any of them fit this general framework? Related to this, could we use the strategy we describe above to get from the general framework to another type of latent trait model, previously undefined? Or any type of model that jointly models response times and accuracy? If so, how many possible ways would there be to do this? With our results seem to follow a lot of unanswered questions and material for potential future studies. In addition to the questions above we

(30)

have named three details that require attention: what the sign of the pairwise interactions

of the general graphical framework implies, what the SAF of the HRM1 looks like, and

whether there is a function of time for the DDM such that it fits the general framework. We are intrigued by all these options of future studies, and hope that the reader is equally inspired to continue this story.

(31)

References

Anderson, C. J., & Vermunt, J. K. (2000). Log-multiplicative association models as latent variable models for nominal and/or ordinal data. Sociological Methodology, 30 (1), 81–121.

Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the

Royal Statistical Society. Series B (Methodological), 36 , 192–236.

Birnbaum, A. (1968). Some latent train models and their use in inferring an examinee’s ability. In F. Lord & M. Novick (Eds.), Statistical theories of mental test scores (p. 395-479). Addison-Wesley.

Bolsinova, M., & Tijmstra, J. (2017). Improving precision of ability estimation: Getting more from response times. British Journal of Mathematical and Statistical Psychology. doi: 10.1111/bmsp.12104

Bolsinova, M., Tijmstra, J., & Molenaar, D. (2016). Response moderation models for conditional dependence between response time and response accuracy. British Journal of Mathematical

and Statistical Psychology, 70 , 257–279. doi: 10.1111/bmsp.12076

Bolsinova, M., Tijmstra, J., Molenaar, D., & De Boeck, P. (2017). Conditional dependence between response time and accuracy: An overview of its possible sources and directions for distinguish-ing between them. Frontiers in psychology, 8 (202). doi: 10.3389/fpsyg.2017.00202

Cox, D. R., & Wermuth, N. (2002). On some models for multivariate binary variables parallel in complexity with the multivariate gaussian distribution. Biometrika, 89 , 462–469. doi: 10.1093/biomet/89.2.462

Cressie, N., & Holland, P. W. (1981). Characterizing the manifest probabilities of latent trait models. ETS Research Report Series, 1981 . doi: 10.1002/j.2333-8504.1981.tb01281.x

Eckhoff, P., Holmes, P., Law, C., Connolly, P., & Gold, J. (2008). On diffusion processes with variable drift rates as models for decision making during learning. New Journal of Physics,

10 , 015006. doi: 10.1088/1367-2630/10/1/015006

Epskamp, S., Maris, G. K., Waldorp, L. J., & Borsboom, D. (2016). Network psychometrics. arXiv

preprint arXiv:1609.02818 .

Erdos, P., & Rényi, A. (1960). On the evolution of random graphs. Publ. Math. Inst. Hung. Acad.

Sci, 5 , 17–60.

Goldhammer, F. (2015). Measuring ability, speed, or both? challenges, psychometric solutions, and what can be gained from experimental control. Measurement: interdisciplinary research and

perspectives, 13 , 133–164. doi: 10.1080/15366367.2015.1100020

Holland, P. W. (1990). The dutch identity: A new tool for the study of item response models.

Psychometrika, 55 , 5–18. doi: 10.1007/BF02294739

Ising, E. (1925). Beitrag zur theorie des ferromagnetismus. Zeitschrift für Physik A Hadrons and

Nuclei, 31 , 253–258. doi: doi: 10.1007/BF02980577

Jansen, B. R., Louwerse, J., Straatemeier, M., Van der Ven, S. H., Klinkenberg, S., & Van der Maas, H. L. (2013). The influence of experiencing success in math on math anxiety, perceived math competence, and math performance. Learning and Individual Differences, 24 , 190–197. doi: 10.1016/j.lindif.2012.12.014

Referenties

GERELATEERDE DOCUMENTEN

For example, in the arithmetic exam- ple, some items may also require general knowledge about stores and the products sold there (e.g., when calculating the amount of money returned

Moreover, Hemker, Sijtsma, Molenaar, &amp; Junker (1997) showed that for all graded response and partial-credit IRT models for polytomous items, the item step response functions (

The Crit value as an effect size measure for violations of model assumptions in Mokken Scale Analysis for binary data .... The monotonicity assumption in

research on the practical consequences of item response theory (IRT) model misfit, at the department of Psychometrics and Statistics, University of Gro- ningen, supervised by prof.

The chapter will include statistical analysis and discussions of the results acquired from the target population identified to participate in the study to determine customer needs

3.4 Recommendations on the application of Bacteroides related molecular assays for detection and quantification of faecal pollution in environmental water sources in

Illusion: checkerboard-like background moving horizontally at target’s appearance or at 250ms inducing illusory direction of target motion Task: Hit virtual targets as quickly and

These include the following: the enhancing effect of outsourcing the repairs and maintenance of power plant equipment on the skills level of internal employees; outsourcing