• No results found

A nonparametric Item Response Theory model for unidimensional unfolding

N/A
N/A
Protected

Academic year: 2021

Share "A nonparametric Item Response Theory model for unidimensional unfolding"

Copied!
115
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Mudfold :

A nonparametric Item Response Theory model for unidimensional unfolding

by

Spyros E. Balafas

A thesis submitted to University of Groningen in conformity with the requirements for the degree of Master of Science.

Groningen, the Netherlands January, 2016

Spyros E. Balafas 2016c All rights reserved

(2)

Abstract

Preferences, choices and attitudes are observations which are modelled in Statistics with the use of latent variables. These variables are not directly measurable and their values are obtained from other indicator variables called items. The study which handles and analyzes the responses from a population to a set of items in a statistical context, is known as Item Response Theory (IRT). MUDFOLD (Multiple UniDimensional unFOLDing) is a nonparametric model in the class of Item Response Theory models for unfolding unidimensional latent variables constructed from dichotomous responses. The distinctive feature of MUDFOLD compared to other nonparametric approaches is its item response function (IRF) which shows a sinle peaked shape. Nonparametric item response models (NIRM) are statistical methods that are used to analyze item response data collected in a sample of individuals to find out whether or not the items can be considered to be indicators of the same hypothetical construct.

Primary Reader: Wim P. Krijnen Secondary Reader: Ernst Wit

(3)

Acknowledgments

Hereby, I would like to thank my first supervisor Dr. W.P. Krijnen for his help and support in the completion of the present research. Additionally, I would like to give my special thanks to my second supervisor Prof. Dr. E.C. Wit for his inspirational comments and his help during my studies.

Moreover, I would like to thank: Dr. A. Mohammadi, S.M. Mahmoudi (Phd candidate), M.

Eisenmann (colleague) for their feedback on my R code, A. Delifilipidi and Sev. Balafas for their comments on the grammar.

I will always be thankful to Prof. Dr. A. Philipou for his guidance and support during my undergraduate studies.

(4)

Dedication

To my family for their unconditional love.

(5)

Contents

Abstract ii

Acknowledgments iii

List of Tables viii

List of Figures ix

1 Introduction 1

1.1 Nonparametric Item Response Theory . . . 3

2 Literature Review 4 2.1 Guttman scale . . . 4

2.2 Rasch model . . . 6

2.3 Mokken model (MH) . . . 9

2.4 Unfolding models . . . 13

2.5 Coombs unfolding model . . . 16

3 Mudfold 19 3.1 Assumptions . . . 20

(6)

CONTENTS

3.2 Unfolding scale analysis . . . 22

3.3 Triples . . . 24

3.4 Errors and scalability coefficients for Mudfold . . . 26

3.4.1 Expected Errors in Triples . . . 28

3.4.2 Scalability Coefficient for Triples . . . 30

3.4.3 Symmetries in Errors . . . 32

3.5 Nonparametric scale construction . . . 34

3.6 Best elementary scale . . . 34

3.6.1 Unique Triples . . . 35

3.7 Extending the Best Elementary Scale . . . 36

3.7.1 Scalability H for the Scale . . . 38

3.8 Diagnostics . . . 39

3.8.1 Dominance matrix . . . 40

3.8.2 Adjacency matrix . . . 41

3.8.3 Conditional Adjacency matrix . . . 43

3.8.4 Iso statistic . . . 44

4 Model Application 46 4.1 Introduction . . . 46

4.2 Sports data . . . 47

4.3 ANDRICH data . . . 60

4.4 Plato7 data . . . 64

5 Simulation study 73

(7)

CONTENTS

5.1 Design . . . 73

5.2 Results . . . 76

5.2.1 P1 simulation . . . 76

5.2.2 P2 simulation . . . 79

5.2.3 Mean scale length P1 vs P2 . . . 81

6 Summary 85

Bibliography 101

(8)

List of Tables

2.1 A data conforms Guttman ’s structure . . . 5

2.2 Collection of responses for models that formulate cumulative response process (left) and proximity response process (right) respectively. . . 15

2.3 Parallelogram shape for pick any/8 data . . . 18

3.1 All the ordered triples that have to be examined for K = 5 items . . . 26

3.2 Data matrix X for a sample of n = 6 individuals responding on K = 5 items . . . . 27

3.3 marginal popularities for each item of the data set in Table 3.2 . . . 29

3.4 Data matrix of the permutations (h, j, k) and (k, j, h) respectively for the set of items {A, B, C} . . . 32

3.5 Ordered triples defined for the scale BCADF E . . . 39

3.6 Dominance matrix for the Mudfold scale ABDEF obtained from EURPAR2 data set 41 3.7 Adjacency matrix for the Mudfold scale ABDEF obtained from EURPAR2 data set 42 3.8 Conditional adjacency matrix for the Mudfold scale ABDEF obtained from EUR- PAR2 data set . . . 44

5.1 10 principal response patterns containing the responses of individuals on 6 items with no Mudfold errors . . . 74

5.2 P1 simulation Results . . . 83

5.3 P2 simulation Results . . . 84

.1 P1 simulation Results for Mudfold alternative procedure . . . 88

.2 P2 simulation Results for Mudfold alternative procedure . . . 89

(9)

List of Figures

2.1 Item response function for a Guttman item (step function) . . . 6

2.2 Item response function for an item that conforms to Rasch model . . . 7

2.3 Cumulative model IRF vs Unfolding model IRF . . . 14

2.4 4 items and 3 individuals scaled on the latent continuum of a unidimensional latent variable . . . 14

3.1 graphical representation of a dichotomous item and its two item steps . . . 22

3.2 3 items scaled on the latent continuum of a unidimensional latent variable . . . 23

3.3 3 items scaled on the latent continuum of a unidimensional latent variable . . . 24

3.4 3 items scaled on the latent continuum of a unidimensional latent variable . . . 24

4.1 Proportions of positive responses (popularities) for each item in sports data . . . 49

4.2 Plot of the rows of the cond. adjacency matrix (Empirical Item Response functions) 53 4.3 Plotted results for alternative Mudfold fit (parameter H=TRUE) . . . 56

4.4 Plotted results for Mudfold fit with starting scale the triple EGF . . . 59

4.5 Mean popularities for ANDRICH data . . . 61

4.6 Empirical tracelines for the items in Mudfold scale for ANDRICH data . . . 63

4.7 item popularities for the dichotomized Plato7 at the clausula mean . . . 68

4.8 estimated IRFs for two Mudfold scales . . . 70

4.9 estimated IRFs for two Mudfold scales . . . 71

5.1 The means of scalability coefficients and of the iso statistic(above left-below left) and the respective standard deviations (above right - below right for the mean H and the mean iso statistic for any sample size after simulating data with process P1 and fitting Mudfold’s main algorithm plotted against all error levels (1000 replications) . 77 5.2 The total number of item misplaced from the desired order (left) and the model misfits (right) for any sample size after simulating data with process P1 and fitting Mudfold’s main algorithm plotted against all error levels (1000 replications) . . . 78

5.3 The means of scalability coefficients and of the iso statistic(above left-below left) and the respective standard deviations (above right - below right for the mean H and the mean iso statistic for any sample size after simulating data with process P1 and fitting Mudfold’s main algorithm plotted against all error levels (1000 replications) . 80 5.4 The total number of item misplaced from the desired order (left) and the model misfits (right) for any sample size after simulating data with process P2 and fitting Mudfold’s main algorithm plotted against all error levels (1000 replications) . . . 81

(10)

LIST OF FIGURES

5.5 The mean scale length for any sample size,error level and simulation process. The results for P1 are on the left and on the right you can see the resulting mean lengths for simulation process P2 (1000 repetitions) . . . 82

(11)

Chapter 1

Introduction

The increasing need to examine underlying abilities from a population and their diffusion in larger scale, guided the development of psychometrics. Psychometrics is a field of psychology that studies the theory behind measurement of psychological constructs. Cronbach and Meehl [5] (1955), define a hypothetical construct as a concept which cannot be directly observed, and for which there exist multiple referents, but none all-inclusive. Such constructs, e.g intelligence, happiness, morale, or conservatism are referred in statistics as latent variables.

It is impossible to date the first use of latent variables in science. Maybe the first definition can be found in Plato’s epistemology when he illustrated the idea that knowledge is a justified belief.

Numerous applications in many disciplines, including psychology, economics, medicine, physics, machine learning/artificial intelligence, bioinformatics, natural language processing, econometrics, management and the social sciences presume the existence of such variables and thus the urgency for their measurement arises. These latent variables are not directly measurable and their values are obtained from other indicator variables called items. Items are questions or statements from a questionnaire, which are considered to measure the same latent ability of the population and on

(12)

CHAPTER 1. INTRODUCTION

which an individual could declare an ordinal level of agreement. Items can be categorized on the basis of response categories. An item j for which m + 1 ordered answer categories xj = 0, 1, . . . , m exist, and m > 1 is called polytomous. However, the current text deals with items with only two different response categories xj = 0, 1 where 0:= disagree and 1:= agree which are called dichotomous items.

The study which handles and analyzes the responses from a population to a set of items (also indicated as stimuli in countable papers) in a statistical context, is known as Item Response Theory (IRT). IRT met an explosive growth, and nowadays is one of the most successful and broadly used statistical modeling techniques in psychometrics, with applications in developmental, social, educational and cognitive psychology for example, as well as in medical research, demography and other social science settings. Item response theory is used for the design, analysis, and scoring of tests, questionnaires and similar psychological instruments. The general objective of IRT is to construct reliable and valid tests or scales with the ulterior goal to estimate sufficiently latent abilities, attitudes or preferences from a population.

A key component of IRT is the item response function (IRF), also known as item response curve (ICC), which considers the probability of correctly answering a question item as a function of the latent trait θ i.e, P (Xj = 1 | θ) = Pj(θ). Moreover, the IRF for a question item and the latent abilities of subjects can be estimated from the sample ’s item response data. According to approaches used for the estimation of the IRFs, IRT models can be divided into two broad categories, i.e., Parametric Item Response Theory (PIRT) and Nonparametric Item Response Theory (NIRT).

PIRT models typically assume IRFs to be parametric functions (e.g., logistic curves or normal ogives) [30]. On the other hand, the NIRT models target on scaling the set of items and the individuals on the latent continuum. One of the most influential researchers in the field of IRT the

(13)

CHAPTER 1. INTRODUCTION

previous decade, the Dutch statistician and psychometrician Ivo Molenaar argued as exponents of PIRT and NIRT the work of George Rasch (1960) and Robert Mokken (1971) respectively. However, it is favorable for a lot of researchers to work with models which are based on preferred measurement properties but do not impose the restrictions of a parametric family on the item response functions.

The focus of the present study is on the nonparametric family of IRT models.

1.1 Nonparametric Item Response Theory

Nonparametric item response models (NIRM) are statistical methods that are used to analyze item response data collected in a sample of individuals to find out whether the items can be con- sidered to be indicators of the same hypothetical construct [31]. This requires a scaling procedure.

The most acknowledged scaling model in the NIRT class of models is the monotone homogeneity model introduced by Mokken [20] (1971), which formed the base for further developments in the nonparametric theory of cumulative scale construction. Mokken adapted Loevinger ’s H coefficient of homogeneity in his model as a measure that indicates scalability for pairs of items. Van Schuur (1984) makes use of an extended version of this scalability coefficient which can be calculated for individual items, triples of items, and the scale as a whole denoted as H(j), H(hjk), and H respectively in his M udf old model for Multiple unidimensional unfolding analysis.

Mudfold, founded by W. H. Van Schuur [36] and further developed by W. J. Post (1992) [22], is a nonparametric unidimensional unfolding model for dichotomous data. The objective of a Multiple unidimensional unfolding analysis is the scaling of j = 1, . . . , K points representing dichotomous items in one dimensional latent continuum which is the graphical representation of a one dimensional latent variable θ.

(14)

Chapter 2

Literature Review

2.1 Guttman scale

One of the first evolutions in modeling questionnaires from a stated preference binary choice experiment came from Louis Guttman in the 1940’s. A subset of items having binary responses forms a Guttman scale if these can be ranked in an order such that, for individual i = 1, 2, . . . , n, the response pattern can be expressed by a single index on that ordered scale. Guttman ’s findings on the field of scale analysis form fertile ground for further developments.

On a Guttman scale, items are arranged in an order such that an individual who agrees with a particular item will also agree with items of lower rank-order. A hypothetical, perfect Guttman scale consists of a unidimensional set of items that are ranked in order of difficulty from the least extreme to the most extreme position (table 2.1). For example, a person scoring 3 on a four item Guttman scale, will agree with items j = 1, 2, 3 and disagree with item j = 4. The data set in Table 2.1 was constructed to form a perfectly unidimensional Guttman scale. i.e, an individual who gives the positive response to a more “difficult” (i.e., an item with lower marginal frequency)

(15)

CHAPTER 2. LITERATURE REVIEW

Statements

Response Patterns S4 S3 S2 S1 Score

1 0 0 0 0 0

2 0 0 0 1 1

3 0 0 1 1 2

4 0 1 1 1 3

5 1 1 1 1 4

Table 2.1: A data conforms Guttman ’s structure

item will also respond positively to all items that are less “difficult.” Thus, the Guttman scale is a

“cumulative” scale [38]. The latter implies that all subjects “accumulate” positive responses to the items in the same order, from the “easiest” (statement ”S1” in Table 2.1) to the most “difficult”.

An important property of Guttman ’s model is that a person’s entire set of responses to all items can be predicted from their cumulative score because the model is deterministic. The item response of an item which belongs in a Guttman scale can be seen in Figure 2.1

Guttman scale is more applicable when a researcher desires to design short questionnaires with strong discriminating properties and for constructs that are hierarchical and highly structured such as social distance, organizational hierarchies, and evolutionary stages. Since not a lot of data sets could fit Guttman ’s model well, the latter was brought within a probabilistic structure in item response theory models. The Rasch model requires a probabilistic Guttman structure when items have dichotomous responses. According to D. Andrich [1] (1985), in Rasch model the Guttman response pattern is the most probable response pattern for a person when items are ordered from least difficult to most difficult.

(16)

CHAPTER 2. LITERATURE REVIEW

Figure 2.1: Item response function for a Guttman item (step function)

2.2 Rasch model

Danish statistician George Rasch in 1960 developed a model for exploring the properties of an intelligence test. This model is known as one parameter logistic model (1PL) or Rasch model [24], and formulates a cumulative response process as Guttman ’s scaling model.

In this model if Xij denotes the response of a person i on the item j, then the probability of a positive response given the individual ’s ability and the item ’s difficulty parameters is defined,

P (Xij = 1|θi, βj) = eα(θi−βj)

1 + eα(θi−βj) (2.1)

where θi is the ability parameter for individual i, βj is the difficulty parameter for item j and α is a constant for the scale. For α = +∞ , the model reduces to the Guttman scale, the probability of a positive response goes to 1 if θi > βj, and goes to 0 if θi < βj. In this case the IRF of an item j will look like a step function as in Figure 2.1.

In other cases where α takes more normal values (e.g., α = 1), the IRF of a single item is a

(17)

CHAPTER 2. LITERATURE REVIEW

Figure 2.2: Item response function for an item that conforms to Rasch model

logistic (S-shaped) function as in Figure 2.2.

An important property of this model is that subject and item parameters can be distinguished and estimated separately. This is known as the separability property and Rasch defined a sepa- rability theorem which can be generalized also for polytomous items as follows (see J. Rost [27]

(2001)):

Theorem 2.1 Separability Theorem

Let Xij = xij denote the response of subject i on the item j. Then, the probability of response xij, xij = 0, 1, . . . , m is:

P (Xij = xij) = exp[φhθi+ ψhβj+ χhθiβj + ωh] Pm

l=0exp[φl+ ψlβj + χlθiβj + ωl] (2.2) for i = 1, 2, . . . , n and j = 1, 2, . . . , K where (φ1, . . . , φm) and (ψ1, . . . , ψm) are vectors of con- stants that can be interpreted as scoring functions for the response categories, (χ1, . . . , χm) and (ω1, . . . , ωm) are vectors of constants and both the person and the item parameters θi and βj are

(18)

CHAPTER 2. LITERATURE REVIEW

considered as unidimensional.

Now assume that each of the Xij follow the distribution of equation 2.2. Then,

1. the distribution of the item marginals (X.1, . . . , X.K) conditional on the person marginals (X1., . . . , Xn.) only depends on the item parameters while the person parameters have been eliminated

2. the distribution of the person marginals (X1., . . . , Xn.) conditional on the item marginals (X.1, . . . , X.K) only depends on the person parameters while the item parameters have been eliminated

3. the distribution of all the individual ’s responses conditional on both the person and the item marginals does not depend on any of the two sets of parameters.

Separability property can be regarded analogous to variance analysis where two separated factors in a main effects model have no interaction. Hence, the existence of intersecting item response functions in a model, implies that the effect size of the individual factor with respect to the probability of a positive response differs among the items and thus separability is contradicted.

Rasch tried to specify a broad family of models. The family of IRT models which are assumed to measure one or more quantitative latent variables on a metric level of measurement and additionally fulfill the properties of sufficiency, separability, specific objectivity and latent additivity are said to belong in the family of Rasch models.

The sufficiency property means that the sum of responses in a specific response category forms a sufficient statistic for estimating the individual ’s ability parameter. An even stronger property than separability is the specific objectivity property. The latter has been characterized from Andrich [3], as ”fundamental measurement” and its role is important when individuals from different groups

(19)

CHAPTER 2. LITERATURE REVIEW

have to be compared. Specific objectivity implies that the measure of one group of individuals is independent of the measure of the items which take part in the measurement process. When specific objectivity holds for a set J ={1, 2, . . . , K} consists of K items and for a set I = {1, 2, . . . , n} which is consisted of n individuals then the model can be represented with addition or subtraction of the individual and the item parameters θi and βj respectively. The last property is referred as latent additivity and according to Fischer (1988) is strongly related to specific objectivity which holds only for the Rasch family models.

According to Van Schuur [38], “Even though Rasch model can be strongly recommended, its assumptions are strict, and the model is best applied when the number of items is rather high (e.g., greater than 20). However, in certain applications-such as omnibus-type surveys in which a large number of different topics are treated with a small number of indicators for each concept to be measured, the Rasch model often does not fit very well”.

To face these problems the Dutch mathematician and political scientist Robert J. Mokken in 1971 in his work ”A theory and procedure of scale analysis: With applications in political research”, developed his fundamental nonparametric item response theory model for cumulative scaling known as monotone homogeneity model.

2.3 Mokken model (MH)

Consider a data matrix Xn×K obtained from a preference choice experiment, n individuals indexed by i, i = 1, 2, . . . , n, are giving their responses on K items indexed by the letter j, j = 1, 2, . . . , K. Each item j is associated with an item response variable Xj = (Xij), which is a column vector that contains the responses of the n individuals on the specific item j, with xj = (xij) (e.g 0 or 1) the realization variable.

(20)

CHAPTER 2. LITERATURE REVIEW

Mokken scaling belongs to item response theory models. In essence, Mokken ’s model is a non- parametric, probabilistic version of Guttman ’s deterministic model. The nonparametric model of monotone homogeneity (MH) is based on three fundamental assumptions: unidimensional measure- ment, local stochastic independence of responses to items, and non decreasing IRFs.

Unidimensionality

For the dimensionality (D) of the data, one can assume that the individuals can be characterized from one latent ability and thus the latent variable is unidimensional (θ ∈ R) or the latent variable is multidimensional and more latent abilities are needed θ = (θ1, . . . , θq)∈ Rq to describe it. In MH model the latent variable is assumed to be unidimensional, that is

D = 1

This is the so called unidimensionality (UD) property. The one dimensional latent trait defines a latent scale and each subject as well as each item can be represented with a scale value θi and βj

respectively on this scale. A sufficient estimator of the individuals scale value θi is the total score denoted as

X+ = XK

j=1

Xj (2.3)

The UD property is in principle implied from the assumption of local stochastic independence.

Local stochastic independence

For the relationship between the items Mokken assumed local stochastic Independence. The latter implies that the responses of an individual i on K items are independent given the value on the latent variable θ. In other words and according to Sijtsma [30], the K -variate distribution of the item scores conditional on θ is the product of the K = marginal conditional distributions of the

(21)

CHAPTER 2. LITERATURE REVIEW

separate item scores. Local stochastic independence (LI) for polytomous data is defined as:

P (X = x | θ) = YK j=1

P (Xj = xj | θ) (2.4)

which has the familiar likelihood form and reduces to

P (X = x | θ) = YK j=1

Pj(θ) xj [1 − Pj(θ)] (1−xj) (2.5)

for the case of dichotomous responses and for P (Xj = 1 | θ) = Pj(θ).

In principle LI means that the covariance between the scores on items j and k conditional on θ is 0 and thus the responses on the different items are uncorrelated; that is

Cov(Xj, Xk| θ) = 0, for all pairs j and k and j 6= k, (2.6)

Reversely, this property, also known as weak LI, does not imply LI (Junker [11], 1993) because Eq.2.6 only deals with bivariate relationships which may be 0 while multivariate relationships are not. In practice LI implies also the dimensionality of the data since one tries to find the dimension of θ such that LI property is optimally satisfied.

Non-decreasing IRF’s

The IRF is monotonically non decreasing. Hence, the probability of a positive response to an item j conditional on θ does not decrease with increasing value of θ. That is,

P (Xj = 1 | θi)≤ P (Xj = 1 | θc) ⇔ Pji)≤ Pjc) (2.7)

for all

θi ≤ θc

where θi, θc are the latent trait values for two subjects i, c ∈ I.

(22)

CHAPTER 2. LITERATURE REVIEW

Additionally to MH, Mokken defined a model which is called double monotonicity (DM) model.

The latter, fulfills the same assumptions as MH model plus the assumption that the IRF ’s do not intersect. The IRF ’s for large number of items are likely to be more parallel compared to the IRF

’s for smaller number of items. The assumption of non intersecting item response functions implies that for each measurement value θ, the ordering of the success probabilities on K items remains invariant. This implication is also referred as invariant item ordering.

For data that conforms to the latter assumptions MH model can be used to construct a scale.

The principal tool for this process is Loevinger ’s [13], H coefficients which have been adapted to Mokken ’s model as well. Let Cov(Xj, Xk) be the covariance between two distinguished item variables Xi, Xj and Covmax(Xj, Xk). These coefficients are the ratio of marginal covariance to the maximum covariance possible under statistical independence for all pairs of items j, k∈ J. That is,

Hij = Cov(Xj, Xk)

Covmax(Xj, Xk), j 6= k (2.8)

The Hij coefficient can be calculated not only for pairs of items but also for the entire scale, denoted H and for each item separately denoted as Hj. The H coefficient for the entire scale can be calculated with

H =

PK j=k+1

PK−1

k=1 Cov(Xj, Xk) PK

j=k+1

PK−1

k=1 Covmax(Xj, Xk) (2.9)

Mokken showed that for a scale consisting of K items with non-decreasing IRF’s,

0 ≤ H ≤ 1 (2.10)

where H = 0 if and only if “at least K − 1 IRF ’s are constant functions of θ (a trivial case of unsuccessful measurement). A non-negative value of H constitutes a necessary condition for the non decreasingness of IRFs: the assumption is rejected with a negative value of H” (see Sijtsma [32],

(23)

CHAPTER 2. LITERATURE REVIEW

p.150). The scalability coefficient for an item j ∈ J itself, is:

Hj =

PK

k=1,j6=kCov(Xj, Xk) PK

k=1,j6=kCovmax(Xj, Xk) (2.11)

In principle the scalability coefficients in Mokken ’s model is a ratio which compares the number of violations in Guttman response patterns observed in the data which are called observed errors (Eobs), to the violations that is expected to occur under the assumption of stochastic independence.

Van Schuur in Mudfold (see, Van Schuur [36] [37], Van Scuur & Post [29], Post [22]) adapts Loevinger ’s H coefficients of homogeneity as a measure that indicates scalability. In Mudfold ’s H coefficients the smallest subset of items that can be examined on its scalability properties is a set consists of three different items h, j, k∈ J and therefore the scalability H can be calculated for triples of items, items and the scale as whole.

Item response theory models differ in the way the response process is formulated. Guttman ’s deterministic scaling model as well as its probabilistic, parametric and nonparametric alternatives from Rasch and M okken respectively in which we have already referred, formulate a cumulative response process. These models are also known as dominance models.

2.4 Unfolding models

A class of models that contrast with dominance models are the unfolding models. These models also assume unidimensionality but consider the probability of endorsing an item as proportional to the distance between the item parameters βj and the respondent parameter θion the unidimensional trait θ. This is the reason why unfolding models are also known as proximity or distance models.

Unfolding models differ from the cumulative models in the way that they model the probability of positively endorsing an item. This difference can be seen in figure 2.3. In the cumulative model

(24)

CHAPTER 2. LITERATURE REVIEW

Figure 2.3: Cumulative model IRF vs Unfolding model IRF

the shape of its item response function is monotone (non-decreasing) while in the unfolding model the IRF is unimodal (single-peaked)

Question items which indicate knowledge are assumed to follow the cumulative model i.e, a positive response to a statement is expected to occur with increasing probability for subjects with higher knowledge (monotone IRF). Consider the graphical representation of a unidimensional latent variable with four items and three individuals scaled together as in Figure 2.4. In this example the third individual with a scale value θ3 is expected to dominate (i.e, give a positive response) all four items indexed by the letters A, B, C, D with respective item parameters βA, βB, βC, βD. The most

”difficult” from all the items to respond positively, is the fourth item since, βD > βC > βB > βA.

LOW HIGH

latent continuum →

βA θ1 βB θ2 βC βD θ3

Figure 2.4: 4 items and 3 individuals scaled on the latent continuum of a unidimensional latent variable

On the other hand, items which are indicating ideological positions follow the unfolding model.

(25)

CHAPTER 2. LITERATURE REVIEW

The probability of a positive response to such items does not increase monotonically for subjects with increasing scale values but, rather, shows a single - peaked function (unimodal IRF). The latter implies that an individual i with scale value θi will choose all the items j with scale values βj which are numerically close to θi. Translating the last on the scale of Figure 2.4, the unfolding model implies that the first scaled individual with scale value θ1 will choose the ”closest” (in a distance sense) from all the items (i.e, items A & B) and he will reject the items that are positioned far from his scale value (i.e, items B & C). The difference in the response process between the unfolding and the cumulative model is illustrated also in a hypothesized table (see table 2.2) of the response patterns of the scale in Figure 2.4.

Cumulative responses

Items Individuals A B C D

1 1 0 0 0

2 1 1 0 0

3 1 1 1 1

Proximity responses

Items Individuals A B C D

1 1 1 0 0

2 0 1 1 0

3 0 0 1 1

Table 2.2: Collection of responses for models that formulate cumulative response process (left) and proximity response process (right) respectively.

Unfolding models can be either parametric or nonparametric. Mudfold belongs in the class of nonparametric unfolding models. The choice between a parametric and a nonparametric model has to do with the level of measurement implied from the data. If an ordinal level of measurement is sufficient (ordinal scale), then a nonparametric IRT model would be better choice than a parametric one. In contrast if the desirable level of measurement is interval (interval scale) a parametric model

(26)

CHAPTER 2. LITERATURE REVIEW

would analyse the data in a more sufficient way. Interval scaled latent variables produce numerical scale scores for subjects and items according to the logistic function which is attached to the model.

Ordinal scaled latent variables produce only rank orders of scale values for individuals and items, following the nonparametric model specification (i.e, the nonparametric relationship between the manifest variables and the latent variable).

One of the basic models in unfolding theory is Coombs deterministic model for unidimensional unfolding. Clyde Hamilton Coombs in his work “A theory of data” in 1964 developed his determin- istic model which is for Mudfold what Guttman ’s deterministic scale is for monotone homogeneity model from Mokken (1971).

2.5 Coombs unfolding model

Coombs unidimensional unfolding theory is a theory of preference which deals with the question whether there exists a common latent dimension underlying preferences of individuals [22]. Consider a set J consists of K items i.e., J = {1, 2, . . . , K} where each item is indexed by the letter j and a set I = {1, 2, . . . , n} that contains each of the i individuals taking part to a preference choice experiment. Unidimensional unfolding theory states that for any individual i∈ I and any pair of items j, k ∈ J then the individual will prefer item j from item k iff

i− βj| < |θi− βk| (2.12)

Equation (2.12) implies the fact that holds in the unfolding models which is that an individual i will choose item j that is closed to and he will reject an item j that is located far from his scale value on the latent continuum. The value of θi is referred as the subject ’s ideal point or point of maximum preference. The quantity|θi− βj| is defined as the attitudinal distance of individual i to

(27)

CHAPTER 2. LITERATURE REVIEW

an item j.

Any model that fulfills the following three fundamental conditions will be called Coombs model (see Michell [17], [18]). These conditions are presented below:

1. The first assumption that a Coombs model has to fulfill is the unidimensionalty property.

From unidimensionality follows that the underlying latent variable is one dimensional and thus produce an unidimensional latent continuum where subjects and items can be scaled on it.

2. The second assumption that a Coombs model has to conform is that the individuals scaled on the latent continuum can have only one ideal point. This requirement suggests that an individual cannot prefer two items positioned in the two extremes of the scale without endorsing the same time all the items positioned in between.

3. Items must be ordered. The latter means that the subjects must agree about the order of the items on the scale. This implies that all the subjects agree that the item parameters are increasing as ones moves from right to the left hand side of the scale.

To illustrate this model we consider the following example. Assume a researcher who wants to explore the amount of the study workload for high school students. The students have to give their responses to the question ”how many hours are you studying on average during the day?”.

There are eight items, namely from zero to seven hours per day, and the participating students can choose as many items as they like. All subjects will agree about the order of the items on the latent dimension which is 0, 1, . . . , 7 and which could be interpreted as a scale from ”no studying” during the day to ”studying a lot” during the day, and each subject will have his own ideal number of studying hours. A student who is not interested in studying at all, having zero hours as his ideal

(28)

CHAPTER 2. LITERATURE REVIEW

Number of hours: 0 1 2 3 4 5 6 7

1 1 1 0 0 0 0 0 0

2 1 1 1 1 0 0 0 0

3 0 1 1 1 0 0 0 0

4 0 0 1 1 1 0 0 0

5 0 0 0 1 0 0 0 0

6 0 0 0 1 1 1 0 0

7 0 0 0 0 0 1 1 1

8 0 0 0 0 0 0 1 1

Table 2.3: Parallelogram shape for pick any/8 data

point, will choose the item zero and maybe one hour studying per day. On the other hand, a person whose ideal hours are two and a half hours per day will choose two, three and maybe four hours.

A student who is really motivated in studying will choose six or seven hours. If the data fulfill the Coombs assumptions for the pick any/n case, then we end up with a data matrix where an entry of ’1’ exhibits an irregular parallelogram shape possibly bulging on both edges.

Coombs’ deterministic model’s second assumption is violated by response patterns{1,0,1}. As- sumption 2 implies that for three adjacent items h, j, k with h < j < k in the latent dimension or more conveniently βh < βj < βk, if the first item h and the third item k are chosen by an individual i, then item j has to be chosen too. In the example table (see table 2.3) if a student chooses as daily study hours the items 2 and 4 then he has to choose also item 3. If not, then a violation of the second assumption occurs which states that a subject chooses those items which are close to his ideal point in the latent continuum.

(29)

Chapter 3

Mudfold

Mudfold firstly developed by W. H. Van Schuur [36] in 1984 and a further developments have been added by W. J. Post in 1990 [29] and 1991 [22] respectively. Specifically, Mudfold tests upon ideas behind two extreme models. The first one is the Coombs deterministic unfolding model (see Section 2.5) and on the other hand Mokken ’s nonparametric model of monotone homogeneity (see Section 2.3). One can consider Mudfold as a probabilistic model which lays between these two extremes.

Assumptions from both of these models are tested to fit proximity data (e.g preferences, atti- tudes). Mudfold is a nonparametric alternative to parametric unfolding models such as the PAR- ELLA model introduced by Hoijtink [10] or Andrich ’s model [2] for unfolding one dimensional latent traits which according to Andrich are “constructed from direct responses (e.g., agreement or disagreement) which characterized by single peaked functions involving the locations of each person and each stimulus”.

(30)

CHAPTER 3. FORMULATION OF MUDFOLD

3.1 Assumptions

Consider a psychological test in which n subjects are free to endorse as many items as they want (pick any / K case) from a set J ={1, 2, . . . , K} consisted of K items. Let Xij be random variables associated with the dichotomous (0,1) response of the subject i on the item j. We will denote the collection of responses on a specific item j as Xj.

The assumptions made by Mudfold and a data set has to fit are the following:

Unidimensionality (UD):

For the dimensionality (D) of the data, Mudfold assumes that the responses of the individuals on the various items are only conditional depending on a one dimensional latent variable denoted as θ ∈ R. In different words with unidimensionality one assumes that all the items measure the same one dimensional latent variable θ.

Unique Ideal point (UIP):

Individuals indexed by the letter i scaled on the latent continuum can have only one ideal point on this continuum denoted as θi and they will endorse those items and only those which are close to their ideal point in a distance sense.

Ordered items (OI):

Subjects or examinees, must agree about the order of the items on the latent scale. The latter means that for all individuals, as moving from left to the right hand side of the scale the item location parameter βj ∈ θ increases.

local independence (LI):

(31)

CHAPTER 3. FORMULATION OF MUDFOLD

The responses of an individual i on K items are independent given the latent variable θ. The local independence condition can be written in likelihood form as:

P (X = x | θ) = YK j=1

P (Xj = 1 | θ)

which reduces to

P (X = x | θ) = YK j=1

Pj(θ) xj [1 − Pj(θ)] (1−xj)

for the case of dichotomous data that we study in this thesis.

The last assumption implies that the one dimensional latent trait θ alone explains the responses of the individuals on the various items. LI means that the covariance between the scores on items j and k conditional on θ is 0 and thus the responses on the different items are uncorrelated; that is the response to one item is not influenced from the response on any other item in the test.

In this thesis it is assumed that the individuals are drawn at random from a population. By integrating over the cumulative distribution function (cdf) of the latent trait denoted as G(θ) with probability density function (pdf) g(θ), one can obtain the unconditional multivariate distribution

P (X = x) = Z

θ

YK j=1

Pj(θ) dG(θ).

The probability that a randomly selected individual i will endorse positively item j also known as marginal popularity (proportion - mean) of item j is denoted as

p(j) = Z

θ

Pj(θ) g(θ) dθ

Suppes & Zanotti [34], showed that unless other restrictions on the conditional probabilities Pj(θ) = P (Xj = 1|θ), the latent trait distribution G(θ) or both, the multivariate distribution of the K items is not restricted.

Unimodality (UM):

(32)

CHAPTER 3. FORMULATION OF MUDFOLD

The probability of positive endorsement of an item j is considered as a function of the latent trait also known as item response function (IRF) and denoted as Pj(θ) = P (Xj = 1 | θ), is a unimodal function of the latent variable θ.

Definition 3.1 Unimodal function

A function f : X → R, where X ∈ R, is unimodal if there exists a unique x ∈ X for which the function f assumes its maximum, and for which f (y) is non-decreasing for y < x and non-increasing for y > x.

UM assumption of the IRF is a distinctive feature of Mudfold compared to other NIRT models.

Mudfold assumes that the item response functions which are functions of the latent trait are uni- modal and are following a ”single” picked shape while other NIRT models assume monotone (non decreasing) IRFs with sigmoid shape (see Figure 2.3).

3.2 Unfolding scale analysis

A dichotomous item j in unidimensional unfolding analysis can be considered as a closed interval on the latent continuum. In this interval the positive response on item j occurs and the boundaries of the interval are called ”item steps”. As you can see in Figure 3.1 one can distinguish the left hand side item step δj01 and the right hand side item step namely δj10.

1

δj01 δj10 latent continuum →

0 0

Figure 3.1: graphical representation of a dichotomous item and its two item steps

For a set of items I, if the left hand sided item steps are ordered in the same order as the right hand sided item steps an explicit order of the items is obtained. e.g if for a scale consisted of three

(33)

CHAPTER 3. FORMULATION OF MUDFOLD

items namely h,j, and k for which,

δh01 < δj01 < δk01 (3.1)

holds. And the same time

δh10 < δj10 < δk10 (3.2)

also holds then the three items are ordered in the order (h,j,k) with item parameters βh < βj < βk. The latter condition implies that the {1,0,1} response pattern in a given ordered triple of items forms a violation for Mudfold since a subject i has to choose all the items that are positioned near to his ideal position θi according to UIP assumption.

But let us illustrate this with an example. Assume that three items h, j, k ∈ I are forming a unidimensional unfolding scale in the order (h, j, k). Then a graphical representation of the three items and its item steps as in Figure 3.2 is obtained.

000 100 110 111 011 001 000

δh01 δj01 δk01 δh10 δj10 δk10

Figure 3.2: 3 items scaled on the latent continuum of a unidimensional latent variable

A subject i moves from the left to the right hand side of the scale until he meets his ideal position θi. Passing or not an item step δ.01 or δ.10 has implications on choosing or not the item (.). Now assume that the subject ’s response pattern on the triple (h, j, k) is{1,0,1}. A positive response on item h means that the subject is positioned between the item steps δh01and δh10 in the gray shaded area of Figure 3.3.

In addition response j = 0 implies that the individual is positioned also on the left hand side of item step δj01 or on the right hand side of the item step δj10.

(34)

CHAPTER 3. FORMULATION OF MUDFOLD

000 100 110 111 011 001 000

δh01 δj01 δk01 δh10 δj10 δk10

Figure 3.3: 3 items scaled on the latent continuum of a unidimensional latent variable

000 100 110 111 011 001 000

δh01 δj01 δk01 δh10 δj10 δk10

Figure 3.4: 3 items scaled on the latent continuum of a unidimensional latent variable

The last is rejected since the individual has to be also in between the grey area (see 3.4). Thus, the item step δj01 has not been passed. A response k = 1 in the last item of the scale implies that the individual would be placed between the item steps δk01 and δk10 (see yellow shaded area in Figure 3.4). But, this cannot be true since item step δk01 has to be passed but δj01 has not been passed yet therefore, a violation occurs. We will call these violations observed errors.

3.3 Triples

Mudfold statistically tests the item fitting to the unfolding model. In the process of building an unfolding scale, Mudfold establishes a search procedure (first step) to find the best elementary scale which fits best to the unfolding scale conditions. The best elementary scale is a triple of items in a unique order on the latent continuum.

Van Schuur originally distinguishes triples in ”unordered” which are sets consisting of three

(35)

CHAPTER 3. FORMULATION OF MUDFOLD

distinguished items (combinations) from the starting ”pool” of items where the order does and in

”ordered” triples. Ordered triples are the possible permutations (orderings) of three distinguished items (no repetition) from a starting set of K items. All the permutations in three of the K items are candidate orderings for the first step of the procedure. From this large collection of the permutations the best fitting to Mudfold properties will be chosen.

Consider a set including K item indices J ={1, 2, . . . , K}. According to combinations formula (equation 3.3) ,

K l



= K(K− 1) . . . (K − l + 1)

l(l− 1) . . . 1 = K!

l!(K− l)! (3.3)

the number of all possible 3−combinations of the set J, is lets say w = 3!(K−3)!K! . Each of these combinations is a subset of J consisted of three distinct elements from J .

All these 3!(K−3)!K! sets can be permuted (ordered - rearranged) in triples in 3! = 6 different permutations. For example if there exist three items h, j, k ∈ J with h 6= j 6= k which are forming the unordered triple{h, j, k} the corresponding permutations are (h, j, k), (k, j, h), (h, k, j), (j, k, h), (j, h, k), (k, h, j). These orderings imply a different position of the items on the latent scale. For example the ordering (j, k, h) implies that item j will be the first item which an individual will meet as (s)he moves from the left to right of the scale and therefore positioned closer to the beginning of the scale in a distance sense followed by the items k and h respectively.

Hence, the total number of triples that have to be examined to determine the best elementary scale is 3! 3!(K−3)!K! = (K−3)!K! . In Mudfold, the number of permutations that we take into account is

K!

2(K−3)! from all the available permutations-orderings. More specifically we take into consideration the permutations (h, j, k, ), (j, h, k, ) and (h, k, j, ) respectively. This is due to symmetries that occur in the number of violations ({1,0,1} response patterns) in the data and they will be discussed later in the text. For an hypothesised item index set consisted of K = 5 item indices, namely

(36)

CHAPTER 3. FORMULATION OF MUDFOLD

J ={A, B, C, D, E} the triples that have to be examined can be seen in Table 3.1.

Ordered triples defined for K = 5 items

subset order (h,j,k) order (h,k,j) order (j,h,k) {A, B, C} (A, B, C) (A, C, B) (B, A, C) {A, B, D} (A, B, D) (A, D, B) (B, A, D) {A, B, E} (A, B, E) (A, E, B) (B, A, E) {A, C, D} (A, C, D) (A, D, C) (C, A, D) {A, C, E} (A, C, E) (A, E, C) (C, A, E) {A, D, E} (A, D, E) (A, E, D) (D, A, E) {B, C, D} (B, C, D) (B, D, C) (C, B, D) {B, C, E} (B, C, E) (B, E, C) (C, B, E) {B, D, E} (B, D, E) (B, E, D) (D, B, E) {C, D, E} (C, D, E) (C, E, D) (D, C, E)

Table 3.1: All the ordered triples that have to be examined for K = 5 items

3.4 Errors and scalability coefficients for Mud- fold

Observed errors for an ordered triple of items (h, j, k) is the frequency of {1, 0, 1} response patterns over all the individuals. The latter means that we count how many of the n individuals taking the K−test have give positive response on the items h and k while they have respond

(37)

CHAPTER 3. FORMULATION OF MUDFOLD

negatively on the item j. But let us illustrate this with an easy example.

Assume that 6 individuals responded on 5 items (see table 3.2). e.g if one wants to test the properties of the responses on the ordered triple of items (A, B, C) in this order then, it can be easily seen that the{1, 0, 1} response pattern occurs two times in total, in the individuals 4 and 5.

sample A B C D E

1 0 0 1 0 1

2 1 1 0 1 1

3 0 1 0 1 1

4 1 0 1 0 0

5 1 0 1 1 0

6 0 0 1 0 1

Table 3.2: Data matrix X for a sample of n = 6 individuals responding on K = 5 items

Thus, in the ordered triple (A, B, C) the number of observed errors which are denoted as O(.) is:

O(ABC) = 2

while in the permutation (A, C, B) the number of observed errors is O(ACB) = 1 and is the second individual the one who violates Mudfold ’s assumption. Generalizing the last dummy example for any given triple of items

Definition 3.2 Observed Errors in Triples

Observed errors in an ordered triple of distinct items (h, j, k) where h, j, k ∈ {1, 2, . . . , K} and

(38)

CHAPTER 3. FORMULATION OF MUDFOLD

h6= j 6= k, is the frequency of the {1, 0, 1} responses among all the n individuals.

O(hjk) = Xn

i=1

xih (1− xij) xik ⇔ xih= 1, xij = 0, xik = 1 (3.4)

where xij is the realization variable and xij = 1 if the ith individual responds positively on the jth item and xij = 0 else.

By using the formula 3.4 in definition 3.2 we can calculate the observed errors for any ordered triple that can be defined from the K items in the scale.

3.4.1 Expected Errors in Triples

Expected errors in an ordered triple of items (h, j, k) where h 6= j 6= k, is the frequency of the {1, 0, 1} response patterns that would expected to occur if the three items h, j and k were statistically independent. Statistically independent means that the responses of the individuals on the various items are not depending on the latent variable that is assumed to dictate their responses but only on the popularity of each item.

For calculating the expected errors in a triple we have first to define the marginal popularity of an item j.

Definition 3.3 Marginal Popularity

We define as marginal popularity of an item j ∈ {1, 2, . . . , K} and we denote as p(j), the relative frequency of the positive responses on the item j summing over all the individuals.

p (j) = Xn

i=1

xij

n , with 0≤ p(j) ≤ 1, (3.5)

the realization variable xij = 1 if the ith individual responds positively on the jth item and xij = 0 else.

(39)

CHAPTER 3. FORMULATION OF MUDFOLD

If we recall the Table 3.2 on page 27 and following the definition 3.3 we can calculate the marginal popularity of item A for example as follows:

p(A) = 0 + 1 + 0 + 1 + 1 + 0

6 = 1/2 = 0.5

In simpler words marginal popularity is the proportion of subjects who give a positive response on the item j. The marginal popularity of each item for the small data set presented in Table 3.2 is given in Table 3.3

sample item A item B item C item D item E

1 0 0 1 0 1

2 1 1 0 1 0

3 0 1 0 1 1

4 1 0 1 0 0

5 1 0 1 1 0

6 0 0 1 0 1

p(j) 0.5 0.33 0.67 0.5 0.5

Table 3.3: marginal popularities for each item of the data set in Table 3.2

After the definition of the marginal popularity of an item j one can calculate the number of expected errors in an ordered triple of items. This formula can be used to obtain the values of the expected errors for any triple of items defined from the K−item scale.

Definition 3.4 Expected Errors

Expected errors in an ordered triple of items (h, j, k) where h, j, k ∈ {1, 2, . . . , K} and

(40)

CHAPTER 3. FORMULATION OF MUDFOLD

h6= j 6= k, is the expected frequency of the {1, 0, 1} responses among all the n individuals.

EO(hjk) = p(h) (1− p(j)) p(k) n (3.6)

where p(h), p(j) and p(k) are the marginal popularities for the items h, j and k respectively and n is the sample size.

If for example we want to calculate the errors that are expected to occur in the ordered triple (A, B, C) following the formula in definition 3.4, the expected number of errors ({1, 0, 1} response patterns) if the three items were statistically independent is

EO(ABC) = p(A) (1− p(B)) p(C) 6 = 0.5 (1 − 0.33) (0.67) 6 = 1.3467.

The next step in the process is to define the scalability coefficients which are used as fitting criterion for finding the most appropriate items each time in the process of building an unfolding scale.

3.4.2 Scalability Coefficient for Triples

To find the best elementary scale for the first step of the algorithm, Mudfold use as scalability measure the comparison between the errors observed in the data and the errors expected under statistical independence conditions. If the observed errors are less than the expected errors, then some confidence in the unfolding nature of the data is obtained [22].

Analogous to Loevinger’s coefficient of homogeneity for a cumulative scale used by Mokken (1972), an H coefficient for triples of items based on the comparison between the observed and the expected number of errors can be defined. An extension of this definition for calculating the scalability coefficients for a candidate item j or for the scale as a whole will be discussed later in this text.

(41)

CHAPTER 3. FORMULATION OF MUDFOLD

Definition 3.5 H coefficient for Triples

H(hjk) coefficient of scalability for an ordered triple of items (h, j, k) where h, j, k ∈ {1, 2, . . . , K}

and h6= j 6= k, is the real number obtained if we subtract from the unity the ratio of observed errors to expected errors.

H(hjk) = 1− O(hjk)

EO(hjk) (3.7)

where O(hjk), EO(hjk) are the observed and expected errors for the ordered triple of items (h, j, k) .

Perfect scalability for an ordered triple, is defined as H(hjk) = 1. This means that no error is observed in this triple. When H(hjk) = 0, the number of observed errors is equal to the number of expected errors. Mokken (1972) proposes the following classification for the H(hjk) coefficients that we are going to adapt also in this text:

• if H(hjk) < 0.3, the scale has poor scalability properties,

• if 0.3 ≤ H(hjk) < 0.4, the scale is ”weak”,

• if 0.4 ≤ H(hjk) < 0.5, the scale is ”medium”,

• if 0.5 ≤ H(hjk), the scale is ”strong”.

Using the ordered triple (A, B, C) as in the examples of the observed and expected errors we can calculate its H coefficient of scalability as follows :

H(ABC) = 1− O(ABC)

EO(ABC) = 1− 2

1.3467 = 1− 1.485 = −0.485

. A highly negative H(hjk) implies a negative scalability between the three items which means that the three items are not forming an unfolding scale in this order. With the same way we can calculate the H(hjk) coefficient for any ordered triple of items (h, j, k) where h, j, k∈ {1, 2, . . . , K}

and h6= j 6= k.

(42)

CHAPTER 3. FORMULATION OF MUDFOLD

3.4.3 Symmetries in Errors

At this point it is useful to pay attention to the fact that some symmetries that occur in the number of the errors observed, in the number of the errors expected and as a consequence in the H(hjk) coefficients.

Consider the ordered triple (h, j, k). The number of errors in this triple -where θh < θj < θk w.r.t their scale values- is the same as in the ordered triple (k, j, h) where the scales value follow the order θk < θj < θh. Since multiplication is a commutative operation in real numbers, if we reverse the items positioned in the two edges of the triple while we let the in between positioned item in the same place (the item that we assume each time that gets a negative response), the number of the observed errors remains invariant. For example, the ordered triple (A, B, C) has the same number of observed errors as the ordered triple (C, B, A) sincePn

i=1xiA(1− xiB) xiC =Pn

i=1xiC (1− xiB) xiA (table 3.4)

sample item A item B item C

1 0 0 1

2 1 1 0

3 0 1 0

4 1 0 1

5 1 0 1

6 0 0 1

p(j) 0.5 0.33 0.67

sample item C item B item A

1 1 0 0

2 0 1 1

3 0 1 0

4 1 0 1

5 1 0 1

6 1 0 0

p(j) 0.67 0.33 0.5

Table 3.4: Data matrix of the permutations (h, j, k) and (k, j, h) respectively for the set of items {A, B, C}

(43)

CHAPTER 3. FORMULATION OF MUDFOLD

In general for the 3! = 6 available permutations of a set consisted of three items :

O(hjk) = O(kjh) (3.8)

O(hkj) = O(jkh) (3.9)

O(jhk) = O(khj) (3.10)

The same symmetries occur in the number of the expected errors as well i.e, for the set of items {A, B, C}, EO(ABC) = EO(CBA) since p(A) (1 − p(B)) p(C) = p(C) (1 − p(B)) p(A). For the same reason EO(ACB) = EO(BCA) and EO(BAC) = EO(CAB). Generally, for the 6 available permuatations of a set Tw ⊂ J consisted of three items :

EO(hjk) = EO(kjh) (3.11)

EO(hkj) = EO(jkh) (3.12)

EO(jhk) = EO(khj) (3.13)

From equations 1.7 - 1.12 we trivially conclude that

H(hjk) = H(kjh) since 1− O(hjk)

EO(hjk) = 1− O(kjh)

EO(kjh) (3.14)

H(hkj) = H(jkh) (3.15)

H(jhk) = H(khj) (3.16)

Taking the symmetries into consideration, from now on for a set consisting of three items {h, j, k}, three H coefficients can be distinguished namely H(hjk), H(hkj) and H(jhk).

(44)

CHAPTER 3. FORMULATION OF MUDFOLD

3.5 Nonparametric scale construction

According to Sijtsma and van der Ark [33], nonparametric scale construction techniques exist at least since Mokken ’s model of monotone homogeneity (1971). In all the nonparametric approaches in scale construction, basic concept is Loevinger ’s coefficient of scalability [13].

Mudfold model fit is based on the extended Loevinger ’s H coefficients for triples as well as on scalability coefficients for items and the whole scale. In Mudfold, H coefficients of scalability are comparing the number of observed errors in the data to the number of errors that are expected to occur if the items were statistically independent.

The scalability coefficient for an item j which is part of a scale consisted of m items where 4≤ m ≤ K compares the number of observed and expected errors in all the ordered triples that are defined in the specific scale and are containing the item j.

One would fit Mudfold on a data set where proximity relations between the individuals and the items exist. Then a maximal subset of the K items consisted of m items with m≤ K will be determined such that these form an unfolding scale in a specific order.

The nonparametric procedure of building an unfolding scale with Mudfold can be divided in two steps. The first step is to find the best elementary scale which consists of three items and the next step in the procedure is to expand this best elementary until no more items fit in Mudfold scalability criteria.

3.6 Best elementary scale

Finding the best elementary scale means that we want to decide if a triple of items which are in a specific order forms a Mudfold scale. Mudfold scale means a scale that fits best in Mudfold

(45)

CHAPTER 3. FORMULATION OF MUDFOLD

criteria.

3.6.1 Unique Triples

Our focus in the process of finding the best elementary scale or best triple of items will be on the set of unique triples. Unique triples is a restricted pool of ordered triples from which the best elementary scale will be chosen. The set of unique triples contains these triples (from the total

K!

(K−3)!) for which a unique representation on the latent scale is possible.

Definition 3.6 Unique Triples

Unique triples is a finite set which contains all the ordered triples (h, j, k) with

h, j, k ∈ {1, 2, . . . , K} = J, and h 6= j 6= k for which the following requirements are fulfilled:

• H(hjk) > 0 while

• H(hkj) < 0 and H(jhk) < 0

The latter guarantees that the triples of items in the set of unique triples are ”uniquely” repre- sented on the latent continuum, i.e are scalable together in only one of the three permutations.

Each ordered triple in the set of unique triples is associated with a positive H(hjk) coefficient.

Then, we can order the set of unique triples based on the associative H(hjk) coefficient of its triple from maximum to minimum. Since the set of U nique T riples is a finite set of ordered triples, an ordered triple with maximum H(hjk) exists. The ordering with the highest H(hjk) coefficient is called best unique triple and it will form the best elementary scale for the second step of the algorithm if its H(hjk) coefficient is greater than 0.3 or from a different -user specified- lower boundary.

(46)

CHAPTER 3. FORMULATION OF MUDFOLD

Definition 3.7 Best Unique Triple

An ordered triple (h, j, k)∈ Unique Triples is called best unique triple iff H(hjk)=max(Hunq).

In Mudfold thought, the lower boundary H(Best U nique T riple) > 0.3. This lower boundary has been firstly used by Mokken (1971) for his stochastic cumulative scaling model which is anal- ogous in many aspects with Van Schuur’s (1984) Mudfold model [36]. The latter secures that the scalability properties of the best unique triple are - at least- slightly higher than ”weak”.

3.7 Extending the Best Elementary Scale

The best three items in their unique permutation have been found in the first step. The next step in the process is to iteratively investigate the remaining K − 3 items to find the best fourth, fifth, sixth and in general for m (3 ≤ m < K) items in the scale the best (m + 1)−st item which will be added next. This procedure is iterative and the maximum number of iterations is K− 3 as the cardinality of a set J which consists of the remaining K − 3.

For the (m + 1)−st item there exist m + 1 possible scale positions. If for example a scale consists of m = 4 items (1234), for the best fifth item five places are available (51234, 15234, 12534, 12354, 12345).

In each iteration of the second step of Mudfold scaling algorithm, the number of scales which are investigated to find the best one which contains the best (m + 1)−st item is (K − m)(m + 1).

A hypothetical procedure of extending the best elementary scale can be described as follows:

1st iteration: the best elementary scale has been found. Hence, m = 3 and we are looking for the best fourth item to add. This means that K− 3 items will be investigated. Each item defines four scales as many as the available positions. Thus, the total number of scales which are

(47)

CHAPTER 3. FORMULATION OF MUDFOLD

investigated is (K− 3)(3 + 1) = 4 ∗ (K − 3)

2nd iteration: The best fourth item has been added in the scale. Thus m = 4 and the remaining items which have to be investigated are K− 4 in total. The total number of scales from which the best one will be determined is 5∗ (K − 4)

...

if all the items fit in Mudfold properties then in the

(K− 3)d iteration: The best (K− 1)−st item has been added in the previous step. Thus, m = K − 1 and we are looking for the best K−th item to add. The number of the scales under investigation will be then K = (K− (K − 1))((K − 1) + 1) = K.

To determine the best scale in each iteration Mudfold uses certain criteria which have to be fulfilled from a candidate scale. These are :

1st criterion: The first criterion that a candidate scale from the total (K − m)(m + 1) has to accomplish to be admissible for the next steps of the algorithm is:

All the 2!(m−2)!m! triples defined from this specific scale (with respect to the order of the items), which are containing the new -candidate- item must have positive H(hjk) coefficient.

The fulfillment of this first criterion, gives us a level of confidence that the candidate item fits sufficiently with the existing m items in the scale.

2nd criterion: The candidates scale which fulfill the 1st criterion are admissible for the next level of the process. The second criterion states that the scale which will be chosen has to contain this candidate item which is less represented among the candidate items.

The latter means that a certain amount of scales ”passed” the first criterion. Each of these scales contains a candidate (new) item. We calculate how frequently each candidate item is

Referenties

GERELATEERDE DOCUMENTEN

Confirmatory analysis For the student helpdesk application, a high level of con- sistency between the theoretical SERVQUAL dimensionality and the empirical data patterns for

For example, in the arithmetic exam- ple, some items may also require general knowledge about stores and the products sold there (e.g., when calculating the amount of money returned

Index terms: cognitive diagnosis, conjunctive Bayesian inference networks, multidimensional item response theory, nonparametric item response theory, restricted latent class

Although most item response theory ( IRT ) applications and related methodologies involve model fitting within a single parametric IRT ( PIRT ) family [e.g., the Rasch (1960) model

Moreover, Hemker, Sijtsma, Molenaar, &amp; Junker (1997) showed that for all graded response and partial-credit IRT models for polytomous items, the item step response functions (

research on the practical consequences of item response theory (IRT) model misfit, at the department of Psychometrics and Statistics, University of Gro- ningen, supervised by prof.

research on the practical consequences of item response theory (IRT) model misfit, at the department of Psychometrics and Statistics, University of Gro- ningen, supervised by

The significance of IRT model misfit should be decided based primarily on theoretical considerations and within specific research contexts. Items that violate IRT assumptions