• No results found

Reality or fiction? : the interpretation of common factors from different epistemological perspectives

N/A
N/A
Protected

Academic year: 2021

Share "Reality or fiction? : the interpretation of common factors from different epistemological perspectives"

Copied!
70
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam &

University of Canterbury

Reality or fiction?

The interpretation of common

factors from different

epistemological perspectives.

Author:

L.D. Wijsen, B.Sc., B.A.

Supervisors:

Prof. Dr. B.D. Haig &

Prof. Dr. D. Borsboom

August 17, 2015

Number of words = 17272, Abstract: 156 Student number = 6075398

(2)

Contents

1 Introduction 2

2 A brief history of the common factor model 9 3 The common factor model 12

3.1 Characteristics of the common factor model . . . 13

3.2 Principal components analysis . . . 15

4 Epistemological doctrines 20 4.1 Scientific realism . . . 20

4.2 Anti-realist epistemologies . . . 25

5 Epistemological doctrines and the common factor model 30 5.1 Scientific realism and the common factor model . . . 30

5.2 Empiricism and the common factor model . . . 40

5.3 Fictionalism and the common factor model . . . 42

5.4 Factors versus components . . . 46

6 The common factor model and theory construction 49 6.1 Abduction and factor analysis . . . 49

6.2 Alternative uses of the common factor model . . . 55 7 Conclusion and Discussion 59

(3)

Abstract

This thesis examines the interpretation and use of the common factor model in light of different epistemological frameworks. Within a realist framework, fac-tors can be interpreted as real-world entities, whereas an empiricist or fictionalist framework prescribes the interpretation of factors as statistical artifacts. Several arguments for and against a realist interpretation of factors will be discussed, and I will conclude that there is no reason to only interpret factors as statistical arti-facts. When research has been conducted well, it is justified to interpret factors as real-world entities. Nevertheless, a fictionalist account of science is not useless and is applicable in situations that call for more cautious interpretations. Fur-thermore, the role that factor analysis plays within theory construction will be discussed. Within a realist framework, factor analysis can contribute to theory construction through the employment of abductive reasoning. Not only can fac-tors be interpreted as real-world entities, factor analysis is able to generate and evaluate plausible theories.

1 Introduction

Ever since its foundations were laid by Spearman (1904), the common factor model has become a widely used method for classifying underlying factors and generating theories about them. Factor analysis has not only become a popular method of correlational analysis and interpretation in psychology but also in other fields such as biology and economics (Fabrigar, Wegener, MacCallum, & Strahan, 1999; Mulaik, 2009). Its development started with the formulation of g

(4)

theory, in which g referred to the general factor for intelligence, but factor analysis has performed a central role in, for instance, the five-factor model of personality (McCrae & Costa, 1987) and the p-factor underlying

psychopathology (Caspi et al., 2014). In all these examples, we assume that the variables (e.g., general intelligence or neuroticism) are real: psychologists usually believe that the factors extracted represent true variables, and the factor model was originally used to represent the theoretical relations between unobserved and observed variables. Nevertheless, other kinds of factor interpretations are also possible. For example, factors might only just be statistical constructs, a result of a mathematical analysis, and nothing more. In this thesis, I aim to

conceptually analyze and evaluate possible interpretations and uses of the common factor model, from different epistemological perspectives.

Spearman (1904) developed the common factor model to construct his theory of human intelligence. He found high positive correlations between several ability tests taken by school children; when they scored highly on an English test, they were more likely to score highly on a French or mathematics test as well. This phenomenon is also known as the positive manifold, and is one of the most replicated findings in psychology. Spearman believed that there must be one underlying variable that explained all these correlations, which he named g or the ‘general factor’. The model he subsequently developed describes the relation between observed variables (scores on intellectual performance tests) and the underlying latent variable; the latent variable causes the variance in the observed variables and is the primary cause of the observed correlations. Once we conditionalize on the latent variable, the correlations between the observed variables disappear and the observed variables become statistically independent,

(5)

which is also known as the principle of local independence (Borsboom,

Mellenbergh, & Van Heerden, 2003). In other words: given the latent variable, the observed variables no longer correlate. The latent variable, or general intelligence in Spearman’s case, is the common cause of the scores on the observed variables.

Spearman believed that the common factor model reflected a reality, although Spearman admitted that g’s exact whereabouts were unknown and perhaps hard to trace (Baird, 1987). He characterized intelligence as ‘mental energy’, and although this definition is certainly vague, he believed that intelligence had a physiological basis, and that scientists would eventually find out about the exact biological mechanism that underlies intelligence (Spearman, 1914). Moreover, in his 1904 article he explicitly states that his aim is that ‘-by the aid of information thus coming to light- it is hoped to determine this

Intelligence in a definite objective manner and to discover means of precisely measuring it’ (205-206). Later on he concludes that ‘there really exists a

something that we may provisionally term “General Sensory Discrimination” and similarly a “General Intelligence” [...]’ (p.272). All together, it is clear that Spearman believed that general intelligence was real, that it could be measured, and that factor analysis was the way to detect this phenomenon.

According to Spearman, our intellectual ability can be understood as a single factor. But science is an ongoing endeavour and methodologically, the factor model developed in many ways over the years, as did the theory on

intelligence. When others such as Thurstone (1931) entered the discussion on the topic, Spearman had to admit that perhaps there was not just one general factor that explained all the variance but also other, more specific factors. Thurstone

(6)

identified a total of seven primary abilities, such as word comprehension, word fluency and number facility. In The Vectors of Mind, he speaks of how the ‘multidimensionality of the mind must be recognized, before we can make progress toward the isolation and description of separate abilities’ (Thurstone, 1934). Thurstone speaks of the mind and of abilities as if they are real though hidden entities that he has to uncover, and the way to acknowledge this

multidimensionality of the mind was factor analysis. This also becomes clear in Multiple Factor Analysis, in which he gives several examples of how a correlation table can be accounted for by factors that refer to psychological phenomena (Thurstone, 1931). For example, Thurstone wonders whether one general factor for psychopathology can be extracted that explains a table of correlations of psychotic symptoms. Spearman’s and Thurstone’s thoughts about intelligence were not the same, nevertheless, the idea that the factors reflected real variables was clear to both psychometricians. Both Spearman and Thurstone do not simply extract factors and interpret them as statisical artifacts, but believe that they refer to psychological entities. Analyzing and theorizing about factors was therefore more a matter of science, of theory construction, than of statistics (Bartholomew, 1995). One could therefore argue that both Spearman and Thurstone shaped their arguments in the tradition of scientific realism with the aim of constructing explanatory theories on intelligence where the factors represented real-world variables.

Following this line of argument, Spearman and Thurstone could be seen as scientific realists (Block, 1974). Scientific realism is characterized by a belief in the reality of theories and the entities they refer to (Boyd, 1983; Psillos, 2005; Putnam, 1984). Scientific realists believe that, as science progresses, they come

(7)

closer to the truth, as theories that are better in describing a mechanism are preferred over theories that are less capable of doing so. A reason for being a realist is that scientific theories seem to perform well and have lead to several applications that proven to influence our lives. An important argument for scientific realism states that if theories were not true, we would have never made such progress over the years. The fact that we have made such progress, is a direct consequence of the fact that our theories are approximately true and the entities they refer to are real.

Although scientific realism is probably the most popular philosophy among researchers because they usually believe that theoretical entities exist, there are other epistemological doctrines that provide us with different, but relevant views on what science is capable of and what knowledge entails. One of these,

empiricism, is a well-known ‘rival’ of scientific realism, and states that our knowledge should only come from what we directly observe from our senses. As opposed to realism, empiricism does not favor the construction of theories about unobservables as such entities cannot be observed directly by our senses.

Empiricism can be seen as a broad category that comprises several variants. One of these, fictionalism, is of particular importance to the common factor model and its interpretation. Whereas realists hold theories and the entities they refer to to be true, fictionalists only believe in their practical adequacy. According to fictionalism, we can never be sure that a theory is true or not, so we cannot and should not ascribe truth-claims to theories. The whole truth claim is not

important to fictionalists. In fact, they do not care about whether theories are true or false: truth is not something fictionalists strive for.

(8)

interpreted and used differently. Within a realist framework, factors can be interpreted as real-world entities and the model can be used for the detection of phenomena and the generation of theories. However, within an empiricist or fictionalist framework, such an interpretation would not be possible. According to fictionalism, the common factor model might be a useful method for

extracting variables that are useful tools for prediction, but not for representing real-world entities. Fictionalists would argue that the common factor model is nothing but a statistical method, that produces nothing but statistical constructs that may describe the data adequately, but never truthfully. Considering these epistemological frameworks when applying factor analysis, brings the question of the interpretation of factors to the forefront. Which doctrine we choose, has a major influence on what we expect from science and what we can expect from the common factor model. Factor analysts should be aware of possible

interpretations to make a well informed choice on how they think factors should be interpreted in order to draw the correct conclusions.

Even though the factor model is omnipresent in psychological literature, and much has been written about its history and technical foundations (Mulaik, 2009; Bartholomew, 1985; Buckhalt, 2001), surprisingly little has been written about its conceptual development and how the model can be interpreted from different epistemological perspectives. The judicious use of philosophy of science can help understanding the common factor model in a more fundamental way than what we learn about factor analysis from statistical textbooks. Philosophy of science can give us insight in how we understand scientific methods, and how scientific methods can contribute to our knowledge. Statistical methods are often employed blindly, without taking its conceptual implications into account, and I

(9)

believe that that should be remedied. One should be aware of what claims a model can and cannot make, and to understand this, philosophy of science can be of considerable help.

Intelligence has been a controversial topic ever since its birth, and I want to make clear that in this thesis, I am not interested in the question ‘who was right?’. I am not an expert in intelligence research, and that question should be left to those who are. The primary interest of this thesis is what conceptual claims can be made when making use of factor analysis, when studying it in the light of different epistemological doctrines. After providing more detailed

background on what the common factor model entails and putting it in a historical context, three epistemological doctrines (realism, empiricism and fictionalism) will be discussed and I will analyze how the common factor model is interpreted according to these doctrines. I will discuss several viewpoints

expressed by factor analysts on what factors represent and the purpose for which the common factor model should be used. It will be concluded that a realist interpretation of the model is justified. Factors may rightfully be taken to refer to real-world entities, provided that the study has been conducted well.

Nevertheless, as will become clear, fictionalist or empiricist arguments should not be totally discarded, as they make us aware that factors do not necessarily represent real-world entities at all times and that the common factor model’s capabilities are limited. Finally, it will be discussed how the common factor model can contribute to theory construction through abductive reasoning (Haig, 2014). All in all, I hope that this thesis will provide a deeper understanding of possible interpretations of the common factor model, and give researchers an informed overview of different perspectives on its use.

(10)

2 A brief history of the common factor model

In 1904, the common factor model made its first appearance in Spearman’s most famous work “‘General intelligence,” objectively determined and measured’. As its title implies, Spearman’s aim was to construct a model that depicted general intelligence and performance on mental ability tests, and could be used for objective measurement. Although Spearman was responsible for most of the work done on the model at the time, he was clearly influenced by several scientists that preceded him. Among these was the famous (Galton, 1894), a scientist who, among many other things, thought out the fundamental basis of the common factor model: the correlation coefficient and linear regression. Galton discovered that when plotting the size of ‘daughter-peas’ against the size of ‘mother-peas’, a straight line could be drawn to describe the direction of the scatter plot. From this information, he drew the conclusion that seed-size was heritable, as the size of ‘mother-peas’ was closely related to that of the ‘daughter-peas’ (Galton, 1894). Galton believed that the data patterns he found were indeed reflections of reality (Stanton, 2001). The regression line described above reflected the true

heritability of seed-size, and because no information contradicted his theory, he concluded that length as such must be purely heritable. It was Galton’s prot´eg´e, Karl Pearson, who further developed the mathematical foundations of the

correlation coefficient (also known as the Pearson product-moment correlation coefficient). Whereas Galton argued that data patterns were true reflections of phenomena in the real world, Pearson, having a strong mathematical

background, viewed data patterns as ‘descriptive summaries of data’ (Mulaik, 1991, p.90). He believed that regression lines or correlation coefficients were just

(11)

properties of the data, and believed that these did not necessarily reflect any real phenomena. In his 1904 paper, Spearman combined Galton’s belief that theories should reflect data with Pearson’s mathematical work on multivariate statistics, and published the first version of the common factor model in 1904.

The theory of one general factor for intelligence and the common factor model that accompanied it did not go unnoticed, and in fact received a

considerable amount of criticism. The idea that there is only one underlying factor, instead of perhaps multiple factors, was hard for some to accept. As mentioned in the introduction, Spearman also admitted that he could not specify g’s whereabouts or its specific identity, but the common factor model simply proved it was ‘out there’ somewhere. Intelligence researchers such as Thurstone (1931) and years later Cattell (1963) developed theories on intelligence that did not involve a general factor of intelligence. Thurstone found seven ‘primary mental abilities’ and Cattell believed there were two, ‘fluid‘ and ‘crystallized’ intelligence. They might have disagreed on the exact number of underlying intelligence factors, but both researchers used the common factor model to establish their conclusions. Thurstone continued the technical development of the common factor model over the years that followed. Thurstone became well known for developing a model suitable for multiple factor analysis. Now it was possible to apply factor analysis to situations in which the correlation matrix could be explained by more than one factor. He also developed the ‘simple structure’ as a solution for the factor indeterminacy problem. Factor indeterminacy refers to the fact that when performing factor analysis, an infinite number of solutions is possible, all of them with the same number of factors. Simple structure is one of these solutions and provides the user with factors that are easily interpretable.

(12)

The problem of factor indeterminacy with respect to general intelligence is that g is not uniquely defined (Steiger, 1979). This means that for different sets of factor scores, the same single factor can be distilled. Suppose we have two sets of test scores, both by the same person. Applying factor analysis to both sets of test scores will result in the same output; one single factor. It does not matter for the final output of the factor analysis whether a person scores higher in the first set of tests, than in the second set of tests. Critics have argued that this is a major issue that has clear implications for how useful the common factor model is (Maraun, 1996). How can factor analysis be a proper method if its input, or the scores on the observed variables, can vary over a wide range? Although there is a general agreement among psychometricians that g is indeed not uniquely determined and that factor indeterminacy exists, there has been a long lasting debate on whether this is in fact an issue. Later in this thesis, I will come back to the issue of factor indeterminacy, and discuss how it supposedly influences factor interpretation and whether this is a problematic issue.

The discussion on whether factor indeterminacy is indeed as problematic as some argue started in the 1930s, died out in the 1940s, and then emerged again in the 1970s (Steiger, 1979). This forms an interesting parallel with what Mulaik (2009) characterized as the ‘period of blind factor analysis’, during the 1950s and 1960s. Mulaik argued that during this period, the factor model was usually fitted blindly on each data set that had a correlational structure, without paying attention to whether the model was in fact suitable for the research question. Whether or not this is a correct characterization of the period, the 1950s and 1960s was a time during which the technical development of the factor model came to a halt, and although further efforts were made to make factor

(13)

analysis easier to use, the method was mostly just applied rather than further investigated and developed.

However, since the 1970s, the factor model has gone through a range of developments. Not only was the model now easier to apply, new techniques were developed. J¨oreskog (1970) formed the technical foundations for confirmatory factor analysis (CFA) and structural equation modeling (SEM) that have since become widespread applications of factor analysis. SEM is a technique that combines factor analysis, path analysis and multiple regression; one can model the linear relations between the observed variables and the latent variables (the factor analysis or measurement part), but also the linear relations between the latent variables (the path analysis or the structural part). Consequently, one can use SEM to test these theories and evaluate them using goodness-of-fit measures. Joreskog’s software package, LISREL, enabled researchers around the world to apply CFA and SEM with relative ease. Not only was it now possible to explore data, detect phenomena and hypothesize about causal mechanisms, theories could now be tested through confirmatory techniques. As mentioned before, it was also during the 1970s that the factor indeterminacy discussion became more popular again, and awareness of the limits of the factor model was raised. Ever since, factor analysis and SEM have remained popular statistical methods, serving several purposes within several scientific disciplines.

3 The common factor model

In this section, I intend to give a short overview of the common factor model’s most important characteristics. Subsequently, I will also briefly discuss a method

(14)

often associated with the common factor model, principal components analysis, as I will later make a conceptual comparison between these models. It is

important to note straight away that the explanation and graph in Figure 1 below will center on an example with only one common factor, but the characteristics explained below can easily be generalized to multiple common factors.

3.1 Characteristics of the common factor model

As mentioned earlier, the common factor model was developed by Spearman to describe his theory of general intelligence. More abstractly, the common factor model was a mathematical construction devised to summarize correlations between observed variables (sometimes also referred to as manifest variables). In Spearman’s case, those observed variables were a test battery, consisting of several tests that measure different mental abilities. The main idea behind the model was that there was one common factor, or latent variable, that explained the correlations between the ability tests. In other words, when

controlling for the latent variable, the correlations between the different observed variables would disappear, which is known as the principle of local independence. The latent variable (linearly) influences the scores on the unobserved variables. The scores a person gets on mental ability tests can be explained by the level of the latent variable. In conclusion, the common factor model derives one or more common causes from a table of the correlated observed variables.

In addition to the model’s assumption that there are one or more common causes that explain correlations between observed variables, the model also assumes the existence of the so-called unique factors. Although the performance on several ability tests may be causally influenced by the latent variable of

(15)

general intelligence, not all of the variance will be explained by the common factor alone. There will always be a proportion of the variance in the scores on the ability tests that is not explained by the common variable. This proportion is called the unique variance. The unique factors each influence only one of the observed variables, which leads to the assumption that all unique factors are independent of each other. The unique variables can then be divided into two specific components: the specific factor and measurement error. The specific factor refers to the variance that is solely caused by this observed variable that is not captured by the common cause. For example, when we find a general

intelligence factor for several ability tests, we do not expect that general intelligence explains everything, but that each test also tests somethingspecific that is not captured by general intelligence. Measurement error refers to the fact that there is always a proportion of variance caused by neither the common cause or the specific factor, that is simply caused by other, random influences. In the case of mental ability tests, these random influences can be things such as sudden distractions when filling out a test, or the fact that the test is taken on the computer instead of the test having a paper format.

The common factor model can be graphically represented as the graph in Figure 1. Each of the observed variables, or each of the mental ability tests in the case of general intelligence, is a linear function of the latent variable, or the common cause. These linear relations are represented by the arrows. The

strength of this linear relation is represented by the factor loading, often signified by λ. The higher the factor loading, the stronger the linear relationship between the observed variable and the common cause. There is no restriction on the number of linear relations between an observed variable and the common causes;

(16)

in the case of a multiple factor model, the observed variables are caused by more than one latent variable.

These days, several methods fall under the category of factor analysis. Often, a useful distinction between exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) is made. EFA, is, as its name implies, useful for exploring the possible latent structure for an available data set. EFA is often used to detect underlying common factors and to generate plausible hypotheses about the underlying causal mechanisms. EFA can be used in an early stage of research when there are no clear hypotheses or theories yet available. CFA, on the other hand, can then be used to test whether a certain theory of latent variables indeed holds. In EFA, no expected structure is defined beforehand because one has no or little knowledge of the possible relations. With CFA, the expected covariance structure is defined beforehand and then evaluated to see if the proposed model fits. CFA is therefore suitable for the evaluation and testing of the hypotheses, brought forward by EFA or other sources. For this reason, it is often recommended to use both, as CFA can confirm what has been found in the earlier EFA analysis.

3.2 Principal components analysis

Another popular statistical method, principal component analysis (PCA), is often associated with factor analysis, and often used for the same purpose of identifying the underlying latent structure. What is often overlooked is that there are indeed conceptual differences between the methods and that they are not interchangeable (Bandalos & Boehm-Kaufman, 2009). PCA and factor analysis are not the same, even though it is a popular belief that they are. To

(17)

𝜆

1

𝜆

2

𝜆

3

𝑋

1

𝑋

2

𝑋

3

𝜉

1

𝛿

1

𝛿

2

𝛿

3

Figure 1: Example of the common factor model, with one latent variable (ξ1) and

three observed variables (X1, X2, X3). δ1, δ2 and δ3 stand for error terms. λ1, λ2

(18)

understand the difference between factor analysis and PCA, it is important to understand one technical detail that differentiates the models. Whereas the factor model incorporates a uniqueness term (variance that is caused by the factor itself and a random error term), PCA does not have such a term. In PCA no distinction is made between the variance that is caused by the common factors and the variance that is unique to each observed variable. Therefore, in factor analysis, only the shared variance is analyzed (only the relations between the observed variables are of interest), whereas in PCA all of the variance is analyzed. This technical difference between the two models leads to a clear conceptual difference in how both models should be used and interpreted. PCA is a data analytic technique that can be used for reducing data by constructing a set of components that are linearly related to a multitude of observed variables. The goal is to explain as much variance as possible by a small number of

variables that are linear combinations of the observed variables. Factor analysis, on the other hand, is more appropriate when one is interested in the covariance structure between the variables and for identifying the latent constructs and the relations between them that underlie the correlations. Whereas factor analysis is suitable for analyzing the relations among latent variables, PCA should be used for reducing data to fewer variables that explain as much variance as possible.

The conceptual difference between factor analysis and PCA can be brought out by contrasting what is called a reflective and a formative model (Borsboom et al., 2003). A model that describes the influence of a latent variable on the observed variables is also known as a reflective model (see Figure 1), as the observed variables reflect changes in the latent variable. A model that

(19)

lead to changes in the latent variable) is known as a formative model (see Figure 2). Psychological entities, such as depression and intelligence, are often described by reflective models, because those entities are characterized as causes of our behaviour. We have high or low scores on mental abilities because of the level of our general intelligence. An example of a concept that can be described by a formative model is social economic status, which is ‘formed’ by different observed variables, such as education, salary and health. These entities together form the composite construct of social economic status; it is not social economic status by itself that influences each of these component entities. Whereas the common factor model is seen as a reflective model, PCA is a statistical method that is often associated with the formative model, as the observed data are reduced to a smaller set of variables. PCA should therefore not be confused with factor

analysis, or seen as a method that has the same purpose. For this reason, I believe that factors and components are different types of statistical constructs, and should therefore be interpreted differently. I will come back to this later in this thesis.

As has become clear, PCA and EFA are mathematically and conceptually speaking different statistical techniques that serve different purposes. Whereas EFA should be used to explore the underlying latent structure, PCA should be used for data reduction purposes. Yet, both techniques are often used

interchangeably; researchers tend to use PCA for identifying an underlying structure or EFA for data reduction, without realising that both techniques are designed in different ways (Fabrigar et al., 1999; Bandalos & Boehm-Kaufman, 2009). Sometimes they even fail to mention which technique they have actually used. The fact that SPSS groups both of these techniques under the category of

(20)

Data Reduction does not help matters. It is also a popular belief among

researchers that PCA and EFA are quite similar and that it is not relevant which method is used for factor extraction. As shown here, it does matter, and

researchers should be aware of the distinction when choosing the appropriate technique for their study.

(21)

𝛾

1

𝛾

2

𝛾

3

𝑋

1

𝑋

2

𝑋

3

𝜂

1

𝜁

1

Figure 2: Example of a formative measurement model, with one latent variable (η1) and three observed variables (X1, X2, X3). γ1, γ2 and γ3 signify the weights

of the indicators. ζ1 stands for a residual term.

4 Epistemological doctrines

4.1 Scientific realism

Within philosophy of science, several doctrines exist that describe how scientific research should be understood, and how much science can in fact explain (also known as epistemologies). These doctrines describe how strong scientific statements are allowed to be and whether or not science can make claims about the truth. Scientific realism, one of these doctrines and allegedly the most popular one, comprises four core theses (Boyd, 1983): i. scientific

(22)

theories should be interpreted realistically, or in other words, the variables that theories refer to should be seen as real-world entities, ii. ordinary scientific methods can confirm these theories to be approximately true (or untrue), iii. the history of science is progressive, in the sense that new theories build upon older ones and, as science progresses, scientific theories come closer to the truth, and iv. the reality described by scientific theories is independent of those theories. In other words: whatever we believe the world to be like, the world as such does not change because of these beliefs. We might have once believed that the earth was flat, but this did not influence the shape of the earth at all. We were just wrong. The world around us stays the same, whatever we believe it to be like and

however we develop our theories.

Scientific realism is characterized by seeking explanations for empirical phenomena. It focuses on explaining complex mechanisms, rather than testing single hypotheses (House, 1991). Furthermore, scientific realism aims to find the best explanation possible. Even though scientific realists believe that theories are true (at least to a certain extent), they accept that there is not one ultimate way of doing science and that facts are theory-laden (there is no such thing as a fact without it being embedded in a theoretical context). The ultimate aim of scientific realism is to construct theories about real-world entities that explain the world around us in the best possible way, and to do this, realists often use the notion of causality. Realists believe that we can identify causes and their consequences, and draw the conclusion that there is indeed a causal connection between the two. We will see later that antirealist doctrines attach much less importance to theories, causality, and the truth-value of entities.

(23)

‘No Miracles Argument’(Putnam, 1984). Throughout the years, science has proven its merit: scientific theories can make valid predictions about the world and these theories have lead to several applications that have a prominent existence in our daily lives. Without science we would have led a very different life. It is therefore hard to believe that our theories bear no truth at all; if our theories had been wrong, we would not have been able to explain empirical phenomena as successfully as we do now. If our theories were not true, it would make their success miraculous, as opposed to their success being a logical consequence of their accuracy or truthfulness. And every scientist would rather opt for a plausible explanation than for a miraculous one.

Although this argument has generally been received as convincing and seems intuitively compelling, opponents of scientific realism have pointed out that it has its limitations. Opponents of scientific realism argue that science has certainly not always been right; there are numerous examples of successful scientific theories that turned out to be false. In those cases, even though scientific theories were useful and may have predicted well, they were not ‘true’ at all. Laudan (1981) was the first who formulated this argument, which is now known as the ‘Pessimistic Induction Argument’. He made a list of several

scientific theories that were false, yet very succesful in their time. He argues that the notion of succesfulness could therefore never be a test of whether a theory is true or not. Laudan and other anti-realists believe we should therefore never believe scientific theories to be true: they might turn out to be wrong in the end. A possible way around this argument is to say that scientific realists do not necessarily believe that the theory X they support describes the world around us truthfully, but that ‘some version’ of X may well be true’ (p.536, Neumann

(24)

(1978)). Realists can acknowledge that science can be wrong (science does not progress linearly at all times, with one success after the other) but they will argue that their theories refer to real-world entities, and even though their own theory might not be completely accurate yet, there exists a true theory X that can accurately describe the relations between real-world entities, and that theory is possibly a modified version of the one scientists currently support. It is

possible to be a realist, without falling into the trap of believing that science is always right.

Another argument against realism is that of reification, which I will explain here briefly and attend to in more detail later because it is of direct importance to the interpretation of the common factor model. The “error” of reification holds that we should not believe that words or statistical constructs necessarily designate objective ‘things’ (Haig, 1975). Just because we have a word for intelligence, or just because a number rolls out of a statistical analysis, does not mean that it also exists in reality. Assuming that words in our

vocabulary automatically refer to real-world entities is rather dangerous as that would imply that nonsense words would also refer to objects (Miles, 1957). The same kind of reasoning can be applied to mathematics: just assuming that mathematics refer to reality would imply that nonsense mathematics could also refer to real objects. Scientific realism is guilty of the error of reification, as it assumes that unobservable entities such as intelligence exist, whereas there is no real justification for this belief, apart from the successful application of the common factor model. As we will see later, the error of reification has been an argument against interpreting common factors as being real entities.

(25)

variables that we study are as real as the chair I am currently sitting on. To take a realist position in the realism-antirealism debate does not automatically imply that we should allocate the same realist status to every single object. Rather than one global realism that attributes the same realist status to all variables that is the same for all sciences, we might benefit from adopting a more localized scientific realism that can vary in strength depending on the field of interest (M¨aki, 2005). Antirealists have often argued that sciences that have reached a higher level of theoretical development, can probably make stronger truth statements than those at a lower level of development that have not yet reached a consensus about a prevailing research paradigm. For example, physics might be able to make stronger truth-claims than psychology. Physics has reached a high level of agreement with respect to how one should do research, whereas a science like psychology is still engaged in looking for the most suitable way of doing behavioural research. M¨aki (2005) therefore calls for a doubly local scientific realism and antirealism: several sciences may ask for different realisms or antirealisms. The term ‘doubly’ refers to the fact that we need an adjustable realism that varies over different scientific fields (both the type of realism and the scientific disciplines vary). In other words: scientific realism does not have to be global endeavour, but can be adjusted locally when necessary. For example, as behavioural research is complex partly due to the latent nature of its variables -it is simply hard to study ent-ities that one cannot directly observe -, psychology might be in need of a modified realist perspective than one that is suitable for physics (physicists might be able to make stronger truth claims than behavioral researchers). As (M¨aki, 2005) suggests, claims in the social sciences do not yet have to be seen as true or false, but only as candidates for truth. This does not

(26)

mean that behavioral researchers or psychologists should take on an antirealist attitude; it just means that behavioral researchers may have to be more careful in assigning a strong realist status to the outcome of their analyses, and ‘choose’ a type of realism that is appropriately cautious. As Neumann (1978) argued, a realist can perfectly well accept that he cannot yet make any truth claims, but believe at the same time that the theory he uses is similar to, or a modified version, of the world he aims to describe.

4.2 Anti-realist epistemologies

Opposed to the belief that theories refer to real hidden entities, there are

philosophical doctrines that state that we can never make such truth claims. One of these is empiricism. Modern empiricism stems from the British empiricists of the 17th and 18th centuries, such as Francis Bacon and David Hume, and holds that knowledge should only be derived from what we observe.

According to empiricism, science should consist of propositions that directly follow from the data we observe and, importantly, should not make claims about unobservable entities. Speaking in terms of causal relations would not be appropriate, especially when the causes are unobservable, as there is no reliable way of observing them. Naturally, the distinction between observables and unobservables is not easily made and has led to many debates as to where the line should be drawn (Van Fraassen, 1980). This distinction is particularly relevant for the social sciences than for the physical sciences, because the social sciences often study entities that are directly unobservable. Entities such as mental disorders, intelligence, personality or social attitudes cannot be observed with our eyes or other senses. According to strict empiricists, pursuing

(27)

psychology as an explanatory science might not even be worthwhile because psychological entities like depression or intelligence cannot be translated into propositions that result from our direct observations and are therefore not entities that we should aim to study.

An empiricist argument against scientific realism is the problem of theory underdetermination, which states that when two theories are equally capable of explaining empirical phenomena; there is no way of deciding which theory is best. The problem of factor indeterminacy as described earlier, is an example of theory underdetermination. There are always a number of available theories that explain whatever is observed, and using factor analytical terminology, there is always a number of possible solutions available that all explain the correlation table. To see a theory as true because a consequence that it could possibly have lead to is true, is also known as the fallacy of affirming the consequent (Mulaik, 1985). When we assume a hypothesis is true because its possible consequences are true, we overlook other possible theories that might have lead to the same consequences. As we cannot know whether our theories are indeed

(approximately) true, empiricists argue that we should just stick to what we can observe without reaching out to constructing theories about variables and

relations that we cannot observe. Only our direct observations can ground truth claims, postulated theories cannot. A strict empiricist perspective withholds speculating and theorizing about entities we cannot observe.

Over the years, empiricism has developed in many ways and has qualified its originally strong statements. Van Fraassen (1980) formulated his constructive empiricism, which states that, although we cannot believe in the truth of

(28)

for observable variables that empirical adequacy and truth coincide; for the unobservables, however, we can only conclude that a certain theory is empirically adequate, not that it is true. Regardless of this more nuanced and tenable

version of empiricism, the distinction between observables and unobservables is relatively arbitrary and scientific realism remains a more popular position to hold (Psillos, 2005). The No Miracles Argument noted earlier is an important and strong argument for scientific realism, and although it is not a knock-down argument, it shows that it is not necessary to restrict scientific attention to mere observations: theories that purportedly refer to unobservable variables can be quite informative and successful. An example of a successful theory about unobservables is Spearman’s theory of general intelligence, due to the often replicated finding of the positive manifold.

Empiricism can be seen as a broad epistemology that covers several doctrines, and I will elaborate on two of them: operationism and fictionalism. Operationism, or operationalism, is commonly associated with the physicist Percy Bridgman (1927) and states that the definition of concepts we refer to when engaging in scientific research should be formulated in terms of the

operations that go hand in hand with these concepts. According to operationists, the definition of concepts is not defined a priori, but depends on the set of

operations that is used to measure these concepts. For example, the meaning of intelligence as a concept is not fixed and should not be formulated in general terms, but depends on how it is measured (i.e. on which tests are being used). Following this line of reasoning, there would be as many ‘intelligences’ as there are IQ tests, as each of these IQ tests form a definition of intelligence. Within an operationist epistemology, the meaning of concepts can be plentiful yet very

(29)

restricted. The concept of intelligence can be measured by many different tests, so the definition of intelligence is never constant. However, operationism makes it impossible to get to the core of what intelligence actually is in more general terms. Although operationism has been heavily criticised throughout the 20th century, it has been a popular epistemology among psychologists (although psychologists apply operationism in a different way than described above; see Feest (2005)). As Green (1992) aptly puts it: ‘Everyone who has taken an undergraduate methodology course in psychology ‘knows’ what an operational definition is’ (p.291). Before any research can be done, a psychological researcher has to define how the constructs he intends to measure should be operationalised, or in other words, should be measured. Several articles have been written on the role of operationism and psychological measurement (e.g., Feest (2005); Green (1992)) and exploring this relation goes beyond the scope of this thesis, but operationism is closely related to another doctrine that is of importance to this thesis. This doctrine forms an interesting contrast with scientific realism, and is known as fictionalism.

Fictionalism falls under the category of empiricism, and is sometimes also referred to as instrumentalism. Fictionalists, as opposed to scientific realists, believe that there are insufficient reasons to believe that a theory, or whatever the theory implies, is in fact true. Fictionalists regard scientific theories as useful tools that can make valid predictions, when well confirmed, but do not claim that they are true or false. According to fictionalism, the proper aim of science is empirical adequacy rather than truth. We may believe in the predictions that follow from theories but not believe in the theories themselves. Fictionalists argue that there is no way of finding out whether either entities themselves exist

(30)

or whole theories purportedly referring to those entities are in any way true, or close to the truth, and for this reason, we should never make truth statements. Strong fictionalists do not even care about the actual truth status of a theory: as long as a theory is useful and predicts well, it satisfies the requirements for what a theory is supposed to do. Whereas scientific realists believe in the the

truthbearing capacities of their theories, at least to a certain extent, fictionalists only believe in their utility. In that sense, fictionalism is similar to Van

Fraassen’s constructive empiricism. Theories should be judged on their adequacy for predicting phenomena, not on their truth-value. But the main difference is that fictionalists make no distinction between observables and unobservables, whereas constructive empiricism states that theories about unobservables can be true and theories on unobservables cannot.

Operationism and fictionalism are both empiricist doctrines as both emphasize the importance of observations, and both operationists and fictionalists are unwilling to construct theories about theoretical and

unobservable constructs. Nevertheless, they focus on different aspects of the empiricist epistemology: whereas fictionalists focus on the ‘fictionalist’ character of theories (theories cannot be true, only useful tools), operationists focus on how constructs should be defined. For operationists, it is perfectly acceptable to study intelligence and develop theories about it as long as one defines the relevant term as being a set of operations, whereas fictionalists would never believe in the truthfulness of a theory of intelligence (or a theory of any other entity for that matter).

According to Neumann (1978) it is ultimately only the degree of belief in a theory that distinguishes realists from fictionalists. As mentioned earlier, realists

(31)

do not necessarily have to believe that the theory they support describes a reality, but they believe that a modified version of the theory would describe the reality more or less accurately and that by doing more successful research they will come closer to the truth. Neumann argues that it is only in respect of this degree of belief in the truth of theories, that it is nonsensical to be a fictionalist. Realists are not required to believe directly in the truth of their theories, but they at least have an indirect belief in their truth. According to fictionalist epistemology, even indirect belief in theories is off limits. It is this disapproval of even indirect belief in the truth-bearing capacities of theories (given that these theories are well confirmed) that makes fictionalist epistemology implausible. According to fictionalists, scientific realism does not make sense as scientific theories have so often proven to be false, and therefore believing in theories serves no useful purpose. But if indirect belief in theories is still part of the realist epistemology, scientific theories can still be incorrect without ‘harming’ the realist

epistemology. For this reason, Neumann (1978) argues that there is no reason to be a fictionalist. Scientific realists can easily display indirect belief in their theories and still be proven wrong, without this harming their realist philosophy.

5 Epistemological doctrines and the common factor model

5.1 Scientific realism and the common factor model

The debate between realists and anti-realists is not yet at an end, and will probably not come to an end anytime soon. I have discussed this debate at some length because it is of specific relevance to the common factor model and its conceptual development. As will become clear, there are factor analysts who

(32)

believe that the interpretation of factors should follow the realist view, and can be seen as real-world entities, whereas others follow the fictionalist or

instrumentalist doctrines and argue that factors are only summaries of the data and can never make such an existential claim about latent entities. Which

doctrine is chosen therefore has a rather large impact on what kind of claims can be made in response to a factor analytic study.

As I have argued in the introduction, Spearman can be seen as a scientific realist: even though g could not be directly observed, he believed that it existed in the real world. Spearman did not just believe in the adequacy of his theory, in its usefulness and in its successful predictions (although these three aspects are also of importance in all theory construction); he intended to capture what he believed to be the (approximate) truth about intelligence. Intelligence scholars who came after Spearman, such as Thurstone, also took a realist view on

intelligence, although the details of their theory were conceptually different from Spearman’s. Even though Thurstone (and many others) did not believe in a single general intelligence factor, he did believe that the intelligence domain could be subjected to factor analysis, and that a multiple factor model could accurately reflect the causal relations between multiple latent variables and mental ability test scores. According to both, factors referred to psychological phenomena, not just to statistical artifacts.

Given that factor analysts interpret the common factor model as a measurement model (and not a data reduction model), we assume that this model measures an underlying latent variable based on scores for observed

variables. When we attribute a measurement purpose to the model, only a realist attitude towards the interpretation of factors makes sense (Borsboom et al.,

(33)

2003). Only when we believe that mental ability tests are indeed able to measure intelligence (and as psychologists, we do believe that we measure something when we test participants), and only when we believe that intelligence is a real-world entity, does the common factor model have the characteristics of a measurement model. Borsboom et al. (2003) argue that latent variable models in general, and therefore the common factor model in particular, are generally used as measurement models and that psychologists are in that sense realists rather than empiricists of one type or other. For psychologists, the fact that latent variables are only indirectly observable is not an issue; it does not make them less real, and factor analysis is a method employed to measure them.

Gould and the error of reification

One of the major critics of interpreting factors as real-world variables was Gould (1981). In The Mismeasure of Men, Gould wrote a clear protest against the realist approach towards general intelligence, or g theory, favored since Spearman first developed it. According to Gould, factor analysts commit the error of reification, or in other words, the error of assigning a physical status to the factors that are abstracted from a correlation matrix. In the case of general intelligence, Spearman assigned the physical interpretation of general intelligence to the first factor. The problem that Gould points out, is that assigning physical, or real, entities to mathematically deducted factors is not justified to begin with. He emphasizes the difference between the mathematical side of a model and the interpretation side of a model and argues that these are not the same. To assume that mathematics in itself can mean anything real is premature. Factors are only the outcome of the analysis itself, and do not directly refer to an object in reality.

(34)

For Gould, therefore, drawing the conclusion that general intelligence exists, based on the outcome of a factor analysis, is flawed logic. Before one can actually draw this conclusion, one needs extensive knowledge of the variables in question. Only models or mathematics are not enough.

Gould’s book was both positively and negatively received. His disapproval of how intelligence was measured and of our tendency to rank people found support among both academics and non-academics. Nevertheless, his writings were not uncontroversial. One major point of critique is the fact that he does not seem to understand that he is actually writing about PCA rather than factor analysis. In his example on bone length, he describes the extracted factor as a principal component, and as mentioned earlier, factors and components should not be confused. The purpose of PCA is rather to reduce data whereas in factor analysis, we often have a theory generation or measurement purpose. Gould seems to be unaware of this difference and explains factor analysis using PCA terminology.

His claims on the error of reification also received a fair amount of criticism; in his critical book review, Carroll (1995) stated that the issue of reification was not as big an issue as Gould believed it was. Carroll’s counterargument holds that it is not true that psychologists or other factor analysts interpret factors as real as Gould claims: factor analysts do not interpret factors to be as real as physical objects. Carroll argues that factors ‘should be regarded as sources of variance, dimensions, intervening variables or “latent” traits that are useful in explaining manifest phenomena, much as abstractions such as gravity, mass, distance, and force are useful in describing physical events.’ (p. 126). Gould’s accusation that factor analysts interpret

(35)

factors to be as real as any physical objects would therefore be incorrect; factor analysts regard them as latent traits rather than actual physical objects. For example, intelligence is not regarded to be as real as a chair or a table; it is supposed to be seen as an abstract, latent trait that is useful to explain our performance on mental ability tests. The problem of reification would therefore not be as severe as Gould portrays it.

I agree with Carroll that the issue that Gould rises with regard to the reification is not as problematic as Gould argues, but I disagree with Carroll’s line of argument. Carroll claims that factor analysts do not interpret factors to be as real as Gould says we do, and if factor analysts do assign such a status to factors, they wrongfully do so. Factor analysts do sometimes interpret factors to be as real as concrete objects, but they are simply latent (or unobservable). Factor analysts often do not question the existence of the variables they study; these variables are just as real as a table or a chair. Just because we cannot directly observe these variables, and just because we may not know their physical structure (how intelligence is actually constituted by our brains is largely

unknown), does not imply that we should interpret them to be less real. In the quote mentioned above, Carroll states that variables from physics such as

gravity, mass and distance, are also not interpreted to be as physical as concrete objects. Again, I think that Carroll makes a mistake here: both academics and non-academics interpret these to be real-world entities and it is not a mistake that they do. Carroll confuses the notion of reality and the notion of

unobservability. Latent entities are not the same as unreal variables; although latent variables are directly unobservable they can still be real.

(36)

the error of reification, but for different reasons. The core of the problem is what role we want to ascribe to the common factor model. As has been argued earlier, only a realist epistemology enables the common factor model to have both a measurement and theory construction purpose (probably the two most popular functions of factor analysis in psychology). Regarding these functions, the reification of factors is part of the factor analytic tradition, even though it is not part of the model’s mathematical structure. Strictly speaking, Gould is right in pointing out that the mathematical properties and interpreting the output of the factor model are two distinct categories, just like the word ‘table’ and the object ‘table’ are not strictly the same. Nevertheless, this does not imply that

mathematics cannot refer to the real-world entities. Mathematical properties themselves do not facilitate any interpretation, and certainly not the correct interpretation, but it is up to the good sense of factor analysts to interpret them in a manner that advances the construction of their theories. Moreover, if we use the common factor model for theory construction, we can assume that we have some knowledge of the mechanism we intend to describe, and that we are able to make sound interpretations of them. Of course, we could ascribe another purpose to the common factor model, such as data reduction, in which case a realist interpretation of the factors would not be sensible, but in most cases in

psychology this is not the main aim of factor analytic research. Moreover, PCA is better suited for data reduction purposes, so there is no reason to apply factor analysis in those situations. In conclusion: I think that the error of reification does not have to be regarded as an error, when there are good reasons to believe that factors indeed represent real-world entities.

(37)

Maraun and the problem of factor indeterminacy

Another critic of interpreting factors as real entities (and a critic of the factor analytic method in general) is Maraun (1996), who argued that when considering the mathematical model of factor analysis, there is no reason to conclude that extracted factors represent the only possible underlying latent variables. According to Maraun (1996), the undeniable existence of factor indeterminacy makes realist interpretations of extracted factors impossible. Recall that mathematically speaking, the common factor factor implies that there is no such thing as ‘the one latent variable’ that causes the correlations between the manifest variables. Given that Y is the set of observed variables, ‘[...] the factor analysis model provides a criterion that admits an infinity of constructed variates that are each latent common factors to Y’ (p.521). There is no reason to believe that the factor that is generated by the factor analysis is the latent variable that forms the only explanation for the found correlations. Moreover, a method for the detection of underlying variables cannot at the same time also be a criterion for the identity of this underlying variable. Maraun therefore argues that ‘the figurative language, and various senses of factor that surround the practice of factor analysis must be distinguished from the technical sense of latent common factor inherent in the model.’ (p.527). The metaphorical side to factor analysis, finding analogies and labelling extracted factors, has nothing to do with the mathematical restrictions of the model, and should therefore be seen as an external practice rather than something that is internal to factor analysis.

Although Gould and Maraun would probably agree with each other to a certain extent (both argue that the mathematical side of a model is distinct from

(38)

how its output should be interpreted), the reason why they disapprove of a realist interpretation of factors is slightly different. Whereas Gould argues in a more general sense that it is a mistake to assign a realist status to extracted factors, due to them merely being the outcome of a statistical analysis, Maraun focuses on the mathematical limitations of the model that imply that an extracted factor is never the only possible latent variable that can explain the correlation matrix. Gould might possibly agree with a fictionalist interpretation of factors (to which I will come back later), as long as we do not make the mistake of assigning real entities to them. Maraun on the other hand possibly disapproves of the whole factor analytic tradition as factor analysts do not seem to be able to distinguish between the internal or mathematical restrictions and external interpretation procedures, which should not be confused. For him, latent variable models differ distinctly from mathematical models in the natural sciences, as they lack the rules that would relate the latent variable to any real entity. In the natural sciences on the other hand, we are able to point out the natural entities that we would like to represent in the model and we have an understanding about how these entities behave. For example, we have an idea about the rules that govern entities such as speed or mass, which enables us to link them to mathematical entities. Latent variable models on the other hand are unable to point out entities, as we have no tools to relate latent variables to natural entities.

Importantly, Maraun is not an anti-realist: he believes in the existence of natural entities and his point is not to disprove the existence of latent variables as such. Intelligence or depression might be real-world entities, but the category of latent variable models are simply not capable of uncovering them. If we want to keep studying latent variables, we need a whole different kind of model that

(39)

incorporates the correspondence tools that latent variable models lack.

Factor analysts everywhere agree that factor indeterminacy is a given, and in that sense Maraun is right: there is never a single explanation for the

correlations between observed variables, as there is always a wide range of possible solutions available. As mentioned earlier, the debate of factor indeterminacy is therefore not about whether it exists, but about how

problematic it in fact is. Obviously, in Maraun’s eyes this problem is severe: we can never draw such strong conclusions about underlying latent variables as is often done.

In a reply to Maraun’s arguments, Mulaik (1996) wrote:

The remarkable thing about factor indeterminancy is that for all the mathematics that suggest the existence of alternative solutions for a common factor, it is difficult to show clear-cut examples of

researchers having distinct alternative interpretations for their factors that are equally consistent with the factor structure coefficients and the covariances among the manifest variables. (p.585)

The existence of factor indeterminacy cannot be denied, but it is highly unlikely that the interpretations of different researchers will vary substantially, especially when researchers have already carefully thought about possible interpretations in advance (which is the case for confirmatory factor analysis). Moreover, following a realist epistemology, researchers do not have unlimited freedom in choosing adequate interpretations because scientific realism requires factors to be

objectively determined real-world entities. For the interpretation of the output of exploratory factor analysis, one might expect more disagreement among

(40)

researchers, yet researchers seem to come to similar conclusions. As one has no clear hypotheses before one applies exploratory factor analysis, its output might be more ambiguous and invoke several possible interpretations compared with the output from confirmatory factor analysis. However, the variety of possible interpretation need not be a problem; it is simply part of the natural course of science: in the early stage of research, theories might not be fully developed and researchers might still disagree about which concrete phenomena play a role in the effect they intend to study. As science progresses, this uncertainty becomes less and one finds that researchers tend to agree more on the entities factors refer to.

In conclusion: although both Gould and Maraun raise interesting points about the problem of reification, factors can still be interpreted as real variables, as long as there are good enough reasons to do so. Unlike Gould I do not think that reification is a problem, as long as we are informed enough to draw plausible conclusions. The problem of factor indeterminacy does not have to be as

problematic as Maraun believed it is, and although there is always a variety of possible factor interpretations, through sensible reasoning we can solve the puzzle of which interpretation is most appropriate.

The fact that psychologists (and other scientists as well) interpret factors as the representation of real-world entities does not mean that factors as such necessarily represent real-world entities. Factor analysis is a complicated method that involves many steps and many subjective decisions, and there exists a substantial literature on what happens when factor analysis is used incorrectly. For example, Horn (1967) argues that in factor analysis there is always a ‘lack of operational independence between an investigator’s substantive hypotheses

(41)

concerning factorial structure and the procedures used to determine factor structure’ (p.819), or in other words, there is always a substantial amount of subjectivity involved in the procedure of applying factor analysis. When wrongly applied, factor analysis can indeed lead to nonsensical output, and a realist interpretation of factors would in those cases be incorrect. Factor analysis does not magically lead to the discovery of new latent variables; to be useful, it needs to be applied correctly. For those who follow a realist doctrine, factor analysis can still only lead to truth-bearing claims when the relevant theory is generated properly and tested successfully.

5.2 Empiricism and the common factor model

Not much has been written on the link between modern empiricism and latent variable modeling, and possibly for a good reason. As mentioned earlier, empiricism is a doctrine that states that knowledge of the world around us should only be described in terms of direct observations from the senses. For this reason, it is impossible to construct theories about unobservable entities: there is no way to directly observe intelligence or depression, and we should therefore not construct theories about unobservable causal mechanisms. However, I will argue that there are some applications of latent variable modeling that would also be of use within an empiricist framework, although these applications are restricted.

Before I continue on with this argument, it will be useful to once again make the distinction between exploratory and confirmatory factor analysis, or exploratory and confirmatory statistical techniques in general. Each of these approaches makes different epistemological claims. According to popular opinion, exploratory statistical techniques (such as EFA) are methods that can detect

(42)

data patterns and generate hypotheses. Scientists apply EFA or other

exploratory techniques when they have no clear idea yet of what they can find in the data. Baird argues that EFA can be seen as a purely empirical method as it ‘proceeds in a virtual vacuum of substantive theory’ (p. 323, Baird (1987)). According to him, the technique of EFA as such is not dependent of any theory, and only makes use of directly observed test scores. CFA on the other hand is embedded within substantive theory, because CFA requires that one models the expected relations between latent variables beforehand. Because CFA is first and foremost concerned with the testing of explanatory theories, it does not function well within a empiricist framework. As mentioned earlier, strict empiricists do not believe in constructing theories about latent variables, as we can never know whether these are indeed true. CFA thus fits better in a realist framework than in an empiricst one. In conclusion, following Baird’s line of argument, EFA can be seen as an empirical method as it deals with direct observations, but CFA does not function well within such an epistemology because it deals with theoretical relations between unobservable entities.

Even though testing theories about latent variables through CFA does not fit well in an empiricist framework, the common factor model as such could still be useful for empiricists when the common causes are not latent. Although common causes in psychology and examples of common causes that I have discussed so far often have a latent structure, common causes are not necessarily latent. The common factor model can also be applied to common causes that can be directly observed. For example, the reason that the number of sunglasses sold and the number of people going to the beach strongly correlate, is because the sun has been shining. Only strict empiricists refuse the use of language that

(43)

involves causality, but less strict empiricists acknowledge causality and appeal to it in explanations of observed phenomena (as long as the common cause is not an unobservable). The common factor model could therefore still be used, although due to the restrictions of an empiricist framework, the common factor model might only be applied to entities that can be directly observed.

Even though there are some applications of the common factor model within an empiricist framework, these are quite limited. EFA might fit an

empiricist framework better than confirmatory factor analysis, but the process of EFA is not entirely empirical as we have to use our rational minds to interpret the output and certain assumptions are made beforehand . Moreover, empiricists might argue that latent variable modeling as a whole is a pointless endeavour as it measures a hidden reality that is by no means directly observable. It makes no sense to measure an unobserved entity by linking it through linear regressions to scores on observed variables; the latent variable of interest remains unobservable and should therefore not be included in scientific theories at all. In conclusion, the link between empiricism and the common factor model is not as natural as the link between the common factor model and realism.

5.3 Fictionalism and the common factor model

As noted earlier, Spearman the realist was strongly influenced by Francis Galton, who believed that the regression lines he found represented a relation between the size of mother-peas and daughter-peas. Someone else who greatly influenced Spearman, but who was clearly not a realist, was the statistician Karl Pearson. Whereas Galton believed in the actual existence of data patterns, Pearson believed that data patterns were always just a statistical artefact, and a

Referenties

GERELATEERDE DOCUMENTEN

When viewing the correspondence of STAMP with the human factors models and task analysis methods reviewed, as presented in Tables 3 & 4, it is observed that the core

The ‘idea’ of the whole is such an important element of the artwork and coupled with the merging of different art forms makes the artwork an entity, therefore the

The smallest size and highest book-to-market equity portfolio () and the largest size and lowest book-to-market equity portfolio () are highlighted in the

The sleep deprivation studies demonstrated that sleep loss produced severe impairment in actual and simulated driving performance as expressed by a significant rise in SDLP in the

Fietspaden zijn ook belangrijk natuurlijk voor als het om fietsen gaat maar ik denk ik zijn algemeenheid eigenlijk, kijk uit onderzoek blijkt bijvoorbeeld dat in Duitsland

The prediction of the second analysis is that there is a positive interaction effect of both the independent variable perceived HRM practices and the three moderating variables

The interaction effects between the inter- and intra-bundle flow can be accounted for in a meso scale permeability model by applying a slip boundary condition using the

It is a time-critical task with high uncertainty, high risk, and high information density (information overload). Stress and possible cognitive overload may lead to