• No results found

An Analysis of the Asymmetry of Causal Inference in an Interventionist Framework

N/A
N/A
Protected

Academic year: 2021

Share "An Analysis of the Asymmetry of Causal Inference in an Interventionist Framework"

Copied!
75
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

An Analysis of the Asymmetry of Causal

Inference in an Interventionist Framework

by

Omri M. de Boers

University of Amsterdam Amsterdam, The Netherlands

(2)

Name: Omri Maurits de Boers Student number: 10022333 Supervisor: dr. Katrin Schulz Second reader: prof. dr. Franz Berto Date of defence: September 3rd, 2014 Word count: 21331

Department: Philosophy

(3)

Contents

1 Introduction 3

1.1 Asymmetry and Causal Inference . . . 5

2 Causal Asymmetries 9 2.1 Temporal order . . . 10 2.2 Probabilistic (in)dependence . . . 11 2.3 Manipulability . . . 12 2.4 Explanation . . . 14 2.5 Screening off . . . 15

2.6 The Causal Field . . . 16

2.7 Concluding remarks . . . 18

3 Pearl’s Interventionist Framework 20 3.1 The framework and its motivation . . . 21

3.2 Formal concepts . . . 25

3.2.1 Probabilities . . . 26

3.2.2 Causal graphs . . . 28

3.2.3 Structural equations and causal models . . . 30

3.2.4 Remarks . . . 32

3.3 Formal definitions . . . 33

3.3.1 Preparing for the CMC . . . 34

3.3.2 The Causal Markov Condition . . . 39

3.3.3 Causal structure and model definitions . . . 41

3.3.4 Inferred Causation . . . 42

3.3.5 Temporal information and statistical time . . . 43

3.3.6 Intermediary remarks . . . 45

4 Analysis 46 4.1 General considerations . . . 47

(4)

4.1.1 Inference to a model and inference from a model . . . . 47

4.1.2 Minimality . . . 49

4.1.3 Directedness and acyclicity . . . 49

4.1.4 Ontological consequences . . . 52

4.2 Probabilistic dependence patterns . . . 52

4.3 The Causal Field and Asymmetry . . . 57

4.3.1 The causal field, granularity and the CMC . . . 57

4.4 Temporal order . . . 60

4.4.1 Causation with and without temporal information . . . 60

4.4.2 Statistical time . . . 61

4.4.3 Reflections on temporal asymmetry . . . 63

4.5 Discussion . . . 65

4.5.1 Contextualization . . . 65

4.5.2 Considerations and desiderata . . . 67

Appendix 72

(5)

1

Introduction

Causality has been a matter of philosophical debate since the time of the ancient greeks. Philosophers have been particularly interested in what causality is and how it works. A reduction of the concept of causality to a more fundamental non-causal concept, a concept that explains what causality is and how it works through non-causal terms, is often what is strived for. Such a reduction of causality has not been successfully attempted to this day, nor has consensus on a likely candidate concept been reached (Beebee et al.,2009:1-3). Theories focusing on such a reduction often show failings in causal situations that people can resolve fairly intuitively (Paul,2009:162-163, 173-182;Halpern and Pearl, 2005). One thing most philosophers do agree on is that causality is a stable relation and that it has an asymmetric character. The first point is simple: causal relations seem to persist. The second point is seems straightforward as well. The simple binary relation ”C causes E” is asymmetric, but there are other ways in

(6)

which causality has been observed to be asymmetric. A prominent example is the time-asymmetry of causation. Intuitively causes generally precede their effects. These two points of stability and asymmetry have been taken as points of departure for philosophical theories and most if not all attempts to explain what causality is and how it works have been made by taking asymmetrical concepts (Beebee et al.,2009).

In the last decades a practical approach to causality has started to take hold in empirically oriented scientific disciplines. This approach is exemplified by interventionist theories such as that of Judea Pearl. The goal of this approach with regard to causality is quite different from one of the prominent philosophical goals, a reduction to a more fundamental concept, described above. The

approach seeks to operationalize the notion of causality in such a way as to make it practically applicable to scientific endeavors, or, in other words, to facilitate causal inference (Pearl,2009:xv-xviii). For that reason it is a very popular

approach in a number of scientific disciplines such as artificial intelligence ¹. Due to the practical focus on causal inference, with human intuition and experience as guides, the theory is more successful with regards to causal inference in situations that people find intuitive as well. A thorough analysis of such an intuitively appealing approach may inform and perhaps re-orient the philosophical debate about a theory of causality. Conducting such an analysis of an interventionist system will be the purpose of this paper.

The layout of the paper has been organized as follows. An in-depth problem description including a more detailed motivation for the project, a research question and methodological considerations will be handled in the section below, setting the stage for the paper. Proceeding from there, the second chapter will focus on laying a part of the foundation necessary for the analysis, by describing a number of relevant different observed causal asymmetries from philosophical literature and elucidating a number of concepts that will be built upon in the subsequent chapters. The third chapter will use the material and

¹Other examples are medical research, biology, economics, social sciences, psychology and more.

(7)

concepts from the second chapter to give a comprehensive picture of Pearl’s interventionist framework, forming the other needed part of the foundation. In the fourth chapter the analysis of Pearl’s framework will be handled, building upon the foundation of the previous two chapters and answering the research question. The results to the research question will be discussed in the final section of the fourth chapter.

1.1 Asymmetry and Causal Inference

After the introduction above a specific description of the actual issue is needed. Before doing so, it is worth briefly discussing the difference between causality and causal inference, since Pearl’s theory focuses on the latter due to its practical approach. Causality in philosophical terms generally refers to the causal relation itself and is the main object of philosophical debate. Causal inference instead usually refers to the learning or discovery of causal relations (Garrett,2009:75-77; Danks,2009:451-455). We would expect causal inference to adhere to the causal relation in the sense that causal inference is aimed at uncovering those causal relations for use in reasoning. That does not mean that insight into causal inference will tell us everything there is to know about the causal relation itself, but it is reasonable to expect that causal inference will unveil those characteristics of the causal relation required to utilize it. In other words, on would expect there to be a significant correspondence between causal inference and the character of causation. This correspondence is intuitively and experientially indicated by looking at the stability and asymmetry of causation. Causation has been observed and generally presumed to be stable and we see this in causal inferences as well. If a wineglass is forcibly thrown against a concrete wall, we expect it to shatter every time, barring exceptions. Similarly, the asymmetry of causation is seen in causal inferences, for example in the time-asymmetry of cause-effect relations as mentioned. In this case the throwing of the wineglass being before the shattering of it against the wall. It is worth pointing out that our intuition, experience and experiments are ultimately the only source for a starting point to probe the nature

(8)

of causation, regardless of the many directions that may be taken from there. For Judea Pearl the two fundamental questions of causality are: ”(1) What empirical evidence is required for legitimate inference of cause-effect

relationships?” and ”(2) Given that we are willing to accept causal information about a phenomenon, what inferences can we draw from such information, and how?” (Pearl,2009:xv). In other words, which experiential cues do we need to infer the presence of causal relationships? And what can we infer from that information? The stability of causal relations, i.e., that A and B are always causally related, plays an important role, but that is not enough to infer whether A causes

B or vice versa, because from the mere fact that A and B are related, we cannot yet

establish which is the cause of the other. An illustration of this problem can be offered by the difference between correlation and causation. The slogan ”correlation does not imply causation” is a popular phrase from statistical textbooks. It turns on the observation that a correlation, or covariance, between two variables A and B tells us that they are probabilistically dependent, that the variance of their values is related, but not whether A is dependent on B or whether B is dependent on A, or even whether the correlation is spurious, a matter of chance without any dependence involved. Evidently, such information is not enough to infer that A causes B, because such a statement would require that we know that B depends on A in a causal manner. Such a dependence is commonly thought to be impossible to infer from statistical data, which consists only of correlations between variables (Pearl,2009:401-428).

What this brief illustration demonstrates is that in order to infer cause and effect, to establish not just that two events are related, but to establish which event is the influencer and which is the influenced, asymmetrical information is needed. As said, such asymmetrical concepts are often the instrument of attempted philosophical reductions of causality. For example, it may be attempted to reduce all causal relationships to specific temporal relations between events, potentially reducing causality to temporal order. Such reductions of causality generally focus on a single asymmetrical concept. Unfortunately, although much has been contributed to our understanding of

(9)

causality, attempted reductions have been unsuccessful so far. A large number of conclusions may be attached to that status of affairs. For example, that a

reduction is unfeasible or even impossible, that causality cannot be reduced to a single concept or that we have not yet found the right concept to reduce causality to. What might be considered is that our approach has to be re-oriented based on the starting point of our investigations into causality: intuition, experience and experiments. Taking a framework like that of Pearl which focuses on that starting point may be able to inform the debate on causality. Investigating how the asymmetry that we observe causality to have is relevant for our inference of cause-effect relationships in a framework like that of Pearl, which focuses on the starting point of intuition, experience and experiment, may inform the debate on causality and its asymmetry, perhaps providing support for one of the

conclusions mentioned a few sentences above. This motivation leads to the following research question for this paper:

”How is the observed asymmetry of causation relevant for inferring causal relationships from experience in an interventionist framework?”

In order to arrive at an informative answer to the research question a number of methodological considerations have to be discussed. First, as should have already become clear, Judea Pearl’s theory will be taken as the point of reference for an interventionist framework. Judea Pearl is one of the leading figures in the field and his theory is one of the most extensive, well documented, very well received and extensively researched. Furthermore, as has been pointed out by numerous scientists working on interventionist theories, the basic principles and methodologies between different instantiations of such interventionist theories are largely the same (Halpern and Hitchcock,2010; Woodward,2009). For these reasons the framework of Judea Pearl will be taken as the interventionist framework referred to in the research question.

A conceptual analysis is the primary method of approach, focusing on describing, elucidating and analyzing the role of causal asymmetry in causal

(10)

inference. Aspects to account for in that analysis are as follows. Intuition and experience are important guides which we would like a theory about asymmetry for causal inference to adhere to: it should be able to explain them. Having said that, there are also philosophical considerations and scientific developments to account for. Scientifically, it is prudent to respect the current state of affairs in science and a theory of causality should generally adhere to prevailing scientific views, such as those of physics. Philosophically, perhaps the most important point is not to exclude possibilities on a priori grounds, and that the foundation of the theory has been adequately argued for and explained. Furthermore, since our folk notions may not always be obvious, may be wrong to a certain degree, may be found to be contradictory or run into other issues, the application of a metaphysical analysis remains important philosophically and will be taken into account where possible. All these aspects will be used to arrive at a

comprehensive answer to the research question. This allows a contextualization of the results to the broader philosophical discussion on causality and its asymmetry as well.

Two of specific points deserve attention before commencing the paper. First, the difference between token and types as causal relata, i.e., as objects that are causally related, will not be discussed in this paper, because Pearl’s framework does not make a distinction between them (Pearl,2009:253-256). Tokens are singular instances of a causal relation. Types are generic instances of a causal relation, applicable to a multitude of token-level situations. In daily life we often make generic causal statements such as ”you can turn a computer on by pressing the button with the power symbol”. But singular ones as well, such as ”you have to press the white button to turn my computer on”. Second, Pearl’s terminology, notation and numbering of definitions and figures will be adopted for

(11)

2

Causal Asymmetries

Causal asymmetry comes in a number of varieties which all speak to the

asymmetrical character of causality. This chapter will describe asymmetries that have been observed to correspond with the asymmetry of causal relationship and occur in Pearl’s framework. Some asymmetries will not be covered, a prominent one being counterfactual dependence, because they do not play a direct role in Pearl’s framework itself ¹. The chapter will focus on explaining the relevant asymmetries and a number of concepts that will be needed for describing and explaining Pearl’s framework in chapter 3, as well as highlighting connections between asymmetries that will be relevant for the analysis in chapter 4.

(12)

2.1 Temporal order

Temporal order is one of the most intuitive asymmetries that has been observed to be associated with causality. Simply defined, the time-asymmetry associated with causation is that causes precede their effects in time. This will be termed ”successive” causation in this paper. A simple example was given in the first chapter with the wineglass being thrown against a wall. The cause, the throwing of the wineglass, precedes the effect, the shattering of the wineglass against the wall, in time. Although he time-asymmetry of causation is intuitively very plausible, it has not been accepted without nuance (Beebee et al.,2009). It has not often been denied that causality corresponds generally to the direction of time, but its worth for supporting a theory of causality has been questioned. This has to do with philosophical theorizing and reflections on intuition regarding causality.

It has been found plausible that causation is not always aligned with the direction of time (Carroll,2009:286). Earlier the term ”successive” causation was coined to contrast it with what has become known in the literature as

”simultaneous” and ”retro” causation. Simultaneous causation is defined as cause and effect both occurring at the same moment in time. In other words,

simultaneous cause-effect relationships are a-temporal in the sense that no time elapses from the cause occurring to the effect occurring. Consequently, the direction of such cause-effect relationships cannot be derived from the temporal direction in such cases. Remember from the previous chapter that in order to establish the presence of a causal relationship, we need to know not just that two events or variables A and B are related, but whether B depends on A (which means that A causes B) or vice versa, which means that not being able to establish this directionality is a strike against temporal order as a candidate for a

metaphysical reduction of causality. An example of a case of simultaneous causation is a situation where two cards are stacked against each other in a reverse V. It can be said that the cause of one card standing up is that the other keeps it from falling at the same exact same time.

(13)

Another possibility is retrocausation. These cases are defined as cause-effect relationships where the effect precedes the cause in time. In other words, these are cases of causal relationships that run from future to past, not past to future. From the perspective of our intuition and experience these cases seem highly implausible, even impossible. Potential examples given by philosophers are highly theoretical in nature and invariable involve only events at a

microscopic, i.e., quantum mechanical, level of analysis (Healey,2009:684-685). Retrocausation will not be considered further in this paper because of its

theoretical nature, the fact that its theoretical plausibility does not extend beyond the microscopic scale, and the fact that because of that it seems intuitively and experientially highly implausible if not impossible. Having said that, it should be noted that Pearl’s framework does not exclude retrocausation on an a-priori basis, but rather leaves the issue to be decided empirically, as we will later see.

2.2 Probabilistic (in)dependence

To describe the asymmetry of probabilistic dependence it is important to understand the concept of probabilistic dependence itself. If two variables are probabilistically dependent, then their values are correlated. As an example, nicotine stained fingers and tar-filled lungs are correlated variables in long-term smokers. That is to say that if the value of the variable of nicotine stained fingers varies, then so does the value of the variable of tar-filled lungs, and vice versa. Probabilistic independence is simply the reverse. If two variables are

probabilistically independent of each other, then the values of those variables are

not dependent, not correlated.

The observed asymmetry of probabilistic dependence associated with causal relations is that the effects of a common cause are probabilistically dependent (as in the example above, taking as the cause being a long term smoker), while the causes of a common effect are probabilistically independent of each other (Hausman,1998). This asymmetry is used extensively in Pearl’s framework through the concept of conditioning on variables. If we take two

(14)

variables, A and B, then if we look at the value of B conditioning on A that means we are looking at the value of B given A (formally the probability of B given A:

P(B|A), which is called the conditional probabilistic dependence of B on A).

What this means is that we look at the value of B provided that we know the value of A, i.e., the value of A is given. Another way of saying this is that we look at the value of B while we control the value of A, i.e., we hold the value of A constant. Now, if we find that the value of B is different when we already know the value of

A from a situation in which we do not yet know the value of A, then B is

dependent on A or, in other words, A has a probabilistic influence on B. For example, take as B the probability that the door is unlocked and as A the fact that we have just locked the door ². If we now ask a random person for the probability that the door is unlocked (B), then that person should take that value to be 0.5, because it could be either true or false. But if we first tell the person A, then that person will evaluate the probability of B to be much closer to 0 if not truly 0, because it is much more likely that the door is not unlocked if we have just locked it ³. If the value of B is the same given the value of A as it is without the value of A given, then B is probabilistically independent of A (formally: P(B|A) = P(B)). This concludes the explanation of probabilistic dependence for the moment, it will be expanded upon in 2.5, the following chapter and in chapter 4, specifically 4.2 and 4.3.

2.3 Manipulability

Woodward gives the following rough definition: ”if C causes E then if C were to be manipulated in the right way, there would be an associated change in E.” (Woodward,2009:234). Intuitively speaking, we can manipulate a cause to produce a corresponding change in its effect, which is what Woodward’s rough

²This is the probability of us locking the door being set to the value 1.

³The values have been chosen as examples only. In reality, we might evaluate the probability of the door being unlocked without knowledge of whether we locked it (so the value of B without conditioning on the value of A, without the value of A being given) to be lower than 0.5, because people usually lock their doors, for example.

(15)

definition says. Furthermore, it appears that we cannot manipulate the effect to produce a corresponding change in its cause, which seems intuitively and experientially correct (Hausman,1998:1). Taken together we have an observed asymmetry of manipulation when it comes to causal relationships. This is not to say that manipulation is inherently asymmetric, only that it is observed to behave in an asymmetric manner where causal relationships are concerned.

The observed asymmetry of manipulation with regard to causal relations is particularly intuitive. One of the primary reasons has been argued to be that this asymmetry is intricately tied to human deliberation and action. Cartwright has argued that human deliberation and action are in the pursuit of targets of strategies. In that sense, knowing what to manipulate in order to reach a

particular target is what separates an effective strategy form a useless strategy. For example, if the goal is to drive a nail into a wall, it makes no sense to hit the nail against a hammer, which sounds absurd. In other words, effective strategies always prescribe the manipulation of causes to achieve a desired end (effect) (Cartwright,1979).

Another reason why the observed asymmetry of manipulation with regard to causal relations may appear so intuitive is that it corresponds with the temporal asymmetry that causal relations appear to exhibit. Looking at our experience, we can only manipulate matters in the present or plan to manipulate some future event. We cannot, apparently, manipulate events that have already happened or manipulate present or future events to affect something that has already

happened. In this sense, since causes generally precede their effects in time, effects cannot be manipulated in order to affect causes, because the causes are already past. An interesting observation is that in cases of simultaneous causation, such as the stacked cards example in 2.1, it appears that effects can be manipulated to affect causes. Both cards holding the other up are at the same time both cause

and effect. That is to say, the left card is caused, is affected, by the right not to fall,

but at the same time the left card causes, affects, the right card not to fall. As a final note for this section, the causal asymmetry of manipulation appears to correspond to the asymmetry of conditional probabilistic

(16)

dependence. In the example with A and B in section 2.2 above, there is a dependence of B on A. In other words, one can affect B by manipulating A. However, if A and B were probabilistically independent, then, generally, one could not manipulate A to affect B.

2.4 Explanation

According to Hausman causes can be called upon to explain why their effects occurred, but effects cannot be called upon to explain why their causes occurred (Hausman,1998:159). This is an observed explanatory asymmetry with regard to causal relations. Intuitively when people ask why a particular event happened, they are asking for a cause, not an effect.

Causal explanations are subject to the ”level of explanation” concept. As an example, if we want to explain why a car crashed into a safety barrier, we might offer as an explanation that the driver was drunk. We might alternatively say that the neurochemical properties of alcohol altered the driver’s brain chemistry in such a way as to impair the driver’s motor abilities, causing the car to crash. Both of these are valid explanations of why a car crashed into a safety barrier if the driver was drunk and his lack of motor abilities caused the crash, but the level of these explanations differs. The first explanation is more ”general”, ”abstract” or ”less detailed” than the second explanation. Almost all causal situations can be answered at various levels of detail of explanation. As a general rule it seems intuitively accurate to say that the level of detail has to match the level of detail asked for by the questioner or be more detailed, but cannot be more abstract lest it be insufficiently explanatory to the questioner. The level of explanation will also be referred to as the ”granularity”. A more abstract or ”coarse” granularity being a more abstract level of explanation and a more detailed or ”fine-grained” granularity being a more detailed level of explanation.

An interesting observation is that while the level of detail of the

explanation of why an effect occurred can be varied, the level of detail at which the effect itself is described is held constant. Perhaps a link to manipulation can be

(17)

seen here, regarding effective strategies. It makes sense to understand how a cause works because such information can be used for planning and executing effective strategies. It is not a stretch to imagine that more detailed information about a cause may offer strategic benefits by providing knowledge about intermediary causes. In the example with the drunk driver, neurotransmitters may be seen as intermediary causes that can be acted upon. For example, imagine a pill that acts on synapses in such a way as to block certain neurotransmitters, preventing a deterioration in motor skills after drinking alcohol. Gaining such additional information about the effect would seem useless from a point of effective strategies. This observation and example will be revisited in the analysis chapter.

2.5 Screening off

This observed asymmetry makes use of the concept of conditional probabilistic (in)dependence. Take an example of a cause with two effects, such as a long term smoker having tar-filled lungs and nicotine stained fingers in 2.2. The effects will be probabilistically dependent in that case. That is to say, tar-filled lungs and nicotine stained fingers are probabilistically dependent. However, if we condition on the cause, long-term smoking, then the effects become probabilistically independent of each other. That is to say, tar-filled lungs and nicotine stained fingers are conditionally probabilistically independent given long-term smoking.

Given the fact of long-term smoking, the dependency between the value of

nicotine stained fingers and tar-filled lungs will disappear. That is to say, given that someone is a long-term smoker, if we were to clean his or her lungs of tar, he or she would still have nicotine stained fingers. So, given the cause (that someone is a long-term smoker), the two effects are probabilistically independent of each other. The term in the literature for this phenomenon is that causes ”screen-off ” their effects from each other. Causes screen of their effects, which means that if we condition on a cause then the probabilistic dependence among its effects will disappear, or, in other words, conditioning on the cause will render the effects

(18)

The reverse is not true. Conditioning on an effect does not render common causes of that effect independent. Remember that we saw in 2.2 that causes of a common effect are probabilistically independent without

conditioning on the effect. Notice that if we condition on a common effect, the causes are rendered probabilistically dependent. We can conclude, In other words, that conditioning ”reverses” the asymmetry of probabilistic dependence. Causes are probabilistically independent, while effects are probabilistically dependent, as was explained in 2.2, but when we condition on a common effect, causes are turned conditionally probabilistically dependent, and when we

condition on a common cause, effects are turned conditionally probabilistically

independent. The idea that causes can be probabilistically dependent given a common effect can seem unintuitive, but a simple example will show why this is the case. Consider a situation with A and C as causes and B as their common effect: A→ B ← C. In this situation A and C are probabilistically independent, but if we condition on B, A and C become dependent. To see this take

B = A + C and fill in a fixed value for B, in other words, treat the value of B as given or controlled for. Then if the value of A changes, the value of C will have to

adjust, considering that we keep B fixed. In other words, A and C are rendered dependent if we condition on B.

This asymmetry is essentially the ”reverse” of the probabilistic dependence asymmetry. The reversal is brought about by conditioning, as described above. It is important to keep this concept in mind, because it plays a large role in Pearl’s system, as we will see in the next chapter and chapter 4.

2.6 The Causal Field

The causal field is not an asymmetry per se. Rather it is a phenomenon that is related to the causal asymmetries of probabilistic dependence, explanation and manipulation. The relationship of the causal field to these asymmetries will be explored in depth in chapter 4, but the basics will be explained here because of the relationship to the asymmetries described and to expedite the explanation of

(19)

Pearl’s system in the next chapter.

Every causal inference and all causal talk assumes a particular implicit set of conditions as being met. For example, if someone says to you that striking a match will light a fire, then that person does so under the presumption that there is oxygen available, although that is not explicitly conveyed in the remark itself. The presence of oxygen is a condition that is assumed to be met when the remark is made. All causal inferences and remarks assume such a ”causal field” which contains such conditions that have to be met for the inference or remark to hold true. In other words, causal inferences and remarks are made in a certain causal field of implicit conditions. This makes intuitive sense, because it would not be functional if, every time we attempted to convey a causal inference or

explanation, we would have to state each and every causal field condition

required for the inference or explanation to hold. In many if not all circumstances the person arriving at a causal inference, or uttering a causal explanation based on a causal inference, will not know all conditions inherent in the causal field. This has some interesting formal and philosophical implications that will be discussed in section 3.3. and chapter 4.

This phenomenon of the causal field has a strong correspondence to the phenomenon of ”background circumstances” that is important for the observed causal asymmetry of probabilistic dependence. For example, Hausman mentions that, in most circumstances, the ingestion of highly acidic foods is correlated with stomach pains. However, in circumstances in which someone has already

ingested an alkali, that correlation can be reversed (Hausman,1998:75). In other words, the probabilistic dependence between ingestion of highly acidic foods and stomach pains holds only in relation to, or is conditional on, specific background circumstances. If something in those background circumstances changes, for example the ingestion of an alkali, then the probabilistic dependence can change to probabilistic independence or vice versa. This is the same kind of

phenomenon as the causal field. The causal field and background circumstances are not technically the same, since that presumes a complete correspondence between causal relations and probabilistic dependence.

(20)

The causal field has an intimate connection to the observed causal

asymmetries of explanation and manipulation as well. In the above example with the match, striking it is only an effective manipulation in a causal field where the condition of oxygen presence is met. Similarly, the explanation that striking the match will cause a fire to light will make sense only in a causal field where the condition of oxygen presence is met. Furthermore, regarding explanation it should also be said that the utilized level of explanation has a connection with the causal field. Specifically, if a more abstract level of explanation is utilized then more conditions are assumed to be met in the causal field. That is to say, the less detailed the explanation, the more conditions are relegated to, and thus assumed to be met in, the causal field. A more detailed explanation has the opposite effect. For example, I may take the condition of oxygen being present up in my

explanation of why striking a match lights a fire and as a result that condition is no longer an implicit condition assumed to be met in the causal field, but an explicit one taken up in the causal explanation and causal inference.

The causal field phenomenon raises a number of questions, such as how we know which causal field to assume for a particular causal inference and how we apparently agree on causal fields for particular causal inferences and explanations intersubjectively, considering that we generally find each other’s causal inferences and explanations plausible. Such questions regarding the causal field will be considered in chapter 4, because answering them requires the treatment of additional material that will be covered in chapter 3 and 4.

2.7 Concluding remarks

This chapter focused on explaining observed asymmetries corresponding to the causal relationship and some key concepts such as successive and simultaneous causation described in 2.1, the concept of conditioning and screening-off, levels of explanation (granularity) and the causal field. The interrelatedness of the different asymmetries above received attention as well. All of these matters will be built upon when Pearl’s system is explained in the next chapter and the

(21)

analysis is conducted in the chapter 4.

Note that the above asymmetries have been observed to correspond, each in their own way, to the causal relationship. That does not mean that such a correspondence constitutes a complete mapping, as became clear when

simultaneous causation was discussed in relation to the time-asymmetry that we generally associate with causation in 2.1, in which case cause and effect cannot be distinguished based on temporal order. That asymmetry is a crucial element of the causal relationship and for the inference of those relationships is clear, but that is hardly surprising given the intuition that causality is asymmetric and the consensus in the philosophical community that the causal relationship is asymmetric. The question to be answered is how the asymmetries above are relevant to the inference of those causal relationships, and what we can learn from the answer to that question with regard to the philosophical debate about

(22)

3

Pearl’s Interventionist Framework

This chapter will focus on describing Pearl’s motivations, his framework and its formal aspects as well as a number of formal definitions that bear on how asymmetry is relevant in causal inference. The chapter will build on the asymmetries and concepts explained in the previous chapter. For the formal parts below, note that Pearl’s method of notation has been adopted. An elementary understanding of probability theory and notation is useful.

Unfamiliar or unorthodox notation will be explained through footnotes where applicable. Note that uppercase letters such as X denote variables, and lowercase letters such as x denote particular instantiations of those variables. Definitions, such as those in section 3.3 will be taken directly from Pearl’s book Causality (Pearl,2009), including their numbering. The first section will treat the motivation of Pearl’s framework and its main ideas. The second section will explain the corresponding formal concepts and the third section will cover some

(23)

of the definitions that pertain specifically to the analysis of Pearl’s framework in chapter 4. This supports the reader to first gaining an overview, allowing the formal matters to be seen in the context of the framework.

3.1 The framework and its motivation

To make sure Pearl’s framework is seen in the right light it will serve to see where Pearl is coming from with regards to causality. Pearl started his academic career as a researcher in statistics and applied that knowledge to artificial intelligence research (see for examplePearl,1988). These academic disciplines have a strong leaning toward an empirical application of causality, focusing on causal

inferences, because causal inference directly relates to problems encountered in both fields. For example, in statistics an important question is how we can draw causal inferences from correlational data and in artificial intelligence that question transforms to how an artificial autonomous system can learn causal relations from observing and interacting with its environment (Pearl,

2009:408-409, 412-413). This influence can be seen in the preface of Causality where Pearl poses what he calls ”the two fundamental questions of causality” (see the introduction). From these questions it should be clear that Pearl concerns himself mainly with the more practical point of how causality works. Pearl’s goal is to operationalize the notion of causality so that it can be used to answer these two questions in a manner useful for empirical research.

Empirical experimentation with as its goal the discovery of cause-effect relationships is prominently focused on manipulation (Shadish and Cook, 2002:12-13). Due to Pearl’s practical and empirical focus it should not surprise us that he selects one of the most well used and intuitive observed asymmetries to do with uncovering causal relationships in experimental settings, manipulation, as the center of his framework. Another reason for using manipulation is

connected to the first of the two fundamental questions about causality that Pearl asks (see the introduction). How do we infer causal relationships from

(24)

experience? Children seem to succeed admirably in this task through interaction (consisting of manipulations) with, and observation of, their environment. Our manipulations allow us to experiment, although contrary to scientific

experimentation in an uncontrolled manner, with our environment and to observe the results of our interventions and of interventions in the environment. For these reasons, Pearl considers the concept of manipulation as critical to learning causal relationships from experience. From the above argument it might be concluded that observation plays an important role in Pearl’s framework as well and that is the case, but the main element is manipulation.

Pearl’s concept of manipulation is somewhat broader than that ordinarily understood. Usually, theories that take a manipulation based analysis of causality take as the scope of manipulations all human actions (Woodward,

2009:238-239). Pearl and other interventionist theorists take a broader view. All actions that manipulate something, not just human actions, are considered to be proper manipulations in Pearl’s framework. This means that an earthquake counts as a manipulation on a mountain for example. To distinguish this broader idea of manipulation from an anthropocentric view of manipulation the word ”intervention” is used. Henceforth, when the word manipulation is used in this paper, it is understood to be interpreted as an intervention. Both words will be used synonymously.

Pearl’s broader use of manipulations may be thought to have averse consequences for the argument that the observed causal asymmetry of

manipulation is intuitive due to its association with effective strategies and hence action (see 2.3). However, this is not the case, because one may still devise effective strategies to achieve a particular effect even though one is incapable of the required manipulation him or herself. The required manipulation can be imagined in a hypothetical manner. Many instances of causal inferences occur in a hypothetical setting, where we imagine, for example, what would happen to the earth if it came too close to the sun even though we could not ourselves put the earth in such a position.

(25)

manipulations have been discussed, almost clearing the path for a basic

exposition of how the idea of manipulation is implemented in Pearl’s framework. Before doing so, a quick revisitation to the causal field explained in 2.6 is

desirable. Remember that it was stated in 2.6 that causal inferences always assume a particular causal field. In Pearl’s framework, there is a distinction between the causal elements under consideration, such as the match, the striking of the match and the fire igniting in the example from 2.6, and the elements or conditions of the causal field (the oxygen in the example of 2.6). The causal elements under consideration are together referred to as the ”causal system” or ”system” in Pearl’s framework or as the ”endogenous” variables, whereas the elements or conditions of the causal field are ”out of the system” or rather ”part of the causal field” or part of the ”background circumstances” and are called ”exogenous” variables. Now that we have this terminology available a basic exposition can be given.

Suppose that we take a circuit diagram (see figure 3.1.1 below).

These diagrams are common in engineering, which, according to Pearl, is one of the few disciplines that has never had any trouble with causality (Pearl, 2009:414). If we manipulate a particular input in the causal system shown by the diagram from 0 to 1 or vice versa, then we can review the consequences on the system and easily calculate the output. Pearl says that the reason for this simplicity is that the diagram consists of a collection of independent or

autonomous mechanisms (Pearl,2009:414). These mechanisms are simply input-output mechanisms, i.e., mechanisms that relate input and ouput. This can be seen by taking the most basic circuit diagram of a single switch with one input and one output, which consequently consists of a single independent

mechanism. Larger diagrams such as figure 3.1.1 are essentially collections of such independent mechanisms connected to each other in a particular manner. These diagrams easily show the causes, which are manipulations on the system’s inputs represented by the inputs of the diagram, and the effects that those manipulations have, i.e., the output that they produce through their influence on the interconnected collection of independent mechanisms in the system. This

(26)
(27)

view on causality leads Pearl to propose that we ”treat causation as a summary of behavior under interventions” (Pearl,2009:414). In other words, Pearl proposes that we treat causation as the sum of manipulations on a particular causal system and the effect those manipulations have on the interconnected collection of independent mechanisms that the causal system contains (here represented by the diagram). From this example and Pearl’s remark, we can conclude that the essence of Pearl’s system revolves around observing the effects of manipulations on

independent mechanisms in a particular causal system under consideration. The

mechanisms are called independent, even when interconnected in a larger causal system, because each of them can be intervened on directly, without intervening on other independent mechanisms in the system. This should not be confused for the effect such an intervention on a particular independent mechanism has on the system. The intervention will change the output value (by changing the input value) of that particular mechanism, meaning that any intervention will have consequences for other independent mechanisms down the line. The point is that those independent mechanisms down the line could again be independently and directly intervened on. Note that the causal field plays its role by providing the surrounding context for the system. In the case of a causal system represented by a circuit diagram, a lamp for example, a condition that is a part of the causal field is that it relies on the availability of an electric current.

This concludes the basic exposition which captures the essence of Pearl’s view on causality. This particular view informs all formal aspects of Pearl’s framework, which will be introduced in the section below, building on what has just been discussed.

3.2 Formal concepts

This section will deal with the formal aspects of Pearl’s framework, which serve to cast the idea of treating causation as a summary of behavior under interventions into a formal language. A formal language for causal inference allows scientists to explicitly state cause-effect relationships, to model causal systems and causal

(28)

inferences, and to communicate results on these matters explicitly and objectively. For these reasons Pearl considers a formal language of paramount importance for a better understanding of causality (Pearl,2009:404, 412, 427). The subsections below will treat three general formal aspects of Pearls system. The first subsection is dedicated to a discussion of Pearl’s choice of using probabilities instead of formal logic for representing causal influences. The second subsection explains Pearl’s use of causal diagrams as mathematical tools, based on the circuit diagram example above. The third subsection will cover variables, structural equations and causal models as formal representations of input/output points, independent mechanisms and interconnected collections of those mechanisms respectively.

3.2.1 Probabilities

Before we can work with the formal framework that will be discussed in the two subsections below (3.2.2 and 3.2.3), a manner of representation has to be chosen for values of the input and output of independent mechanisms. In a circuit diagram such values are 1’s and 0’s, indicating simply that a current is flowing or that it is not flowing. For computational reasons such a logical approach is ideal. This may be further supported by the intuition that causal relationships are stable and that they seem to have a necessary character. However, we are aware too that all our causal inferences and explanations are subject to an almost infinite if not infinite array of exceptions. In other words, our causal inferences and exceptions are subject to a degree of uncertainty that is hard to capture using formal logic. In a formal logic approach every possible exception to a causal inference would have to be explicitly stated in order to be able to arrive at a coherent conclusion. Pearl needs a convenient (read computationally less intensive ¹) way of accounting for uncertainty and exceptions. In order to do so Pearl turns to probabilities for representing the input and output variables of independent mechanisms. Pearl mentions that any theory of causality that wishes to incorporate uncertainties and

¹Keep Pearl’s artificial intelligence background in mind, where computational economy is not only valued but necessary.

(29)

exceptions, represented by ”various shades of likelihood”, will have to entertain probabilities (Pearl,2009:1). He argues that the use of probabilities, which intrinsically permits those exceptions and uncertainties, allows us to focus on the main issues of causality without having to deal with the problems of exception and uncertainty that generally characterize a classic formal logic approach (Pearl, 2009:2). Furthermore, a probabilistic approach can model a formal logic

approach by taking absolute values (0 or 1) if desired (Pearl,2009:26). A word on using probabilities to account for unknown exceptions. Unknown exceptions, by their very nature, cannot be stated explicitly in any computational model. They have to be estimated. This is a task for which

probabilities are specifically suited because of their inherent ability to incorporate uncertainties. An important manner in which probabilities are utilized in Pearl’s system, apart from as a representation for the values of input/output variables, is as a way to express uncertainty about unobserved facts (Pearl,2009:2). Those unobserved facts are conditions of the causal field, values of exogenous variables, as we will see 3.3 and chapter 4.

Another particularly important reason for probabilities can be illustrated by referring back to 3.1 where it was concluded that Pearl’s framework is essentially about manipulating independent mechanisms and observing the effects. Consider that daily life, our standard learning environment for discovering or learning causal relationships, is an uncontrolled environment (Pearl,2009:42). We cannot control which variables are allowed to vary and which we hold steady (control for) in our normal learning environment. This means that we cannot distinguish spurious covariances from causal relationships as would be done (ideally) in scientific experiments (Pearl,2009:43). To give an example, how are we to know that when our mother turns a light on by pressing a switch, while our father drops a glass at the same time, it is the pressed switch and not the dropped glass that turns the light on? If we rely purely on observation and witness this event, then it seems we cannot tell, since the only information we have comprises of covariances. This problem is exacerbated for Pearl, because he omits temporal information in his framework. His reason for doing so is the

(30)

possibility of simultaneous causation as discussed in 2.1. Unfortunately, without temporal information a large number of spurious covariances that constitute temporally backwards (future to past) relations cannot be filtered out, although we might wish to do so based on our intuitions, experience and experimental results on a macroscopic level (see 2.1). Due to this decision, most of our observational data is turned into an enormous statistical dataset of covariances between variables ².

Pearl notes that he will ”adhere to the Bayesian interpretation of probability, according to which probabilities encode degrees of belief about events in the world and data are used to strengthen, update, or weaken those degrees of belief ” (Pearl,2009:2). The discussion about different interpretations of probability will not be treated in this paper, as a large amount of existing work on the topic, both philosophical and technical, can be consulted, but more importantly, because different interpretations do not affect the asymmetry regarding patterns of probabilistic dependence that will be investigated, since such asymmetry is characteristic of probabilistic relationships in general (See 2.2).

3.2.2 Causal graphs

The example at the end of 3.1 with the circuit diagram reveals the substantial explanatory power of graphical representations according to Pearl (Pearl, 2009:415). Graphical representations such as diagrams allow us to encode large bodies of information. Although figure 3.1.1 of the circuit diagram in 3.1 has a particular structure, it allows us to easily imagine alterations that represent specific interventions on the causal system under consideration, and compute their effects using the hypothetical new diagram. In fact, the diagram in figure 3.1.1 allows for a host of interventions that its authors could not have foreseen or

²There is an abundance of qualitative information as well. Such as color, smell, touch, etc. How-ever, all these qualitative signs covary with other qualitative signs as well, which means that qual-itative information cannot provide a direct insight into causal relationships where covariances fail either.

(31)

imagined. Diagrams provide a vivid representation of the sets of variables in a system that are relevant to each other in a given state of the system as brought about by a particular set of interventions. Furthermore, diagrams allow us to model the direction of influence between pairs of variables that are connected by an independent mechanism, which is something that algebraic equations, being of a symmetrical nature, cannot do. Another reason for using diagrams to model causal systems is that they can easily capture a particular level of explanation and the relevant causal field. The boundaries of a diagram represent the boundaries of the causal system under consideration. Everything beyond the boundaries is a part of the causal field. Diagrams allow us to easily adjust the causal field by taking a smaller part of the diagram or by expanding it, as desired.

The particular graphs that Pearl uses are called directed acyclic graphs (DAG’s), causal graphs for short (Pearl,2009:13). See figure 3.2.1 for an example of a DAG. The nodes in the graph represent the input/output variables described earlier. These variables are valued probabilistically, as discussed in 3.2.1. The lines between nodes are called edges and represent the independent mechanisms described in the circuit diagram example of section 3.1. In DAG’s these edges are always directed, showing the direction of influence. Let us take node X2 as a reference point in figure 3.2.1 to explain a few notions. X1 is a parent and ancestor or predecessor of X2. If X1 were to have parents itself, then those would be ancestors or predecessors of X2. X4 is a child and descendant of X2, while X5 is a descendant of X2. X3 is called a nondescendant of X2, meaning that it is not connected to X2.

The directed and acyclic nature of DAG’s is expressed by the directed edges (the arrows) and the fact that paths cannot be cyclic, i.e., paths cannot turn back on themselves. Paths are sequences of connected variables. For example, in figure 3.2.1 there is a path from X1 to X5 through X2 and another through X3. A cyclic path could be created by linking X5 back to X1, or by linking X2 back to X1 for example.

(32)

Figure 3.2.1

3.2.3 Structural equations and causal models

Historically, DAG’s have been used with stochastic relationships to represent the relationships between variables (nodes) in the graphs. However, that does not seem appropriate for modeling causal relationships, because causal relationships have a connotation of stability and necessity. They are generally perceived to be deterministic in the sense that if there are no exceptions, a specific cause-effect relationship will turn out the same way every time. This stability was referred to in the introduction. Looking back at the circuit diagram example at the end of section 3.1, we can point out that the independent mechanisms mentioned therein are of a deterministic functional nature. For any specific given input they always deliver a specific output, invariably, meaning that the independent mechanisms are not stochastic processes. At the same time, it was discussed in 3.2.1 that probabilities are needed if we are to capture the uncertainties and exceptions that are inherent to causal inferences. Pearl’s solution to this dichotomy is to use probabilities to represent the values of variables in causal graphs, and the values of unobserved variables, but to use deterministic

functional relationships to represent the independent mechanisms that connect those variables together. In order to actually calculate the effects of sets of interventions on a particular causal system, a mathematical representation for those independent mechanisms is required. In Pearl’s framework the

(33)

the form of structural equations.

Structural equations are equations of the form xi = fi(pai, ui), where

i = 1, ...., n. firepresents a function, consisting of the set of parents paiof the variable xiand a set of random disturbances ui, modeling the causal field (Pearl, 2009:27). A set of equations of this form, in which each equation represents an independent mechanism is called a structural model. Pearl calls such a model a

causal model if it is the case that each mechanism determines the value of just one

distinct variable (the dependent variable) (Pearl,2009:27).

Pearl interprets structural equations in a particular way. For example, the algebraic equations used in physics are symmetrical. Yet, a symmetrical

interpretation is not suitable for interventions, because as we observed in 2.3, manipulation relates asymmetrically to causation. Therefore, Pearl wants to be able to calculate the particular effects of interventions on specific variables of particular structural equations that represent independent mechanisms, independently of other independent mechanisms represented by structural equations. In other words, Pearl’s approach has to be capable of attaining solutions to asymmetrical input-output mechanisms to represent the intuitively asymmetrical aspects of causation. Pearl’s solution is to interpret structural equations in an asymmetrical manner. The variable xion the left hand side of the equality sign of a structural equation is called the dependent or output variable. The function fion the right hand side represents the independent mechanism. We can see that, since the function consists of the set of parents pai(and a set of random disturbances ui) the value of the dependent variable is characterized by the values of its parents (and the set of random disturbances). Consequently, interventions are handled on the right hand side of the equation (Pearl,

2009:29). This implies that in order to intervene on a particular mechanism we intervene on the parent variable side of that mechanism. That makes sense, because, since the input is connected to the output, the input determines the output in the sense that a specific input will generate a specific output.

To see how an intervention would work in a causal model, let’s look at a small example adapted from Pearl (Pearl,2009:66-69). The causal model consists

(34)

of the variables E1, F and E2. Where the variable E1 represents the population of eelworms before fumigant is applied, F represents the amount of fumigant used and E2 represents the population of eelworms after the fumigant has been applied. The following equations describe the causal model:

E1 = e F = f F(E1) E2 = f E(E1, F)

The set of random disturbances uihas been left out for convenience. The model is set up in such a way that the eelworm population before fumigant is applied influences the amount of fumigant that is applied, which in turn influences the eelworm population after the fumigant is applied. If we want to know the effect of the application of a particular level of fumigant on the population of eelworms, then we have to intervene on the level of fumigant used. In other words, we have to intervene on the equation where the F variable is the dependent variable, on the left hand side. The intervention replaces the function on the right hand side of the equation and instead sets a particular value f (So, F = f). Once that value has been set, the effect will cascade through the causal model, in this case affecting the outcome of the last equation representing the size of the population of eelworms after the fumigant has been applied.

3.2.4 Remarks

Looking back at the circuit diagram example at the end of 3.1, we now see the formal representations that Pearl uses to substantiate the main idea of his framework. Causation, for Pearl, is a summary of the behavior of independent mechanisms under manipulations and observing the results. The causal structures, consisting of collections of partially interconnected independent mechanisms, are modeled by directed acyclic graphs and the corresponding set of structural equations that is used to mathematically describe the relevant

independent mechanisms for each variable of the causal structure (represented by a node in the graph) of the causal model. It is important to note that Pearl uses

(35)

probabilities to express uncertainty about unobserved facts and to express the value of variables, but that he uses deterministic functional mechanisms to relate those variables. In this way Pearl combines the best of both worlds. On the one hand his framework can account for uncertainty and exceptions through

probabilities, on the other it satisfies our intuitions and experiences regarding the stable and apparently deterministic nature of causal relationships. Note that saying that causal relationships are deterministic is an assumption on Pearl’s part, based on intuition and experience. A further clue as to why Pearl holds this position can be seen from the circuit diagram example at the end of 3.1. The mechanisms in such a circuit diagram are deterministic as well. This position of causal relationships being deterministic is not ubiquitously shared. Proponents of probabilistic causality would say that causality is not deterministic but indicates a high likelihood that E happens if C happens (Williamson,2009). A brief reminder about the distinction between a causal system and a causal model. The causal system is the actual system as it exists in reality, under consideration. A causal model is a mathematical representation in the form of a directed acyclic graph and corresponding a set of structural equations that models the causal system.

3.3 Formal definitions

This section will discuss the formal definitions of Pearl’s framework. These formal definitions relate to the material discussed above. For example, formal definitions of causal structure and causal model will be given. The discussion in 3.2 led to the conclusion that Pearl uses both probabilistic information as well as deterministic causal information to model causal systems in his framework. The reasons for desiring both have been explained, but what has yet to be explained is how probabilistic and deterministic information go together. This will be

explained using the formal definitions regarding Markov properties in Pearl’s system below, leading up to the Causal Markov Condition (CMC). What formally constitutes inferred causation in Pearl’s framework will be discussed as

(36)

well. After that, some temporal definitions will be stated for the upcoming analysis. The section will conclude with a brief summary and lead-up to chapter 4, where the analysis will be conducted.

3.3.1 Preparing for the CMC

Before embarking on a tour of definitions that lead up to the CMC, it is useful to revisit the use of probabilities in Pearl’s framework. As was discussed in

subsection 3.2.1, it is not just because of uncertainty and exceptions that Pearl uses probabilities in his framework. Observation plays an important role as well, due to the covariant nature of observational data. This leaves Pearl with the problem of deriving causal relationships form probabilistic data: Pearl has to establish a correspondence between probability distributions over variables and causal structure consisting of a partially interconnected collection of

independent mechanisms relating variables, as was described in 3.2.2-3.2.4. In 3.2.2 DAG’s were described as representing causal models, consisting of variables (displayed as nodes) related by structural equations (see 3.2.3), but the story is more complicated. We have to work with statistical datasets consisting of covariances between variables in the form of a joint probability distribution over a set of variables. This means that we have to build a DAG G from a particular joint probability distribution P that ranges over a set V of variables. From a

computational perspective, this can quickly become intensive. If we have an arbitrary joint probability distribution P(x1, ..., xn) for n dichotomous variables, that means that we would need a database with 2nentries to store the distribution explicitly, which will quickly be an unthinkably large number (Pearl,2009:13)³. Imagine that each variable of a joint probability distribution depends on just a small subset of other variables in the distribution. Considerable economy could be achieved, but more important in relation to the research question, such dependence information allows us to decompose large distributions into smaller

³This problem is exacerbated when discrete variables are the object of discussion, which is more likely considering the fact that Pearl wants to be able to capture ”various shades of likelihood” (Pearl,2009:1).

(37)

distributions (each involving a small subset of variables) and then to piece together those smaller distributions in a structural manner as a DAG (Pearl, 2009:13-14).

We do not want to alter the structure inherent to the raw data when decomposing and piecing together, because we want to preserve the structure captured in the form of covariances through observation as expressed by a joint probability distribution and bring that structural order forth. DAG’s offer a decomposition scheme to that effect. Suppose that we have a distribution P defined on n discrete variables, which may be arbitrarily ordered as X1, X2, ..., Xn. The chain-rule of probability calculus allows a decomposition of P as a product of

n conditional distributions (formula (1.30)):

P(x1, ..., xn) = ∏

j

P(xj|x1, ..., xj−1)

Suppose that the conditional probability of a variable Xjof that distribution P is not sensitive to all its predecessors, but only a small subset of predecessors called

PAi. This set PAiis called the Markovian parents of Xj, or parents for short (Pearl, 2009:14). This allows us to write (formula (1.31)):

P(xj|x1, ..., xj−1) = P(xj|paj)

meaning that we only have to concern ourselves with possible realizations of the set PAiinstead of all possible realizations of its predecessors X1, ..., Xj−1for

specifying the probability of Xj. The terms predecessor and parent here apply as defined in 3.2.2. Recall the discussion about conditioning and screening-off in 2.2 and 2.5 respectively. If we condition on the parents, the children, in this case

Xj, are rendered independent. Referring back to figure 3.2.1 we can see this in operation in a plausible causal system as we might encounter it in our everyday environment. Given the observed asymmetry of screening-off, X3 is independent of X2 given X1. This is formally captured in definition 1.2.1 below.

(38)

Let V = {X1, ..., Xn} be an ordered set of variables, and let P(v) be the joint

probability distribution on these variables. A set of variables PAjis said to be

Markovian parents of Xjif PAjis a minimal set of predecessors of Xjthat renders Xj

independent of all its other predecessors. In other words, PAjis any subset of {X1, ..., Xj−1} satisfying formula (1.32)

P(xj|paj) = P(xj|x1, ..., xj−1)

and such that no proper subset of PAjsatisfies formula (1.32) (Pearl,2009:14). Definition 1.2.1 assigns a select set of predecessors PAjto a variable Xjthat is sufficient for determining the probability of Xj(Pearl,2009:15). This assignment can then be represented by a DAG, in which the variables are nodes and the parent-child⁴ relationships are denoted by directed edges, as explained in 3.2.2. In fact, definition 1.2.1 suggests a simple method for constructing such a DAG. Take figure 3.2.1 from 3.2.2 as an example. Starting with the pair X1, X2, we draw an arrow from X1 to X2 if the latter is dependent on the former. Continuing to X3, no arrow is drawn if X3 is independent of X1, X2; otherwise, if X2 screens-off X3 from X1 (not the case in figure 3.2.1), then an arrow is drawn from X2 to X3, or, if

X1 screens-off X3 from X2, then an arrow is drawn from X1 to X3 (this is the case

in figure 3.2.1). This process of selecting a minimal set of parents PAifrom all the predecessors of a particular variable Xjthat screens Xjoff from its predecessors and then drawing an arrow from all members of PAito Xjcontinues until no more minimal sets can be found (Pearl,2009:15). Thinking back to 2.2 and 2.5 the above method of construction implies that DAG’s are defined as carriers of conditional probabilistic independence relationships along the order of

construction, since it uses the observed asymmetry of probabilistic dependence and screening-off (Pearl,2009:15). Note that, since any pair of variables can be used to start the construction process, a number of differently structured DAG’s may result form a single distribution P. Consequently, Pearl still has to find a way

(39)

to decide which DAG represents the structure that correctly corresponds to the causal structure. This issue will be covered in 3.3.4.

In order for the construction process to be possible, a correspondence between the joint probability distribution P and DAG G has to hold. This is important, because if Pearl is to use DAG’s as tools for causal modeling and inference, they have to explain the body of empirical data represented by the joint probability distribution P. The product decomposition of a joint probability distribution over a set of variables is not order specific. This means that we can test whether P decomposes into the product (formula 1.33)

P(x1, ..., xn) = ∏

i

P(xi|pai)

without referencing variable ordering. Consequently, for DAG G to be a causal network of P all that is necessary is that P admit the product decomposition dictated by G (Pearl,2009:16). Formally:

Definition 1.2.2 (Markov Compatibility)

If a probability function P admits the factorization of (1.33) relative to DAG G, we say that G represents P, that G and P are compatible, or that P is Markov relative to G (Pearl,2009:16).

Pearl states the following theorem about what is required for Markov Compatibility to hold:

Theorem 1.2.7 (Parental Markov Condition)

A necessary and sufficient condition for a probability distribution P to be Markov relative a DAG G is that every variable be independent of all its nondescendants in G, conditional on its parents. (We exclude Xiwhen speaking of its ”nondescendants”) (Pearl,2009:19).

What Theorem 1.2.7 says is that definition 1.2.2 is met if Xiis probabilistically independent of every variable in the set of variables V in P and thus, by virtue of

Referenties

GERELATEERDE DOCUMENTEN

In adopting shareholder value analysis, marketers would need to apply an aggregate- level model linking marketing activities to financial impact, in other words a

We report transient absorption spectroscopic studies on the hybrid material composed of porphyrin molecules covalently attached to graphene for investigating the mechanism

De items voor het construct bereidheid tot kennisdeling zijn als volgt: delen mensen binnen jouw werkveld kennis en expertise met elkaar, als iemand bepaalde kennis heeft over

To perform on-chip amplifications, the resistive heater on the chip is connected to the Keithley source using crocodile connections and the thermocouple is inserted in the

Regarding uncer- tainty, users reported that they were surprised by certain aspects of the automated vehicle ’s behaviour (e.g., the vehicle’s need to adjust its position to the

on to this entreaty by marrying the criticality of municipal service delivery with the promise, hope and government mandate entrenched in the entire Bill of Rights

Hier aan het einde van deze thesis wil ik toch ook graag nog verwijzen naar Grotz (2012) en hem gelijk geven dat het inderdaad ingewikkeld is gebleken om effecten te ontwaren

Een groot aantal vroeg-naoorlogse wijken in Nederland heeft te maken met sociale en fysieke problematiek. Tegelijkertijd vertegenwoordigen deze wijken een kenmerkend