• No results found

Complex Reward Learning: Relations Can Function As Reinforcers

N/A
N/A
Protected

Academic year: 2021

Share "Complex Reward Learning: Relations Can Function As Reinforcers"

Copied!
49
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

COMPLEX REWARD LEARNING:

RELATIONS CAN FUNCTION AS

REINFORCERS

Word count: 16,461

Matthias Raemaekers

Student number: 01411318

Supervisor: Prof. Dr. Jan De Houwer

A dissertation submitted to Ghent University in partial fulfilment of the requirements for the degree of Master of Sciences: Theoretical and Experimental Psychology

(2)

Preamble concerning COVID-19

Data-collection for this paper had already been completed before Corona-measures were imposed. Aside from being forced to work in a slightly different environment to that that I am used to, the practical side of my work did not change. Hence, there is no reason why this work would be judged any differently than it would have been if the Corona crisis had never occurred. This preamble was agreed upon by the student and the supervisor.

(3)

Abstract

At the functional level, learning has been defined as changes in the behavior of an organism, due to regularities in its environment. Whereas most learning research has focused on effects of individual regularities, it has recently been argued that combinations of regularities can also elicit behavioral changes. These joint effects of multiple regularities were referred to as complex learning. This novel notion inspires the systematic investigation of complex learning effects, including research on the functions of relations. In the current paper, we investigated whether a relation can serve as a reinforcer in a conditioned reinforcement procedure. In four experiments (N = 160) pairs of symbols were first paired with either monetary gain or no gain, based on whether symbols of a pair were identical or not. In a subsequent test phase, we examined whether this identity-relation between the symbols served as a (conditioned) reinforcer. The results of Experiment 1 provided strong support for the hypothesis that relations can acquire the function of (conditioned) reinforcer, which can be seen an instance of a complex learning effect. In Experiments 2 to 4, an operant learning procedure was used instead of the pairing procedure, during the first phase of the experiment. Although the results in these experiments were less straightforward, overall, they also support our conclusion that relations can function as (conditioned) reinforcers. More research is required to understand the moderators of establishing this function. Further extensions and applications of this research in the context of psychotherapy, functional and cognitive psychology are discussed.

Keywords

(4)

Research on the human (and nonhuman) ability to learn has held a central place in psychological science ever since its establishment as an independent scientific field. Today, the insights and knowledge produced in the field of learning psychology remain relevant for other fields such as, for instance, the study of memory processes, motivation and reward learning, as well as psychopathology (e.g., Eichenbaum & Cohen, 2004; Wassum, Ostlund, Balleine & Maidment, 2011; Maia & Frank, 2011; Redish, Jensen, Johnson & Kurth-Nelson, 2019). Despite this long tradition, new insights still emerge. Recent theoretical developments in the field of functional learning psychology have produced novel hypotheses and questions that require investigating, and that themselves might influence current research in other fields. In the current paper, we aimed to answer some of the questions put forward by these recent developments.

First of all, we think it is important for us to note that the research described in this current paper is situated at the functional level of analysis. That is, we aim to understand behavior as a function of the environment (De Houwer, 2011; see also Hayes & Brownstein, 1986; Chiesa, 1992). The current findings might also influence research at the cognitive level of explanation, at which the relation between the environment and behavior is explained in terms of mental processes (for a discussion of the interplay between these levels of analysis, see De Houwer, 2011, 2018; De Houwer & Barnes-Holmes, 2018; Hughes, De Houwer & Perugini, 2016a). De Houwer, Barnes-Holmes and Moors (2013) recently proposed a functional definition of learning as ontogenetic adaptation, that is, behavioral changes during the lifetime of an organism that result from regularities (i.e., anything that is more than one event at one point in time and space) in its environment. This definition allows researchers to directly differentiate types of learning based on the nature of these regularities (see De Houwer et al., 2013, for a discussion of the merits of this definition).

In general, three comprehensive classes of regularities can be identified in learning literature: (1) regularities in the presence of one stimulus across different moments in time, (2) regularities in the presence of two or more stimuli at the same time or at different moments in time, and (3) regularities in the presence of behavior and a stimulus, again, at one point in time or across multiple moments in time (De Houwer et al., 2013). When learning is defined as changes in behavior that result from these regularities in the environment, it follows that we can also classify distinct forms of learning, based on the nature of the regularity. Effects of the first type of regularity, typical examples of which are habituation and sensitization (i.e., a decrease or increase, respectively, in the intensity of a response to a stimulus due to repeated presentation of that stimulus; see

(5)

Thompson & Spencer, 1966; Rankin et al., 2009 for reviews), are referred to as non-associative learning. Secondly, effects of a regularity in the presence of two or more stimuli are typically referred to as classical conditioning (or Pavlovian conditioning, named after Pavlov, who first described this type of learning; Pavlov, 1927). Finally, effects of regularities in the occurrence of behavior and a stimulus are referred to as operant conditioning (Skinner, 1938).

Note that these three classes of learning effects involve the effect of a single regularity on behavior. However, it has recently been suggested that more complex learning effects also exist, in which behavioral changes arise due to specific interactions between multiple regularities in the environment (i.e., the joint effect of multiple regularities; see De Houwer and Hughes, in press, for a discussion). Considering these complex learning effects would increase the flexibility whereby one can describe the environment, and also prompt researchers to re-evaluate known learning effects. Let us consider sensory preconditioning as an example. Initially, two conditioned stimuli (CSs; e.g., a light and a tone) are paired (i.e., the first regularity), followed by the pairing of the second of those events (e.g., the tone) with an unconditioned stimulus (US, e.g., an electric shock). As a result of this procedure, the first CS (e.g., the light) will evoke a conditioned response (CR, e.g., fear) even though it never co-occurred with the US. This changed response regarding the first CS is the joint result of two regularities. Neither regularity alone would produce the effect.

De Houwer et al. (2013) proposed to group such learning effects, that arise due to the joint impact of multiple regularities, as instances of moderated learning (other examples are higher-order conditioning, conditioned reinforcement and Pavlovian-to-instrumental transfer, see De Houwer & Hughes, in press). More recently, Hughes, De Houwer and Perugini (2016b) studied similar effects in the context of evaluative learning. The authors noted that in many instances of moderated learning, the two regularities share an element. This was referred to as intersecting regularities, which can be considered a fourth type of regularities. Hence, the behavioral changes that result from intersecting regularities can be considered a fourth type of learning effects. Furthermore, it appears as though the function an event fulfills in a regularity (e.g., its valence in the experiments by Hughes et al., 2016b) can be transferred to another event through such intersections. However, more research is needed to determine what functions can be transferred in such procedures, and how these might be moderated.

De Houwer and Hughes (in press) identified another subclass of complex learning effects that they referred to as effects of meta-regularities. Meta-regularities are

(6)

regularities of which at least one element is a regularity itself. Considering such effects as a separate class of learning effects again significantly increases our options and flexibility to describe the relationship between behavior and the environment. Investigating effects of meta-regularities also allows us to examine the functions that regularities and relations between events in the environment can have. In addition to introducing the concept of complex learning, De Houwer and Hughes (in press) also defined relational learning as a type of complex learning, more specifically, a subclass of effects of meta-regularities.

From our aforementioned discussion of the different types of regularities, it follows that anything that is more than one event at one point in time and space constitutes a regularity. In the case of multiple events that occur, we can identify different relations between those events. As discussed by De Houwer and Hughes (in press), these relations can be conceived of as elements of the environment, just like individual stimuli or regularities. For instance, say that two shapes are presented on a screen as part of a task. One could identify multiple relations between those two shapes, for example, with regards to their identity (i.e., are they the same or different shapes), their size, their color, and so on. However, identifying relations between events not only allows us to more elaborately describe the environment. It also allows us to investigate the functions that these relations can have in relational learning. Relational learning refers to changes in behavior that result from regularities in which a relation has the function of a stimulus (De Houwer & Hughes, in press). Because a relation requires the presence of a regularity (i.e., more than one stimulus), instances of relational learning are by definition effects of meta-regularities.

In order to illustrate these ideas, let us consider a standard matching-to-sample (MTS) procedure, as shown in Figure 1. Participants are instructed to select one of the two comparison stimuli (presented at the bottom) based on its relation to the sample stimulus (presented in the center) and receive feedback for doing so in the training phase. In the example depicted in Figure 1, choosing the stimulus that is physically identical to the sample stimulus is considered a correct response. In other words, an identity-relation has the function of a discriminative stimulus (Sd), indicating which response (i.e., choosing the symbol that is identical to the sample) is reinforced (see Skinner, 1938). Following multiple training trials, participants also pick the comparison stimulus that matches the sample stimulus in the test phase, even though those stimuli were never presented before. This change in behavior results from an operant meta-regularity in which a relation functions as an Sd. More specifically, there is a standard regularity in

(7)

the presence of two stimuli (i.e., the sample stimulus and the identical comparison stimulus) which is part of a meta-regularity in which one aspect of the standard regularity (i.e., the identity relation between the sample and comparison stimulus) signals which response should be emitted (i.e., pick the comparison stimulus that is identical to the sample stimulus).

The function of Sd is just one of the many possible functions that relations could have as part of a meta-regularity. For instance, a relation could, in principle, also function as

a conditioned stimulus or CS, a reinforcing stimulus or Sr, an occasion setter, and so on. However, only very few of those functions have been systematically investigated in the context of relational learning. Future research should aim towards systematically mapping moderators of the acquisition of these functions (e.g., extinction, generalization, etc.). Our main focus for the current manuscript was the function of Sr. In operant conditioning, as it was formulated by Skinner (1938), a Sr is a stimulus that increases the probability of a response (R) that immediately preceded it. The act of presenting the Sr is referred to as reinforcement (Keheller & Gollub, 1962).

Importantly, the function of Sr is not necessarily inherent to specific stimuli. An event can also acquire the function through pairing (or chaining) with another Sr as is, for instance, illustrated by the prototypical magazine training procedure. In this procedure, a lever press (R) produces the sound of the pellet feeder, which is paired with the availability and consumption of food (e.g., Skinner, 1938; Pierce & Cheney, 2008). As a result of the contingency between the lever pressing and the sound, the frequency of lever pressing increases, even if it is never followed by food. This effect suggests that the sound is functioning as a reinforcer because it had previously co-occurred with food.

Figure 1. Illustration of a MTS-procedure, adapted from De Houwer & Hughes (in press)

In the training phase, participants receive feedback on their responses (correct responses are indicated by the arrows below the symbols). In this example, participants are required to select the comparison stimulus (bottom) that is physically identical to the presented sample (top). In the testing phase, feedback is no longer presented.

(8)

In general, we speak of conditioned reinforcement when an event increases the probability of a response because of a conditioning history between that event and a currently effective Sr (e.g., food; Pierce & Cheney, 2008, p. 270). To our knowledge, no previous research has assessed whether relations can have the function of a (conditioned) Sr. Therefore, in the current paper, we investigated whether the relation between a pair of symbols could acquire this function.

Experiment 1

In our first experiment, we aimed to assess whether a relation could be established as a conditioned Sr through a simple pairing procedure. We did so by means of four unique blocks, which were referred to as stimulus pairing, reinforcement, generalization and

equivalence. Participants were instructed to pay close attention to what would be

presented on the screen, because the amount of money they earned would be a function of what they would see and the responses they would make. In the stimulus pairing block, pairs of either identical or non-identical symbols were paired with monetary gain or no gain (from here on referred to as “gain”- and “no-gain”-cues, respectively). This block served as a learning phase, in which the contingency of what relation was paired with monetary gain (referred to as the “‘gain’-relation”) was manipulated.

In the second block (i.e., reinforcement), participants were presented two response-options, one of which produced identical pairs, the other non-identical pairs. We expected that, if participants successfully learned which relation was paired with monetary gain and which was not, they would consistently choose the response that produced the ‘gain’-relation. To verify that it was in fact the identity-relation between the symbols that served as the Sr, as opposed to any other aspect of the symbols, we tested whether this expected pattern of responding generalized to novel responses and novel symbol-pairs in the third block (i.e., the generalization block). The procedure for this block was similar to that of the previous block, but the symbol-pairs in this block were made up of symbols that had not been presented previously. If the relation between the symbols functioned as the Sr, we would expect to observe the same pattern of responding as in the reinforcement block. Furthermore, to avoid that participants would relate the relations to the specific keyboard responses, they were required to respond verbally in this block.

Finally, participants completed a MTS- task in the last block (i.e., the equivalence block). This block was not intended to test the Sr-function, but rather to assess whether participants were able to respond in line with the contingencies from the first block. They

(9)

were instructed to match the presented symbol-pair (pairs from the set used in the

generalization block) with the cues for ‘gain’ or ‘no-gain’. No feedback was presented

after their choices. We expected participants would select the cue for ‘gain’ or ‘no-gain’ as a function of the presented sample, consistent with the co-occurrence of the relations and the cues in the stimulus pairing block. For example, if identical symbols were paired with ‘gain’, we expected them to pick the ‘gain’-cue instead of the cue for ‘no-gain’ when a pair of identical symbols was presented, and vice versa when a pair of different symbols was presented.

Method

Participants and design.

40 participants (33 women) were recruited through the Sona Research Participation System of Ghent University, ages ranging from 18 to 26 years old (M = 20.67, SD = 1.83). Participants were paid €5 for completing this experiment and another1 (from a different master’s project), which together took no more than half an hour. All participants spoke Dutch fluently (assessed by the researcher), provided informed consent and were explicitly told they were allowed to quit whenever they felt uncomfortable, in line with the ethics protocol of Ghent University. The experiment entailed a within-subjects design (see Procedure for counterbalancing). We investigated the effect of a contingency between a relation and monetary gain on the frequency of participants’ responses producing that relation. Responses and reaction times were registered.

Materials.

Stimuli. Each participant encountered stimuli from three separate stimulus sets. One

set was used in the first two blocks, whereas the second and third sets were used for the latter two blocks. Each set was based on six distinct ASCII-symbols that were combined into triplets (e.g., “^^^”). These triplets were then again combined into pairs of either identical (e.g., “^^^ ^^^”, thus, six distinct pairs in each set) or non-identical (e.g., “^^^

&&&”, eighteen pairs in total in each set) triplets. Currency symbols were not used to

exclude the possibility that there was any other relation between the pairs than their

1In one session, participants completed this experiment along with another experiment from a different, unrelated Master’s project. The order of the experiments was counterbalanced between participants. In between experiments, the participants were instructed to contact the researcher who set up the other experiment, asked them to take a short break, and instructed them that the two studies were unrelated.

(10)

physical identity. To indicate participants’ earnings, we used “+€0.10” as the cue for ‘gain’ and “+€0.00” as a cue for ‘no gain’. Response options were the ‘E’ and ‘I’ buttons on the keyboard. Verbal responses were two2 one-syllable non-words (‘boo’ and ‘kef’), chosen to be easily discriminable and to have no inherent value, meaning or affect. All stimulus material can be found in Supplementary Materials.

Software. The experiment was coded in and controlled by the Inquisit 4 (2015, retrieved

from https://www.millisecond.com) software package. Reproducible code used for the measures was preregistered to the Open Science Forum (OSF) prior to data collection3. The acquired data set was analyzed using R (version 3.6.1; the R Core Team, 2019), also using the effsize (version 0.7.6; Torchiano, 2020), ggpubr (version 0.4, Kassambara, 2020), rcompanion (version 2.3.25, Magniafico, 2020), reshape2 (version 1.4.4, Wickham, 2007) and boot (version 1.3-22, Canty & Ripley, 2019) packages. Preregistered and post-hoc analysis scripts were submitted to OSF (see footnote 3).

Hardware. The experiment was executed on a Dell™ laptop (Precision M4700, Intel i5

Core). The participants were seated approximately 70cm from the 15.6-inch screen (1920x1080 resolution). Left and right responses were registered with the ‘E’ and ‘I’ keys on the keyboard, respectively.

Procedure.

Participants were invited to take a seat in front of the computer, asked to provide informed consent and to enter their demographic information (i.e., age and gender). Subsequently, they were instructed that they would perform a task in which the amount of money they would receive was a function of the stimuli they would see and the buttons they pressed. The experiment itself consisted of two phases4, each of which contained

2 As per the preregistered design, we intended to use four unique, verbal responses, to

counterbalance them between phases. However, because of an error in translating the scripts, the same two responses were used in both phases.

3 Both the master (English) and the translated (Dutch) measures and analysis scripts are

available at: https://osf.io/bwk74/

4 In the original, pre-registered design, the experiment consisted of two phases of four blocks,

for the purpose of implementing a reversal learning design. However, after data-collection for Experiments 1 and 2 was completed, some errors in the counterbalancing of the response-consequence mappings, in both experiments, were detected. Because these errors complicated the interpretability of our results, we opted to drop the reversal design and only make use of the first phase that each participant completed for our analysis (data for the second phase

participants completed were analyzed to ensure we did not observe any inconsistencies, and were reported in the Supplementary Material). The same logic was applied to Experiments 2 and 3 for consistency, whereas in Experiment 4, only one phase was included in the design.

(11)

four blocks: stimulus pairing, reinforcement, generalization and equivalence. Afterwards, a short questionnaire, intended to assess participants’ explicit awareness of the contingencies and demand compliance, was administered (see Supplementary Materials). An example trial structure is illustrated in Figure 2.

At the start of the stimulus pairing blocks (i.e., the learning phase), participants were instructed to carefully attend and observe the stimuli that appeared on the screen, and to learn the relations between them. On every trial, a pair of triplets of ASCI-symbols from the first stimulus set (six symbols, six identical pairs and 18 non-identical pairs) was presented in white font in the center of the screen for 1500ms (1° visual angle). The pairs consisted of either two identical (e.g., “&&& &&&”) or non-identical (e.g., “&&& %%%”) triplets. Subsequently, for another 3000ms, either the ‘gain’- (i.e., “+€0.10”, in green; 1°) or ‘no-gain’ cue (i.e., “+€0.00”, in red; 1°) was presented two degrees above the pair of triplets, as a function of the contingency in the current phase. This block consisted of 20 trials with a 1500ms inter-trial-interval (ITI), after which participants proceeded to the testing blocks. We manipulated whether identical or non-identical pairs cooccurred with ‘gain’ (i.e., the ‘gain-relation’ versus the ‘no-gain-relation’).

For the reinforcement blocks (16 trials, 1000ms ITI), participants were instructed to make a left (‘E’) or right (‘I’) response, which would be followed by the presentation of a pair of symbols. Their goal was to maximize their earnings. The instruction to “press ‘E’

or ‘I’” was presented in the center of the screen (1° visual arc, yellow font) on every trial,

until either button was pressed. Pressing the ‘E’ vs. ‘I’ keys resulted in the 1500ms presentation of either a pair of identical symbols or a pair of non-identical stimuli (white, 1° visual angle, in the center of the screen). These pairs comprised symbols from the same stimulus-set as the stimulus pairing block.

In the generalization blocks (16 trials, 1000ms ITI), we aimed to investigate whether the expected pattern of responding in the reinforcement blocks would generalize to novel responses and stimuli. The procedure, timings and psychophysical properties of the stimuli in this block were identical to the reinforcement block, with two exceptions. First, the symbol-pairs presented following participants’ responses were now symbol pairs selected from the second or third stimulus-sets. This was to verify that it was in fact the

Illustrations of the design of Experiments 1 to 3 (Figures 2, 4 and 6, respectively) show both contingencies participants could experience in their first phase, but also show how participants would complete them in subsequent phases (i.e., in the preregistered design). Furthermore, sample sizes were, aside from practical considerations, based on a priori power analyses, which were thus also based on the old design (i.e., a paired sample t-test). Post-hoc power analyses for the new design, taking exclusions into account, were reported in the Results sections.

(12)

identity-relation between the elements in a pair, and not the physical stimuli themselves, that influenced responding. Second, to ensure that participants did not simply respond based on a perceived contingency between ‘gain’ and the specific response-keys in the

reinforcement bocks, participants were instructed to make one of two verbal responses

(i.e., “Say ‘boo’ or ‘kef’”).

Finally, in the equivalence blocks (16 trials, 500ms ITI), we tested whether participants could respond in line with the contingency between the relations (i.e., identical versus non-identical symbols) and the ‘gain’ or ‘no-gain’ cues in a MTS-like task. Their instruction was to choose one of two targets (i.e., the ‘gain’ or ‘no-gain’ cue, presented in the bottom right and left corner) by making a left (“E”) or right (“I”) response. A trial started with the

Figure 2. Trial structure for both contingencies in Experiment 1.

Each row shows one of the two possible contingencies experienced by participants (i.e., in the stimulus pairing blocks). In the top panel, identical symbols were paired with ‘gain’ (and non-identical pairs with ‘no-gain’), whereas in the bottom panel, the contingency is reversed. This contingency (i.e., technically, the order of contingencies, in the preregistered design) was counterbalanced between participants. In the

reinforcement blocks, the left key (“E”) produced identical pairs, whereas the right key (“I”) produced non-identical pairs. Again, this mapping (i.e., reinforcement a versus. b) was counterbalanced between participants (but see footnote 4), adding up to four unique block orders. Note: arrows underneath the response options in the equivalence blocks indicate correct responses.

(13)

presentation of the sample, an identical or non-identical pair of symbols (white, 1°, in the center of the screen), followed by the presentation of the ‘gain’ (1°, in green) and ‘no-gain’ (1°, in red) cues on the bottom left- and right-hand sides, respectively (randomized between trials). The sample and target cues remained on the screen until the participant had responded.

Results

Participants completed a task in which pairs of either identical or non-identical symbols were paired with monetary gain. To examine whether this identity-relation could have the function of a Sr, we investigated their responses in the blocks subsequent to the learning phase. In particular, we compared the frequency with which participants selected the response that produced the relation paired with the ‘gain’-cue to the chance-level (i.e., eight correct responses out of sixteen) by means of a two-tailed, one-sample t-test (or non-parametric equivalent). While it was unlikely that the frequency of participants’ responses for the ‘gain’-relation would decrease, and thus a one-tailed test seemed appropriate, we opted for a two-tailed test to allow for this possibility, and to decrease the possibility of Type I errors on the positive end of the distribution. We also computed effect sizes (Cohen’s d or non-parametric equivalent) and one-sided confidence intervals (CI) to aid interpretation of results. Data for one participant were excluded because it satisfied our exclusion criterion5, leaving a sample size of 39 (33 women, M = 20.67, SD = 1.83 years). Post-hoc6 power analyses in G*Power (version

3.1; Faul, Erdfelder, Buchner & Lang, 2009) demonstrated that a sample size of 39 provided .80 power to detect effects of Cohen’s d > 0.46 and .95 power to detect effect sizes Cohen’s d > 0.59, for a two-tailed, one-sample t-test.

In the reinforcement block, participants, on average, selected the response that resulted in the relation paired with gain above chance-level (M = 11.97, SD = 4.21; see Figure 3A). Results of the one-sample t-test suggested that this difference was significant, t(38) = 5.90, p < .001, d = 0.94, 95% CI = [0.26; 1.63]. Consistent with this result, the proportion responses that produced the ‘gain’-relation in the generalization

5The pre-registered exclusion strategy was to remove data from participants that responded faster than 300ms on more than ten percent of all trials. However, this was later revised, because participants could legitimately give the same response on every trial in both the reinforcement and generalization blocks, allowing for very fast, but accurate responding. Therefore, exclusions were only based on response times in the equivalence blocks. 6A priori power analyses, based on the original, preregistered design (see footnote 4),

demonstrated that a sample size of 40 provided .95 power to detect effects of Cohen’s d > 0.53 and .80 power to detect effects of d > 0.40, using a paired-samples t-test.

(14)

block (M = 12.08, SD = 4.05; see Figure 3B) was also significantly larger than chance-level, t(38) = 6.29, p < .001, d = 1.01, 95% CI = [0.32; 1.70]. Finally, for the equivalence block, we investigated whether participants responded in line with the contingency they experienced in the learning phase, in a MTS-task. Results of our analysis provided strong support that this was the case (M = 14.85, SD = 1.99; see Figure 3C), t(38) = 21.44, p < .001, d = 3.43, 95% CI = [2.42; 4.45].

Discussion

The aim of the current paper was to expand the literature on complex learning effects. In particular, we investigated whether a relation can have the function of a conditioned Sr. To recapitulate, we speak of (standard) conditioned reinforcement when an event increases the probability of a response that produces that event, solely because of the conditioning history of that event (e.g., because it was previously paired with an effective reinforcer). In our first experiment, we investigated whether the identity-relation between a pair of symbols could be established as a conditioned Sr by pairing it with monetary gain, and then testing the effect of this contingency on participants’ responses in subsequent blocks.

Results of our analysis provided strong support for the hypothesis. We observed that, in the reinforcement block, participants more frequently (above chance-level) selected one response over another, as a function of the contingency they experienced in the

stimulus pairing block. In other words, the probability of participants selecting the Figure 3. Raw response frequencies with means and confidence intervals for Experiment 1.

For each block, the frequency with which individual participants selected the response that produced the ‘gain’-relation is indicated represented by black dots. Red dots indicate the sample mean for each block, with 95% confidence intervals in black. The red, dashed line indicates the chance-level. Note: * p < .05; ** p < .01; *** p < .001.

(15)

response that produced the gain-relation increased, which is exactly what conditioned reinforcement entails. To our knowledge, this is the first report of relations functioning as conditioned reinforcers. Furthermore, this pattern of responding also generalized to novel (verbal) responses, and to relations between novel symbols in the generalization block. Results from these blocks with never before presented stimuli confirm our hypothesis that it was indeed the identity-relation between the symbols (as opposed to any other aspect of the stimuli) acting as the Sr. Finally, we also observed that, in the MTS-task in the equivalence block, participants responded as if the relation and the ‘gain’- and ‘no-gain’-cues were equivalent, which is referred to as equivalence responding7 (see Sidman, 1994, 2009). The equivalence class, in this case, entailed a relation and an individual stimulus, and appears to arise as a result of the pairing procedure. Equivalence responding is an important notion in theories of relational learning. The implications of our findings will be more thoroughly discussed in the general discussion.

Experiment 2

The results of Experiment 1 provided strong evidence for the hypothesis that a relation can have the function of a conditioned Sr, and that it can acquire this function through of a simple pairing procedure. For Experiment 2, the aim was to quasi-replicate and extend these findings, now using a different learning procedure to establish the function of conditioned Sr. The (passive) pairing procedure at the start of each phase in Experiment 1 (i.e., the stimulus pairing blocks) was replaced by an operant learning procedure. In the first block of each phase, participants performed a MTS-task in which, on each trial, a pair of either identical or non-identical symbols was presented as the sample, and the cues for monetary ‘gain’ and ‘no-gain’ as the comparison stimuli. Participants were instructed to pick one of the comparison stimuli, and received feedback based on their responses, feedback that could be used to learn the relevant contingency in the task. Analogous to Experiment 1, we then tested whether the identity-relation between elements in a pair served as a conditioned Sr, whether this effect would generalize to novel stimuli and responses, and whether participants could successfully respond in line with the contingency from the first block in another MTS-task without feedback.

7 Technically, this was a simplified procedure to test for equivalence responding, because we

only tested one direction (i.e., choose ‘gain’ versus ‘no-gain’ in the presence of the sample), but not the other way around. For a discussion of this, see Sidman (2009).

(16)

The three testing blocks that followed the pairing procedure (i.e., reinforcement,

generalization and equivalence) were mostly identical to those in Experiment 1, aside

from a few changes, which we discussed in more detail in the Method section. We expected participants to respond in a pattern similar to the pattern observed in Experiment 1. Because we replaced the passive learning procedure, which required participants to simply observe the stimuli, with an operant learning procedure that required participants to actively choose one of two responses, and learn the contingency based on the feedback provided, we expected this procedure to be more effective and the pattern of unimodal responding to be even more strongly expressed in this second experiment.

Method

Participants and design.

30 participants (20 women) were recruited through the Sona Research Participation System of Ghent University, ages ranging from 18 to 32 years old (M = 21.55, SD = 2.64 years). Sample size was determined based on availability of resources and power analyses in G*Power, using the results of the previous study8. All participants spoke Dutch fluently (assessed by the researcher), provided informed consent and were explicitly told they were allowed to quit whenever they felt uncomfortable, in line with the ethics protocol of Ghent University. They were paid €5 for completing this experiment and another9, which together took up no more than half an hour. The experiment entailed a within-subjects design (see Procedure for counterbalancing), in which we examined the effect of a contingency between a relation and monetary gain on the frequency of participants’ choices resulting in that relation. Responses and reaction times were registered.

Materials.

The stimuli used in this experiment were identical to those used in Experiment 1, as

8Experiment 1 demonstrated that point estimates of effect sizeon all dependent variables, based on the preregistered, paired-sample t-test, were Cohen’s d > 1.0. A priori power analyses demonstrated that a sample of 30 participants provided 0.99 power to detect effect sizes of Cohen’s d > 0.95.

9In one session, participants completed this experiment along with another experiment from a different, unrelated Master’s project. The order of the experiments was counterbalanced between participants. In between experiments, the participants were instructed to contact the researcher who set up the other experiment, asked them to take a short break, and instructed them that the two studies were unrelated.

(17)

were the response-keys on the keyboard (see Materials for Experiment 1 and Supplemental Materials). One exception was that for the generalization blocks in this experiment, four unique verbal responses (i.e., “boo”, “kef”, “nom” and “gez”) were used for counterbalancing purposes, instead of only two in Experiment 1 (as per erroneous translation, see footnote 2). Furthermore, the experiment was administered using the same hardware, and controlled and analyzed using the same software as in Experiment 1. Reproducible code for the experiment and analysis were preregistered to OSF10 prior to data-collection.

Procedure.

Participants were invited to sit in front of the computer, were asked to provide informed consent and to enter their demographic information (i.e., age and gender). Subsequently, they were instructed that they would perform a task in which the amount of money they would receive depended on the stimuli they would see and the choices they made. The experiment itself consisted of two phases, each containing four blocks: MTS-training,

reinforcement, generalization and equivalence. Afterwards, a short questionnaire was

administered for the purpose of assessing participants’ explicit awareness of the contingency and whether they felt influenced by the researcher in any way (also available under the Supplementary Materials). An example trial structure is illustrated in Figure 4. Starting the first block (i.e., MTS-training), participants were instructed that they would see a sample presented in the center of the screen, and that they had to pick the one of the comparison stimuli on the top-left or -right side of the screen by pressing the E- or I-key, respectively. They were further instructed to use the feedback, presented after choices on every trial, to learn the relations between the stimuli. On each trial a pair of identical (e.g. “&&& &&&”) or non-identical (e.g., “&&& \\\“) triplets of symbols from Stimulus Set 1 was first presented (1°) in white, in the middle of the screen. Subsequently, the ‘gain’- (“+€0.10”) and ‘no-gain’-cue (“+€0.00”) were presented (1°) in white at the top left and top right of the screen (positions randomized between trials). When participants had pressed a key to select either the ‘gain’- or ‘no-gain’-cue, a “correct” (in green font) or a “wrong” (in red font) feedback message was presented (height = 1°) for 750ms or 1500ms, respectively. Participants completed up to five blocks of 16 trials each (500ms ITI). When they responded correctly on at least 13 out of 16

10 Both the master (English) and the translated (Dutch) measures and analysis scripts are

(18)

trials within a block, or finished all five blocks without reaching this criterion (binomial probability of passing by chance = .01), they moved on to the next task.

The reinforcement (16 trials, 1000ms ITI) blocks in this experiment were identical to those in Experiment 1 (for exact stimulus presentation and timing characteristics, see the Method section of Experiment 1). Here too, the mapping of left and right responses resulting in identical versus non-identical pairs (i.e., whether ‘E’ resulted in an identical pair and ‘I’ produced a non-identical pair, or vice versa; referred to as reinforcement a versus b) was counterbalanced between participants. Analogous to Experiment 1, the

generalization blocks in Experiment 2 were very similar in procedure to the reinforcement

blocks (see Method section of Experiment 1 for details). In this experiment (as opposed to an error made in Experiment 1), there were two sets of verbal responses (i.e., ‘boo’

Figure 4. Trial structure for both contingencies in Experiment 2.

Each row shows one of the two possible contingencies experienced by participants (i.e., in the stimulus pairing blocks). In the top panel, selecting the ‘gain’-cue when a pair of identical symbols was presented was followed by positive feedback (and the ‘no-gain’-cue for non-identical pairs was followed by negative feedback), whereas in the bottom panel, the contingency was reversed. This contingency (i.e., technically, the order of contingencies, in the preregistered design) was counterbalanced between participants. In the reinforcement blocks, the left key (“E”) produced identical pairs, whereas the right key (“I”) produced non-identical pairs. Again, this mapping (i.e., reinforcement a versus. b) was counterbalanced between participants (but see footnote 4), adding up to four unique block orders. Note: arrows underneath the response options in the equivalence blocks indicate correct responses.

(19)

and ‘kef’ versus ‘nom’ and ‘gez’) to reduce the possibility of effects driven by familiarity of these words (e.g., if ‘boo’ produced the pair of symbols that was paired with monetary gain in Phase 1, participants might prefer to choose ‘boo’ in the second phase as well). The response-consequence mappings were counterbalanced between participants.

Again, in the equivalence blocks(16 trials, 500ms ITI), we tested whether participants were able to reproduce the contingency they learned in the first block, by means of a

MTS-like task. Compared to the equivalence blocks in Experiment 1, a few small

changes were made. The colors of the target cues (i.e., green and red) were removed and the target stimuli were also moved to the top of the screen. Participants were still instructed to pick one of two cues (i.e., the ‘gain’ or ‘no-gain’ cue) by making a left (i.e., the “E”-button) or right (i.e., the “I”-button) response, but the targets were now presented in the top right and top left corners of the screen (as opposed to the bottom left and right in Experiment 1). A trial started with the presentation of the sample, a pair of identical or non-identical symbols (center of screen, text height = 1°) from Stimulus Set 2, followed by the presentation (1°) of the ‘gain’ and ‘no-gain’ cues that appeared, in white (as opposed to green and red, respectively), on the top left- and right-hand sides (randomized between trials). The sample and target cues remained on the screen until the participant had responded.

Results

In this experiment, an identity-relation between a pair of symbols (i.e., identical or non-identical symbols) served the function of Sd in the training phase (i.e. a MTS-task), indicating which of the two cues (i.e., ‘gain’ or ‘no-gain’) to select. Participants experienced this contingency in the MTS-training blocks and were presented feedback on their responses. To investigate whether an identity-relation could function as a Sr, we investigated their responses in the blocks that followed this learning phase (i.e.,

reinforcement, generalization). We also examined whether participants could respond in

line with the contingency (i.e., the equivalence block). Our analytic strategy for this experiment was identical to that of the previous (see Results section for Experiment 1). Data for 9 participants were discarded: for one participant because they were incomplete11, and for the other eight because they did not reach the preregistered accuracy-criterion for the learning procedure within five blocks. This resulted in an

11More specifically, data for one of the reinforcement blocks was missing for this participant, likely due to an error that occurred during either the saving or transferring of the data.

(20)

analytic sample of 21 (14 women), ages ranging from 19 to 32 years old (M = 21.71, SD = 2.78 years). Post-hoc power analyses in G*Power demonstrated that a sample size of 21 provided .80 power to detect effect sizes Cohen’s d > 0.64 and .95 power to detect effect sizes d > 0.83, for a two-tailed, one-sample t-test.

Because of the small sample size that resulted from the exclusions, we first tested the normality of the data using the Shapiro-Wilk test (Shapiro & Wilk, 1965). For the

reinforcement blocks, the distribution of the data did not significantly deviate from a

normal distribution (W = 0.93, p = .16). The one-sample t-test showed that participants selected the response that produced the ‘gain’-relation significantly above chance-level,

t(20) = 2.45, p = .02, Cohen’s d = 0.53, 95% CI = [-0.39, 1.46] (M = 10.14, SD = 4.02 out

of 16 responses, see Figure 5A). However, the 95% confidence interval for the effect size indicated that we cannot conclude that this effect was meaningful, which was likely, in part, due to the small sample size. The data for the generalization block were not normally distributed (Shapiro-Wilk test, W = 0.79, p < .001). Therefore, to compare the median frequency of participants’ responses to chance level, we used a two-tailed, one-sample Wilcoxon singed-rank test and r (Rosenthal, 1994) as a measure of the effect size. Participants selected the response that produced the ‘gain’-relation significantly more frequently than chance-level (Mdn = 8, MAD = 2.97, see Figure 5B), V = 101.5, p = .02, r = 0.52, 95% CI = [0.15; 0.74]. The distribution of the equivalence block data also deviated from a normal distribution (Shapiro-Wilk test, W = 0.76, p < .001). Results of the non-parametric Wilcoxon test showed that the median frequency by which participants selected the response producing the ‘gain’-relation (Mdn = 15, MAD = 1.48, see Figure 5C) was significantly higher than chance-level, V = 210, p < .001, r = 0.87, 95% CI = [0.82; 0.90].

(21)

Discussion

Our aim for this second experiment was to quasi-replicate our findings from Experiment 1, now establishing the function of conditioned Sr through an operant learning procedure. Instead of the passive pairing procedure from Experiment 1, participants completed a MTS-task in which they were instructed to match identical or non-identical symbol-pairs (presented as the sample) to the ‘gain’- or ‘no-gain’-cues (presented as the targets), and were provided feedback to learn from. Aside from this new learning procedure, the design of the subsequent testing blocks remained the same as in Experiment 1, except for minor changes to the equivalence block. In the

reinforcement and generalization blocks, we tested whether the identity-relation between

the symbols (a regularity) actually functioned as a conditioned Sr, increasing the probability of a response that produced it as a function of the contingency learned in the

MTS-task. Furthermore, we tested whether participants would treat the relations and the

cues for ‘gain’ and ‘no-gain’ as equivalent in the equivalence block.

In contrast to the results of Experiment 1, our findings in this experiment were not as straightforward. Results from the one-sample t-test suggested that participants selected the response producing the ‘gain’-relation significantly more frequently than chance-level in the reinforcement block. However, the confidence interval for the effect size indicated that the effect was not meaningful, and thus, our findings were ambiguous. We did observe that, in the generalization blocks, the frequency by which participants selected

Figure 5. Raw response frequencies with means and confidence intervals for Experiment 2.

For each block (i.e., the three panels), the frequency by which individual participants selected the response that produced the ‘gain’-relation is indicated represented by black dots. Red dots indicate the sample mean for each block, with 95% confidence intervals in black. The red, dashed line indicates the chance-level. Note: * p < .05; ** p < .01; *** p < .001.

(22)

a certain response increased as a function of the contingency they learned in the training blocks. Whereas the results of the reinforcement block appear to trend in the expected direction, the inconsistency between analyses might, in part, result from the small sample size. The fact that we did observe the expected response pattern in the generalization block (i.e., the identity-relations between novel symbols did function as Sr’s), even though it didn’t present in the reinforcement blocks, does support our hypothesis that relations can in fact function as a Sr, in line with our results from Experiment 1. Further investigation is needed to explain why we observed these inconsistencies.

Furthermore, and in line with what we observed in Experiment 1, our results demonstrated that participants were able to respond in line with the contingency they learned, in a MTS-like task without feedback (i.e., the equivalence block). They responded as if the identity-relation and the ‘gain’- or ‘no-gain’-cues were equivalent. Furthermore, this equivalence relation appears to be established as a result of the operant MTS-procedure, in which correct responses were reinforced (see Sidman, 2000). As noted previously, the relevance of this finding in the context of relational learning will be discussed in the general discussion. Generally speaking, our findings resemble those of Experiment 1, but provide less clear support for our hypothesis that a relation can function as a Sr. Considering that we expected this new procedure to be more effective than the passive learning procedure in Experiment 1 (i.e., the

stimulus-pairing block), the observed inconsistencies pose an even bigger problem. Given these

difficult to interpret findings, we set-up a follow-up experiment. To resolve the questions posed by the current findings, we made a number of methodological changes to the design of this experiment.

Experiment 3

In an attempt to resolve the inconsistent findings in Experiment 2, we set up an online quasi-replication, in which we made a few small methodological changes. First and foremost, the errors in counterbalancing of response-consequence-mappings, discovered after conducting Experiments 1 and 2 (see Footnote 4), were corrected. Second, because this experiment was conducted through an online platform, the generalization blocks were left out, as we would not be able to register verbal responses. Furthermore, we reversed the changes made to the equivalence blocks, going from Experiment 1 to 2 (see Method section for Experiment 2), to exclude the possibility that this change had any influence on the inconsistent results of Experiment 2. Finally, we

(23)

made two more changes to focus participants’ attention on the identity-relation between the symbols in a pair. We opted to present the pairs surrounded by a white rectangle and we made a subtle change to the instructions, which, instead of being referred to as “symbol-pairs” were now referred to “pairs of 2x3 symbols (e.g., xxx yyy)”.

Method

Participants and design.

30 participants (18 women) were recruited using the Prolific Academic online platform (http://www.prolific.ac/), ages ranging from 18 to 51 years old (M = 31.73, SD = 9.9). Sample size was determined based on availability of resources and power analyses in G*Power, using the results of the previous study (see footnote 8). All participants provided informed consent (online) and were informed they were allowed to quit whenever they felt uncomfortable, in line with the ethics protocol of Ghent University. They were paid £5 for completing this experiment. The experiment entailed a within-subjects design (see Procedure for counterbalancing), in which we investigated the effect of a contingency between a relation and monetary gain on the frequency of participants’ choices resulting in that relation. Responses and reaction times were registered.

Materials.

As noted previously, the generalization blocks were not included in this experiment, and thus, only one set of stimuli was used. The symbol-pairs were identical to those of the first stimulus set used in Experiments 1 and 2 (see Supplementary Material). However, the pairs were now surrounded by a white rectangle in order to encourage participants to relate the two elements of the pair. The ‘gain’- and ‘no-gain’-cues were changed to “+£0.05” and “+£0.00”, respectively, to be compatible with the online platform. The responses were identical to those in Experiments 1 and 2, with the exception of the verbal responses. Nothing changed with regards to the software used for analysis or to code and control the experiment. Reproducible scripts for the latter were preregistered to OSF12.

Procedure.

Participants were asked to provide informed consent and provide demographic

12Scripts used for analysis and coding of the measures can be found at: https://osf.io/9fuh3/?view_only=e21ec82258df448ebacb8dc8815d44da

(24)

information (gender and age) before proceeding to the task. They were instructed that the amount of money they would receive at the end of the experiment depended on the stimuli they would see and the buttons they pressed. The experiment itself consisted of two phases (however, only the first phase was discussed, see footnote 4), each containing three blocks: MTS-training, reinforcement and equivalence, which were followed by one question that assessed participants’ explicit awareness of the contingency (i.e., the first question of the questionnaire in Experiments 1 and 2, see Supplementary Materials). An example trial structure for this experiment is illustrated in Figure 6. The first block, MTS-training (up to five blocks of 16 trials, 500ms ITI), was mostly identical to that in the previous experiment (see Method section for Experiment 2). The only difference was that the symbol-pairs presented as the samples were now surrounded by a white rectangle (8° by 1.75°) to increase the probability that the two triplets of a pair would be related in terms of their identity. The same accuracy-criterion was applied for participants to proceed to the next block.

Analogously, the reinforcement blocks (16 trials, 1000ms ITI), in this experiment were identical to those in the previous experiments (see Method section of previous experiments for details), aside from the rectangle that was added around the symbol pairs. As per the abovementioned reversal of changes made to the equivalence blocks in Experiment 2, they closely resembled those in Experiment 1 (see Method section of Experiment 1 for details). Here too, the only difference was that the symbol-pairs, presented as the samples in the MTS-task, were surrounded by a white rectangle (8° by 1.75°). After completing the block, participants were instructed that they would start a new phase, in which the contingencies of the previous phase were no longer relevant. For all blocks, the instructions were subtly changed wherever they mentioned the symbol-pairs, to increase participants’ attention to the relation between them. Instead of simple referring to them as “symbol-pairs”, they were referred to as “pairs of 2x3 symbols

(e.g., xxx yyy)”, which we hoped would also increase the probability that the triplets would

(25)

Results

Participants completed a similar task to that in Experiment 2. Our aim, to investigate whether a relation can have the function of (conditioned) Sr, was the same as before, and thus, the analytic procedure was identical to that of the previous experiments (see Results section for Experiment 1). Data for 9 participants, who did not manage to reach the accuracy-criterion for the operant learning procedure within five blocks, were discarded. This resulted in a sample of 21 (14 women), ages ranging from 18 to 49 years old (M = 29.05, SD = 9.38 years). Given this small sample size, data were first checked for deviations against the normality assumption by means of the Shapiro-Wilk test.

Neither the responses for the reinforcement blocks (Shapiro-Wilk, W = 0.89, p = .02),

Figure 6. Trial structure for both contingencies in Experiment 3.

Each row shows one of the two possible contingencies experienced by participants (i.e., in the MTS-training blocks). In the top panel, selecting the ‘gain’-cue when a pair of identical symbols was presented was followed by positive feedback (and the ‘no-gain’-cue for non-identical pairs was followed by negative feedback), whereas in the bottom panel, the contingency was reversed. This contingency (i.e., technically, the order of contingencies, in the preregistered design) was counterbalanced between participants. In the reinforcement blocks, the left key (“E”) produced identical pairs, whereas the right key (“I”) produced non-identical pairs. Again, this mapping (i.e., reinforcement a versus. b) was counterbalanced between participants, adding up to four unique block orders. Note: arrows underneath the response options in the equivalence blocks indicate correct responses.

(26)

nor for the equivalence blocks (W = 0.77, p < .001), were normally distributed. Therefore, we tested the median response-frequency against chance-level with the one-sample Wilcoxon Signed-rank test. In the reinforcement block, the median frequency by which participants selected the response that produced the ‘gain’-relation (Mdn = 8, MAD = 1.48, see Figure 7A) was not significantly higher than chance-level, V = 60, p = .32, r = 0.22, 95% CI = [-0.23; 0.59]. For the equivalence block, our analysis did yield a significant effect, V = 312.5, p < .001, r = 0.84, 95% CI = [0.78; 0.90], showing participants were significantly more accurate (Mdn = 15, MAD = 1.48, see Figure7B), responding in line with the contingency, than expected if they were to respond randomly. These results were in line with what we observed in Experiment 2.

Discussion

Given the difficult to interpret results of Experiment 2, we set-up a follow-up experiment with some modifications to the design of the former. Errors were corrected, changes made going from Experiment 1 to Experiment 2 were revised and we made minor alterations to the stimulus presentation, intended to increase participants’ focus on the relation between the symbols in a pair. Through these revisions, we expected to resolve, and possibly also explain the inconsistent results of the previous experiment. Our goal was the same as it was in Experiment 2. We aimed to find support for the hypothesis that a relation (the identity-relation between a pair of symbols) can function

Figure 7. Raw response frequencies with means and confidence intervals for Experiment 3.

For each block (i.e., the four panels), the frequency by which individual participants selected the response that produced the ‘gain’-relation is indicated represented by black dots. Red dots indicate the sample mean for each block, with 95% confidence intervals in black. The red, dashed line indicates the chance-level. Note: * p < .05; ** p < .01; *** p < .001.

(27)

as a conditioned Sr, and that this function can be established by means of an operant learning procedure.

Despite the changes that were made, our analyses again did not yield clear results. We did not find evidence that the contingency experienced in the learning phase significantly increased the frequency with which participants selected the response that produced the ‘gain’-relation in the reinforcement block. In line with our observations in Experiment 2, we found that participants were able to respond in line with the contingency in the equivalence block, responding as if the identity-relations and the ‘gain’- and ‘no-gain’-cues were equivalent. Furthermore, this equivalence relation appears to be established as a result of the operant MTS-procedure, in which correct responses were reinforced (see Sidman, 2000). The latter will be discussed in more detail in the general discussion. Interestingly, however, when we applied a different exclusion criterion, based on participants’ explicit awareness of the contingency (as opposed to the accuracy-criterion in the training blocks), our results suggested that the identity-relation functioned as a conditioned reinforcer in the reinforcement blocks, in support of our hypothesis (see Supplementary Material and Figure 11).

Notwithstanding, it must be said that the results were not entirely convincing and should be interpreted cautiously. Compared to the results of Experiment 1, the effects observed here are considerably smaller. Note also that, given the relatively large proportion of exclusions, sample sizes for Experiments 1 and 2 were small. Across our three experiments, we consistently found the strongest effects in the equivalence blocks. Conversely, in Experiments 3 and, to a lesser extent, in Experiment 2, as opposed to Experiment 1, results of the reinforcement blocks did not support our hypothesis that a relation can function as a conditioned Sr. It is possible, although unlikely, that this was due to the operant learning procedure, and that the function of conditioned Sr cannot be established through such procedures. Furthermore, despite the strong effect in the

reinforcement blocks of Experiment 1, it could be said that the task in these blocks is not

as clear as that in the equivalence block. Especially in the online study, this may have contributed to the contrasting results because the instructions that participants’ earnings would be a function of their performance, presented to them in the lab experiments, had to be left out for the online platform.

(28)

Experiment 4

In light of the results of Experiments 2 and 3, we set up a fourth and final online experiment. Across the three previous experiments, we found the strongest effects in the equivalence blocks, suggesting participants were well able to respond in line with the contingencies they experienced in a MTS-task. However, in Experiments 2 and 3, as opposed to Experiment 1, results of the reinforcement blocks did not support our hypothesis that a relation would reinforce certain responses, as a function of that contingency. We hypothesized that the task at hand in the reinforcement block is not explicit, compared to the MTS-task in the equivalence block, and might thus not be clear to participants. This was especially the case in the online study, where the instructions for the reinforcement block were slightly altered, compared to the previous experiments in the lab.

This final experiment was a quasi-replication of Experiment 3, with two changes made to the latter. First, to make the task clearer, we updated the instructions for the reinforcement blocks to more closely resemble those of the lab experiments. We explicitly mentioned that they could earn extra money in the reinforcement blocks, and that the exact amount would be a function of the symbols they saw and the responses they made (note, however, that because of practical reasons, all participants received the extra £0.5, regardless of their performance). Second, based on time and financial considerations, in this experiment, participants only completed one phase. The phase still consisted of the same MTS-training, reinforcement and equivalence blocks. The contingency (i.e., whether pairs of identical or non-identical symbols were paired with monetary gain) was counterbalanced between participants. Aside from these changes, the experiment was identical to Experiment 3.

Method

Participants and design.

60 participants (35 women, 1 non-binary), ages ranging from 18 to 69 years old (M = 35.08, SD = 11.85 years), were recruited online through Prolific Academic. Sample size was determined based on a priori power analyses in G*Power, using the results of the Experiment 1 (see footnote 8), and taking into account the relatively high attrition rates in Experiments 2 and 3. All participants provided informed consent and were explicitly told they were allowed to quit whenever they felt uncomfortable, in line with Ghent

(29)

University’s ethics protocol. They were compensated £1.5 for completing the experiment, which took up no more than fifteen minutes. This experiment entailed a between-subjects design. We investigated the effect of a contingency between a relation and monetary gain on the frequency of participants’ responses producing that relation. Responses and reaction times were registered.

Materials

Given that this was a quasi-replication, all stimulus material and responses were identical to those used in Experiment 3 (see Method section for Experiment 3). Analogously, the same software was utilized for analyses and for controlling and running the experiment. Reproducible scripts for both were preregistered to OSF13.

Procedure.

The procedure for this experiment was mostly identical to that of Experiment 3. One exception was that, after providing informed consent and demographic information (age and gender), participants completed three blocks: MTS-training, reinforcement and

equivalence. The contingency (i.e., whether identical or non-identical symbols were

paired with monetary gain) and response-consequence mappings were counterbalanced between participants. As before, they were afterwards were presented one question that assessed their explicit awareness of the contingency (see Supplementary Materials). An example trial structure is illustrated in Figure 8.

The MTS-training blocks were identical to those of Experiment 3. The reinforcement blocks were identical to that in the previous experiment, except for the fact that the instructions before starting the block were adapted. Specifically, we added a short sentence to the instructions, stating that participants “could win up to £0,5 extra in this

block”, and that “the amount of extra money you receive at the end of the experiment is a is a function of the symbols you see and the keys you press”. As before, the

response-consequence mappings (i.e., whether a left and right response produced identical versus non-identical pairs or vice versa) were counterbalanced between participants. Finally, the equivalence blocks were also identical to those in the previous experiment (see Method section for Experiment 3 for exact stimulus and timing characteristics).

13Both the master (English) and the translated (Dutch) measures and analysis scripts are available at: https://osf.io/7x8bj/?view_only=72b4a21407d44eb68ec5eda554a001d7

Afbeelding

Figure 1. Illustration of a MTS-procedure, adapted from De Houwer &amp; Hughes (in press)
Figure 2. Trial structure for both contingencies in Experiment 1.
Figure 3. Raw response frequencies with means and confidence intervals for Experiment 1
Figure 4. Trial structure for both contingencies in Experiment 2.
+7

Referenties

GERELATEERDE DOCUMENTEN

In weerwil van deze eufemismen verdient dus ook de andere vraag die in de briefwisseling tussen Einstein en Freud aan bod kwam, alle aandacht: de vraag waarom mensen altijd

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version

As is proven in the ontological manuals, it is obvious that the transcendental unity of apperception proves the validity of the Antinomies; what we have alone been able to show is

In this paper, an agent-based model to describe social activities between two people over time is described and four different input networks (random, based on spatial distance,

In short, we have investigated queueing systems of increasing complexity and applied three forms of Q-learning: tabular learning, linear learning (using lin-.. ear

However when using multiple networks to control a sub-set of joints we can improve on the results, even reaching a 100% success rate for both exploration methods, not only showing

The learning rate represents how much the network should learn from a particular move or action. It is fairly basic but very important to obtain robust learning. A sigmoid

We will use the Continuous Actor Critic Learn- ing Automaton (CACLA) algorithm (van Hasselt and Wiering, 2007) with a multi-layer perceptron to see if it can be used to teach planes