Implicit artificial grammar learning: effects of complexity and usefulness of the structure

(1)

Implicit artificial grammar learning: effects of complexity and usefulness

of the structure

Bos, E.J. van den

Citation

Bos, E. J. van den. (2007, June 6). Implicit artificial grammar learning: effects of complexity and usefulness of the structure. Department of Cognitive Psychology, Leiden University Institute for Psychological Research, Faculty of Social Sciences, Leiden University. Retrieved from https://hdl.handle.net/1887/12037

Version: Corrected Publisher’s Version

License: Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden

Downloaded from: https://hdl.handle.net/1887/12037

Note: To cite this publication please use the final published version (if applicable).

(2)

Chapter 3 Structural selection

Abstract

In the contextual cueing paradigm, Endo and Takeda (2004) recently provided evidence that implicit learning involves selection of the aspect of a structure that is most useful to one’s current task. The present study attempted to replicate this finding in artificial grammar learning to investigate whether or not implicit learning commonly involves such a selection. Participants in Experiment 1 had to perform a task in the induction phase that could be facilitated by the grammatical letter

sequences in the exemplars and, for some participants, by a highly useful feature. The results suggested that the aspect of the structure that was most useful to the

participants’ task was selected and learned implicitly. Experiment 2 provided evidence that, although salience affected participants’ awareness of the feature, the selection for implicit learning was based on usefulness.

Introduction

Implicit learning is often defined as a process that occurs without intention to learn, which results in knowledge that is not completely accessible to consciousness (e.g. Gomez, 1997; Mathews et al., 1989; Reber, 1989; Seger, 1994). It has been suggested to underlie the acquisition of complex patterns, such as motor skills, social rules (e.g. Seger, 1994) and the grammars of natural languages (Reber, 1976).

Originally, implicit learning was conceptualized as an invariant and ineluctable process that would abstract any structure present in the environment (e.g. Hayes &

Broadbent, 1988; Reber, 1989). This passive view of implicit learning, however, was opposed by the episodic processing account (e.g. Wright & Whittlesea, 1998). By this account, structure learning is accidental; what is learned is determined by what is attended. Recently, Endo and Takeda (2004) proposed that people acquire the aspect of a structure that is most useful to the task they are engaged in. The present study describes two Artificial Grammar Learning (AGL, Reber, 1967) experiments that explored the viability of the view that implicit structure learning reliably involves selection of the aspect of a structure that is most useful to one’s current task.

(3)

Selection in implicit learning

Reber (1989) originally proposed that implicit learning automatically abstracts knowledge of covariation patterns from the environment. In addition, it would be an evolutionarily old process; robust with respect to disorders and aging, virtually

invariant between individual humans and shared by other species (Reber, 1989, 1992).

This characterization closely matches Hasher and Zacks’ (1979) description of an

‘innate automatic process’, which suggests that the process is not only initiated unintentionally but is also difficult to inhibit or modify (Hasher & Zacks, 1979;

Shiffrin & Schneider, 1977). The invariance of implicit learning was also stressed by Hayes and Broadbent (1988), who proposed that the distinguishing factor between implicit and explicit learning is the selectivity of the processes. Explicit learning would involve active selection of a small amount of relevant information, whereas implicit learning would unselectively store the frequency of co-occurrence of all elements present.

The view that implicit learning is unintentional, ineluctable and inflexible was supported by studies demonstrating structure learning under conditions in which participants apparently had no reason to learn a structure. Numerous AGL-studies indicated that memorizing letter strings without knowing that they have been

generated by an artificial grammar enables participants to judge the grammaticality of new exemplars as well as –and sometimes even better than- intentionally looking for the rules underlying grammatical exemplars (Brooks, 1978; Dienes, Broadbent &

Berry, 1991; Dulany, Carlson & Dewey, 1984; Mathews et al., 1989; Perruchet &

Pacteau, 1990; Reber, 1976; Shanks, Johnstone & Staggs, 1997). In addition, participants have been shown to learn about an artificial grammar by holding exemplars in memory for a few seconds to recognize them among distracters (Mathews et al., 1989) and to learn the words of an artificial language presented as background auditory stimulation in a drawing task (Saffran, Newport, Aslin, Tunick

& Barrueco, 1997).

In contrast, Whittlesea and colleagues have argued that implicit learning does not passively capture any structure present in the stimuli. They demonstrated that the kind of knowledge acquired in implicit learning experiments could be modified by the task participants perform on the stimuli (Whittlesea & Dorken, 1993) and by

accidental characteristics of the stimuli (e.g. familiarity, salience; Whittlesea &

Wright, 1997) and the context (e.g. spatial organization; Wright & Whittlesea, 1998).

(4)

According to their episodic processing account, sensitivity to a structure (at test) is due to accidental overlap in information processing with earlier (learning) situations.

What is learned can vary widely; structure has no special status in the selection process. Therefore, Whittlesea and colleagues see no need to invoke an implicit structure learning mechanism (Wright & Whittlesea, 1998). In this account, the knowledge acquired in any situation depends on what is attended and attention is treated as essentially unpredictable.

Recently, however, research using visual search tasks has provided evidence that the relationship between attention and learning is bi-directional: structure learning can also guide attention (Chun & Jiang, 1998; Lambert, 2003). In the contextual cueing paradigm, for example, participants are presented with search displays

containing one target and several distracters. Half of the displays have configurations of targets and distracters that are repeated during the experiment. For these

configurations, participants become increasingly faster at locating the targets, because they learn where to attend. Learning is considered implicit, because participants are not informed about the repeated configurations prior to the experiment and are unable to distinguish between old and new configurations on a subsequent recognition test (Chun & Jiang, 1998).

In the experiments of Endo and Takeda (2004), the location of the target could be predicted by both the configuration and the identity of the distracters. When both were predictive of target location, attention was guided by the strongest predictor and only distracter configuration was learned. However, when each relationship predicted the target location on half of the trials, both were learned. When distracter

configuration was made less informative than distracter identity, by varying it randomly or associating it with target identity rather than location, only the identity predictor was learned. Together these experiments show that the aspect of a structure that is most useful to the current task is implicitly selected for learning. This

demonstration of implicit selection of useful structure is interesting, because both the basis of selection and its implicitness are controversial in the light of findings from other implicit learning paradigms. These findings will be discussed below.

Selection of useful information

On the one hand, the finding that an aspect of a structure was implicitly selected for learning on the basis of its usefulness to the visual search task (Endo &

Takeda, 2004) may be expected in other tasks as well. One might argue that

(5)

knowledge of the structure is useful to the participants’ task in most demonstrations of implicit learning. For example, in serial reaction time (SRT) tasks (Nissen &

Bullemer, 1987) participants have to react as quickly and accurately as possible to a stimulus appearing at some location on a computer screen by pressing the

corresponding key on the keyboard. Knowing the sequence of locations in which the stimulus appears allows for faster responding. Similarly, memorizing the individual exemplars in the induction phase of an AGL-experiment is facilitated by acquiring knowledge of the underlying structure. Reber (1967) demonstrated that participants who memorized grammatical exemplars needed fewer learning trials to achieve accurate reproduction than participants who memorized stimuli that were randomly composed of the same letters. This suggests that implicit learning is commonly directed at useful structures.

On the other hand, several SRT-experiments have demonstrated implicit learning of the sequence of locations in the presence of a more reliable predictor of where the next stimulus would appear. The structure was shown to be acquired when the next stimulus’ location could be perfectly predicted from the identity of the present one (Jiménez & Méndez, 1999, 2001) and when the next location was indicated by an explicit cue (Cleeremans, 1997). This suggests that selection of the aspect of a structure that is most useful to one’s current task may not be a general finding in implicit learning.

The discrepancy between the results of Endo and Takeda’s (2004) contextual cueing experiments and the findings in the SRT-paradigm (Cleeremans, 1997;

Jiménez & Méndez, 1999, 2001) may reflect different learning mechanisms. There is some evidence that different neural substrates underlie the formation of the spatial associations acquired in contextual cueing experiments and the spatiotemporal associations acquired in SRT-tasks (Howard, Howard, Dennis, Yankovich & Vaidya, 2004). Interestingly, Dominey (2003) proposed a similar distinction between a mechanism for spatiotemporal structure learning and a mechanism underlying AGL.

This suggests that selection of the aspect of a structure that is most useful to the participant’s current task, observed in contextual cueing experiments (Endo &

Takeda, 2004), but not in SRT-tasks (Jiménez & Méndez, 1999; 2001), may be replicated in AGL.

(6)

Implicitness

A second interesting point raised in the contextual cueing paradigm is that an aspect of a structure may be selected and learned implicitly (Chun & Jiang, 1998;

Endo & Takeda, 2004). This suggestion is at odds with a previous finding that people have conscious control over selection of information. Haider and Frensch (1999) found that participants processed irrelevant information when they were instructed to perform a task as accurately as possible, but not when they were instructed to perform the task as fast as possible. They concluded that, although participants may implicitly assess whether or not information is relevant, they intentionally decide whether or not to process it.

Moreover, learning without awareness itself is controversial. Artificial grammar learning has been shown to result in explicit knowledge of bigrams (Dulaney, Carlson & Dewey, 1984), which can be sufficient to achieve normal performance on a grammaticality judgment task (Perruchet & Pacteau, 1990). In addition, Shanks and St. John (1994) argued that early studies demonstrating knowledge without awareness had failed to detect explicit knowledge, because they used insensitive measures and focused on irrelevant information. At present, it is generally acknowledged that implicit learning produces a certain amount of explicit knowledge (Cleeremans, 1993; Reber, 1989).

However, there is some evidence that explicit knowledge is insufficient to explain performance (Mathews et al., 1989) and that it may not actually be used in making grammaticality judgments (Meulemans & Van der Linden, 2003). In addition, it has been proposed that implicit learning produces knowledge that is different from explicit knowledge in the sense that it is not accompanied by meta-knowledge (Dienes

& Berry, 1997). In implicit learning, people may form representations that are not labeled as knowledge and, hence, cannot be recognized as such (Dienes & Perner, 1999). In sum, additional research is needed to establish the generality of the finding that an aspect of a structure can be selected and learned implicitly (Endo & Takeda, 2004).

The present study further investigated both the possibility that selection for implicit learning is based on a structure’s usefulness to one’s current task and the degree to which this selection can be performed implicitly. To address the first question, we explored whether structural selection of the aspect most useful to one’s current task could be observed in artificial grammar learning. We presented

(7)

participants with exemplars that contained moderately useful letter sequences as well as a highly useful feature. If they only selected the aspect of the structure that was most useful to the task they were instructed to perform in the induction phase, learning would be restricted to the feature. To address the second question of this study, we investigated the degree to which learning was implicit by estimating performance on a classification test in the absence of explicit knowledge.

Experiment 1

The findings of Endo and Takeda (2004) suggest that implicit learning is limited to the aspect of a structure that most effectively facilitates a person’s current task. To explore the viability of this usefulness-hypothesis, we presented each participant with exemplars from two artificial grammars. For one group of participants, memorizing the exemplars and the side of the screen where they appeared could be facilitated both by the letter sequences specified by the grammars and by a highly useful feature. For the other group, the memorize-task could only be facilitated by the letter sequences. Participants working with the stimulus set without the feature were predicted to learn the letter sequences specified by the grammars.

Participants working with the stimulus set with a feature were predicted to learn the feature, but not the letter sequences. To reveal what participants had learned, they were presented with two kinds of stimuli at test. One half of these could be classified on the basis of both the feature and the grammatical letter sequences, while the other half could only be classified on the basis of the letter sequences.

Method

Participants. Fifty-six undergraduate students of Leiden University (17 male, 39 female, 17 - 29 years of age; M = 20.37, SD = 2.86) participated in this experiment.

They received either course credits or money for their participation. The reward depended on the duration of the experiment; experimental participants were paid € 4.50 and control participants were paid € 3.

Materials. The stimuli in this experiment were exemplars of four different artificial grammars of the finite state type (see Figure 1). The grammars were

implemented in a computer program, which generated 56 different exemplars for each grammar, consisting of either seven or ten letters. From these exemplars two sets of stimuli were created. Set 1 consisted of the exemplars generated by A1 and B1; Set 2 consisted of the exemplars generated by A2 and B2 (see Appendix B).

(8)

0

1

2

3

4

5

6

7

8 9 M

M

M M

P S

W X

Z

J

N R

Q

S T

Z

J

N A1)

0

1

2

3

4

5

6

7

8 9 S

R

T Q

X W

S P

N

Z

J R

Q

M T

N Z

J B1)

0

1

2

3

4

5

6

7

8 9 T

Q

M R

P S

W X

Z

J

N R

Q

M T

Z

J

N A2)

0

1

2

3

4

5

6

7

8 9 M

R

T Q

X W

S P

N

Z

J R

Q

M T

N Z

J B2)

1

2 M

T Q

P Z

T 8

9

X W

S

N J

R Q

M N

Z J 4

5 0

3 6

M

M R

7

10

11 12 A3)

1

2 M

T

Q P

Z

T 8

9 X

W

S N

J R

Q

M N Z

J 4

5 0

3 6

T

T R

7

10

11 12 B3)

Figure 1. Artificial grammars used in this study. Grammars A1 and B1 generated Stimulus Set 1, with a simple feature. Grammars A2 and B2 generated Stimulus Set 2, without a simple feature. Grammars A3 and B3 generated Stimulus Set 3, with a non-salient feature. The grammars are based on those of Whittlesea and Dorken (1993, Experiment 1).

As shown in Figure 1, exemplars generated by A1 always started with an M, while exemplars generated by B1 never started with an M. So, for participants who memorized exemplars from Set 1 and the side of the screen where they appeared, their task could be facilitated by this simple feature as well as by the sequences of letters permitted by each of the grammars. Since the feature was invariant, while a specific letter sequence would not necessarily be part of each exemplar from a grammar, the

(9)

feature would be more useful to the memorize task than the letter sequences. In Set 2, such a feature was not available and the task of memorizing could only be facilitated by learning letter sequences.

Each set was divided into 64 induction stimuli and 48 test stimuli, so that both groups consisted of an equal number of exemplars from grammars A and B. One half of the test stimuli, balanced for grammar and length, were presented as ‘complete exemplars’. The other half were presented as ‘fragments’, created by replacing the first letter of the exemplar by an underscore. The stimuli for practice in the test phase consisted of five additional exemplars for each stimulus set: one complete exemplar and one fragment generated by grammar A and two complete exemplars and one fragment generated by grammar B. The stimuli for practice in the induction phase consisted of numbers that bore no relation to the grammars.

All stimuli were displayed on a computer monitor as black text (Arial 18, bold) against a white background. Participants were seated in front of the computer monitor at a distance of about 50 cm. They reacted by pressing keys on a keyboard and by writing their answer to an open question on a sheet of paper.

Design. Three independent variables were manipulated in the experiment.

Firstly, the participants were divided into an experimental group and a control group.

The experimental group was presented with an induction phase, in which they had to memorize each exemplar together with the side of the screen where it appeared. In addition, they had to type in the stimuli in order to guarantee that they would attend to the letters of each of the 64 exemplars. For the control group there was no induction phase.

Secondly, the stimulus set was varied between participants. One half of the participants worked with materials with the highly useful feature and the other half worked with materials without the feature. For each experimental participant, exemplars generated by grammar A were always presented on one side of the screen during the induction phase and exemplars generated by grammar B were always presented on the other side. The order of presentation of the exemplars was randomized. The side of the screen (left or right) with which each grammar was associated was balanced over all participants.

Thirdly, the type of exemplar was varied within-subjects in the test phase. All participants classified both complete exemplars and fragments, presented in random order. Knowledge of the letter sequences specified by each grammar would allow for

(10)

accurate performance on both complete exemplars and fragments, while knowledge of the feature would allow for accurate performance on complete exemplars but not on fragments.

The main dependent variable was the proportion of exemplars correctly classified as belonging to the side of the screen associated with their grammar.

Experimental participants knew from the induction phase which side of the screen each grammar was associated with. Control participants, however, did not know which side each grammar was assigned to. If, as expected, control participants were unable to distinguish between exemplars from the two grammars, they would perform at chance level irrespective of the mapping they had been randomly assigned to.

However, if they consistently grouped exemplars from the same grammar together, their accuracy would depend on the side of the screen to which each grammar was assigned. To check whether the accuracy results were biased by the control group’s mapping, consistency was included as a second dependent variable. Consistency was defined as the difference between the number of exemplars from grammar A

classified as belonging to one side and the number of exemplars from grammar B also classified as belonging to that side.

Procedure. Participants were tested individually in a dimly lit test booth. At the beginning of the experiment, experimental participants were told that it would consist of two parts. They were informed that they would first be presented with two groups of exemplars: a left group and a right group and that they would have to memorize each exemplar together with the side of the screen where it appeared.

Subsequently, there were five practice trials. Participants were notified when the experimental trials began. Each trial started with a fixation cross appearing on the left or on the right of the screen. After 1 second the cross was replaced by an exemplar, centered at the fixation point. Participants performed the memorize instruction and typed in the letters. When they pressed the last key, their input was displayed on the screen underneath the original exemplar and a reminder of the instruction appeared.

After 2 seconds the screen turned blank for 1 second and then the next trial began.

Control participants only performed the second part of the experiment, which was a classification task. At the beginning of this task, all participants were informed that they would be presented with both complete exemplars and fragments of

exemplars. They were told that the stimuli would appear in the middle of the screen and that half belonged to the left and half belonged to the right group. They were

(11)

required to indicate for each exemplar to which group it belonged by pressing either the left or the right arrow on the keyboard. They received 5 practice trials, followed by 48 experimental trials. Each trial began with a fixation cross appearing in the middle of the screen. After 1 second the cross was replaced by an exemplar centered at the fixation point. The exemplar remained on the screen until the participant pressed one of the arrows. Then there was a blank screen for 1 second before the next trial started.

After the participants had completed all trials, they were asked to write down whether they had noticed any differences between the left group and the right group and, if they had, what these differences were. Finally, the participants were thanked for their participation. The experiment took about 25 minutes for the experimental participants and 15 minutes for the control participants.

Analyses. The main analysis was a 2-between, 1-within mixed model analysis of variance (ANOVA) on the proportion of correct classifications; with group

(memorize vs. control) and stimulus set (with feature vs. without feature) as between- subjects variables and type of exemplar (complete exemplar vs. fragment) as within- subjects variable. A similar analysis was performed on the consistency data.

If learning were restricted to the aspect of the structure that is most useful to the task in the induction phase, participants memorizing exemplars from the stimulus set with a highly useful feature would have learned the feature, but not the letter sequences. This would allow them to correctly classify the complete exemplars, but not the fragments. Participants memorizing exemplars from the stimulus set without a feature would have learned the letter sequences and would therefore be able to correctly classify both complete exemplars and fragments. Therefore, the hypothesis that selection for learning is based on the structure’s relevance to the task in the induction phase predicts an interaction between stimulus set and type of exemplar.

However, this effect would be restricted to the memorize group, as no learning was expected for the control group. In other words, the usefulness-hypothesis predicts a three-way interaction of group, stimulus set and type of exemplar. For analyses in which this three-way interaction is significant, main effects and two-way interactions will not be reported. Separate ANOVA’s for each stimulus set and independent samples t-test for each type of exemplar will subsequently be performed to test whether the pattern of results is in line with the hypothesis.

(12)

The second question of this study was whether (selective) learning could occur without leading to awareness. This question was addressed by estimating the

experimental participants’ level of performance on the classification test in the

absence of explicit knowledge. A measure of explicit knowledge was derived from the responses to the open question. Verbal reports have been criticized as insensitive measures of explicit knowledge (Shanks & St. John, 1994). However, naming

differences between two groups of exemplars seems to be easier than describing what makes exemplars grammatical. Knowledge of the simple feature in particular would be easy to verbalize and should be revealed by this question.

The answers to the question were scored following a procedure developed by Dienes, Broadbent and Berry (1991). Firstly, the criteria provided by the participant were applied to the test stimuli in an attempt to determine for each exemplar whether it would be classified correctly or incorrectly. Secondly, exemplars that could not be classified, since none of the participant’s rules applied to it, were assumed to be guessed correctly in 50% of the cases. Therefore, the open question score was defined as the sum of the number of correctly classified exemplars and half the number of unclassifiable exemplars. For example, any mention of the feature would lead to correct classification of the 24 complete exemplars and correct guesses for 12 of the fragments, amounting to an open question score of 36. If participants did not have any relevant explicit knowledge, their open question score would be 24.

The open question score (independent variable) was entered in a regression analysis to predict the proportion correct on the classification test (dependent

variable). Subsequently, a reliable estimate of the proportion of correct classifications associated with the score of 24 on the open question was derived from the regression equation. A predicted proportion of correct classifications that was significantly above chance indicated implicit knowledge.

(13)

Results

Performance analysis. The ANOVA on the proportion of correct

classifications showed that the three-way interaction of group, stimulus set and type of exemplar, illustrated by Figure 2, was significant (F(1,52) = 23.395, MSE = .010, p <

.001). For the stimulus set with a feature, the interaction between group and type of exemplar was significant (F(1,26) = 33.905, MSE = .013, p < .001). Independent samples t-tests showed that more complete exemplars were classified correctly by the memorize group (M = .896, SD = .148) than by the control group

(M = .521, SD = .174, t(26) = 6.139, p < .001). For Fragments, however, there was no difference between the memorize group (M = .527, SD = .126) and the control group (M = .503, SD = .068). For the stimulus set without a feature, only the main effect of group was significant (F(1,26) = 17.137, MSE = .026, p < .001). The proportion of correct classifications was higher for the memorize group (M = .676, SD = .143) than for the control group (M = .497, SD = .075).

The consistency analysis showed the same pattern of results (see Table 1), indicating that the accuracy data were not biased by the way the grammars were assigned to the sides of the screen for the control group. In summary, participants who had memorized exemplars without a feature had learned the letter sequences that characterized the two groups of exemplars, whereas participants who had memorized exemplars with a highly useful feature had only learned this feature.

Table 1. Consistency analyses

Experiment

Effect 1 2

Stimulus set x Group x Type of exemplar F(1,52)= 19.496***

Without feature

Group F(1,26)= 12.519**

With feature

Group x Type of exemplar F(1,26)= 19.866*** F(1,26)= 4.417*

Complete exemplars

Group t(26)= 5.315*** t(16.0)= 5.063***

Fragments

Group t(26)= 1.604 t(14.6)= 3.287**

Note. * p < .05. ** p < .01. *** p < .001.

(14)

Figure 2. Mean proportion of correct classifications with 95% confidence interval for each stimulus set, group and type of exemplar at test in Experiment 1.

Type of knowledge. For the stimulus set without a feature, the experimental group had a mean open question score of 30.6 (SD = 7.3) out of 48. The regression analysis showed that the open question score was a significant predictor of the proportion of correct classifications (F(1,12) = 72.163, p < .001). The predicted proportion of correct classifications for an open question score of 24, corresponding to no explicit knowledge, was .555 (95% CI = .510 – .600). As this is significantly above chance, it can be concluded that memorizing exemplars without a simple feature led to knowledge that was partly implicit.

For the stimulus set with a highly useful feature, the performance analysis indicated that participants in the experimental condition had knowledge relevant to complete exemplars, but not to fragments. Therefore, the open question scores were only computed for complete exemplars. The mean open question score was 20.4 (SD

= 5.2) out of 24. Nine participants had complete explicit knowledge of the feature.

The regression analysis showed that the open question score was a significant predictor of the proportion of complete exemplars classified correctly (F(1,12) = 18.025, p = .001). The predicted proportion of correct classifications for an open question score of 12, corresponding to no explicit knowledge, was .708 (95% CI = .597 – .820). As this is significantly above chance, it can be concluded that

memorizing exemplars with a highly useful feature led to knowledge of this feature that was partly implicit.

(15)

Discussion

We investigated the hypothesis that only the aspect of a structure that is most useful to a person’s current task is selected for implicit learning. The results of Experiment 1 showed that, after memorizing exemplars from two artificial grammars that could be distinguished on the basis of moderately useful letter sequences as well as a highly useful feature, participants were able to classify complete exemplars, while they were unable to classify exemplars from which the feature had been

removed. This indicates that they had learned the feature, but not the letter sequences.

In contrast, participants who could only use letter sequences to facilitate the memorize task were significantly better than the control group for both types of exemplars. Their overall performance of 67% correct was similar to the 66% correct found by

Whittlesea and Dorken (1993, Experiment 1: unambiguous items), who also asked participants to classify exemplars to one of two grammars. In short, the results were in accordance with the hypothesis. When two aspects of the structure were useful to the participants’ task, the most useful aspect was selected for learning, whereas the other was not. However, this aspect could be selected for learning if there was no more useful alternative.

In addition, the aspect most useful to a person’s current task could be selected and learned implicitly. Participants who memorized exemplars characterized by specific letter sequences and the side of the screen where they appeared acquired partly implicit knowledge of these letter sequences. The regression analysis indicated above chance performance even in the absence of explicit knowledge. Similarly, participants who memorized exemplars characterized by their initial letter together with the side of the screen where they appeared were shown to acquire partly implicit knowledge of this feature. This suggests that very simple information that is useful to one’s current task can be selected and learned without reaching awareness.

The results of the present experiment could be taken as evidence that the finding of implicit selection and learning of useful information (Endo & Takeda, 2004) generalizes from the contextual cueing paradigm to artificial grammar learning.

However, it could be argued that the feature was not only the most useful aspect of the structure in the present experiment, but also the most salient. Turner and Fischler (1993) have suggested that implicit learning could be facilitated by salient materials (though see Reber, Kassin, Lewis & Cantor, 1980). An alternative interpretation of

(16)

our findings may therefore be that only the feature was learned, because attention was fully captured by its salience, leaving the rest of the exemplar unattended.

Two aspects of the data from the present experiment make this interpretation unlikely. Firstly, participants were required to type in each exemplar in the induction phase, which requires at least some attention to the subsequent letters in the string.

Secondly, if the salient feature were the only aspect of the structure that was attended during the induction phase, it would seem unlikely that participants’ knowledge of it would be partly implicit. Nevertheless, we conducted a second experiment to investigate whether a simple feature that is useful to the task in the induction phase can be selected and learned implicitly when its salience is reduced.

Experiment 2

In this experiment, two grammars were used that could be characterized by invariant second letters (M and T) instead of an invariant first letter. Frick and Lee (1995) found that 79% of participants noticed an invariant first letter in otherwise random letter sequences, whereas only 24% noticed an invariant second letter.

Therefore, removing the invariant letter from the first position was expected to reduce the feature’s salience, but not its usefulness to the task of memorizing each exemplar and the side of the screen where it appeared. The results of Endo and Takeda (2004) suggest that the feature will be selected for learning as long as it is the most efficient way to facilitate the participants’ task. This leads to the prediction that, as in

Experiment 1, participants in Experiment 2 will learn the feature, but not the letter sequences. Alternatively, if selection of the feature had been based on its salience in Experiment 1, learning of the feature would be diminished by a reduction in salience.

In that case, no difference between complete exemplars and fragments would be expected in Experiment 2.

Method

Participants. There were 28 participants in this experiment (9 male, 19 female;

19-34 years, M = 23.21, SD = 3.35). All participants were students of Leiden University and none of them had participated in Experiment 1. They received either course credits or money for their participation. The reward depended on the duration of the experiment; experimental participants were paid € 4.50 and control participants were paid € 2.

(17)

Materials. In Experiment 2, only a stimulus sets with a simple feature was used (see Appendix B). The stimuli were generated by Grammar A3 and Grammar B3 in Figure 1 on page 35. These grammars produced exemplars that varied in length from 8 to 11 rather than from 7 to 10 letters and were characterized by a highly useful feature in the second position rather than in the first. Therefore, the fragments for the test phase were created by removing the second letter. This shift in position was assumed to make the feature less salient without diminishing its usefulness in memorizing the side of the screen where each exemplar appeared. In all other

respects, the stimulus sets for Experiment 2 were created in the same way as those for Experiment 1.

Procedure. The procedure was the same as in Experiment 1.

Design and Analyses. Experiment 2 contained only one stimulus set. Apart from the omission of this independent variable, the design was the same as in Experiment 1. The data were analyzed by means of a 1-between, 1-within mixed model ANOVA with group (memorize vs. control) as between-subjects variable and type of exemplar (complete exemplar vs. fragment) as within-subjects variable. As the feature is the most reliable characteristic of the exemplars appearing on either side of the screen, the usefulness hypothesis would predict that participants in the memorize group would learn the feature, but not the letter sequences. This would enable them to classify more complete exemplars correctly than the control group, but not more fragments. To check that the results were not biased by the side of the screen to which the grammars were assigned for the control group, the same analysis was performed with consistency as the dependent variable. Implicit knowledge was estimated in the same way as in Experiment 1.

Results

Performance analysis. The ANOVA on the proportion of correct

classifications showed that the interaction between group and type of exemplar was significant (F(1,26) = 7.453, MSE = .010, p = .011). For the memorize group, the proportion of complete exemplars classified correctly (M = .772, SD = .178) was higher than the proportion of fragments classified correctly (M = .621, SD = .169;

t(13) = 3.002, p = .010). For the control group, there was no difference between complete exemplars (M = .506, SD = .081) and fragments (M = .503, SD = .040). Both the proportion of fragments and the proportion of complete exemplars classified

(18)

correctly were higher for the memorize group than for the control group (t(18.2) = 5.103, p < .001 and t(14.4) = 2.541, p = .023, respectively).

The consistency analyses (see Table 1 on page 40) showed a pattern of results similar to that of the accuracy analyses. Although the control group classified

complete exemplars more consistently than fragments (t(13) = 3.079, p = .009), consistency for either type of exemplar was lower than in the memorize group.

Type of knowledge. The performance analysis indicated that, although

participants in the experimental condition could classify both complete exemplars and fragments, they were better at classifying complete exemplars. Therefore, implicit knowledge of the useful feature and implicit knowledge of letter sequences were estimated separately. The mean open question score for complete exemplars was 17.5 (SD = 5.1) out of 24. Three participants had complete explicit knowledge of the feature. The regression analysis showed that the open question score was a significant predictor of the proportion of complete exemplars classified correctly (F(1,12) = 11.536, p = .005). The predicted proportion of correct classifications for an open question score of 12, corresponding to no explicit knowledge, was .639 (95% CI = .524 – .754). As this is significantly above chance, it can be concluded that

memorizing exemplars with a highly useful feature led to knowledge of this feature that was partly implicit.

The mean open question score for fragments was 14.5 (SD = 3.2) out of 24.

The regression analysis showed that the open question score was a significant predictor of the proportion of fragments classified correctly (F(1,12) = 6.251, p = .028). The predicted proportion of correct classifications for an open question score of 12, corresponding to no explicit knowledge, was .541 (95% CI = .433 – .649). As the confidence interval includes .50, the knowledge of letter sequences demonstrated in the performance analysis cannot be said to be implicit.

In contrast to participants who memorized exemplars containing a highly useful feature in Experiment 1, participants in Experiment 2 were able to classify both complete exemplars and fragments. Nevertheless, only their classification of complete exemplars was partly based on implicit knowledge. The mean open question scores for complete exemplars did not differ significantly between Experiment 1 and Experiment 2 (t(26) = 1.528, p = .139, 95% CI = -1.0 – 7.0). The number of participants with full explicit knowledge of the feature, however, was larger in Experiment 1 than in Experiment 2 (χ²(1) = 5.25, p < .05). This suggests that

(19)

participants in Experiment 2 had explicit knowledge of information correlated with the feature.

Indeed, the responses to the open question revealed that four participants had learned initial trigrams containing the invariant letter and the side of the screen where they appeared. Knowledge of these trigrams could, somewhat less straightforwardly, be applied to the fragments as well. In line with this suggestion, only experimental participants whose open question score for complete exemplars was above chance (binomial test: score >= 17, p = .032) were significantly better at classifying

fragments (M = .687, SD = .188) than control participants (M = .503, SD = .040, t(6.3)

= 2.570, p = .041). Experimental participants with an open question score for complete exemplars that did not differ from chance were not significantly better at classifying fragments (M = .554, SD = .128) than control participants (t(6.6) = 1.040, p = .335, 95% CI = -.067 – .170).

Discussion

Experiment 1 showed that, by memorizing exemplars from two artificial grammars that could be distinguished on the basis of both a distinctive feature and the letter sequences they contained, participants implicitly learned the feature, but not the letter sequences. Experiment 2 investigated whether the feature had been selected for implicit learning because of its salience or because it facilitated the participants’ task of memorizing each exemplar together with the side of the screen where it appeared.

Participants were presented with exemplars from two artificial grammars containing a feature that was not salient, but highly useful. The usefulness-hypothesis predicted that participants would be able to classify complete exemplars, but not fragments from which the feature had been removed. The salience-hypothesis predicted no difference between complete exemplars and fragments.

The performance analysis showed an intermediate outcome: although experimental participants classified more complete exemplars correctly than

fragments, they did better on the fragments than a control group. Subsequent analyses made clear that the latter result was based on explicit knowledge of trigrams

containing the feature. There was no evidence for the acquisition of letter sequences in the absence of explicit knowledge. The reduction in salience mainly seemed to affect the number of participants who became aware of the invariant letter, which decreased from 9 in Experiment 1 to 3 in Experiment 2. When the second letter was

(20)

invariant instead of the first, some participants acquired explicit knowledge of trigrams containing it rather than of the feature per se.

Reducing the salience of the feature, however, did not enhance implicit learning of the letter sequences; as in Experiment 1, only the feature was learned implicitly. As the feature employed in Experiment 2 was much less salient, this finding suggests that selection was based on the feature’s usefulness in the task of memorizing each exemplar and the side of the screen where it appeared. In conclusion, the finding from the contextual cueing paradigm that the aspect of a structure that is most useful to one’s current task is selected for implicit learning (Endo & Takeda, 2004) seems to generalize to artificial grammar learning.

General discussion

Implicit learning was initially characterized as a process that automatically and unselectively captures any regularity present in the environment (Hayes & Broadbent, 1988; Mathews et al., 1989; Reber, 1989). This view was opposed by Whittlesea and colleagues (Whittlesea & Dorken, 1993; Whittlesea & Wright, 1997; Wright &

Whittlesea, 1998), who demonstrated that implicit learning is selective. According to their episodic processing account, what is learned is determined by what is attended.

Structure learning is therefore not guaranteed. Recently, however, it has been argued that regularities in the environment can guide attention and thereby affect what is learned (Chun & Jiang, 1998; Lambert, 2003). The present study further investigated the proposal by Endo and Takeda (2004) that implicit learning involves selection of the aspect of a structure that is most useful to one’s current task. In addition, we explored the possibility raised by the work of Chun and Jiang (1998) that this structural selection can be made implicitly.

Selection on the basis of usefulness in AGL

To explore whether or not selection of the most useful aspect of a structure for implicit learning can be observed outside the contextual cueing paradigm, we tried to replicate the findings of Endo and Takeda (2004) in an AGL-experiment. In

Experiment 1 of the present study, the task of memorizing exemplars from two different artificial grammars together with the side of the screen where they appeared could, for some participants, be facilitated by two aspects of the structure of the exemplars: a highly useful feature and the moderately useful sequences of letters permitted by each grammar. Participants presented with these exemplars acquired

(21)

knowledge of the feature, but not of the letter sequences. Participants who only had letter sequences available to facilitate their task, in contrast, learned this aspect of the structure.

These findings suggested that participants implicitly learned the aspect of the structure that was most useful to their task in the induction phase. However, in Experiment 1, the feature was not only useful but also highly salient. Experiment 2 demonstrated that a non-salient feature could be selected for implicit learning on the basis of its usefulness. Although some participants in this experiment were able to classify fragments, from which the feature had been removed, they were shown to use knowledge of trigrams containing the non-salient feature, rather than of letter

sequences unrelated to it. Moreover, this knowledge was shown to be explicit: in the absence of explicit knowledge, participants were unable to classify fragments. Only the highly useful feature was learned implicitly, even though it was not salient.

Contrary to the traditional views that characterize implicit learning as unselective (Hayes & Broadbent, 1988) and ineluctable (Reber, 1989), these results provide further evidence that implicit learning does not inflexibly acquire any structure that is present in the stimuli (c.f. Whittlesea & Dorken, 1993; Whittlesea &

Wright, 1997; Wright & Whittlesea, 1998). They suggest at minimum that, like familiarity, salience (Whittlesea & Wrigh, 1997) and spatial organization (Wright &

Whittlesea, 1998), usefulness of a structure to one’s current task may affect what people learn. Furthermore, findings from the contextual cueing paradigm indicate that usefulness guides attention (Chun & Jiang, 1998; Endo & Takeda, 2004) suggesting that implicit learning will reliably result in knowledge of the most useful aspect of a structure.

Unselective learning in the SRT-task?

In contrast with the suggestion that implicit learning commonly involves selection of useful information, however, research using the SRT-task indicated that implicit learning of the sequence of locations was not hampered by the presence of a perfectly valid cue to the location of the next stimulus (Cleeremans, 1997; Jiménez &

Méndez, 1999, 2001). As noted before, this discrepancy could be due to the involvement of different learning mechanisms in the acquisition of spatial and spatiotemporal associations (Dominey, 2003; Howard et al., 2004). However, there also seems to be a possibility that the seemingly redundant structure was acquired in these SRT-experiments, because it was, in fact, useful to the participants’ task.

(22)

In the experiment by Jiménez and Méndez (2001), for example, participants were presented with four different stimuli occurring at four locations on the screen.

Two stimuli were designated targets and the others distracters. In addition to responding to the location of the stimulus, participants had to count the number of targets they were presented with. Although the identity of the present stimulus perfectly predicted the location of the next, participants still learned the sequence of locations. This redundant structure learning, however, could be the result of

processing the stimuli as either targets or distracters rather than at the level of their unique identity. If the two targets and the two distracters were not distinguished from each other, they would only predict the next location with a validity of 50%. The simple cue would then be less predictive than the finite state grammar, which

determined the next location on 80% of the trials. So, this study may be more like the contextual cueing experiment in which two aspects of the structure predicted the target location on half of the trials and both were acquired (Endo & Takeda, 2004).

Further research will be needed to establish whether or not selection of the most useful aspect of a structure is involved in implicit learning on the SRT-task.

Selection in a broader context

Although it is not yet clear how selection of the aspect that is most useful to one’s current task and blocking of other aspects was achieved in the present study, it should be noted that similar findings have emerged outside the typical implicit learning paradigms. For example, it has been suggested that second language learners often fail to acquire tense markings on the verb, because the presence of temporal adverbs makes them redundant in understanding the meaning of sentences (Ellis, 2005). Similarly, according to a formal analysis of grammar induction, rules are only represented when they provide the simplest possible description of the language-input that has been received (Chater & Vitányi, submitted).

Moreover, the failure to learn additional information in the presence of a useful cue is a robust finding in classical conditioning. When two conditioned stimuli reliably precede an unconditioned stimulus, the stronger conditioned stimulus is likely to overshadow the other (e.g. Mackintosh, 1971). For example, blocking experiments have shown that no association is formed between a conditioned stimulus (e.g. a light) and an unconditioned stimulus (e.g. a shock) when the unconditioned stimulus has already been associated with another conditioned stimulus (e.g. a tone) (see Rescorla

& Holland, 1982, for review). An eye-tracking study provided evidence that the

(23)

blocking effect is due to learned attention: participants spent little time looking at the redundant cue (Kruschke, Kappenman & Hetrick, 2005). Similar studies in the AGL or contextual cueing paradigms may clarify the contribution of attentional processes to selection of the most useful aspect of a structure in implicit learning.

Selecting and learning implicitly

A second issue investigated by the present study is whether selection and learning of the aspect of a structure that is most useful to a person’s current task can occur without awareness. In both experiments, participants acquired implicit as well as explicit knowledge of the difference between the two groups of exemplars. The more salient the simple feature used in the experiment, the higher the number of participants who became aware of it. Nevertheless, participants also acquired implicit knowledge of both the salient and the non-salient feature. This suggests that even very simple information can be selected and learned implicitly.

This suggestion is in accordance with findings by Frick and Lee (1995), who presented participants with pseudorandom letter sequences containing an invariant letter at one position. Sequences containing the invariant letter at that position were judged to be more familiar than random sequences of the same letters, even by participants who remained unaware of the invariant. The authors concluded that very simple information that would be easy to articulate could nonetheless be learned implicitly. The present experiments suggest that implicit learning of a highly useful aspect of a structure may even occur without leading to awareness when it has to be selected from among other potentially useful aspects.

In conclusion, the present study indicates that implicit learning of artificial grammars does not occur automatically and unselectively. Participants learned the aspect of a structure that was most useful to their current task. This finding is in line with the view that selective attention affects the kind of knowledge acquired in implicit learning (Whittlesea & Wright, 1997; Wright & Whittlesea, 1998) and suggests that usefulness may guide attention. By conceptually replicating the finding from the contextual cueing paradigm that an aspect of a structure can be selected for learning on the basis of its usefulness (Endo & Takeda, 2004), this AGL-study provides evidence that such selection is a common component of implicit learning. In addition, the results suggest that the aspect of a structure that is most useful to one’s current task may be selected and learned without reaching awareness.