• No results found

The animacy bias in visual object recognition cannot fully be explained by processing speed

N/A
N/A
Protected

Academic year: 2021

Share "The animacy bias in visual object recognition cannot fully be explained by processing speed"

Copied!
17
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The animacy bias in visual object recognition cannot fully be

explained by processing speed

Reina van der Goot

Student number: 10786872 Supervisor: Daniel Lindh

Word count (without abstract): 3500

Abstract

When images are presented in a rapid visual presentation (RSVP), a second target (T2) is often missed when it falls within a 500 millisecond (ms) window of a first detected target. This phenomenon is contingent on subjects attending the first target, limiting the capacity to process the second target, and thus has been dubbed the Attentional Blink (AB). Influential models of the AB postulate a two-stage process, where both targets are first processed up to high level representation. This is followed by a second, post-perceptual, stage where target representations are put into a more durable working memory (WM) state. Traditionally, the AB has been investigated using low-level feature stimuli, such as letters and digits, but there is a limited understanding of how natural images are affected by the AB. Previous research has found an animacy bias in the AB task, where the AB effect is smaller for animate objects. It is known that animate object are represented on a continuum throughout the higher visual areas, and that this is related to the speed of processing. However, it is still unknown if this animacy bias in the AB is solely due to processing speed of the objects. Here we address this question by comparing behavioural performance on

(2)

animate and inanimate objects in the AB and a slower, but similar, working memory (N-back) task where subjects are given enough time to process the whole scene. An animacy bias was found in both the AB and the working memory task, providing evidence that the bias towards animate objects is not solely due to processing speed, but furthermore is present in later stages of processing as well.

Introduction

Humans are remarkably rapid at detecting objects, within 120 milliseconds (ms) decisions about whether an animal is present in a scene, can be made (Kirchner & Thorpe 2006). However, when images are presented in a rapid serial visual presentation (RSVP), a second target (T2) is often missed when it falls within a 200-500 ms window of a first detected target (T1). This phenomenon is contingent on subjects attending the first target, limiting the capacity to process the second target, and thus has been dubbed the Attentional Blink (AB; Raymond, Shapiro, & Arnell, 1992).

The AB has traditionally been investigated using highly familiar and overlearned low-level feature stimuli, such as letters and digits (Martens, Dun, Wyble, Potter, 2010). Because there is a limited understanding of how natural images are affected by the AB, the natural images used in the present study will provide a way of looking into categorical objects for ecologically valid exemplars. Previous studies with natural images have shown that targets were less often detected and identified when faces were present among the distractors of an RSVP (Evans & Treisman, 2005). Furthermore, a smaller AB has been observed for animate objects (Guerrero & Calvillo, 2016) and faces didn’t show an AB at all

(3)

(Awh, Serences, Dhaliwal & Dassonville, 2004), providing evidence for a bias towards animacy. The present study aims to explore how this animacy bias arises.

It is generally assumed that AB arises from capacity limitations in the brain, a so-called bottleneck. Chun and Potter (1995) proposed one of the most influential models to account for the AB phenomenon: the two-stage model. In the first stage of the model all visual information for object recognition is processed and conceptual representations are formed. These representations are volatile and quickly forgotten when interfered with other stimuli. The second stage of the model is assumed to be capacity limited and draws on attentional resources. In this stage the brief representations of the stimuli are being consolidated into the working memory (WM) and become stable and available for report (Chun & Potter, 1995). When T2 is presented within 200-500 ms after T1, it must wait for T1 to be consolidated in the capacity limited stage 2. This leaves T2 vulnerable for interference of other items competing for representation, thus the longer it takes for T1 to be processed, the fewer attentional resources are left to consolidate T2 and the bigger the chance of interference from other items (Dux & Marois, 2009). Evidence supporting the two-stage model comes from the finding that a N400, an electrophysiological marker of semantic mismatch, is still present, even when T2 is not reported (Luck, Vogel & Shapiro, 1996). This finding suggests that the AB does not arise from capacity limitations in perceptual

processing, but rather because of capacity limitations in later stages of processing. More recent evidence from neuroimaging and electrophysiological studies demonstrated that early stages of processing (visual areas) respond to both missed and reported T2s, whereas later stages of processing (parietal-frontal regions), corresponding to the second stage in the model, only respond to reported T2s (Kranczioch, Debener, Schwarzbach, Goebel, & Engel, 2005). Also the involvement of the WM has been shown by several studies, a larger AB is

(4)

observed when T1 encoding load or the strength of the mask is increased (Ouimet & Jolicœur, 2007; Dux & Coltheart, 2005).

One possible explanation for the animacy bias is that animate objects are processed faster than inanimate objects. An animate T2 would therefore need less time processing than an inanimate T2, thus having a bigger chance of reaching the later stages of processing, where it can be consolidated and reported. The finding that animate objects (120 ms) are detected faster than scenes (160 ms) and vehicles (180 ms) (Crouzet, Joubert, Thorpe & Fabre-Thorpe, 2012) supports this idea. How fast an object is processed is found to be dependent on the level of animacy. Carlson, Ritchie, Kriegeskorte and Durvasula (2013) created a

representational decision boundary for animacy and showed that the more animate an object was represented, the faster it was identified and thus the faster it was processed.

A distinction for animacy has also been found in the ventral vision pathway, where animate objects are presented on a continuum (Sha, Haxby, Abdi, Guntupalli, Oosterhof, Halchenko & Connoly, 2015). The most animate objects (primates) are represented more lateral on the continuum whereas the least animate objects (invertebrates and insects) are presented more medial on the continuum. The least animate objects share the same location on the continuum as inanimate objects, which shows that no distinct boundary is present (Sha et al., 2015). In sum, studies show that animate objects are processed faster than inanimate objects, the precise nature of the bias is still unclear. One plausible notion is that rapid processing speed of the animate objects might aid the emergence to consciousness of the second target, which would relate to the first stage of processing. In the current study, I address the possibility that the animacy bias is also present in the second stage of processing.

The present study aims to answer the question whether the animacy bias is solely due to processing speed in the first stages of processing, or whether the bias is present as well in

(5)

later stages of processing. Therefore our goals in this study are (1) to replicate previous findings of the animacy bias in the first stages of processing (Evans & Treisman, 2005; Crouzet et al., 2012; Carlson et al., 2013; Sha et al., 2015; Guerrero and & Calvillo, 2016) and (2) to answer the question whether this bias is also present in later stages of processing. To answer these questions I will be looking at animacy differences across the AB task and a WM task. In order to test the influence of animacy on the second stage of processing, we

employed a working memory task, a N-back, which is very similar to the AB, but while giving subjects enough time to process the whole scene. Both these tasks rely on encoding, maintaining and comparing object representations, but processing speed is only a variable in the AB (Jaeggi, Buschkuehl, Perrig & Meier, 2010).

I reasoned that if the animacy bias would be a consequence of solely processing speed, an animacy bias should be found in the AB task, but not in the WM task. This outcome would be persistent with previous research on the animacy bias showing an animacy bias in the first stages of processing. Alternatively, if the animacy bias would be a consequence of not solely processing speed, then the bias should be found in the AB task, as well as in the WM task. The present study would then be the first, to my knowledge, to show the animacy bias in later stages of processing.

Experiment 1

Experiment 1 was set up to examine if an animacy bias was present in the AB.

Method

Participants. Twenty two students (16 female, M = 20,38, SD = 2,01) from the University of Amsterdam participated in the experiment for course credits and were

(6)

recruited through an advertisement posted on a website dedicated for research participation (http://lab.uva.nl). All participants reported normal or corrected to normal visual acuity and were unfamiliar with the purpose of the experiment. The experiment was approved by the Ethical Committee of the Psychology Department at the University of Amsterdam. Before starting the experiment, all participants were asked to sign an informed consent. This provided information about the research such as the purpose, procedure and contact information. They were informed that they could stop at any given moment.

Stimuli and apparatus. Sixteen natural images were selected from the ImageNet database (http://image-net.org; see Fig.1a). The images were derived from 2 categories, animate and inanimate, with each four groups containing two images. The animate groups consisted of bears, apes, beetles and butterflies, while the inanimate group consisted of cars, airplanes, closets and chairs. In the AB task all images and distractors were converted to greyscale to prevent ceiling effects and presented on a grey background. All images were squared and presented at the centre of the screen with the object centred in the image, presented at 5 degrees of visual angle. The stimuli used in this and subsequent experiments were displayed on a computer (screen resolution: 1920x1080, refresh rate: 60 Hz) in a dimly lit room at a viewing distance of 60 cm. The generation of the AB and N-back stimuli and the collection of the responses were controlled using the Psychophysics Toolbox extension (Kleiner et al., 2007) for MATLAB (Mathworks, Natick, MA). Participants made manual responses using the computer keyboard.

The AB task is a well-studied, highly robust and reliable task (Dale & Arnell, 2011) and I therefore expected that the task would be sensitive enough to detect an animacy bias if animals have a priority in the first stages of processing.

(7)

Procedure. After signing the informed consent, participants were seated and given an explanation about the task and the response menu. A few trials of the task were

demonstrated. The experiment consisted of 42 runs with 24 trials, participants had options to take breaks in between runs. Each trial began with a 500 milliseconds fixation point,

followed by a RVSP stream of 19 images. Every image was presented for 20 milliseconds, followed by an 80 milliseconds blank, so that each stimulus onset asynchrony (SOA) lasted 100 milliseconds. There were two conditions, the lag-2 condition and the lag-7 condition. T2 was always positioned at the 13th position in the RSVP, T1 was moved around. In the lag-2

condition T1 was placed at the 10th position in the stream and in the lag-7 condition T1 was

placed at the 6th position in the stream (see Fig. 1c for the example of one trial). Animate and

inanimate images were randomized so that both categories were evenly distributed over T1 and T2 and T2 was never the same object as T1. After 2400 milliseconds (500 ms fixation plus 1900 ms RSVP) a response menu was presented for T1 and T2 (see Fig. 1b). For the

identification of T1 4 words were presented, 2 displayed on top and 2 displayed at the bottom. Only one word matched T1, the other words were randomly drawn from the other image description words. A similar response menu was presented for T2. Subjects were asked to match T1 and T2 using the corresponding buttons, S and K for the top half of the words, X and M for the bottom half of the words. The score range ran from roughly 25% (chance level) to 100% (when participants don’t show a blink). The higher the score, the better participants were at detecting T2.

Analysis. To analyse the effects of animacy on correct T2 reportability, a 2 (animate vs inanimate) x 2 (lag-2 and lag-7) repeated measures ANOVA was performed. The analysis was performed over results were T1 was correctly identified. To see how much each image

(8)

suffered from attentional depletion, the AB magnitude (ABM) was calculated by subtracting the lag-2 performance from lag-7 performance.

Figure 1. (a) Examples of images used in the study. (b) The response menu appeared twice

after the 2400 ms RSVP to identify T1 and T2. (c) Example of a trial sequence consisting of a 500 ms fixation point, followed by 19 images (which are not all depicted), consisting 2 targets.

Results

Data from one participant (number 22) was excluded because she performed only slightly better than at chance level, indicating that she did not perform the task as instructed. No further outliers were removed from the data. Consistent with typical AB studies, we only included T2 trials in which participants correctly reported T1.

(9)

Figure 2. Mean T2 accuracy on lag-2 and lag-7 for inanimate and animate objects. Error bars

show +/- 1 standard error.

To analyse the effects of animacy and lag on T2 reportability, a repeated measures ANOVA was performed. Q-Q plots and histograms showed that the data for lags and animacy were normally distributed. In the ANOVA there was a significant main affect for lag F (1,20) = 62,43, p = <.00 with a large effect size, η 2 = .76. T2 was reported correctly more often at lag-7 (M = .91, SE = .014, 95% CI [0,881,0,939]) than lag-2 (M = .809, SE = .020, 95% CI [.768, .850]). This means that the typical AB pattern, consistent with other AB studies, was found (see Fig. 2). There was also a significant main effect for animacy F (1,20) = 6,78, p = .017 with a medium effect size, η 2 = .25. T2 was reported correctly more often for animate (M = .874, SE = .018, 95% CI [.837, .911]) than inanimate objects (M = .845, SE = .016, 95% CI [.812, .877]; see Fig. 3a and Fig. 3b). A significant interaction effect was found between animacy and lag F(1,20) = 10,85, p = .004 and represented a large effect, η 2 = 0,35. At lag-2 the

difference between animate objects (M = .833, SE = 0.21, 95% CI [.789, .877]) and inanimate objects (M = .785, SE = .021, 95% CI [.742, .829]) was larger than the difference between

(10)

animate (M = .915, SE = .021, 95% CI [.881, .949]) and inanimate objects (M = .905, SE = .013, CI 95% [.877, .932]) at lag-7, meaning a larger AB for inanimate objects was found at lag-2.

Figure 3. (a) Mean AB magnitude for animacy per participant. (b) Mean ABM magnitude for

animacy. Error bars show +/- 1 standard error.

Experiment 2

In order to examine the bias towards animacy in the WM, an 2-back task was performed with animate and inanimate objects. The 2-back task was performed in a second experiment to prevent the participants from getting to familiar with the images for the AB task.

Method

Stimulus and apparatus. The same stimuli and apparatus were used in the second experiment. The only difference here was that for the N-back images were presented in colour.

(11)

reliability, but it is found to be very useful for experimental investigation of WM processes, such as in the present study (Jeaggi et al., 2010).

Procedure. After signing the informed consent, participants were seated and given an explanation about the N-back task and the response menu. A few trials of the task were demonstrated (see Fig. 4 for an example of a trial). The 2-back task consisted of 6 blocks, each containing 4 runs of 100 trials. Each trial began with a 500 milliseconds fixation point,

followed by a continuous stream of images. Within each run, 30% of the trials were N-back trials, where the same image was presented as two trials ago. Animate and inanimate object were again evenly distributed, so that every category had the same amount of N-backs. Each image was shown for 700 milliseconds, with a SOA between 2.3 and 2.7 seconds. For every image participants were asked to respond whether or not the present image was the same as they had seen two images ago, using the corresponding buttons, Z for no and M for yes. The score range ran from 0% (no 2-back correct) to 100% (every 2-back correct).

(12)

Analysis. To analyse the effects of animacy on the N-back a paired-samples t-test was performed between the correct animate N-backs and the correct inanimate N-backs.

Results

Data from one participant (number 22) was excluded because she had not performed the earlier made AB task as instructed. No further outliers were removed from the data. To test if participants scored better on the 2-back for animate objects than inanimate objects, a paired sampled t-test was performed. There were no significant outliers in the difference between animate objects and inanimate objects and the Shapiro-Wilk test showed that the differences between animate and inanimate were normally distributed (p = .133). The paired samples t-test showed that on average, participants performed better on the 2-back with animate objects (M = .719, SD = .215) than with inanimate objects (M = .698, SD = .222). The difference, .021, BCa 95% CI [.004, .039] was significant t(21) = 2,550, p = .019 and represented a medium-sized effect, r = .495.

Figure 5. (a) Mean n-back score for animacy per participant. (b) Mean N-back score for

animacy. Error bars show +/- 1 standard error.

(13)

In this study we aimed to examine whether the animacy bias was solely due to

processing speed in the first stages of processing or if the bias was also present in later stages of processing, by using an AB task and a N-back task. The results showed an animacy bias in both the AB task and the N-back task. Since inanimate objects are processed within 180 ms (Crouzet et al., 2012) it can be concluded that images in the N-back task were fully processed during the 700 ms SOA. This is why we can conclude that the animacy bias is present in the WM stage and that it is implausible that the animacy bias is solely due to processing speed.

Our results show a prominent AB effect using our stimuli, where participants performed better on lag-7 than lag-2 (see Fig. 2). Furthermore, we corroborated earlier findings (Guerrero & Calvillo, 2016) and show that there were categorical differences in the ABM (attentional blink magnitude), where animate objects were less affected by the

attentional depletion caused by the T1. In the study by Guerrero and Calvillo (2016) the effect of animacy was smallest at lag-2, which is contrary to the typical AB pattern where short lags lead to the most pronounced differences. The effect of animacy in this study followed the typical AB pattern and was most pronounced at lag-2.

Up till now the animacy bias was only shown to be present in the first stages of processing (Evans & Treisman, 2005; Crouzet et al., 2012; Carlson et al., 2013; Sha et al., 2015; Guerrero & Calvillo, 2016). One plausible notion was that rapid processing speed of the animate objects might aid the emergence to consciousness of the second target, which would relate to the first stage of processing in the two-stage model. However, our results have shown an animacy bias in the WM, providing evidence that the bias is not only present in the first stages of processing, but is present in the later stages of processing as well. Therefore, other factors, besides processing speed, must also contribute to the bias.

(14)

Monkey research has shown that prefrontal cortex (PFC) cells are reciprocally connected with the visual areas and show sustained activity during a WM task delay

(Wilson, O’Scalaidhe & Goldman-Rakic, 1993). This suggests that the prefrontal cortex (PFC) cells activate perceptual presentations in the posterior visual areas during a WM delay through recurrent activation, leading to an object being remembered (Ungerleider, Courtney & Haxby, 1998). Although monkey research used colour or spatial cues (Ungerleider et al., 1998), this could also be the case for objects features. Results from Carlson et al. (2013) and Sha et al. (2015) showed that animate objects have certain features that makes them more distinct from inanimate objects. One possible explanation for the animacy bias, besides processing speed, would be that animate features are easier to remember trough this

recurrent activation because they are more distinct and ambiguous than inanimate features. At the moment this is not yet clear and future research is needed to provide more insight on the other factors contributing to the animacy bias.

The present study had a few limitations. Low-level features between categories could possibly cause differences in T2 reportability, leading to a bias in the results. There are, however, a few reasons to assume this is not the case. Guerrero and Calvillo (2016) used objects presented centred on a white background, removing low-level differences from that surroundings that could influence T2 reportability and still found an animacy bias. Hagen and Laeng (2017) build on this finding by using images that were pre-experimentally

balanced on low-level characteristics, and found an even larger effect size for animacy. These studies show that it is unlikely that the results of the present were caused by differences in low-level features. Another limitation of the study was the use of only unnatural inanimate objects in combination with natural animate objects. It is possible that unnatural inanimate objects have a different T2s reportability than natural inanimate T2s, for example rocks.

(15)

Further research is needed to provide more insight on the influence of naturalness on the animacy bias.

In conclusion, the present study was, to my knowledge, the first study to show that the animacy bias is not only present in the first stages of processing, but is also evident in later stages of processing. Hereby providing evidence that the animacy bias is not solely due to processing speed in the first stages of processing.

References

Awh, E., Serences, J., Laurey, P., Dhaliwal, H., van der Jagt, T., & Dassonville, P. (2004). Evidence against a central bottleneck during the attentional blink: Multiple channels for configural and featural processing. Cognitive Psychology, 48, 95-126.

Chun, M. M., & Potter, M. C. (1995). A two-stage model for multiple target detection in rapid serial visual presentation. Journal of Experimental Psychology, 21, 109-127.

Carlson, T. A., Ritchie, J. B., Kriegeskorte, N., Durvasula, S., & Ma, J. (2013). Reaction time for object categorization is predicted by representational distance. Journal of Cognitive

Neuroscience, 26, 132-14.

Crouzet S. M., Joubert O. R., Thorpe S. J., & Fabre-Thorpe M. (2012). Animal Detection Precedes Access to Scene Category. PLoS ONE, 2012, Vol.7(12)

Dale, G. & Arnell, K. M. (2013). How reliable is the attentional blink? Examining the relationship within and between attentional blink tasks over time. Psychological

Research, 77, 99-105.

Dux, P. E., & Coltheart, V. (2005). The meaning of the mask matters: Evidence of conceptual interference in the attentional blink. Psychological Science, 16, 775-779.

(16)

Dux, P. E., & Marois, R. (2009). The attentional blink: A review of data and theory. Attention,

Perception & Psychophysics, 71, 1683-1700.

Evans, K. K., & Treisman, A. (2005). Perception of objects in natural scenes: Is it really attention free? Journal of Experimental Psychology, 31, 1476-1492.

Guerrero, G., & Calvillo, D. P. (2016). Animacy increases second target reporting in a rapid visual presentation task. Psychonomic Bulletin & Review, 23, 1832-1838.

Hagen, T., & Laeng, B. (2017). Animals do not induce or reduce attentional blinking, but they are reported more accurately in a rapid serial visual presentation task.

i-Perception, 8(5), 2041669517735542.

Jaeggi, S. M., Buschkuehl, M., Perrig, W. J., & Meier, P. B. (2010). The concurrent validity of the N-back task as a working memory measure. Memory, 18, 394-412.

Kirchner, H., & Thorpe, S. J. (2006). Ultra-rapid object detection with saccadic eye movement: Visual processing speed revisited. Vision Research, 46, 1762-1776.

Kranczioch, C., Debener, S., Schwarzbach, J., Goebel, R., & Engel, A. K. (2005). Neural correlates of conscious perception in the attentional blink. NeuroImage, 24, 704-714. Luck, S. J., Vogel, E. K., & Shapiro, K. L. (1996). Word meaning can be accessed but not

reported during the attentional blink. Nature, 383, 616.

Martens, S., Dun, M., Wyble, B., & Potter, M. C. (2010). A quick mind with letters can be a slow mind with natural scenes: Individual differences in attentional selection. PLoS

ONE, 5, e13562.

MATLAB and Statistics Toolbox Release 2012b, The MathWorks, Inc., Natick, Massachusetts, United States.

Ouimet, C., & Jolicoeur, P. (2007). Beyond task 1 difficulty: The duration of T1 encoding modulates the attentional blink. Visual Cognition, 15, 290-304.

(17)

Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: An attentional blink? Journal of Experimental Psychology,

18, 849-860.

Sha, L., Haxby, J. V., Abdi, H., Guntupalli, J. S., Oosterhof, N. N., Halchenko, Y. O., & Connoly, A. (2015). Journal of Cognitive Neuroscience, 27, 665-678.

Shapiro, K. L., Raymond, J. E., & Arnell, K. M. (1997). The attentional blink. Trends in

Cognitive Sciences, 1, 291-296.

Wilson, F. A., O’Scalaidhe, S. P. & Goldman-Rakic, P. S. (1993). Dissociation of object and spatial processing domains in primate prefrontal cortex. Science 260, 1955–1958. Ungerleider, L. G., Courtney, S. M., & Haxby, J. V. (1998). A neural system for human visual

Referenties

GERELATEERDE DOCUMENTEN

We explore the possibilities of a dense model-free 3D face reconstruction method, based on image sequences from a single camera, to improve the current state of forensic

For the present study, these finding imply that that participants who score high on trust in feelings (compared to low) and thus, are more likely to rely on their

More importantly, and indicative of a confirmation bias, we hypothesize that ambiguous feedback (i.e., “partly correct” and “partly incorrect”) will be assimilated as a

Therefore, Dutch investors cannot exhaust the gains of international diversification with homemade portfolios, and the home equity bias is a suboptimal choice for

The genetic risk loci identified for IBD so far have shed new light on the biological pathways underlying the disease. The translation of all of this knowledge

This suggests that aggressive children in early childhood are already able to support each other and provide affection, even though the supportive aggressive roles of assistant

By comparing ten different land use classification methods, we conclude that integrating the structural features that are derived from GCNs and graph kernels with spatial