• No results found

Attentional Bias Towards Animacy During the Attentional Blink

N/A
N/A
Protected

Academic year: 2021

Share "Attentional Bias Towards Animacy During the Attentional Blink"

Copied!
21
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Attentional Bias Towards Animacy During the Attentional Blink

Ingmar Eiling

University of Amsterdam

Word count: 193 (abstract), 4334 (text body)

Abstract

Human object recognition is generally taken to perform hierarchical transformations from low- to high-level feature complexity. Several lines of research have shown this process to favor objects and natural scenes containing animacy. Whether animacy affects attentional processing as well is not yet explored. To this end, participants were tasked with identifying two targets that were either animate or inanimate objects within a rapid serial visual presentation of distractor stimuli. An impairment in identifying the second target (T2) is induced if presented shortly after the first (T1), also known as the attentional blink (AB). Animacy was hypothesized to reduce the AB due to an attentional bias for mid- to high-level animate features. An expected increase in T2 identification was found for animate objects as compared to inanimate objects at 100 and 200 ms lags, while converging to similar levels at the 700 ms control lag. Additionally, a deep convolutional neural network was employed to investigate potential effects of high-level featural similarity between targets on T2 identification, yet no correlation to animacy or behavioral measures were observed. Overall, the presented work is in line with previous literature regarding a visual processing bias for animacy.

(2)

Introduction

The human visual system is continuously exposed to a large stream of information from our surroundings, and yet we are able to identify any relevant objects apparently effortless. This is a sizeable computational feat given the vast total variation in object position, range, viewing angle, lighting and context (DiCarlo, Zoccolan, & Rust, 2012). The process of object identification is very rapid too, as scenes shown for 20 ms containing animals can be differentiated as such within as little as 120 ms (Kirchner & Thorpe, 2006). This happens during the hierarchically layered transformation of retinal images from low-level visual features, such as edge orientation and contrast, into more high-level object parts. When and how our attention interacts with these object representations is still unclear.

A frequently used method for discerning differences in attention over time is rapid serial visual presentation (RSVP). This method involves a fixed stream of stimuli shown for approximately 100 ms each. One particular phenomenon studied with RSVP is the impairment in reporting a second target (T2) in the stream when it appears around 200 to 500 ms after the first target (T1; Broadbent & Broadbent, 1987). Later coined the attentional blink (AB), this notable constraint in our ability to report sequential targets was shown to be an attentional bottleneck rather than a sensory limitation (Raymond, Shapiro, & Arnell, 1992). However, the second target is often spared from this limitation if it is shown earlier than the 200 ms mark without intervening distractors, commonly known as lag-1 sparing (Dux & Marois, 2009). The present study aims to explore how object types in natural scenes reach our attention during this sparing of the attentional blink.

Standard accounts of the AB usually employ alphanumeric characters, such as letters for targets and numbers for distractors. These targets are low in feature complexity and

(3)

extremely well-learned by most literate humans (Martens, Dun, Wyble, & Potter, 2010). In this sense, extension of the AB effect towards real-world object recognition is limited. To this end, a set of experiments using natural scenes in RSVP was conducted by Evans and Treisman (2005). In this study no AB impairment was observed when participants were asked to merely detect animal targets as seen or not seen rather than identification by name. This meant that feature sets of an image signifying particular animal shapes undergo parallel detection, while actual featural binding into identifiable objects would be subject to serial processing. Whether this can explain potential sparing effect for natural scenes is, however, unclear; their model allows for only a single object to be integrated and identified at a time.

Several following RSVP studies did not report any sparing effects for short lags using object depictions (Einhäuser, Koch, & Makeig, 2007; Dux & Harris, 2007). Although Livesey and Harris (2011) found lag-1 sparing for objects at short lags, they were critical of its small effect size in comparison to standard alphanumeric characters. One study that did report regular lag-1 sparing used diverse categories of real-world objects (Potter, Wyble, Pandav, & Olejarczyk, 2010). Although they did not report any category-specific effects for animate or inanimate types, they rejected the serial feature binding during single attentional episodes from Evans and Treisman (2005). Instead, T1 feature binding was attributed to the facilitation of T2 processing during lag-1 sparing. This means that a T2 subsequent to T1 has a larger chance of joining its attentional processing, resulting in higher target report, but not if it falls at later lags typical for the AB.

Recently, it was found that reaction times for natural scene identification in RSVP can be just as quick as for simple features (Howe, 2017). Equally rapid identification of complex and simple features is at odds with any account dedicating feature binding to attentional

(4)

processes, as it would take at least 20 to 30 ms longer to involve this additional neural activity. Early feature binding before the onset of attentional processing is therefore likely. Similarly, the recognition of animals within natural scenes is also shown to take precedence over scene detection itself (Crouzet, Joubert, Thorpe, & Fabre-Thorpe, 2013). If an animal is shown in a scenic context, saccades are triggered to register it before the rest of the scene is processed, unlike scenes with vehicles. These simultaneous and involuntary quick eye movements are thus directed to prioritize animals (and human faces) in a natural scene on the basis of early object identification. Perhaps then animate feature binding is also pre-attentive, resulting in more rapid feature processing and faster or even prioritized entrance into attention.

It should be noted that animates and inanimates are not represented in discrete ‘dead or alive’ categories in cognition. A strong correlation was found between ‘how animate’ something looks and how fast it is identified as such (Carlson, Ritchie, Kriegeskorte, Durvasula, & Ma, 2013). In this case, the representational distance of an animate or inanimate exemplar to a decision boundary was modelled; the more objects resembled animate parts, the faster the reaction time was to treat them as such. A similar continuum of high to low animacy was also found in the human ventral visual pathway, although no differences in activation patterns were observed between living things with low animacy, such as insects, and nonliving objects (Sha et al., 2015). In sum, areas in the ventral stream responsible for processing mid- to high-level features will process objects in a graded manner from animate to inanimate, with higher animacy receiving faster processing. Why and how they would be prioritized in attentional processing is not yet well understood.

A possible explanation for such rapid animate feature binding is that integrated objects convey more information about potentially nearby animals, including humans, than separate

(5)

feature sets. New, Cosmides and Tooby (2007) hypothesized that high-level features drive our attentional priority for animals over other similar inanimates signifying potential danger such as vehicles. This would stem from the ‘hard-wired’ evolutionary importance of animals in our ancestral surroundings. The latter premise was tested in RSVP by Guerrero and Calvillo (2016), exploring animate and inanimate images that were either found threatening or not. They concluding that animacy—but not threat—facilitated T2 identification at both 200 and 400 ms lags. Despite the lack of a threat effect or controlling for the magnitude of the AB effect, these results suggest an attentional processing bias for animacy.

The present study aims to investigate this attentional bias by examining T2 identification for natural scenes with animate and inanimate objects. To accomplish this, RSVP is used with T2 presented at two AB lags: lag-1 at 100 ms and lag-2 at 200 ms after T1 presentation. To control for the AB magnitude lag-7 is used at 700 ms, falling outside of the typical AB period. The used images depict several animals, vehicles and furniture as found in their common setting, presented for 20 ms within the stream. This matches the time used in Kirchner and Thorpe (2006) and potentially avoids a ceiling effect from faster natural scene processing. The animate images are taken to adequately represent the low- to high animacy continuum by including mammals and insects, whereas the inanimate images include furniture and vehicles. Behavioral data from testing these images is also matched against their high-level feature values as extracted from a convolutional neural network (CNN; Krizhevsky, Sutskever, & Hinton, 2012).

The CNN was used to accurately define feature spaces for each animate and inanimate target image and subsequently compare their similarity. Contemporary neural networks achieve object recognition comparable to human performance, making them usable as a tool

(6)

for reproducing human visual feed-forward processing (Kriegeskorte, 2015). CNN feed an image through hierarchical layers of artificial neural nodes with receptive fields of increasing size, eventually matching an object matched to its learned database. This happens in similar fashion to human processing of low- to-high feature complexity across the ventral stream (Güçlü & Van Gerven, 2015). Extracting high-level feature representations therefore gives an idea of how similar target images are treated by layered processing, and how animacy would be represented, before the target representation is entered into our attention.

Building on Guerrero and Calvillo (2016), higher identification rates of animacy are hypothesized to indicate elevated attentional processing for high-level animate features. The expectation follows that T2 report rates for animate objects are higher than for inanimate objects. This effect should be pronounced for lag-2, as it conveys biased attentional processing for animacy in spite of the general AB effect. In addition, a potential effect of animacy showing at lag-1 would suggest enhanced attentional processing for rapid-bound animate features. As for target similarity, hypothetical switching costs should incur during the attentional processing very different feature representations. Therefore, similar high-level features between targets should correlate to higher T2 recognition. It is assumed here that animacy shares high-level features, and are therefore expected to correlate higher, than mixed animate and inanimate target pairs.

Method Participants

22 native Dutch speakers (ages ranging from 18 to 26, 5 males, 18 right handed) participated in the experiment for course credit. Recruitment was done via the University of

(7)

Amsterdam lab website (http://lab.uva.nl) and word of mouth. Every participant reported normal or corrected-to-normal vision. Informed consent was obtained and the experimental procedures were approved by the Faculty Ethics Review Board (FMG) of the University of Amsterdam.

Materials

16 images in total were gathered from the ImageNet database (http://image-net.org) and used as target stimuli. Of these, 8 were labelled as animate (in categories bear, ape, butterfly and beetle) and 8 as inanimate (in categories plane, car, chair and cabinet). These images were square with the object centered. All stimuli were displayed at 5 degrees of visual angle. Images and distractors were presented in greyscale in order to reduce potential ceiling effects in object color detection. Stimuli were shown on a 60 Hz, 1920x1080 resolution monitor using the Psychophysics Toolbox extension (Kleiner et al., 2007) for MATLAB (MathWorks, Natick, MA). They were presented in the screen center in a uniform grey background. Test setting was a closed cubicle with soft lighting and a mechanical response keyboard.

For target similarity, a pretrained CNN was implemented in Caffe for Python (Jia et al., 2014). This network was trained on ~1 million images from the ImageNet database, none of which were used as targets in the experiment. The CNN is a feature model that transforms an object stimulus to non-linear feature representations from pixel values. It consisted of five convolutional and three fully connected layers. In every convolutional layer, feature detectors with a certain receptive field are replicated all over the image to form a feature map. This means they detect whether a feature is present across the image space and feed that answer as a single value into the next layer, reducing the total spatial resolution. Three fully connected

(8)

layers eventually take all features at all locations in the previous layer as input and compare it to other similar patterns of neuron activations from the total ImageNet database. All stimuli were processed and their layer 7 unit activations extracted from the CNN, later used to correlate T1 and T2 combinations for each trial using a Spearman’s correlation coefficient.

Figure 1. Example of a lag-2 trial in RSVP in temporal order, with scrambled distractors and the butterfly and airplane as targets. Note that not all 17 distractors are depicted.

Procedure

After giving informed consent, participants were briefed and shown a short explanation of the experiment and the response keys. The experiment itself consisted of 42 runs with 24 trials and a short break, totaling around a 1008 RSVP trials per participant. In Figure 1 a trial presentation sequence is shown. Each trial started with a fixation cross centered in the screen for 500 ms, after which 19 images were shown in one stream containing two targets and 17 distractors. Each image was shown 20 ms followed by an 80 ms blank (grey background). T2 position was fixed in the 13th position of the stream, while T1 was placed in

(9)

Animates and inanimates were pseudo-randomly presented as T1 and T2 during trials, with mean presentation equaling half for both categories as T1 and T2. Each image was shown as T2 an equal amount of times while never being the same as the preceding T1. After each RSVP trial, a response menu for T1 was prompted consisting a question to identify the first image and 4 target identities in rows of 2 below the question. One was the correct word (e.g. beer for bear) while the others were drawn randomly from other image identities (e.g. kever, beer, auto and stoel for beetle, bear, car and chair, respectively). Subjects responded by pressing the corresponding key—S or K for the top half, X or M for the bottom half—within 5 seconds. The same response menu was then shown immediately afterwards for T2.

Design

For the analysis of animacy effects on correct T2 report rate, a 2 (animate or inanimate) x 3 (lag-1, 2 and 7) within-subjects factorial design was used. Lag was defined as the stimulus onset asynchrony (SOA) between T1 and T2; lag-1 amounted to 100 ms SOA, lag-2 as 200 ms SOA and lag-7 as 700 ms SOA. Animacy meant whether a stimulus depicted an animal or non-animal object. The dependent variable was the overall variance in correct T2 report on condition of correct T1 (T2|T1). The analysis of target similarity was done using paired target similarity values correlated with correct T2 report at all lags.

Results

The data of one participant was excluded on the basis of chance level T1 and T2 report rates (33.73% and 19.83% respectively) and mostly instantaneous responses, indicating no

(10)

adherence to task instructions. After exclusion, correct report for T1 was 87.98% overall (n = 21), while T2 report on condition of correct T1 report (T2|T1) was 85.27%. Mean reaction time for correct T1 was 1077 ms (SD = 513 ms) and for correct T2 841 ms (SD = 577 ms). This 236 ms difference is probably due to increased response readiness for reporting T2 after the T1 menu, considering that the exact same text and position was used for both target response menus. Apart from T2 position in the RSVP, all presentation variables were kept (pseudo-)random and are therefore unlikely to explain this difference.

Analysis of T2 rates as a function of animacy and lag was done using a factorial repeated measures ANOVA. Histograms and Q-Q plots showed normal data distribution across lag and animacy, and the plotted standardized residue showed no heteroscedasticity or non-linearity. Sphericity was assumed following a non-significant Mauchly’s Test. In the two-way ANOVA, there was a significant main effect of lag, F(2, 40) = 52.145, p < .001, η2p = .723.

Overall, T2 was more often correctly reported at lag-7 (M = .910, SE = .014, 95% CI [.881, .939]) than at lag-1 (M = .811, SE = .021, 95% CI [.766, .855]) or lag-2 (M = .809, SE = .020, 95% CI [.768, .850]). This confirms that the T2 impairment characteristic for the AB paradigm was found across participants, as lag-7 is the control SOA. A significant main effect for animacy was also observed, F(1, 20) = 7.548, p = .012, η2p = .274. The T2 was reported more often as an animate object (M = .858, SE = .019, 95% CI [.819, .896]) than as an inanimate object (M = .829, SE = .018, 95% CI [.792, 866]).

As seen in Figure 2, there was a significant interaction effect between animacy and lag, F(2, 40) = 4.369, p = .019, η2p = .179. Across participants, animate T2 report (M = .833, SE = .021, 95% CI [.789, .877]) was significantly higher than inanimate T2 report (M = .785, SE = .021, 95% CI [.742, .829]) at lag-2. This difference resulted in a significant interaction effect between

(11)

animate and inanimate targets at lag-2 as compared to lag-7, F(1, 20) = 10.850, p = .004, η2p = .352. This considerable effect size reflects that the difference in animate and inanimate T2 report across participants was significantly larger at 2, the ‘blink’ lag, than at 7. For lag-1 versus lag-7, however, the different levels of animate and inanimate T2 report did not significantly interact, F(1, 20) = 1.529, p = .231, η2p = .071. A paired sample t-test did reveal this mean difference of 0.0281 (2.81%) between animate and inanimate T2 at lag-1 to be just significant, t(20) = 2.136, p = .045. For lag-2, this mean difference of 0.0478 (4.78%) was confirmed significant as well, t(20) = 3.378, p = .003. Both differences are observable in Fig. 2 as well.

To investigate the effect of target similarity on T2 reportability, unit activations within layer 7 of the CNN were correlated per image using Fisher transformed Pearson correlation coefficients. Overall, values ranged from -0.074 to .558 (M = .036, SD = .097) with a positive left skew of 2.323 (SD = .018). Spearman’s rank correlation coefficient, being robust for skewed distributions and outliers, was used to correlate the similarity values with correct T2|T1 reporting rates at lag-1, 2 and 7 separately. No significant correlation was found between the amount of target similarity and T2 reporting rates at lag-1, ρs = -.145, p = .113, lag-2, ρs = -.100,

p = .277, or lag-7, ρs = .032, p = .731. A scatterplot for these values is included in Appendix A.

No linear relationship was revealed between target similarity and T2 report rates, at any lag for both animate and inanimate values. This finding is contrary to the expectation that higher target report would coincide with higher target similarity at lag-1 and 2. No significant correlation was observed between target similarity values and both targets being animate or inanimate either.

(12)

Figure 2. Mean correct report for animate and inanimate T2 at each lag across participants. Whiskers denote standard error (±).

Discussion

The present study confirmed an effect of animacy in second target identification rates for lag-1 and lag-2. A clear overall impairment in reporting T2 was observed at lag 2, confirming a general AB effect in line with previous literature (e.g. Dux et al., 2009). The magnitude of the AB was considerably smaller for animate objects than for inanimate objects (see Fig. 2). The presented findings therefore support the hypothesis that attentional processing during the AB favors animacy. Target identification rates for both categories converged at lag-7, showing no marked difference. However, looking at lag-1 identification rates, no sparing was elicited in this experiment; i.e., mean T2 report was not considerably

(13)

higher for lag-1 than for lag-2, or even close to lag-7 as control. This result is not entirely in line with previous research where at least some sparing effect was observed for object types (Potter et al., 2010; Livesey & Harris, 2011). Despite this, the results point to an attentional bias towards processing animate objects. This finding supports evidence for a dedicated role of attention in quickly observing animals due to their evolutionary importance (New et al., 2007). Prioritized binding and entry into attentional processing for animate representations might underlie this role.

As for the CNN, target similarity did not produce any expected results; no higher T2 identification rates for similar, compared to dissimilar, T1-T2 combinations were observed (see Appendix A). The hypothesized increase in T2 identification due to feature similarity between both targets could therefore not be confirmed. One possible explanation is the lacking correlation between animate target pairs—such as bear for T1 and ape for T2—and overall similarity values. Animacy as a category is expected to share mid-level feature processing between exemplars distinct from inanimate categories (Long, Störmer & Alvarez, 2017). As the CNN is shown to resemble a ventral feedforward model (Güçlu et al., 2015), at least some correlation should show between animate target pairs and the overall level of similarity. Similarity values across trials were quite low as well, perhaps because category exemplars were not matched on similar low-level features such as contrast energy and line orientations.

Another reason for this missing similarity effect is perhaps the extraction of layer 7 CNN values. Layer 7 is the penultimate layer in which feature clusters are already being ‘matched’ against the exemplars the CNN was trained on. These clusters are often large parts or whole objects. Although high-level features are thought to already include some contextual scene information, animate features might be ‘fast-tracked’ in visual processing, preceding the

(14)

scene itself (Crouzet et al., 2012). Again, taking the CNN as a model akin to human visual processing, perhaps animate target pairs in layer 7 (e.g. bear face and beetle shield) already differ just as much as other pairs (e.g. taxi window and chair armrest). An intermediate layer such as layer 4 might yield mid-level animacy features more fitting for analysis. It would then merit further investigation to compare different layer extraction sets, and to refine the method for comparison of CNN layer extraction and behavioral data, in future studies.

As for animacy, the results of Guerrero and Calvillo (2016) were partly corroborated. An effect of animacy on T2 report was indeed found. Where the effect in this study was most pronounced at lag-2, theirs increased from lag-2 to lag-4, contrary to the usual AB curvature. A recently published study also seeking to extend these results observed higher animate versus inanimate target report at lag-1 and 7 as well (Hagen & Laeng, 2017). Their task differed to a degree by having participants select images in a visual array for target identification, and using such images for distractors as well. Still, it is quite possible that the present study would have shown a similar trend for animacy across lags but that it was obscured by a ceiling effect; mean lag-7 report was over 90% (see also Appendix B) and no particular effect for lag-7 was predicted on basis of previous literature. Lowering the ceiling effect could perhaps be achieved in several ways: by making the response menu more sensitive to guesses, use more ‘organic’ shapes in masking, or demand more explicit responses (‘bear cub’ instead of ‘bear’, ‘fighter jet’ instead of ‘airplane’) thereby removing category responses and reducing a potential learning effect.

Hagen and Laeng (2017) took their effect of animacy regardless of lag as evidence that a processing advantage for animate objects would not result in a reduction of the AB effect. They concluded that the animal bias relates to post-attentive processing stages. This is

(15)

inconsistent with previous literature on perceptual and attentional biases regarding animacy, as these studies report a specific, faster processing mechanism for animate features (e.g. Kirchner & Thorpe, 2006; Crouzet et al., 2012; Evans & Treisman, 2005). The present study however did find a much smaller effect for animacy at lag-1 (2.81%) than for lag-2 (4.78%). This discrepancy in animate target report marks an advantage for animate objects during the AB period, implying that animate representations remain intact more often than inanimates despite the general attentional processing deficit.

To my knowledge, this is the first study investigating attentional processing of animate objects at multiple lags. The smaller effect of animacy at lag-1 as compared to lag-2 (see Fig. 2) is hard to reconcile with existing literature covering lag-1 or natural scenes. As no lag-1 sparing was observed, with T2 report at lag-1 at roughly the same level as lag-2, general theories concerning the AB do not fully apply (Dux et al., 2009). Although Potter et al. (2010) did observe lag-1 sparing for object images, they did not look specifically at animate categories despite including insects, marine animals and reptiles in the RSVP. Livesey and Harris (2011) found such a large difference in sparing magnitude between alphanumeric characters and natural objects that they concluded sparing to result from a different mechanism than the AB itself. This means that perhaps lag-1 sparing is only observable for well-learned alphanumeric stimuli and not for full objects or natural scenes. What this would imply for theories of the AB is beyond the scope of this study.

Yet there is a possible explanation for the lack of sparing generally found using natural images in RSVP: participants often seem to switch the temporal order of the presented targets during lag-1 sparing (Hommel & Akyürek, 2005). Whichever target produces a more lasting mental representation seemingly wins a competition for attentional resources at a loss of target

(16)

order information. A later extension of this study also provided evidence for actual perceptual integration of both targets, as if being superimposed into a single representation (Akyürek et al., 2012). Integration of both target stimuli into the attentional episode triggered by T1 is therefore a likely reason for T2 escaping the attentional limitation at short lags, a similar explanation to Potter et al. (2010). No study to date seems to have controlled for temporal order switching with natural scenes. This could result in a marked lag-1 identification increase or perhaps uncover patterns in which categories might cause order reversal.

As the small-to-large effect of animacy at lag-1 and 2 cannot be fully explained by general theories of attention, this discrepancy is left to be interpreted by its attentional bias. Throughout this paper, higher T2 report and higher target identification are assumed equal, which for the animacy effect is thought to result from mid-level differences (Long et al., 2017) along the visual ventral pathway (Sha et al., 2015) to give a ‘shortcut’ for animal features in attentional processing (Kirchner et al., 2006; Crouzet et al., 2012). In addition, another, post-perceptual advantage for the visual processing of animals is offered apart from processing speed. The animacy bias during the attentional blink suggests either prioritized entry and attentional processing, or a selective mechanism for animacy which could partially negate the attentional deficit. The specifics of such a mechanism, likely relating to visual short-term memory or working memory, is not yet apparent and will require further research into the attentional dynamics of object recognition. Improving methods of reliably extracting a continuum of feature complexities from convolutional neural networks will certainly aid this cause.

(17)

Acknowlegdments

I want to thank Daniel Lindh for supervising this bachelor thesis project and Reina van der Goot, Mels Boerkamp and Sophie Berkhout, all in the Brain and Cognition program group at the University of Amsterdam, for shared data collection.

References

Akyürek, E. G., Eshuis, S. A. H., Nieuwenstein, M. R., Saija, J. D., Başkent, D., & Hommel, B. (2012). Temporal target integration underlies performance at Lag-1 in the attentional blink. Journal of Experimental Psychology: Human Perception and Performance, 38, 1448-1464.

Broadbent, D. E., & Broadbent, M. H. P. (1987). From detection to identification: Response to multiple targets in rapid serial visual presentation. Perception & Psychophysics, 42, 105-113.

Carlson, T. A., Ritchie, J. B., Kriegeskorte, N., Durvasula, S., & Ma, J. (2013). Reaction time for object categorization is predicted by representational distance. Journal of Cognitive Neuroscience, 26(1), 132-142.

Crouzet, S. M., Joubert, O. R., Thorpe, S. J., & Fabre-Thorpe, M. (2012). Animal detection precedes access to scene category. PLoS One, 7(12), e51471.

Dux, P. E., & Harris, I. M. (2007). Viewpoint costs occur during consolidation: Evidence from the attentional blink. Cognition, 104, 47-58.

Dux, P. E., & Marois, R. (2009). The attentional blink: A review of data and theory. Attention, Perception, & Psychophysics, 71, 1683–1700.

(18)

Einhäuser, W., Koch, C., & Makeig, S. (2007). The duration of the attentional blink in natural scenes depends on stimulus category. Vision Research, 47, 597-607.

Evans, K. K., & Treisman, A. (2005). Perception of objects in natural scenes: Is it really attention free? Journal of Experimental Psychology: Human Perception and Performance, 31(6), 1476-1492.

Guerrero, G., & Calvillo, D. P. (2016). Animacy increases second target reporting in a rapid serial visual presentation task. Psychonomic Bulletin & Review, 23(6), 1823-1828.

Hagen, T., & Laeng, B. (2017). Animals do not induce or reduce attentional blinking, but they are reported more accurately in a rapid serial visual presentation task.

i-Perception, 8(5), 2041669517735542.

Hommel, B., & Akyürek, E. G. (2005). Lag-1 sparing in the attentional blink: Benefits and costs of integrating two events into a single episode. The Quarterly Journal of Experimental Psychology, 58(8), 1415-1433.

Howe, P. D. (2017). Natural scenes can be identified as rapidly as individual features. Attention, Perception and Psychophysics, 79, 1674-1681.

Kirchner, H., & Thorpe, S. J. (2006). Ultra-rapid object detection with saccadic eye movements: Visual processing speed revisited. Vision Research, 46, 1762-1776.

Kleiner, M., Brainard, D., Pelli, D., Ingling, A., Murray, R., & Broussard, C. (2007). What’s new in Psychtoolbox-3. Perception, 36(14), 1.

Kriegeskorte, N. (2015). Deep neural networks: A new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 1, 417-446. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep

convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097-1105.

(19)

Livesey, E. J., & Harris, I. M. (2011). Target sparing effects in the attentional blink depend on type of stimulus. Attention, Perception and Psychophysics, 73, 2104-2133.

Long, B., Störmer, V. S., & Alvarez, G. A. (2017). Mid-level perceptual features contain early cues to animacy. Journal of Vision, 17(6), 1–20.

Martens, S., Dun, M., Wyble, B., & Potter, M. C. (2010). A quick mind with letters can be a slow mind with natural scenes: Individual differences in attentional selection. PLoS ONE, 5(10), e13562.

New, J., Cosmides, L., & Tooby, J. (2007). Category-specific attention for animals reflects ancestral priorities, not expertise. PNAS, 104(42), 16598-16603.

Potter, M. C., Wyble, B., Pandav, R., & Olejarczyk, J. (2010). Picture detection in rapid serial visual presentation: Features or identity? Journal of Experimental Psychology: Human Perception and Performance, 36(6), 1486-1494.

Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: an attentional blink? Journal of Experimental Psychology: Human Perception and Performance, 18(3), 849-860.

Sha, L., Haxby, J. V., Abdi, H., Guntupalli, J. S., Oosterhof, N. N., Halchenko, Y. O., & Connolly, A. C. (2015). The animacy continuum in the human ventral vision pathway. Journal of Cognitive Neuroscience, 27(4), 665-678.

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding.

(20)

Appendix A

Scatterplot of all similarity Pearson correlates as extracted from layer 7 of the CNN plotted against their mean T2 identification rates. Triangles signify lag-1 trial types, an x for lag-2 and a circle for lag-7. Notice how the data cloud overall skews towards the upper left, with the lag-1 (triangles) and 2 (x) trials producing the most outliers. Lag-7 values stay clustered to the upper left. A linear relationship for any three trial types would generally entail the plotted markers to spread from the lower left to the upper right corner.

(21)

Appendix B

Histograms of the mean T2 report rate frequency (1 for each participant) and normal distribution at lag-7. On the left (a) a distribution is shown for animate objects and on the right (b) for inanimate objects. Notice how the report rate at (a) skews towards the maximum report rate of 1.00, indicating a ceiling effect. This follows from the normal distribution as well, which cuts off at 1.00. For inanimate objects at (b) the ceiling effect is less apparent.

Referenties

GERELATEERDE DOCUMENTEN

Because we are using one resource to get our animacy dictionary and a dierent resource to get the feature values, it may be the case that words from the dictionary are unknown in

Deze hypothese wordt door Jansen (1981) niet overgenomen, omdat er volgens hem geen zinnen voorkomen die niet grammaticaal, maar wel acceptabel zijn, waarmee de

In three experiments, we consistently found that high socially anxious individuals paid more attention to expressive hands than low socially anxious individuals,

In the multi‐level model with T1 anxiety scores from parents and children as predictors of child attentional bias in the visual search task (N = 81), only the interactions

Following the same line of thought as for the isentropic case, we can also extend the kinetic energy port-Hamiltonian system using the distributed port (e d , f d ) to model

Patients with multiple myeloma from the cancer charity Myeloma UK were invited to participate in an online survey based on multicriteria decision analysis and swing weighting to

This showed that short expression of the 3 transcription factors and Ezh2 is necessary for the conversion of astrocytes into induced oligodendrocytes progenitor cells

Denk bijvoorbeeld aan de waterwinning en het ge- zamenlijk belang van voldoende water van goede kwaliteit; meer aandacht voor oppervlaktewin- ning spaart grondwater