• No results found

Predicting individual differences in Infancy : do infant scan patterns relate to infant cognition?

N/A
N/A
Protected

Academic year: 2021

Share "Predicting individual differences in Infancy : do infant scan patterns relate to infant cognition?"

Copied!
57
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Predicting Individual Differences in Infancy: do Infant Scan Patterns Relate to Infant Cognition?

Linda van den Berg 10660550

University of Amsterdam

Supervisors: Maartje Raijmakers, Daan van Renswoude, & Ingmar Visser Second assessor: Yaïr Pinto

(2)

Abstract

Early predictors in infancy of individual differences in cognition have often been examined. Studies employing eye-tracking often focus on simple eye-tracking metrics, largely ignoring the exploration over time (i.e., scanpaths; Hayes et al., 2011), which may provide a more complete understanding of complex cognitive processes (Noton & Stark, 1971). This study aims to explore the usefulness of scanpath analysis as an additional tool for measuring cognition in infants. Fifty-four infants completed a free-viewing task and two cognitive functioning tasks. We investigated whether free-viewing scanpaths, captured by the novel successor representation scanpath analysis (SRSA) would predict infant cognitive individual differences as it did for adults (Hayes & Henderson, 2017). SRSA was unable to explain differences in cognitive functioning, nor was it able to extract free-viewing strategies related to individual differences. We conclude that, for now, SRSA may not be suited for infant free-viewing data, as different factors may have influenced infant scanpaths and the performance of SRSA.

(3)

Introduction

Researchers aim to uncover early predictors of cognitive functioning in childhood and later development. Individual differences in infancy moderately relate to future individual differences. Infants with higher scores on cognitive tests generally had higher scores on intelligence tests (Fagan, Holland, & Wheeler, 2007; Rose, Feldman, Jankowski, van Rossem, 2012) and executive functioning tests later in life (Rose, Feldman, & Jankowski, 2012). However, the predictive validity of cognitive tests has often been poor to moderate, such as in habituation tasks (Kavšek, 2004; McCall & Carriger, 1993; McCall, Hogarty, & Hurlburt, 1972). Predictive validity problems are possibly due to shortcomings in the cognitive tests, such as construct validity and reliability. Firstly, it has been questioned whether cognitive tests accurately measure cognitive abilities. In early work this was often done using standardized and normed tests, such as The Bayley Scales of Infant Development (Bayley, 1993). These tests appear to target sensorimotor abilities whereas tests of mature cognition target abstract abilities. This raises the question whether these early tests are sensitive enough to infer complex cognitive functioning and therefore validation of cognitive tests is needed. Recent efforts to develop adequate individual difference paradigms have made progress in changing cognitive tests to resemble mature tests. Tests now target recognition memory, attention, and processing speed (Rose et al., 2012; Rose, Feldman, & Jankowski, 2003). Secondly, test-retest reliability of infant individual difference tests is often limited, such as in habituation tests used in early work. Although steps have been taken to make individual difference tests more reliable, they often remain moderately reliable (Kavšek, 2004; McCall & Carriger, 1993; Rose et al., 2003; Rose et al., 2012). Fortunately, with improvements of cognitive tests came improvements in predictive validity (Rose et al., 2003). Nevertheless, there is still work to be done to achieve the goal of predicting cognitive functioning. An

(4)

important next step might be to explore additional tools for accurately measuring cognitive functioning in infancy.

Cognitive abilities have often been inferred from simple behaviours already present in infancy, such as looking behaviour. Looking behaviour in the preverbal infant is an important gateway into how infants view the world and how they process information. Without looking behaviour research, we would assume infants have few cognitive abilities. Eye-tracking has enabled us to measure looking behaviour with high spatial and temporal precision and eye movement analysis has become a favoured way of uncovering infant cognitive development (Aslin, 2007; Grëdebäck, Johnson, & Hofsten, 2009).

Scene viewing studies often encounter individual differences in infants’ eye movements. In only a few studies these age-related individual differences are related to specific individual differences in cognition. For example, Frank, Vul, and Johnson (2009) show that semantics guide older infants’ viewing behaviour whereas salient features guide younger infants. Wass and Smith (2014) showed evidence for a relationship between eye movements in scene viewing and attentional individual differences. These first findings indicate that there might be useful information about individual differences in eye

movements. Our exploratory study therefore aims to build on work validating measurements of cognitive functioning by examining how scene viewing relates to cognitive individual differences.

There are some shortcomings in the current scene viewing and individual differences literature. First, most studies use simple and artificial stimuli (e.g., static geometric forms; Richards, 2010) . With simple stimuli, infants do not use all their cognitive resources (Kidd, Piantadosi, & Aslin, 2012), thus hindering investigations of the full scope of cognitive processes. Recent work however has incorporated more complex stimuli (e.g., dynamic stimuli or free-viewing; Frank, Vul, Saxe, 2012; Papageorgiou et al., 2014; Wass & Smith,

(5)

2014). A second shortcoming is that most studies rely on averaged eye movement measures, such as looking time, fixation duration, fixation location, or saccade amplitude (Hayes et al., 2011). Infants are known to allocate attention to learnable stimuli (i.e., neither too simple nor too complex; Kidd et al., 2012). Hence, averaged measures may cause us to fail to notice important information on cognitive development in infants.

Although averaged measures show stability over time in infants and have been found to correlate with cognitive functioning (Hessels, Hooge, & Kemner, 2016; Wass & Smith, 2014), by reducing eye movement data to an averaged measure we fail to recognise that looking behaviour is more than a simple measure; it is a process that unfolds over time (Anderson, Anderson, Kingstone, & Bischof, 2015). Scanning patterns (i.e., scanpaths) have been thought to uncover how we look at visual information in time and what information attracts our attention first (Dewhurst et al., 2012; Hayes, Petrov, & Sederberg, 2011). There is a wealth of research attempting to quantify scanpaths (Cristino, Mathôt, Theeuwes, &

Gilchrist, 2010; Jarodzka, Holmqvist, & Nyström, 2010). Commonly, scanpaths are treated as strings of letters and we analyze how many transformations are needed to resemble another scanpath (Anderson et al., 2015; Brandt & Stark, 1997; Dewhurst et al., 2012). This string-edit method is beneficial for comparing short scanpaths of the same length. However, it hinders examination of cognitive processes that take longer or individual differences in processing length (Hayes et al., 2011). Scanpaths are also analyzed by calculating

probabilities of making subsequent fixations based on regularities in scanpaths (Ellis & Stark, 1986). In contrast to string-edit, this method enables comparison of varying scanpaths lengths. However, this method only predicts the probability of the current fixation to the next fixation (Hayes et al., 2011). For decades, research has aimed to link scanpaths to cognitive functions using these scanpath methods. Early evidence suggested that adults process visual stimuli serially and fixate features in a fixed order (Noton, 1970). Scanpaths were found to be stable

(6)

for a subject viewing the same image multiple times. However, with different images and different individuals, scanpaths varied, suggesting that scanpaths are related to individual differences in processing visual stimuli (Foulsham et al., 2012; Noton & Stark, 1971). However, other work suggests that scanpaths may not be as stable as Noton & Stark (1971) thought, as research also found that individuals did not necessarily show similar scanpaths for different images (Dorr, Martinetz, Gegenfurtner, & Barth, 2010) or the same images

(Humphrey & Underwood, 2010).

Recently, however, Hayes et al. (2011) suggested that scan patterns may in fact be stable viewing behaviours that can be related to individual cognitive differences. Their rational for looking into this relationship originates from studies demonstrating that people solve problems differently. For instance, in the Raven’s Advanced Progressive Matrices (APM, Raven, 2000), two distinct strategies have been found: an efficient strategy and an inefficient strategy. In the efficient strategy, individuals go back and forth between the to-be-solved problem and the possible responses. People who employ this strategy have been found to score lower on the APM. By contrast, in the efficient strategy people explore and solve the problem prior to choosing from the response options and tend to score higher. These different strategies have been found to be observable in eye movements (Hayes et al., 2011; Vigneau, Caissie, & Bors, 2006). Hayes and colleagues argued that since people also differ in scanning strategies (e.g., Noton & Stark, 1971), scanpath analysis might help discern and predict (in)efficient problem solving strategies. However, they acknowledged that common scanpath quantification methods may limit the ability to gain insight into these complex cognitive processes, for the aforementioned reasons. If they were to use scanpaths to uncover cognitive processes, it was imperative to find ways of going beyond the limited fixation prediction. Fortunately, Hayes and colleagues devised a method that goes beyond the second fixation prediction. Their successor representation scanpath analysis (SRSA) extracts statistical

(7)

regularities from eye movement sequences of varying lengths. Using this analysis, they not only found distinct scanpaths to reflect the two distinct problem solving strategies (Hayes et al., 2011), they also found cognitive individual differences to be related to other viewing behaviour (i.e., scene viewing behaviour). In scene viewing, individuals with higher cognitive abilities tended to follow a more systematic and efficient viewing strategy that was mostly directed towards central areas in scenes. However, as cognitive abilities decreased, less efficient scanning patterns emerged that were directed towards peripheral areas. In addition to the ability to detect distinct strategies in viewing behaviour, their novel analysis of scanpaths was found to be a more accurate prediction of individual differences in cognition than averaged measures and common scanpath analysis (Hayes et al., 2011; Hayes & Henderson, 2017). Since scanpaths may allow for broader comprehension of cognitive processes (Hayes et al., 2011, Hayes & Henderson, 2017; Noton & Stark, 1971), it seems fruitful to explore whether scanpath analysis may be a beneficial tool for measuring cognitive functioning in infancy and whether individual differences in scan patterns relate to cognitive individual differences.

The aim of the current exploratory study is to investigate what scene viewing is able to tell us about cognitive functioning in infants. Specifically, we are interested in whether

individual differences in scene viewing relate to infant individual differences in cognition. In order to investigate this, we conducted an experiment consisting of 3 tasks: a recognition memory task, an attentional disengagement task, and a free-viewing task of 29 real-world scenes. We analysed the data using the novel SRSA, devised by Hayes et al. (2011). Based on studies examining individual differences in adults ( Hayes et al., 2011, Hayes & Henderson, 2017), we would expect SRSA to predict infants’ scores on recognition memory and attention. We would also expect SRSA to predict cognitive functioning better than averaged measures. Prior research has shown that averaged measures are stable over time (Hessels et al., 2016;

(8)

Wass & Smith, 2014). Since scanpath analysis of infant free-viewing data is rare. Therefore, if we are to correlate scanpaths with cognitive functioning, we would ideally examine the stability of scanpaths over time. Hence, in this study we also made a first attempt at exploring whether the SRSA is a stable method and thus whether it may be considered a beneficial method for predicting cognitive functioning.

Methods Participants

Infant participants were recruited through the municipality of Amsterdam. In total, we tested 59 infants, of which 5 infants were excluded from analysis. One infant was excluded due to failure to calibrate, 2 infants were unable to participate due to fussiness prior to the experiment, and one infant was excluded because the parent interfered with the experiment. Only one infant participant was unable to start the free-viewing task. As this study aimed to associate cognitive individual differences with individual differences in free-viewing, it was imperative that all infants completed at least 20% of the free-viewing task. Fifty-four 5.93- to 13.06-month-old infants (M = 9.39, SD = 2.19, see Figure 1; 26 females, 28 males) provided data on all three tasks (i.e., recognition memory (RM), gap-overlap (GO), and free-viewing (FV)). Only 4 infants were born preterm (i.e., born before 37 weeks), and caregivers did not communicate any developmental disorders or difficulties. Caregivers gave their written consent.

(9)

Figure 1. The distribution of ages in our sample.

Design

The experiment consisted of 3 different tasks: 2 cognitive tasks (RM and GO) and 1 FV task of real-world scenes. Prior research showed that infants prefer free-viewing over other types of tasks. Hence, participants first completed the cognitive tasks followed by the free-viewing task. In contrast to a counterbalanced order, this order of presentation would minimise the risk of missing data (Frank, Amso, & Johnson, 2014). The experiment was built in Python and Pygaze was used to interact with the eye tracker. All tasks were completed using the remote Eyelink 1000 eye-tracker (SR Research Ltd., Ontario, Canada). Data were collected at 500Hz.

Stimuli

Recognition Memory (RM). We used RM as described by Rose and colleagues (e.g., Rose et al., 2012; De Jong, Verhoeven, Hooge, & Van Baar, 2016). The task consisted of 8 trials with two phases each, a familiarisation phase and a testing phase. In the former, infants looked at two identical faces for 8000ms, after which a grey screen was shown for 500ms.

(10)

Subsequently, the left or right (familiarised) face was replaced with a novel face and both the new and old face remained onscreen for 8000ms. Prior to each trial, a centred attention getter was presented which infants had to fixate in order to proceed with the next trial. This attention getter acted as the inter trial-interval. Stimuli were 10 children’s faces obtained from De Jong et al. (2016) and 6 faces of adults obtained from the Umea University Database of Facial Expressions (Samuelsson, Jarnvik, Henningsson, Andersson, & Carlbring, 2012). We initially planned on using all children’s faces from the De Jong dataset. However, only 10 caregivers had given permission to use their children’s faces in other studies. As to complete the faces dataset, we matched adult faces on the colour of clothing, hair colour, eye colour, facial expression, and sex. The stimuli had a size of 425 by 646 pixels (approximately 10 x 15 degrees of visual angle) and were presented with a distance of 230 pixels (approximately 5 degrees of visual angle) between the stimuli. The stimuli and the position of the new stimulus (left or right) were presented in a fixed order. All stimuli were presented against a grey background (see Figure 2 for a trial example).

Areas of interests (AOIs) were the 2 faces in the testing phase, which were used to compute a measure of recognition: the novelty score. The novelty score was computed by dividing dwell times for novel faces by dwell times for both faces in the testing phase across trials (e.g., Rose et al., 2003).

(11)

Figure 2. An example of the recognition memory task: first familiarisation faces are shown

and subsequently, the left or right face changes.

Gap-Overlap (GO). The stimuli in the GO task were similar to the stimuli in Cousijn, Hessels, Van Der Stigchel, & Kemner (2017); 1 central target (a clock) and 8 peripheral targets (dog, duck, elephant, fish, moon, penguin, sun, and turtle). Normally, the GO task consists of 3 conditions: a baseline condition, a gap condition, and an overlap condition. However, recent work has shown that the difference between the gap condition and overlap condition may be a better representation of attentional disengagement (Cousijn et al., 2017). Hence, the baseline condition was omitted from this task. All stimuli in this task were made gaze contingent, accompanied by various sounds, and presented against a grey background. Trials began with a rotating central clock (3cm, 2.6° x 2.6°). Trials would proceed only when the central target was fixated for 100 consecutive milliseconds. In order to minimise

anticipatory saccades, the stimulus would stay onscreen for 600-700ms. Subsequently, a peripheral stimulus (3 cm, 2.6° x 2.6°) would appear 13° to the left or right of the central target. In the gap condition the peripheral stimulus would appear 222ms (+/- 35ms) after the central stimulus disappeared. By contrast, in the overlap condition, the peripheral stimulus and the central stimulus would be onscreen concurrently and the latter would disappear 222ms

(12)

(+/- 35ms) after peripheral appearance (see Figure 3). Fixating the peripheral stimulus would trigger a reward (i.e., the peripheral stimulus would vibrate and a beeping/twirling/bird sound would be played). If 1500ms had elapsed with no fixation on the target, the same reward would be prompted. The task consisted of 40 trials total (20 trials per condition) and the conditions were randomly presented.

The measurement of interest in this task was the gap-overlap effect, that is, the

difference in reaction time (RT) between the gap condition and overlap condition. Hence, we measured the time it took infants to disengage from the central target and fixate the peripheral target after its onset. Trials were excluded from analysis when RTs were below 200ms or exceeded 1500ms (Elsabbagh et al., 2009; Cousijn et al., 2017 ; Wass, Porayska-Pomstra, & Johnson, 2011).

Figure 3. An illustration of the gap trial and overlap trial in the gap-overlap task. After

fixating the clock a grey screen would follow (gap condition) and after 222ms (+/-35ms) a left or right target would appear. Alternatively, the clock and peripheral target would be presented simultaneously, and after 222ms (+/-35ms) the central target disappears.

(13)

Free-Viewing (FV). Twenty-nine 29 real-world indoor scenes (600 x 800 pixels) were selected from the Object and Semantic Images and Eye-tracking (OSIE) dataset (Xu, Jiang, Wang, Kankanhalli, & Zhao, 2014). As this task was used in a parallel study assessing the influence of semantic knowledge in free-viewing behaviour, images were selected based on six exclusion criteria, namely: the presence of people and animals, unclear scenes, poor image quality, the image contains less than four objects; and indefinable objects as these objects were used to detect object knowledge in the parallel project. Eight hundred images were inspected by one assessor who judged the photographs on eligibility (i.e., photographs wherein humans or animals were present were excluded immediately). The resulting 400 images were then judged on the remaining exclusion criteria. A second assessor subsequently judged the remaining 100 images on eligibility. Both assessors agreed on inclusion of 26 photographs, after which assessor 1 included 3 additional photographs. Images were presented in full-screen, as to prevent cut-offs of meaningful objects and to allow for longer sequential eye movements. All trials began with a central attention getter (clock, dog, duck, elephant, fish, moon, penguin, sun, and turtle), accompanied by bleeping/twirling/bird sounds. Only after fixating this attention getter, for at least 100 consecutive milliseconds, were infants able to view an image. In order to examine scanpaths it was important to present the scenes for a sufficient amount of time (Hayes et al., 2011). Therefore, we presented the photographs for eight seconds to allow for sufficient sequential eye movements and not to exhaust the infants. The stimuli were presented randomly such as to avoid fatigue effects.

Procedure

All participants were tested in a baby car seat or on the caregiver’s lap, in a darkened room with dimmed lights. They were seated approximately 55 cm from a 17-inch-screen with a subtended visual angle of 27° x 34°. Caregivers were present at all times in the testing room

(14)

and were instructed not to interact with their child or to react or point to stimuli presented on the screen. A second camera in the testing room was directed towards the car seat such as to allow the experimenters to follow the infant’s behaviour during the experiment and judge whether the infant had shifted position, necessitating re-calibration.

Prior to stimulus presentation, point of gaze was calibrated using a 5-point calibration procedure. In order to engage the infant in the calibration, we presented the children’s book character Miffy in a wobbling plane against the same grey screen as in the tasks, accompanied by a twirling sound. When validation of the calibration exceeded 1° of each Miffy’s centre, calibration was repeated. After successful calibration, the infant started the first cognitive task: RM. After completion of the first task, the experimenters proceeded with the second task (GO) without calibration, unless the infant had changed position resulting in eye-tracking difficulties. Subsequently, a short children’s film was shown of Woezel and Pip, such as to redirect the infants’ attention. After this break, point of gaze was calibrated and the FV began. When infants needed a break and were taken out of the car seat, calibration was repeated.

Data Pre-processing

The R-package Gazepath (Van Renswoude, Raijmakers, Koornneef, Johnson, Hunnius, & Visser, 2017) was used to classify raw eye tracking data into saccades and fixations. The algorithm implemented in Gazepath accounts for individual differences in data quality. To avoid fixations belonging to the inter-trial attention getters being used to compute scores or scanpaths, we removed all fixations around the attention getter (for RM) and fixations during presentation of the attention getter (for FV).

(15)

Scanpath Quantification

In order to quantify scanpaths for each scene, we overlaid the eye movement data pertaining to the images with 3 different types of AOI grids (i.e., state spaces) that captured tendencies in scanning behaviour. Because the SRSA method is rather novel and little is known about which AOI grid supports scanning patterns and their successor representations best, we quantified scanpaths according to the AOI grids described by Hayes and Henderson (2017). The state spaces were: horizontal state space, vertical state spaces, and radiating state space (see Appendix A for state coordinates). All state spaces consisted of an outermost rectangle, that, according to Hayes and Henderson (2017) were meant to capture non-central bias tendencies. The horizontal state space starts on the left of an image and scatters toward the other side of the image (Figure 4a), and is set to capture horizontal eye movements. In the vertical state space, the first AOI was located at the top of images while subsequent AOIs dispersed toward the bottom of images (Figure 4b). This was intended to capture vertical scanning of images. In the radiating state space, AOIs were comprised of rectangles that started from the centre and dispersed toward the periphery (Figure 4c). This state space was intended to capture to what extent fixations moved between the centre and periphery.

Figure 4. Examples of how we overlaid areas of interest grids (i.e., state spaces) over the

scenes, such as to extract scanning patterns: a horizontal state space, a vertical state space, and a radiating state space.

(16)

Analyses

SRSA. Our main focus was to examine whether scanpath analysis of free-viewing behaviour is a useful tool for research concerning infant cognitive functioning. However, scanpath analysis of infant eye movements is rather novel, hence, all analyses in this study were exploratory. We analysed scanpaths using the successor representation scanpath analysis (SRSA, Hayes et al., 2011), which was implemented in R in a different project (source code available at https://github.com/Kucharssim/SRSA).

The SRSA was proposed to capture statistical regularities in free-viewing scanning behaviour and to predict cognitive scores based on these statistical regularities. The algorithm, in this analysis follows the shifts in eye movements from one AOI (or state) to another, for every trial and participant. In the more common first-order transition probability methods we calculate the probability of eye movements shifting from one AOI to all of the AOIs in a scene, thus only allowing us to calculate the probability of shifting to only one of all available AOIs given that a certain AOI was fixated. The SRSA, however, goes further than the

prediction of only the next transition. It associates transitions from a certain AOI to another AOI. In this process, the method searches for transitions to this second AOI, originating from other AOIs in the scene. Based on this history of transitions, it will associate the transition starting from the first AOI with all other AOIs. The updating rule used for this (Equation 1) creates expected number of transitions to all AOIs associated with the transition that was being followed (Hayes et al., 2011; Hayes & Henderson, 2017).

∆𝑴𝑴𝒊𝒊 = 𝛼𝛼(𝑰𝑰𝒋𝒋+ 𝛾𝛾𝑴𝑴𝒋𝒋− 𝑴𝑴𝒊𝒊) (1).

To simplify the understanding of this algorithm, consider the following illustration of this process. For the sake of simplicity, we will illustrate the radiating state space. Suppose an infant is to explore Figure 5 in a systematic manner, going from the centre to the periphery (state 1  state 2  state 3 state 1  state 2  state 3 and so on). The expected following

(17)

transitions of AOI 1 will include the shift from AOI 1 to AOI 2 but also the shift from AOI 2 to AOI 3, whereby the latter transition is weighted by the gamma value. This calculation is done for the whole fixation sequence with equation 1, which follows the shift from the current fixation on a state (subscript i) to subsequent fixations on other states (subscript j). This results in a matrix of size 5x5 because each state space consists of 5 states. This matrix represents the expected transitions from a given AOI in a given scene to another AOI in that same scene: the columns represent the location the transition started from and the rows

represent the location where the expected transition is thought to end. This matrix of expected transitions is also known as the successor representation (SR; Hayes et al., 2011; Hayes & Henderson, 2017).

Figure 5. An example of a systematic free-viewing scanning behaviour as an illustration of

how the successor representation follows transitions. Every state is indicated by a different rectangle.

The SR algorithm (1) requires a 5 x 5 SR matrix with its values initially set to zero. Using the parameters α and γ, it will use the current fixations (i) and subsequent fixations (j) to update the matrix. The α parameter is a learning-rate parameter whose task is to optimize the temporal-difference learning algorithm and ensures that the SR matrix is being updated. Alpha values can vary between 0 and 1. In order to include temporally extended transitions,

(18)

gamma is necessary to regulate the temporal discounting (ranging from 0 to 1). The more gamma increases, the farther the transitions in the sequence reach. Conversely, as gamma approximates zero, the more the extracted regularities resemble a traditional first-order transition matrix. A gamma of zero will not extract elongated statistical regularities from the scan pattern, but rather it will provide a matrix based on first-order transitions (Hayes et al., 2011, Hayes & Henderson ,2017).

As in Hayes and Henderson (2017), SRs were computed for each state space and for each individual difference measure, and subsequently analysed as described below. Prior to SR calculation, we mapped the fixation location to the states in each of the 3 state spaces. In the original use of the SRSA, fixations within the same state would not be taken into account in the analysis, as they would not contribute to differentiating scanning behaviour of the APM’s problem AOIs. However, as free-viewing scenes contain more information within states than an APM problem, it seemed suited to also base the SRSA on repeated fixations within states, as was done in Hayes and Henderson (2017). For each scene we created an SR matrix, resulting into 29 5 x 5 scene SR matrices per person. Infants are known to make less fixations than adults. For instance, infants in Van Renswoude, Visser, Raijmakers, Tsang, & Johnson (2017) on average made 4 to 5 fixations per scene. It is difficult to decide how many transitions are required for accurate scanpath analysis. As the images in Van Renswoude et al. (2017) were presented 4s shorter than the images in our study, we took the average number of fixations in the Van Renswoude study as the minimum number of fixations needed for a scanpath. We only calculated scene SRs when infants made more than 4 fixations. We also foresaw that infants might also differ in the number of fixations made in each scene, due to fatigue effects or unmotivated infants. Hence, we weighted the scene SR matrices by the number of fixations made in that scene. In order to ultimately predict individual differences in cognitive scores, these trial SRs were averaged over participants, resulting into 54 5 x 5

(19)

matrices; one SR matrix for each participant that captured all scan pattern regularities in a given state space for that particular participant (Hayes & Henderson, 2017).

As these SR matrices would act as predictors of cognitive scores, the SRSA

implementation vectorized the individual SRs, thus resulting in a matrix of 25 x 54. The next step in the SRSA was to perform a principal-components-analysis (PCA), which would guard against high dimensionality of SRs and overfitting (Hayes et al., 2011). The SRSA computed the first 20 components and picked the 5 components that correlated the strongest with the cognitive scores. The individual SRs were then multiplied by these top 5 components and then used in a multiple linear regression wherein these projections were treated as predictors and the cognitive score as the dependent variable. During this process, an optimization procedure repeated this procedure such as to find the alpha and gamma values that provided the best fit of the cognitive score prediction. The regression coefficients from the best fitting model were then multiplied by the top 5 principal components, such as to create a prediction weight matrix that would capture distinct scanning strategies of all infants. Positive values indicated strategies belonging to higher cognitive scores and low values indicated strategies belonging to lower cognitive scores (Hayes & Henderson, 2017). Due to the danger of overfitting, a cross-validation procedure was necessary to ensure generalization. We used a leave-one-out cross-validation, as was used in Hayes et al., (2011).

Based on the prediction weight matrix it is possible to distil the expected transitions between states for infants with high cognitive scores and low cognitive scores, as indicated by positive and negative values respectively. However, to make the interpretation somewhat easier, we used the procedure described by Hayes and Henderson (2017) to visualise the scenes of which the scanpaths strongly correlated with the prediction weights (positively and negatively). Using the alpha and gamma values that provided the best fit of the cognitive score prediction, we selected the data pertaining to the 5 highest and 5 lowest scoring infants

(20)

and computed an SR for each scene for each participant, which were then correlated with the prediction weights. This procedure resulted in 29 correlations per person (1 for each scene). These correlations per person in the lowest scoring group were then subtracted from the correlations of each person in the highest scoring group. In this difference matrix, we

searched for the biggest difference in correlations, which indicated which scene was the best representation of two scan patterns that positively and negatively correlated with the overall prediction weights. This would thus represent the scan patterns of high and low scoring infants, respectively.

Traditional analyses analysis. The SRSA has never been applied to infant data. Therefore, we compared this analysis to traditional eye movement analysis. We first predicted cognitive scores using first-order transition models. In order to achieve this, we used the same models as in the SRSA, with the same alpha values. However, gamma values were set to zero throughout the whole process, such as to create first-order transitions (Hayes & Henderson, 2017). The first order transition models were performed using all 3 state spaces and used to predict both cognitive scores.

We also predicted cognitive scores using averaged eye movement measures: fixation duration, number of fixations, and saccade amplitude. In a multiple linear regression with these 3 variables as predictors, we predicted both cognitive scores. To be able to make conclusions about the benefits of the SRSA over traditional analyses, the latter analyses were also cross-validated using a leave-one-out cross-validation (Hayes et al., 2011; Hayes & Henderson, 2017).

Stability of Scanpaths. As a first estimation of the stability of the SRSA , we ran an SRSA for the first 14 trials and another SRSA for the other 15 trials. Subsequently, the prediction weight matrices of these two SRSAs were correlated, as these matrices represented the summed viewing strategies. However, the prediction weight matrices are based on SRs

(21)

averaged over participants (Hayes et al., 2011; Hayes & Henderson, 2017). As this project’s aim was to investigate individual differences, one would want to examine the stability of the SRs at the level of the individual. In the SRSA, the projections that act as predictors of

cognitive scores, are set to show to what extent the scanning behaviour resembles the viewing strategies in each individual participant (Hayes et al., 2011). The projections may thus be a more appropriate way of examining the stability of scanpaths. Hence, we correlated the projections from the first set of SRSA with the projections from the second set of SRSA. Both stability checks were done for all three state space SRSAs for both cognitive scores.

Results

Out of 54 infants, 13 infants were placed on their parents’ lap during the experiment (1 infant started the experiment on their parent’s lap, 3 infants switched from car seat to their parents’ lap during or after MT, 4 infants switched during or after GO, and 5 infants switched during or after FV. As the experiment lasted approximately 15 minutes, the experimenters kept track of the infants’ mood and behaviour. In general, infants seemed content during MT and GO. However, as the experiment progressed, infants became distracted or fussy. Thus, infants may have been distracted during FV.

Cognitive Tasks

Recognition Memory. In total, there were 9152 fixations on either the left or right face; 4953 in the familiarisation phase and 4199 in the test phase. On average, fixations had a root mean square (RMS) of 0.03° (SD = 0.02). Fixation durations ranged between 100ms and 7416ms during familiarisation (Median = 392, M = 490, SD =370.45), whereas in the test phase, fixation durations ranged between 100ms and 4992ms (Median = 392, M = 487.50, SD = 356.49). As is often found in the infant eye-tracking literature, fixation durations

(22)

significantly correlated with age (r = -0.37, p < 0.01, Table 1); as infants grow older, fixation durations get shorter.

Table 1. Correlation matrix of all measures pertaining to all 3 tasks: recognition memory,

gap overlap, free-viewing.

*** p < .001; ** p < .01; * p < .05.

Note. RM = recognition memory; GO = gap-overlap; RM: Median Duration= median fixation

duration in the memory testing phase; GO effect = the gap-overlap effect; FV Median

Duration= median fixation duration in the free-viewing; FV: NF = Number of fixations in the free-viewing; FV: SA = free-viewing saccade amplitude.

Novelty scores. We calculated novelty scores for each infant by dividing the duration of fixations on novel faces across test phase trials by the duration of fixations on both faces across test phase trials. This score was then converted to a percentage. Novelty scores ranged from 25.55 to 70.50 (M = 52.98, SD = 9.10), and significantly differed from 50%, t (53)= 2.40, p = 0.02. There was no relationship between novelty scores and age, see Figure 6. The internal consistency of the novelty scores was examined, and a split-half reliability of indicated that the RM task had a low reliability, r = -0.24, p = 0.01. It is difficult to pinpoint how this negative reliability arose. Perhaps the first half of RM data consisted of less trials

Age RM: Novelty score RM: Median Duration GO: Median Gap RT GO Median Overlap RT GO effect FV: Median Duration FV: NF FV: SA Age 1.0 RM: Novelty score 0.16 1.0 RM: median Duration -0.37** -0.06 1.0 GO Median Gap RT -0.27* -0.16 0.41** 1.0

GO: Median Overlap RT -0.44*** -0.21 0.31* 0.32* 1.0

GO effect -0.25 -0.10 0.03 -0.37** 0.76*** 1.0

FV: Median Duration -0.18 0.18 0.54*** 0.29* 0.38** 0.17 1.0

FV: FN 0.39** 0.18 -0.29* -0.21 -0.31* -0.16 -0.15 1.0

(23)

due to infants who were still getting accustomed to the room. Alternatively, the second half may have consisted of less trials due to declining attention of infants.

Figure 6. The relationship between age and cognitive scores.

Lateral bias. Studies employing RM paradigms sometimes encounter infants who show a lateral bias, whereby their fixations are biased towards either the left or right stimulus. When infants show such a bias, their novelty scores will not be based on the comparison between novel and familiar faces (Fisher-Thompson & Peterson, 2004; Rose, Feldman, Jankowski, 2001). We first examined this lateral bias for each infant in each trial, by calculating the percentage of time spent looking at the left or right face. If this percentage exceeded 80% there was a lateral bias for that trial. We found 34 trials where infants showed a side bias, distributed among 42 infants, but these biases did not correlate with the

corresponding novelty scores per trial nor with age. In order to investigate whether some infants had a complete bias toward either side, we calculated an overall lateral bias whereby we calculated the percentage of total looking time in the test phase that was devoted to either side, with the same criterion. Only two infants exclusively looked at one side of the stimuli.

(24)

There was no relationship between the individual lateral bias and novelty scores nor was there a relationship between the individual lateral biases and age.

Gap-Overlap. RTs in 5 trials (3 gap and 2 overlap) were shorter than 200ms or longer than 1500ms, and thus were excluded from analysis. RTs ranged from 205.5ms to 1366.1ms (M = 496.1 median = 457.6ms). Out of 2160 peripheral targets, infants did not look at 329 peripheral targets (i.e., 15%; 159 gap trials, 170 overlap trials). Inspection of normality, skewness, and kurtosis indicated that the data was non-normal and positively skewed. We used median RTs to calculate the gap-overlap effect, as median RTs are less sensitive to skewness (Whelan, 2008). There were correlations between age and RTs in the gap condition (r = -0.27, p = 0.05) and between age and RTs in the overlap condition (r = -0.42, p < 0.01). Older infants were faster to disengage from the central stimulus than younger infants. However, the gap-overlap effect did not significantly correlate with age, nor did it correlate with the RM. The internal consistency of the GO was also examined, and a split-half reliability of indicated that the GO task had a moderate reliability, r = 0.40, p = 0.002.

Free-viewing. In total, there were 19246 fixations in the free-viewing data, with an average of 290 (SD = 86) fixations per infant. We removed fixations of which the standard deviation in point of gaze exceeded 0.7 degree of visual angle and fixations of which the RMS exceeded 0.1°, such as to only include stable fixations. This led to a total of 15640 available fixations for scanpath analysis (mean standard deviation in point of gaze = 0.20, mean RMS = 0.03). Fixation durations ranged from 100ms to 4936ms (M = 460.8, SD = 309.8065, median = 394). Inspection of normality of fixation durations showed that fixation durations were positively skewed. As fixation durations were used to predict cognitive functioning in a linear model as a comparison with scanpath analysis, fixations durations were log-transformed to a normal distribution. Median fixation durations correlated positively with RTs in gap (r = 0.29,

(25)

negatively correlated with the number of fixations infants made (r = -0.32, p <0.05). This indicated that infants who disengaged more easily made shorter and more fixation. Moreover, the number of fixations were negatively correlated with fixation durations in RM (r = 0.39, p < 0.01), indicating that the shorter the fixations, the more fixations could be made (Table 1). Scanpath analysis. We used the successor representation scanpath analysis (SRSA) to predict individual differences in cognitive scores, as described above. Part of the SRSA is to maximise the model fit of the cognitive scores, which was done by a procedure that searched for the alpha and gamma values that achieved the best fitting model (Hayes & Henderson, 2017). However, in the SRSA implementation the process of finding these two values

sometimes fails to find the best possible solution to this goal. Instead of finding the best alpha and gamma values when all possible options are considered, it only finds the optimal

solutions among neighbouring values. In order to ensure that the SRSA would find the best possible alpha and gamma, we computed R2 values for combinations of these values in steps of 0.04. The alpha and gamma values that resulted in the best R2 was then used as initial starting values for SRSAs (source code available at https://github.com/Kucharssim/SRSA).

SRSA FV: Recognition Memory

Horizontal State Space. To predict novelty scores, an SRSA of the FV data was performed with 0.56 and 0.96 as initial values for the learning rate parameter alpha and discount parameter gamma, respectively. After the optimization process, the SRSA revealed the best prediction of novelty scores to have a goodness-of fit of R2 = 0.27. However, the goodness-of-fit dramatically decreased after cross-validation (R2 = 0.04), see also Table 2 for all SRSA results. This indicated that the SRSA of the horizontal state quantification of FV was unable to predict novelty scores. Although the SRSA was unable to predict novelty scores, we further examined the relationship between cognitive individual differences and

(26)

individual differences in FV scanning strategies, by inspecting the prediction weight matrices for this SRSA analysis. This seemed a fruitful endeavour as this study was exploratory and perhaps trends of strategies could be detected. A prediction weight matrix was composed of the top 5 PCA projections multiplied by the regression coefficients from the SRSA regression model. Figure 7A shows the prediction weight matrix for the horizontal state space prediction of novelty scores. Following Hayes and Henderson (2017)’s interpretation of the prediction weights, negative values (red colours) indicate strategies followed by low scoring infants, whereas positive values (blue colours) represent strategies followed by high scoring infants. These values in the prediction weights were used to differentiate possible strategies in free-viewing.

Table 2. The results from the SRSA models and first-order transition models. For each

cognitive score and each state space a goodness-of-fit R2 is provided with the cross-validated R2cv, α and γ value that was used to achieve the best fit.

Horizontal State Vertical State Radiating State

SRSA model R2 R2cv α γ R2 R2cv α γ R2 R2cv α γ RM 0.27 0.04 0.69 1 0.34 0.006 0.62 0.58 0.28 0.09 0.62 0 GO 0.32 0.00005 0.19 0.69 0.38 0.0002 1 0 0.37 0.0005 0.90 0.50 First-order transition model RM 0.22 0.06 0.58 0 0.24 0.004 0.97 0 0.27 0.09 1 0 GO 0.26 0.005 0.56 0 0.29 0.03 0.15 0 0.26 0.0002 0.31 0

Note. RM = recognition memory; GO = gap-overlap

Following Hayes et al. (2011) the prediction weights should be interpreted as the number of expected future transitions to a state (i.e., AOI, indicated by the matrix row) given

(27)

that one was fixating a certain state (i.e., the matrix column). However, interpreting the prediction weights is rather challenging. The horizontal SRSA prediction weights did not show distinct clusters that would indicate different strategies, as no distinct clusters could be found in the matrix. As Hayes and Henderson (2017) predicted, extracting viewing strategies from unconstrained free-viewing data was rather difficult. Hence, we followed the procedure described by Hayes and Henderson to aid the interpretation of free-viewing prediction weight matrices. Inspection of the most illustrative scan patterns seemed to imply somewhat distinct strategies. The low scoring infant showed one clear cluster of transitions that were located around the first 2 states of the horizontal space, as indicated by the 2 by 2 values in the first 2 rows of the illustrative scan pattern (Figure 7c). The low scoring infant thus only scanned the peripheral regions of the scene. The high scoring infant showed a similar scanning behaviour of the early states, but additionally frequently scanned all other states (Figure 7b). These different scanning behaviours are also observable in Figures 7d and 7e. To recapitulate, the high scoring infant traversed the entire scene while the low scoring infant lingered in the first 2 states. This different scanning behaviour might thus imply that distinct strategies exist.

(28)

Figure 7. Interpretation of the horizontal SRSA correlated with recognition memory. A)

shows the prediction weights where positive (blue) values indicate the expected scanning strategy of high scoring infants and negative (red) values indicate the expected scanning strategy of low scoring infants. The columns represent the current state from which the transition is initiated, whereas the rows represent the state where the transition is expected to land. B) and C) show the SRSA for the most illustrative scene where 2 infants showed the biggest difference in scanning behaviour: B) for the high scoring infant and C) for the low scoring infant. D) shows the state transition of each point in the scan pattern. E) visualises the scanpaths of the most illustrative scene. In all plots, blue coloured values or lines represent the high scoring infant and red coloured values or lines the low scoring infant.

Vertical State Space. The SRSA for novelty scores predicted by the vertical state space scanpath quantification of FV data was performed with starting values: α = 0.1 and γ = 0.6. The SRs of the vertical state space initially accounted for more than 30% of the variance

(29)

in novelty scores (R2 = 0.34). However, this best fit did not cross-validate (R2 = 0.006), thus indicating that the SRSA was unable to account for the variance in novelty scores. From the prediction weights, it was not clear whether distinct strategies existed. Expected transitions were scattered across the scene, without the appearance of clusters. Inspection of the

illustrative scan patterns also failed to show distinct clusters of expected transitions. Both the high scoring infant and the low scoring infant tended to scan the first 2 states as indicated by the values on the first two rows in Figures 8b and 8c (the receiver states). Both infants also tended to scan the bottom state of the scene and the border, which is also indicated by the values on the bottom 2 rows of these matrices. However, as can be seen in Figures 8d and 8e, the low scoring infant (red) made more transitions within the regions that both infants tended to scan. Based on the similar distribution of values on the scan pattern visualisation, it may be concluded that the vertical state space SRSA was unable to uncover distinct strategies that were related to individual differences in novelty scores.

(30)

Figure 8. Interpretation of the vertical SRSA correlated with recognition memory. A) shows

the prediction weights. B) illustrative scanpath for the high scoring infant and C) illustrative scanpath for the low scoring infant. D) shows the state transition. E) visualises the scanpath. In all plots, blue coloured values or lines represent the high scoring infant and red coloured values or lines the low scoring infant.

Radiating state space. We performed another SRSA with scan patterns quantified according to the radiating state space and predicted novelty scores using starting values α= 0.56 and γ = 0.76. The radiating state space SRSA initially predicted novelty scores relatively well (R2 = 0.28). Note, however, that this prediction was best achieved using a gamma value of 0, thus indicating that this SRSA was unable to capture elongated statistical regularities in scanpaths (Hayes & Henderson, 2017). Moreover, cross-validation of the SRSA revealed the SRSA to be unsuccessful at explaining variance in novelty scores (R2 = 0.09). As in the other two SRSAs, the prediction weights failed to show distinctive clusters. The illustrative scan patterns showed that both the high scoring infant and the low scoring infant showed a similar

(31)

viewing strategy, as indicated by the distribution of values in the scan pattern matrices of Figure 9b and 9c. Values in these matrices are mostly located on rows 3 and 4 for both infants. The two infants thus scanned the same states. However, it is important to note that their scan patterns did differ in some respects. First, the low scoring infant only scanned the right side of the scene without returns to earlier states. The high scoring infant, however, came back to central states 1 and 2 more frequently (Figure 9d). Second, although the 2 infants scan similar states, their scan patterns were not similar. The high scoring infant scanned the scene from the left bottom corner to the top right corner (and vice versa) while also scanning the central areas. The low scoring infant’s scan pattern, however, lingered in the mid right periphery (Figure 9e). Note, however that the low scoring infant made less fixations compared to the high scoring infants (Figure 9d). In sum, state-wise the scanning behaviour was similar, with more exploration of these states in different directions by the high scoring infant.

(32)

Figure 9. Interpretation of the radiating SRSA correlated with recognition memory. A) shows

the prediction weights. B) illustrative scanpath for the high scoring infant and C) illustrative scanpath for the low scoring infant. D) shows the state transition. E) visualises the scanpath. In all plots, blue coloured values or lines represent the high scoring infant and red coloured values or lines the low scoring infant.

SRSA FV: Gap-Overlap Task

Horizontal State Space. An SRSA was performed with initial values of alpha = 1.0 and gamma = 0.36 in order to predict GO effect scores. The goodness-of-fit R2 of the SRSA was 0.32, but after cross-validation this was reduced to R2 = 0.00005. This indicated that the SRSA did not explain variance in the GO effect scores. The prediction weights did not indicate differences in expected transitions to certain states (Figure 10a), which was

observable in the lack of clusters of transitions. The illustrative scan patterns did not reveal distinct clusters of expected transitions between the high scoring infant (10b) and the low

(33)

scoring infant (10c). The scan patterns belonging to the two infants showed a similar

distribution of expected transitions across the scan pattern matrix. The high scoring infant was expected to scan the first 4 states of a scene. The low scoring infant, by contrast, was expected to make more transitions to states 1, 3, and 4. The two illustrative scan patterns were thus rather similar, as is also observable in Figures 10d and 10e. It might thus suggest that this horizontal SRSA was unable to show distinct strategies.

Figure 10. Interpretation of the horizontal SRSA correlated with the gap-overlap task. A)

shows the prediction weights. B) illustrative scanpath for the high scoring infant and C) illustrative scanpath for the low scoring infant. D) shows the state transition. E) visualises the scanpath. In all plots, blue coloured values or lines represent the high scoring infant and red coloured values or lines the low scoring infant.

Vertical State Space. The SRSA prediction of the GO effect was performed with α = 1 and γ = 0. Cross-validation of the SRSA decreased the initial best fit (R2 = 0.38) of the GO effect prediction to R2 = 0.0002. Hence, the vertical SRSA of FV data was unsuccessful at predicting GO effect scores. Note, however, that the vertical state space SRSA was unable to

(34)

extract elongated scan patterns from the FV data, as indicated by the γ = 0 (see Table 2). The vertical state space quantification of the scanpaths thus resembled a first-order transition model (Hayes & Henderson, 2017). The prediction weights of the vertical state space SRSA, did not show distinct clusters of positive or negative values. The illustrative scan patterns (11b and 11c) showed that the high scoring infant tended to scan the early vertical states (states 1 to 3). The low scoring infant also tended to scan the early states (states 1 and 2) although this infant did not scan the third state, see also Figure 11d and 11e. As the transitions of the scan patterns were similar in both infants, it may be implied that this vertical state SRSA was unable to reveal distinct viewing strategies related to GO effect scores.

Figure 11. Interpretation of the vertical SRSA correlated with the gap-overlap task. A) shows

the prediction weights. B) illustrative scanpath for the high scoring infant and C) illustrative scanpath for the low scoring infant. D) shows the state transition. E) visualises the scanpath. In all plots, blue coloured values or lines represent the high scoring infant and red coloured values or lines the low scoring infant.

(35)

Radiating State Space. The SRSA of the GO effect was performed with initial values of α = 0.92 and γ = 0.56. The analysis was unable to account for the variance in the GO effect scores (R2 = 0.37; after cross-validation, R2 = 0.00005). The prediction weights for the

radiating state space prediction of GO effect scores did not display any free-viewing strategies, as indicated by the lack of clusters in the matrix (Figure 12a). However, the

illustrative scan patterns revealed 2 distinct strategies of viewing behaviour. The high scoring infant tended to scan the first 4 states of the scene, as indicated by the values on the first 4 rows of Figure 12b. This suggested that this infant scanned the central states of the scene. The low scoring infant scanned the periphery of the scenes as indicated by the values at the bottom 2 rows of Figure 12c. Figure 12d and 12e also show this. This may thus suggest that high scoring infants move between central regions during scene viewing, whereas low scoring infants scan the periphery.

Figure 12. Interpretation of the radiating SRSA correlated with the gap-overlap task. A)

shows the prediction weights. B) illustrative scanpath for the high scoring infant and C) illustrative scanpath for the low scoring infant. D) shows the state transition. E) visualises the scanpath. In all plots, blue coloured values or lines represent the high scoring infant and red coloured values or lines the low scoring infant.

(36)

In summation, the SRSAs for RM were unsuccessful at explaining individual

differences in novelty scores. With the exception of the horizontal state space, the prediction weights were unable to show distinct strategies in viewing behaviour. It was only the

horizontal state space that suggested that a high score entailed more elaborate scanning across all states, whereas a low score was indicative of scanning the periphery. The SRSAs for the GO task were also unsuccessful at predicting GO effect scores and unveiling distinct scan patterns, with the exception of the radiating state space SRSA for GO effect scores. The radiating space suggested that high scoring infants might be more inclined to traverse the entire scene, whereas low scoring infants might be biased towards the periphery. This scanning behaviour was similar to what the horizontal SRSA for RM uncovered.

However, the strategies found with the SRSAs of the RM and GO effect need to be interpreted with caution for several reasons. First, none of the SRSA models cross-validated, therefore suggesting that scan patterns, as captured by SRs, are not be related to cognitive scores in our sample. Second, our procedure for identifying strategies was flawed in one respect; strategies were inferred based on only 2 illustrative scan patterns. It is questionable whether 2 scan patterns are able to reveal true differences in scanning behaviour that are linked to cognitive individual differences. Finally, the SRSA of infant FV data appears to be an unstable method for capturing infant scan patterns, as will be discussed below. This thus questions whether the SRSA is equipped to act as an additional tool for predicting future cognitive functioning.

Stability of the SRSA

We examined the degree to which the scan patterns, as captured by the SRs, were stable. For all three state spaces and both cognitive scores, SRSAs were run for the first 14 trials and for the next 15 trials, separately. For each state space and cognitive score, the

(37)

prediction weights from the first set of SRSAs were subsequently correlated with the second set of SRSAs. Correlations for all state spaces and cognitive scores were close to zero, thus indicating a lack of stability of scanpaths (see Table 3 for all correlations).

Table 3. Stability of the SRSA. Correlations between the prediction weights from the two sets

of SRSAs for each state space and cognitive score (recognition memory, gap-overlap).

The stability of the SRs was also examined at the level of the individual. The

projections from both sets of SRSAs of each state space and cognitive score were correlated with each other. As is evident from the diagonal of Table 4, there were low, non-significant correlations between the projections. Hence, it may be concluded that the SRs (analysed with SRSA) of the participants in this sample were unstable.

Horizontal State Vertical State Radiating State

Recognition memory -0.03 0.28 -0.14

(38)

Table 4. Stability of the SRSA. Correlations between the projections from SRSA sets, for all

three state spaces and both cognitive scores: RM (recognition memory) and GO (gap-overlap).

Horizontal RM Projection 1 Projection 2 Projection 3 Projection 4 Projection 5

Projection 1 0.31 -0.11 -0.05 -0.09 0.11 Projection 2 -0.25 0.12 0.09 0.17 0.08 Projection 3 0.05 -0.20 0.04 -0.08 -0.08 Projection 4 -0.11 -0.22 0.28 0.11 0.01 Projection 5 0.09 0.17 0.15 -0.23 -0.14 Vertical RM Projection 1 0.11 0.13 -0.25 0.01 0.03 Projection 2 -0.24 0.08 0.14 0.05 -0.01 Projection 3 -0.20 -0.13 -0.13 0.03 0.03 Projection 4 0.06 0.07 -0.47 -0.04 0.00 Projection 5 0.05 -0.07 0.42 -0.08 0.18 Radiating RM Projection 1 -0.07 0.06 0.47 0.06 0.05 Projection 2 0.24 -0.17 0.16 0.14 -0.02 Projection 3 0.10 0.14 -0.14 -0.12 0.06 Projection 4 -0.19 -0.14 -0.01 0.02 0.04 Projection 5 -0.10 0.21 0.08 0.20 0.08 Horizontal GO Projection 1 0.07 -0.017 0.05 0.31 0.08 Projection 2 -0.07 -0.03 -0.29 0.10 0.25 Projection 3 -0.17 0.07 -0.01 -0.17 -0.08 Projection 4 0.26 0.12 -0.02 -0.13 -0.01 Projection 5 0.09 0.22 -0.58 -0.10 0.19 Vertical GO Projection 1 -0.07 -0.18 0.06 0.13 -0.11 Projection 2 0.12 0.20 0.22 -0.19 0.31 Projection 3 -0.07 0.04 -0.14 -0.03 0.05 Projection 4 -.019 -0.03 0.00 0.05 0.01 Projection 5 -0.04 0.18 0.61 0.10 -0.06 Radiating GO Projection 1 -0.05 0.05 -0.01 0.37 0.02 Projection 2 0.11 -0.11 0.34 -0.12 0.14 Projection 3 -0.03 -0.07 0.22 -0.09 0.04 Projection 4 -0.14 0.07 0.12 -0.03 -0.01 Projection 5 0.34 -0.10 0.01 0.22 -0.07

(39)

Alternative Analyses

First-Order Transition Analysis. To compare SRSA to more traditional analyses, we predicted all novelty scores and GO effects using first-order transitions in all three state spaces. We used the same model as was used in the SRSA described above, with the only modification being the discount factor gamma, which was set to zero. Setting gamma to zero ensured that order transition matrices were generated instead of SR matrices. In all first-order transition models, the same initial alpha was used as in their respective SRSA models (Hayes & Henderson, 2017). All state spaces were able to account for some amount of variance in novelty scores and GO effects. Goodness-of-fits values ranged from R2 = 0.22 to

R2 = 0.29. However, the first-order transition models did not cross-validate. Table 2, in the beginning of the results section, shows the results from the first-order transition probability models.

Before cross-validation, the SRSA models explained slightly more variance in cognitive scores than first-order transition models. However, after cross-validation, both models performed equally poor. Additionally, half of the SRSA models only succeeded at predicting cognitive scores using low gamma values that did not allow the model to capture elongated statistical regularities in scanning behaviour. This might imply that the SRSA and first-order transition model perform equally poor on infant FV data.

Traditional Eye Movement Analysis. The SRSA was also compared to traditional methods of eye movement analysis. Across all images, we calculated averaged

log-transformed fixation durations, average number of fixations, and average saccade amplitude for each participant, which were subsequently used as predictors in a multiple linear

regression. With this linear regression we aimed to predict individual differences in cognitive scores. The multiple regression model explained little of the variance in novelty scores (R2 = 0.13) nor did it explain much of the variance in GO effect scores (R2 = 0.10). However, both

(40)

models did not cross-validate: R2 = 0.005 for novelty scores and R2 = 0.008 for GO effect. Thus, both models were unable to account for variance in novelty scores and GO effect scores. In summary, the SRSA, first-order transition, and the averaged eye movement measures failed to predict cognitive scores.

Cluster analysis. The SRSA was unable to explain individual differences in cognitive scores, nor was it successful at uncovering free-viewing strategies that related to individual differences in cognition. A disadvantage of the SRSA is that the prediction weights are based on the top 5 principal components and the regression coefficients that uses these 5

components to predict cognitive scores. The analysis optimises the SRs based on the cognitive scores that are being predicted in the analysis. Moreover, this process is sensitive to

overfitting of the data (Hayes et al., 2011). This may thus restrict the degree to which individual cognitive differences may be explained and viewing strategies may be extracted. An alternative way of analysing SRs without the threat of overfitting may be to perform a cluster analysis on the SRs. The relationship between the clusters and the cognitive scores may then be used to infer whether differences in scanning behaviour relate to differences in cognition (https://github.com/Kucharssim/SRSA). Therefore, we performed k-means cluster analyses on the SRs.

As in the analyses, above, we quantified scanpaths using the 3 state spaces. Two SRs were calculated for each state space, using the starting alpha and gamma values of both the RM SRSA and the GO SRSA. In total, 6 cluster analyses were run, and for all analyses distinct patterns were captured in the SRs using 2 clusters. To visualise the clusters of scanpaths, as captured by SRs, we searched for scenes for which most infants provided data (i.e., more than 4 fixations, as this was the criterion for SRSA). For each scene we plotted the scanpaths of the first 5 infants from each cluster, see Figure 13. Figure 13 (A-C) illustrates 3 scenes that showed distinctive scanning behaviour between clusters, that could easily be

(41)

identified through visual inspection. These plots show that the five infants within each cluster showed similar scan patterns, whereas this scanning behaviour differed between clusters. Infants in the second cluster (right sided images) scanned the images more extensively,

compared to infants in the first cluster (left sided images). Moreover, there were differences in how many objects were visited between the clusters. Scanpaths in the second cluster seemed to visit more objects compared to scanpaths in the first cluster. Although panels A to C show these distinctions, not all images were scanned in such a manner, as can be seen in panel D. These two figures show that infants from both clusters showed rather similar scanpaths. To gain more insight into how these clusters differed from each other, we tested whether cluster membership differed in cognitive scores, age, and fixation number in an exploratory fashion.

Based on membership to the clusters, we performed t-tests on the clusters of each cluster analysis to explore whether the 2 clusters differed in cognitive scores or age. For all state space analyses, no significant differences in cognition were found between the clusters. The clusters only differed in the values of ages within each cluster (all p-values < 0.05). This might, therefore, imply that differences in scanning behaviour, as captured by SRs, may be related to other individual differences related to age instead of RM and GO. From Table 2 (above), it can be seen that age correlated with the average number of fixations that infants made during free-viewing. In order to further explore differences between the clusters, we performed t-tests on the number of fixations in the two clusters of all cluster analyses. There were significant differences in number of fixations between all two clusters (all p-values < 0.001), which is also evident in the length of scanpaths in the clusters in Figure 13. The cluster analyses showed that membership classification was consistent for all but 3 infants. One cluster was characterised by less fixations and younger ages (M age = 8.52 months, SD age = 0.04; M fixations = 254.19, SD fixations = 26). The other cluster was characterised by older infants and more fixations (M age = 9.80 months, SD age = 0.10; M fixations =352.38,

(42)

SD fixations = 2.41). Perhaps this relates to the common finding that infants make shorter

fixations as they grow older (Hunnius & Geuze, 2004), and, therefore are able to make longer scanpaths.

Figure 13. Examples of scanpaths of 5 infants in the clusters plotted on four example images.

(43)

As the cluster analyses showed 2 distinct clusters of SRs, we explored whether the cluster analysis would yield stable classifications of membership. As only 3 infants were not consistently classified as belonging to the same cluster, this may indicate that SRs do yield stable scanpaths when cluster analysis is applied, instead of SRSA. To further explore the stability of scanpaths captured by cluster analysis of SRs, we computed SRs for the first 14 trials and for the other 15 trials, for every state space. This yielded 2 sets of cluster analyses for all 6 state space SRs each (using starting alpha and gamma values from RM and GO). This, again, revealed that all but 3 infants were consistently classified as belonging to the same cluster. These initial findings thus suggest that cluster analysis may be a more robust method for analysing scanpaths.

Discussion

In this study we used a novel scan pattern analysis technique (SRSA) to relate infant individual differences in free-viewing to cognitive individual differences. Infants completed a free-viewing task and additionally also completed two cognitive tasks, namely: recognition memory and gap-overlap. The results showed that the SRSA models were unable to predict infants’ scores on these two tasks. Although two SRSA models suggested the presence of distinct viewing strategies, it is debatable whether these strategies represent true free-viewing strategies, as they were inferred from an SRSA model that did not predict cognitive scores and only two infants were compared to infer these strategies. Hence we may conclude that the SRSA was unable to uncover free-viewing strategies. We also showed that, on infant data, the SRSA does not perform better than traditional eye-movement analyses; they all equally predict cognitive scores poorly. The results also showed that the SRSA may not be a stable method for analysing infant free-viewing data, as scanpaths captured by SRs were instable. Although the SRSA did not unveil free-viewing strategies, the cluster analysis did

Referenties

GERELATEERDE DOCUMENTEN

De Romeinse activiteiten concentreerden zich waarschijnlijk bovenop en tot aan de rand van de zandrug, maar door nivellering levert enkel de rand in situ vondsten op.

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the

Three imporant issues are dealt with in this paper, namely (1) the constrained optimization problem underlying this tuning is made explicit; (2) the non-linear constraint causing

With respect to the regression results with only the 2 outliers excluded, the significant coefficients of the control variables are the one for the openness to trade

Correlates of infant-caregiver attach- ment.—To determine why some infants developed an insecure relationship to their Professional caregiver while other infants developed a

Het lijkt erop dat de automobilisten – in ieder geval achteraf – wel beseffen dat ze vermoeid waren en dat vermoeid rijden gevaarlijk is; 20% van de automobilisten geeft aan dat

The IDS preference was significantly stronger in older children, in those children for whom the stimuli matched their native language and dialect, and in data from labs using

Table 5.2 Knowledge of health care workers – Infant feeding and HIV 85 Table 5.3 Attitudes of health care workers – Infant feeding and HIV 89 Table 5.4 Practices of health