• No results found

Automated classification of self-grooming in mice using open-source software

N/A
N/A
Protected

Academic year: 2021

Share "Automated classification of self-grooming in mice using open-source software"

Copied!
11
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Cell Press 16, 1-11, September 19, 2016 1

Automated classification of self-grooming in mice using

open-source software

Bastijn J.G. van den Boom

1,2

, Hanne, A. Mooij

1

and Ingo, Willuhn

1,2*

1 Netherlands Institute for Neuroscience, Institute of the Royal Netherlands Academy of Arts and Sciences,

Amster-dam, The Netherlands

2 Department of Psychiatry, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands

*Correspondence:

Ingo Willuhn, Netherlands Institute for Neuroscience (NIN), Meibergdreef 47, 1105 BA, Amsterdam, The Netherlands. Tel: +31-20-5665491. Email: i.willuhn@nin.knaw.nl

A b s t r a c t

Background: In order to understand how the brain outputs behavior, it is necessary to reliable

an-notate specific behaviors. Manual quantification is labor intensive, time consuming, and subjected

to inter-rater variability. While considerable progress has been made in the direction of automated

behavioral analysis, complex behavior such as grooming still lacks automation. However, due to the

complexity, repetitive, and sequentially patterned nature of grooming behavior, it is often used in

behavioral neuroscience.

New method: Here, we employ the Janelia Automatic Animal Behavior Annotator and show that our

automated classifier is capable of annotating bouts and total duration of mice grooming based on

overhead video recordings.

Results: We show that the automated classifier annotates grooming in open field and elevated plus

maze setup using two genetically different strains of mice, SAPAP3 knockout mice and wild type

littermates. Our classifier is capable of quantifying grooming behavior expressed in bouts and total

duration. Using our classifier, we validated that SAPAP3 knockout mice show increased grooming

behavior.

Comparison with existing methods: Thus far, manual assessment was regarded as inferior to

auto-mated quantification. We show that our classifier is as reliable as expert observers, is cost efficient,

requires minimal operator time, is a non-invasive method, provides high throughput measurements

of mice grooming, and diminished inter-rater variability.

Conclusion: We establish the use of the automated classifier in behavioral neuroscience which

pa-ves the way for quantitative studies to elucidate involvement of particular brain areas or neuronal

networks on grooming and to phenotype and compare mice with different genotypes. This would

increase our understanding of the brain in relation to its behavioral output.

cognitive and other deficits present in neuropsychiatric patients genetically engineered animal models are em-ployed (Decker, 2006; Levin and Buccafusco, 2006). For example, mouse models have been developed to study complex disorders such as obsessive compulsive disor-der (OCD) (Welch et al., 2004), schizophrenia (Gerlai et

1. Introduction

Behavioral neuroscience seeks to understand brain func-tion that underlies behavioral phenomena both of nor-mal and pathological nature. Studies often involve be-havioral observation of animals. To investigate complex

(2)

of individual bouts (Cromwell and Berridge, 1996). In con-trast, genetic depletion of the synapse-associated protein 90/postsynaptic density protein 95-associated protein 3 (SAPAP3) in the striatum is involved in both increased number of bouts and duration of grooming behavior (Welch et al., 2007). Neuromodulators such as dopamine, serotonin, and glutamate have been implicated in both grooming bouts and duration (Audet et al., 2006; Taylor et al., 2010). In addition, as grooming behavior is related to stress and de-arousal, environmental factors and stres-sors have been found to increase grooming bouts and du-ration in rodents (Kalueff et al., 2007). It is noticeable that the major part of behavioral neuroscience studies investi-gate grooming bouts and duration as translational rodent phenotype output (for a review see (Kalueff et al., 2016)). Here, we aim to automatize quantification of grooming behavior using open-source software analysis of over-head video recordings, thus far regarded as inferior to manual assessment by a human investigator. For this purpose, we use a recently described video-based ma-chine learning system for automatically computing in-terpretable, quantitative measures of animal behavior based on manual annotations (Kabra et al., 2013). After training the Janelia Automatic Animal Behavior Anno-tator (JAABA) classifier to detect grooming behavior in mice, we conclude that it reliably and with high throug-hput can detect grooming bouts and duration. Initially we tested our automated grooming classifier using video recordings of open field behavior, but we demonstrate that our classifier is also capable of detecting grooming on an elevated plus maze with a three-fold decrease in video resolution, without additional training of the au-tomated classifier. Using the SAPAP3 knockout mouse as an example, we demonstrate that our classifier is able to detect novelty-induced difference in grooming behavior of genetically modified animals. Overall, in the present study we show that our classifier can be used to quan-tify differences in grooming behavior in mice, and thus proves to be useful in neurobehavioral research and be-havioral phenotyping of genetically modified animals.

2. Experimental procedures

All animal procedures and methods were performed in accordance with the guidelines of the Dutch laws (Wet op de Dierproeven, 1996) and European regulations (Guideline 86/609/EEC). All animal experiments were approved by the Animal Experimentation Committee of the Royal Netherlands Academy of Arts and Sciences. 2.1 Animals

Male and female SAPAP3 knockout mice and wild type littermates, aged between 7 and 13 weeks, were used. One week prior to recording, mice were put on rever-sed 12-hour light-dark cycle (lights on from 7 p.m. to 7 al., 2000), and anxiety (Bouwknecht and Paylor, 2002).

The ability to carry out thorough, automated, and quan-titatively reliable observations in animal experiments will likely increase our understanding of brain functi-on, neuropsychiatric disorders, and help pave the way for novel therapeutic treatments (Ohayon et al., 2013). Manual analysis of animal behavior is time consuming and labor intensive and thus results in a low rate of measurement throughput (Tecott and Nestler, 2004). Slow throughput is mainly caused the inability of an ex-perimenter to simultaneously annotate multiple animals across sessions and tasks (e.g., open field and elevated plus maze) (Ohayon et al., 2013). Automated classificati-on of complex behaviors can tremendously increase the rate of throughput. Furthermore, reliability of measure-ments can be increased by applying automated annota-tion rules of behavioral observaannota-tions thereby preventing human inter-rater variability (Martin, P., Bateson, 2007). Grooming is an innate stereotyped and frequently ob-served behavior in animals (Spruijt et al., 1992) that consists of complex strings of movements to clean and maintain fur and skin of the body (i.e., hygiene mainte-nance). Grooming is also involved in other physiologically important processes, including social and sexual interac-tions (Ferkin, M. H., Leonard, 2005; Yu et al., 2010), de-arousal and displacement (Kalueff and Tuohimaa, 2005, 2004a), and thermoregulation (Hainsworth, 1967). It is remarkably similar across species and strains (Spruijt et al., 1992). In awake rodents, grooming is one of the most observed behaviors, and rodents spend a large amount of their awake time grooming (30% to 50%) (Kalueff et al., 2010). Because of its complexity, repeti-tive, and sequentially patterned nature, grooming is po-tentially useful for translational neuroscience research even though that it cannot be considered an exact mo-del of any human psychopathology (Kalueff et al., 2016).

Mature rodent grooming is known for its highly stereo-typic pattern of sequential movements in a cephalocau-dal stepwise pattern, also known as a sequential chain pattern (Kalueff et al., 2007; Richmond, G., Sachs, 1980). However, this syntactic chain pattern only accounts for approximately 10-15% of all the observed grooming behavior (Kalueff et al., 2007). Interestingly, the remai-ning 85-90% of grooming follows less predictable, more flexible sequential patterns with similar individual mo-vements, strokes, licking, and scratches (Aldridge et al., 2004; Kalueff et al., 2007). Recent studies suggest that different brain areas and neuromodulators are involved in functional different aspects of grooming behavior. For example, lesion of the ventral pallidum, an output struc-ture of the basal ganglia, has a significant negative effect on the number of grooming bouts (i.e. frequency of beha-vior/initiation of grooming), but not on average duration

(3)

empty chamber freely without restrictions. Animals were placed in a corner facing the wall, using the PVC tube pro-vided in their home cage, after which 5 minutes of ha-bituation started followed by one-hour video recording. 2.5 Elevated plus maze (EPM) test

The EPM was custom-made (NIN mechanical workshop) and consisted of two open arms and two enclose arms (length arms 31 cm, total length EPM 69 cm, width arms 8 cm, height closed arms 11 cm), elevated 50 cm above the floor with shielding panels (183 x 121 cm) surroun-ding the EPM to minimize disturbance and spatial cues. The Basler GigE camera was mounted on a custom rail centered above the EPM (100 cm) with on both sides an IR beam linear above the enclosed arms (50 cm, camera to IR beam). IR beams were tilted and adjusted prior to the start of the experiments to gain maximal illuminati-on of the area and animal. Again, no envirilluminati-onmental cues or attributes such as water, bedding, or food were availa-ble for the duration of recording. The experimenter put the animal on the center square of the EPM facing the same open arm for every session and immediately star-ted video recording. The animal was allowed to explo-re the EPM for 10 minutes while being video explo-recorded. 2.6 Behavioral analysis

2.6.1 Definition of grooming

Grooming activity was recorded for each individual bout including all patterns known to be involved in grooming behavior. In this study, all phases of the syntactic groo-ming chain including additional tail and genital licking were marked as grooming: elliptical strokes, small stro-kes, bilateral strostro-kes, flank licks, and tail and genital licks. Elliptical strokes consist of paw licking, and elliptical kes over the nose and face, along the snout. Small stro-kes included fast small asymmetrical circular strostro-kes over the eyes and head. Bilateral strokes start with large bi-lateral strokes behind the ears and over the neck. Flank licks include grooms/scratches the body and can be done by licking or biting the body, or by scratching the body with the hind paws (Berridge et al., 1987). Tail and ge-nital licks are regularly shown by rodents, consisting of leg licking/biting , and tail and genitals grooming with their mouth (Kalueff and Tuohimaa, 2004b). In this study, grooming was defined as one or more of the above stated movements in a flexible serial non-chained order. Thus, animal behavior was marked as grooming when one of the above defined behaviors was present, not necessa-rily followed up by the appropriate fixed chain sequence. 2.6.2 Grooming activity measures

Two quantitative ethological measures of grooming ac-tivity, frequently used in behavioral neuroscience (for a review see (Kalueff et al., 2010)), were evaluated: bouts (frequency of grooming) and total duration (cumulative a.m.), housed individually, and put on 80-90% food

res-triction with daily delivery of chow food (Evigo Rms BV). Animals had ad libitum access to water and were handled daily. Standard cages (33 x 18 x 14 cm) with corn cub bedding, additional nesting material, and a PVC tube as enrichment were used in a temperatu-re- (21-23 °C) and humidity-controlled (40-65%) room. 2.2 Procedure

Behavioral testing was conducted between 10:00 and 16:00 hours. On experimental days, mice were transpor-ted to the dark experimental room (dim red light availa-ble) and left undisturbed for 5 minutes prior to testing. Mice were places individually in a clean, unfamiliar ap-paratus, with a centered overhead video camera. Bet-ween subjects, each apparatus was cleaned thoroughly using 70% ethanol and purified water. During recordings, the animals were observed by an experienced experi-menter. The experimenter was sitting in front of the re-cording computer (2 meter away from the apparatus). 2.3 Recording equipment

Overhead video recordings were performed using a high-end Basler GigE infrared camera (monochrome 1/2” Basler acA1300-60gm CS-mount) with attached a C-Mount lens to CS-Mount camera adapter ring (5mm) to hold the Kowa lens (1 / 1.8”, F 1.6, 4.4-11 mm) and the IR pass filter (43 mm, P = 0.75 mm). The camera was placed centrally above the setups and connected to a computer (Dell T3500 workstation, Windows 7 64-bit) via an Ethernet cable and network card (Basler GigE Visi-on Adapter). Video data was always captured at a frame rate of 30 fps and with 1024 x 768 pixel images in uncom-pressed AVI format (average recording time and file size for open field was 1 hour and 600 MB, for elevated plus maze 10 minutes and 70 MB, respectively) using a custom written script in the open-source software Bonsai (Lo-pes et al., 2015). Illumination was accomplished by two infrared beams (Infrared Illuminator IR-56, 880±20 nm, 60±6° x 40±4° beam, Microlight). Camera gains and black levels were adjusted prior to the experiment to obtain high contrast between background and animal without saturating the light intensity of the recording image. 2.4 Open field (OF) test

All experiments were performed in a custom-made squa-re OF chamber (30 x 30 x 40 cm, NIN mechanical work-shop) made of clear Perspex and shielded with white paper on the outside to minimize disturbance and spa-tial cues. The OF chamber was placed on a rack with the Basler GigE camera mounted on a rail centered above the OF (50 cm). One IR beam was located next to the came-ra. Below the Perspex OF chamber, another IR beam was mounted to increase illumination. No bedding, water, food, or other environmental attributes were available during video recordings. Animals were able to explore the

(4)

reflection in Perspex, feces). The program is capable of computing ecliptic trajectories of multiple animals, thus requiring substantial processor power. Analyzing one-hour video with one animal took our computer (Windows 7 64-bit, Intel Xeon W3530 CPU 2.80 GHz 4 cores, 12 GB RAM) approximately 12 hours, thus we ran the analysis overnight and used multiple similar computers to calcu-late the trajectories of multiple animals efficiently. Out-put of Motr is a MATLAB file with x-y trajectories which is required as input for JAABA. Next, JAABA needs to place the video and trajectories file in a single, separate direc-tory, called an experimental direcdirec-tory, and create one ad-ditional directory which takes just a few seconds. After creating this directory structure, JAABA is ready to train a classifier based on that video or to annotate the bouts and total time spent grooming in that video based on an existing classifier (previously trained with other video re-cordings). Our automated classifier annotated grooming in one-hour videos in approximately 30 seconds and was able to annotate multiple movies in a sequence through a custom written MATLAB script. Output of our automated classifier was a MATLAB matrix file with for every frame time in seconds) spent grooming during the entire

ses-sion. Measurements were manually scored by experien-ced observers using a custom written script in Bonsai to score behavior based on keyboard presses. Bonsai con-verted all keyboard presses into an Excel output file, re-sulting in an Excel file with per animal number of bouts and total duration of grooming behavior per session. 2.6.3 Training the automated classifier

We trained a JAABA classifier using 10-minute video re-cordings of 6 SAPAP3 knockout mice, known to demon-strate increase grooming activity thus providing enough grooming video frames to train the automated classifier in the OF test. Using the MATLAB-based JAABA general user interface, we trained an automated classifier to recognize all the video recorded grooming movements previously described (see 2.6.1. Definition of grooming). Training this automated classifier to perform reliably took approxi-mately 10 hours and multiple versions of the automated classifier. The final version of the automated classifier was trained on 21.790 single frames of which 12.011 single frames were displaying grooming behavior and 9.779 sin-gle frames did not show grooming. We noticed that our classifier reached some form of maximal reliability after training it on approximate 20.000 single frames. The ex-perienced experimenter trained the automated classifier according to the guidelines of grooming behavior descri-bed above. The automated classifier recognized all dif-ferent stage of grooming and combined these grooming events to output a quantitative measurement of general grooming behavior expressed as bouts and total duration. 2.7 Analysis of grooming

Bouts and total duration data were computed automati-cally after analyzing single video recordings. In summary, raw uncompressed AVI video recordings of grooming behavior in the OF and EPM were analyzed using open-source, MATLAB-based Mouse Tracker (Motr) (Ohayon et al., 2013). Motr is a single-camera computer vision system that automatically learns the identity of the mouse and tracks that identity without confusing with artifacts (e.g.

Figure 1 | Design of the open field (OF) and elevated plus maze (EPM). Drawing in side perspective of the OF setup to score grooming behavior

(a). Side perspective of our EPM used to score grooming behavior (b). Note the representative video frame on the computer screens. Horizontal, linear traces below the video show annotations used in this experiment. First row manual scoring, second row automated raw scores, and third row binary predictions. Grooming behavior is annotated using red and non-grooming using blue.

Figure 2 | Cross-validation and ground-truth testing. White bars

re-present correct annotation of grooming whereas dark gray with whi-te dots means incorrect annotation of grooming (false negative; type II error). Light gray represents correct non-grooming while dark gray with dark dots represents incorrect non-grooming (false positive; type I error). Automated classifier correctly annotates individual groo-ming frames in 10.005 out of 12.011 frames and correctly annotates non-grooming in 7.903 out of 9.779 single frames (a). After applying a minimum groom bout length of 10 frames (0.66 seconds), automated classifier correctly annotates 110 out of 118 grooming bouts and 243 out of 255 non-grooming bouts (b).

(5)

training data. We found a sensitivity to annotate posi-tive labeled grooming frames of 83% and a specificity to annotate positively labeled non-grooming frames of 81% (Figure 2a). To increase accuracy of our classifier, we smoothed the prediction by applying a post-proces-sing method, setting a minimum bout length of 10 frames (0.33 seconds). Using these settings, we performed a ground-truth test by which we manually annotated every 50th frame (1.67 seconds) for 18.000 frames (10 minu-tes) and run our classifier on these novel bouts to predict grooming behavior. Sensitivity to annotate positive groo-ming bouts increased to 93% while specificity to anno-tate non-grooming bouts increased to 95%, resulting in lower false negative and false positive rates (Figure 2b). 3.2 Automated classifier compared to manual rater: total duration of grooming

After verifying reliable performance of our classifier, we asked an expert observer to score 10 OF grooming beha-vior videos of wild type mice manually to compute total duration of grooming. We then calculated the correlation between manual scoring and automatic scoring (Figure 3). The correlation was 0.947 (p < 0.0001; 95% CI 0.886 - 0.994; Pearson R square), showing that the automated classifier consistently identified 95% of the total grooming duration over 1 hour movies. Thereafter, the same expert rated 10 OF grooming behavior videos of SAPAP3 knock-out mice which were correlated with automated scoring (Figure 4). This resulted in a smiliar strong correlation of 0.996 (p < 0.0001; 95% CI 0.993 – 1; Pearson R square), suggesting our automated classifier reliably predicts total grooming duration over 1 hour sessions based on overhead video recordings irrespective of the animal’s genotype. the raw scores (value between 0 (non-grooming) and 1

(grooming) and binary predictions of grooming behavior in that frame, subjected to post-processing smoothing). A custom written MATLAB script transformed this matrix into quantitative grooming measurements for every video. 2.8 Data analysis

All results are expressed as mean ± S.E.M. Data com-paring manual annotation and JAABA annotation were analyzed by correlation analysis and all correlation coef-ficients are expressed as R square. Data comparing wild type mice and SAPAP3 knockout mice were analyzed by independent T-test for independent samples. A probabili-ty of less than 0.05 was considered statistically significant.

3. Results

Our goal was to train, assess, and apply an automated classifier to reliable recognize quantitative measure-ments of grooming behavior, based on overhead video recordings of freely behaving mice. We trained a JAABA classifier to allocate total duration and bouts of grooming of video recordings using mice of different genotypes and different tasks to compare to manual allocation (Figure 1). 3.1 Performance and reliability of automated classifier To test whether our classifier can predict grooming re-liable, we used the cross-validation provided by JAABA. Cross-validation is used to quantitatively measure the automated classifier’s accuracy on frames based on 1/7 folding (Kabra et al., 2013). JAABA withholds a subset of the training data for testing based on the remainder

Figure 4 | Correlation of total grooming duration of mice with diffe-rent genotype. Analysis of expert observer 1 and automated classifier

based on OF SAPAP3 knockout mice videos (n = 10) shows similar ex-cellent correspondence (R² = 0.996, p < 0.0001) as wild type animals (see figure 3).

Figure 3 | Correlation of total grooming duration between manual and automated classifier annotation. Analysis of overhead OF video

recordings of wild type animals (n = 10) scored by expert observer 1 and automated classifier shows excellent correspondence. R value computed from Pearson’s correlation (R² = 0.947, p < 0.0001). Red bands represent 95% confidence intervals.

0 200 400 600 0 200 400 600

M

an

ua

l

S

co

re

Classifier Score

Classifier Score

M

an

ua

l

Sc

or

e

0 300 600 900 1200 0 300 600 900 1200

R

2

= 0.947

R

2

= 0.996

= Regression = 95% Prediction = Regression = 95% Prediction

(6)

between the automated classifier and different manual expert raters on total grooming duration. To further in-vestigate the similarities between manual and automated annotation, the two expert observers and a third expert observer scored a one-hour video recording consecu-tively after which we analyzed their annotations using one-minute bin analysis. This provided a measurement of cumulative grooming duration in blocks of one minu-te. Figure 7 shows the one-minute block analysis for the three expert observers with in addition the block analysis of our automated classifier. We found a very similar, scal-loped pattern of grooming duration annotation between all four annotators. At the end of the video recording (60 minutes), we found a small annotation difference in total duration between the three manual expert obser-vers, exemplifying the risk of inter-rater variability when different manual raters score the same video recording. 3.5 Automated classifier compared to manual raters: grooming bouts

The similarities in scalloped pattern of grooming beha-vior over time compared between manual and automa-ted scoring show that both methods of scoring detect a similar amount of grooming duration within one minute blocks. This suggests that both methods are capable of identifying grooming bouts, likely reflecting actual groo-ming bouts displayed by the animal. To investigate this hypothesis, we reanalyzed the first expert observer's an-notations of the 10 SAPAP3 video recordings in the OF (Fi-gure 4) and correlate grooming bouts between manual and automated scoring (Figure 8). As expected according to our hypothesis, we found a close relationship bet-ween our expert observer and the automated classifier, 3.3 Automated classifier compared to manual rater:

diffe-rent task and video resolution

To test how sensitive our classifier is to changes in pixel density and resolution, the same 10 SAPAP3 knockout mice used in the OF (Figure 4) performed 10 minutes on the EPM, while being manually scored on grooming beha-vior by the same expert observer. Besides a slight change in light intensity and a three-fold decrease in video resoluti-on, the EPM also had a different shape compared to the OF setup (Figure 1). A correlation index of 0.942 (p < 0.0001; 95% CI 0.877 – 0.993; Pearson R square\) was found bet-ween manual and automatic scoring (Figure 5), showing similar overlap in total grooming duration annotation compared to OF recordings. Thus, our automated classifier can reliable predict total grooming duration independent of experimental equipment used and with a broad range of video resolution (approximate 0.7-2.1 pixels per mm). 3.4 Automated classifier compared to other manual ra-ters: total duration of grooming

To demonstrate the potential of our classifier, we asked a second expert observer to score 10 OF grooming be-havior video recordings of novel SAPAP3 mice manually according to our definition of grooming behavior (Figu-re 6). The use of diffe(Figu-rent animals made it impossible for the second human rater to bias scoring in favor of our automated classifier, as outcomes were unknown during manual scoring. Correlation analysis showed a strong positive correlation between manual and automa-ted scoring of 0.964 (p < 0.0001; 95% CI 0.922 – 0.996; Pearson R square). Thus, there was a strong correlation

Classifier Score

M

an

ua

lS

co

re

0 100 200 300 0 100 200 300 400

Figure 5 | Correlation of total grooming duration on elevated plus maze (EPM). Analysis of grooming scored by expert observer 1 and

automated classifier of SAPAP3 knockout mice (n = 10) performing on the EPM. Despite training the automated classifier on video recordings with a three-fold increase in video resolution, excellent correlation is found (R² = 0.942, p < 0.0001).

Figure 6 | Correlation of total grooming duration with different ma-nual observer. Analysis of grooming scored by mama-nual expert observer

2 compared to automated classifier. Expert observer 2 scored videos of SAPAP3 knockout mice (n = 10), which revealed highly consistent results compared to automated classifier (R² = 0.964, p < 0.0001).

Classifier Score

M

an

ua

lS

co

re

0 300 600 900 1200 0 300 600 900 1200

R

2

= 0.942

R

2

= 0.964

= Regression = 95% Prediction = Regression = 95% Prediction

(7)

and reliable method to score grooming behavior in mice based on overhead video recordings with high through-put. We show that our automated classifier is capable of annotating bouts and total duration of mice grooming as reliable as expert observers, thus paving the way for au-tomated analysis of grooming behavior. Particular advan-tage of using this method in studies on grooming behavi-or is that it is open-source and thus freely available and readily adjustable to individual needs. Also, minimal ope-rator time is required, it is a non-invasive method where-by animals can move freely without restrictions, and the results match manual measurements within the limits of inter-observer variability. A limitation of the use of JAABA classifier is the offline processing of data. Fortunately, it is possible to use batch processing in order to analy-ze multiple video recordings consecutive. Also, JAABA needs trajectory input of the animal from other open-source software programs. We tried different programs but found Motr the most successful in reliable tracing our mice. However, as Motr is a processing power demanding program, it requires approximately 12-fold tracing time to analyze video recordings. Fortunately, with the use of mul-tiple computers and running Motr overnight, this has litt-le impact on throughput of the experiment. Motr outputs x-y trajectories of every animal enabling us to determine locomotion and activity levels of the mice during each ex-periment (not shown). As our classifier is trained on short video recordings of animals moving freely without any additional attributes, we found it is unable to reliable an-notate grooming behavior of mice connected to cables or tubing. This is likely caused by the loss of frames when the cable blocks the camera lens, but might also be caused by with the correlation index being 0.954 (p < 0.0001; 95%

CI 0.901 – 0.995; Pearson R square). This shows that the automated classifier consistently identified 95% of the grooming bouts. We also reanalyzed the annotations of our second expert observer of the 10 SAPAP3 video (Fi-gure 6). We found a correlation in grooming bouts with a correlation index of 0.629 (p = 0.034; 95% CI 0.326 – 0.949; Pearson R square) (Figure 9). This difference in correlation strength between two observers likely re-flects inter-rater variability (Martin, P., Bateson, 2007). 3.6 Effect of genotype on grooming behavior

With the validated automated classifier for grooming be-havior to quantify total duration and bouts of grooming, we examined if there is a difference in grooming behavi-or between wild type and SAPAP3 knockout mice. Based on previous behavioral studies investigating the SAPAP3 knockout mice, we expected to find an increase in bouts and both total duration of grooming (Burguière et al., 2013; Welch et al., 2007; Xu et al., 2013). As expected, with our automated classifier, we found a significant in-crease in total duration (Figure 10a, mean ± S.E.M.: 169.4 ± 31.73 wild type vs. 302.6 ± 49.95 SAPAP3, t(38) = 2.251,

p = 0.03) as well as in grooming bouts (Figure 10b, mean ±

S.E.M.: 89.65 ± 11.36 wild type vs. 137.9 ± 16.27 SAPAP3, t(38) = 2.432, p = 0.02) in the OF, showing that SAPAP3 knockout mice display more grooming behavior in general.

4. Discussion

4.1 Automatic annotation of complex behavior

A major limitation in analysis of rodent behavior in labo-ratory setting has been the acquisition of quantitative, reliable measurements with high throughput. While considerable progress has been made in the direction of automated behavioral analysis (Barry, 2012; Brash et al., 2005; Brodkin et al., 2014; Patel et al., 2014; Sams-on et al., 2015), complex behavior such as grooming still lacks automation. The aim of this study was to establish the use of the JAABA classifier as a novel, cost-efficient,

F igure 7 | One-minute bin analysis of annotation patterns of three different expert observers and automated classifier. Figure shows

cumulative duration of grooming in one-minute bins over one-hour session of a high grooming SAPAP3 knockout mouse. Scalloped pattern shows cumulative duration of grooming, similar between different ob-servers and automated classifier.

Session (min) C um ul at iv e du ra tio n (s ) 0 10 20 30 40 50 60 0 300 600 900 1200 Manual observer 1 Manual observer 2 Manual observer 3 Classifier score

Figure 8 | Correlation of grooming bouts between expert observer 1 and automated classifier annotation. Analysis of grooming bouts

sco-red by expert observer 1 compasco-red to automated classifier. OF Video recordings of SAPAP3 knockout mice, same recordings as in figure 4 (n = 10). A strong correlation between expert observer and automated classifier is found (R² = 0.954, p < 0.0001).

Classifier Score

M

an

ua

lS

co

re

0 100 200 300 400 0 100 200 300 400 500

R

2

= 0.954

= Regression = 95% Prediction

(8)

4.2 Prevention of inter-rater variability using automated classifier

With our automated classifier, we found highly consistent results in total grooming duration compared to manual annotation (Figure 3). Interestingly, when comparing dif-ferent expert observers with the automated classifier, we found strong and similar correlation coefficients (Figure 4 and 6). This indicates that total duration of grooming is regardless of manual observer annotations. We perfor-med a multi rater bout analysis over a video recording of a high grooming animal to visualize annotation pattern of three different raters compared to the automated clas-sifier (Figure 7). An almost identical scalloped pattern in one-minute grooming bouts emerged which indicates that expert observers as well as the automated classifier annotated closely related video frames as grooming. This suggests the use of highly comparable definition of groo-ming to annotate video recordings. Taken together, total duration of grooming over a one-hour video recording is stable with regard to inter-rater variability. However, we found that grooming bout annotation is more prone to in-ter-rater variability, which could potentially affect manual annotation if different raters score the same animal du-ring an experiment (e.g., different observers before and after treatment). Our classifier correlated stronger with the first expert observer (0.954) compared to the second expert observer (0.626), although this correlation is still considerable and significant (p = 0.034). It is not surpri-sing to find a glimp of inter-rater variability in grooming bout annotation as it is hard to determine the start and end points of a grooming bout when annotating groo-ming behavior in real-time, which results in false postive and false negative bout annotations. Expert observers a constant change in appearance of the animal as the

ca-ble is attached to its body. Subsequently, training a novel classifier with cables might solve this problem. Finally, as JAABA is open-source available, software relies on the ex-pertise and generosity of their developers. Technical sup-port is limited but there is an active community capable of solving questions, offering support, and detected pro-gram bugs can be reported (jaaba@googlegroups.com).

Figure 9 | Correlation of grooming bouts between expert observer 2 and automated classifier. Grooming bout analysis between expert

observer 2 and automated classifier reveals a moderate to strong cor-relation in annotation (R2 = 0.629, p = 0.034). OF Video recording of

SAPAP3 knockout mice, same recordings as in figure 6 (n = 10).

Classifier Score

M

an

ua

lS

co

re

0 100 200 300 400 0 100 200 300 400 500

R

2

= 0.629

= Regression = 95% Prediction

Figure 10 | Effects of genotype on grooming behavior. Bar graphs show significant difference in total grooming duration (a) over one-hour

sessi-ons in the OF setup between wild type (light gray with dots, n = 20, mean ± S.E.M.: 169.4 ± 31.73) and SAPAP3 knockout mice (dark gray with stri-pes, n = 20, mean ± S.E.M.: 302.6 ± 49.95) (conditions; t(38) = 2.251, p = 0.03). A significant effect was also found in grooming bouts (b), showing that SAPAP3 knockout mice (dark gray with stripes, n = 20, mean ± S.E.M.: 137.9 ± 16.27) initiate more grooming bouts than wild type animals (light gray with dots, n = 20, mean ± S.E.M.: 89.65 ± 11.36) (conditions; t(38) = 2.432, p = 0.02). Asterisk represents significance level: p < 0.05.

*

*

(9)

4.4 Grooming behavior of the SAPAP3 knockout mouse In this study, we confirm that there is a genetic base un-derlying grooming behavior. By examining grooming be-havior in wild type and SAPAP3 knockout animals, we verified that the post-synaptic density protein SAPAP3 is involved in grooming behavior. As grooming behavior is an innate stereotyped and frequently observed behavior in rodents, it is thereby potentially useful for behavioral neuroscience research (Ferkin, M. H., Leonard, 2005; Yu et al., 2010). Demonstration of grooming behavior is sub-jected to activity in different brain areas (Cromwell and Berridge, 1996), local release of multiple neurotransmit-ters (Audet et al., 2006; Taylor et al., 2010), environmental factors such as stress (Kalueff et al., 2007), and genetic va-riance (Welch et al., 2007). SAPAP3 knockout mice show an increase in both total duration and grooming bouts compared to wild type animals. Therefore, these mice can most likely be used as a model for deficits in complex and repetitive patterned behavior. A recent study showed that the lateral orbitofrontal cortex and its terminals in the striatum (Burguière et al., 2013) are involved in excessive grooming behavior of the SAPAP3 knockout mice. This net-work plays a prominent role in psychiatric disorders in hu-mans, resulting in repetitive and persistent behaviors (for a review, see (Wood and Ahmari, 2015). Taken together, grooming behavior in SAPAP3 knockout mice potentially plays a role in translational neuroscience research for psy-chiatric disorders, and because of its highly repetitive and compulsive pattern, could be used in OCD, anxiety, and autism research (for a review, see (Figee et al., 2015)). 4.5 Future studies using automated classifiers

Our automated classifier paves the way for quantita-tive studies to grooming behavior with high throughput. Grooming behavior of animals can be annotated multi-ple times during the experiment to study the effect of different drugs or environmental attributes on groo-ming behavior without inter-rater variability. Animals with different genotypes can be easily phenotyped and compared against wild type or other animal models. Although cables during video recordings are biasing our automated classifier annotation, it is possible to stimu-late particular areas of the brain before video recording to explore long term effects. Viral targeted manipulati-ons of specific neuronal populatimanipulati-ons provide temporal control which could provide novel insights in how spe-cific brain areas, neuronal networks, and/or neuromo-dulator pathways are involved in grooming behavior and possibly in psychiatric disorders. This would incre-ase our understanding of the brain and accelerates the discovery of cellular and molecular effects on behavior. analyzed video recordings in real-time, therefore they

have to average their predictions over multiple frames as the animal continues to behave while being video recor-ded. After post analyzing the differences between our ex-pert observers, we found that our second exex-pert observer tended to score grooming bouts longer and not necessary stops scoring a grooming bout when the animals shortly interrupts grooming behavior (e.g. when standing up or sniffing around). Post analysis revealed that interruptions during a grooming bout should be considered as the end of a grooming bout as the animal clearly interrupts its be-havior, after which it restarts a new grooming bout and thus complies with our definition of grooming. Our first expert observer annotated bouts more consistent and rigorous, likely resulting in a closer prediction of actual grooming behavior. The automated classifier can predict grooming behavior on every individual frame, therefore lacking this form of dubious annotating due to annotation in real-time. Taken together, we think that our automa-ted classifier is capable of consistent detecting of small changes in behavior, resulting in accurate predictions of grooming bouts, and eliminates inter-rater variability. 4.3 Sequential chain pattern detection

The automated classifier was trained to output frequently used properties of grooming in behavioral neuroscience (i.e., bouts and total duration), resulting in incapability in detecting sequential chain patterns (Kalueff et al., 2007; Richmond, G., Sachs, 1980). Although this syntactic chain pattern only accounts for approximately 10-15% of all the observed grooming behavior (Kalueff et al., 2007), it is known that this sequence is generated by brain structu-res in the basal ganglia (Cromwell and Berridge, 1996). Grooming bouts and duration can be unaffected while the sequence of this syntactic chain pattern can be bidirecti-onally affected by experimental manipulations, including lesions, neuromodulator release disruption, genetic mani-pulations, and environmental factors (Kalueff et al., 2007). Sensorimotor sequences annotation, such as sequential chain patterns of grooming, is an additional and valuable tool in behavioral neuroscience for complex cognitive tes-ting. It remains unknown if a JAABA classifier is capable of recognizing all different phases of sequential chain pat-terns of grooming behavior and could detect disturban-ces in the temporal execution of these patterns. The abili-ty to detect sequential chain patterns would increase the quantitative output of grooming experiments which could provide novel insights in sequential control of complex behaviors and involvement of different brain areas in the execution and maintenance of these fixed chain patterns.

References

1. Aldridge, J.W., Berridge, K.C., Rosen, A.R., 2004. Basal ganglia neural mechanisms of natural movement sequences. Can. J. Physiol. Pharmacol. 82, 732–9. doi:10.1139/y04-061

(10)

2. Audet, M.-C., Goulet, S., Doré, F.Y., 2006. Repeated subchronic exposure to phencyclidine elicits excessive atypical grooming in rats. Behav. Brain Res. 167, 103–10. doi:10.1016/j.bbr.2005.08.026

3. Barry, M.J., 2012. Application of a novel open-source program for measuring the effects of toxicants on the swimming behavior of large groups of unmarked fish. Chemosphere 86, 938–944. doi:10.1016/j.chemosphe re.2011.11.011

4. Berridge, K.C., Fentress, J.C., Parr, H., 1987. Natural syntax rules control action sequence of rats. Behav. Brain Res. 23, 59–68.

5. Bouwknecht, J.A., Paylor, R., 2002. Behavioral and physiological mouse assays for anxiety: a survey in nine mouse strains. Behav. Brain Res. 136, 489–501.

6. Brash, H.M., McQueen, D.S., Christie, D., Bell, J.K., Bond, S.M., Rees, J.L., 2005. A repetitive movement detec tor used for automatic monitoring and quantification of scratching in mice. J. Neurosci. Methods 142, 107– 14. doi:10.1016/j.jneumeth.2004.08.001

7. Brodkin, J., Frank, D., Grippo, R., Hausfater, M., Gulinello, M., Achterholt, N., Gutzen, C., 2014. Validation and implementation of a novel high-throughput behavioral phenotyping instrument for mice. J. Neurosci. Methods 224, 48–57. doi:10.1016/j.jneumeth.2013.12.010

8. Burguière, E., Monteiro, P., Feng, G., Graybiel, A.M., 2013. Optogenetic stimulation of lateral orbitofronto-striatal pathway suppresses compulsive behaviors. Science 340, 1243–6. doi:10.1126/science.1232380

9. Cromwell, H.C., Berridge, K.C., 1996. Implementation of action sequences by a neostriatal site: a lesion mapping study of grooming syntax. J. Neurosci. 16, 3444–58.

10. Decker, M.W., 2006. Cognition Models and Drug Discovery, Animal Models of Cognitive Impairment. CRC Press/ Taylor & Francis.

11. Ferkin, M. H., Leonard, S.T., 2005. Self-grooming by rodents in social and sexual contexts. Acta Zool Sin. 772–9.

12. Figee, M., Pattij, T., Willuhn, I., Luigjes, J., van den Brink, W., Goudriaan, A., Potenza, M.N., Robbins, T.W., Denys, D., 2015. Compulsivity in obsessive-compulsive disorder and addictions. Eur. Neuropsychopharmacol. doi:10.1016/j.euroneuro.2015.12.003

13. Gerlai, R., Pisacane, P., Erickson, S., 2000. Heregulin, but not ErbB2 or ErbB3, heterozygous mutant mice exhibit hyperactivity in multiple behavioral tasks. Behav. Brain Res. 109, 219–27.

14. Hainsworth, F.R., 1967. Saliva spreading, activity, and body temperature regulation in the rat’. Am. J. Physiol. 1288–1292.

15. Kabra, M., Robie, A.A., Rivera-Alba, M., Branson, S., Branson, K., 2013. JAABA: interactive machine learning for automatic annotation of animal behavior. Nat. Methods 10, 64–7. doi:10.1038/nmeth.2281

16. Kalueff, A. V, Aldridge, J.W., LaPorte, J.L., Murphy, D.L., Tuohimaa, P., 2007. Analyzing grooming microstructure in neurobehavioral experiments. Nat. Protoc. 2, 2538–44. doi:10.1038/nprot.2007.367

17. Kalueff, A. V, Laporte, J.L., Bergner, C.L., 2010. Neurobiology of Grooming Behavior. | Neuroscience | Cambridge University Press, 1st ed. Cambridge University Press.

18. Kalueff, A. V, Stewart, A.M., Song, C., Berridge, K.C., Graybiel, A.M., Fentress, J.C., 2016. Neurobiology of rodent self-grooming and its value for translational neuroscience. Nat. Rev. Neurosci. 17, 45–59. doi:10.1038/ nrn.2015.8

19. Kalueff, A. V, Tuohimaa, P., 2005. The grooming analysis algorithm discriminates between different levels of anxie ty in rats: potential utility for neurobehavioural stress research. J. Neurosci. Methods 143, 169–77. doi:10.1016/j.jneumeth.2004.10.001

20. Kalueff, A. V, Tuohimaa, P., 2004a. Grooming analysis algorithm for neurobehavioural stress research. Brain Res. Brain Res. Protoc. 13, 151–8. doi:10.1016/j.brainresprot.2004.04.002

21. Kalueff, A. V, Tuohimaa, P., 2004b. Contrasting grooming phenotypes in C57Bl/6 and 129S1/SvImJ mice. Brain Res. 1028, 75–82. doi:10.1016/j.brainres.2004.09.001

22. Levin, E.D., Buccafusco, J.J., 2006. Animal Models of Cognitive Impairment, Animal Models of Cognitive Impair ment. CRC Press/Taylor & Francis.

23. Lopes, G., Bonacchi, N., Frazão, J., Neto, J.P., Atallah, B. V., Soares, S., Moreira, L., Matias, S., Itskov, P.M., Correia, P.A., Medina, R.E., Calcaterra, L., Dreosti, E., Paton, J.J., Kampff, A.R., 2015. Bonsai: an event-based frame work for processing and controlling data streams. Front. Neuroinform. 9, 7. doi:10.3389/fninf.2015.00007 24. Martin, P., Bateson, P., 2007. Measuring Behaviour An Introductory Guide, 3rd ed. Cambridge University Press,

Cambridge.

25. Ohayon, S., Avni, O., Taylor, A.L., Perona, P., Roian Egnor, S.E., 2013. Automated multi-day tracking of marked mice for the analysis of social behaviour. J. Neurosci. Methods 219, 10–9. doi:10.1016/j.jneumeth.2013.05.013 26. Patel, T.P., Gullotti, D.M., Hernandez, P., O’Brien, W.T., Capehart, B.P., Morrison, B., Bass, C., Eberwine, J.E., Abel,

(11)

T., Meaney, D.F., 2014. An open-source toolbox for automated phenotyping of mice in behavioral tasks. Front. Behav. Neurosci. 8, 349. doi:10.3389/fnbeh.2014.00349

27. Richmond, G., Sachs, B.D., 1980. Grooming in Norway Rats: The Development and Adult Expression of a Complex Motor Pattern. Behaviour 75, 82–96.

28. Samson, A.L., Ju, L., Ah Kim, H., Zhang, S.R., Lee, J.A.A., Sturgeon, S.A., Sobey, C.G., Jackson, S.P., Schoenwaelder, S.M., 2015. MouseMove: an open source program for semi-automated analysis of movement and cognitive testing in rodents. Sci. Rep. 5, 16171. doi:10.1038/srep16171

29. Spruijt, B.M., van Hooff, J.A., Gispen, W.H., 1992. Ethology and neurobiology of grooming behavior. Physiol. Rev. 72, 825–52.

30. Taylor, J.L., Rajbhandari, A.K., Berridge, K.C., Aldridge, J.W., 2010. Dopamine receptor modulation of

repetitive grooming actions in the rat: potential relevance for Tourette syndrome. Brain Res. 1322, 92–101. doi:10.1016/j.brainres.2010.01.052

31. Tecott, L.H., Nestler, E.J., 2004. Neurobehavioral assessment in the information age. Nat. Neurosci. 7, 462–466. doi:10.1038/nn1225

32. Welch, J.M., Lu, J., Rodriguiz, R.M., Trotta, N.C., Peca, J., Ding, J.-D., Feliciano, C., Chen, M., Adams, J.P., Luo, J., Dudek, S.M., Weinberg, R.J., Calakos, N., Wetsel, W.C., Feng, G., 2007. Cortico-striatal synaptic defects and OCD-like behaviours in Sapap3-mutant mice. Nature 448, 894–900. doi:10.1038/nature06104

33. Welch, J.M., Wang, D., Feng, G., 2004. Differential mRNA expression and protein localization of the SAP90/ PSD-95-associated proteins (SAPAPs) in the nervous system of the mouse. J. Comp. Neurol. 472, 24–39. doi:10.1002/cne.20060

34. Wood, J., Ahmari, S.E., 2015. A Framework for Understanding the Emerging Role of Corticolimbic-Ventral Striatal Networks in OCD-Associated Repetitive Behaviors. Front. Syst. Neurosci. 9, 171. doi:10.3389/ fnsys.2015.00171

35. Xu, P., Grueter, B.A., Britt, J.K., McDaniel, L., Huntington, P.J., Hodge, R., Tran, S., Mason, B.L., Lee, C., Vong, L., Lowell, B.B., Malenka, R.C., Lutter, M., Pieper, A.A., 2013. Double deletion of melanocortin 4 receptors and SAPAP3 corrects compulsive behavior and obesity in mice. Proc. Natl. Acad. Sci. U. S. A. 110, 10759–64. doi:10.1073/pnas.1308195110

36. Yu, H., Yue, P., Sun, P., Zhao, X., 2010. Self-grooming induced by sexual chemical signals in male root voles (Micro tus oeconomus Pallas). Behav. Processes 83, 292–8. doi:10.1016/j.beproc.2010.01.012

Referenties

GERELATEERDE DOCUMENTEN

The parameters for amplitude, damping, phase and frequency for each simulated signal were chosen in the following way: first, meaningful parameters were estimated from a set of 98

Echter, gemeten over de periode mei tot september was de diktegroei van Conference peren vrijwel lineair en werd deze voornamelijk bepaald door het aantal vruchten per boom en

The development of the electronic design of grooming products is currently being done in-house at Philips Consumer Lifestyle in Drachten, but the future ambition is

For the study the standard homogeneous Poisson model (HOM) and three non-homogeneous Poisson models, namely a changepoint model (CPS), a free mixture model (MIX) and a hidden

Haat is geen boek met hoge literaire kwaliteiten, maar wel een verhaal met leesbevorderende elementen, geschreven voor een breed publiek en daarom buitengewoon geschikt voor

Literature reports a distinct sensory pathway activated by stimulation of the whiskers – the lemniscal pathway – which results in transmission of sensory information from

In this study we investigated the potential of the atmospheric window in the thermal wavelength region (long wave infrared, LWIR: 8 – 14 µm) to predict soil surface properties such

Estonian, a minor language, is spoken by fewer than a million people (Kilgi, 2012). Hence, selecting Estonian words as the base of Nadsat was very apt to assure that newly