• No results found

Recognition of facial composites : a comparison between PRO-fit and forensic sketches in relation to target viewing time

N/A
N/A
Protected

Academic year: 2021

Share "Recognition of facial composites : a comparison between PRO-fit and forensic sketches in relation to target viewing time"

Copied!
37
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Recognition of Facial Composites: a Comparison Between PRO-fit and Forensic Sketches in Relation to Target Viewing Time

Masterthese Brein & Cognitie University of Amsterdam

M.A.L.A. Leijtens

Studentnummer: 9788182

Begeleider: prof. dr. J. G. W. Raaijmakers Aantal woorden: 6897

(2)

Abstract

Facial recognition of composites, a graphical representation of an eyewitness’s memory of a face, is an important means to identify suspects. The purpose of the current research was to compare two methods, PRO-fit (software) and artist sketches in order to understand which method is more appropriate under different viewing circumstances (5 vs 30 seconds). We used videos of Dutch targets that are unknown to the UK witnesses, but very familiar to our Dutch participants. We used a forensically realistic delay of 24 hours. In accordance with our

expectations, we found general recognition to be very low (3.6%). A second naming round as an auxiliary measure for composite quality showed an interaction effect between type of composite and duration of target presentation. As expected, recognition of sketches benefited from longer presentation times. However, contrary to our hypothesis, performance of sketches was not better than PRO-fit with longer presentation time. Conclusion is that composites might not be very effective in identifying a subject but if witnesses have seen a suspect for a very short amount of time PRO-fit composites seem to be the better option.

(3)

Recognition of Facial Composites

Witnesses of a crime face the difficult task of describing a perpetrator’s appearance to the police. Often the witness has seen the perpetrator only for a short amount of time and in the case of the witness also being a victim, under a fair amount of stress. Facial recognition of the

resulting composite, a graphical representation of an eyewitness’s memory of a face, is an important means to identify perpetrators. These composites are either computer produced or drawn by a forensic sketch artist.

PRO-fit, evaluated in the current study, is a typical feature-based software program in general police use in the UK (Frowd et al., 2005b). PRO-fit contains a large database of photographed individual features of individuals which can be selected, resized, and positioned freely in the face by a trained operator, usually a police officer, until a composite face is produced that matches the witnesses’ memory best. Witnesses are being interviewed while the operator selects features. After a general face is created based on the verbal description, the witness and the operator start fine-tuning the face until an optimal likeness is created.

Sketch or forensic artists on the other hand are probably the oldest means of creating a composite based on a witness’s description. These artists are highly skilled in portraiture and draw faces by means of pencils and crayons based on verbal descriptions. Since composite quality is dependent on the artist’s skill, some differences are found between them in quality of the portraits (Laughery & Fowler, 1980). Although software such as PRO-fit is now the most used means of composite production, sketches remain an important tool for police work. For example, in a study by McQuiston-Surrett, Topp and Malpass (2006) done in the US, 43% of police offers reported relying on sketch or forensic artists to construct composites.

(4)

Facial composites have been the subject of many studies. In general the conclusion is that modern systems are able to produce pictures with a high degree of likeness when operators work directly from photographs (Brace, Pike, Allen, & Kemp, 2006; Davies, Van der Willik, &

Morrison, 2000). However, although this tells us something about the qualities of the program or artist itself, it is not very representative of a normal situation in which witnesses have to describe a person’s face from memory. When relying on the description based on the memory of a

witness, composite quality drops and naming of these composites is only about 20% correct, and even less for sketches (Brace, Pike, & Kemp, 2000; Frowd, McQuiston-Surrett, Anandaciva, Ireland, & Hancock, 2007). This however is after a delay of a couple of hours, while a more realistic delay for normal witnesses would be at least 24 hours up to a couple of days. Under these circumstances performance is even worse as quality of the image decreases and correct naming of composites drops to only a few percent at best (Frowd et al., 2005a, 2005b; Frowd et al., 2007).

One of the main reasons for poor recognition of facial composites is that verbally describing unfamiliar faces and recognising familiar faces are dependent on different cognitive and perceptual processes. For one, people seem to have some difficulties with the initial

encoding of an unfamiliar face (Megreya & Burton, 2008). What is stored in memory is selective by nature and there is a tendency to code material in terms of categories instead of specific instances (Davies, Shepherd, & Ellis, 1978). Second, verbal description is based on recall of information and people are generally not very good at describing individual features and selecting the appropriate facial parts (Frowd, Bruce, Smith, & Hancock, 2008). Therefore,

witnesses are generally unable to accurately construct the internal features of a face after long delays. These internal features however are important for recognition by another person later

(5)

(Andrews, Davies-Thompson, Kingstone, & Young, 2010; Frowd, Bruce, McIntyre, & Hancock, 2007).

Face recognition on the other hand is a more holistic process emerging from parallel processing of all the individual features, and people are in general very good at recognising familiar faces (Bruce 1982). Also, familiar and unfamiliar faces are represented qualitatively different and recognition is processed differently (Johnston & Edmonds, 2009). People are also in general not good at recognising unfamiliar faces (Hancock, Bruce, & Burton, 2000), which might impair the recognition of an unfamiliar looking composite, but it may also further complicate the process of creating a good composite. While in general unfamiliar face recognition means matching two unfamiliar faces by photograph or at a line-up, this might possibly also extend to matching a mental image of an unfamiliar face with a composite of the same face. If a witness has trouble deciding if the composite is looking good enough (is it the same as the memory?), or what features need more work, then this will impair the composite’s quality. Since composite recognition is dependent on the entire chain of cognitive processes, from remembering and describing, to composing and identifying, the consequence is that if one or more of these steps are compromised, recognition of the composite will be harmed.

Although in general recognition of both sketch and software composites is low, a main difference between sketches and software composites is that the former is more dependent on the witnesses’ verbal descriptions; there is no reference face to compare the memory to such as in feature based systems like PRO-fit. While this leaves more freedom to both the artist and the witness, it might be more difficult to accurately describe what a certain facial feature looks like, than to know that a feature is not right. For many people it might be easier to narrow down than to start from scratch. There is however some evidence that sketch composites perform better after

(6)

a longer delay. In one study by Frowd et al., (2005a) participant-witnesses made composites that were correctly named at only 3.2% overall, with sketches emerging as the best method, at 8.1%. One of the reasons for this finding could be that sketch artists are better at drawing internal features of faces than software; again, these internal features are crucial for facial recognition (Frowd et al., 2007). This however leads to the prediction that sketches would perform better always, which is not supported by earlier research. Another reason could be the procedure of sketching itself; a difference in the way a sketch artist interviews the witness and produces the composite itself. There is some evidence that type of interview attributes to the quality of the composite (Frowd et al., 2008). Frowd (Wilkinson & Rynn, 2012) argues that ‘while sketching is still based on the selection of individual features, the initial focus is on configural information - the placement of features on the face - and then on increasing the detail in groups of features. This procedure would appear to encourage more natural, holistic face perception’. This more holistic process might be more beneficial when there is a longer delay between target viewing and composite making since there is a weaker memory of individual features.

The purpose of the current experiment was to understand if and under which viewing circumstances sketch composites produce better results than software composites such as made by PRO-fit. In most previous research famous faces were chosen as targets to enable composite naming, with participant-witnesses selecting a to them unknown famous face as their target (e.g. (Frowd et al., 2005b)). Bias might influence choice (pick an easier to remember face) and the possibility that the target face still is familiar to the witness cannot be ruled out. In the current research however we used Dutch targets which were unknown to the UK witnesses, but were very familiar to our participants.

(7)

Since in general recognition of composites is found to be low especially after some delay, our first hypothesis was that general composite recognition would be lower than 10%. Our second hypothesis was that composites based on participant-witnesses verbal description of targets that were presented longer (30 seconds vs 5 sec) would show higher recognition rates, regardless of type of composite. Finally our hypothesis was that an interaction between type of composite and duration of target presentation would be found: based on Frowd et al., (2005a) and his suggestion that sketch composite production is more holistic in nature it was expected that sketch composites benefitted the most of longer durations with higher rates of recognition as a result.

Method Pilot

Frowd (2005b) found that target photographs were correctly named between 54% and 100% of the time due to a problem of differences in target familiarity. One proposed solution for this would be to use a conditional naming measure, based on the number of identified targets (as was done in the current study). Another, or extra solution however would be using targets with a general high familiarity. Therefore, in order to make sure targets being used were famous enough to be close to 100% identifiable, a pilot study of target naming was conducted. Fifteen Dutch famous male targets (and unknown in the UK) were selected. A total of 24 first year psychology students of the University of Amsterdam volunteered to participate. Photographs of every target were sequentially presented for 30 seconds on a beamer in a small classroom. Participants could identify each target either by name or by descriptive, biographical information written down on paper. Ten out of these 15 targets that were recognised at least 85% of the time were used for

(8)

composite production. For target names see Table 1, for photographs of these targets, see Appendix A.

Composite Construction Materials

Target video’s

Videos of the targets in which their faces are in full view, under good lighting conditions (e.g. television interviews) were downloaded from video content website

http://www.youtube.com through website http://keepvid.com. The video editing program iMovie was used to edit clips in the desired format and length. Each final video started and ended with 5 seconds of blank screen in order for the operator to be able to start the video and leave the room while staying blind for the target face. The 30 second target videos consisted of several edited clips from the same movie in order to show only the target face, and not the interviewer for example. These transitions were smoothed out and hardly noticeable. The five second target

Table 1. Names and occupations of targets

Target Celebrity Occupation

1 Matthijs van Nieuwkerk TV presenter

2 Marco Borsato Singer

3 Paul de Leeuw TV presenter/Singer

4 Jeroen van Koningsbrugge Actor

5 Frans Bauer Singer

6 Dennis Storm TV presenter

7 Jan Smit Singer

8 Gordon Singer/TV

(9)

video was taken from the 30 second video for consistency between conditions. Once all videos were completed they were digitally transferred to the UK experimenters for stage 1 of the experiment. As a final check, one researcher (not the operator) confirmed that these targets were unknown to a UK audience.

Participants

Fourty participants (31 female and 9 male) were recruited from the university of Winchester (1st and 2nd year Psychology undergraduates), via a police volunteering group at Thames Valley, and people from the local community. Mean age was 36 years.

Procedure

Composites were produced at the University of Winchester (UK). The basic procedure was the same for each condition (PRO-fit and Sketch). The operator was unfamiliar with the target movies and did not know whether participants were shown the 5 or 30 second movie. Since the audio of the video was in Dutch and could be considered distracting, no sound was used. Participants were randomly assigned to each condition. Each participant was instructed that the movie would contain an unfamiliar face and to thoroughly view and remember this target face. After instruction, the viewing took place in front of a computer screen. After a 24 hour delay the participant would return and, depending on the condition, the operator would either use PRO-fit to create a composite or make a sketch. For both sketches and PRO-fit composites, the operator used a face-recall cognitive interview, a specific type of interview to have witnesses describe a remembered face. All composites were produced by one operator/sketch artist, to keep variability between both methods at a minimum. The construction session was open-ended and

(10)

lasted for about an hour. Once all composites were produced they were digitally transferred to us for stage 2 of the experiment. One of the composites (target 8, Gordon, 5secSketch) looked very different from both the other composites and the target photograph. Since the witness was participating in another face recognition studies the description did not match the actual target. This composite was therefore not used in this study. The composite was replaced with the 30sec Sketch composite of the same target. In the analysis this was entered as a missing value. See Appendix B for all composites, including the one not used.

Composite Recognition Participants

Participants were 91 (31 male, 60 female) Dutch individuals ranging from 18 to 66 years of age (M = 24.65, SD = 8.01; see Figure 1 for age distribution). The number of participants was based on the recommendation by Simmons, Nelson and Simonsohn (2011) that when an a priori sample size cannot be computed in absence of earlier known effect sizes a minimum of 20 participants per cell is needed. Participants were recruited through UvA LAB, an online research application service. All participants were paid €5. First year psychology students could choose between earning 0.5 research credit or €5.

Research Design

The study was a 2 (type of composite) X 2 (time of observation) between-subjects factorial design, since there were two independent variables. Type of composite had two levels, PRO-fit or sketch, and time of observation had also two levels, five or 30 seconds of seeing the

(11)

target face. The dependent measure was the mean rate of identification of the composite.

Photographs of the targets were also identified by the participants in order to be able to calculate a conditional naming rate.

Procedure

Participants were randomly assigned to one of four conditions. Participants were

informed that the images they were to be shown were facial composites, such as they might have seen on the Dutch television program “Opsporing Verzocht” (analogous to “Crimewatch” in the UK). Further it was explained that the composites were famous Dutch males and that these composites were likenesses, based on descriptions by a witness unfamiliar with the target. Each participant was shown composites from one condition (either sketch or PRO-fit) since it is reported that mixing composites from different systems could interfere with the recognition process (Frowd et al., 2005b). Testing was done in one session, consisting of three stages or rounds and was done individually and self-paced. Total duration of the experiment was between 15 and 30 minutes. In the first round, 10 composites were presented sequentially in randomised order on a computer screen. Participants were asked if they could identify this person. In cases where a composite could not be named but the participant could provide unequivocal

biographical information (such as the name of the television show the target is specifically known from for example), responses were accepted as correct. After each answer a rating on a scale from 1 to 7 (1 meaning not certain at all, 7 meaning very certain) was given how certain the participant was of the given answer. In the second round participants were asked to also identify the target photographs. This way we could control for unfamiliar targets and compute a

(12)

participants were again presented the same composites as in round 1, in the same order. Again participants were asked if they could identify this person and to give a certainty rating on a scale from 1 to 7. They were then thanked for their participation and debriefed.

Results Descriptives

Results of all 91 participants were included in the analysis. The first hypothesis was that overall mean recognition would be lower than 10%. Overall mean recognition rate for the first round of recognition (REC1) was 3.6%, which is very low (M = 0.36; SD = 0.59; see Table 2 for all three naming rounds), confirming the hypothesis. Out of 10 composites, 63 participants identified no composites at all, 23 participants identified 1 composite, and only five participants identified 2 composites (see Table 3 for frequencies of all three naming rounds). Overall mean recognition rate of targets in round 2 (REC2) was 87.7% (M = 8.77, SD = 2.02). Overall mean recognition rate of composites in round 3 (REC3) was 39.1% (M = 3.91; SD = 2.14). No

difference in recognition was found between men and women in all three rounds. No correlation was found between age of the participants and recognition rates in all three rounds.

(13)

Table 3. Recognition per target per round. Left column shows how many targets were identified.

Main Analyses

Our second hypothesis was that composites from the 30 seconds conditions would be recognised more often than composites from the 5 seconds conditions, regardless of type of composite. However, general naming rates of 30 second and 5 second composites were both low (3.5% for 5 seconds and 3.8% for 30 seconds) with no significant difference between them (M =

Frequencies for REC1

Targets Frequency Percent Valid Percent Cumulative Percent

0 63 69.2 69.2 69.2

1 23 25.3 25.3 94.5

2 5 5.5 5.5 100.0

Total 91 100.0 100.0

Frequencies for REC2

Targets Frequency Percent Valid Percent Cumulative Percent

0 1 1.1 1.1 1.1 1 1 1.1 1.1 2.2 3 2 2.2 2.2 4.4 4 1 1.1 1.1 5.5 5 3 3.3 3.3 8.8 6 1 1.1 1.1 9.9 7 3 3.3 3.3 13.2 8 13 14.3 14.3 27.5 9 19 20.9 20.9 48.4 10 47 51.6 51.6 100.0 Total 91 100.0 100.0

Frequencies for REC3

Targets Frequency Percent Valid Percent Cumulative Percent

0 5 5.5 5.5 5.5 1 11 12.1 12.1 17.6 2 9 9.9 9.9 27.5 3 8 8.8 8.8 36.3 4 22 24.2 24.2 60.4 5 17 18.7 18.7 79.1 6 9 9.9 9.9 89.0 7 6 6.6 6.6 95.6 8 2 2.2 2.2 97.8 9 2 2.2 2.2 100.0

(14)

.35, SD = .559 for 5 seconds, and M = .38, SD = .628 for 30 seconds). Our third hypothesis was that there would be an interaction between type of composite and duration of target presentation; sketches would benefit the most from the 30 second viewing. Again, general naming rates of both sketches and type of composites regardless of time was very low, 4% for PRO-fit and 3.3% for sketches (M = .40, SD = .665 for PRO-fit and M = .33, SD = .516 for sketches). Since all three variables (type of composite, time and recognition) are binary, the association between type of composite, time and recognition was analysed by loglinear analysis, an extension of the Chi-Square method. Although Chi-Chi-Square is useful for determining relationships between categorical variables, it does not provide information about the strength and direction of the relationship. Specifically, because recognition is an outcome variable, a logit loglinear procedure was chosen for analysis. The three-way loglinear analysis produced a complete model with a goodness of fit of χ2(0)= 0. There was no significant interaction between time and type of composite, χ2(1)= 0.991. Both main effects for type of composite χ2(1)= 0.188 and time were also not significant χ2(1)= 0.123.

The third naming round (REC3) was used as an auxiliary measure for composite quality. Mean naming rate was much higher than in the first round (see table 4 and figure 3). Mean recognition of PRO-fit was 44.9% (M = 4.49, SD = .287), while it was lower for sketches, 32.6% (M = 3.26, SD = .305). For analysis of the hypotheses a 2x2 factorial ANOVA was used. The effect of time on identification was of borderline significance, F(1, 87) = 5.54, p = .059 and there was a significant main effect for type of composite on identification, F(1, 87) = 8.71, p = .004. There was a significant interaction effect between amount of time the witnesses had seen the targets, and type of composite on identification, F(1, 87) = 21.62, p = .021. This indicates that the two types of composite were affected differently by time. Specifically, identification of

(15)

PRO-fit composites was similar in both the 5 second condition and the 30 second condition, 45.9% and 44% respectively (M = 4.59, SD = 0.37 and M = 4.40, SD = 0.44). Identification of Sketch composites was significantly lower in the 5 seconds condition than in the 30 seconds condition, 23.6% and 41.5% respectively (M = 2.36, SD = 0.42 and M = 4.15, SD = 0.44).

Table 4.. Mean recognition in percentage for each recognition round. *one participant was excluded from the conditional measure (CONREC3)

REC1 REC2 REC3 CONREC3

Type of composite Time % % % % N

Sketch 5 sec 3.6 84.5 23.6 31.4 22 30 sec 4.5 84.5 41.5 49.6 20 PRO-fit 5 sec 3.4 92.1 45.9 50.5 29 30 sec 3.0 88.0 44.0 48.5 20* Total 5 sec 3.5 88.8 34.8 40.9 51 30 sec 3.8 86.3 42.8 49.0 40

Figure 3. Mean recognition of REC3 in percentages.

15 20 25 30 35 40 45 50 55 5sec 30sec

Mean recognition REC3

Sketch ProFit

(16)

Figure 4. Mean conditional recognition of REC3 in percentages.

Since recognition of composites is dependent on knowing the targets, identification of targets in REC2 was also analysed. As expected, a one-way ANOVA showed no significant between-conditions differences of target recognition, F(3, 87) = .797, p = .499 (see Table 3 for mean recognition in percentages).

To control for familiarity with the targets, a new dependent variable called conditional recognition was calculated. Conditional recognition is expressed as a recognition percentage, how many of the known targets were recognised. For example: a participant recognised 8 out of 10 targets and identified 6 out of 10 composites. Conditional recognition is then expressed as 6 identifications out of 8 known targets, so 6 divided by 8, times 100 = 75%.

For conditional recognition of REC1 the association between type of composite, time and recognition was analysed by a logit loglinear analysis. The three-way loglinear analysis produced a complete model with a goodness of fit of χ2(0)= 0. There was no significant interaction

25 30 35 40 45 50 55 60 5sec 30sec

Conditional recognition REC3

Sketch ProFit

(17)

between time and type of composite, χ2(1)= 0.405. Both main effects for type of composite χ2(1) = 0.169 and time were also not significant χ2(1)= 0.123.

For analysis of conditional recognition in REC3, a 2 x 2 factorial ANOVA was done. One participant was excluded from this analysis, since this participant recognised no targets in REC2 (and gave no answers at all), but somehow did recognise two composites in REC3. There was no significant main effect for type of composite on conditional recognition, F(1, 86) = 3.24, p = .075 ( see Table 3 for means and SD, and figure 4). The main effect for time on conditional recognition was not significant, F(1, 86) = 2.61, p = .110. There was a significant interaction effect between length of time the witnesses had seen the targets, and type of composite on conditional recognition, F(1, 86) = 4.01, p = .048. This indicates that both types of composite were affected differently by time. Specifically, conditional recognition of PRO-fit composites was similar in both the 5 second condition (M = 50.49, SD = 4.37) and the 30 second condition (M = 48.54, SD = 5.40). Conditional recognition of Sketch composites was significantly lower in the 5 seconds condition (M = 31.39, SD = 5.01) than in the 30 seconds condition (M = 49.55, SD = 5.26).

Analysis of composites

Quality of composites, variation in famousness of the targets and or specific facial features might all contribute to differences in identification of composites.

REC2

First, famousness or familiarity with the targets was analysed. For REC2 a Chi-Square test was done. There was a significant association between target and identification, χ2 = 38.48, p

(18)

= .000. Target 6 (Dennis Storm) was identified the least (68 times out of 91), while targets 7 and 8 (Jan Smit and Gordon) were recognised the most (both 86 times out of 91; see Table 4 and Figure 5 ).

Figure 5. Recognition of target photographs (REC2) in percentages.

REC1

Since identification of composites in round 1 was very low, not every composite was recognised at least once. Analysis with a Chi-Square analysis (Monte Carlo) showed that there was a significant association between target and identification of composites, p < .000. Three targets/composites were identified significantly more often than the other ones (see Table 4 and Figure 6). These were target 3 (Paul de Leeuw), target 6 (Dennis Storm) and target 8 (Gordon). Three targets were never identified at all, target 5 (Frans Bauer), target 7 (Jan Smit) and target 10, Nick Schilder). 0 10 20 30 40 50 60 70 80 90 100

Matthijs Marco Paul Jeroen Frans Dennis Jan Gordon Bram Nick

(19)

Figure 6. Sum of identifications per target per condition in REC1.

Table 4. Counts of recognitions per target per condition in REC1.

REC3

Analysis with a Chi-Square test showed that there was a significant association between target and identification of composites, χ2=183.67, p = .000. Target3 (Paul de Leeuw) was

identified the most (70 times out of 91), target 10 (Nick Schilder) was identified the least (20 times out of 91; see Table 6 and Figure 7).

0 2 4 6 8 10 12 14

Matthijs Marco Paul Jeroen Frans Dennis Jan Gordon Bram Nick

REC1

5secSketch 30secSketch 5secProfit 30secProFit

condition Matthijs Marco Paul Jeroen Frans Dennis Jan Gordon Bram Nick N

5secSketch 3 4 22

30secSketch 1 1 2 4 2 20

5secProFit 1 1 2 7 29

30secProFit 4 1 1 20

(20)

Figure 7. Sum of identifications per target per condition in REC3.

Table 6. Counts of recognitions per target per condition in REC3. *5 second sketch of Gordon was replaced with the 30 second sketch; these recognitions were discarded from analysis.

Condition Matthijs Marco Paul Jeroen Frans Dennis Jan Gordon Bram Nick N

5secSketch 5 0 14 1 4 9 2 3* 7 5 22 30secSketch 12 2 14 6 3 14 4 11 11 4 20 5secProfit 13 17 23 12 7 18 11 16 9 8 29 30secProFit 14 8 19 9 9 8 6 6 5 3 20 Total 44 27 70 28 23 49 23 33 32 20 91 Ratings

For each identification a rating of 1 (not sure at all) through 7 (very sure) was asked. Mean rating for correct identifications in REC1 was M = 3.73, SD = .324, and mean rating for no recognition was M = 2.93, SD = .074. Since the experiment was set-up in a way that participants also had to give a rating even when they didn’t give any answer at all, there were many trials

0 10 20 30 40 50 60 70 80

Matthijs Marco Paul Jeroen Frans Dennis Jan Gordon Bram Nick

REC3

(21)

with a rating of 1 which is meaningless, but does lower the mean rating for no identification. After selecting only cases with answers mean rating for incorrect identifications was M = 3.29, SD = .085. The same analysis was also done for REC3. Mean rating for correct identifications in REC3 was M = 4.86, SD = .090, and mean rating for no recognition was M = 3.36, SD = .093. After selecting only cases with answers mean rating for incorrect identification was M = 3.89, SD = .103.

Discussion

To summarise, the current study was done to compare two methods of composite

production, PRO-fit and sketches in relation to target viewing times. Importantly, this is also the first study in which it was certain that witnesses had never seen the targets before. One of the main findings was that overall identification was very low (3.6%), confirming our first hypothesis. This is comparable to the 3% reported by Frowd et al., (2005a). The other main finding was an interaction effect between time and type of composite; viewing times had no effect on PRO-fit naming, but 5 second sketches were recognised less than 30 second sketches.

Although the current finding is comparable with Frowd et al., (2005a), previous studies reported higher recognition rates (eg. Brace et al., 2000; Davies, van der Willik, & Morrison, 2000; Frowd et al., 2005b). Frowd et al. (2005b) for example report a mean naming rate of 20% and Brace et al. (2006) report a mean naming rate of 12% for composites based on description from memory. As mentioned above, one important difference between these studies and the current one is familiarity of the participant-witnesses with the targets. In the study done by Frowd (2006) participants were familiar with the faces they had to describe to the operators in order to construct a composite. Although this is a useful method for testing the quality and

(22)

limitations of the software and comparing verbal description from memory with description from a photograph, it is not forensically realistic. In the study done by Frowd (2005b) witnesses identified a photograph of an unknown celebrity and then inspected the photograph for one minute. One of the problems with this approach however is that while witnesses might not consciously know the identity of the celebrity, they might have seen this person on tv or in the newspaper without realising it. This could have influenced their verbal description in a positive way, resulting in composites of higher quality.

Another major difference with earlier studies is the time between witnesses seeing the target and construction of the composite. In Frowd et al., (2005b) the delay was 3-4 hours, which is relatively short compared to the 24 hours in the current study. Indeed, in another study done by Frowd et al., (2005a), a more realistic delay of 2 days was used and a much lower overall

composite naming of 3% was found. Again, this is comparable with the current study’s finding since we used a delay of 24 hours. Since the procedure of selecting an unfamiliar celebrity target face was the same in both earlier studies, it seems the delay between memorising the unfamiliar face and composite production is the most critical factor. Known as the Ebbinghaus curve, it seems that most of the forgetting takes place a few hours after the memory has been produced, but after that forgetting is much slower (Baddeley 1997). In practice this would mean that in order to create the best possible composites, construction needs to take place within several hours after the witness has seen the suspect. If the delay is longer, quality decreases, but difference in quality between a 24 hour delay or several days will be negligible. This however needs to be tested specifically.

Since recognition rates were too low we used a second naming round of composites as an auxiliary measure of composite quality. This round was used to test the other hypotheses. To

(23)

control for target familiarity a conditional naming measure was also calculated. Since target familiarity in general was very high, no differences were found between these measures. Our second hypothesis was that composites for targets that were presented longer (30 seconds vs 5 sec) would show higher recognition rates, regardless of type of composite. However, no

significant effect of time was found. Our third hypothesis was that an interaction between type of composite and duration of target presentation would be found. Sketch composites would benefit the most of longer durations with higher rates of recognition as a result. Indeed, an interaction effect was found, however it was found that time only had an effect on quality of the sketches. For PRO-fit composite quality it didn’t matter whether participant witnesses viewed the targets 5 or 30 seconds. Mean recognition of these composites was the same. Sketches however,

performed worse under shorter viewing conditions, while recognition of 30 second sketches was the same as PRO-fit. So while the hypothesis was that sketches would benefit the most (higher recognition than PRO-fit), the current finding suggests that performance of sketches is impaired by a shorter viewing time.

What could be an explanation for this? One of the reasons could be the fact that when making sketches there is no reference face and the sketch artist is more dependent on the verbal description than when using PRO-fit. Although this is the same for the 30 second sketches, witness-participants might have better or stronger memory traces and are better able to describe the face. As hypothesised in the introduction one would argue that PRO-fit should also benefit from the longer viewing time, however this was not found and perhaps having a reference face makes these memory traces less crucial for composite quality. It may be easier to describe something off (wrong) or missing in a face than to describe something from scratch, unless

(24)

perhaps when you are trained to do so. Verbally describing a face while the operator keeps asking questions might interfere with keeping the mental image of the face.

Another reason could be that while the 5 second sketches are of a good quality, these sketches might miss some important information necessary for identification. In the study done by Frowd et al., (2005b) it was noted that while the sketches were of better quality (measured by a sorting task) than PRO-fit, identification was lower. These sketches contained more detail for face shape, hair etc. (external features), but lacked detail in internal features such as the nose, important for recognition (Frowd et al., 2007). Recognition of familiar faces however involves processing of internal features more than external features (Ellis, Shepherd, & Davies, 1979). In this study these particular details were not analysed, but a lack of detailed internal features in the 5 second sketches might have contributed to a lower overall recognition.

Although the current finding does not confirm our hypothesis, perhaps it does support the proposed mechanism mentioned in the introduction that sketch composite production is more holistic in nature. Since there is no reference face, perhaps sketches are more dependent on viewing times of the witnesses than PRO-fit. Since there is a big difference in recognition between 5 second and 30 second sketches, maybe the composite quality will increase even further when witnesses have longer viewing times (such as 60 sec). Findings from Frowd et al., (2005a) support this idea since sketches performed better than PRO-fit (8% recognition vs 3%) when witnesses could look at the target photograph for 60 seconds. Whether sketches really benefit from this remains to be specifically tested however.

Why is recognition rate in general so low? The problem with composite production and recognition is that it depends on four stages, each with its own task demands. As mentioned earlier in the introduction, verbally describing unfamiliar faces for composite production and

(25)

recognising a familiar face are dependent on different cognitive and perceptual processes (Davies et al., 1978). First, a witness needs to be able to remember an unfamiliar perpetrator’s face which people are in general not very good at (Megreya & Burton, 2008). Then the witness has to

verbally describe the features of the perpetrator. And while recognising a face and arguably encoding a face is dependent on holistic processes, verbal description is feature based and might interfere with the mental image of the face itself. This process is also comparable to describing which muscles to use when riding a bicycle; this subconscious process is not verbally accessible. It could be however that focus on features, their configuration and perhaps describing them is trainable, such as an artist can learn. Experience of the operator to translate these verbal

descriptions into a composite of course also plays a role. All factors above influence composite quality.

One specific explanation for problems with recognition of composites is that people might consciously focus on certain features of the face that are not correct for this person, and attribute too much weight or attention to these errors. Some participants in the current study reported such things: ‘he looks like … , but I don’t know, his hair and nose is different, so it must be someone else.’ Some composites, by chance, looked very much like another famous Dutch person, and participants could not unsee this. The idea that like real faces, composite faces are processed holistically as well is supported by research done by McIntyre, Hancock, Frowd, and Langton (2015). Composites contain inaccuracies and these impair recognition by making it difficult to perceive the correct and identifiable features.

Again, an important point to make is that recognition of familiar faces is dependent on the analysis of internal features such as the mouth, nose and eyes. But although these features are represented holistically in face-selective regions of the brain, the same is true for external

(26)

features such as hair and face outline (Andrews et al., 2010). This means that besides consciously focussing too much on incorrect features interfering with identification, the holistic process of facial recognition is actually impaired by both wrong internal and external features. This is supported by a study done by Toseeb, Keeble and Bryant(2012) in which it was found that internal features of a face give sufficient information for recognition, but that changes in hairstyle disrupted this holistic process. This is further corroborated by the finding that external facial features modify the representation of internal facial features in the fusiform area (Axelrod & Yovel, 2010). It is then not hard to see why composites, often consisting of incorrect internal and external features, are very hard to recognise.

Interestingly, five participants recognised two out of 10 composites in the first naming round, while most of the participants recognised no composites at all in the first round, and 23 recognised only one composite. Composite quality and therefore recognition is of course dependent on both the memory of the witness and the operator’s experience. There is however some evidence that individual differences in recognising faces might also be a contributing factor (Andersen, Carlson, Carlson, & Gronlund, 2014) and this trait might even be genetic (Wilmer et al., 2010).

One solution for the floor effect of the first naming round would be to use a smaller pool of targets, e.g. football players. Another solution could be to have a much larger sample size than the current study. One way of doing this would be to test a large group of people identifying the composites at the same time, in a lecture hall for example. Although some control will be sacrificed with this method, it resembles the way facial composites are usually encountered in practice, by seeing them on a tv programme, or in a newspaper.

(27)

This leads to a final question, whether the use of composites for identifying suspects is a good method in general. The rationale of this method is of course that if only one out of

thousands of viewers recognises a composite, the facial composite has been a success in helping to identify a perpetrator. But there are two sides two this. If thousands of people see a composite, probably only a few of them will actually know the perpetrator, but a lot of times the composite will look like someone else they do know leading to false identifications. By being pointed in a wrong direction, this could culminate in a lot of wasted police time and resources which could have been used elsewhere. Or perhaps, in a worst case scenario, it might even lead to a false conviction.

This leads to the final question of what a recognition rate of 3.6% actually means. It means that on average for a composite to be recognised, at least 28 people need to see the composite to identify the perpetrator. Again, while a policeman would say that only a single positive identification could mean a lead and perhaps an arrest and conviction of a perpetrator, not every single person who thinks he recognises a composite will actually contact the police. A lot of people will be hesitant to contact, since they might see a ‘likeness’ but might not be very certain this is actually the person they think it is. Therefore, in practice composites might even be recognised less than in this study. Also, general mean recognition is just that, general. Some composites were identified not even once, meaning that in practice some composites will never be recognised at all, no matter how many viewers saw them. That by itself makes the usage of composites very much a hit-and-miss procedure, much more akin to gambling or a last resort, than a regular useful procedure. The practical take away from the current results is that it might be a good idea to invest time and resources in other police methods than composites to find and identify a perpetrator. If however a composite is used, both sketch and PRO-fit seem to perform

(28)

equally, PRO-fit however seems to be the better option if witnesses have seen a suspect for a very short amount of time

As a final thought one could question how valid the use and research of facial recognition is at a time when more and more CCTV images and video from security camera’s are available. In practice however, not everything is being filmed yet and therefore eyewitness accounts, although far from perfect, remain an important means. But also from a theoretical standpoint is this type of research still relevant, since insight in the way internal and external features of a face are important for facial recognition translates to often blurry camera images and suspects who have changed their hair styles. Future evaluations will therefore hopefully provide much needed insight to improve the difficult task of describing a suspect’s appearance to the police.

(29)

References

Andersen, S. M., Carlson, C. A., Carlson, M. A., & Gronlund, S. D. (2014). Individual differences predict eyewitness identification performance. Personality and Individual Differences, 60, 36-40. doi:10.1016/j.paid.2013.12.011

Andrews, T. J., Davies-Thompson, J., Kingstone, A., & Young, A. W. (2010). Internal and external features of the face are represented holistically in face-selective regions of visual cortex. The Journal of Neuroscience : The Official Journal of the Society for

Neuroscience, 30(9), 3544-52. doi:10.1523/JNEUROSCI.4863-09.2010

Axelrod, V., & Yovel, G. (2010). External facial features modify the representation of internal facial features in the fusiform face area. NeuroImage, 52(2), 720-5.

doi:10.1016/j.neuroimage.2010.04.027

Baddeley, A. D. (1997). Human memory: Theory and practice. Psychology Press. Brace, N., Pike, G., & Kemp, R. (2000). Investigating E-FIT using famous faces. In A.

Czerederecka, T. Jaskiewicz-Obydzinska and J. Wojcikiwicz (eds), Forensic Psychology and Law, (pp. 272-276). Krakow: Institute of Forensic Research Publishers.

Brace, N. A., Pike, G. E., Allen, P., & Kemp, R. I. (2006). Identifying composites of famous faces: Investigating memory, language and system issues. Psychology, Crime &amp; Law, 12(4), 351-366.

Bruce, V. (1982). Changing faces: Visual and non-visual coding processes in face recognition. British Journal of Psychology, 73(1), 105-116.

(30)

Davies, G., Van der Willik, P., & Morrison, L. J. (2000). Facial composite production: A

comparison of mechanical and computer-driven systems. Journal of Applied Psychology, 85(1), 119.

Davies, G., van der Willik, P., & Morrison, L. J. (2000). Facial composite production: A

comparison of mechanical and computer-driven systems. Journal of Applied Psychology, 85(1), 119-124. doi:10.1037//0021-9010.85.1.119

Davies, G. M., Shepherd, J. W., & Ellis, H. D. (1978). Remembering faces: Acknowledging our limitations. Journal of the Forensic Science Society, 18(1), 19-24.

Ellis, H. D., Shepherd, J. W., & Davies, G. M. (1979). Identification of familiar and unfamiliar faces from internal and external features: Some implications for theories of face

recognition. Perception, 8(4), 431-439.

Frowd, C., Bruce, V., McIntyre, A., & Hancock, P. (2007). The relative importance of external and internal features of facial composites. British Journal of Psychology, 98(1), 61-77. doi:10.1348/000712606X104481

Frowd, C. D., Bruce, V., Smith, A. J., & Hancock, P. J. B. (2008). Improving the quality of facial composites using a holistic cognitive interview. J Exp Psychol Appl, 14(3), 276-87. doi:10.1037/1076-898X.14.3.276

Frowd, C. D., Carson, D., Ness, H., McQuiston-Surrett, D., Richardson, J., Baldwin, H., & Hancock, P. (2005a). Contemporary composite techniques: The impact of a forensically-relevant target delay. Legal and Criminological Psychology, 10(1), 63-81.

(31)

Frowd, C. D., Carson, D., Ness, H., Richardson, J., Morrison, L., Mclanaghan, S., & Hancock, P. (2005b). A forensically valid comparison of facial composite systems. Psychology, Crime & Law, 11(1), 33-52. doi:10.1080/10683160310001634313

Frowd, C. D., McQuiston-Surrett, D., Anandaciva, S., Ireland, C. G., & Hancock, P. J. B. (2007). An evaluation of U.S. Systems for facial composite production. Ergonomics, 50(12), 1987-98. doi:10.1080/00140130701523611

Hancock, P. J., Bruce, V., & Burton, A. M. (2000). Recognition of unfamiliar faces. Trends in Cognitive Sciences, 4(9), 330-337. doi:10.1016/s1364-6613(00)01519-9

Johnston, R. A., & Edmonds, A. J. (2009). Familiar and unfamiliar face recognition: A review. Memory (Hove, England), 17(5), 577-96. doi:10.1080/09658210902976969

Laughery, K. R., & Fowler, R. H. (1980). Sketch artist and identi-kit procedures for recalling faces. Journal of Applied Psychology, 65(3), 307. doi:10.1037/0021-9010.65.3.307 McIntyre, A. H., Hancock, P. J. B., Frowd, C. D., & Langton, S. R. H. (2015). Holistic face

processing can inhibit recognition of forensic facial composites. Law and Human Behavior. doi:10.1037/lhb0000160

McQuiston-Surrett, D., Topp, L. D., & Malpass, R. S. (2006). Use of facial composite systems in US law enforcement agencies. Psychology, Crime &amp; Law, 12(5), 505-517.

doi:10.1080/10683160500254904

Megreya, A. M., & Burton, A. M. (2008). Matching faces to photographs: Poor performance in eyewitness memory (without the memory). Journal of Experimental Psychology: Applied, 14(4), 364. doi:10.1037/a0013464

(32)

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol Sci, 22(11), 1359-66. doi:10.1177/0956797611417632

Toseeb, U., Keeble, D. R. T., & Bryant, E. J. (2012). The significance of hair for face recognition. PloS One, 7(3), e34144. doi:10.1371/journal.pone.0034144

Wilkinson, C., & Rynn, C. (2012). Craniofacial identification. Cambridge University Press. Wilmer, J. B., Germine, L., Chabris, C. F., Chatterjee, G., Williams, M., Loken, E., . . .

Duchaine, B. (2010). Human face recognition ability is specific and highly heritable. Proceedings of the National Academy of Sciences of the United States of America, 107(11), 5238-41. doi:10.1073/pnas.0913053107

(33)

Appendix A

Target Photographs and names

1: Matthijs van Nieuwkerk 2:Marco Borsato

3: Paul de Leeuw 4: Jeroen van Koningsbrugge

5: Frans Bauer 6: Dennis Storm

(34)
(35)

Appendix B Overview of all composites

(36)
(37)

Referenties

GERELATEERDE DOCUMENTEN

This research set out to find out whether three differences between acquiring companies from Germany and their targeted companies in other countries, namely cultural

Compared to the South African RCA index (Figure 4.1), it is clear that Argentina has a revealed comparative advantage for the entire period for all the sunflower seed products

Regarding the size 35 instruments, the positive control group had significantly (P &lt; 0.001) higher scores compared to all other groups except the group employing the ultrasonic

In experiment III was vanaf week 2 in de zoogperiode de uitval bij biggen die ijzer- chelaat A via het drinkwater verstrekt kregen hoger dan de uitval bij biggen die ijzer per

Koen Vanbleu, Geert Ysebaert, Gert Cuypers, Marc Moonen Katholieke Universiteit5. Katholieke Universiteit Leuven, ESAT Leuven, ESAT / / SCD-SISTA, Belgium

160 It can be argued that this is a reversed paradox in comparison with the housing situation in the 1970s: while at that time there was a pressing housing shortage with thousands

Time Span Analysis Residential Burglaries Enschede 2004-2008 0 50 100 150 200 250 1 3 5 7 9 11 13 15 17 19 21 23 Hour of Day F re q u e n cy Average Aoristic TEMPORAL

I assume that adverbs are adjoined.. The verb undergoes movement to Asp 0. However, as mentioned earlier, the aspect marker -le is generally considered to be a