• No results found

Does the use of posed stimuli inflate recognition rates in emotion research?

N/A
N/A
Protected

Academic year: 2021

Share "Does the use of posed stimuli inflate recognition rates in emotion research?"

Copied!
25
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Does the use of posed stimuli inflate recognition rates in

emotion research?

W. M. Waterbolk

Bachelorthesis Social Psychology Waterbolk, W. M.

10003420

University of Amsterdam May 2014

(2)

Abstract

Laughing, crying, being sad or angry: every day, people express emotions and try to read emotions in others. Emotional communication is an important part of our social lives and has been studied by researchers for many years. Most research has been done with posed stimuli. Posed expressions are high in experimental control but their ecological validity has been questioned. Therefore, spontaneous expressions have been studied, , because they are high in ecological validity. Could it be that

depending on the method used, recognition differ? Based on the literature that I review, a few conclusions can be drawn: (1) For static face stimuli, recognition rates for studies that have used spontaneous stimuli are lower than in studies that have used posed stimuli. There is however, little evidence on the comparison between posed and spontaneous expressions in faces. (2) With vocal expressions of emotion, recognition rates for posed stimuli are more accurate than for spontaneous stimuli. (3) The use of multi-modal stimuli raises recognition rates for posed stimuli, and it is also suggested that this is the same for spontaneous stimuli. Reasons why it is important to examine the difference between posed and spontaneous stimuli are discussed.

(3)

Introduction

Laughing, crying, being sad or angry; every day, people express emotions and try to read emotions in others. When a friend says ‘Hi!’ a smile appears on our face and when someone is crying we ask what is going on. Being aware of how other people feel is essential for interactions and relations with others (Diener & Mangelsdorf, 1999; Simpson, Collins, Tran, & Haydon, 2007; Van Kleef, 2010). Therefore, the ability to express and recognize emotions accurately is crucial in our lives.

There is no clear definition of what an emotion really is, although it is a widely accepted concept. On at least one point researchers agree, emotions are internal phenomena. The problem is that internal phenomena are not directly observable. Outward expressions and actions of internal

phenomena on the other hand, are indeed observable. Researchers often isolate parts of the emotional experience and use them as the indicator of emotion (Niedenthal, Krauth-Gruber & Ric, 2006). Classic studies on emotion recognition have used facial expressions in photographs and found that people were quite accurate at recognizing the emotion (Ekman, Sorensen, & Friesen, 1969; Ekman, 1972; Elfenbein & Ambady, 2002). However, the photographs used in such studies show facial expressions that are still and posed. In real life, an emotion is never that static.

Since real life situations do not consist of posed expressions, posed emotional stimuli may lack ecological validity (Mesquita & Frijda, 1992). It has been suggested that people exaggerate or overact the posed expression to make sure that the expression is a clear as possible. Also, stereotypes of expressions can influence the portrayal of the emotion. It is argued that posed expressions are much more intense, prototypical expressions than those that are found in natural emotions (Scherer, Clark-Polner, & Mortillaro, 2011).

Researchers need a method to obtain a high level of ecological validity, but on the other hand, keep as much experimental control as possible. Therefore, spontaneous expressions have been

(4)

involuntary and thus stereotypes have less impact (Scherer, Clark-Polner, & Mortillaro, 2011). Spontaneous stimuli can be obtained in a laboratory by showing people videos or other material that evokes a specific emotion. For example, a funny video can be used to elicit happiness. Alternatively, spontaneous expressions can be obtained in a real life setting.

Most research on emotion recognition has been done with facial expressions. There are, however, other channels through which emotions are communicated. Vocal expressions of emotions also received a lot of attention (see Scherer, 2003). It can be useful to draw parallels between the research on facial and vocal expressions. Each research team makes their own choices in experimental design; Ekman and colleagues (e.g., Ekman, 1972; Matsumoto & Ekman, 1988) used solely still facial photographs, whereas Scherer and his colleagues (e.g., Scherer et al., 2001) only worked with the voice.

The question I will address in this review is whether recognition rates differ when posed or spontaneous emotion stimuli are used. For example, would Ekman and colleagues (Ekman, Sorensen, & Friesen, 1969; Ekman, 1972; Matsumoto & Ekman, 1988) have found similar results if they had used spontaneous displays of facial expressions instead of posed photographs?

Firstly, recognition of facial expressions is discussed. Methods that are used are posed photographs, spontaneous photographs, and acted dynamic displays. Also, the difference between posed and spontaneous smiles is described. Secondly, vocal expressions are discussed, examining posed and spontaneous expressions. Then I describe the difference between facial and vocal expressions and note the possibility of using multiple communication channels in studies of the recognition of emotion. Based on the literature that I review, a few conclusions can be drawn: (1) For static face stimuli, recognition rates for studies that have used spontaneous stimuli are lower than in studies that have used posed stimuli. There is however, little evidence on the comparison between posed and spontaneous expressions in faces. (2) With vocal expressions of emotion, recognition rates for posed stimuli are more accurate than for spontaneous stimuli. (3) The use of multi-modal raises recognition rates for posed stimuli, and it is also suggested that this is the same for spontaneous

(5)

stimuli. Reasons why it is important to examine the difference between posed and spontaneous stimuli are discussed.

Facial expressions

Posed photographs

Posed expressions give researchers a high degree of control over their stimuli (Cowie & Cornelius, 2003; Scherer, 2003; Scherer, Clark-Polner & Mortillaro, 2011). However, the expressions might be different in a laboratory than in real life, and thus be a less realistic image of how emotions are really expressed. This means lower ecological validity. Another criticism of posed photographs came from Russell (1994), who argued that with posed expressions, the poser chooses how to express that specific emotion and could therefore be subject to display rules. This means that when people pose an expressions, they might think they act out the same expressions as what they would show when experiencing an emotion, but it is not the entirely the same as what they look like when they express emotions in real life (Russell, 1994). Facial muscles could play a role in real emotion, yet not in posed, or could be almost invisible, for example. Also, display rules are present when people control their emotional expressions in certain situations because it is not appropriate in their culture. For example, Japanese people tend to smile or hide their feelings in the presence of an authority person (Ekman, 1972).

Most studies on emotions selected posed facial stimuli in their research (Cowie & Cornelius, 2003). With this method, Ekman and colleagues found that people in literate and preliterate cultures were quite accurate at recognizing the emotional expressions (Ekman, Sorenson, & Friesen, 1969; Ekman & Friesen, 1971; Ekman 1972). In their opinion, it was not the posed expressions, but the natural expressions that are affected by display rules. According to the researchers, posed facial expressions are similar to, or perhaps an exaggeration of, spontaneous expressions. Only, they convey one single specific emotion and have no distracting or irrelevant features.

(6)

Spontaneous photographs

The use of posed stimuli has thus its disadvantages. When using spontaneous expressions stimuli, ecological validity is higher. However, one of the major problems with spontaneous expressions is determining the underlying emotion because the expression could still be affected by display rules, for example when somebody is laughing out of politeness instead of happiness (Ekman, 1972; Cowie & Cornelius, 2003; Scherer, 2003; Scherer, Clark-Polner & Mortillaro, 2011). In order to isolate spontaneous facial expressions, not being affected by display rules (Ekman, 1972), researchers turned to infants. In their opinion, infants were not yet affected by display rules and had thus the most pure emotion signals (Izard, 1994). In a study by Yik, Meng, and Russell (1998), babies from 12 to 18 months of age were put in a situation that elicited a certain emotion, such as being given a toy to create happiness or having it taken away to make them angry. Photographs were taken of the facial

expressions they showed in these situations. People from Canada, China, and Japan had to name the emotion shown in photograph. The results showed that there was quite a lot of disagreement within and across cultures. Happy faces were best recognized but there was poor recognition of the other facial expressions. It could be that infants have not fully developed their emotional system and therefore, in contrast of what was thought by Izard (1994), did not show highly recognizable facial expressions.

Not only was this found with infants, other studies with adults’ expressions have also shown that recognition of spontaneous facial expressions is relatively low. Naab and Russell (2007) showed American participants the photographs of the preliterate New Guineans’ spontaneous expression that were taken in Ekman’s study (Ekman, 1980). Even though recognition of the correct emotion was above-chance, an incorrect emotion label was more often selected for half of the expressions. An earlier study by Motley and Camden (1988) noticed that there might be a difference in recognition of posed and spontaneous expressions. The researchers made, without participants knowing, photographs of the expressions of participants who were in a casual conversation with a confederate. Based on the

(7)

conversation topics that the confederate brought in, six emotions were elicited. For example, surprise was elicited by recognizing the participant with name and hometown, which was a surprise for the participant because they did not know each other. Anger, confusion, disgust, happiness, sadness and surprise were elicited this way. Later, with awareness of the camera, the participants had to pose the expressions again. Posed facial expressions were accurately identified, whereas spontaneous ones were more ambiguous. These results suggest that with spontaneous expressions, people’s accuracy in identifying emotions is less impressive than with posed expressions. Apparently, facial expressions in natural interpersonal communication settings are more ambiguous than in the artificial settings that are usually studied.

Why might this be? According to Matsumoto, Olide, Schug, Willingham, and Callan (2009), other functions of the face are affecting the emotion signal clarity, which in their turn affect

recognition rates. In real life settings, talking, illustrating speech, and regulating conversation are also functions of the face (Bavelas & Chovil, 2000, 2006; see Matsumoto et al., 2009). Spontaneous expressions have lower signal clarity than posed expressions because other functions are also working. Is it just signal clarity that makes spontaneous facial expressions different from posed facial

expressions? Then the spontaneous facial expressions in the study by Motley and Camden (1988) were less clear because the participants were also focused on other facial behavior that was necessary for the conversation, but which inferred the emotional signal clarity. Or could it be that the reason why spontaneous expressions are more difficult to recognize lies in that they are perceived in a completely different manner than posed ones? If that is the case, comparing posed and spontaneous facial

expressions is like comparing apples and oranges, not comparable at all. This of course then, has implications for previous research that has been done on posed expressions.

(8)

As argued above, differences in recognition rates between posed and spontaneous facial expressions could occur because of inflated emotion signal clarity or because of different ways of perceiving both expressions. The latter has been studied in research on smiles, which will be reviewed in this paragraph.

In the research on smiles, the genuine smile is also known as the Duchenne smile and has been widely studied (Williams, Senior, David, Loughland, & Gordon, 2001; Krumhuber & Manstead, 2009). Typical for a Duchenne smile are muscle movements of both the zygomatic major muscle, which raises the corners of the mouth, and the orbicularis oculi muscle, which raises the cheeks and makes your eyes crinkle (Duchenne, 1862/1990, see Ekman, Davidson, & Friesen, 1990). People were not able to deliberately show a Duchenne smile, because with a non-Duchenne smiles, only the zygomatic major muscle contracts (Williams, Senior, David, Loughland, & Gordon, 2001). This means that a spontaneous smile is biologically different from a deliberate smile. Miles and Johnston (2007) examined the difference between spontaneous enjoyment and deliberate non-enjoyment smiles. Smiles were classified as deliberate non-enjoyment smiles if there was evidence of zygomatic major contraction, but no contraction of orbicularis oculi. Spontanous enjoyment smiles had contractions of both muscles. Enjoyment stimuli were acquired when watching positively valenced video clips and photographs,and non-enjoyment stimuli when asked to pose a series of smiles as if a picture was taken. Participants had to judge in the enjoyment and non-enjoyment stimuli how happy the face looked. The results showed that participants were sensitive to the differences between spontaneous and deliberate smiles. They evaluated the faces displaying enjoyment smiles more positively than those who were displaying non-enjoyment smiles. Perceivers are sensitive to smile type, however, the study did not examine the recognition rates of the emotion.

A recent studies by Krumhuber and Manstead (2009) examined how Duchenne and non-Duchenne smiles were perceived, and whether people made different judgments of these two smiles in posed and spontaneous stimuli (Krumhuber & Manstead, 2009). Spontaneous smiles were a reaction to amusing stimuli, while deliberate smiles were created with the instruction to pose a smile.

(9)

smile was rated as most genuine and amused, but perceivers also distinguished between spontaneous Duchenne smiles that varied in intensity. Duchenne and non-Duchenne smiles appeared in

approximately equal proportions in the spontaneous and deliberate conditions, which casts doubt on theclaim that Duchenne smiles are spontaneous signs of felt enjoyment that cannot be feigned (Ekman, 1985, 1989, 1993, see Krumhuber & Manstead, 2009). Although both smile types were recognized as positive emotions, the difference between posed and spontaneous stimuli is not that clear.

So far, the literature has shown that recognition rates in posed and spontaneous facial

recognition differ, with lower recognition for spontaneous expressions. Research on smiles shows that people can accurately detect whether facial expression is posed or spontaneously displayed. This suggests that posed and spontaneous smiles are physiologically different. However, there is also research that shows that the muscles that make a smile spontaneous, can occur in posed expressions of smiles. So, the distinction between posed and spontaneous smiles still has its limitations. Nevertheless, if muscles for spontaneous expressions can be acted as well, this would be a good alternative in research design if the spontaneous stimuli are difficult to collect. Here, we only looked at smiles. It would be interesting to see if other emotions show similar results when posed and spontaneous expressions are compared in research.

Posed acted displays

Another form of posed facial expressions are acted displays of emotions in films or TV series. Here, professional actors, instead of lay participants, produce an emotional expression based on emotion-typical scenarios. There is one study by Caroll and Russell (1997) that used movies from Hollywood as emotion material. They compared the emotions shown in the movies to facial affect programs (FAPs; Ekman, 1972), which are specific facial patterns consisting of distinguishable parts. Combinations of these parts make a pattern for a specific emotion. They found that actors who were judged as happy showed FAPs congruent with happiness in almost all cases. However, actor

showing

(10)

other emotions as surprise, afraid, angry, disgusted, or sad only showed one or two parts of the FAPs for those emotions. Actors’ displays of emotions could be influenced by stereotypes of expressions or miss out on some subtle details that occur in spontaneous expressions (Wallbott & Scherer, 1996; Scherer, Clark-Polner & Mortillaro, 2011; Scherer & Bänziger, 2010). As we have seen with the Duchenne smile, were realistic smiles could be produced in posed expressions, and so acted stimuli might be a good compromise between posed and spontaneous stimuli with facial expressions. Future research could take this method in consideration.

Dynamic displays

Most of the time, posed expressions have been displayed in static stimuli. As we have seen with emotional material from movies, the possibility to put posed expressions in a more dynamic setting seems appealing. Movies are after all multiple photographs that are viewed fast after each other. In that view, dynamic stimuli contain multiple static images, and offer therefore a larger sample of expression than one single photograph. Additional information or cues could help to judge an ambiguous emotional expression (Ambador, Schooler, & Cohn, 2005). In the study mentioned before, Krumhuber and Manstead (2009) also looked at dynamic displays in comparison with static ones, combined with spontaneous or deliberate smiles, and genuine or non-genuine smiles. This had a significant impact on recognition rates. When smiles were presented in dynamic displays, spontaneous smiles were judged as more genuine and amused than deliberate smiles. However, this difference did not emerge for static displays. Apparently, information about the spontaneity of the expression was not transmitted through static cues. Furthermore, people could recognize a non-genuine smile faster in a video than in a photograph. For example, in terms of the duration of the smile, non-genuine smiles tend to last longer than genuine smiles. Dynamic displays thus adds a different perspective on the time-course of emotional expressions. These results are in line with previous research by Wehrle et al. (2000), who showed that the dynamic presentation of emotional expressions adds important cues that tend to improve recognition.

(11)

So far, we know that with posed photographs, recognition rates are high, but lower in

ecological validity. In contrast, spontaneous photographs are higher in ecological validity, but lower in recognition rates. A possible explanation from Matsumoto et al. (2009) would be that lower emotional signal clarity with spontaneous expressions is due to the fact that the face has more functions than just expressing emotion (Matsumoto et al., 2009). In line with this theory, posed photographs of facial expressions would have higher emotional signal clarity. Looking at smiles, it appears that people are quite accurate at distinguishing between spontaneous and posed smiles. However, using the Duchenne smile as an indicator for real happiness has its limitations, since people can easily fake the Duchenne smile.

An option for more realistic posed material would be acted displays such as movies. There is, however, only one study that looked at this material. Dynamic displays such as video clips of people expression emotion show higher recognition rates due to the fact that more information is present in a dynamic setting. Suggestion for future research is therefore the use of video material rather than photographs. In addition, only smiles were reviewed, future research should also look at other emotions.

Vocal expressions

Although facial expressions have been the most studied in the research on emotion

recognition, the voice has also received attention, especially since the methods of recording have been improved. Nowadays, the television, telephone, radio or internet can provide enough and variable material (Scherer, Clark-Pollner, & Mortillaro, 2011). Equal to facial expressions, posed portrayals are mostly use in this line of research, covering almost all of emotion vocalization studies (Juslin & Laukka, 2003). However, there have also been few studies on spontaneous vocal expressions.

Posed vocal expressions

For posed vocal expressions, people are asked to portray a certain emotion in an experimental setting. Van Bezooijen, Otto, and Heenan (1983) asked people to say a sentence in either a neutral way

(12)

or by expressing a specific emotion, which could be fear, anger, joy, sadness, disgust, surprise, interest, contempt, or shame. The expressed sentences were then judged by Japanese, Taiwanese and Dutch participants who had to label the expression as neutral or as one of the specific emotions. All but one of the emotions were better recognized than chance. Nevertheless, there were cultural differences in recognition accuracy and some emotions were better recognized in one culture than in another. For example, Dutch people recognized happiness the best, and contempt the least (Van Bezooijen, Otto, & Heenan, 1983). Other studies also asked participants to read passages in a specific emotional tone or emotional content and found similar results (McGilloway, 1997, Iriondo et al., 2000, see Douglas-Cowie, Campbell, & Cowie, 2003).

Same as with facial expressions, critics have questioned the ecological validity of posed vocal stimuli (Cowie & Cornelius, 2003; Scherer, 2003). Therefore, professional actors were suggested since they are used to acting out realistic emotional sounds. Professional actors deliver emotional speech that has high arousal, which makes it easier to distinguish, and that would thus be more

accurately recognized (Ververidis & Kotropoulos, 2006; Douglas-Cowie, Campbell, Cowie, & Roach, 2003). In a study by Banse and Scherer (1996) actors were asked to perform an emotional eliciting scenario, of which they received a script a few days before. They could only start when they actually felt the emotion. Participants were then asked to judge the expression out of fourteen emotions. Three emotions, anger, boredom and interest, produced high recognition rates. Others were lower, but still with quite high recognition rates.

Although posed vocalizations can be classified reliably, portrayals of emotions can still to some degree be influenced by stereotypes of expressions (Scherer, Clark-Polner, & Mortillaro, 2011). This is controversial though. On the one hand, Batliner, Fischer, Huber, Spilker, and Nöth (2000) suggested that with acting out an emotion, people are supposed to display their emotions, but this does not mean that people express the same emotions the same way in real life, bearing in mind the display rules mentioned earlier. And Wilting et al. (2006) argue that when someone is acting out an emotion, the emotion is not truly felt and could be overacted. On the other hand, Scherer (2013) argues that posed vocalizations are less artificial, exaggerated or prototypical than critics suggest (Scherer, 2013).

(13)

Nevertheless, the relationship between posed and spontaneous vocal expressions can only be clearly shown when it is empirically tested.

Spontaneous vocal expressions

There are different ways of collecting emotional vocalizations from real life situations. Each method has to see whether the lack of experimental control outweighs the advantages of a real life setting for emotion recognition of the vocal expression. One way is to let people induce or elicit the emotion. Amir, Ron, and Laor (2000) asked their participants to recall personal experiences that evoked anger, fear, joy, sadness or disgust. Recalling would simulate the actual experience and therefore produce a more natural expression than just portraying an emotion on the spot. A computer algorithm could reliably distinguish between the emotions expressed in the voice of the speakers. Still, recalling an emotion in laboratory setting also has its limitations. Fear, for example, can only partially been felt, since it is not possible to generate real threat to one’s life. This, however, has more to do with intensity of the emotion and not so much with recognition per se.

Other ways of collecting spontaneous vocal expression stimuli is material from television, radio, or YouTube clips, although these do not necessary control for external factors (Scherer, 2013). Interviews on the other hand, can be more guided (Ververidis & Kotropoulos, 2006). A more recently used method is the use of a human-interactionmachine. Audibert, Aubergé, and Rilliard (2008) used a human-interactionmachine to collect emotional vocal expressions (Wizard-of-Oz technique, see Schuller et al., 2011). French-speaking actors had to pose or spontaneously express an emotion, which were judged by French listeners in audio, visual, or audio-visual settings. A pair with spontaneous and posed stimuli were presented and participants had to point out which of them was spontaneous. The researchers found highly significant recognition effects in all settings and concluded that people can discriminate between voluntary (posed) and involuntary (spontaneously) control of emotions. The researchers examined if people can tell if an emotion is spontaneous or not. Their focus, however, was not so much on the recognition of the emotion.

(14)

Recognition rates of posed vocal expressions and real emotional speech have been compared. Barkhuysen, Krahmer, and Swerts (2007) examined the perception of audiovisual

expressions of posed and real emotions in spoken language. Dutch participants were showed positive or negative sentences and asked to say the sentence in either a positive or negative way. When the mood of the sentence matched the way it was said, it was a real expression. When the mood of the sentence was reversed with how it was said, it was posed. After that, Czech participants had to judge these sentences, which they could not understand, in different settings: only audio, only visual, or audio-visual combined. Recognition rates were higher for posed emotional speech than non-posed. In addition, the difference between recognition of posed and real was larger in negative speech than in positive speech. Real speech was judged as a less extreme displays of emotion than posed expressions. Laukka, Audibert, and Aubergé (2012) suggest that posed expressions are acoustically distinct from spontaneous expressions and therefore the difference is highly recognizable. Expressions that were posed were recognized as more typical and of higher intensity than spontaneous ones (Laukka et al., 2012).

So far, comparison between posed and spontaneous vocal expressions has been done with speech. There are also studies of non-verbal vocalizations, such as spontaneous outburst of laughter, screams and cries (Cowie & Cornelius, 2003; Scherer, 2003). Barker (2013) found that recognition rates for spontaneous vocalizations were less accurate than for posed vocalizations. Based on these findings Schenk (2013) compared enacted, acted and spontaneous vocalizations and found that enacted stimuli were more easily recognized than spontaneous ones. In addition, Schenk (2013) showed that differences in prototypically and intensity did not predict emotion recognition. This contradicts what has been said by Laukka, Audibert, and Aubergé (2012). Therefore, additional research should clarify this matter.

As we have seen, equal to facial expressions, naturalness is still a problem in posed vocal emotion expressions. The lack of experimental control makes it harder to determine the precise nature of the vocal expression and the effect of regulation is problematic (Scherer, Clark-Polner & Mortillaro,

(15)

2011). Spontaneous vocalizations of emotions could, like facial expressions, be affected by emotional signal clarity (Matsumoto, Olide, Schug, Willingham, & Callan, 2009).

To sum up, vocal expressions seem to have the same problem as facial expressions. People can distinguish between specific emotional vocal expressions. Few emotions are better recognized, although overacting and the influence of stereotypes might be a problem. Recognition of posed expressions is more accurate than for spontaneous vocal expressions. Actual studies compared posed and spontaneous stimuli in their experiment. However, with spontaneous vocalizations there is no consensus yet.

Face versus voice

Thus far, we have seen that both facial expressions and vocal expressions can be displayed in either posed or spontaneous expressions. Recognition rates are influenced by the different modalities and communication channels. Some emotions are better recognized in vocalizations than in facial expressions and vice versa. Happiness, for example, is best recognized from the face, but least from the voice. In contrast, anger and sadness are best recognized from the voice, but relatively less from the face (Elfenbein & Ambady, 2002). Hawk et al. (2009) found that expressions of anger, contempt, disgust, fear, sadness, and surprise were better recognized through vocalizations. And in contrast, joy, pride, embarrassment and neutral expressions had higher recognition rates in posed facial expressions. In a way, they seem to complement each other. Combining the both communication channels

discussed above could be useful to understand the contribution of both of the channels.

Multiple channels

Recently, a few studies have integrated the use of multiple communication channels to examine emotion recognition. Busso et al. (2004) compared recognition rates from only one channel

(16)

with recognition rates from multiple channels. They found that recognition rates were more accurate when a multi-modal expressions was used in comparison with a uni-modal signals. Another study by Paulmann and Pell (2011) combined facial, semantic and prosodic expressions of emotion. Their results also showed an advantage for multiple emotional stimuli instead of only one. Preference for a communication channel was studied by Collignon, Girard, Gosselin, Roy, Saint-Amour, Lassonde, and Lepore (2008). Participants saw a photograph of either a fearful or disgusted posed facial

expression, heard the sound of the specific emotion, or heard and saw both. Audio-visual stimuli could be either incongruent, with different expression in the two modalities, or congruent, with the same expression in the two modalities. People preferentially turned to the visual part of emotional stimuli, when these are incongruent. However, if the visual stimuli was not reliable enough, people

preferentially turned to the audio part. Recognition is therefore, as Collignon and colleagues argue, more flexible and situation-dependent. Additional research is necessary in this area to see how

different emotional communication channels interact with each other. Research has mostly focused on vocal and face stimuli, leaving the rest of the body relatively understudied. Recent studies have shown that posture expressions also convey emotion specific information (Cowie et al., 2010). Pairing facial expressions with congruent posed body postures, such as a defensive pose for acting out fear, for example, made recognition faster of the emotion faster than with only a posed facial expression (Meeren, van Heijnsbergen, & de Gelder, 2005).

The research done with multiple channels, however, has only been done with posed stimuli. To see what the influence is of spontaneous stimuli, Poell (2014) tested the ecological validity of the use of enacted and spontaneous multi-modal nonverbal vocalizations of emotion. Surprisingly, the results were the opposite. Spontaneous expressions were better recognized than the posed ones for the multi-model stimuli. It could be, as was already suggested by Motley and Camden (1988) that

spontaneous expressions need context to be better recognized. Multiple channels could provide that context. Further research however, has to point that out.

(17)

Conclusion and discussion

To summarize, (1) for static stimuli, recognition rates for studies that have used spontaneous stimuli are lower than in studies that have used posed stimuli. (2) With vocal expressions of emotion, recognition rates for posed stimuli are more accurate than for spontaneous stimuli. (3) The use of multi-modal raises recognition rates for posed stimuli, but disproportionately so spontaneous stimuli.

The trend in emotion research seems to be that expressions should be as natural, authentic, and spontaneous as possible, resembling a real life situation. According to Scherer (2013):

“Researchers need to (1) be able to distinguish between what is true and what is false, or what is real and what is artificial or faked; (2) obtain access to the true and valid expressions; (3) ascertain that all requirements for experimental control in scientific research are fulfilled; and (4) make sure the expression material is appropriate for the question under investigation” (p. 41 Scherer, 2013).

As we have seen in this overview, selecting the right emotional stimuli is not that easy. The first point made by Scherer (2013) already has his question marks. Terms as posed and spontaneous, deliberate and non-deliberate, voluntary and involuntary, or universal and cultural specific are often used in emotional research. However, for example with the Duchenne smile, which has been thought to be a spontaneous sign of felt enjoyment that cannot be faked, apparently can be faked. Krumhuber and Manstead (2009) pointed out that people often produced a Duchenne smile deliberately, and that not all non-Duchenne smiles were posed, some were spontaneous. So, we have to be careful with interpreting spontaneous expressions as true or real. The same with deliberate and non-deliberate, or voluntary and involuntary. If we can control the expressions in our face or voice, control does not necessarily means artificial or faked. And a universal emotion does not necessarily mean that it is the only true, realistic emotion, because cultural specific emotions might just be as real. Researchers define and operationalise true and false, be explicit, and leave it to the reader to agree or not. So, certain precaution when comparing articles is important to keep in mind.

(18)

The second point, obtaining access to true and valid expressions, follows from the previous one. Not only the choice between posed and spontaneous expressions, but also the communication channel of the expression is essential. The emotional experience is more than just a facial or vocal expression. Expressions trough posture have also been studied, although not as much as facial and vocal expressions. An example is the study by Coulson (2004), which found that emotions such as anger, sadness and joy are just as well recognized through posed body posture expression as through facial expression. Also, Tracy and Matsumoto (2008) studied emotional expressions in posture and facial expressions. They looked at normal Olympic and blind Paralympic athletes just after winning or losing a match and found that the expression of pride and shame was linked to success and failure. Posture expressions were the same for blind and sighted athletes, which suggests that posture

expressions can be linked to a certain emotion. Blind athletes have never seen such an expression, so it must be something innate (Tracy & Matsumoto, 2008).Research on different communication channels should parallel their results with each other. Also in line with this review about the difference in posed and spontaneous expressions, this could be done for posture expressions of emotions as well.

It also makes sense to use more channels to create a realistic emotional expression, since real life situations include multiple channels as well. For example, which emotions are better recognized when just facial and vocal expressions are in the stimulus set, which when facial and posture, posture and vocal, or facial and vocal.

Scherer’s third point was the need to ascertain that all requirements for experimental control in scientific research are fulfilled. Comparing posed and spontaneous expressions means that there has to be actual comparisons in the study. We have seen that for facial expressions, there were only two studies that compared recognition rates of posed and spontaneous stimuli in the same study. Motley and Camden (1988) were early with their research, it took almost twenty years for another study by Krumhuber and Manstead (2009) was done, which was also only on smiles. Because of the confusion with the Duchenne smile, whether it is a reliable indicator for real happiness, further research should compare different facial expressions of emotions in either a spontaneous or posed way, as has been done in vocal emotional research (Barkhuysen, Krahmer, & Swerts, 2007; Audibert, Aubergé, & Rilliard, 2008; Laukka, Audibert, & Aubergé, 2007).

(19)

The last point mentioned by Scherer (2013) about keeping in mind the question under investigation, seems a logical, yet difficult point. As mentioned in the introduction, researchers use observable expressions and actions to get insight into an internal phenomenon like emotions. In this case, expressions are used as indicators for internal states. However, the link between expressions and emotional experience is not entirely straight forward. Fernandez-Dols, Sanchez, Carrera, and Ruiz-Belda (1997) questioned whether there is actual coherence between experience and spontaneous facial expression. They argued that certain emotions are not a necessary or sufficient precondition of certain spontaneous expressions. On the other hand, a recent meta-analysis had pointed out that there is coherence (Lench, Flores, & Bench, 2011). Furthermore, Scherer and Ceschi (1997, 2000) show that we have to be careful with recognition rates of spontaneous expressions collected in a natural setting. In their research, they analyzed vocalizations of airline passengers who were talking to airport agents about their lost baggage, which is considered an anger evoking situation. Only half of the passengers reported that they felt angry and most of them claimed to be in good humor. Inferring an emotional state on the basis of the situation, thus, has its flaws. However, the alternative of using self-reports has also its limitations. There has yet no good way be found to measure emotional states.

In a lot of emotion recognition research, the central question also has to do with the universality of emotions. Ekman and colleagues (Ekman, Sorenson, & Friesen, 1969; Ekman & Friesen, 1971; Ekman 1972) argued that, on the basis of their research, there is a set of emotions that are universally recognized, so called ‘basic’ emotions: happiness, surprise, fear, anger, contempt, disgust, and sadness. These emotions are recognized from facial expressions by all human beings, regardless of their cultural background. Critics, such as Russell (1994) argued that drawing conclusions about universality based on posed stimuli is not right. If recognition rates differ in spontaneous stimuli, what does this mean for the universality claim? Some argued against the universality of emotion recognition (Naab & Russell, 2007), others propose reasons other than non-universality that explain the different recognition rates (Matsumoto et al., 2009). Nevertheless, it stays an ongoing debate.

(20)

Since there is little evidence on the comparison between posed and spontaneous stimuli in facial expressions, future research should start with comparing posed facial and spontaneous dynamic facial expressions. As we have seen, the trend is to make stimuli as realistic as possible. Stimuli may lack essential cues used for the differentiation of emotions when it is static. So, adding static and dynamic displays to posed and spontaneous facial expressions compares a more realistic situation with a less realistic situation. Additionally, multiple channels should be examined.

Literature

Amir, N., Ron, S., & Laor, N. (2000). Analysis of an emotional speech corpus in Hebrew based on objective criteria. In ISCA Tutorial and Research Workshop (ITRW) on Speech and Emotion. Atkinson, A. P., Dittrich, W. H., Gemmell, A. J., & Young, A. W. (2004). Emotion perception from

dynamic and static body expressions in point-light and full-light displays. PERCEPTION-LONDON-, 33, 717-746.

Audibert, N., Auberge´, V., & Rilliard, A. (2008). How we are not equally competent for

discriminating acted from spontaneous expressive speech. In P. A. Barbosa, S. Madureira, & C. Reis (Eds.), Proceedings of the Speech Prosody 2008 Conference (pp. 693696). Campinas, Brazil: Editora RG/CNPq.

Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of personality and social psychology, 70(3), 614.

Bänziger, T., & Scherer, K. R. (2010). Introducing the geneva multimodal emotion portrayal (gemep) corpus. Blueprint for affective computing: A sourcebook, 271-294.

Barker, P. L. (2013) Recognition of spontaneous and acted vocalizations of emotion. MSc thesis, University of Amsterdam.

Barkhuysen, P., Krahmer, E., & Swerts, M. (2007). Cross-modal perception of emotional speech. ICPhS, Saarbruecken, Germany, 2133-2136.

(21)

Batliner, A., Fischer, K., Huber, R., Spilker, J., & Nöth, E. (2000). Desperately seeking emotions or: Actors, wizards, and human beings. In ISCA Tutorial and Research Workshop (ITRW) on Speech and Emotion.

Buck, R., & VanLear, C. (2002). Verbal and nonverbal communication: Distinguishing symbolic, spontaneous, and pseudo-spontaneous nonverbal behavior. The Journal of z3

Communication, 52(3), 522–541.

Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., ... & Narayanan, S. (2004). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205-211 ACM.

Carroll, J. M., & Russell, J. A. (1997). Facial expressions in Hollywood's protrayal of emotion. Journal of Personality and Social Psychology, 72(1), 164.

Cohn, J., & Schmidt, K. (2004). The timing of facial motion in posed and spontaneous smiles. International Journal of Wavelets, Multiresolution, 2(2), 121–132.

Collignon, O., Girard, S., Gosselin, F., Roy, S., Saint-Amour, D., Lassonde, M., & Lepore, F. (2008). Audio-visual integration of emotion expression. Brain research, 1242, 126-135.

Coulson, M. (2004). Attributing emotion to static body postures: Recognition accuracy, confusions, and viewpoint dependence. Journal of nonverbal behavior, 28(2), 117-139.

Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human-computer interaction. Signal Processing Magazine, IEEE, 18(1), 32-80. Corpus. In LREC.

Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech communication, 40(1), 5-32.

Diener, M. L., & Mangelsdorf, S. C. (1999). Behavioral strategies for emotion regulation in toddlers: Associations with maternal involvement and emotional expressions. Infant Behavior and Development, 22(4), 569-583.

(22)

Ekman, P. (1972). Universals and cultural differences in facial expressions of emotion. In J. Cole (Ed.), Nebraska Symposium on Motivation, 1971 (Vol. 19, pp. 207–282). Lincoln: University of Nebraska Press.

Ekman, P. (1980). Asymmetry in facial expression.

Ekman, P., & Friesen, W. V. (1971). Constants across cultures in the face and emotion. Journal of personality and social psychology, 17(2), 124.

Ekman, P., Sorenson, E. R., & Friesen, W. V. (1969) Pan-cultural elements in facial displays of emotion. Science, 164, 86–88.

Elfenbein, H. A. & Ambady, N. (2002) On the Universality and Cultural Specificity of Emotion Recognition: A Meta-Analysis. Psychological Bulletin, 128, 203–235.

Fernandez-Dols, J. M., Sanchez, F., Carrera, P., & Ruiz-Belda, M.-A. (1997). Are spontaneous expressions and emotions linked? An experimental test of coherence. Journal of Nonverbal Behavior, 21(3), 163–177.

Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and music performance: Different channels, same code?.Psychological bulletin, 129(5), 770.

Krumhuber, E. G., & Manstead, A. S. (2009). Can Duchenne smiles be feigned? New evidence on felt and false smiles. Emotion, 9(6), 807.

Laukka, P., Audibert, N., & Aubergé, V. (2012). Exploring the determinants of the graded structure of vocal emotion expressions. Cognition and Emotion, 26(4), 710–719.

Lench, H. C., Flores, S. A., & Bench, S. W. (2011). Discrete emotions predict changes in cognition, judgment, experience, behavior, and physiology: a meta-analysis of experimental emotion elicitations. Psychological Bulletin, 137(5), 834–855.

Matsumoto, D., & Ekman, P. (1988). Japanese and Caucasian facial expressions of emotion (JACFEE)[Slides]. San Francisco, CA: Intercultural and emotion research laboratory, department of psychology, San Francisco State University.

Matsumoto, D., Olide, A., Schug, J., Willingham, B., & Callan, M. (2009). Cross-cultural judgments of spontaneous facial expressions of emotion. Journal of Nonverbal Behavior, 33(4), 213–238.

(23)

Matsumoto, D., & Willingham, B. (2009). Spontaneous facial expressions of emotion of congenitally and noncongenitally blind individuals. Journal of Personality and Social Psychology,96(1), 1– 10.

Mesquita, B., & Frijda, N. H. (1992). Cultural variations in emotions: a review. Psychological bulletin, 112(2), 179.

Miles, L., & Johnston, L. (2007). Detecting happiness: Perceiver sensitivity to enjoyment and non-enjoyment smiles. Journal of Nonverbal Behavior, 31(4), 259–275

Motley, M. T., & Camden, C. T. (1988). Facial expression of emotion: A comparison of posed expressions versus spontaneous expressions in an interpersonal communication

setting. Western Journal of Communication (includes Communication Reports), 52(1), 1-22. Naab, P. J., & Russell, J. A. (2007). Judgments of emotion from spontaneous facial expressions of

New Guineans. Emotion, 7(4), 736.

Niedenthal, P. M., Krauth-Gruber, S., & Ric, F. (2006). Psychology of emotion: Interpersonal, experiential, and cognitive approaches. Psychology Press.

Paulmann, S., & Pell, M. D. (2011). Is there an advantage for recognizing multi-modal emotional stimuli?. Motivation and Emotion, 35(2), 192-201.

Poell, L. S. (2014).Multi-modal Emotion Recognition from Spontaneous vs. Acted Vocalizations of Emotion. MSc thesis, University of Amsterdam

Ruiz-Belda, M., Fernandez-Dols, J., Carrera, P., & Barchard, K. (2003). Spontaneous facial expressions of happy bowlers and soccer fans.Cognition and Emotion, 17(2), 315–326. Russell, J. A. (1994). Is there universal recognition of emotion from facial expressions? A review of

the cross-cultural studies. Psychological bulletin,115(1), 102.

Schenk, N. (2013). Recognition of Emotions in Acted, Spontaneous and Enacted Vocalisations. MSc thesis. University of Amsterdam.

Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech communication, 40(1), 227-256.

Scherer, K. R. (2013). Vocal markers of emotion: Comparing induction and acting elicitation. Computer Speech & Language, 27(1), 40-58.

(24)

Scherer, K. R., Banse, R., & Wallbott, H. G. (2001). Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross Cultural Psychology, 32, 76–92. Scherer, K. R., Clark-Polner, E., & Mortillaro, M. (2011). In the eye of the beholder? Universality and

cultural specificity in the expression and perception of emotion. International Journal of Psychology, 46(6), 401-435.

Scherer, K. R., & Ceschi, G. (1997). Lost luggage: A field study of emotion–antecedent appraisal. Motivation and emotion, 21(3), 211-235.

Scherer, K. R., & Ceschi, G. (2000). Criteria for emotion recognition from verbal and nonverbal expression: Studying baggage loss in the airport. Personality and Social Psychology Bulletin, 26(3), 327-339.

Simpson, J. A., Collins, W. A., Tran, S., & Haydon, K. C. (2007). Attachment and the experience and expression of emotions in romantic relationships: a developmental perspective. Journal of personality and social psychology, 92(2), 355.

Schuller, B., Batliner, A., Steidl, S., & Seppi, D. (2011). Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech

Communication, 53(9), 1062-1087.

Tracy, J. L., & Matsumoto, D. (2008). The spontaneous expression of pride and shame: Evidence for biologically innate nonverbal displays. Proceedings of the National Academy of Sciences of the United States of America, 105(33), 11655–11660.

Van Bezooijen, R. (1984). Characteristics and recognizability of vocal expressions of emotion (Vol. 5). Walter de Gruyter.

Van Heijnsbergen, C. C. R. J., Meeren, H. K. M., Grezes, J., & de Gelder, B. (2007). Rapid detection of fear in body expressions, an ERP study. Brain research, 1186, 233-241.

Van Kleef, G. A. (2010). The emerging view of emotion as social information. Social and Personality Psychology Compass, 4(5), 331-343.

Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech communication, 48(9), 1162-1181.

(25)

Wallbott, H. G. (1998). Bodily expression of emotion. European journal of social psychology, 28(6), 879-896.

Wallbott, H. G., & Scherer, K. R. (1986). Cues and channels in emotion recognition. Journal of personality and social psychology, 51(4), 690.

Williams, L. M., Senior, C., David, A. S., Loughland, C. M., & Gordon, E. (2001). In search of the “Duchenne Smile”: Evidence from eye movements.Journal of Psychophysiology, 15(2), 122-127.

Wilting, J., Krahmer, E., & Swerts, M. (2006, September). Real vs. acted emotional speech. In INTERSPEECH.

Yik, M. S. M., Meng, Z. L., & Russell, J. A. (1998). Adults“ freely produced emotion labels for babies” spontaneous facial expressions. Cognition and Emotion, 12(5), 723–730.

Referenties

GERELATEERDE DOCUMENTEN

This section describes several interesting social interaction theories that will be investigated and incorporated into a computational model for our conversational tutoring agent:

Binnen deze triades kwam naar voren dat de relatie tussen de JIM en de jongeren en ouders in orde was, maar er tussen jongeren en ouders nog wel veel spanning aanwezig was, en of/

We present analysis algorithms for three objectives: expected time, long-run average, and timed (in- terval) reachability.. As the model exhibits non-determinism, we focus on maxi-

De totale taxonrijkdom, het aantal indicatortaxa (kenmerkende en positief dominante taxa voor de KRW-maatlatten R4-R6) en hun abundanties zijn vergeleken tussen de

For example, even though their neutrality prevents them from joining NATO, the member states of the European Union (EU) that are neutral, which are Ireland, Austria, Sweden,

In this study, a condition monitoring methodology that incorporates an autoregressive fault detection model is developed to improve condition-based maintenance strategies

Regarding the size 35 instruments, the positive control group had significantly (P < 0.001) higher scores compared to all other groups except the group employing the ultrasonic

The findings of my research revealed the following four results: (1) facial expres- sions contribute to attractiveness ratings but only when considered in combination with