• No results found

The Processing of Semantics, World Knowledge, and Irony in Tweets: A Behavioural Study

N/A
N/A
Protected

Academic year: 2021

Share "The Processing of Semantics, World Knowledge, and Irony in Tweets: A Behavioural Study"

Copied!
58
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The Processing of Semantics, World Knowledge, and Irony in

Tweets: A Behavioural Study

Michelle Warta S2729318 Master Neurolinguistics University of Groningen Dr. S. Popov Dr. D. A. de Kok University of Groningen August 31, 2020 Word count: 14,181

(2)

Abstract

Language processing is necessary in order to communicate properly. There are different processes that are involved during language processing. However, for some aspects of language processing it remains unclear whether they belong to the same process, or whether they are different processes altogether. These aspects include semantics, world knowledge, and irony. The present study set out to investigate whether processing of tweets with semantic violations differ from the processing of tweets with world knowledge violations, as well as whether reading tweets containing irony requires different processes than reading literal tweets, through two experiments. The reason for tweets rather than a more formal language construction is that Twitter, and consequently tweets, play a large role in modern language use. Moreover, this type of study in which tweets and hashtags are used is novel. Fifty-seven participants read tweets and accompanying hashtags in two experiments, after which they had to decide whether the hashtag fit the tweet in Experiment 1, and whether the hashtag was meant ironically or literally in Experiment 2. In the first experiment, a repeated measures ANOVA revealed that there is a significant difference in RTs when comparing semantic violations and world knowledge violations, with the RTs for the world knowledge condition being significantly longer than the RTs for the semantic condition. Moreover, a marginal trend was found when comparing the baseline condition and world knowledge violations, with the RTs for the world knowledge condition being longer than the RTs for the baseline condition. In the second experiment, an independent samples t-test revealed that there is a marginal between RTs of literal (baseline) tweets compared to RTs of ironic tweets, with the RTs for the ironic tweets being longer than the RTs for the literal tweets. These trends from the two experiments suggest that semantics and world knowledge are processed on different levels, and that there is a difference between the processing of ironic tweets and literal tweets. Interestingly, no significant differences were found when comparing the baseline condition to the experimental conditions in either of the two experiments. This could be because the paradigm that was used in the current study is slightly different from a regular self-paced reading task that was mostly used in previous research. Moreover, it is also possible that this paradigm was not sensitive enough to detect the differences in processing between these conditions. Thus, more research should be carried out using another, more sensitive, paradigm.

Keywords: language processing, tweets, hashtags, semantics, world knowledge, literal language, ironic language

(3)

Table of Contents

Abstract ... 2

1. Introduction & Background ... 5

1.1. Semantic ... 6

1.1.1. Semantic Relatedness ... 6

1.1.2. The N400 ... 7

1.2. World Knowledge ... 9

1.2.1. How World Knowledge Influences Language ... 9

1.2.2. World Knowledge and ERPs ... 10

1.2.3. World Knowledge Violations and Negation ... 11

1.2.4. Priming, Quantifiers, and Discourse ... 12

1.3. Irony ... 13

1.3.1. Understanding Irony ... 13

1.3.2. Irony and ERPs ... 15

1.4. Social Media and Twitter ... 17

1.5. #Hashtags ... 17

1.6. The Current Study ... 19

2. Experiment 1 ... 20 2.1. Method... 20 2.1.1. Ethical Approval ... 20 2.1.2. Participants ... 21 2.1.3. Materials ... 21 2.1.4. Procedure ... 23 2.1.5. Analysis ... 23 2.2. Results ... 24 2.3. Discussion ... 25 3. Experiment 2 ... 28 3.1. Method... 28 3.1.1. Ethical Approval ... 28 3.1.2. Participants ... 28 3.1.3. Materials ... 28 3.1.4. Procedure ... 29 3.1.5. Analysis ... 29 3.2. Results ... 30 3.3. Discussion ... 30 4. General discussion ... 32 5. Conclusion ... 37

(4)

6. References ... 38 7. Appendices ... 43 7.1. Appendix 1 ... 43 7.2. Appendix 2 ... 47 7.3. Appendix 3 ... 50 7.4. Appendix 4 ... 53 7.5. Appendix 5 ... 56

(5)

1. Introduction & Background

When a person listens to spoken language or reads written language, the comprehension of said language is fully automated. In this process, multiple parts are necessary for full comprehension: the morphological structure, the syntactic structure, and meaning of the language. The meanings of words, otherwise known as the semantic representations of words, are activated in the brain as we hear spoken language, or see written language. It is generally believed that semantic representations are held within the semantic memory.

It has been argued that sentence processing occurs in a two-step manner (Hagoort & Van Berkum, 2007). During the first step, the meanings of the words in a sentence are interpreted by looking at the meanings of the words that are dictated by the syntax. Secondly, the meaning of the sentence is integrated with additional information, such as world knowledge. An example of this can be seen when reading the following sentence: “The present queen of England is divorced.” (example from Hagoort et al., 2004). Whereas the semantic interpretation of this sentence is immediately clear, the world knowledge of the reader will register this information as false. Thus, sentence processing requires more than only semantics, and it is possible that semantics and world knowledge could play a separate role within this process. Additionally, the evidence provided by the example suggests that the processing of world knowledge relies on context, in this case provided by the reader.

Another aspect that comes into play when processing sentences is irony. Irony, much like the validation of world knowledge in a sentence, relies heavily on context (Calmus & Caillies, 2014). The semantic aspects of an ironic and non-ironic sentence are mostly identical, and the intended meaning only becomes available when taking context into account. An example of this is the sentence “He is bright” (example from Calmus & Caillies, 2014). This could either mean that someone is smart, or, given the right, ironic, context, that someone is an idiot. However, unlike with the validation of world knowledge, irony does not immediately provide a clear interpretation of the sentence from the beginning. The interpretation only becomes clear from context, when the irony is processed. Thus, one could argue that this is yet another distinct process one can go through when reading and interpreting sentences.

This means that the following steps can be followed when processing language: 1) interpretation of the meaning of the literal sentence, 2) validation using world knowledge, and 3) the usage of context to consider other interpretations of the sentence. The combination of all these different aspects of language processing (semantics, world knowledge, and irony) have never been investigated together before. It is currently unknown whether these three aspects are

(6)

distinct processes that are accessed during language processing or are all part of the same process. The current study aims to take a closer look at this question by conducting two experiments.

1.1. Semantic

1.1.1. Semantic Relatedness

Multiple studies have found that semantic relatedness and semantic (in)congruity can affect reaction time (RT) (Fischler & Bloom, 1979; Henik et al., 1983; Huttenlocher & Kubicek, 1983; Schuberth & Eimas, 1977). A study by Schubert and Eimas (1977) found that RTs were faster for words that were semantically congruent with context when compared to the RTs of words that were shown in isolation. Additionally, words that were semantically incongruent with context resulted in slower RTs than words that were shown in isolation. The authors suggested that semantic context has an effect on RTs, and concluded that not even a full sentence is necessary to induce this effect; even a semantically related word could have a positive effect on RTs. Similarly, Huttenlocher and Kubicek (1983) found that semantic relatedness has a large effect on RTs when naming pictures, but not when reading aloud. They argued that this is due to the differences in recognition processes: how long it takes for written language to be processed versus how long it takes to recognize and recall an object. A study carried out by Pratarelli (1994) looked at RTs for related and unrelated target words, and found that the RTs for unrelated target words was slightly slower, albeit not significantly. Additionally, a study carried out by Schwanenflugel and Lacount (1988) suggested that the effects surrounding semantic relatedness could vary depending on sentence type. This is because some sentences have more restrictions than others.

Fischler and Bloom (1979) and Henik and colleagues (1983) investigated the facilitation and inhibition effects related to semantic relatedness between words that were either highly likely or highly unlikely to occur. The former found that there was more inhibition for words that were unlikely to occur than facilitation for words that were likely to occur, but that facilitation still occurred. Henik and colleagues (1983) found that facilitation occured when participants were semantically primed before a lexical decision task, whereas inhibition occured if they were primed before a colour naming task. They suggested that the prime activation was closely related to semantic memory, which in turn activated the target word more quickly. However, Fischler and Bloom (1979) also suggested that, even though the lexical decision task is the most appropriate paradigm to test semantic relatedness and its effects on facilitation and

(7)

inhibition, a lexical decision task is very unlike reading. They argued that the results of the lexical decision task can thus not be applied to reading processes. Furthermore, Schwanenflugel and Lacount (1988) discussed that, while reading sentences, people will use the meanings of all words in the sentence, and integrate this with the semantic representations that are activated. This means that semantic relatedness is an important factor during sentence processing.

1.1.2. The N400

Brain responses (seen as changes in electrical brain activity) to different stimuli can be measured with various techniques, one of which is Event-Related Potentials (ERPs). Normally, when reading a sentence and encountering a surprising or unexpected word, the brain would respond with a positive deflection after 300 to 600 milliseconds (ms) post-stimuli onset (P300) (Kutas & Hillyard, 1980). Kutas and Hillyard (1980) were the first to find that semantically anomalous words elicit a slightly different reaction, namely a negative deflection. This negative wave, currently commonly knows as the N400, begins around 200 to 300 ms after the presentation of a stimulus, and reaches its peak around 400 ms (Lau et al., 2008). The N400 has not only been elicited in response to words; it can also occur when processing faces, environmental sounds, odours (Kutas & Federmeier, 2011), and pictures (Proverbio & Riva, 2009). Although all these stimuli can render an N400, the localizations of the N400 can differ per stimulus group. According to Lau and colleagues (2008), the N400 can be investigated with two paradigms: the semantic-priming paradigm and the semantic-anomaly paradigm. The semantic-priming paradigm shows a related or an unrelated word before the target word, thus priming the participant. The semantic-anomaly paradigm shows a word that is either congruent or incongruent within the sentence.

The first study looking into the N400 was carried out by Kutas and Hillyard (1980). They looked at short sentences in three conditions: sentences with a semantically appropriate ending, sentences with a semantically anomalous ending, and sentences with a semantically appropriate ending but bigger font size. They found that the sentences with a semantically anomalous ending rendered the biggest N400, whereas sentences with a bigger font size at the end rendered a late positivity. They argued that the N400 gives information about the timing, classification, and interactions of cognitive processes involved in language comprehension.

Three years later, Kutas and Hillyard (1983) followed up on this research by investigating whether the N400 is rendered solely through semantic anomalies, or whether grammatical anomalies can affect this component as well. They used short sentences with

(8)

semantic errors or syntactic errors. Additionally, the errors did not only occur at the end of the sentence, but in different positions throughout the sentences. They found that semantic anomalies resulted in an N400, and it did not matter where in the sentence the error was positioned. These N400s were found in the central, posterior, and temporal scalp regions. Grammatical anomalies, on the other hand, rendered an N400 as well, but this effect was positioned in other regions. They argued that the differences between an N400 for semantical anomalies and for grammatical anomalies shows that the N400 is only due to unexpected words.

After previous findings that an N400 would show regardless of where an error is positioned in the sentence, Kutas and colleagues (1988) decided to investigate this further. Their experiment consisted of short sentences, with semantic anomalies in various positions in the sentences. They replicated the results of Kutas and Hillyard (1983) that the N400 would occur regardless of the position of the error. Moreover, the results showed that the N400 was larger for sentences that contained an anomaly earlier in the sentence, compared to sentences that contained an anomaly later in the sentence.

More research has looked into the N400 since. Gunter and colleagues (1997) compared semantic anomalies to grammatical anomalies, similar to Kutas and Hillyard (1983). They used sentences in four different conditions: semantically appropriate and grammatically appropriate, semantic anomaly and grammatically appropriate, semantically appropriate and grammatical anomaly, and semantic anomaly and grammatical anomaly. As with the research by Kutas and Hillyard (1983), they found that the N400 was specifically linked more strongly to semantic anomalies. However, they also found that semantically appropriate sentences and sentences with a semantic anomaly both showed a positive wave with a peak around 600 ms post stimulus onset (P600) when these sentences contained a grammatical anomaly. They argued that the semantic processes (N400) and the syntactical processes (P600) were interlinked, and seemed to interact.

Kutas and Federmeier (2011) decided to investigate the N400 effect thirty years after the initial discovery by Kutas and Hillyard (1980). They suggested that even though the N400 is thought to be a language measure, it is much more than that. They argued that the N400 can be used to investigate how semantic information is stored in what is called the semantic memory. Moreover, they said that the N400 was additionally found in repetition tasks, thus the N400 can be used to investigate recognition memory. This is because old, seen-before words render a smaller N400 than new, never-seen-before words.

(9)

1.2. World Knowledge

1.2.1. How World Knowledge Influences Language

Research has also looked into whether our world knowledge, in addition to (lexical) semantic violations, affects language processing. Research carried out by Chaffin (1979) investigated whether two types of inferences, necessary inferences and invited inferences, had different effects on RTs. Necessary inferences are dependent on linguistic knowledge, whereas invited inferences depend on world knowledge. The results showed that necessary inferences resulted in faster RTs, but only when participants had to verify only one inference-type. When participants were unaware which sentences would require world knowledge beforehand, the RTs for both inference-types were comparable. These results imply that world knowledge is part of linguistic processing and sentence comprehension, and that it can only be ignored on very rare occasions.

Kounios and Holcomb (1992) investigated semantic memory. They argue that semantic memory is the cognitive structure mostly involved with both the storage and the processing of world knowledge, implying that semantic memory is vital for successful language processing and language comprehension. They found that trials where sentences needed to be verified resulted in slower RTs. Thus, the verification of world knowledge is what leads to slower RTs in these cases. Moreover, they implied that this verification of the sentences happens quite late during processing.

Chwilla and Kolk (2005) supplied some slightly different results. Participants were asked to look at word triplets. These triplets were either script-related or script-unrelated. They found that script-related triplets led to faster RTs than script-unrelated triplets. This implies that RTs becomes faster when the next word in a triplet, or alternatively a sentence, is expected. However, with world knowledge violations, the word is unexpected and inappropriate, which would suggest that RTs would be slower.

Finally, Duffy and Keir (2004) carried out eye-tracking experiments incorporating stereotypes and context to examine the effects of world knowledge on language processing. The stereotypes were professions, for example an electrician, which is typically a male profession. When there was a mismatch between the gender and stereotypes, this would interfere with the reading processes. The results showed that this meant longer fixation times when reading sentences that contained mismatches between gender and stereotypes. This implies that gender roles play a role with context, and thus with world knowledge. Moreover, it implies that people will fixate longer on world knowledge violations, as if to verify what they have read. Another

(10)

eye-tracking study carried out by Cook and Myers (2004) had similar results. They used sentences with a script, like ‘rock band’. In a later sentence, they would use a script action that would either be appropriate or inappropriate for the script role. They found that the inappropriate action words showed longer fixation times than appropriate action words. However, if the inappropriate word had been encountered before, the results were slightly different. The previous encounter of an inappropriate word led to an advantage, for example earlier recognition and thus less fixation time. They suggested that these results cannot be explained with the help of models about general world knowledge, as these models imply that accessing world knowledge occurs before contextual processing during sentence comprehension. Their results suggest that semantics and world knowledge do not occur before contextual processing, but that it is an interactive process.

1.2.2. World Knowledge and ERPs

More research has been carried out to investigate sentences that are semantically correct, but that are incorrect with respect to the world knowledge of the reader. One study compared the ERPs elicited by native Dutch speakers in response to correct sentences, sentences with a semantic violation (e.g. “The Dutch trains are sour and very crowded”), and sentences with a world knowledge violation (e.g. “The Dutch trains are white and very crowded”) (Hagoort et al., 2004). The latter example counts as a world knowledge violation as only yellow trains existed in the Netherlands at that time, and no white ones. They found an N400 effect for sentences with semantic violations, as expected. However, they also found a similar effect in sentences with a world knowledge violation. Therefore, they argued that lexical semantic knowledge and general world knowledge are integrated into sentence processing at the same time. These results have been replicated by other studies (e.g. Hagoort & Van Berkum, 2007). Moreover, Hagoort and colleagues (2004) suggested that detecting whether a sentence is semantically incongruous takes just as long as detecting whether a sentence is true or not based on our world knowledge.

The number of studies investigating world knowledge violations using ERPs has further increased in the past few years. Metzner and colleagues (2015) examined world knowledge violations using German sentences in two conditions: correct sentences and world knowledge violations. They found a large N400 effect in the world knowledge violation condition in comparison with the correct condition in addition to a positivity in the frontal to frontotemporal region. Dudschig and colleagues (2016) looked at short German sentences, in which some sentences were correct, some contained a semantic violation, and some contained a world

(11)

knowledge violation, similar to Hagoort et al. (2004). They found a significant difference in N400 amplitudes between correct sentences and sentences containing a semantic violation, as well as between correct sentences and sentences containing a world knowledge violation. However, they additionally found a significant difference in amplitude latency (270 ms after word onset) between sentences that contained a semantic violation and sentences that contained a world knowledge violation, suggesting that the N400 effect in each of these conditions was not completely identical. They argued that this might signify a difference in the integration of linguistic and non-linguistic knowledge sources.

1.2.3. World Knowledge Violations and Negation

World knowledge violations have also been investigated in combination with negation. A study by Haase et al. (2019) looked at negation and world knowledge in the language processing of native German speakers by presenting them with short sentences such as “George Clooney is an actor” or “George Clooney is not an actor”. These sentences were either affirmative or negative. They found that the N400 effect associated with world knowledge was larger for false affirmative sentences than true affirmative sentences, whereas the N400 effect was smaller for false negative sentences than for true negative sentences. They also found an effect in the right posterior region, which was more activated when reading false affirmative sentences. They suggested that the larger N400 amplitude for affirmative sentences could come from the possibility that anticipation of the upcoming content plays a role. This does not play a role as much in negative sentences, as it is more difficult to anticipate what someone or something is not. Dudschig et al. (2019) carried out research with native German speakers as well, only with three types of sentences: correct, world knowledge violations, and semantic violations. All three sentence types contained both affirmative and negative sentences. They found an N400 effect for both sentences containing a world knowledge violation and sentences with a semantic violation. However, negation did not alter the N400 effect, unlike the results found by Haase and colleagues (2019). More importantly, Dudschig and colleagues (2019) suggested that the difference in N400 size was due to the association between the noun and the adjective. Furthermore, they argued that the N400 does not reflect the full sentence processing, as the time point to which the N400 is locked and thus measured is not always the point in which the meaning of the sentence is reflected. This would imply that only a part of the meaning of the sentence at that point can thus be understood and interpreted, and thus reflected in the N400. Other studies, however, have suggested that if the sentence is predictable, the full

(12)

interpretation of the meaning of the sentence takes place immediately when reading, and thus is reflected in the N400 (Nieuwland, 2016).

1.2.4. Priming, Quantifiers, and Discourse

Finally, studies looking into world knowledge violations in combination with various other aspects of language will be discussed. Chwilla and Kolk (2005) investigated native Dutch students to examine if script priming affected the N400 when reading word triplets. In their first experiment, they found an N400 for script-related triplets in the frontal midline site and left hemisphere, unlike earlier research into the N400, which found activation in the bilateral central, posterior maximum, and right hemisphere (Kutas et al., 1988). In their second experiment they asked the participants after each word triplet whether the words made up a plausible scenario, based on their world knowledge. They found an N400 for script-related triplets as well, but this time the activation accompanying this was more widespread across the scalp. The findings regarding world knowledge violations and semantic violations in script priming replicated earlier work (e.g. Hagoort et al., 2004).

Nieuwland (2016) examined native English speakers processing quantifiers in sentences. The stimuli were short sentences containing quantifiers such as ‘many’ or ‘few’ and where divided into four conditions that were created: true-positive, false-positive, true-negative and false-negative. Nieuwland (2016) found that all conditions showed a positive peak (P2), followed by an N400. However, the N400 was bigger for false-positive than for true-positive. As previous research claimed that the N400 was insensitive to quantifiers, Nieuwland (2016) argued that his results show that those claims were not fully accurate. Furthermore, he suggested that full quantifier interpretation is possible, if the quantifier is incorporated into the predictions for upcoming words successfully, and if these predictions are based on world knowledge. Thus, if the sentence is predictable based on world knowledge, quantifiers can be interpreted properly, and thus the meaning of the full sentence can also be interpreted properly.

Additionally, some studies have looked into world knowledge violations in combination with discourse context. Hald and colleagues (2007) investigated native Dutch speakers, using four sentence conditions: correct according to world knowledge (WK+), world knowledge violation (WK-), discourse, and baseline. They found a significant difference in the N400 between the WK+ condition and the WK- condition, as well as between. The largest N400 amplitude they found was for the WK- condition being incorrect according to both discourse and world knowledge in long-term memory. They argued that these results imply that different types of information, such as semantics, syntax, and world knowledge, are used at the same

(13)

time during the interpretation of the sentence, as earlier studies have also suggested (Hagoort et al., 2004; Hagoort & Van Berkum, 2007). Moreover, the results suggest that world knowledge and discourse interact.

Nieuwland and Van Berkum (2006) also investigated the interaction between discourse and world knowledge violations in sentence processing. They investigated native Dutch speakers, using short stories made up of six sentences each, which the participants had to read. These short stories either contained a man (animate) or an object (inanimate). For the first experiment, the inanimate condition in the story would contain an animacy violation, functioning as a world knowledge violation. It was found that the N400 was biggest for the inanimate object. For the second experiment, the inanimate object would behave like an animate object, (e.g. a peanut was dancing). Again, there were two conditions at one point during the story: only this time, the inanimate object would either be in a canonical state (‘the peanut was salted’) or in an animate state (‘the peanut was in love’). Interestingly enough, they found an N400 effect for the canonical condition, but no N400 for the animate condition. They argued that discourse can change the reaction to animacy violations, and thus world knowledge violations. Moreover, this implies that discourse can overrule lexical-semantic violations, and that these violations are thus very sensitive to discourse context.

1.3. Irony

1.3.1. Understanding Irony

The interpretation of a sentence is not solely reliant on semantics or world knowledge. Instances in which something is said, but something else is meant, are commonly known as examples of non-literal language. Irony is one of these non-literal linguistic features. According to Attardo (2000), irony is not specifically about expressing alternative meanings, but rather about expressing one specific opposite meaning. Furthermore, irony can be divided into multiple subdivisions, namely verbal irony and situational irony. Verbal irony is a linguistic phenomenon, whereas situational irony is described as a state of the world that can be seen as an ‘unhappy coincidence’, for example the fire station burning down (example from Attardo, 2000). For the purpose of the current study, verbal irony will be discussed in more detail. Verbal irony is a complex part of language, as it carries a different meaning than what is actually said (Akimoto et al., 2014). According to Curcó (2000), there are various theories concerning verbal irony. Verbal irony can be described as echoic use of language or as a form of indirect negation. Calmus and Caillies (2014) argue that irony is a part of non-literal language that has no specific

(14)

semantic criteria that are identifiable. Alternatively, Attardo (2000) suggests that irony can be classified as inappropriate comments that are still relevant for context. Nevertheless, (verbal) irony can be best described for its foundation on the contrast between context and the ironic statement that is being made (Calmus & Caillies, 2014). This is why, according to Akimoto and colleagues (2014), it is so difficult to understand irony. In order to understand irony completely, one would need knowledge of irony, as well as knowledge of context. To shed some more light onto the topic of irony, Akimoto and colleagues (2014) studied irony and its place within the brain during language processing. They concluded that the right anterior superior temporal gyrus is responsible for the understanding of irony. Understanding the context in which irony occurred was placed more in the medial prefrontal cortex and anterior inferior temporal gyrus, which influenced the knowledge of irony. The comprehension of irony, alternatively, had more activation in the amygdala, hippocampus and parahippocampal gyrus.

Some have argued that irony is simply a form of humour, and that some people might be more fluent in this form of humour than others. Calmus and Caillies (2014) investigated how much of a role humour plays during the processing of irony. They showed their participants word pairs, which where either ironic or literal. The participants then graded the pairs, first on their contrast, and second on their humour. It was found that ironic pairs were rated as having both more contrast and more humour than literal pairs. However, the humour was rated highest when contrast was moderate: if the contrast was higher, the humour rating would decrease. They argued that this could be because the contrast was too prominent if it surpassed ‘moderate’, which made it more difficult for the participants to understand the irony in the pairs, and thus resulted in a lower humour rating.

Some additional studies looked into the processing of irony. Giora and colleagues (Giora, 1997; Giora & Fein, 1999), for example, found that, during sentence processing, the more important and thus salient meaning is processed first, regardless of which meaning is intended to be activated. Later, if a less salient meaning is meant to be activated, it will be activated, but the more salient meaning will still be activated simultaneously. In ironic sentences, the less familiar targets render literal salient interpretations first. Only later will the ironic less salient interpretation become available. In ironic sentences with familiar targets, both literal and ironic interpretations will become available immediately. They have worded these findings into the salience hypothesis, in which ironic (less salient) interpretations will either be available at the same time or after literal (more salient) interpretations, but never before.

Filik and Moxey (2010) investigated the differential language processes for reading ironic and literal sentences. They found that reading ironic sentences takes longer compared to

(15)

literal sentences. They argued that this is in line with the salience hypothesis put forward by Giora and colleagues (Giora, 1997; Giora & Fein, 1999), as the readers had both the literal and the ironic interpretation in mind when reading these sentences. According to the hypothesis, literal interpretations are accessed immediately, and ironic interpretations are accessed later, as these require additional processes before they can be accessed. Moreover, they suggested that the reading times are longer for ironic sentences. In order to access the proper interpretation, the ironic sentences can be read again by the readers. This does not happen for literal sentences, as the right interpretation is accessed immediately. Spotorno and Noveck (2014) focussed on the processes when reading ironic sentences. They let people read sentences that were either ironic or literal, and measured reading times. In their first experiment, they found that the reading times for ironic sentences were slower than the reading times for literal sentences, but they argued that this was only the case because they used decoys and fillers as well. They suggested that if no decoys or fillers were used, the participants would be able to anticipate the ironic sentences, thus lowering the reading times. In their second experiment, they removed the decoys and fillers, and reading times for ironic sentences were declining throughout the course of the experiment. This implies that, if participants cannot anticipate irony, reading times would be slower than when reading literal sentences.

1.3.2. Irony and ERPs

However, irony has not only been investigated in behavioural studies, but also using EEG and ERPs. According to Spotorno and colleagues (2013) and Symeonidou (2018), understanding irony requires much more than the understanding of linguistic aspects such as syntax or semantics. Understanding the processing of irony also requires an understanding of the speaker’s mind. Even though the appreciation for irony starts developing during childhood, a full appreciation for irony only becomes available in adolescence. Symeonidou (2018) investigated the difference between the processing of irony in adults (24-34 years old) and adolescents (11-18 years old). She found that irony elicited a positive effect in the form of a P600 in both adults and adolescents. However, the results suggested adults and adolescents process irony slightly differently. Different brain regions were activated in adolescents in comparison to adults during the processing, which Symeonidou (2018) suggested might be due to structural changes in the brain that occur in adolescence. Moreover, a correlation was found with the P600 and empathy score for adults, but not for adolescents. Symeonidou (2018) believed that this might be due to an ongoing development in adolescents.

(16)

It is not straightforward, however, that the processing of irony would elicit a P600. According to Spotorno et al. (2013), the classic view of processing should lead to an N400 effect with regards to the processes involved in irony. This is because ironic sentences are incongruent with the context. However, generally no significant N400 effect has been found for irony processing (Amenta & Balconi, 2008; Regel et al., 2010, 2011; Spotorno et al., 2013). Amenta and Balconi (2008) did find a small but insignificant increase in N400 during the processing of irony. They suggested that the processing of ironic sentences required more cognitive resources than the processing of literal sentences. They concluded that this must mean that irony is not similar to a semantic anomaly, as it does not render an N400, and that something else happens in the brain. Furthermore, Regel and colleagues (2011) suggested that critical words and their meanings seemed to play a role, since irony is normally incongruent with the context, but renders no N400. Another study added that irony processing might not only be the processing of incongruencies with context, as previous research found that context can influence ERP outcomes (Spotorno et al., 2013).

Nevertheless, some studies did find an N400 during irony processing. Filik and colleagues (2014) found a larger N400 for unfamiliar ironies than for familiar ironies or literal sentences. This may suggest that the N400 was caused by the presence of an unexpected (unfamiliar) aspect, rather than the irony itself. Caillies and colleagues (2019) carried out an auditory experiment and found an N400 that was larger for ironic praise than for literal sentences. Additionally, they also found a larger N400 for literal sentences than for ironic criticism. They concluded that the N400 seemed to be affected by prosody, as a function of emotional information. Moreover, they found some additional ERP results, namely a larger P600 for ironic criticism than either ironic praise or literal sentences. Filik and colleagues (2014) found a P600 for all ironies, regardless of their familiarity. They suggested that, after the initial delay in processing the unfamiliar ironies, the processes seem to coincide here again for both familiar and unfamiliar ironies. Regel and colleagues (2010; 2011) identified a larger P600 for ironic sentences in comparison to literal sentences. They concluded that the P600 seemed to be related to the processing of irony, as distinct cognitive processes seemed to be at work while processing ironic sentences compared to the processing of literal sentences. Finally, Spotorno and colleauges (2013) also found a P600 for sentences that contained irony. Regel and colleagues (2011) concluded that there can be three separate views of the P600: 1) the P600 is present because of the predictability of the task and the stimuli, 2) the P600 is present emotional arousal when processing irony, as emotional arousal has been found to increase the P600, and 3) the P600 is present because the ‘incongruent’ meaning of ironic sentences is

(17)

reintegrated into context with extralinguistic information. Spotorno and colleagues (2013) added by stating that only one of these views was most likely. Decoys in their experiment prevented possibility 1) from happening, and the stimuli did not contrast emotionally with the neutral sentences, preventing 2) from taking place. Thus, they concluded that the P600 is most likely present during the processing of irony because the incongruent meaning reintegrates into context.

1.4. Social Media and Twitter

However, due to the importance of language and our ever-growing use of technology, language is not just a stately thing, it is also constantly evolving. Therefore, it was not altogether surprising that the language we use daily was affected by a new form of socializing that entered society approximately fifteen years ago: social media (Treese, 2006). This new form came with various platforms and new ways in which people could express themselves. In 2017, approximately 37% of the world population actively used at least one social media application (Salazar, 2017). Among these platforms were Facebook, MySpace, Instagram, Snapchat, and Twitter. Twitter in particular is very interesting. Twitter is an application that allows its users to send and receive short messages, called tweets (Fitton et al., 2009; Weller et al., 2013). Moreover, people can choose to follow updates from a specific person, brand, or organization. This means that the content you see is not limited to people you know in real life, but can be extended to any Twitter user. The reason why Twitter has had such an impact on our language use is the brevity that encompasses Twitter. In 2017, Twitter expanded their original limit of 140 characters to 280 characters, giving the people on Twitter twice as many characters to work with (Rosen, 2017). However, these extra characters were barely used: only 5% of all tweets were longer than 140 characters, only 2% over 190 characters, and only 1% reached the character limit (Rosen, 2017).

1.5. #Hashtags

The use of the hash (#) on Twitter came about in 2008, approximately 1 year after Twitter’s launch (Salazar, 2017). Before this, the hash and the asterisk (*) were given special functions on telephones, such as shortcuts for calling alarm numbers and altering ring time. However, when the hash was used in tweets for the first time, an alternative function was created: tagging. It was thought that the hash could be used to assign a keyword to an object of data, in this case the tweet, so that users could find it easily, and in turn subscribe, follow, mute,

(18)

or block other users who used this keyword (Salazar, 2017; Scott, 2015; Treese, 2006), thus creating the hashtag. In using the hashtag as a keyword, a hyperlink is created which makes both organization and classification easier for the users, as well as searching for specific data (Erz et al., 2018; Lee & Chau, 2018; Mulyadi & Fitriana, 2018; Powell, 2015; Scott, 2018). Nowadays, the hashtag is integrated into social media so well that young adults no longer link the hash symbol to ‘number’ or ‘pound’, but rather to a hashtag. The hashtag as a tagging system was only used on Twitter at first, but is now used on various other social media platforms as well (Martín et al., 2016; Mulyadi & Fitriana, 2018; Powell, 2015; Scott, 2018). Moreover, nowadays it is not uncommon for brands and businesses to use hashtags as well, as it is important for companies to keep up with social trends (Powell, 2015).

Since the hashtag has become a common feature within social media, it is not surprising that it has been investigated frequently. Hashtags can be used to convey emotions that would normally become clear through context or facial expressions, such as ‘school is right around the corner #sad’ (example from Lee & Chau, 2018). Additionally, the hashtag is currently also used to create communities within social media, as certain hashtags are only used by specific communities (Mulyadi & Fitriana, 2018). Moreover, the hashtags that are used in these communities can be used to keep track of the discussions within these communities in real time (Powell, 2015). Additionally, Twitter and hashtags are both being used for protests all over the world (Lee & Chau, 2018; Recuero et al., 2015). During protests, hashtags are not only used to provide context, but also to steer other users of the social media platform towards their cause, in this case the protests. In sum, in a short amount of time the hashtag has evolved from simply being a tool for categorization, to a tool for raising awareness, particularly if the hashtag becomes trending on social media (Recuero et al., 2015). Finally, the hashtag has also evolved when it comes to our daily spoken language. Scott (2018) found that hashtags are also used in spoken discourse, although this is a more recent phenomenon. Scott suggests that some restrictions fall away when the hashtag is used in spoken discourse, namely 1) the audience is no longer imaginable, but rather a visible entity, 2) the length restraint that the speaker is objected to on Twitter disappears, and 3) context can be used more in order to understand both the conversation and the hashtag better. However, Scott (2018) states that, even when these restrictions fall away, it is still unclear whether the spoken hashtag will be used as much as the written hashtag. This is due to the fact that the written hashtag is already incorporated into Twitter and other social media platforms, whereas the spoken form is sometimes still met with resistance (‘you’re not on Twitter, hashtags don’t work here’, example from Scott, 2018).

(19)

In sum, semantics, world knowledge, and irony all are part of the process of understanding language, its meaning, and its accuracy. Semantics can be described as the meaning of the language, and often revolves around (un)relatedness of the target word with the context. However, both world knowledge and irony delve much deeper than meaning, and rely heavily on context. World knowledge relies on context in order to demonstrate the accuracy of the statement, whereas irony relies on context to provide an alternative meaning. How these three aspects work and come together during language processing will be interesting to see, as this combination semantics, world knowledge, and irony has not been carried out before. Moreover, the added layer of using an alternative form of communicating, namely tweets, is completely novel.

1.6. The Current Study

The current study aims to investigate the three separate aspects of language processing that are discussed above, namely semantics, world knowledge, and irony, using a violation paradigm. These three aspects of language processing are relatively closely related, yet, as the research above shows, vastly different as well. In the current study, these three aspects will be combined with hashtags, in the forms of tweets, which has not been done before. We aim to see how the processes that are involved in these three conditions are different and how they overlap and interplay when reading tweets and hashtags. These three aspects of language processing will be investigated through two separate experiments. In the first experiment, semantic violations and world knowledge violations will be compared. Looking at semantic violations and world knowledge violations, the following research question has been established: Does the processing of tweets with semantic violations differ from the processing of tweets with world knowledge violations? In other words, is there overlap between the semantic processing and the processing of world knowledge, or is it rather a two-step manner, as described by Hagoort et al. (2004)? It is hypothesized that, when encountering a semantic violation in a tweet, the process of reading will take longer than when reading a tweet without a semantic violation, as an unexpected word occurs. The unexpected word has not been accessed, and cannot most likely be accessed easily, due to the semantic unrelatedness of the word and the preceding context. Additionally, it is expected that reading tweets with world knowledge violations takes longer compared to reading tweets without world knowledge violations. This difference in reaction times would likely be due to verification that is necessary when reading sentences with world knowledge violations; regardless of whether the statement in the tweet is true or not, it

(20)

will need to be verified in order to comprehend the language properly. It is thus expected that both experimental conditions (semantics and world knowledge) render longer RTs than the baseline condition, and that different processes are accessed when reading tweets from the experimental conditions compared to when reading tweets from the baseline condition. Moreover, we believe that there will be a difference in RTs when reading tweets containing world knowledge violations, compared to when reading tweets with semantic violations. Previous research (Dudschig et al., 2016) has shown that, even though sentences with world knowledge violations and sentences with semantic violations both resulted in an N400, these N400 effects were slightly different. The difference was found 270 ms after word onset, which they believed to be due to a difference in the integration of linguistic knowledge (semantic violations) and the integration of non-linguistic knowledge (world knowledge violations). Therefore, it is expected that semantic violations and world knowledge violations are processed differently and separately, which will result in a difference RTs. This difference in RTs should be with longer RTs for the world knowledge condition, and slightly shorter RTs for the semantics condition, due to verification of the information given in the tweets.

The second experiment will focus on irony. For the purpose of investigating the processing of irony in tweets, the following research question has been formulated: Does reading tweets containing irony require different processes than reading literal tweets? Specifically, when reading ironic tweets, does one go through two steps, as was found in research by Giora and Fein (1999)? It is hypothesised that tweets containing irony will have longer RTs, as irony takes longer to process, and should thus take longer to respond to. If the RTs for tweets containing irony are indeed longer, it will be safe to assume that there are different processes at play when reading literal tweets compared to when reading ironic tweets.

2. Experiment 1

2.1. Method

2.1.1. Ethical Approval

The present experiment received ethical approval from the local ethics committee (Research Ethics Committee (CETO), Faculty of Arts, University of Groningen).

(21)

2.1.2. Participants

Sixty native Dutch speakers (20 male, mean age 23.4) participated in the experiment. One speaker reported to be a Frisian-Dutch bilingual. All participants signed an online informed consent form prior to participation. Participants were asked to provide their age, gender, and highest educational attainment (VMBO, HAVO, VWO, MBO, HBO or university) at the beginning of the experiment. They also had to specify whether they had dyslexia, AD(H)D, or other reading- concentration problems. Three speakers were excluded from the analysis because they reported to have dyslexia. The participants who reported AD(H)D were checked for accuracy, on which they performed well. Therefore, their results were included in the analysis. All participants were recruited via a Facebook-page, and all participants were reimbursed 5 euros for their participation in both experiments. During the first experiment, participants were given a personal program-generated code, which they should use to verify their participation with the experiment leader in order to get paid, as well as get their data removed, should they request this.

2.1.3. Materials

Stimuli were in the form of a tweet in Dutch, followed by a hashtag. The chosen hashtags were carefully selected, such that they would not exceed the four syllable maximum that was adopted for this experiment. Moreover, for consistency, the hashtags were of approximately the same orthographical length. The hashtags all corresponded to various themes throughout the experiment, namely food, places, and colours.

Subsequently, for each hashtag, three tweets were compiled: one baseline tweet in which the hashtag was correct (1), one tweet in which the hashtag caused a semantic violation (2), and one tweet in which the hashtag caused a world knowledge violation (3):

1) Ik ben laatst in Washington geweest #WitteHuis I recently went to Washtington #WhiteHouse

2) Gezelschapsspellen zijn leuk om te spelen #WitteHuis Board games are nice to play #WhiteHouse

3) Ik ben laatst in Parijs geweest #WiteHuis I recently went to Paris #WhiteHouse

(22)

A total of 180 experimental items were compiled in this way. Additionally, the experiment contained 20 filler stimuli, which were neutral tweets with hashtags that were not used in any of the other conditions, as can be seen in the example (4):

4) Komende zomer ga ik naar Thailand en Indonesië #reizen This summer I am going to Thailand and Indonesia #travel

The 180 experimental stimuli were divided into three separate lists, which consisted of 20 baseline tweets, 20 tweets with a semantic violation and 20 tweets with a world knowledge violation (no hashtag was used more than once in different conditions in the same list). All three lists also contained the same 20 filler tweets. All three lists were pseudo-randomized, as shown in Appendices 1, 2, and 3.

2.1.3.1. Sentence rating All baseline tweets and tweets with a world knowledge violation

were rated prior to the experiment, to see if the stimuli adhered to general knowledge. The tweets and hashtags were rated by 37 native Dutch volunteers (4 male, mean age 21,9). The participants in this rating were not compensated for their participation.

Participants were asked whether the hashtag fit the tweet. The correct response to baseline tweets would be ‘yes’, whereas the correct response to tweets with a world knowledge violation would be ‘no’. All tweets that received a ‘wrong’ rating (i.e. ‘no’ for baseline tweets, and ‘yes’ for the tweets with a world knowledge violation) were separated, and the percentage of wrong ratings compared to the total ratings was calculated. If this percentage was 20% or higher, the tweets was discarded, and a new tweet had to be compiled, using the same hashtag. This was the case for six baseline tweets and two tweets with world knowledge violations.

The new tweets were again checked, this time by 13 native Dutch volunteers (3 male, mean age 23.8). Two participants were excluded from the final calculations, as they did not fill in the form to completion. The participants of this rating were also not compensated for their participation.

This time, no tweets had an error rate of 20% or higher. These tweets were used in the final experiment, together with the tweets from the previous rating that did not exceed the 20% error rate.

(23)

2.1.4. Procedure

Participants were asked to join the present study via a Facebook-page. They were provided with a link, which would direct them to a Qualtrics-survey (Qualtrics, 2005). In this survey, the participants were provided with all necessary information, after which they gave their informed consent to participate in the study. Subsequently, they were taken to the next page, which would present them with one of twelve lists (all combinations of the three lists of the present experiment and the two lists of the following experiment, counterbalanced for keys). Next, they were redirected to a new window, which would open the PsychoPy experiment (Peirce et al., 2019). First, the participants were asked some descriptive background questions. Following this, they were given the personal program-generated code.

Then, they would be presented with an instruction page, which would specify that a tweet would show up on screen and after 3 seconds, a hashtag would appear. It was then up to the participant to decide whether this hashtag fit the previous tweet, as quickly and as correctly as possible. They could press two keys, one for ‘yes’ (the hashtag fits the tweet) and one for ‘no’ (the hashtag does not fit the tweet). The keys were either ‘p’ or ‘q’, which were counterbalanced per list.

Next, participants were asked to complete a short practice round consisting of three tweets, one sentence at a time. Participants were provided with feedback after every practice stimulus. Afterwards, another instruction screen appeared, stating that the practice round had ended, reminding the participant which keys to press for which option, and stating that the real experiment was about to begin, this time without feedback.

Lastly, the real experiment would begin, with one tweet at a time, and a hashtag appearing after 3s. The whole experiment battery, including experiment 2, would take approximately 30 minutes in total.

2.1.5. Analysis

After all the results from the experiment were gathered, all RTs < 200 ms and all RTs > 5s were excluded. This was done because RTs < 200 ms would not be enough time to process the stimuli, whereas RTs > 5s suggests that the participant was distracted. The remaining RTs were used for calculations. Additionally, the RT data were log transformed prior to statistical analysis.

For the statistical analysis an Excel file with all data was loaded into RStudio (RStudio Team, 2016). The appropriate statistical assumptions (i.e. normality, homogeneity of variance,

(24)

and the sphericity of the variables) were examined. A repeated measures analysis of variance (ANOVA, two-sided) was carried out with the three variables (baseline, semantic violations, world knowledge violations), using the function ‘aov’ in RStudio to determine whether there was any significant difference in RTs between the three variables. The significance level was set at p < .05. Tukey’s HSD Test (Honestly Significant Difference) was performed as a post-hoc test to calculate which levels were significantly different from each other. The function ‘TukeyHSD’ in RStudio was used for this purpose. A similar approach was adopted to analyse the number of errors.

2.2. Results

Table 1 below contains the descriptive statistics, namely the mean RT as well as accuracy for each condition.

Table 1

Mean RT (in ms), mean number of errors, and percentage of the mean number of errors per condition

Mean RT in ms (sd) Mean errors (sd) % errors

BL 1281.19 (818.62) 0.72 (0.81) 3.60%

SEM 1204.59 (695.73) 0.37 (0.81) 1.84%

WK 1320.35 (775.24) 1.25 (1.62) 6.23%

BL: baseline condition, SEM: semantic condition, WK: world knowledge condition

The mean RT for the hashtags in the world knowledge (WK) condition was the highest. Additionally, this condition also had the highest mean number of errors, as well as the highest standard deviation in the mean number of errors.

The statistical analysis showed that there was a significant difference between the three conditions (F(3, 4438) = 10.16, p < .001). A post hoc-test (pairwise comparison t-test), with Tukey’s HSD Test as p-value adjustment method, showed that there was a significant difference between RTs in the semantic condition and the world knowledge condition (p < .01), such that RTs for the world knowledge condition were slower than the RTs for the semantic condition. There was a marginal trend found between the baseline condition and the world knowledge condition (p = .06), with the RTs for the world knowledge condition being slower than the RTs for the baseline condition. There was no significant difference between the baseline condition and the semantic condition (p > .1).

(25)

The statistical analysis of the number of errors showed that there was a significant difference between the three conditions (F(2, 168) = 8.24, p < .01). A post hoc test (pairwise comparison t-test), with Tukey’s HSD Test as p-value adjustment method, showed that there was a significant difference between the baseline condition and the world knowledge condition (p < .05), as well as a significant difference between the semantic condition and the world knowledge condition (p < .01), with the number of errors being higher in the world knowledge condition than in the other two conditions. There was no significant difference between the baseline condition and the semantic condition (p > .1).

2.3. Discussion

The first experiment set out to investigate whether the processing of tweets with semantic violations differs from the processing of tweets with world knowledge violations. This difference in processing was investigated through a violation paradigm, in which participants had to judge whether the hashtag was correct in regard to the previous tweet or not. This type of experiment, in which both semantic processing and world knowledge processing are combined, together with tweets and hashtags, has not been done before. It was hypothesized that both experimental conditions would render slower RTs than the baseline condition, thus belonging to different processes. Additionally, it was hypothesized that semantic violations and world knowledge violations are processed differently during language processing, which would result in different RTs in both conditions. This is because world knowledge processing needs more verification than semantic processing (Kounios & Holcomb, 1992), resulting in slower RTs when reading tweets with world knowledge violations compared to tweets with semantic violations.

A rather surprising result was the fact that the RTs in the baseline condition seemed to be slower (when looking at the means) than those in the semantic condition. This is surprising since previous research has found that unrelated words, as used in the semantic violation condition of the present experiment, usually result in longer RTs than related words (Pratarelli, 1994; Schubert & Eimas, 1977). However, the baseline condition of the present experiment incorporated world knowledge as well, but congruous world knowledge rather than world knowledge violations. This could mean that the baseline condition, similarly to the world knowledge violation condition, required a verification process by the participants. The semantic violation condition, on the other hand, was comprised of hashtags unrelated to the tweets that were presented. Thus, when encountering an unrelated word, this word could be more easily

(26)

dismissed as a violation, rather than having to verify the information. Alternatively, the different paradigm of the present experiment (i.e. violation paradigm) compared to previous experiments could have contributed to these novel findings as well. The participants were shown the context sentence (tweet) prior to the hashtag, and completely rather than word for word as is common in a self-paced reading paradigm. Next, an isolated word (hashtag) would appear and remain on the screen, together with the tweet. Thus, the setup of the first experiment is actually closer to a judgement task, rather than a regular self-paced reading task as used in previous studies. Whereas slower RTs are expected in the baseline condition during a self-paced reading task in comparison to the experimental conditions (semantics and world knowledge), a judgement task may result in completely different findings. In judgement tasks of this nature, violations stand out more and could elicit a faster response, as they are more salient than expected completions. The occurrence of an unexpected word when reading semantic violations or the longer verification process that happens when reading world knowledge violations could have rendered a possible lag. However, this possible lag could then have been compensated by faster RTs when the judgement plays a role as well. This could have contributed to the difference in RTs between the baseline condition and the semantics condition, with the baseline condition having a longer RT.

As can be seen in Table 1, the data from the world knowledge condition do indeed have a longer mean RT than either the data from the baseline condition, or the data from the semantic condition. Moreover, the post-hoc test showed that the difference in RT between the world knowledge condition and the semantic condition is statistically significant, and that there was a marginal trend between the results from the baseline condition and the world knowledge condition. These results are a counterargument against findings in a study carried out by Hagoort et al. (2004), in which they stated that detecting semantic anomalies would take just as long as detecting world knowledge anomalies. The present results show that it takes longer to process world knowledge violations than it does to process semantic violations when reading sentences. Moreover, this is in line with an argument by Dudschig et al. (2016), who found a difference in the amplitudes of the N400s of sentences with semantic violations compared to sentences with world knowledge violations, 270 ms after word onset. To reiterate, they argued that this difference is due to different ways of integrating the knowledge: linguistic and non-linguistic knowledge are integrated at different rates. This can directly be seen in the results presented here, where the linguistic knowledge (the knowledge of semantic accuracy and violations) is integrated faster than the non-linguistic knowledge (world knowledge). In turn, this difference of integration is a strong indicator that there are two different processes in play

(27)

when looking at the processing of semantic violations and the processing of world knowledge violations.

Another surprising results can be found in the number of errors made in the experiment. As can be seen in Table 1, the mean number of errors in the world knowledge condition was higher than the mean number of errors in either of the other two conditions. Moreover, the statistical analysis showed that this difference is significant. The reason for this statistical difference is not easily explained through available literature. However, it could be that the participants were more prone to making mistakes when verifying world knowledge. We can see in Table 1 that the baseline condition also has a higher mean number of errors than the semantic condition. However, this difference is not significant. It could be that participants had more difficulty with verifying world knowledge information, and additionally had even more difficulty with marking world knowledge information as incongruent than as congruent. This would explain why there are more errors in the world knowledge condition and the baseline condition than in the semantic condition, as well as there being more errors in the world knowledge condition compared to the errors in the baseline condition. However, this is purely speculative.

The results of the current experiment imply that semantic violations and world knowledge violations are separate processes. The significant difference in RT suggests that reading semantic violations and reading world knowledge violations, and comprehending them properly, results in a difference in RTs. This would suggest that there are most likely different processes. This is in line with Duschig et al. (2016), which found a difference in N400 between these two conditions. They attributed this difference to a difference in integration of linguistic and non-linguistic knowledge. Semantic violations and world knowledge violations can in turn be assigned to linguistic and non-linguistic knowledge, respectively. However, the lack of significant differences between the baseline condition and the experimental conditions is even more surprising, and would suggest that the paradigm used in the present study might not be sensitive enough.

(28)

3. Experiment 2

3.1. Method

3.1.1. Ethical Approval

The present experiment received ethical approval from the local ethics committee (Research Ethics Committee (CETO), Faculty of Arts, University of Groningen).

3.1.2. Participants

For the present experiment, the same participants were used as in Experiment 1. The data from two participants was excluded from analysis, as the number of errors exceeded 20% error rate. This left 55 participants (18 male, mean age 23.4).

3.1.3. Materials

Stimuli were in the form of a tweet in Dutch, followed by a hashtag. The chosen hashtags were carefully selected, so that they would not exceed the four syllable maximum that was adopted for this experiment. Moreover, for consistency, the hashtags were of approximately the same orthographical length. All hashtags consisted of one word, with the exception of the hashtag ‘goed bezig’ (doing well). This hashtag was chosen instead of the shorter ‘goed’ (good) due to a stronger connection to irony for the word ‘doing well’ than for the word ‘good’.

Subsequently, for each hashtag, two tweets were compiled: one baseline tweet in which the hashtag was literal (5) and one tweet in which the hashtag caused irony (6):

5) De serie eindigde met een cliffhanger #spannend The series ended with a cliffhanger #exciting 6) Tijdens het lezen viel ik in slap #spannend

I fell asleep while reading #exciting

A total of 80 experimental items were compiled this way. Additionally, the experiment contained 20 filler stimuli, which were neutral tweets with hashtags that were not used in any of the other conditions, as can be seen in the example (7):

7) Ik ging zo op in mijn boek dat ik de bel niet eens hoorde #lezen I was so invested in my book that I didn’t even hear the bell #reading

(29)

The 80 experimental stimuli were divided into two separate lists, which consisted of 20 baseline (literal) tweets and 20 ironic tweets (no hashtag was used more than once in a different condition in the same list). Both lists also contained the same 20 filler tweets. These lists were pseudo-randomized, as shown in Appendices 4 and 5.

3.1.4. Procedure

The present experiment took place directly after the previous experiment. Once the participants were done with Experiment 1, the next page would display instructions of the second experiment. Again, this would specify that a tweet would show up on screen and after 3s, a hashtag would appear. It was then up to the participants to decide whether this hashtag was meant ironically or literally, when combined with the previous tweet, as quickly and as correctly as possible. They could press two keys, one for ‘ironical’ (the hashtag is meant ironically) and one for ‘literal’ (the hashtag is meant literally). The keys were either ‘p’ or ‘q’, which were counterbalanced per list.

Next, participants were asked to complete a short practice round consisting of three tweets, one tweet at a time. Participants were provided with feedback after every practice stimulus. Afterwards, another instruction screen appeared, stating that the practice round had ended, reminding the participant which keys to press for which option, and stating that the real experiment was about to begin, this time without feedback.

Lastly, the real experiment would begin, with one tweet at a time, and a hashtag appearing after 3s. The whole experiment battery would take approximately 30 minutes in total.

3.1.5. Analysis

After all the results from the experiment were gathered, all RTs < 200 ms and all RTs > 5s were excluded. This was done because RTs < 200 ms would not be enough time to process the stimuli, whereas RTs > 5s suggests that the participant was distracted. The remaining RTs were used for calculations. Additionally, the RT data were log transformed prior to statistical analysis.

For the statistical analysis an Excel file with all data was loaded into RStudio (RStudio Team, 2016). The appropriate statistical assumption (i.e. normality of the variables) was examined. The data was not normally distributed, but the t-test is robust to non-normality for larger samples. This includes any samples with 30 observations or more, which is the case for the current data set. An independent samples two-sided t-test was carried out with the two

Referenties

GERELATEERDE DOCUMENTEN

We need strategic governance approaches focused on adaptation and resilience of the whole water system rather than crisis management of extreme events.. Continuous attention

Die belasting, ingestel deur die Natalse Owerheid as ' n ekonomiese en finansiele maatreel, is onder meer deur die swartmense beleef as 'n verdere aanslag op

Tijdens de ontwikkeling van de Digitale GIZ staat co-creatie met belanghebbenden zoals professionals, ouders en jongeren centraal. Het kent de

So whereas the director first aligns his cinema to a certain extent with his main examples or role models, he now adapts his approach of a complete genre, which produces a

Irony is clearly a complex concept that is difficult to define, as discussed above. In this chapter I will discern three subtypes of irony. These are theoretical, ideal types, in

Over the past decades behavior of firms and non-governmental organizations (NGOs) in relation to each other is changing as a result of globalization and changing interests,

Based on the design insights provided by our model, in this thesis, we propose a flicker noise/IM3 cancellation technique for active mixer, a wideband IM3 cancellation technique

The present study highlights the delay in diagnosis of AM in an SA population served by a large tertiary hospital, as illustrated by the size of the tumours at presentation,