• No results found

The role of language proficiency in verbal deception detection : comparing English native and non-native speakers

N/A
N/A
Protected

Academic year: 2021

Share "The role of language proficiency in verbal deception detection : comparing English native and non-native speakers"

Copied!
43
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The Role of Language Proficiency in Verbal Deception Detection:

Comparing English Native and Non-native Speakers

Charlotte van den Hengel

10298657

University of Amsterdam

Word count: 10.830 (Abstract: 173)

(2)

Abstract

This study focuses on detecting malicious intent using a verbal deception detection based method and examines the role of language proficiency on the proportion of details and named entities in statements. The study was conducted online where participants were asked to either answer truthfully or deceptive to ten questions regarding their upcoming or past flights. In addition, two language proficiency tasks (i.e. LexTALE and category fluency) and self-rating measures were administered. Information specificity was extracted from the statements using the Linguistic Inquiry and Word Count (LIWC) for the proportion of details and Named Entities Recognition System (NER) for the proportion of named entities. Results indicate that language proficiency does not influence the proportion of details or named entities in statements, , neither does veracity. In addition, results of this study indicated that using the proportion of details or named entities as a method of deception detection in low-stake online studies is not distinctive enough to detect malicious intent in natives and non-native speakers. Explorative findings, implications and limitations of the study are discussed.

(3)

Introduction Background

Recent terrorist attacks like the hostage taking in Paris (2015) and suicide bombings in Brussels (2016) raise the necessity of correctly identifying people with malicious intent to prevent further attacks from happening. Therefore, customs officers at borders and airports question thousands of passengers per day about their reasons and intentions for entering a country.

Intentions are defined as “a persons’ mental presentation of his or her planned future actions” and answers to questions about intent could be truthful [i.e. true intent] or deceptive [i.e. false intent] (Vrij, Granhag, Mann & Leal, 2011a, p. 611). In order to estimate the possible harm of granting potentially unwanted people access to a country, discriminating between truthful and deceptive intentions is important, and this discrimination process is an emerging area in deception research. Since airports and borders are a hub for people with different native languages, it is important to investigate whether language proficiency plays a role in the discrimination process using verbal deception detection methods. Therefore, this article focuses on detecting malicious intent with verbal deception detection methods and examines the possible role of language proficiency.

Verbal deception detection

There are several deception detection methods that can be used to discriminate between truth-tellers and deceivers, though verbal deception detection methods focus on the evaluation of verbal content (i.e. written or spoken statements). Verbal deception detection methods are mostly rooted in the same theoretical assumption of the Undeutsch Hypothesis (Undeutsch, 1967), stating that “that the cognitive elaboration of an untruthful narrative differs from the elaboration of a truthful one, therefore this difference should be traceable in the features of the narrative itself” (Fornaciari & Poesio, 2013, p. 306). Developers of a verbal deception detection method (e.g. Reality Monitoring) like Johnson and Raye, suggested that memories of perceived events contain more temporal, spatial, perceptual and semantic details (e.g. names of individuals or places) than imagined events (Johnson & Raye, 1981; Nahari, Vrij & Fisher, 2014a). Recent studies show that

(4)

truthful statements contain more details, and indeed contain more temporal, spatial and perceptual details than deceptive statements (e.g. Nahari et al., 2014a; Vrij, 2015).

In interrogation settings, verbal deception methods are usually used by transcribing the statements provided by the suspect where after the presence (or absence) of several criteria in these statements are analysed according to the method that is used. In the case of Reality Monitoring, statements are coded by independent raters and it is examined to what extent statements contain emotional-, sensory-, spatial- and time-information, and to what extent statements are realistic, clear, vivid and easier to reconstruct (Masip, Sporer, Garrido & Herrero, 2005); the presence of these criteria in statements were found to be more present in truthful statements, and are therefore considered indicators of truthfulness (Masip et al., 2005; Vrij, 2015; Vrij, Edward & Bull, 2001; Vrij, Edward, Roberts & Bull, 2000). Correct classification using the Reality monitoring criteria yields in rates around 75% for true statements and 67.5% for deceptive statements (Masip et al., 2005), and accuracy rates lay around 70% (Vrij, 2015) peaking at 85% (Masip et al., 2005). However, using just temporal, spatial and perceptual details as criteria for truthfulness results in accuracy rates that are close to those when using all Reality Monitoring criteria (i.e. 71.7%; Nahari et al., 2014a).

Considering the discrimination between true and false intent, verbal deception detection methods have been found suitable to do so (Sooniste, Granhag, Knieps & Vrij, 2013; Sooniste, Granhag, Strömwall & Vrij, 2014; Vrij et al., 2011a; Vrij, Leal, Mann & Granhag, 2011b). The first study regarding intentions with verbal deception methods was performed by Vrij and colleagues (2011); they asked passengers at an airport to tell either a truthful or deceptive account about their upcoming trip, but they did not find differences in the provided amount of details between the truth-tellers and deceivers (Vrij et al., 2011a). In a later study of Vrij and colleagues (2011) they found that differences in details were more pronounced in the recall of past activities than in the recall of intentions, however they stated not to rule out that the lack of differences in detail is peculiar to lying about intentions (Vrij et al., 2011b). Sooniste and colleagues (2013) discriminated between true and false intentions by asking about the (past) planning phase to detect the future

(5)

intentions (Sooniste et al., 2013). They found that truth-tellers’ answers to the planning phase questions were longer, clearer and more detailed than those of deceivers, while the questions about the future were equally detailed (Sooniste et al., 2013). In a later study of Sooniste and colleagues (2014), groups of participants planned a mock crime or a non-criminal event and were again asked about the planning phase and intentions. Truth-tellers’ stories contained more details than those of deceivers, and there was higher within-group inconsistency in answers about the planning phase (Sooniste et al., 2014). These results of these studies show inconsistencies whether discrimination between true and false intentions can be done using details as the criteria.

However, it can be argued that airplane passengers have planned some things in advance (e.g. hotels, sight-seeing, excursions, transportation) and consequently have more detailed plans than participants in a lab study preparing a random non-criminal event like usually done in studies (Sooniste et al, 2013; 2014). Critically, in this case, the assumption remains that true statements contain more details than deceptive statements, as truth-tellers can draw on their actual memory of their planning phase (Sooniste et al., 2013).

Automated deception detection software

Computerised coding software has been applied in psycholinguistic and forensic research (e.g. Bond & Lee, 2005; Kleinberg, Nahari & Verschuere, 2016; Vrij, 2000). It was found that

computer analyses are more accurate in detecting lies (80%) than human coders (60%), but human coders are more accurate in detecting truths (80%) than computers (53%) (Vrij, 2000). Bond & Lee (2005) have used automated linguistic software to discriminate between true and deceptive statements of prisoners to counteract suspicion and (human) bias in detection decisions (Bond & Lee, 2005). The software they used, the Linguistic Inquiry and Word Count (LIWC) (Pennebaker, Booth, Boyd & Francis, 2015) captures specific categories of words out of statements indicating perceptual-, spatial- and temporal details, and can therefore be used to discriminate between truth-tellers and deceivers. Using the reality monitoring criteria, the software correctly classified 71% of the deceptive statements, in comparison to the correct classification rates of 71% and 50% of the

(6)

human coding of different groups of prisoners (Bond & Lee, 2005). These results suggest that the automated LIWC software might be as accurate in detecting deceivers as human coding, but without the influence of human bias.

Notwithstanding, LIWC only captures the word frequencies indicating perceptual, spatial and temporal details, but does not capture words indicating names of persons, locations and organisations (i.e named entities; Kleinberg et al., 2016); entities that indicate details that have been experienced with an identifiable person, or witnessed by an identifiable person or have been recorded through technology (Kleinberg et al., 2016), and these kind of details are often avoided to mention by deceivers in interrogation settings because fact-checking is possible (Nahari et al., 2014a; Nahari, Vrij & Fisher, 2014b). Based on this it can be assumed that deceptive statements contain fewer named entities than truthful statements. Yet most methods do not use the named entities criteria (DePaulo et al., 2003; Sporer, 2004; Vrij, 2015), although perceptual, spatial and temporal details do not necessarily indicate checkable facts. Therefore, a comparison of the use of LIWC and a named entities recognition (NER) system in verbal deception detection methods has been proposed (Kleinberg et al., 2016), but has not yet been performed. However, based on the definition of details (i.e. inclusions of specific descriptions of place, time, persons, objects and events; Vrij, 2015, p. 8) it can be suggested that these named entities using the NER can be captured. Therefore, both the LIWC and NER will be used in this study, and it is expected that truthful statements contain more details (i.e. spatial, temporal and perceptual details) and named entities than deceptive statements.

Deceiving in a second language

Most studies using verbal deception detection methods have only been examined in

participants’ native language (e.g. Evans & Michael, 2014; Nahari et al., 2014a; Vrij et al., 2001), and therefore prior findings regarding the use of verbal deception detection methods are only

(7)

language while crossing borders, it is important to look at the possible differences between both telling true and deceptive statements in a non-native language.

According to the cognitive approach to deception (i.e. Cognitive Load Theory, Zuckerman, DePaulo & Rosenthal, 1981), deceiving is more cognitively demanding than telling the truth, since telling the truth is suggested to be the default setting of the mind (i.e. Truth-Default Theory, Levine, 2014; Verschuere & Shalvi, 2014). Deceiving generates differences in cognitive load between deceivers and tellers, like slower response times in deceivers compared to truth-tellers (Kleinberg et al., 2016; Vrij, Fisher, Mann & Leal, 2008a; Zuckerman et al., 1981). Based on this finding, multiple minimal interventions (e.g. unexpected questions and reverse order techniques) can be applied in interrogations to enhance the cognitive differences between truth-tellers and deceivers (Kleinberg et al., 2016; Vrij et al., 2008a; Vrij et al., 2008b), and are suggested to elicit in less quality of details and inconsistencies in deceivers’ stories (Vrij et al., 2008a).

However, speaking in a second language considered to be a cognitively resource-intensive task already (Evans & Michael, 2014), and might enlarge these differences in addition to the manipulation techniques, which in turn would result in more pronounced differences between deceivers and truth-tellers in terms of providing detailed accounts. Although the cognitively resource-intensive task of speaking in a second language remains for truth-tellers, honesty is not supposed to be as cognitively demanding as for the deceivers because the additional requirement of maintaining a convincing false account is lacking (Evans & Michael, 2014; Vrij et al., 2008a). Therefore, non-native truth-tellers may be better at coping with the cognitive requirements of using a second language than non-native deceivers (Evans & Michael, 2014), but this could still result in providing fewer details in the non-native language than in the native language as a function of vocabulary. Thus, when trying to deceive someone in a second language, deceivers face two challenging tasks (i.e. telling a convincing deceptive statement and speaking another language), and these challenging tasks could result in providing fewer details when deceiving in a non-native language in comparison with deceiving in their native language.

(8)

Consequences for the use and accuracy

Since most verbal deception methods have only been examined in participants' native language, little is known about the possible consequences of using these existing methods with non-native speakers. Prior research focused on comparing different interview techniques (Ewens, Vrij, Mann & Leal, 2015), the use of interpreters in interrogation settings (Ewens, Vrij, Leal, Mann, Jo, & Fisher, 2014) and differences using non-verbal detection methods (Duñabeitia & Costa, 2015). It was found that when using interpreters, interviewees provided fewer details in comparison to speaking in their first language or a foreign language (English) without an interpreter (Ewens et al., 2014). In addition, no significant differences were found between native and non-native speakers using non-verbal detection methods (e.g. pupil size, speech latencies and utterance durations; Duñabeitia & Costa, 2015). Nevertheless, studies have not focused on whether

examining people in a second (i.e. non-native) language would influence accuracy, but it has been found that there is a bias to believe that non-native (English) speakers are deceiving (i.e. increased false positives; Evans & Michael, 2014). Along these lines, this bias could result in non-natives being unfairly classified as a deceiver due to the poorer language proficiency. Given the reliance and classification of verbal deception detection tools on the provided details (Nahari et al., 2014a; Sooniste et al., 2013; Vrij et al., 2001), it is important to investigate the influence of language proficiency on the accuracy and validity of the current deception detection methods, since

deceiving in another language might lead to possible unwanted consequences like misclassifying true statements of non-natives as deceptive (Evans & Michael, 2014).

In order to shed light on the discriminatory value of deception detection tests, it is important to distinguish between the sensitivity and specificity of tests. Sensitivity is the

proportion of deceptive participants who are detected by the test as deceivers, thus how well the test detects deceptive participants (Wales, 2003). Specificity is the proportion of truth tellers who are detected by the test as truth-tellers, thus how well the test rules out deceiving when it’s really absent (Wales, 2003). A sensitive test has few false negatives, and a specific test has few false positives, but both are interdependent on each other (Wales, 2003). Since it is expected that

(9)

non-native truth-tellers provide fewer details than non-native truth-tellers, it is expected that they are more likely to be classified as a deceiver (i.e. increased false positives), which results in a lower

sensitivity of the deception detection method. Additionally, since it is expected that non-native deceivers provide fewer details than native deceivers, it is expected that they are more likely to be classified as a deceiver (i.e. increased true negatives), which results in a higher specificity for the deception detection method.

Language proficiency

Airports and borders are a hub for people with different native languages, and since these travellers cannot be assumed to be equally fluent in the language used in interviews at border controls (e.g. English), it is important to look at the role of language proficiency in this.

When you are more advanced and know more words in a language, it is easier for you to explain yourself and describe things in that language. Therefore, it can be thought that an increase in language proficiency is associated with providing more details in that language. In this study, three different methods to measure language proficiency were adopted; a questionnaire about demographic variables, self-rating and two language proficiency tasks to compare these outcomes and examine the association between language proficiency and the provided proportion of details and named entities.

Information protocols

The manipulation of specific instructions given to suspects is a promising and emerging technique in deception detection research (Vrij, Akehurst, Soukara & Bull, 2004; Vrij, Kneller & Mann, 2000). A recent study shows that when both deceivers and truth-tellers are informed that checkable details (e.g. identifiable witnesses) are used as an indicator for truthfulness, accuracy increase (77.5%) in comparison with uninformed participants (57.5%) (Harvey, Vrij, Nahari & Ludwig, 2016; Kleinberg et al., 2016). However, this has not yet been used in settings where intentions are examined and the role of language proficiency in these cases has not been

(10)

investigated yet. Therefore, both information protocols will be adopted in this study; half of the participants will receive instructions to give as much specific information as possible (i.e. specific information protocol), and the other half will receive instructions to give as much information as possible (i.e. standard information protocol). Since little is known about the use of specific information protocols in intention studies and the role of language proficiency in this, specific expectations about the information protocols are not formed.

Hypotheses and expectations

Based on prior research, a total of four expectations are formulated for the current study. First, based on research on verbal deception detection methods, it is expected that truthful statements contain more details and named entities than deceptive statements (Kleinberg et al., 2016; Nahari et al., 2014a; Sooniste et al., 2013; 2014; Vrij 2015) (Hypothesis 1).

Secondly, based on the Cognitive Load Theory (Zuckerman et al., 1981) and deception studies with non-natives (Evans & Michael, 2014) it is expected that truthful statements written in a native language contain more details than truthful statements written in a non-native language (Hypothesis 2a, Appendix 1), and consequently it is expected that truthful statements are more likely to be misclassified in the non-native language than in the native language (i.e. decreased sensitivity) (Hypothesis 2b).

Also based on the Cognitive Load Theory (Zuckerman et al., 1981) and deception studies with non-natives (Evans & Michael, 2014), it is expected that deceptive statements written in the native language contain more details than deceptive statements written in the non-native language (Hypothesis 3a, Appendix 1), and consequently it is expected that deceptive statements written in a non-native language are easier classified as deceptive than deceptive statements written in a native language (i.e. increased specificity) (Hypothesis 3b).

Lastly, in the current study, three different measures of language proficiency were used including a questionnaire about demographic variables, self-rating and language proficiency tasks,

(11)

and it is expected that these measures positively correlate with each other (Hypothesis 4a) and with the provided proportion of details (Hypothesis 4b).

Considering the different information protocols, it was found that specific information protocols could elicit in greater accuracy of deception detection methods (Harvey et al., 2016). However, since little is known about the use of specific information protocols in intention studies and the role of language proficiency in this, specific expectations about the information protocols are not formed but this will be analysed and discussed in the explorative research part of this study. The main hypotheses will be analysed without the influence of the different information protocols.

Method Participants

A total of 518 participants were recruited through the online participant platform

Crowdflower. Of the 518 participants that took part in this study, 105 participants were excluded due to missing data (n = 9), IP address exclusion (n = 36, resulting in n = 94) and age limitation (n = 2). Out of the remaining sample (n = 413), outliers (i.e. larger than 2.5 standard deviations above the mean) were excluded based on the number of weeks until or past flying (n = 23) and frequency of visiting the destination (n = 10). Participants who did not provide genuine information in the first place were also excluded (n = 28).

As a result, 352 participants were included in the analyses (39.5% female, age M = 33.32, SD = 9.43), of which 221 participants (62.8%) had upcoming flight plans within the next three months (M = 6.04 weeks, SD = 6.25). Of the 131 participants (37.2%) who did not have any upcoming flights planned, their last flight was on average half a year ago (M = 25.56, SD = 20.58). The main purposes for flying were going on holiday (49.1%), visiting family (21.3%), work (14.2%), visiting friends (9.1%), going back home (3.1%), for study (2%), or attending a wedding (1.1%).

Participants were assigned to either the truthful or deceptive condition and to a standard or specific information protocol condition. In total, 168 participants (47.7%) were assigned to the

(12)

truthful condition, and 184 participants (52.3%) were assigned to the deceptive condition. In total, 190 participants (54%) were assigned to the standard information protocol, and 162 participants (46%) were assigned to the specific information protocol. Of those who were going to fly (n = 221), 110 participants were in the truthful condition, 111 were in the deceptive condition, and 120 participants (54.3%) were assigned to the specific information protocol. Of those who were not going to fly (n =131), 58 participants were in the truthful condition, 73 were in the deceptive condition, and 70 participants (53.4%) were assigned to the specific information protocol.

English was the native language of 192 participants (54.5%), including 133 bilingual participants (37.8%), but only 88 bilinguals (25% of total population, 66% of total bilinguals) had English as one of their first languages. For the other 160 participants (45.5%) English was not a native language. 117 participants were born in an English speaking country (i.e. the United

Kingdom, United States of America, Australia, Canada or Ireland). Participants were paid 1.00$ for their participation and had an additional chance to win a 100$ voucher for Amazon.com.

Procedure

Only participants meeting the following criteria could take part in the experiment: at least a 3-star participant rating on previous tasks on Crowdflower (1) and only participant channels with at least 80% trustworthiness (2).

Condition allocation

The first part of the experiment consisted of questions regarding upcoming flight plans. Depending on the answer to the question: “Are you going to fly in the next 12 weeks?” people were randomly assigned to either the truthful or deceptive condition, the deceiving about intentions (i.e. future) or past condition and different information protocols (i.e. standard or specific). Participants who were going to fly in the near future could be assigned to the following conditions:

(13)

- the truthful condition or deceptive condition: tell a truthful account of their trip (i.e. truthful condition) or to tell a deceptive account about their upcoming trip with an assigned destination and purpose (i.e. deceptive condition).

- deceiving about intentions or past condition: tell an account about their upcoming trip (i.e. intentions condition) or tell an account about their last trip (i.e. past condition).

- standard instructions versus specific instructions: information protocol being either specific instructions ("give as much specific information as possible") or standard instructions ("give as much information as possible”).

Participants who did not have an upcoming flight planned, were automatically assigned to the past condition, and were asked to tell a truthful account about their last trip (i.e. truthful condition) or tell a deceptive account about their last trip with an instructed destination and

purpose (i.e. deceptive condition), guided by either standard or specific instructions. Ten questions were asked in fixed order about their upcoming or recent flight (Appendix 2).

Language proficiency

The second part of the experiment consisted of three different methods to measure

language proficiency. Firstly, participants were asked about their gender, age, native language(s), country of origin and education. In addition, participants were asked to rate their own English language proficiency. The self-rating questions consisted of three questions regarding language proficiency. Participants were asked to rate their language proficiency in English in general, in word recognition and their English vocabulary, on a slider scale ranging from 0 (very poor) to 10 (expert). Lastly, participants completed two language proficiency tasks consisting of a word recognition and a word production task.

As a word recognition task, the LexTALE language proficiency (Lemhöfer & Broersma, 2012) task was used. The LexTALE is a simple un-speeded visual lexical decision task, consisting of 60 trials (words and pseudo-words). Participants have to decide whether the items are existing English words or not by pressing the A-key for “yes” and the L-key for “no” (Lemhöfer &

(14)

Broersma, 2012). The LexTALE has been evaluated as a valid measure of English vocabulary knowledge and as an indicator of general English proficiency (Lemhöfer & Broersma, 2012). Research with a Dutch and Korean sample shows that the LexTALE has a significant weak to high positive correlation with other, but more extensive, measures of language proficiency (r = .33 with the TOEIC and r = .63 with the Quick Placement Test) (Lemhöfer & Broersma, 2012). It also

correlates moderately with various measures of self-rating (r = .53 to r = .72), and it is considered to be a good predictor of translation performance (Lemhöfer & Broersma, 2012). Scoring was done according to the developers Lemhöfer & Boersma (Lemhöfer & Broersma, 2012), and the

percentage of correct answers was used as a measure of word recognition. A custom-made JavaScript version of the LexTALE was used to incorporate this task into the general flow of the experimental task.

As a word production task, a subtest of the verbal category fluency test was used, specifically the animal subtest. Verbal fluency tasks have been used to measure verbal ability, including lexical knowledge and lexical retrieval ability (Shao, Janse, Visser & Meyer, 2014). Fluency task has been found valid to assess verbal ability in previous studies (Shao et al., 2014). In this part of the experiment, participants were asked to write down as many animals as they could come up with in 90 seconds. Scoring was done automatically by matching the answers with an animal dictionary, indicating whether the provided words were truly existing animals. The animal dictionary used consisted of 1498 animal (species) (Lopez-Terrill, 2014). The amount of correctly provided animals was used as a measure of word production.

Using both demographic variables, self-ratings and language proficiency tasks, language proficiency has been self-rated, quasi-experimental manipulated as a dichotomous variable (i.e. native/non-native) and measured as a continues variable with the help of two language

proficiency tasks. The main reason for this approach with different measures is to increase the applicability and ecological validity of this study. It is not feasible to ask every prospective airplane passenger to do a language proficiency task before answering questions in an interview since this would be too resource-intensive. However, subjective self-ratings are often not as useful

(15)

as data derived from objective language proficiency tasks and are often influenced by other factors, such as the timing of the question (Lemhöfer & Broersma, 2012). On the other hand, in previous studies, it was found that for Dutch students a high positive correlation (r = .74) existed between their self-rating score and their score on the Quick Placement Test (i.e. a test to examine general language proficiency; Lemhöfer & Broersma, 2012). With this finding in mind, it might be that self-ratings are a good estimate of language proficiency, but this has to be examined further in the context of verbal deception detection. Based on this, it will be examined whether a self-rating serves as a good estimate of people’s actual language proficiency (measured with language proficiency tasks), and this will be done by measuring the association between the self-rating scores and the scores on the two language proficiency tasks (i.e. LexTALE for word recognition and Verbal Fluency for word production).

Manipulation check

The last part of the experiment consisted of two manipulation check questions, slightly depending on the allocation of conditions. The first question was about whether the participant is actually flying or has actually flown to their provided destination in the provided amount of weeks. With answering the second question, participants rated their motivation to appear convincing on a scale from 0 (not at all) to 10 (absolutely). In addition, participants answered 10 questions regarding the expectedness of the 10 questions about their flight on a scale from 0 (not at all) to 10 (absolutely), for example: “How expected was the question: “who did you meet there and for which reason”?”.

Coding

Concerning the extraction of details out of the statements, two automated coding methods were used to extract the details and named entities. The Linguistic Inquiry and Word Count System (LIWC) (Pennebaker et al., 2015) and the Named Entities Recognition System (NER) are used. The LIWC is a computer a computer-automated linguistic analysis programme, which has been

(16)

applied in psycholinguistic research (e.g. Bond & Lee, 2005; Kleinberg et al., 2016). This method analyses the statements and produces frequency tables of word categories that fit psychological processes, from which perceptual, spatial and temporal details will be derived. Taken together, the sum of these perceptual, spatial and temporal details will be referred to as “proportion of details” as the proportion of details of the total amount of provided words. The Named Entities Recognition

System (Kleinberg, Nahari & Verschuere, 2016) has been proposed to measure the proportion of

named entities. This automated statement coding software identifies named entities in written text (e.g. names, time, dates, see Kleinberg et al., 2016) and determines the proportion of the named entities considering the total amount of provided words, which will be referred as “proportion of named entities”.

Analysis Plan

The main design of this study will be a 2x2 between-subject ANOVA. The factors that will be taken into the main analysis are the veracity conditions (i.e. truthful or deceptive) and language conditions (i.e. native English or non-native English) as the independent variables and proportion of details (LIWC) and proportion of named entities (NER) as the dependent variables.

Data will be analysed using the statistical analyse software IBM SPSS Statistics 23.0. First objective measures will be retrieved from the data, showing outcomes on the language proficiency measures and outcomes on the dependent variables for the different categories. Secondly, the 2x2 between-subject ANOVA will be preformed where after sensitivity, specificity and accuracy rates will be computed. Afterwards, correlations between the different language proficiency measures and dependent variables will be extracted. Lastly, manipulation checks and explorative analyses will be preformed.

Interpretation of Cohen’s d and Pearsons’ r effect sizes is done according to Cohen’s

guidelines for social sciences (Cohen, 1988); Cohen’s d effect sizes of 0.2, 0.5 and 0.8 are interpreted as small, moderate and large respectively, and Pearson’ r effect sizes of 0.1, 0.3 and 0.5 are

(17)

interpreted as small, moderate and large respectively. When not stated otherwise, the alpha (p-value) of 0.05 is used as the level of significance.

Results

Objective measures

Descriptive statistics are listed in Table 1 (Table 1). The average score on the LexTALE task (i.e. task measuring word recognition) was 75.69, with scores ranging from 37.50 to 97.50. On average, native speakers performed better on the LexTALE (M = 79.70, SE = 1.19), than non-native speakers (M = 70.89, SE = .98), and this difference of 8.81 was found to be significant, t(346.57)= -5.73, p < .001, d = 0.60. The average word production score was 10.72, with scores ranging from 0 to 29. On average, native speakers performed better on the word production task (M = 11.60, SE = .62), than non-native speakers (M = 9.67, SE = .52), and this difference of 1.93 was found to be significant, t(346.56)= -2.39, p = .017, d = 0.31. The veracity conditions (i.e. truthful or deceptive) did not differ in gender, X2(1)= .337, p = .562. Also, there was no significant difference in age between participants in the truthful condition (M = 33.10, SE = .72) or deceptive condition (M = 33.52, SE = .71), t(350)= -.41, p - .680, d = 0.04.

Table 1

Means and standard deviations for the various language proficiency measures and the proportion of details and named entities for each veracity and language proficiency condition. GLP = General Language

Proficiency, WR = Word Recognition and WP = Word Production.

Total

Truthful Deceptive

Native Non-Native Native Non-native

N 352 95 73 97 87

LexTALE WR Score

(18)

Fluency WP Score 10.72 (7.80) 12.08 (9.12) 8.71 (5.71) 11.13 (8.19) 10.47 (7.05) Self-Rating GLP 7.95 (1.81) 8.60 (1.64) 7.01 (1.87) 8.77 (1.47) 7.09 (1.56) Self-Rating WR 8.13 (1.77) 8.77 (1.64) 7.27 (1.84) 8.87 (1.35) 7.34 (1.65) Self-Rating WP 7.66(1.91) 8.40 (1.64) 6.62 (1.98) 8.52 (1.53) 6.76 (1.71) Proportion of details 15.01 (3.61) 15.16 (4.06) 14.67 (3.39) 15.37 (3.46) 14.75 (3.45) Proportion of named entities 1.32 (1.09) 1.26 (1.03) 1.28 (0.97) 1.28 (1.23) 1.45 (1.08)

Difference in details and named entities for veracity conditions

Firstly, it was expected that truthful statements contain more details and named entities than deceptive statements (Hypothesis 1). An independent t-test shows that there was no significant difference in the proportion of details in the truthful statements (M = 14.95, SE = .29) and deceptive statements (M = 15.08, SE = .26), t(350)= .34, p = .736, d = 0.04. In addition, there was no significant difference found in the proportion of named entities in the truthful statements (M = 1.27, SE = .08) and deceptive statements (M = 1.36, SE = .09), t(350)= .81, p = .422, d = 0.09.

Together, these findings suggest that veracity does not influence the proportion of details and named entities in the statements.

The role of language and veracity in proportion of details and named entities

To continue, it was expected that truthful statements written in a native language contain more details than truthful statements written in a non-native language (Hypothesis 2a), and it was expected that deceptive statements written in the native language contained more details than deceptive statements written in the non-native language (Hypothesis 3a). To test these hypotheses, a 2x2 between-subject ANOVA was preformed to examine the effect of language and veracity and its possible interaction on the proportion of details and the proportion of named entities.

(19)

Firstly, there was no significant effect of language found on the proportion of details, indicating that the proportion of details in native (M = 15.27, SD = 3.76) and non-native (M = 14.71,

SD = 3.42) statements did not differ, F(1, 348)= 2.05, p = .153, n2p = 0.01. Also, there was no significant effect of language found on the proportion of named entities, indicating that the proportion of named entities in native (M = 1.27, SD = 1.13) and non-native statements (M = 1.37,

SD = 1.03) did not differ, F(1, 348)= .14, p = .707, n2p = 0.00.

Secondly, there was no significant effect of veracity found on the proportion of details, indicating that the proportion of details in truthful (M = 14.95, SD = 3.79) or deceptive statements (M = 15.08, SD = 3.46) did not differ, F(1, 348)= .14, p = .707, n2p = 0.00. Also, there was no

significant effect of veracity found on the proportion of named entities, indicating that the

proportion of named entities in truthful (M = 1.27, SD = .99) or deceptive statements (M =1.36, SD = 1.16) did not differ, F(1, 348)= .68, p = .409, n2p = 0.00.

Lastly, there was no significant interaction effect found between language and veracity for the proportion of details, indicating that the proportion of details in the statements written in the native or non-native language did not differ according to their veracity (i.e. truthful or deceptive),

F(1, 348)= .02, p = .876, n2p = 0.00. Neither was there a significant interaction effect found between language and veracity for the proportion of named entities, indicating that the proportion of named entities in the statements written in the native or non-native language did not differ according to their veracity (i.e. truthful or deceptive), F(1, 348)= .40, p = .528, n2p = 0.00.

Sensitivity, specificity and accuracy rates

Regarding the expectations that truthful statements are more likely to be misclassified as deceptive in the non-native language than in the native language (i.e. decreased sensitivity and increased false positives) (Hypothesis 2b) and that deceptive statements written in a non-native language are easier classified as deceptive than deceptive statements written in a native language (i.e. increased specificity and increased true negatives) (Hypothesis 3b), a ROC curve analyses was performed and sensitivity and specificity of the methods were calculated using the Youden’s Index

(20)

(Youden, 1950). This method maximised both sensitivity and specificity to identify the best cut-off score.

For statements written in a native language, the analysis shows an area under the curve (AUC) of approximately 0.517 for the proportion of details and 0.473 for the proportion of named entities. For statements written in a non-native language, the analysis shows an area under the curve of approximately 0.519 for the proportion of details and 0.540 for the proportion of named entities. These AUC values imply random classification in line with the non-significant veracity effect reported in the main analysis.

According to the Youden’s (1950) calculation for the ideal cut-off in proportion of details, the ideal cut-off was 14.92 for statements written in a native language, with the corresponding sensitivity of 0.577 and specificity of 0.505. Along, for statements written in a non-native language the ideal cut-off was 14.93, with the corresponding sensitivity of 0.460 and specificity of 0.616. For the proportion of named entities the ideal cut-off was 0.21 for statements written in a native

language, with the corresponding sensitivity of 0.979 and specificity of 0.105. Along, for statements written in a non-native language the ideal cut-off was 1.01, with the corresponding sensitivity of 0.621 and specificity of 0.521. A detailed calculation of sensitivity, specificity and accuracy rates can be found in Appendix 3. The areas under the curves, the ideal cut-off scores and

corresponding sensitivity, specificity and accuracy rates have been listed in Table 2 (Table 2).

Table 2

The different sensitivity, specificity and accuracy measures for proportion of details and named entities, distributed over native and non-native statements.

Area under the curve (AUC)

Cut-off Sensitivity Specificity Accuracy

Proportion of details

Native 0.517 14.92 0.577 0.505 0.458

Non-native 0.519 14.93 0.460 0.616 0.469

(21)

named entities Non-native 0.540 1.01 0.621 0.521 0.425

Sensitivity and specificity rates listed in Table 2 show that using the ideal cut-off for the proportion of details, sensitivity was lower and specificity was higher for non-native statements compared to native statements. Analyses regarding increases and decreases in false positives and false negatives (expressed as percentage points) are discussed below.

Regarding the proportion of details, the sensitivity rate suggests a lower sensitivity for non-native statements; only a slight increase (3%) in false positives was found in non-non-native statements (see Appendix 3). However, this result suggests that truthful statements are more likely to be misclassified as deceptive in the non-native language than in the native language when

considering the proportion of details. The specificity rate suggests a higher specificity for the non-native statements; only a slight decrease (4%) in false negatives was found in non-non-native

statements (see Appendix 3). However, this result suggests that deceptive statements are more likely to be classified as deceivers in the non-native language than in the native language when considering the proportion of details.

Regarding the proportion of named entities, the sensitivity rate suggests a lower sensitivity for non-native statements; a moderate increase (18%) in false positives was found in non-native statements (see Appendix 3). This result suggests that truthful statements are more likely to be misclassified as deceptive in the non-native language than in the native language when

considering the proportion of named entities. The specificity rate suggests a higher specificity for the non-native statements; a moderate decrease (17%) in false negatives was found in non-native statements (see Appendix 3). However, this result suggests that deceptive statements are easier classified as deceptive in the non-native language than in the native language when considering the proportion of named entities.

Nonetheless, the difference of 0.011 between the AUC’s for the proportion of details between statements written in a native and nonnative language, was not found significant Z = -0.18, p = .859. Same was found for the difference of 0.028 between the AUC’s for the proportion of

(22)

named entities between statements written in a native and non-native language, Z = 0.45, p = .065. Thus, considering the sensitivity, specificity and the area under the curves, the method of using proportion of details or proportion of named entities have not been accurate diagnostic measures to correctly discriminate between deceivers and truth-tellers in this study.

Correlations between language proficiency measures

Regarding the language proficiency measures, it was expected that the different methods used to measure language proficiency (self-ratings and language proficiency tasks) were positively correlated (Hypothesis 4a). The correlations of the different methods are listed in Table 3.

Table 3

Correlations of the Language Proficiency Measures. ** Significance, p < .01. GLP = General Language Proficiency, WP = Word Production and WR = Word Recognition.

LexTALE WR Score Fluency WP Score Self-rating GLP Self-Rating WR LexTALE WR Score x Fluency WP Score .406** x Self-Rating GLP .480** .294** x Self-Rating WR .462** .293** .859** x Self-Rating WP .423** .302** .839** .855**

As listed in Table 3, strong positive correlations between the three self-rating measures were found (r = .859 with r2 = .738, r = .839 with r2 = .704 and r = .855 with r2 = .731, all ps < 0.001). In addition, there was a moderate positive correlation between the LexTALE score and word production scores, r = .406, r2 = .165, p < 0.001. This moderate, but not strong, correlation suggests that scores are both influenced by language proficiency, but recognizing and producing words may capture different aspects of general language proficiency. Further, moderate positive

(23)

correlations between the self-rating for General Language Proficiency and the LexTALE scores (r = .480, r2 = .023, p < 0.001), and the Fluency Word Production scores (r = .294, r2 = .086, p < 0.001) were found. Moreover, moderate positive correlations between the self-ratings for word recognition and LexTALE scores (r = .462, r2 = .213, p < 0.001) and the self-ratings for word production and the word production task scores (r = .302, r2 = .091, p < 0.001) were found, even though this last association is less strong. To sum up, as expected, the different methods to measure language proficiency are found to be positively correlating.

Correlations between language proficiency and proportion of details

Finally, it was expected that the different methods used to measure language proficiency were positively correlated with the provided proportion of details (Hypothesis 4b). The

correlations of the different methods with the proportion of details and named entities are listed in Table 4 (Table 4).

Table 4

Correlations of the Language Proficiency Measures with the Proportion of details and Proportion of named entities. * Significant at p < 0.05, ** Significant at p < 0.01. GLP = General Language Proficiency, WP = Word Production and WR = Word Recognition.

LexTALE WR score Fluency WP score Self-rating GLP Self-Rating WR Self-Rating WP Proportion of details LexTALE WR score x Fluency WP score .406** x Self-Rating GLP .480** .294** x Self-Rating WR .462** .293** .859** x Self-Rating WP .423** .302** .839** .855** x Proportion of details .300** .187** .263** .256** .231** x Proportion of -.199** -.104 -.135* -.145* -.153* .058

(24)

As listed in Table 4, significant correlations have been found between the LexTALE scores and the proportion of details (r = .300, r2 = .09, p < 0.001) and named entities (r = -.199, r2 = .04, p < 0.001). A significant correlation was also found between the word production task scores and the proportion of details (r = .187, r2 = .035, p < 0.001), but not for the named entities (r = -.104, r2 = .011, p = .050). These correlations indicate that there is a frail positive association between language proficiency and proportion of details. The opposite association between language proficiency measures and proportion of details and named entities, and the absence association between details and named entities indicate that details and named entities are less associated with each other than originally suggested. Further research is needed to establish whether these findings hold true under more successful veracity manipulation conditions.

Manipulation checks

Participants who indicated that they were telling the truth about flying in the future or past were excluded from the analyses in the first place (N = 28). Further, participants were motivated to appear convincing; on a scale from 0 to 10, participants rated their motivation to appear convincing on average 8.01 (SD = 1.84). There was no difference in motivation to appear convincing between the truth-tellers (M = 7.90, SD = 1.88) and deceivers (M = 8.11, SD = 1.79),

t(350)= - 1.07, p = .287. Considering the wide range of 0 to 10, only 34 participants (9.7%) rated their

motivation with 5.0 or below. Most participants rated their motivation with 10.0 (N = 100, 28.4%). These results indicate that participants were really motivated to appear convincing; regardless of the veracity condition they had to write their statements in.

Explorative analyses

Future and past flights named entities

(25)

One of the main goals of this paper was to investigate the role of details in future intentions, therefore it might be interesting to investigate whether there is a difference between details in statements from people who are going to fly (intentions) and people who have flown, between true and deceptive future intentions, and to see whether the number of weeks until flying makes a difference. It was found that on average, participants writing statements regarding past flights provided more details (M = 15.77, SE = .31), than those writing about future flights (M = 14.57, SE = .24), and the difference of 1.20 was found to be significant, t(350)= -3.05, p = .002; and represented a small effect size of d = 0.33. For named entities, there was not such significant difference found, t(350)= -0.26, p = .792.

Moreover, to investigate whether the weeks until flying influences the proportion of details or named entities in a statement, participants flying within 4 weeks in the future (n = 109) and participants who were flying later (n = 112) were analysed separately. To investigate the influence of weeks to flying, this variable was added to the previous ANOVA from the main analysis. This resulted in a 2x2x2 between-subject ANOVA on the proportion of details and named entities with weeks until flying (i.e. less or more than 4 weeks), veracity (i.e. truthful or deceptive) and language (i.e. native English or non-native English) as independent variables. Like found before, indeed no main effect of veracity or language on the proportion of details or named entities were found; F(1, 213)= 0.00, p = .966 (veracity); and F(1, 213)= 1.56, p = .213 (language) for the proportion of details and F(1, 213)= 0.63, p = .427 (veracity); and F(1, 213)= 0.38, p = .537 (language) for the proportion of named entities respectively. In addition, no significant effects for weeks until flying were found for the proportion of details, F(1, 213)= 0.17, p = .680; and proportion of named entities, F(1, 213)= 0.02,

p = .896. There were significant interactions found between weeks until flying and veracity for the

proportion of details, F(1, 213)= 0.25 , p = .617; and proportion of named entities F(1, 213)= 1.22 , p = .217. These results indicate that the number of weeks until flying does not influence the

proportion of details or named entities in the statements. The absence of an interaction between veracity and weeks until flying indicates that the proportion of details or named entities in the

(26)

statements written by participants who were flying within 4 weeks or later than that did not differ according to their veracity (i.e. truthful or deceptive).

In addition, to see whether the number of weeks since flying influences the proportion of details or named entities in statements, participants who had flown within 4 weeks ago (n = 17) and participants who flew earlier (n = 114) were analysed separately. To investigate the influence of weeks since flying, this variable was added to the previous ANOVA analysis. This resulted in a 2x2x2 between-subject ANOVA on the proportion of details and named entities with weeks since flying (i.e. less or more than 4 weeks), veracity (i.e. truthful or deceptive) and language (i.e. native English or non-native English) as independent variables. Like found before, indeed no main effect of veracity or language on the proportion of details or named entities were found; F(1, 123)= 2.58, p = .111 (veracity); and F(1, 123)= 0.67, p = .414 (language), for the proportion of details and F(1, 123)= 1.19, p = .278 (veracity); and F(1, 123)= 0.15, p = .702 (language), for the proportion of named entities respectively. However, a significant main effect for weeks since flying was found for the proportion of details, F(1, 123)= 4.40, p = .038; but not for the proportion of named entities, F(1, 123)= 0.61, p = .438. There were no significant interactions found between weeks since flying and veracity for the proportion of details, F(1, 123)= 3.49, p = .064; and proportion of details, F(1, 123)= 1.30 , p = .256. These results suggest that the number of weeks since flying influences the

proportion of details in the statements, but not the proportion of named entities. This might be due to the influence of memory. The absence of an interaction between veracity and weeks since flying indicates that the proportion of details or named entities in the statements written by participants who had flown within 4 weeks or later than that did not differ according to their veracity (i.e. truthful or deceptive). An overview of the means and standard deviations of the proportion of details and named entities for the different conditions can be found in table 5 (Table 5). Results found suggest that past events (e.g. memory) influence the proportion of details in comparison to future (planned) events (e.g. intentions).

(27)

Table 5

Means and standard deviations for the proportion of details (LIWC) and named entities (NER) for future and past flights, allocated by condition (i.e. true or deceptive).

Total

Future flights (intentions) Past fights

True Deceptive Total True Deceptive Total

N 352 110 111 221 58 73 131 Proportion of Details 75.69 (15.37) 14.55 (3.70) 14.59 (3.46) 14.57 (3.57) 15.70 (3.87) 15.82 (3.34) 15.77 (3.57) Proportion of Named Entities 10.72 (7.80) 1.25 (1.09) 1.35 (1.17) 1.30 (1.13) 1.29 (0.81) 1.37 (1.16) 1.34 (1.02) Information protocol

Regarding the information protocol, prior research suggested that participants given a specific information protocol (e.g. "give as much specific information as possible") provide more details than participants given a standard information protocol (e.g. “give as much information as possible”). However, the influence of language might influence the efficiency of the specific information protocol. To test this well, a 2x2x2 between-subject ANOVA was preformed to

examine the effect of the information protocols, veracity and language and its possible interactions as well. Like found in the analysis of hypotheses 2 and 3, there is an absence of a main effect of veracity and language on the proportion of details and named entities (for detailed analysis, see the role of language and veracity in proportion of details and named entities section in the results).

Regarding the proportion of details it was found that on average, participants given the specific protocol provided more details (M = 15.45, SD = 3.66) than participants given the standard protocol (M = 14.51, SD = 3.50), F(1, 344)= 6.65, p = .010 n2p = 0.02. In addition, regarding the proportion of named entities it was found that on average, participants given the specific protocol

(28)

provided more named entities (M = 1.48, SD = 1.10) than participants given the standard protocol (M = 1.12, SD = 1.05), F(1, 344)= 10.8, p = .001 n2p = 0.03. An additional t-test shows that the difference of .94 in proportion of details was also found significant, t(350)= 2.44, p = .015; and represented a small effect size of d = 0.26. The difference of .36 in proportion of named entities was also found significant, t(350)= 3.09, p = .002; and represented a small effect size of d = 0.26. These results indicate that the type of instruction influences the proportion of details and named entities in the statements.

However, no significant interaction effect between the information protocols and veracity for the proportion of details was found, indicating that the proportion of details in the statements of participants given the standard or specific instructions did not differ according to their veracity (i.e. truthful or deceptive), F(1, 344)= .47, p = .492, n2p = 0.00. Neither was there a significant interaction effect between information protocol and veracity for the proportion of named entities,

F(1, 344)= .79, p = .374, n2p = 0.00.

In addition, no significant interaction effect between the information protocols and language for the proportion of details was found, indicating that the proportion of details in the statements of participants given the standard or specific instructions did not differ according to their language (i.e. native English or non-native English), F(1, 344)= .28, p = .596, n2p = 0.00. Neither was there a significant interaction effect between information protocol and language for the proportion of named entities, F(1, 344)= .78, p = .379, n2p = 0.00.

The absence of the interaction but the presence of a main effect for the information

protocols suggest that participants given a specific information protocol provide more details and named entities than participants given a standard information protocol. Participants given the specific information protocol are made aware of the fact that detailed information is wanted, and therefore it is likely that they provide more details than participants given the standard

information protocol. The absence of the interaction with language suggests that the information protocol is also easy to understand for participants whom first language is not English. However, the absence of the interaction with veracity suggests that there was no greater discrepancy

(29)

between truth-tellers and deceivers given the specific information protocol in comparison with the standard information protocol. It might be that deceivers given the specific information protocol got increased awareness of what kind of information was needed to sound truer, and therefore provided more details and named entities, just like the truth-tellers, resulting in the absence of difference between these conditions.

Bilingualism

In prior research it was found that people who are raised in more than one native language e (i.e. bilingual speakers) are better in inhibiting the unwanted language, and are therefore better in cognitive control (Bialystok, Craik, Klein & Viswanathan, 2004; Caldwell-Harris & Ayçiçeği-Dinn, 2009; Martin-Rhee & Bialystok, 2008) than people who are not bilingually raised. Based on this it might be that it takes less effort for bilingual speakers to inhibit the truth and consequently deceive easier than monolinguals (in contrast to the Cognitive Load Theory). In this study, 88 participants (25%) were raised bilingually and had English as one of their first languages. It was found that on average, these bilinguals provided more named entities (M = 1.60, SE = 0.15), than those who were not bilingually raised with English (i.e. monolinguals) as one of their first

languages (M = 1.22, SE = 0.06), and the difference of .38 was found to be significant, t(350)= -2.87,

p = .004; and represented a small effect size of d = 0.31. However, there was no significant

difference between bilinguals and the other participants for the proportion of details, t(350)= 1.78,

p = .076.

Moreover, a possible difference in the proportion of details or named entities was not found for the deceptive statements of bilinguals (n = 45) and deceptive statements of the other participants (n = 139), corresponding values were t(182)= 1.40, p = .162 and t(55.87)= -1.54, p = .128, respectively. However, it was found that bilinguals provided more named entities in their true statements (n = 43, M = 1.55, SE = 0.19) than the other participants provided in their true statements (n = 125, M = 1.17, SE = 0.79), and this difference of .38 was found to be significant,

(30)

such difference found in the proportion of details between the true statements of bilinguals and the true statements of the other participants, t(166)= -1.11, p = .269. These results indicate that bilinguals provide more named entities when deceiving than participants who were not

bilingually raised in English. However, whether generating more named entities is a function of bilingualism is hard to declare and whether an explanation for this is useful for deception research is something future research should decide upon.

Discussion and conclusions

The main focus of this study was to investigate the role of language proficiency in the detection of malicious intentions using verbal deception detection methods and to see whether language proficiency influences the proportion of details and named entities in provided statements and consequently the accuracy of detection methods.

In this study, truthful statements did not contain more details or named entities than deceptive statements, which is not in line with hypothesis 1. This finding suggests that veracity does not influence the proportion of details or named entities in statements.

Also, it was not found that truthful statements written in a native language contained more details or named entities than truthful statements written in a non-native language, which is not in line with hypothesis 2a. In addition, it was not found that deceptive statements written in the native language contained more details or named entities than deceptive statements written in the non-native language, which is not in line with hypothesis 3a. Also, no interaction between

language and veracity was found. The results suggest that language proficiency does not influence the proportion of details or named entities in statements, neither does veracity.

Moreover, it was found that truthful statements are more likely to be misclassified as deceptive in the non-native language than in the native language when considering the proportion of details and named entities as the method, which is in line with hypothesis 2b. Also, it was found that deceptive statements are easier classified as deceptive in the non-native language than in the native language when considering the proportion of details or named entities as the method,

(31)

which is in line with hypothesis 3b. However, comparison of these methods as diagnostic

measures shows that these methods are not accurately enough to discriminate between deceivers and truth-tellers.

Lastly, it was found that the different methods to measure language proficiency were positively correlated with each other and with the provided proportion of details, but not so with the proportion of named entities. This is broadly in line with hypothesis 4a and 4b, and these results indicate that people can estimate their own language proficiency quite well and that language proficiency can influence the proportion of details in statements, in contrast with named entities.

Explorative analyses indicate that statements about intentions (i.e. future flights) contain fewer details than statements about the past (i.e. past flights). Additionally it was found that stories about more recent flights contain more details than stories about flights in the past, however, this difference was not found for upcoming flights; stories about sooner upcoming flights do not contain more details or named entities than flights further away in the future. These results suggest that past events (e.g. memory) influence the proportion of details in comparison to future (planed) events (e.g. intentions). Explorative analysis of different the information protocols indicates that specific information protocols elicit more details and named entities than standard information protocols. Also, bilingualism does not seem to be a big factor in deception research.

Based on results it can be concluded that language does not influence the proportion of details or named entities in statements, neither does veracity. In addition, it can be concluded that veracity does not influence the proportion of details or named entities enough to be able to

correctly differentiate between truth-tellers and deceivers. Also, the use of proportion of details and named entities as criteria has not accurate enough to discriminate between truth-tellers and deceivers in this study. Three alternative explanations for these findings, limitations and

suggestions for future research will be discussed below.

First, an alternative explanation for the absence of the effect of language could be that the participants in this study were recruited through an English online participant platform,

(32)

suggesting that these participants already have sufficient level of English language proficiency, resulting in a homogenous sample when it comes to English language proficiency. However, scores on the language proficiency tasks differed significantly between the participants whom the first language was English and whom the first language was not, suggesting that this quasi-experimental design did elicit observable differences in language proficiency between natives and non-natives, resulting in heterogeneous groups when it comes to English language proficiency. Yet no effect for language was found, indicating that there is no actual effect of language. This would suggest that the questions and instructions used are easy enough to understand for people with different English language proficiency, but also that deceiving in a second language is not at the expense of providing fewer details. However, a true interaction effect of language and veracity can only be examined with an experimental manipulation of language, which is suggested for future research. This could be done with students whom second language is English for example, and performing a study like this can examine the true effect of language.

Secondly, an alternative explanation for the absence of difference in details or named entities between true and deceptive future intentions could be that the operationalisation of intentions in this study is deviant from how it’s usually done; questions about flying weeks later instead of direct implication (e.g. passengers at the airport right before boarding; Vrij et al., 2011a). However, explorative analyses show no difference in details or named entities in stories between participants flying soon (i.e. within 1,2,3 or 4 weeks) or later, and most people flew within 6 weeks. But it was found stories about recent flights (less than 4 weeks ago) contained more details than flights from longer ago (more than 4 weeks ago), suggesting that memory influences the amount of provided details in their stories. However, a lack of difference between the intention and past condition remains, and this could suggest that discriminating between true and false intent with details or named entities as criteria are not sensitive enough, like found in this paper. Vrij and colleagues stated that the lack of difference in detail between intentions and past condition could be due to that descriptions of truthful intentions are generally no more detailed than descriptions of false intentions (Vrij et al., 2011b); in this case truth-tellers do not really know what they are

(33)

going to do on their trip. Also, it could be that participants used part of their true stories (people, places and true innocent activities) as part of their deceptive account, making it harder to

differentiate between true and deceptive stories (Nahari & Vrij, 2015). However, Vrij and colleagues found that stories regarding false intent sound less plausible than those about true intent (Vrij et al., 2011b), thus plausibility might be more suitable criteria to discriminate between true and false intent. Nonetheless, whether plausibility can discriminate between true and false intentions about things further in the future is something future research should decide upon on.

In addition, an alternative explanation for the absence of the effect of veracity could be the lack of interaction in this online study design. Online studies like these are suitable to gather data of hundreds of participants, but options are limited when it comes to influencing, motivating and persuading participants to sound more convincing. However, most truth-tellers and deceivers in this study indicated that they were highly motivated to sound convincing during the experiment, suggesting that a lack of interaction did not influence the motivation of the participants to sound convincing. Notwithstanding, it would be interesting to see if interaction could elicit a difference between truth-tellers and deceivers, as feelings of getting caught might be bigger in person than behind a computer. Implementing a form of interaction could be done in a more traditional way such as in labs or by asking people at an airport or border (e.g. Vrij et al., 2011), or by

implementing interaction online, such as with online communication methods like Skype or a chat function. Considering the possible applicability, it would suitable to investigate the use of an interactive online information elicitation method in deception detection.

Further, a possible limitation of this study is the fully automated coding procedure. Most online studies still code humanly instead of automated, but considering the applicability and feasibility of coding the statements of the hundreds of participants in this study (N = 352) automated coding was done in this study. However, we do not know if there would be any significant differences between automated and human coding in this study. Also, prior research found computer analysis more accurate in detecting lies, but human coding more accurate in detecting truths. (Vrij, 2000) This might be because LIWC and NER coding system only look at

(34)

specific words instead of whole sentences with context, whereby plausibility or realism of sentences are not assessed and things like sarcasm might be misinterpreted. However, human coding might involve bias, judgements or suspicion (Bond & Lee, 2005; Evans & Michael, 2014; Vrij, 2000). Nonetheless, it would be useful to investigate whether there is a difference in accuracy between human and automated coding in online statements and this could be done by comparing human and computerised coding of the same sample of statements. When automated coding is as accurate or even more accurate than human coding, this could really benefit the field and make deception research more measurable. Until further research decides upon this question, much caution is needed when interpreting and classifying deceptive statements and consequently prosecute suspects based on automated coding.

Lastly, explorative results regarding the types of instructions indicate that the specific instruction protocol elicits more details and named entities than standard instruction protocol, which is in line with prior research (Harvey et al., 2016). A suggestion for future research would be to investigate more criteria to which the enhancement of this type of instruction applies to, and to implement this in applied settings such as interrogations.

This study provides insight in the role of language proficiency in verbal deception detection methods. Even though no significant effects of language proficiency or veracity have been found in this study, no exclusive conclusions can be made about the role of language

proficiency considering the absence of experimental manipulation of language. Explorative results indicate multiple new variables that could play a role in verbal deception detection and can be addressed in future research.

Referenties

GERELATEERDE DOCUMENTEN

The key foci of the centre are faculty development, support for teaching and learning at different levels, research, mainly in health professions education and the provision

However, the findings of the present study showed that among the demotivating factors both low and high proficient language learners are more likely to

In this research the independent variable (use of native or foreign language), the dependent variable (attitude towards the slogan) and the effects (country of origin,

Correlation data for dependent, independent and control variables used in hypothesis III (PR = Pearson’s correlation) Correlations n=66 EF EPI Score 2012 English nations’ FDI

Using e-beam evaporation, an additional ion treatment, either ion assistance and/or post deposition polishing, might be needed to get denser layers, which in the case of sputtering

Eerder onderzoek heeft ook laten zien dat een hoge mate van Neuroticisme gerelateerd.. is aan een aantal slechte gezondheidsgedragingen, zoals verhoogd gebruik

A dummy variable indicating pre/post crisis and an interaction variable between this dummy variable and the idiosyncratic risk variable are added to a Fama-Macbeth regression