Improving Online Communication between Digitising Governments and its Citizens

(1)

Digitising Governments and its Citizens

SUBMITTED IN PARTIAL FULFILLMENT FOR THE DEGREE OF MASTER OF SCIENCE

D.M. de Vries

10756302

M

ASTER

I

NFORMATION

S

TUDIES

Information Systems

F

ACULTY OF

S

CIENCE

U

NIVERSITY OF

A

MSTERDAM

June 26, 2019

1st_Examiner ₂nd_Examiner

prof. dr. T.M. van Engers, dr. J.A.C. Sandberg,

(2)

I

O

C

BETWEEN

D

IGITISING

G

OVERNMENTS AND ITS

C

ITIZENS

MASTERTHESISINFORMATIONSTUDIES, UNIVERSITY OFAMSTERDAM

D.M. de Vries

UvA ID: 10756302

ABSTRACT

Over the last four decades, there has been a growing adoption of Arti-ficial Intelligence (AI) within governments. The use of AI can unlock enormous potential for the operation of governmental authorities. Governments started to use Automated Decision-making Systems (ADS) in fields where the citizen’s stakes are high. In the last years sub-symbolic AI-technologies, known under different names such as data analytics, machine learning and deep learning have become pop-ular. The models generated by these AI-technologies are typically hard to understand, even by AI-experts. However, for organisations it is essential that their decisions are explainable, transparent and comply with the rules set by law. This puts constraints on the mech-anisms of these sophisticated AI-technologies. Therefore, there has been an increasing interest in Explainable AI (XAI) and as a result, a renewed focus on the quality assessment of explanations. This study focuses on the current developments of AI within the Dutch gov-ernment and its focus on providing citizens with a good motivation of (partly) automated decisions. A framework to assess the quality of explanations of legal decisions by public administrations was developed. Whilst Dutch governmental agencies already offer clear and straightforward letters communicating their decisions, communi-cation with the citizen can be further improved by providing a more interactive letter. In that way citizens could be offered more insights into the specific components of the decision made, the calculations applied and sources of law that contain the rules underlying the decision-making process.

KEYWORDS

Artificial Intelligence, XAI, Automated Decision-making Systems, Transparency, Explanations, Government

1 INTRODUCTION

All over the world, governments have started to adopt new Artifi-cial Intelligence (AI) technologies to improve the efficiency and effectiveness of public administration. Automated Decision-making Systems (ADS), for example, can help governmental agencies with various tasks such as deciding on tax assessment and student finance. In some of those fields, the citizens’ stakes are high. Therefore, it is of great importance that those (partly) automated decision systems are transparent on their reasoning mechanisms and carefully explain their decisions. This study focus on the improvement of explana-tions of governmental agencies’ communicaexplana-tions regarding (partly) automated decisions. Consequently, the main research question for this study is:HOW CAN GOVERNMENTAL AGENCIES IMPROVE THEIR DIGITAL COMMUNICATION TOWARDS CITIZENS CON

-CERNING(PARTLY)AUTOMATED DECISIONS?

Over the years, a substantial number of studies have been pub-lished on opening the black boxes of artificial intelligence [17, 40]. Only a few studies suggest procedures on how decisions made by artificial intelligence should be explained in a proper manner [3, 48]. Besides some studies of prestigious consulting firms, not much aca-demic research is done in improving explainability and transparency of automated decision-making systems in governments [6, 9, 48]. This research is relevant to the scientific community since explain-ability and transparency, as well as accountexplain-ability and auditexplain-ability, are crucial for governmental processes. Those fundamentals should be protected by governments, especially in this digitising era.

In order to answer the main research question, this study elab-orates the following sections. First, this research discusses what kind of AI technologies are currently used by governmental author-ities. Second, a widely renewed interest in more responsible and explainable AI is discussed. In the following subsection, an evalua-tive framework is described to asses the quality of explanations from governmental authorities. In the section that follows, this framework is used to analyse an automated decision letter from a Dutch gov-ernmental authority. A conceptual prototype of this letter is set up and tested consequently. With the help of a survey, the attitude of students towards ADS within the Dutch government is examined. It is also investigated what kind of letter (original or conceptual) the students are most satisfied with. Subsequently, it is investigated how this influences their level of acceptance of the decision and how this influences the likelihood of objection or appeal. After that, the limitations of this study are discussed and directions for future research are given. Finally, the most important findings of this re-search are summarised and the rere-search question is answered to provide recommendations for a digital communication strategy for governmental authorities.

1.1 Governments adopting Artificial Intelligence

AI-based systems have been used for decision-making purposes since the mid-eighties of the 20th_{century. For instance, in the}

Nether-lands, the Dutch Tax and Customs Administration has used so-called expert systems for various tasks. This includes tasks such as the calculation of the appropriateness of Company Tax constructions according to Company Tax Law and International Treaties and risk assessment of tax returns by entrepreneurs which uses the KASSA-system [41]. Most of these successful KASSA-systems were rule-based sys-tems, with the ‘rules’ elicited from (legal) experts. In those days knowledge engineers, the name used for the developers of those AI-applications, mainly focused on getting the complex task done rather than on including transparent references to the legal resources relevant to the assessment and decision-making process at hand. Since the knowledge contained by these systems was the knowledge of (legal) experts these systems were also called ‘expert systems’.

(3)

Nowadays, expert systems are widely adopted by public administra-tions [1, 16]. The primary purpose for the government to invest in those systems, that these days hardly anybody would call AI any-more, is to provide better services and improve the effectiveness and efficiency of public administration.

Many successful AI-applications are knowledge-based systems, i.e. they contain explicit knowledge representations of the knowl-edge domain at hand, they are symbolic systems as they use symbols to represent domain concepts and have explicit reasoning mecha-nisms. These systems can be implemented as rule-based systems or case-based systems; in both cases the knowledge contained in these expert systems is limited to a specific knowledge domain. Expert systems can be realised using various programming languages or environments that were called knowledge engineering shells in the nineteen-eighties. Many expert systems have been build in languages such as logical programming languages such as Prolog, functional programming languages such as Lisp, but also using specific knowl-edge engineering shells that were popular in those days such as AION, KBMS and others [41]. Expert systems have been devel-oped to give advice as well to support decision-making. Many of those decision-support systems make use of an automated rule-based reasoning mechanism using determined rules to come to a specific decision. This method is still one of the main components and a major example of symbolic AI, an approach which uses so-called production rules, such as if-then statements, to derive a particular conclusion from certain facts [18, 44]. With the increase of computer power, sheer unlimited data availability, and the boost of the internet, new AI-technologies have emerged and become popular. Particularly sub-symbolic AI technologies, that are known under various names, such as machine learning, deep learning, and neural nets, became popular again in the 21st_{century [25]. Contrary to symbolic AI that}

is typically connected to deductive approaches, sub-symbolic AI is typically connected to inductive approaches. This focuses on learn-ing systematic patterns from the data, and then apply those learned patterns on new input determining the appropriate output [24].

While public administrations have been using various symbolic AI-technologies for several decades, sub-symbolic AI is used for various tasks as well. Since the end of the nineteen-nineties, ma-chine vision methods were used for various pattern recognition task including that of handwritten addresses from envelopes [29].

Research in AI has continued since its early dawn at the Dart-mouth Conference of 1956 and various new technologies have emerged since then. Computational power has increased incredi-bly and AI-technologies are now used all over the world by various organisations including governmental agencies to support a wide variety of tasks. Public administrators have used it amongst others to improve the efficiency of administration [22], to optimise traffic flows [28], to support tax assessment [8] and to prevent crime [38]. Within the Netherlands, governmental agencies and public ad-ministrations were amongst the early AI adaptors. Nowadays the majority of governmental agencies use some sort of AI supporting their administrative tasks and decision making [1]. Typical examples of ADS can be found at the Dutch Tax Administration that uses an ADS to calculate taxes to be paid and at the Education Executive Agency that uses ADS to decide on the amount of students’ monthly loans [11, 12]. The use of AI in fields where the stakes are high, come with some worries.

1.2 Challenges of AI

As stated, AI has the potential to improve the efficiency and ef-fectiveness of governmental authorities. Recently there have also shown some negative consequences by adopting AI technologies in various fields. One example of a system that became infamous for its bias in its predictions is the COMPAS system used for predicting the likelihood of recidivism for criminals in the United States. Re-searchers proved that this system is biased against Afro-Americans [2]. Introducing a bias against a specific group within society could lead to more segregation and then decreasing opportunities for that specific group. As a result, the system will produce a self-fulfilling prophecy [43].

Also, Caliskan et al. [4] reported on gender bias in Google Trans-late. Google’s machine-learning model seems to have adopted a specific cultural bias. As a result, it translates foreign languages into gender-stereotyped sentences. In Turkish, a gender-neutral pro-noun "o" is used. In Google Translate the sentence "o bir doktor" is translated as "he is a doctor" while the sentence "o bir hem¸sire" is translated as "she is a nurse". This shows that the system is biased in a particular manner.

The main use of AI-tools by administrations is for ADSs, sup-porting their administrative tasks. While these tasks may vary per governmental agency, one commonality of the ADSs used is that they typically contain some normative reasoning, assessing if something is allowed or disallowed and in many cases some form of calculation, e.g. the calculation of the amount of loan for study, taxes due, or fine. Governmental agencies communicate these decisions with a citizen on paper but more and more by digital means. According to law, Dutch citizens are still entitled to receive official communication in the traditional way, but people are encouraged to choose electronic media.

Concluding, the AI adoption by governments and their public ad-ministrations have enormous potential benefits both for governments and their citizens. Although studies on the issue are limited, there is some evidence indicating that governmental authorities operate more efficiently and effectively by applying AI-technologies [22]. The improvement of efficiency may come with some costs as previously discussed.

Ever since the introduction of AI-technologies people have feared the lack of human touch and empathy, the lack of transparency and unfairness when smart AI-components replace the human in the loop [38]. In order to be able to trust organisations in taking (legally) justified decisions, these decisions when produced by AI applications need to be explained and argued for in such a way that the persons subjected to those decisions at least understand what the decision is based upon.

The main challenge that is addressed in this study, is providing insight into the reasoning mechanisms of AI-algorithms for citizens. This is needed in order to check their correctness, fairness, normative compliance and sensitivity to potential biases in their judgements.

1.3 A Renewed Interest in Explainable AI

As discussed previously, the widespread use of AI-technologies and particularly the spread of sub-symbolic AI-systems that ‘learn’ from data, are vulnerable for bias. In most cases the models are hard to understand, which has raised the need for explainability of these

(4)

technologies, hence Explainable AI (XAI). Even for experts that developed the algorithms it can be difficult to understand how the algorithm comes to conclusions due to the characteristics of these sub-symbolic AI-technologies. It must be mentioned that issues like bias in AI-based decision-making and the misuse of data, like the recent case of Facebook and Cambridge Analytica, have raised the need for an even broader concept, Responsible AI. XAI can be perceived as one aspect of Responsible AI, although responsibility could refer both to the AI-technology itself as well to the developers and organisations exploiting these technologies.

The call for XAI has become louder after a few scandals, and it is needless to say that specifically governmental agencies that deploy AI to support their tasks have to meet the traditional government requirements for explainability, transparency, accountability and auditability [6].

In order to try to protect some essential social fundamental val-ues, The Dutch Council of State published a report in 2018 on the influence of new technologies on constitutional relations [36]. The Council of State is the highest general administrative court and ad-vises the government and Parliament on legislation. With this advice, The Council primarily aims to protect the citizen against a changing constitutional relation. The Council mentions that in the current situation, where automated decision-making processes are being ap-plied, Dutch citizens cannot always determine which decision-rules and what data are used for a specific decision. In their report it is furthermore mentioned that it might be very challenging for some people to take part in the digital society. With respect to the first issue, The Council advises the government to pay closer attention to the motivation of their automated decisions. They demand that it should be clear which decision-rules (algorithms) and data the governmental authority used for a specific decision. Furthermore, it should be made clear which data is taken from other governmental authorities. Moreover, the government should be approachable for every individual for useful (online) contact.

The Council of State has concluded that AI is significantly af-fecting the relation between government and citizen. They urge governmental agencies to make the explainability of the reasoning processes of their systems their primary objective. These AI-systems should present their results and explanations thereof in such a manner that the explanation is understandable for the receiver of the decision. Not only the Council of State has outed serious con-cerns on the explainability of automated decision-making. In the Netherlands, as well as in European member states and many other countries XAI is high on the agenda, although explainability itself is not an entirely new topic. But recently there can be observed a renewed interest in explainability as one of the topics related to XAI that emerged over the past couple of years [5].

The interest for explainability in Europe can also be explained because it is addressed in the General Data Protection Regulation (GDPR) [14] that is applicable since May 25th2018 in all European member states. The right on privacy is a fundamental right and the GDPR takes privacy in the digital society very seriously. Besides protecting citizen’s privacy the GDPR also includes Article 22 on "Automated individual decision-making, including profiling". This article forces organisations to be transparent about the decision-making process of their algorithms.

2 LITERATURE REVIEW

As stated in the previous section the need for explainable AI is not an entirely new topic; it has been addressed in many reports and academic papers and is discussed at plenty of conferences such as those of ACM’s CHI community [23, 30, 42]. The increased popularity of sub-symbolic AI has just put the topic back on the agenda again.

2.1 Why Explanations Matter

One key part of XAI is the explanation itself, The Oxford English Dictionary definesEXPLANATIONas: "1) A statement or account that makes something clear 2) A reason or justification given for an action or belief " [13]. Therefore, an explanation mainly aims to answer the how and why questions, which can be useful to clarify or justify the behaviour of an AI agent respectively [32]. Within our daily lives, explanations are used by humans to share information and in order to better understand each other. Therefore, explanations lead to better acceptations about specific statements [39]. Over the years, studies from various disciplines suggest that providing explanations on the mechanisms of AI systems improve the acceptance of the user in regards to the decisions, conclusions and recommendations of those systems [19, 32, 42, 45, 49]. As a result, systems that provide better explanations on their reasoning will improve the acceptance by citizens in the outcome of those systems. Other studies suggest that explanations from AI systems help to acquire or maintain trust from the user in the accuracy of those systems [7, 10, 29, 30, 35].

To conclude, explanations are used to clarify or justify the be-haviour of a system. Exposing the reasoning and data behind an outcome leads to better acceptance of AI-based decisions by the user. For governments this might mean that the chance of a citizen to object or appeal against a decision taken might decrease as well. Eventually, one can expect that there will be more trust in the gov-ernment’s AI systems when good explanations for their decisions are provided. Therefore, to improve the satisfaction of the citizen, it is essential to identify the characteristics that make an explanation a good explanation.

2.2 Explaining Good Explanation

Research into explanations has a long history. Early examples of research in this subject include topics such as logic, causality and human discourse [15, 47]. Related work can be found in various areas such as philosophy and psychology. Based on earlier studies, an evaluative framework that enables to evaluate the quality of XAI was developed. In literature, several criteria have been described that can be used to determine the satisfaction of an explanation. The framework presented in this paper includes those criteria that are most frequently mentioned and extensively discussed in the field of cognitive sciences and AI literature. Six primary quality criteria for explanations were identified:

The first quality criterion for explanation is calledEXTERNAL COHERENCE[46]. Some researchers suggest that the likelihood of acceptance of a decision increases when the explanation is consistent with one’s former beliefs [31]. This means that explanations should be compatible with what the reader already knows in the specific context at hand [50].

(5)

The second quality criterion isINTERNAL COHERENCE. This concept points out the sense of how good the several elements of an explanation fit together [50]. There should be a logical relation between propositions to improve the completeness of the explanation and improve the perceived understanding [34, 46].

The third quality criterion isSIMPLICITY. Two studies tested the theory of Thagard on Explanatory Coherence [46] and found that people preferred explanations that invoke fewer causes [27, 37].

The fourth quality criterion isARTICULATION. One particular study presents several linguistic markers that examine clear articula-tion of a letter [50]. One of the three elements is the number of words used in the explanation. Another one is the average word length of the statement. The median word frequency of the text can also be used as an indicator [50].

The fifth quality criterion isCONTRASTIVENESS. This criterion expresses the clarity of the arguments that explain why event P happened rather than event Q [26, 31]. This specific factor also emphasises questions such as what would happen when a particular condition in the process is changed [20].

Finally, some research mentions that the user’s satisfaction with an explanation might increase when the possibility forINTERACTION

between the explainer and explainee is provided [33]. What is needed for an explanation also depends on what the explainee already knows and specifically; still wants to know [21]. This criterion proposes new opportunities in the field of Human-Computer Interaction (HCI) [31]. By providing interactive dialogue, the satisfaction of the user might increase.

The evaluative framework described here will be used to analyse a specific ADS-generated governmental decision later. Thereafter, the framework will be used to create an alternative presentation format for that decision with the main goal to enhance the citizen’s satisfaction and acceptance of the decision.

2.3 Hypotheses

The literature presented in section 2.1 and the theoretical framework presented in section 2.2 were used to derive the hypotheses. For this study, the following hypotheses were tested:

• H1: There is no relation between one’s trust in government and trust in computer systems within the government. • H2: The citizen’s support for the deployment of AI by the

government does not vary by case.

• H3: The presentation format of a governmental decision will have no influence on the citizen’s perceived satisfaction about that decision.

• H4: The presentation format of a governmental decision will have no influence on the chance a citizen will accept that decision.

• H5: The presentation format of a governmental decision will have no influence on the citizen’s urge to object to or appeal that decision.

3 CASE STUDY ON STUDENT LOANS IN THE

NETHERLANDS

Ideally, this study would focus on an AI application that is repre-sentative of approaches that raised the issue of explainability, in other words deep-learning or similar sub-symbolic technologies.

The Council of State said that with the less complex technologies, problems can still emerge. The absence of sub-symbolic tools in administrative practice means that a decision was made to look into explainability in the most popular tools for automated decision-making in the Dutch government using symbolic AI.

The case selected is the application that is used to decide on student loans, deployed by the Education Executive Agency (referred to as DUO in Dutch), an administrative agency that falls under the responsibility of the Dutch Ministry of Education. The ADS for deciding on student loans uses symbolic AI. More specifically, it is a rule-based system that contains different rules that are evaluated when deciding on the entitlement of students to financial support.

This tool is representative of other tools used by governmental or-ganisations including those used by the Dutch Tax and Customs Ad-ministration and the Immigration and Naturalisation Service (IND), as discussed in section 1.1 and confirmed by the experts.

A typical decision produced by that tool was analysed using the framework described in section 2.2. Based on that framework, a new presentation format for that decision has been developed.

3.1 Student Loans in the Netherlands

Every Dutch student who is enrolled in university or higher or sec-ondary vocational education can apply for financial support, which may consist of a loan at attractive conditions. Several factors in-fluence the amount that a student can borrow monthly from the government. These factors include the student’s parents’ income and the student’s living situation; students who live at their parents’ home can borrow a smaller amount than those students who live on their own. If the situation changes for students who receive some financial support from DUO, this may affect their entitlement to the support received. If too much support is received considering the sit-uation, this may lead to penalties and pay-back obligations that can have serious negative consequences for the student involved. This is also risky for DUO, as the debts may never be repaid. Therefore, it is of great concern that DUO receives any change in data describing the situation of the student and rechecks the entitlement to financial support, recalculates the amount of support and informs the students about their decisions.

This example case was used because the decision model used in this AI application is well understood and relatively straightforward, although some legal issues can be found in any regulation. Since this study focuses on XAI, the exact legal domain is less relevant than understanding what makes a good explanation of the decision made in general. Specific legal domains require some domain-specific explanatory elements as well as that they follow some general re-quirements as described in the framework presented in section 2.2.

The decision on student loans is digitally communicated via a let-ter through DUO’s digital platform, MijnDUO. When the authority receives a change in address by from a municipality, which con-trols the Basis Administration for addresses in the Netherlands, the rule-based system will automatically notice the change in the stu-dent’s address and, when necessary, consequently change the current student loan.

(6)

3.2 Analysing an Original Disposal

With the help of DUO, an anonymised disposal (letter) was obtained for the analysis (see Appendix C). This letter is addressed to a student who moved from his own house to his parents’ house. Moving from one’s parents’ house to a student’s house and vice versa are quite common in the Netherlands. The six quality assessment criteria from section 2.2 were used for the evaluation of this letter.

As mentioned in the previous section, it is essential to provide dis-posals with explanations that are compatible with what the explainee already knows. In this case, the student knows he or she changed the living situation and therefore can expect a letter from the authority. The letter communicated by DUO, however, does not state explicitly that the student has moved from his own house back to his parents’ house.

Furthermore, several parts of the letter have their own purposes that fit together overall. First, the letter states that a change in address was received and that this impacts the student loan. After that, the authority warns the student that he or she has to end the loan in a timely manner when terminating his or her study in order to avoid fines. This is a general reminder and has nothing to do with the current living situation. Then there is a section where the authority notes the data on which their decision was made; this includes the living situation, education details, and the parents’ contribution. The letter ends with a closing remark that states that this message was generated automatically.

This explanation only uses the address change cause in order to explain the change in the student’s loan. The explanation provides no further information that clarifies and justifies the decision and is therefore evaluated as a relatively simple one. The text, however, is understandable for the target audience and written concisely.

As discussed in section 2.2, it is also essential to show the alter-native outcome when motivating a specific decision. This is done in the part of the letter where there is a comparison made between the student’s old loan and the student’s new loan.

The currently used digital letter contains one interactive element. By clicking on What does this mean for you? or On which data is our decision based?, more detailed overviews are presented to the student. This feature makes it easier for the student to check what data the ADS used to make its decision. The information presented then, however, is limited to the data known by DUO and the previous and current amount of the student’s loan.

3.3 Designing a Conceptual Disposal

After analysing the current letter from DUO, an improved version of the presentation format was developed using the principles described in the framework from section 2.2. This conceptual online letter was set up with the main goal of providing better insight into the reason-ing mechanisms of the algorithm, the data used to make the decision, and the presentation of the decision in a clearer way. This interactive letter can be found in Appendix D. The six criteria for explanation, as defined earlier, were used to improve the letter in the following ways. First, the letter contains a section that informs the receiver about the change in address that affects the student’s monthly loan (external coherence criterion). The order of messages, one per sec-tion, was reorganised to give a better relation between the various parts of the letter (internal coherence criterion). Different from the

original letter, the conceptual letter explains the reasoning that led to the decision. As in the original letter, only one cause (change in ad-dress) was presented to explain the change in the loan to the student (simplicity criterion). The number of words in the letter was reduced for the conceptual disposal (articulation criterion). Furthermore, the student’s old situation and new situation were presented together in a contrastive table (contrastiveness criterion). By offering the user the possibility to learn more about the decision via hyperlinks to more elaborated information, the student’s understanding of the situation might increase as well (interaction criterion).

4 METHODOLOGY

In order to answer the research question, nine experts in the fields of business rules management, governmental administration and human-computer interaction were interviewed (Appendix F). These experts were selected using a snowball approach starting with one expert who happens to be a PhD student at the University of Am-sterdam and works for the IND along with being chairman of the working group on eServices at the Manifesto Group, a collaboration between the largest Dutch governmental agencies. Because the main purpose of conducting these interviews was to select a suited use case and because of time limitations, no transcription nor coding of the interviews was part of this study. The interviews with the experts contributed to a better grasp of the current developments in the adaption of both symbolic and sub-symbolic AI technologies by governmental authorities. They also clarified understanding of how the relationship between government and citizen has changed in the era of digitisation.

The research method chosen for this study is a case study. The case selected is a decision-making tool used for deciding on student loans provided by DUO. The original and conceptual versions of the presentation format (as described in section 3.2 and section 3.3 respectively) were subjected to an A/B test. The A/B test was included in an online survey using Qualtrics. Half of the subjects received the survey that included version A, the other half version B. Besides questions about the explainability of the presented version, the survey included questions that were used to measure the students’ attitudes towards the use of ADS in the Dutch government.

Chat service WhatsApp was used for contacting around 100 stu-dents, being the target audience for the application studied. Some of the students forwarded the questionnaire to other students, resulting in 133 students who completed the survey.

4.1 Outline of the Survey

First, the subjects were shown an introductory text that explained the current situation of AI use by the Dutch government and the purpose of the research.

Thereafter, a five-point Likert scale, ranging from strongly dis-agree (1) to strongly dis-agree (5), was used to determine the partici-pants’ attitudes. The participants were asked to rate how strongly they agreed with specific statements on the use of symbolic AI in government.

Subsequently, participants were asked to evaluate a disposal of an automated decision from DUO. One original disposal was obtained from the agency itself; the other one was a more interactive disposal that was created specifically for this study and included all factors

(7)

that, according to theory, would enhance explainability. The partici-pants were randomly assigned to one of the two versions and were then asked questions to survey their satisfaction with the disposal.

Finally, participants were asked to fill in some general information such as gender, age, and education level. The questionnaire thanked the student by appreciating their time and effort for filling in this survey.

Before distribution, the survey was checked by three individuals to ensure understandability.

4.2 Participants

For finding subjects for the A/B test and the survey, a convenience sample was taken. The sample selection resulted in 133 subjects responding and completing the survey. The students recruited were enrolled in various universities and colleges in the Netherlands. From the total group, 60 students (45.1%) were female, and 73 students (54.9%) were male. All the participants were aged between 18 and 30, with an average age of 23.46 years (SD = 1.78). Most of the students were currently enrolled in an academic master’s programme (49.6%), followed by academic bachelor students (28.6%), and 14 respondents were enrolled in a bachelor’s programme at a university of applied sciences (10.5%). Additionally, there was one student enrolled in an applied sciences master’s programme (0.8%) and one student from college (0.8%). Thirteen participants noted that they were currently not in school (9.8%). The next section discusses the data preparation and analysis, and the results are then discussed.

5 ANALYSIS AND MAIN FINDINGS

SPSS, version 26, was used for statistical comparison between the two groups and to identify the students’ attitudes towards the use of symbolic AI within the government. The following analyses were performed: ANOVA, one-sample t-test, independent t-test, chi-squared test, post-hoc McNemar test, Shapiro-Wilk test and Mann-Whitney test. The correlation matrix can be found in Appendix B, the regular disposal in Appendix C, the conceptual disposal in Appendix D and the questionnaire itself in Appendix E.

5.1 Measuring the Students’ Trust

The first question measured the trust of the individual in the Dutch government. There was only one statement used (‘I have confidence in the Dutch government’) to measure the students’ trust in govern-ment (Appendix E.2 Q1). The respondents were asked how much they agreed with this statement with the help of a five-point Likert scale ranging from strongly disagree (1) to strongly agree (5).

Additionally, four questions were used in order to measure the trust of the individual in the use and adoption of computer systems by government. These questions included questions regarding con-fidence, worries, level of support and demand for transparency of those systems (Appendix E.2 Q2-Q5). With a Cronbach’s alpha of only .592, these questions do not have a reliable internally consistent scale. Nevertheless, it was decided to use these four items since the consistency cannot be significantly increased by leaving out one item, and the inter-item correlations do not show any problems.

5.2 Measuring the Students’ Satisfaction

In the initial survey, seven questions were used to identify the con-cept of clarity of the letter. From those seven questions, two questions were removed. The statement ‘I expect this is an automated decision.’ and the statement ‘I prefer an interactive letter.’ did not fit to measure a person’s perceived clarity of a letter.

The other five statements ‘It’s clear what this letter is about.’, ‘This letter contains a clear argumentation.’, ‘The letter is written clearly.’, ‘It is clear to me what data are used for this decision.’ and ‘It is clear to me what reasoning is used to come to the decision.’ were used to identify the perceived clarity of the respondent (see Appendix E.4). With a Cronbach’s alpha of .855, the clarity of the disposal can be measured reliably and consistently with those five items.

5.3 Sampling Independence

As explained before, the A/B test contrasting the original with the redesigned letter was done with 133 students. This group was split into one group of 68 persons who received the survey on the original letter and 65 who received the survey on the conceptual letter.

A check for sampling independence between the two groups was then performed. No difference in gender ( 2(1) = 0.013, p = .910), age (t(131) = 0.662, p = .509) or education level between the groups ( 2(5) = 5.161, p = .397) was found.

5.4 Results

In this section, the results of the research are examined. To address the first hypothesis, a correlation between the trust in government and the trust in computer systems within government was found (F(1,131) = 14.137, p < .0005, R2= .097, b = 0.333, t(131) = 3.760, p < .0005). For this reason, hypothesis 1 (There is no relation between one’s trust in government and trust in computer systems within the government) can be rejected.

Furthermore, the respondents were asked for what tasks they support the deployment of computer systems for governmental use. Students stated that they support the use of computer systems for the optimisation of traffic flows (91.7%), the calculation of stu-dent finance (84.2%) and the calculation of tax assessment (80.5%). Only 34.6% of the students have the opinion that automated sys-tems should be used for the rejection or grant of visas. Cochran’s Q shows that agreement ratios for these four purposes are not identical (Cochran’s Q(3) = 144.437, p < .0005). Post-hoc McNemar tests with Bonferroni correction showed that the students’ support for automated systems for visa decisions is significantly lower than the three other variables, while those three (optimising traffic flows, calculation of student finance and calculation of tax assessment) do not differ significantly from one another. Therefore, the students’ support for the deployment of AI in government varies by use. Hy-pothesis 2 (The citizen’s support for the deployment of AI by the government does not vary by case) can be rejected.

Since the dependent variables do not follow a normal distribution in either condition (Original Disposal: Shapiro-Wilk W(68) = .941, p = .003, Conceptual Disposal: Shapiro-Wilk W(65) = .936, p = .002), the t-test cannot be used. Therefore, a non-parametric Mann-Whitney U test is preferred to analyse the difference between the clarity of the two letters. One of the major findings of this study is

(8)

that students are more satisfied with the conceptual disposal than the original disposal (U = 1082.5, z = 5.112, p < .0005). Furthermore, respondents also agreed with the statement ‘I prefer an interactive (clickable) letter.’ With an average score of 3.80 on the five-point Likert scale, this was also significantly higher than the neutral value of 3.0 on the five-point Likert scale (t(132) = 10.805, p < .0005). Therefore, this study finds that students will be more satisfied with a more interactive letter than the original letter from DUO. For this reason, hypothesis 3 (The presentation format of a governmental decision will have no influence on the citizen’s perceived satisfaction about that decision) can be rejected.

Furthermore, it is shown that the letter type (original or concep-tual) has a significant influence on the acceptance of the decision. Respondents agree significantly more easily with the statement ‘The content of the letter convinces me to agree with the decision.’ when receiving the conceptual letter (U = 1550, z = 3.331, p = .001). There-fore, the letter type, the presentation format of the governmental decision, has a significant influence on the acceptance of the decision by the student. A clearer explanation will therefore lead to a greater acceptance of the decision. Hypothesis 4 (The presentation format of a governmental decision will have no influence on the chance a citizen will accept that decision) can be rejected.

No significant difference between the two letter conditions was found in the urge to object to or appeal the decision (U = 1967, z = 1.186, p = .235). Therefore, hypothesis 5 (The presentation format of a governmental decision will have no influence on the citizen’s urge to object to or appeal that decision) cannot be rejected. However, the explanation in the conceptual letter was found to be more beneficial for the support and argumentation of a potential objection or appeal (U = 1577, z = 2.979, p = .003). Also studied was the way in which the students agreed with the statement that a good explanation of the decision would help to reduce the chance of objection or appeal. With an average score of 4.02 on the five-point Likert scale, this is significantly higher than neutral (which has the value 3.0) (t(132) = 13.319, p < .0005). Therefore, it can only be stated that the citizen’s willingness to object to or appeal the decision might only be reduced by offering a better explanation.

6 CONCLUSION

As stated in the introduction, the main goal of this study was to de-termine the current use of AI applications by the Dutch government and how this affects the interaction between the government and its citizens. The findings of this research provide insights into the current adoption of AI by governments worldwide and the Nether-lands specifically. The adoption of these new technologies brings challenges such as intertwined bias in the algorithms exploited and transformation in the way governments interact with their citizens. As a result, a renewed interest in XAI emerged. This study aims to contribute to this growing area of research by exploring the princi-ples of explanations, and it offers a framework that strives to assess the quality of a given explanation. When analysing one of the Dutch disposals, it seems that the government is already doing a great job with a bright, interactive and straightforward letter. However, the way the government currently interacts with the citizens can be sig-nificantly improved. The following provides answers to the main question of this study:

How can governmental agencies improve their digital communica-tion towards citizens concerning (partly) automated decisions?

Several criteria that can improve the quality of an explanation are discussed in section 2.2. This study finds that the citizen’s satisfac-tion and perceived clarity can increase by providing a letter that is better understandable and uses better explanation. In order to achieve a better understanding, the letter should be compatible with existing knowledge of the citizen; the parts of the letter have to fit together and use as few causes possible; and the letter should be written clearly, provide contrastive information and offer the opportunity to interact.

Several conclusions can be drawn from the quantitative study. A significant relation between one’s trust in government and trust in computer systems within the government was found.

Students are less willing to have AI take over the rejection or grant of visas compared to other fields such as traffic flow optimisation or the calculation of tax assessment and student finance. The citizen’s support for the deployment of AI by the government varies per use or case, and more research is necessary to better understand why.

This research demonstrates that students will be more satisfied with a more interactive letter than the current original letter from DUO.

Furthermore, it can be concluded that a clearer explanation of the decision will lead to a greater likelihood of accepting that decision, which also confirms the previous studies as discussed in the section on 2.1. Therefore, governments can increase the acceptance rate of citizens by improving the clarity of their explanations, and this can create a new field of interest in explanation optimisation.

Lastly, the study found that letter type has no significant influence on the urge to object to or appeal the governmental decision. On the contrary, a good explanation of an automated governmental decision was found to help to reduce the citizen’s willingness to object to or appeal that decision.

Despite the exploratory nature of the current study, it indicates that the change in the constitutional relation between citizens and the government as reported by the Council of State may be affected by the quality of explanations provided by ADS.

The study also reconfirms that while investments in AI supporting various tasks of public administrations are merely driven by the need for improving efficiency and effectiveness. It is important to keep in mind that explainability, transparency, accountability and auditability are essential to governmental processes.

7 DISCUSSION

There are several limitations that need to be addressed for this study. First, this study mainly focuses on the adoption of rule-based sys-tems within the Dutch government, a typically symbolic technology. As discussed earlier, sub-symbolic AI technologies have become more popular as well, and problems with these technologies raised the renewed interest in XAI. Therefore, it would be more relevant to research the adoption of sub-symbolic AI within the Dutch govern-ment. However, it was found that there were very few governmental agencies within the Netherlands that make use of sub-symbolic technologies. Governmental agencies such as the Dutch Tax and Customs Administration (De Belastingdienst) said they were using sub-symbolic AI for various fields such as the prediction of fraud,

(9)

and other agencies are either exploiting or considering the use of such technologies for similar purposes. However, this authority did not want to provide materials on their reasoning mechanisms for this research because they were perceived to be confidential (intended lack of transparency). Therefore, a decision was made to collaborate with DUO, which provided materials on the reasoning mechanisms of their algorithms.

Furthermore, what kinds of technologies the governmental agen-cies in the Netherlands use and when they started to deploy certain AI applications is not documented. This makes it difficult to deter-mine the number of objections to and appeals of decisions made by these tools in order to test their actual efficiency and effective-ness. In future research, the effects of the deployment of these AI applications in terms of the number of objections and appeals should be studied in more detail. This will provide information that helps the agencies to check to what extent AI technologies really improve their efficiency.

One more specific area that could be improved is the survey itself. Only one question was used to measure students’ trust in govern-ment. Future research should include more questions to improve the reliability of this variable. For the measurement of trust in computer systems within government, four specific questions were used. How-ever, with a Cronbach’s alpha of .592, this cannot be seen as a very reliable scale.

The case study, being a quite simple example of symbolic AI technologies, may also have impacted the current results. It is yet unclear if similar results would be found with a case presenting an application using sub-symbolic AI technologies.

Also, for this study, some interviews with experts were conducted to gain a better understanding of the current adoption of AI technolo-gies within Dutch governmental agencies. Those interviews were not transcribed, coded and analysed in order to validate the framework offered in section 2.2. This could be done in future research.

The case used in this study justifies only including students within the Netherlands. The sample consisted of 133 respondents, and their level of education was not equally distributed. In addition, with a small sample size, caution must be applied as the findings might not be transferable to the whole population of students. To test generalis-ability, further research using cases in multiple application domains is needed. Consequently, the study should not only include students but also other target groups. This would be helpful to acquire more insight into people’s preferences in the context of communicating decisions. In this study, how the visual layout of the letter influences the students’ satisfaction was not examined. Further investigation and experimentation into explanation optimisation is strongly rec-ommended. Future studies could use a similar experimental setup. Future studies in this field should also focus on sub-symbolic AI technologies, as they are the most problematic in terms of explain-ability.

ACKNOWLEDGEMENTS

Unique appreciations should be given to prof. dr. Tom van Engers, my thesis supervisor, for his professional guidance and helpful sup-port. I would also like to thank dr. Jacobijn Sandberg as the second reader and examiner of this thesis. I would like to thank the Canadian Research Council sponsoring the ACT Project. Lastly, I would like

to acknowledge the experts who were involved in this study: Robert van Doesburg, Cees-Jan Visser, Matthijs van Kempen, Giovanni Sileno, Marlies van Eck, Mark Neerincx, Diederik Dulfer, Vincent Hoek and Koen Smit.

REFERENCES

[1] AINED. 2018. AI voor Nederland: vergroten, versnellen en verbinden. https: //www.mkb.nl/sites/default/files/aivnl_20181106_0.pdf

[2] Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine Bias. ProPublica (2016). https://www.propublica.org/article/ machine-bias-risk-assessments-in-criminal-sentencing

[3] David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert Muller. 2010. How to explain individual classification decisions. Journal of Machine Learning Research 11, Jun (2010), 1803–1831. [4] Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived

automatically from language corpora contain human-like biases. Science 356 (2017), 183–186. https://doi.org/10.1126/science.aal4230

[5] Alex Campolo, Madelyn Sanfilippo, Meredith Whittaker, and Kate Crawford. 2017. AI now 2017 Report. AI Now Institute at New York University (2017). [6] Miguel Carrasco, Steven Mills, Adam Whybrew, and Adam Jura. 2019. The

Citizens Perspective on the Use of AI in Government. Boston Consulting Group (2019).

[7] Jessie Y Chen, Katelyn Procci, Michael Boyce, Julia Wright, Andre Garcia, and Michael Barnes. 2014. Situation awareness-based agent transparency. Technical Report.

[8] Bjarne Corydon, Vidhya Ganesan, Martin Lundqvist, Emma Dud-ley, Diaan-Yi Lin, Matteo Mancini, and Jonathan Ng. 2016. Trans-forming Government Through Digitization. McKinsey & Company (2016). https://www.mckinsey.com/~/media/McKinsey/Industries/ PublicSector/OurInsights/Transforminggovernmentthroughdigitization/ Transforming-government-through-digitization.ashx

[9] Anusha Dhasarathy, Sahil Jain, and Naufal Khan. 2019. When govern-ments turn to AI: Algorithms, trade-offs, and trust. McKinsey&Company (2019). https://www.mckinsey.com/industries/public-sector/our-insights/ when-governments-turn-to-ai-algorithms-trade-offs-and-trust

[10] Derek Doran, Sarah Schulz, and Tarek R Besold. 2017. What does explain-able AI really mean? A new conceptualization of perspectives. arXiv preprint arXiv:1710.00794 (2017).

[11] Dutch Digital Government. 2018. NL DIGIbeter. Digital Gov-ernment Agenda (2018). https://www.nldigitalgovernment.nl/document/ digital-government-agenda-2/

[12] Dutch Digital Government. 2019. NL Digitaal. Digital Government Agenda (2019). https://www.nldigitalgovernment.nl/document/data-agenda-government/ [13] English Oxford Dictionaries. 2019. Definition of ’explanation’ in English. (2019).

https://en.oxforddictionaries.com/definition/explanation

[14] European Union. 2016. Regulation 2016/679: General Data Protection Regulation. Official Journal of the European Communities (2016), 1–88. https://doi.org/pri/ en/oj/dat/2003/l

[15] Andrea Falcon. 2008. Aristotle on Causality. In Stanford Encyclopedia of Philos-ophy.

[16] Joseph C Giarratano and Gary Riley. 1998. Expert Systems. PWS Publishing Co. [17] David Gunning. 2017. Explainable Artificial Intelligence. Defense Advanced

Research Projects Agency (DARPA) (2017).

[18] John Haugeland. 1985. Artificial Intelligence: The Very Idea. MIT Press. [19] Jonathan L Herlocker, Joseph A Konstan, and John Riedl. 2000. Explaining

col-laborative filtering recommendations. In Proceedings of the 2000 ACM conference on Computer supported cooperative work. ACM, 241–250.

[20] Denis J Hilton. 1990. Conversational processes and causal explanation. Psycho-logical Bulletin 107, 1 (1990), 65.

[21] Robert Hoffman, Shane Mueller, Gary Klein, and Jordan Litman. 2018. Metrics for Explainable AI: Challenges and Prospects. XAI Metrics (2018).

[22] Vincent Homburg. 2008. Understanding e-government: Information systems in public administration. Routledge.

[23] Eric Horvitz, David Heckerman, Bharat Nathwani, and Lawrence Fagan. 1986. The use of a heuristic problem-solving hierarchy to facilitate the explanation of hypothesis-directed reasoning. In Proceedings of Medinfo, Washington, DC. 27–31.

[24] Troy D Kelley. 2003. Symbolic and Sub-Symbolic Representations in Computa-tional Models of Human Cognition: What Can be Learned from Biology? Theory & Psychology 13, 6 (2003), 847–860. https://doi.org/10.1177/0959354303136005 [25] Henry Lieberman. 2016. Symbolic vs. Subsymbolic AI. MIT Media Lab (2016). http://futureai.media.mit.edu/wp-content/uploads/sites/40/2016/02/ Symbolic-vs.-Subsymbolic.pptx_.pdf

[26] Peter Lipton. 1990. Contrastive explanation. Royal Institute of Philosophy Supplements 27 (1990), 247–266.

(10)

[27] Tania Lombrozo. 2007. Simplicity and probability in causal explanation. Cognitive Psychology 55, 3 (2007), 232–257. https://doi.org/10.1016/j.cogpsych.2006.09. 006

[28] Yisheng Lv, Yanjie Duan, Wenwen Kang, Zhengxi Li, and Fei-Yue Wang. 2014. Traffic flow prediction with big data: a deep learning approach. IEEE Transactions on Intelligent Transportation Systems 16, 2 (2014), 865–873.

[29] Hila Mehr. 2017. Artificial Intelligence for Citizen Services and Government. Harvard Ash Center Technology & Democracy (2017). https://ash.harvard.edu/ files/ash/files/artificial_intelligence_for_citizen_services.pdf

[30] Joseph E Mercado, Michael A Rupp, Jessie Y C Chen, Michael J Barnes, Daniel Barber, and Katelyn Procci. 2016. Intelligent Agent Transparency in Human-Agent Teaming for Multi-UxV Management. Human Factors 58, 3 (2016), 401–415. https://doi.org/10.1177/0018720815621206

[31] Tim Miller. 2018. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence (2018).

[32] Robert Neches, William Swartout, and Johanna Moore. 1985. Enhanced Main-tenance and Explanation of Expert Systems Through Explicit Models of Their Development. IEEE Transactions on Software Engineering SE-11, 11 (1985), 1337–1351. https://doi.org/10.1109/TSE.1985.231882

[33] Ingrid Nunes and Dietmar Jannach. 2017. A systematic review and taxonomy of explanations in decision support and recommender systems. User Modeling and User-Adapted Interaction 27, 3-5 (2017), 393–444. https://doi.org/10.1007/ s11257-017-9195-0

[34] Nancy Pennington and Reid Hastie. 1993. The story model for juror decision making. Cambridge University Press Cambridge.

[35] Wolter Pieters. 2011. Explanation and trust: what to tell the user in security and AI? Ethics and information technology 13, 1 (2011), 53–64.

[36] Raad van State. 2018. Ongevraagd advies over de effecten van de digitalisering voor de rechtsstatelijke verhoudingen. Kamerstukken II 2017/18, 26643, nr. 557 (2018). https://www.raadvanstate.nl/adviezen/zoeken-in-adviezen/tekst-advies. html?id=13065

[37] Stephen J Read and Amy Marcus-Newhall. 1993. Explanatory coherence in social explanations: A parallel distributed processing account. Journal of Personality and Social Psychology 65, 3 (1993), 429.

[38] Melanie Reid. 2017. Rethinking the Fourth Amendment in the Age of Supercom-puters, Artificial Intelligence, and Robots. West Virginia Law Review 119 (2017), 863–890.

[39] Thomas R Roth-Berghofer and Jorg Cassens. 2005. Mapping goals and kinds of explanations to the knowledge containers of case-based reasoning systems. In International Conference on Case-Based Reasoning. Springer, 451–464. [40] Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. 2017. Explainable

artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296 (2017).

[41] Robert Victor Schuwer. 1993. Het nut van kennissystemen. https://doi.org/10. 6100/IR394707

[42] Edward Hance Shortliffe and Bruce G Buchanan. 1985. Rule-based expert systems: the MYCIN experiments of the Stanford Heuristic Programming Project. Addison-Wesley Publishing Company.

[43] Giovanni Sileno, Alexander Boer, and Tom van Engers. 2018. The Role of Normware in Trustworthy and Explainable AI. (2018). http://arxiv.org/abs/1812. 02471

[44] Paul Smolensky. 1987. Connectionist AI, symbolic AI, and the brain. Artificial Intelligence Review 1, 2 (1987), 95–109. https://doi.org/10.1007/BF00130011 [45] Frode Sørmo, Jorg Cassens, and Agnar Aamodt. 2005. Explanation in case-based

reasoning-perspectives and goals. Artificial Intelligence Review 24, 2 (2005), 109–143.

[46] Paul Thagard. 1989. Explanatory coherence. Behavioral and brain sciences 12, 3 (1989), 435–467.

[47] Stephen E. Toulmin. 1958. The Uses of Argument. Cambridge University Press (1958). http://bilder.buecher.de/zusatz/22/22199/22199087_vorw_1.pdf [48] Matthijs van Kempen. 2019. Motivering van automatisch genomen besluiten.

Knowbility (2019).

[49] Richard Ye and Paul Johnson. 1995. The Impact of Explanation Facilities on User Acceptance of Expert Systems Advice. MIS Quarterly 19, 2 (1995), 157–172. https://doi.org/10.2307/249686

[50] Jeffrey C. Zemla, Steven Sloman, Christos Bechlivanidis, and David A. Lagnado. 2017. Evaluating everyday explanations. Psychonomic Bulletin and Review 24, 5 (2017), 1488–1500. https://doi.org/10.3758/s13423-017-1258-z

(11)

APPENDICES

A Framework

Criterion Description

External coherence Is the explanation compatible with existing knowledge and beliefs? Internal coherence Is it sound how the several parts relate to each other?

Simplicity Is the number of causes used sufficient? Articulation Is the text written in an understandable manner? Contrastiveness Is the alternative outcome shown as well?

Interaction Is there a possibility to interact between explainer and explainee? Table 1: Explanation Satisfaction Framework for Governmental Disposals

B Correlation Matrix

M SD Lettertype Gender Age Trust in

Government Trust in Computer-Systems

Clarity of

Disposal withAgreement Decision Lettertype .49 .502

-Gender .55 .499 .010 -Age 23.46 1.782 -0.58 -.089

-Trust in Government 4.04 .596 -0.11 -.044 .198* (.592) Trust in Computer-Systems in Government 3.14 .635 .023 -.021 .005 .312**

-Clarity of Disposal 3.83 .742 .420** -.016 .073 .173* .266** (.855) Agreement with the Decision 3.87 .848 .255** -.048 .024 .160 .321** .606**

-** Correlation is significant at the 0.01 level (2-tailed). * Correlation is significant at the 0.05 level (2-tailed). Values in parentheses on main diagonal: Cronbach’s Alpha

(12)

C Regular Letter from DUO

(13)

D Conceptual Letter from DUO

Figure 2: The conceptual letter based on the regular one, which was retrieved from the Education Executive Agency

Figure 3: The conceptual letter shows on what data the decision is based

.

Figure 4: The conceptual letter shows which rules are used for the decision

Figure 5: The conceptual letter shows the student’s old loan and the student’s new loan in a contrastive manner

(14)

E Questionnaire (in Dutch)

The survey was conducted in The Netherlands for a period of one week between 29nd_{of May and 4}th_{of June, 2019.}

E.1 Intro.

Overal om ons heen zien we Artificial Intelligence (AI) toepassingen. Computer-systemen bepalen steeds vaker hoe we de wereld waarnemen en hoe we interacteren met elkaar.

Diverse Nederlandse overheidsinstellingen gebruiken computer-systemen bij het nemen van besluiten. Zo berekenen computer-systemen bijvoorbeeld volledig zelfstandig de hoogte van uw studiefinanciering, zonder menselijke tussenkomst.

Het doel van dit onderzoek is om inzicht te krijgen in de houding van studenten tegenover geautomatiseerde besluiten van de overheid. Deze survey kost circa vijf minuten van uw tijd. Uw antwoorden zijn anoniem. Hartelijk dank voor uw medewerking, klik om de enquête te starten.

E.2 Attitude.

Using a 5-point Likert scale, ranging from strongly disagree (1) to strongly agree (5) (1) Ik heb vertrouwen in de Nederlandse overheid.

(2) Ik heb vertrouwen in besluiten die met behulp van computer-systemen worden genomen door de overheid. (3) Ik ben een voorstander van computer-systemen bij besluitvorming door de overheid.

(4) Ik maak mij zorgen over het gebruik van computer-systemen die zelfstandig beslissingen maken voor de overheid.

(5) Ik heb behoefte aan transparantie van de algoritmes en data die door de computer-systemen van de overheid gebruikt worden. E.3 Use-Case Support.

Ik vind dat computer-systemen zelfstandig ingezet mogen worden om: Meerdere antwoorden mogelijk.

• Belastingaanslagen te bepalen • Studiefinancieringen te berekenen • Visa toe te kennen of af te wijzen • Verkeersdoorstrooming te optimaliseren

The questionnaire will now randomly show either the original letter or the conceptual letter to the respondent. E.4 Clarity of the letter.

Hoe tevreden bent u met deze beschikking?

Using a 5-point Likert scale, ranging from strongly disagree (1) to strongly agree (5). (1) Het is mij duidelijk waar deze brief over gaat.

(2) Ik vind dat de brief een heldere argumentatie bevat. (3) Ik vind de beschikking / brief duidelijk geschreven.

(4) Het is mij duidelijk welke gegevens voor dit besluit zijn gebruikt.

(5) Het is mij duidelijk welke redenering is gebruikt om tot het oordeel te komen. (6) Ik ben van mening dat dit een automatisch genomen besluit is.*

(7) Ik stel een interactief (klikbaar) besluit op prijs.*

(15)

E.5 Acceptance of the decison. In hoeverre accepteert u het besluit?

Using a 5-point Likert scale, ranging from strongly disagree (1) to strongly agree (5). (1) De inhoud van de brief overtuigt mij om met het besluit in te stemmen.

(2) De in de brief gegeven uitleg zou mij uitnodigen het meegedeelde besluit aan te vechten.

(3) De in de brief gegeven uitleg helpt mij mijn eventuele bezwaren tegen het besluit nader te onderbouwen. (4) Een goede uitleg van het besluit zal helpen om het in bezwaar gaan te verminderen.

E.6 Demographics. (1) Wat is uw geslacht? (2) Wat is uw leeftijd?

(3) Ontvangt u studiefinanciering van DUO? (4) Welke opleiding volgt u op dit moment?

(5) Heeft u nog eventuele opmerking voor de onderzoekers? Hartelijk dank voor uw medewerking. Uw antwoorden zijn opgeslagen.

(16)

F Interviews

Person Organisation Contribution

Robert van Doesburg Immigratie- en Naturalisatiedienst (IND) Provided insight into how the IND uses AI. Also approached the other experts in this list to collaborate in this research.

Cees-Jan Visser Dienst Uitvoering Onderwijs (DUO) Provided insight into how DUO uses AI technologies. Mr. Visser was also able to provide materials from DUO about the reasoning mechanisms of their algorithms and an anonymised letter that was used for the case-study.

Matthijs van Kempen Knowbility Explained how the digitising Dutch government was criticised by the Council of State and provided his article on how decisions can be motivated in a better way.

Giovanni Sileno PhD Universiteit van Amsterdam Provided some insight into the basics of democracy and talked about the quality assessment of explanations.

dr. Marlies van Eck Universiteit Leiden Provided some very useful insights into how governmental authori-ties make use of automated chain decisions and how citizens might be affected in a negative way by those systems.

prof. dr. Mark Neerincx Technische Universiteit Delft Mentioned that explanations are in essence personalised and helped to stretch the importance of human-computing interaction in the field of AI adoption in public administration.

Diederik Dulfer De Belastingdienst Provided insight into how the Dutch Tax and Customs Administra-tion uses AI for several tasks.

Vincent Hoek I-Interim Rijk Stressed the importance of ethics by design and explained how agen-cies transfer citizens’ data among one another. Mr. Hoek raised the question of how governments can improve their digital communica-tion towards citizens.

dr. ing. Koen Smit Hogeschool Utrecht Explained that the current adoption of sub-symbolic AI by govern-mental agencies within the Netherlands is minimal and stated that it is hard for governmental agencies to justify the use of complex sub-symbolic AI technologies that the experts who made them do not even understand.