Policy Alienation and Motivation

(1)

What do Bureaucrats Believe?

A survey exploring the perceptions of Dutch policy

makers concerning different types of evidence that inform

policy making.

Andreana de Jong s2213583

Master of Public Administration - International and European Governance Thesis Supervisor: Dr. Valérie Pattyn

Second Reader: Dr. Sarah Giest Semester 2 Year 2018/2019

(2)

1

Table of Contents

Foreword 2

Abstract 4

Chapter One - Introduction 5

1.1 Social and Academic Relevance 7

1.2 Reading guide 9

Chapter Two - Literature Review 10

2.1 Evidence-informed policy making 10

2.2 Types of evidence 12

2.3 Going undercover 14

2.4 The policy process and makers’ 15

2.4 Polity, politics and policy in the Netherlands 16

Chapter Three – Conceptual Framework 17

3.1 Type of evidence typology 17

3.2 Variables of Measurement 18

3.3 A focus on policy development 19

3.4 A focus on policy makers 20

3.5 Construction of hypotheses 20

Chapter Four - Research Method 21

4.1 Data collection and sampling 21

4.2 Response rate 23

4.3 The survey 24

Chapter Five - Analysis 26

5.1 Baseline statistics 26

5.2 Descriptive statistics 28

5.3 Top two box scores 30

5.4 Standard deviation 32

5.5 Repeated Measures ANOVA 33

5.6 Quantitative Data Analysis 40

5.7 Ranking quality and bias 42

Chapter Six - Discussion 44

6.1 Theoretical discussion 44

6.2 Limitations 49

6.3 Future research 51

Chapter Seven - Conclusion 54

References 54

(3)

2

Foreword

It has been almost seven years since I received my HBO Bachelor’s degree. The experiences I have since then accumulated brought me to the crucial moment in time in which I decided to pursue a master’s degree in Public Administration. The past academic year has opened my eyes to the complexities, dynamics and layers of modern-day governance. It has sparked my interest for public policy and more recently, the trials and tribulations of policies becoming more evidence-informed. This Master Thesis would not have been possible without crucial guidance and support from Dr. Valérie Pattyn. She was indispensable for guiding me through the world of evidence-informed policymaking and never steered me away from my ambitions and intentions but only refined them. A special mention is necessary for the policy makers that took the time and effort to participate in the survey research. I thought it might be impossible to reach this target population in the time and scope of the Master Thesis. Yet, policy makers responded with sincere enthusiasm to the concept of evidence-informed policy making. It is with great gratitude to those respondents that this Master Thesis has been put together. Furthermore, my gratitude goes out to my family for always supporting me. Additionally, there have been so many people crucial to the realization of this thesis - university classmates, friends, acquaintances, and family members that have gone out of their way to help in many ways. This product would not have been possible without them.

(4)

3

Index

Dutch and English Versions of the Government Ministries in the Netherlands.

Ministerie van Algemene Zaken Ministry of General Affairs

Ministerie van Binnenlandse Zaken en Koninkrijksrelaties

Ministry of the Interior and Kingdom Relations

Ministerie van Buitenlandse Zaken Ministry of Foreign Affairs

Ministerie van Defensie Ministry of Defence

Ministerie van Economische Zaken en Klimaat Ministry of Economic Affairs and Climate Policy

Ministerie van Financiën Ministry of Finance

Ministerie van Infrastructuur en Waterstaat Ministry of Infrastructure and Water Management

Ministerie van Justitie en Veiligheid Ministry of Justice and Security

Ministerie van Landbouw, Natuur en Voedselkwaliteit

Ministry of Agriculture, Nature and Food Quality

Ministerie van Onderwijs, Cultuur en Wetenschap

Ministry of Education, Culture and Science

Ministerie van Sociale Zaken en Werkgelegenheid

Ministry of Social Affairs and Employment

Ministerie van Volksgezondheid, Welzijn en Sport

(5)

4

Abstract

At the very start of the 21st century, prominent public policy scholars interested in the field of evidence-informed policy making identified several types of evidence to which governments seemed to be responding. However, further empirical testing of these types of evidence in practice in government, or otherwise, has not received much attention in evidence-informed policy making scholarship. Additionally, rapid developments in information technology and the sharing of information might give rise to new types of evidence that can inform public policy development. This uninvestigated domain led to the inspiration to design research that dives deeper into assessing different types of evidence and how policy makers perceive them. A cross sectional survey was conducted with policy makers (in this case, non-elected policy staff tasked with policy making) employed in one of the twelve government ministries in the Netherlands. The survey results showed that there is a significant difference between the different types of evidence and how they are perceived on measures of credibility, usefulness and how likely policy makers are to use them for informing policy development. In regards to the classically identified types of evidence, policy makers continuously scored them much higher on all measured dependent variables than the new type of evidence. This supports that, even almost twenty years later, the classically identified types of evidence are still very much relevant to informing public policy making today.

(6)

5

Chapter One - Introduction

On July 6, 2018, the current Minister of Finance in the Netherlands wrote a White Paper outlining the government’s ambition to increase the social impact of its policies (see Hoekstra, 2018). The government will run what is called operation “Insight into Quality”1_{which is an} initiative towards assessing as well as achieving better quality in Dutch policy making. The cornerstone of the operation is that the government wants to gain more quality insight into what works and what does not, which should ultimately lead to new ways of working. It is emphasized in the White Paper that at this moment in time, the connection between policies and their outcomes is still difficult to prove. The Minister of Finance stresses that as the world changes, the role of the government and its policies must be matched accordingly. One of the consequences is that new types of evidence and new ways of evaluating should be embraced (Ibid).

The Dutch government is not the first to express their motivation to practice “what works” in application to effective policy making. Especially Anglophone countries have placed an increased emphasis on evidence-informed policy (Sanderson, 2002; Curtain 2000b; Weiss, 2001; Van den Berg, 2016). However, as public administrations across the globe embrace evidence as a means of providing more effective and efficient public services, the dissemination, definition and use of evidence still varies greatly across different policy subsystems and policy areas (Davies, Nutley & Smith, 2000).

Inside the larger frame of evidence-informed policy making, one finds different sorts of evidence that might inform policy development. In the early 2000’s, classic scholars in the field of evidence-informed policy making identified several types of evidence which governments were becoming particularly responsive to (Davies, Nutley and Smith, 2000; Weiss, 2001). These evidence types (or evidence typology) as identified by Davies, Nutley and Smith (2000) and Weiss (2001) are evaluations, descriptive data, analytic information and analytic forecasts. Established almost twenty years ago, are the identified types of evidence still relevant to government policy makers in the 21st century? Additionally, with a predominant academic focus on anglophone government systems, what do those different types of evidence mean for policy makers in the Netherlands?

(7)

6 Research into current perceptions of the different types of evidence then becomes relevant due to the maturing of the earlier identified evidence types, the lack of research looking at the types of evidence in practice (especially in non-anglophone countries) and the possible proliferation of new types of evidence.

As paraphrased by Davies, Nutley and Smith (2000) from the Cabinet Office (1999, paras 7.1 and 7.22), “the raw ingredient of evidence is information.” Besides supporting existing or new policies, information also has the potential to confront policy makers with evidence that reveals certain policy failures. In the 21st century, this is intensified as interest groups, non-profits, corporations and the public have more tools at their disposal for dispersing information. Social media, the internet, drones and smartphones have created a “sharing” community but simultaneously also a community where everything can be recorded and documented as new information.

One of the emerging new types of evidence is undercover footage. Undercover footage can be produced by amateurs or experts, by the public or by interest groups, or even by government and enforcing agencies themselves in any context or policy field. Public undercover footage campaigns can capture both public and political attention. One example is that on 23 March 2017, undercover footage from a slaughterhouse in Tielt, Belgium led to a multi-level high stake confrontation between the private sector, interest groups, concerned citizens and government. Undercover footage from within the agriculture industry continued to be publicly exposed and the Minister of Animal Welfare was criticized for his management of the material by industry interest groups and had to answer questions in Parliament (Landbouwleven, 2017). The undercover footage was an incentive for industry measures to be enforced nationwide, including more video surveillance and obligatory external audits (Radar, 2017).

This short synopsis illustrates the high-stake consequences new types of evidence, such as undercover footage, could ignite in policy regimes. It further raises questions about the involvement of undercover footage as a source of input and content. In addition, following from this example, would policy makers consider this type of evidence in policy development?

The lack of prior research that investigates different types of evidence in evidence-informed policy making forms the groundwork of this Master thesis. The research goal is to uncover how policy makers in Dutch administration perceive different types of evidence. Due to the Dutch

(8)

7 government’s recent operation Quality into Insight, and its implementation across the various ministries, the Dutch administration provides an interesting research context. The data collection method is the use of a cross-sectional survey distributed ministry wide, which will test the perceptions of policy makers regarding different sorts of evidence. The research is designed to answer the following research question:

How are different types of evidence perceived by policy makers in the Dutch government ministries?

The following sub questions will also be answered:

1) Is there variability in the answers provided by the policy makers regarding the different types of evidence on the measured dependent variables?

2) Are there significant differences in the perceptions of policy makers between the different types of evidence?

3) How do policy makers rank the different types of evidence on perceived quality and bias?

1.1 Social and Academic Relevance

On the social importance of research Wilbertz (2013, p.3) states; “the identification of the

societal relevance of research projects is one way to achieve both “investment aims” and therefore to push research into a more fruitful direction by actually creating a measurable benefit for society as a whole or to solve a specific problem with implications for a subgroup of society.” Parallels can be found between the highlighted importance on the social impact of

research and the evidence-based movement, which also emphasizes the social benefit of policies. In relation to this Master thesis, societal policy input is becoming increasingly relevant with trends such as public participation in policy making (Weiss, 2001). If the public is increasingly becoming an important factor in policy decision making, a deeper investigation into the types of evidence informing policy is not only relevant on a government level, but also relevant for public use. The growing complexities of public problems and the politicization, internationalization and externalization of policy advice make it especially crucial for the public to understand what types of evidence are informing policy development (Van den Berg, 2016).

(9)

8 Furthermore, most academics endorse evidence-informed policy making as a tool for constructing better public policies (Weiss, 2001; Sanderson, 2002; Davies, Nutley and Smith, 2000). This is fostered by the idea that if policy is informed by systematic research it results in more effective public outcomes. These public policies affect all subsystems of society. The assessment of how those in policy making positions view different types of evidence might reveal if there has truly been a transition to a “scientifically guided society” (Lindblom, 1990).

Lastly, new types or sources of evidence such as undercover footage, but also social media, create opportunities for public actors to move closer to the inner workings of policy subsystems. The proliferation of new types of evidence creates shorter communication channels between the multiple levels of governance, and facilitates the dissemination, dispersion and insertion of information. How these types of evidence are perceived by policy makers gives a greater social understanding into the use of evidence in the policy making process and could facilitate (future) knowledge use and production.

In the academic realm, much (positive) scholarly attention has fluctuated towards evidence-based and as of recent, evidence-informed policy making. It is important to remain critical regarding the impact of evidence-informed policy making in the public domain. Impact remains one of the most difficult variables to test in social research and Wilbertz (2013) highlights the intrinsic problems with the concept of measuring impact in (social) science. Additionally, Nelkin (1992) finds that there is little support that technical evidence-based information has much of a direct impact on decisions being made. Even in the most technical of policy fields (Ibid.). This is a superficial glimpse into the barriers that still need to be addressed in knowledge use (for more see, Weiss, 2001). The academic goal will not be to test if evidence-informed policies truly outperform non-informed policies or why evidence is not being used. However, establishing what the (base) perceptions are regarding different types of evidence by government users might provide important insights in the larger debate regarding the (non) use of evidence.

There is a gap between the identification of these types of evidence as having impact in public policy making by classic scholars, and testing them in practice. Is there validation to these types of evidence? Are they perceived as credible and useful by their end users in government? More importantly, are they still relevant today and what are the new types of evidence that are informing policies? As discussed above, this research is concerned with measuring the policy

(10)

9 makers’ perception. This is an important starting point as Cairney and Oliver (2017) identify that much of what those in governments do, is still guided by what they belief and their underlying emotions.

Undercover footage is a completely uninvestigated spectrum of policy evidence or information. In the Netherlands alone, sixty-four cases of undercover footage received national media attention in the last year (Nexis Uni search April 2019). Increasingly, questions are answered in Parliament regarding (undercover) footage. A search for the term footage (beelden) on the official House of Representatives website, results in 68 official Parliamentary questions submitted (Tweede Kamer, n.d.). This indicates that government policy makers may well benefit from being alert and skilled in the undercover footage debate.

Finally, recent research efforts have focused on the source of policy advice without looking at the content or the type of evidence (Rich, 2005; Doberstein, 2017). It finds that academic advisory sources (for example, academic think thanks) are perceived as more credible (Ibid.). However, less scholarly attention has gone out to testing different types of evidence without any source attribution. Therefore, this research looks exclusively at types of policy evidence regardless of their source. This can provide more insight into what types of evidence should be produced, dispersed and, ultimately, utilized in policy advisory systems and filtered into the evidence base.

1.2 Reading Guide

The following chapter presents a literature review, which will include a broader analysis of evidence-informed policy making. The literature review will also dive deeper into the different types of evidence, including undercover footage. In the third chapter, the conceptual framework builds the foundation for the hypotheses that will lead the data analysis. The fourth chapter will explain the research design, including the construction of the survey and survey sampling. The fifth chapter will present the data findings and statistical analysis of the survey data. The sixth chapter will discuss the major research findings and reflect on the research in terms of its academic and social relevance. Furthermore, the chapter will also feature the research limitations and suggestions for future research endeavors. Finally, chapter seven is the research conclusion.

(11)

10

Chapter 2 - Literature Review

2.1 Evidence-Informed Policy Making

In a broad definition provided by Davies (2004) evidence-informed policy making is the use of scientific knowledge as input during the various stages of the policy cycle. Translated to the perception of policy makers, a study by Reid (2003) found that policy makers most commonly associate the term with policy that is informed significantly by research evidence in the development and implementation stages. The study highlights the critical importance in zooming in on the perceptions of the practitioners of evidence-informed policy making, namely the policy makers. The wider definition(s) hold little value if an unawareness remains in how those actors in policy making positions view, accept and utilize the concept. Additionally, the definition highlights the stages of development and implementation as important periods for evidence-informed policy making.

A further definition by Hope (2004) centers around using policy that is known and has been proven to work. The essence of this definition transitioned into the popular global ‘What

Works’ movement in public policy administrations. As Sanderson (2000) describes it, the

primary motivator for the introduction of evaluative systems is the ability to assess government performance and accumulate evidence to establish what works. In return, this should lead to policy learning and improvement. This falls in line with the more ‘positivistic’ nature of rational decision-making processes in public policy (Ibid.).

However, understanding evidence-informed policy making has become more sophisticated as it has moved from an initial rational emphasis on ‘What Works’ to the consideration that policies might still be informed in a variety of ways. As Nutley et al. (2007) discuss, research evidence can also reveal new ways of thinking about an issue. Evidence may sometimes directly support existing policies, but importantly, what it can also do, is offer better and new information for policy decision debates (Ibid.). Evidence informing policies, rather than evidence basing policies. This is why the now more commonly preferred term is evidence-informed policy making rather than the more deterministic evidence-based policy (EBP) making as it encompasses the variety of other factors and barriers that might still influence policy making (Sanderson, 2002; O’Dwyer 2004; Nevo & Slolim-Nevo, 2011).

(12)

11 The ‘What Works’ movement also received some academic criticism. Parkhurst (2017), applying a more political perspective, comments on defaults with the ‘What Works’ movement. His idea is that an evidence of effect does not equate to social desirability (Ibid.). This means that evidence that a certain policy works well, does not necessarily mean that it is also socially relevant. Starting from the perspective that policymakers must choose from scarce resources it would perhaps be more appropriate to assess policies on the premise of “works to do what,

exactly?” (Parkhurst, 2017, p.19). This elaborates on the analysis that policy makers, in

addition to knowing that something works, also need to know that something works for their public clients. Further doubts have been raised as to whether proven effective programs can be replicated in different policy contexts (Sundell, Ferrer-Wreder & Fraser, 2013).

Another issue with the ‘What Works’ movement is that desired outcomes are much more pre-determined in the clinical health sciences, where evidence-based practice started, as opposed to public policy (Parkhurst, 2017). The Minister of Finance points to the same ambiguity in the White Paper. Namely, how do we really know that the (desired) outcome was obtained by effective policies? When it comes to determining desired outcomes in the public policy arena, political influence still plays a role. Parkhurst (2017) is critical of the lack of scrutiny in evidence-informed policy scholarship concerning the political nature of decision-making. He emphasizes the concerns of the misuse of evidence for political purposes and, alternatively, the use of evidence in depoliticizing political debates. What he proposes is the good governance of evidence. This means the appropriate utilization of evidence as well as establishing what constitutes good evidence. Ultimately, policy makers in in public policy should ask; “will it

work for us” (Parkhurst, 2017, p.20)?

The consensus in the academic literature on evidence-informed policy making is that (1) evidence is a meaningful concept, (2) that it should be readily available to those in advisory positions and (3) that evidence-informed policy is superior to policy not informed by evidence (O’Dwyer, 2014). Yet, as illustrated by Parkhurst (2017) there is still room for improvement regarding the good governance of evidence. Stone (2002) raises the question of fashionability. Is it because governments in the UK, US and Australia are actively stimulating the use of evidence in constructing social policies that others are following this course? Additionally, are social scientists supporting the idea of evidence-informed policy because others are? Then another policy paradox arises (Stone, 2002) where the use of evidence becomes an ideology

(13)

12 without much (empirical) evidence to support it. As stated by Doherty (2000, p. 179-80), “the

notion that public policy is evidence driven is itself a reflection of an ideology, the ideology of scientism.”

According to O’Dwyer (2014) there is also a lack of (empirical) evidence that supports the notion that evidence-based policy is better policy. On the matter, Reid (2003, p. 20) states the following: “one might be surprised by the lack of evidence for evidence-based policymaking,

either as a process which can take place or as a process which will lead to better policy outcomes. It might be argued that, far from being unideological, evidence-based policy is, in itself, a kind of ideology; and one for which there is remarkably little supporting evidence.”

Not only does this indicate that the ideologies associated with evidence-based policy have perhaps stimulated its wider use, but also that there remains a gap in the empirical testing of evidence-informed decisions leading to better policy outcomes. This empirical testing does not lie in the scope of this research project. However, O’Dwyer (2014, p.11) does point out, “it is

not merely the use of evidence, but the type of evidence used that is important.” An

investigation into the different types of evidence, might therefore provide us with more knowledge concerning the evidence-informed debate.

2.2 Types of Evidence

Davies, Nutley and Smith (2000, p.3) operationalize evidence in that it “comprises the results

of systematic investigation towards increasing the sum of knowledge …the accepted rules of evidence differ greatly between research cultures.”Important in this definition is the mention of an incongruence between accepted rules that exist among research cultures. Furthermore, the belief that “the output comes from more formal and systematic enquiries, generated by

government departments, research institutes, universities, charitable foundations, consultancy organizations and a variety of agencies and intermediaries” (Davies, Nutley & Smith, 2000,

p.3) is also relevant. This Master thesis will test the more formal and systematic enquires as identified by Weiss, and Davies, Nutley and Smith, as well as a new type of evidence, undercover footage without any specific source specification. This decision in design, falls in line with more contemporary evidence-informed research including such things as social media and editorials as types of evidence (Talbot & Talbot, 2014).

(14)

13 Additionally, a view adopted by a position paper released by NESTA also highlights some important features that support the inclusion of more informal types of evidence. NESTA is the UK’s innovation foundation and is supported by the Economic and Social Research Council (ESRC), which is the UK’s largest organisation for funding research on economic and social issues (Puttick & Mulgan, 2013, p.2). NESTA has instructed clearing houses who steer the evidence based policy making progress to (1) orchestrate all kinds of evidence, (2) involve the likely users of evidence in the shaping of work programmes, prioritization and good governance, and (3) influence the creation of new evidence (Puttick & Mulgan, 2013). Nielsen, Pedersen & Grunberger (2013) also adopted this standpoint by NESTA in their recent research in which they explore what types of evidence decision makers in Denmark need.

Evolving from the health sciences, evidence-informed policy making now holds influence in multiple sectors (Evans, 2003). Scholarship into the use of scientific evidence has shifted from medicine to other areas of public interest. The ‘political’ element surrounding policy makers has also forced health professionals to investigate what heuristics come into play when it comes to the recognition and implementation of scientific evidence (Evans, 2003).

A clear hierarchy of evidence has been established in the health and medicine sector (Daly et al., 2006; Evans, 2003). To test the robustness of evaluations and reviews many leading organizations employ the Maryland Scientific Methods Scale (SMS) (What Works Centre, n.d.). The leading type of evidence, scoring a level 5, demands a research design that involves strict randomization - the golden example being randomized controlled trials (RCT’s). Institutions such as the What Works Centre, which is sponsored by the ESCR, use UK and OECD evaluations, and apply the SMS scale to determine which impact evaluations meet the industry standard and then produce an evidence review summary (Ibid.). In this institutional setting, a number of evaluations pass through a rigorous and meticulous approach, but they do not take into account other types of evidence other than impact evaluations. As policymakers in the Dutch administration are encouraged to develop better policies through the Operation Quality into Insight, what hierarchy of evidence emerges amongst the different types of evidence available to public administrators in the Netherlands?

There are some differences between medicine evidence-based practice and evidence-informed policy making in the public sector. In the health sector, a new drug might pass sixteen randomized controlled trials (RCT’s) with positive statistical significance before the product is

(15)

14 brought onto the market (Parkhurst, 2017). This extent of systematic and scientific evidence, while undeniably proving what works, is not entirely transferable into the social sciences. RCT’s are also considered the golden standard in the social sciences (Neuman, 2014) and certainly do take place. However, due to what Leicester (1999) has deemed “the seven enemies” of evidence-informed policy making, this will not be to the same extent as in the health sciences. The seven enemies of evidence-informed policy making include such things as bureaucratic logic, politics, civil service culture, cynicism and time (Ibid.).

As previously discussed, classic scholars in the field of evidence-informed policy making identified different types of evidence which, according to them, were having an impact in public policy making. Davies, Nutley and Smith (2000) identify descriptive data, analytic findings, evaluative evidence and policy analysis in their work on evidence-based policy. Weiss (2001) reiterates these four types of evidence in her reflection on which types of evidence governments are responding to. This repetition provides some confidence that this evidence typology can be potentially influential even in public policy today, and is worthy of further investigation. The types of evidence presented by Weiss (2001) and Davies, Nutley and Smith (2000) provide a public administration approach to testing various types of evidence. This is especially relevant in light of the request from academics to incorporate more social science methods to evidence-informed policy making research (Stoker, 2016).

2.3 Going Undercover

The interest in investigating undercover footage as a new type of evidence is threefold. Firstly, occasions where undercover investigations have led to installment of new procedures and policies (RTL, 2018; Zembla, 2018). Secondly, the scientification of undercover footage with increasingly systematic investigative work performed by independent organizations (Schouten, 2018). Finally, the increasing public coverage in mass media and the subsequently initiated public and official debates (Nexis Uni search April 2019).

Undercover footage has sparked some controversy in certain policy sectors. The agriculture institute of the Netherlands (LTO) would like to see a trifecta of cooperation between farmers, the institute itself and police to stop interest groups from making undercover footage (LTO, 2019).

(16)

15 Industry resistance to undercover footage is also present in the United States. Industry lobbyists tried to pass an “ag-gag” law which prohibited any type of undercover footage to be released (Adam, 2012). Adam (2012) argues that this goes against the Freedom of Speech Act and raises more important questions of what does the industry have to hide? The fact that the controversies are not confined to national borders, only highlights that there is an uninvestigated space behind the release of undercover footage and how it is perceived by various actors.

Despite industry resistance, there are also some developments in the Netherlands that support the function of undercover footage as evidence. A fundamental decision was taken by the

College van Beroep voor het Bedrijfsleven2 that enforced undercover footage to be viewed and included in the Minister of Agriculture’s decision making regarding the sector (de Rechtspraak, 2018). Additionally, as an important precedent, an enforcing government agency recently issued official warnings and a fine based solely on evidence found in undercover footage (Vee en Gewas, 2019). This indicates that undercover footage is moving towards being accepted as evidence able to stand on its own in decision-making processes. The shifting dynamics of undercover footage and its societal relevance make it an interesting new type of evidence to test. Following from this, what do those in advisory and policymaking positions make of this new type of evidence?

2.4 The Policy Process and Policy Makers

As all the established policy making models tell us, from the advocacy coalition framework to the issue attention cycle (for a full review, see Sabatier, 2007), the policy process involves a complex set of elements that interact over time. Problems, solutions and their political window of opportunity may all become available at different times (Kingdon, 2003). Only rarely will the conditions emerge for a pure and rational problem-solving model, involving a clear and mutual definition of the policy problem, timely and appropriate research answers, willing and able political actors and a lack of strong opposing forces (Ibid.).

Evidence has joined many other factors in determining policy choices. While at the same time, various procedural, operational, substantive, institutional, financial and environmental factors limit the capacity of policymakers to take on information (Howlett, Ramesh, & Perl, 2009).

(17)

16 New policy evidence is not always enough to influence the decision-making in a contested policy environment. As shown in the literature looking at policy makers; there are a certain barriers to evidence-based policy making and the use of high-quality policy advice (Evans and Edwards, 2011). It has been shown that policy makers hold a ‘ministerial indifference over the

facts’ (Ibid.). In their research, the argument is made that instead of evidence steering policy,

evidence is often utilized to support decisions that have already been taken. This has been given the term ‘policy-based evidence making’ (Evans and Edwards, 2011).

In the literature, policy making is often described as a process that is messy, ambiguous and multifaceted, and will involve a range of actors interacting with one another in policy regimes in iterative policy cycles (Howlett, Ramesh, & Perl, 2009). Howlett (2011, p. 145) states that, “understanding who these actors are and why and how they act the way they do is a critical

aspect of all policy making activity, including policy instrument selection and in policy design.”

Therefore, a study of these actors in the Netherlands and diving deeper into the perceptions and belief systems they hold will aggregate our understanding of the policy process and subsequent actions.

2.5 Policy, Polity and Politics in the Netherlands

The Netherlands has not been at the heart of evidence-informed policy research. However, reports indicate that evidence-informed policy is relevant in the Dutch political discourse (Slob & Staman, 2012). How has the evidence-informed movement developed in the Netherlands? What is the connection between research and policy on a national level? Supporters of evidence-informed policy making have previously led government programs and even won elections. For example, in 1997, Tony Blair famously said, “what matters is what works.” (Cited in Davies, Nutley and Smith, 2000). This marked the start of an evidence-informed Blair Government intended on moving away from ideological policy decision-making (Davies & Nutley, 2000). With the prior academic focus on Anglophone countries, the Netherlands provides an alternative setting to do an investigation of evidence-informed policy making. Slob and Staman (2012) state in an advisory report that the buzz of evidence-based policy has been around in political the Hague for more than 15 years. They remark, that it is not just policy that should be evidence-informed but also politics itself (Ibid.).

(18)

17 Research by De Gier, Henke & Vijgen (2004) identifies the Netherlands as part of the ‘second

wave’ of countries in which evaluative use has become institutionalized. Proper benchmarks

and RCT’s are not yet the reality or the norm in the Dutch setting (Ibid.). However, government ministries in the Netherlands are quickly adapting and expanding their evaluative capacities. For example, the Ministry of Foreign Affairs has an extensive monitoring and evaluating unit in the IBO, which performs international policy research and evaluation (Rijksoverheid, n.d.).

The institutional environment in the Netherlands is often described (domestically and internationally) as poldermodel. De Gier et al., (2004, p.23) describe the concept as, “the

institutional dimension, referring to the deep-rooted Dutch tradition of administrative organisation based on consensus and compromise, mutual trust and decentralisation of responsibilities.” A basic understanding of the decentralization of responsibilities provides

insight into the limited margins of freedom that policy makers within the Dutch government might have to shift programs and policies.

Chapter 3 – Conceptual Framework

3.1 Types of Evidence Typology

The previous chapter discussed that academically, there is a movement toward including more evidence, but also showed the varying definitions of what exactly constitutes evidence as part of an evidence base. In the early 2000’s a prominent group of scholars identified types of evidence which governments were becoming increasingly responsive to. The identification of four types of evidence was first done by Davies, Nutley and Smith (2000) and then reconfirmed by Weiss (2001). In this design, the evidence typology will form the foundation of the types of evidence which will be tested. In the chosen typology, Weiss (2001) has replaced policy analysis (identified by Davies, Nutley and Smith, 2000) with the term policy analytic forecasts. Furthermore, this research design will introduce a novel mode of evidence namely; undercover footage. The below figure provides the definitions of each type of evidence as defined by Weiss (2001).

(19)

18

Figure 1. Evidence Typology Definitions

Type of Evidence Definition

Descriptive data Data on economic conditions, figures, trend data.

Analytic information Research that identifies factors associated with conditions.

Evaluation Looking at the effectiveness of existing

policies and programs.

Policy analytic forecasts Analyses of alternative future policies.

Source: Weiss, 2001, p. 288-289.

3.2 Variables of Measurement

This research is concerned with testing how different types of evidence are perceived by policy makers in the Netherlands. The independent variable tested is therefore four different types of evidence. The dependent variables measured will be the credibility, usefulness and likelihood of usage of each type of evidence in order to gauge how policy makers perceive the evidence types.

The first concept that should be conceptualized is credibility. In operationalizing the term credibility, this research follows Landsbergen & Bozeman’s (1987) definition of credibility: “an individual’s assessment of the believability of an argument, and a function of the evidence

presented, the logic of the argument, as well as other factors that may have little to do with the rigor or logic of the inquiry, such as cues associated with authority and credentials” (as

referenced in Doberstein, 2017, p.265).

Additionally, the concept of usefulness is measured. Talbot and Talbot (2014) in their study of the British civil service and the utility of academic research, also incorporated the variable of usefulness. The study found that the most useful type of research was considered to be case studies produced by British academics. The concept of usefulness is a welcome addition to the credibility variable, as the two might not show a positive correlation. A type of evidence might be very credible but not very useful in the eyes of policy makers. Therefore adding the usefulness variable provides us with more insight into how types of evidence are perceived.

(20)

19 Thirdly, as expressed in the literature review, academics have been concerned with the (non) use of knowledge in policy advisory systems (Weiss, 1997; Doberstein 2017). To this end, Weiss developed a knowledge utilization model which shows multiple ways in which knowledge can be used in public policy (Weiss, 1997). This research will not go further into testing this knowledge model, but because of the academic interest regarding evidence use, a variable that will be tested is the likelihood of usage of the evidence type. This helps in establishing base perceptions to how likely policy makers are to use the types of evidence in policy development.

3.3 A Focus on Policy Development

The literature identifies various stages of the policy process where policy makers might let evidence inform their course of action. As the study by Reid (2003) highlights, policy makers themselves define evidence-informed policy making as policy informing policy development. Whereas undercover footage might place certain issues on the government’s agenda, the decision by the Court in the Netherlands for undercover footage 1) to be watched by decision makers - therefore not disregarded and 2) to play a role in decision making - places it as a type of evidence that can inform policy makers in various stages of public policy development. The scope of this research project is not to analyze the impact of different types of evidence at different phases of the policy cycle, but rather to assess the credibility, usefulness and likely usage of different types of evidence in policy development. This refers to the development of what Jenkins (1978, p.15) defined to be public policy: “a set of interrelated decision taken by

a political actor or group of actors concerning the selection of goals and the means of achieving them within a specified situation where those decisions should, in principle, be within the power of those actors to achieve.”

(21)

20 3.4 A Focus on Policy Makers

Figure 2. Policy Staff categorized in six circles and their policy advisory function

(Adapted by Van den Berg, 2017, p. 68)

Furthermore, the essence of this research is the focus on the perception of policy makers. Van den Berg (2017, p.68), identifies them as the first circle; “the innermost circle is the core of the

public service, that is, the staff employed by general governmental institutions such as the ministerial departments.” By starting the research with the innermost circle, it sets the

framework for subsequent outcomes in the outer circles and builds the academic knowledge pertaining to the perceptions of policy makers employed in the Dutch governmental ministries.

3.5 Construction of Hypotheses

The typology of different kinds of evidence was first introduced in the early 2000’s. It is therefore likely that policy makers are familiar and have been in contact with the sorts of evidence identified in the scholarship. As portrayed in the literature review, an evaluation culture has evolved in the OECD countries and renowned institutes are using impact evaluations to deliver policy advice. Evaluations are as much part of the current policy discourse as the evidence-informed trend in policy making. So much so, that it would be hard to imagine a society without evaluations (Eliadis et al., 2011 in Pattyn, 2014). Due to the proliferation of evaluation use in government agencies as well as the more common instrumental types of evidence such as descriptive data and analytic forecasts it will be assumed

(22)

21 that these types of evidence will score higher on the variables of credibility and usefulness as compared to the more controversial evidence type of undercover footage. Therefore:

H1 : The evidence types as identified by Davies, Nutley and Smith and Weiss will score higher

on the dependent variables of credibility, usefulness, and the likelihood the evidence will be used for policy development as compared to undercover footage.

As discussed in the literature review, there are clear hierarchies of evidence present in the evidence-based health practices. This means there is a difference between the types of evidence, and not all evidence is seen as equal. The assumption is that there will also be some sort of hierarchy present in the perception of the different types of evidence when it comes informing policy development. Therefore:

H2: There will be a significant difference found between the different types of evidence and

how they are perceived by policy makers.

Chapter 4 - Research Method

4.1 Data Collection and Sampling

The utilized research data collection method is a cross sectional survey amongst policy makers in the twelve Dutch ministries. Surveys remain the most commonly used data collection method in the social sciences (Neuman, 2014). The use of survey research proliferated within the positivist stream of social science research (Ibid.). To answer the research question, the chosen data collection method needs to be able to measure the perception of policy makers in the context of their current (or future) behavior, in this regard; “surveys are appropriate when

we want to learn about self-reported beliefs or behaviors”(Neuman, 2014, p.317).

Furthermore, “the survey instrument allows researcher to assess, with a small sample,

population attitudes, perceptions, and opinions about particular social issues, as well as factual knowledge.” (Lee, Benoit-Bryan & Johnson, 2011, p.87). Instead of having different

related groups score different types of evidence, the survey is designed to expose each participant to all four evidence types (in this case, each condition). This establishes a repeated

(23)

22 measures design from which related means can be compared to see if there any significant differences between the perceptions of policy makers regarding the different evidence types.

The survey target population is policy makers in the Netherlands employed by the Government Ministries. As per the survey research conducted in the United Kingdom by Talbot and Talbot (2014, p. 7), the choice is made to solely focus on unelected civil servants whom are considered to be the “gatekeepers” of the policy process. The Dutch Government portal (Rijksoverheid.nl) is quite restricted in terms of communicating the organizational charts of ministries and providing statistics regarding the employee distribution. The sampling frame therefore was mainly established by identifying policy makers - defined as either Policy Officer or (Senior/Coördinerend) Beleidsmedewerker as their professional title on the online professional platform LinkedIn.

Stevens (2012) states that a sample should consist of 15 participants for each independent variable as a rule for participants. In this case, there are four independent variables (types of evidence) and therefore the sample should (minimally) be composed of 60 respondents. The initial target was to collect responses from ten policy makers from each ministry for a total of 120 respondents. Having ample representation from each ministry increases the external validity of the data in terms of being able to generalize about the perception of policymakers regarding different types of evidence.

Using the sampling frame, a sample of N=376 potential respondents was identified through the LinkedIn platform through an extensive search on job title “Beleidsmedewerker” or “Policy Officer”. As the survey is geared towards measuring the perception of policy makers’ ministry wide, participants from all ministries were recorded. Everyone in the sampling frame was contacted to see if they would be interested in participating in a survey on evidence-informed policy making. LinkedIn as an application is restrictive in whom you are allowed to contact so a connection invitation with a short introduction was sent out. If policy makers were interested in participating, they could accept the LinkedIn connection. If this was the case, an official standard message was sent to them with a link to the survey and all information regarding the research and informed consent. In order to address informed consent, the message stated that in no means were participants obligated to partake in the research and that they could withdraw at any time. Additionally, respondents were informed that all answers and identities would remain anonymous.

(24)

23 The responses from government policy makers were supportive and there was quite a willingness to participate. In addition to the reaching the target population through LinkedIn, a source within the Ministry of Justice and Safety was so kind to distribute the survey by internal e-mail to policy staff within that Ministry.

4.2 Response Rate

Through LinkedIn 367 invitations were sent out the policy makers employed by Dutch government ministries. Out of the policy makers that were sent an invitation to connect via LinkedIn, a total of 75 policy makers eventually completed the full survey. The drawbacks of using the LinkedIn strategy were that not all policy makers received the message in the period of the active survey, and that the Linkedin connection invitation merely allows 120 characters. In order to establish survey participation, it is hard to receive a positive response from such a restricted outreach message.

Most social science research literature will not include a LinkedIn strategy as a means of distributing a survey. A LinkedIn survey strategy is therefore quite a novel method of reaching a sample population. This made it difficult to assess if I would successfully reach policy makers in the period of the research. Additionally, if it proved successful, it could show the potential of this method for future survey research. The total response rate is 19.94%. If we were to look at the response rate of responses from policy makers that accepted the connection, this response rate becomes much higher, therefore if people are already in your network, responses will also increase.

From the Ministry of Interior and Kingdom Relations, I received the current number of policy workers employed by the Rijksoverheid at the benchmark of 31 March 2019. This information is insightful because it provides the total size of the sample population: policy workers currently employed by the Dutch government. The total number of policy makers in the Netherlands is 5,581. This means that the 75 policy makers should reflect the perceptions of those 5,581 in office.

(25)

24

Figure 4. Overview of total number of policy makers employed by the Dutch government ministries per ministry and rank at the end of March, 2019.

(Ministerie Binnenlandse Zaken en Koninklijke Relaties, 2019)

4.3 The Survey

The survey presents policy makers with a short description of four different types of evidence (evaluations, descriptive data, undercover footage and analytic forecasts) without any attribution, as to not influence them by the source of the evidence. According to Neuman, (2014, p.321) “two key principles guide writing good survey questions: Avoid possible

confusion and keep the respondent’s perspective in mind.” This was the driving force behind

designing the survey questions and a special focus was put on simplicity and avoiding confusion. Respondents were asked to score the types of evidence on scales of:

(26)

25 1. Credibility

2. Usefulness

Furthermore, the survey adds variables testing the use of the evidence: 3. If they have ever used the type of evidence in policy development;

4. If they would be likely to use this type of evidence in policy development;

and finally the survey asks respondents to: 5. Rank the evidence on quality 6. Rank the evidence on bias

In addition, after respondents were presented with each type of evidence and the open-ended statement questions, they were given the opportunity to place any additional thoughts or comments.

Through the LinkedIn distribution technique, no distinction was made between a respondent’s age and the duration of their employment. A scheme was made in Excel purely identifying policy makers in the different ministries. A stratification technique would not have been very effective due to the limited access to policymakers in government, leading to undistributed or unrepresented strata.

On the decision to employ only three of the four identified evidence typologies in the academic literature, an initial simple pre-test of the survey amongst civil servants (not directly policy makers but associated with government) indicated that of the four types of evidence, they were least familiar with (and thus confused by) analytic information (meaning and relevance). Therefore, the choice was made to omit analytic information as a type of evidence and to replace undercover footage here. This choice was also made because the feedback indicated that the survey became too cumbersome when it included five different types of evidence. The final survey consists of 32 questions and takes between five and ten minutes to complete (see Appendix C for full survey).

(27)

26

Chapter 5 - Data Analysis

The survey was active for two weeks in May 2019 in the university survey software Qualtrics. The total amount of response accumulated to N=75 after three cases of attrition.

5.1 Baseline Statistics

Charts 1 to 3 illustrate the respondent’s age, ministry of employment and the duration of their employment in years.

(28)

27

Chart 2: Number of respondents per Ministry

Chart 3: Number of respondents per duration employed at Ministry category

The baseline statistics provided reveal some insight into the profile of the policy makers who participated in the survey. Regarding their age, all age groups do have representation, but the highest number of respondents is found in the age group 25-34 years old. Regarding the

(29)

28 Ministries of the policy makers there is an equal distribution between four ministries. The most response came from the Ministry of Landbouw, Natuur en Voedselkwaliteit. Additionally, policy makers in the Ministry of Defense did not partake or respond to the survey. The duration of employment of the policy makers at their current ministry showed that the highest response came from those employed between 1 and 4 years.

5.2 Descriptive Statistics

In order to answer the research question, which asks how policy makers perceive different types of evidence, the first step in the data analysis is to illustrate and compare the average (mean) answers the respondents supplied for the different types of evidence. The mean answers are compiled and presented in charts 4-6. In order to test the perception of policy makers they were presented with statement questions regarding each type of evidence; namely; I find

(evaluations/descriptive data/undercover footage/analytic forecast) credible/useful/likely to use for policy development. The scale runs from 1 to 7 where 1 = completely disagree and 7 =

completely agree.

(30)

29

Chart 5: Mean Usefulness of the Different Types of Evidence

Chart 6: Mean Likelihood of Usage of the Different Types of Evidence

Due to the design of the survey, each policy maker was exposed to each type of evidence (or condition). This provides the advantage that the answers can be compared in repeated measures. The graphs are a visual representation of the respondent’s mean answers to the main research question. The credibility, usefulness and the likelihood of the usage of evaluations, descriptive data, and analytic forecasts all score high, with means between 5 and 6 (somewhat agree to agree). The mean score for undercover footage falls lower on all three dependent

(31)

30 variables compared to the other types of evidence. The respondents agreeing that they were likely to use evaluations in policy development work achieved the highest mean (mean = 6,12).

5.3 Top Two Box Score

The analysis of the data and the mean scores indicate that a large portion of the respondents agree that evaluations, descriptive data and analytic forecasts are credible, useful and would likely be used. By further coding the respondent answers to a top two box score, the data becomes more compact and can provide insight to what percentage of respondents truly and completely agree with the statements about the different types of evidence. On the answer scale of 1 to 7 only the respondents that answered a 6 or 7 (agree/totally agree) received a positive 1 score. From here on out, this will be reported as the respondent agreeing with the statement (meaning a respondent answer of agree (6) or totally agree (7).

Chart 7 shows that 64% of the policy makers agree (agree/totally agree) that evaluations are credible for policy development compared to 12% agreeing that undercover footage is credible for policy development. The percentage scores also indicate that there are some differences between the credibility scores of the different types of evidences.

(32)

31 The total percentage of policy makers who agree that the evidence is useful for policy development is somewhat higher than the credibility of evidence types (Chart 8). For example, 68% agrees that descriptive data is useful for policy development compared to 57% agreement that descriptive data is credible.

Chart 8: Top Two Box Score % Agree and Totally Agree Evidence Usefulness

Chart 9: Top Two Box Score % Agree and Totally Agree Evidence Likelihood of Usage

The scores of the percentage of policy makers that agree they are likely to use the different types of evidence in policy development also provides some insightful results. As can be seen

(33)

32 in Chart 9, a high score of 80% of policy makers agree that they are likely to use evaluations in policy development work. The percentage that agree that would use descriptive data in future policy work is 68% while only 8% for undercover footage. Analytic forecasts received a top two box score of 40%.

5.4 Standard Deviation

The baseline and descriptive statistics are necessary to interpret the data and to answer the research question. Now that the perceptions of the policy makers have been compared across the different types of evidence with mean scores, the standard deviation can test if there was consistency in the answers provided. It helps to assess if policy makers largely feel the same way or if the perceptions vary significantly across the board.

As Hair, Money, Samouel & Page (2007) specify, if the standard deviation is small (<1.0) it means the respondent opinion is quite consistent. If it is large (>3) there is a lot of variability in their opinions (Ibid.). For the variable credibility, none of the standard deviations meet the (large) criterion but the perceived credibility of undercover footage is above 1.0, which shows that opinions concerning undercover footage deviate the most amongst all four types of evidence. Additionally, the least variation came from the responses regarding the credibility of evaluations with a SD score of .729.

Std. Deviation Variance Credibility of Evaluations .729 .531

Credibility of Descriptive Data .917 .840

Credibility Undercover Footage 1.420 2.015

Credibility of Analytic Forecasts .854 .730

The variable of usefulness produces similar results; the usefulness of undercover footage received the most varying responses and the responses towards usefulness of evaluations the most consistency with a statistical score of .753.

(34)

33

Std. Deviation Variance Usefulness of Evaluations .753 .566

Usefulness of Descriptive Data .900 .811

Usefulness Undercover Footage 1.447 2.094

Usefulness of Analytic Forecasts

.932 .870

When involving the variable of the likely usage of the different types of evidence in policy development the standard deviations all fall below the small standard (<1.0) indicating that there is a small variation between the given answers provided by the policy makers.

Std. Deviation Variance

Usage of Evaluations .494 .244

Usage of Descriptive Data .661 .437

Usage of Undercover Footage .593 .351

Usage of Analytic Forecasts .915 .838

5.5 Repeated Measures ANOVA

The repeated measures ANOVA is a parametric test that is suitable for testing related means and for detecting if there are any significant differences between mean scores between the different conditions of the independent variable. In the case of the current research, repeated measures ANOVA can be run to determine if there are significant differences between the means of the four types of evidence on the dependent variables credibility, usefulness and the likelihood of using the evidence.

Repeated measures ANOVA could be used for two types of research design. The first is for a design that wants to establish changes between means that are measured over different time intervals. For example, measuring the same group of people and their stress level after three,

(35)

34 six and nine months in order to establish if there are significant differences in the stress level mean scores between the time points. The second type of design for which a repeated measure ANOVA is appropriate is to establish if there is a difference in mean scores when a related test group experiences three or more different conditions (Laerd Statistics, n.d.). For example, each person in a test group is made to eat a chocolate, strawberry and vanilla ice cream. In this example, you can measure if there is a significant difference in means between the conditions (flavors scores) (Ibid.). Our research falls in the latter category as the 75 policy makers were exposed to more than three conditions; evaluations, descriptive data, undercover footage and analytic forecasts were measured on the same dependent variables credibility, usefulness and the likelihood of using the evidence. In order to establish if a repeated measures ANOVA could be used, the data results were assessed for normality (Ibid.) This can be done with statistical tests or graphically, most researchers prefer to do this visually (Ibid.). When plotting the data on histograms it was visually ascertained that the data passed the tests for normality.

5.5.1 Repeated Measures ANOVA Credibility, Usefulness and Likelihood of Usage

Tests of Within-Subjects Effects

Measure: Credibility

Source

Type III Sum of

Squares df Mean Square F Sig.

Evidence Sphericity Assumed 171,637 3 57,212 60,163 ,000

Mauchly's Test of Sphericity indicated that the assumption of sphericity had not been violated, as there were no significant differences between the tests of within subject effects. Therefore, we can look at the results of the p-value under sphericity assumed which is < 0,001. This means there is a significant p-value indicating there is a significant difference between the mean scores on credibility between the different types of evidence.

Measure: Usefulness

Source

Type III Sum of

(36)

35 Sphericity was also assumed for the second dependent variable, usefulness. Again, the results provide a significant p-value indicating there is meaningful difference between the mean scores of the different types of evidence.

Measure: Likelihood

Source

Type III Sum of

Evidence Sphericity Assumed 405,930 3 135,310 109,903 ,000

The third dependent variable, the likelihood to use the type of evidence for policy development also did not violate Mauchly’s Test of Sphericity. Once more, there is a significant p-value of <0,001 which shows that there is a significant difference between the means of the evidence types on the likelihood to use the evidence.

The repeated measures ANOVA has therefore established that there is a significant difference between the means of the different conditions (types of evidence) on each of the dependent variables of credibility, usefulness and likelihood of usage.

5.5.2 Repeated Measures ANOVA Credibility, Usefulness and Likelihood of Usage Post-Hoc Tests

When it has been established that there are significant differences between the types of evidence using the Repeated Measures ANOVA and the output indicates a significant p-value, SPPS generates a post-hoc test that reveals the pair-wise comparisons, indicating which types of evidence significantly differ from each other.

Pairwise Comparison measure: Credibility

For credibility, the post-hoc test reveals that there are significant differences between all four types of evidence except evaluations and descriptive data (the mean difference is significant at the 0,5 level). The bar charts at the start of the chapter are also a good indicator of these

(37)

36 differences, as there is only a marginal difference between evaluations and descriptive data in the graphics.

(I) Evidence (J) Evidence

Mean Difference (I-J) 1 Evaluations 2 ,053 3 1,867* 4 ,440* 2 Descriptive Data 1 -,053 3 1,813* 4 ,387* 3 Undercover Footage 1 -1,867* 2 -1,813* 4 -1,427* 4 Analytic Forecasts 1 -,440* 2 -,387* 3 1,427*

Table 1: Pair Wise Comparison Evidence Types on Credibility

Pairwise Comparison measure: Usefulness

Regarding the scores of the different types of evidence and their usefulness, there are significant differences in the means of evaluations with undercover footage and analytic forecasts.

Mean Difference (I-J) 1 Evaluations 2 ,080 3 1,893* 4 ,453* 2 Descriptive Data 1 -,080 3 1,813* 4 ,373 3 Undercover Footage 1 -1,893* 2 -1,813* 4 -1,440* 4 Analytic Forecasts 1 -,453* 2 -,373 3 1,440*

(38)

37 There is significance between descriptive data and undercover footage. Undercover footage has significant differences with all the other types of evidence, and analytic forecasts with evaluations and undercover footage.

Pairwise Comparison measure: Likelihood of Usage

When it comes to the likelihood of usage of the evidence types, the only non-significant difference between the evidence types is again between evaluations and descriptive data.

Mean Difference (I-J) 1 Evaluations 2 ,160 3 2,920* 4 ,853* 2 Descriptive Data 1 -,160 3 2,760* 4 ,693* 3 Undercover Footage 1 -2,920* 2 -2,760* 4 -2,067* 4 Analytic Forecasts 1 -,853* 2 -,693* 3 2,067*

Table 3: Pair Wise Comparison Evidence Types on Likelihood of Usage

5.5.3 Repeated Measures ANOVA Testing for Variable Duration Employed at Ministry

Now that it has been established that there are significant differences in the perception of policy makers regarding the different types of evidence on the dependent variables of credibility, usefulness and likelihood of usage using a repeated measures ANOVA, the test can also be used to see if other variables have had an effect on creating this significant difference. For example, does the time employed at the ministry of the policy makers have a significant effect on the difference in means regarding the different evidence types? It could be that experience, or time employed is a factor in the answers provided in the survey. Therefore, another repeated measures ANOVA is run with the variable duration employed at Ministry to see if it has any effect on the difference in means on the dependent variable credibility.

(39)

38

Tests of Within-Subjects Effects Credibility

Measure: Credibility and Duration Employed

Source

Type III Sum of

Credibility * Duration

Sphericity Assumed 15,409 12 1,284 1,378 ,178

When adding the variable of the duration of employment to the credibility of the different types of evidence a non-significant p-value of ,178 is the result. This means that the duration of employment of a policy maker does not have a significant effect on the difference in means scores of the credibility of the different types of evidence. Chart 10 shows the visual representation of the credibility scores of the different types of evidence scored amongst the different duration employment categories.

Chart 10: Credibility scores of the different types of evidence per duration employed at ministry

Although there are some interesting differences between the marginal means of some of the duration categories, such as the higher scores on evaluations and descriptive data by those policy makers employed shorter than one year, these are statistically not significant. The graph merely shows the distribution of the marginal means per duration category. A similar test was run for the dependent variable usefulness, but this also retrieved a non-significant result. However, for the last dependent variable, the likelihood of using the type of evidence, there