• No results found

On the alignment between business strategy and data science competence development : what competences are needed in the Dutch Central Government to successfully drive Data Science Projects?

N/A
N/A
Protected

Academic year: 2021

Share "On the alignment between business strategy and data science competence development : what competences are needed in the Dutch Central Government to successfully drive Data Science Projects?"

Copied!
84
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Master Thesis

On the Alignment between Business Strategy

and

Data Science Competence Development

What competences are needed in the Dutch Central Government to successfully

drive Data Science Projects?

Master of Business Administration

Erwin Spekschoor, 10998381, erwin.spekschoor@planet.nl Thesis Supervisors: Prof. Dr. C. Buengeler and Dr. C Boon Amsterdam Business School, University of Amsterdam October 1. 2017

(2)

2 ACKNOWLEDGMENTS

At this instance, I want to thank all the people that were involved and contributed to the completion of this thesis.

First I want to thank my UvA thesis supervisors: Prof. Dr. Claudia Buengeler and Dr. Corine Boon for their guidance, supervision and inspiration during this research.

With respect to the Dutch Central Government I want to thank Drs. Marcel Staring MIM for his supervision and for supporting me in shaping the research, his knowledge of the organization and above all his drive to get me introduced with the interviewees. I would also like to thank two members of the ´begeleidingscommissie´, Drs. Tim van Tongeren (Innovation and Strategy, Ministry of Infrastructure and Environmental Affairs) and Dr. Wim Rietdijk (Data Scientist Ministry of Internal Affairs) for their input, review and inspiration.

I would like to thank Dr. Yuri Demchemko, Director of the Edison project, for the many discussions on the topic of DS competences, his effort to add insights and deliverables and rearranging some of the Edison material to be able to bridge it towards this research. I would also like to mention Dr. Souley Madougou, Edison Researcher, with whom I worked closely to get the Edison text mining algorithm adjusted to provide more accurate results.

A big thank you to all the interviewees who shared ideas, opinions and insights. Thank you for your time, interest and discussions.

Last but not least I want to thank my family, especially my wife Eline for her unconditional support during the last years.

Erwin Spekschoor, Amsterdam, October 2017

(3)

3 EXECUTIVE SUMMARY

Organizations are becoming increasingly aware that developing Data Science (DS) capabilities is enhancing performance. In literature, there is growing empirical evidence that this is indeed the case (Akter et al., 2017). Governments are keen to investigate how Data Science can be used to enhance ´public value creation´. Themes like ´Data driven government´ and ‘datafication´ are on the rise (Maak Waar!, 2016). To be able to develop DS capabilities for the Dutch Central Government (DCG) in a structured and efficient way, it is strongly desired to develop a better understanding of what competences are needed to further drive DS in DCG. Contrary to the focus on competency development for specific roles in the Data Science job-family, the development of competences for other groups in the organization (`non-data scientists’) seems largely disregarded in literature. This thesis addresses that gap in literature and puts broader organizational (DS) competency development within DCG into its focus. This thesis starts with exploring how a strategy of public value creation through DS, in a context of public values, is aligned with DS competency development. This is followed by exploring and describing how competence development for relevant groups in organizations like DCG can be structured and designed. Finally the study validates the applicability of the EDISON (Education for Data Intensive Science to Open New science frontiers) framework for DCG.

Sixteen interviews with respondents in different domains (Business, Information Management, Data Science and IT delivery) and entities (Ministries, agencies and Independent Governing Bodies) have guided this research. Furthermore a small subset of recent published job vacancies of Data Scientists have been text mined to scan for competences as formulated by the Edison Competence Framework.

Firstly, as a result of this research, it is concluded that data governance should be put on the forefront. Senior leaders throughout DCG might consider putting a ´Data Governance Manifesto´ in place to be able to further drive DS in DCG. It should guide business leaders and Data Science projects on the do´s and don’ts when dealing with Big Data and Data Science. Ideally this manifesto could also provide specific guidance on elements like ´permanent beta´ and ´experimentation´ as well as on the governance of models and algorithms. It could and should provide the necessary ´room to breathe´ for business leaders to better and safely define Data driven strategies.

(4)

4

Secondly, the training for the Business functional domain should start first. Above all, senior managers must learn how to be responsible taskers and consumers of projects and Data Science products. They should really understand the impact and content of Data Science. Specific Data Science competences need to be developed on at least awareness level. Furthermore they should be trained on formulating, driving and implementing Data driven strategies, preferably within boundaries defined as a result of the Data Governance policies. Role modelling at the top is crucial to foster and stimulate bottom-up innovation, best illustrated by the one-liner: ‘employees don´t do what leaders say, they do what leaders do’.

Thirdly this research indicates that within each functional domain (Business, Information Management, Data Science and IT delivery) and on each level in a domain (strategy, structure and operations) Data Science competences will need to be developed. This extent to which specific DS competences needs to be developed is dependent on a specific combination of functional domain and level. Is has been validated that the Edison Data Science Competence Framework can be used for this. Concrete examples have been included in this thesis. As a result of this research, together with the EDISON team, a first version of a Data Science Competence and Curriculum Configurator (DSCCC) has been developed that supports generating a Data Science curriculum for each combination of functional domain and level. Fourthly, positioning HR as a valuable partner in aligning strategy and broader organizational DS competency development, is strongly suggested. To support the strategy, HR should develop a coherent set of HR practices for attracting, building and retaining human capital in a data driven government. To be able to add value in this role, HR might need to develop additional competences.

The overarching objective of this thesis has been to contribute to theoretical and practical insights to further drive Data Science in organizations. Existing theories and frameworks have been used, adjusted and applied to a real-life case. Several managerial implications have been derived to support accelerating the use of Data Science in DCG.

(5)

5

Table of Content

1 Introduction ... 6

2 Literature Review ... 10

2.1 STRATEGIC HRM ... 10

2.2 POSITIONING OF FRAMEWORK FOR COMPETENCES ON A GROUP LEVEL ... 14

2.3 COMPETENCES TO DRIVE DATA SCIENCE IN ORGANIZATIONS VERSUS DATA SCIENCE COMPETENCES .. 18

3 Edison Framework ... 19

4 Conceptual model ... 21

5 Dutch Central Government ... 23

6 Research Methodology ... 24

7 Research findings ... 30

7.1 ALIGNMENT BETWEEN STRATEGIC HRM DIMENSIONS AND DS COMPETENCES ... 30

7.2 COMPETENCES NEEDED TO PLAN, DRIVE, IMPLEMENT AND CONSUME DS IN DCG ... 37

7.2.1 Hard core DS competences (group A) ... 38

7.2.2 Competences for non- Data Science domains (group B) ... 41

8 Discussion ... 44

8.1 THEORETICAL IMPLICATIONS ... 49

8.2 MANAGERIAL IMPLICATIONS ... 50

8.3 LIMITATIONS AND SUGGESTIONS FOR FURTHER RESEARCH ... 52

9 Conclusion ... 55

References... 57

Appendix 1 Case study protocol ... 1

Appendix 2 Edison Competence Areas and competences ... 4

Appendix 3 Coding categories ... 5

Appendix 4 Edison competences and model curriculum for 3 roles in DCG ... 7

(6)

6

1

INTRODUCTION

Organizations are increasingly focused on “Big Data Analytics”, which over the years has emerged as a growing domain of productivity and opportunity. (Akter et al., 2017). Building strong Big Data Analytics Competences is an important element in transforming the way in which organizations do business (Davenport & Harris, 2007). Organizations have realized they need to hire Data Scientists and many publications are describing Data Science (DS) as an important, even ´sexy´ career choice (Provost & Fawcett, 2013).

Big Data Analytics (BDA) and related terminology (e.g. Data Science, Business Intelligence, Data Mining, Data-driven decision making) have not been uniformly defined in literature. Although the boundaries of the field can be discussed extensively, Big Data Analytics and Data Science -used interchangeably throughout this document- can also simply be referred to as the ability to use Big Data for organizational decision making (Lavalle et al., 2011). Defined as such Data Science is directly connected to the business strategy of an organization. Following a definition of Gartner (2013) Big Data Analytics distinguishes itself from statistical analysis and Business Intelligence reporting by the fact that it is concerned with a variety of (often not well defined) data structures. It lacks the pre-defined questions that are much more experimental in nature: hypothesis might be defined only after having studied and prepared data. Following definitions by NIST (n.d.), Data Science is referred to as to the conduct of data analysis as an empirical science. This can either take the form of collecting and analyzing data without a-priori hypothesis (data exploration or data mining) or by the formulation of a hypothesis and the collection of (new) data to address that hypothesis, followed by the analytical confirmation or rejection.

In the context of this research, Data Science is defined as an empirical science, using all data types (structured and unstructured) in large quantities (not capable of being processed by the traditional IT infrastructures), along with the algorithms and models to analyze and visualize that data, ultimately leading to faster and better organizational decisions (definition partly after Dijo & Lyytinen, 2017). Following this definition, Data Science is again considered an integrative discipline linking strategy, business, analytics, operations and IT technology.

Strategic Human Resource Management (SHRM)

There is substantial growing empirical evidence that developing Data Science capabilities is enhancing performance (Akter et al., 2017). Careful strategic planning of Human Capital and

(7)

7

attracting, retaining and building Data Science competences seems of crucial importance. This requires HRM to play an important role. Over the last twenty to thirty years a lot of research on Strategic HRM (SHRM) has emphasized to integrate HRM into the strategic management process. “There is broad agreement that a strategic approach to Human Resource Management (HRM) involves designing and implementing a set of internally consistent policies and practices that ensure a firm´s human capital (employees collective knowledge, skills and abilities) contributes to the achievement of its business objectives” (Jackson & Schuler, 1997, p.171). In this research, configurations of Knowledge, Skills and Abilities are referred to as competences (McEvoy et al., 2005).

Competences to drive DS in organizations versus DS competences

Numerous publications and papers list the required technical Data Science capabilities or competences (for example Davenport 2012 and Dichev 2017) for Data Scientists. Contrary to the focus on competency development for specific roles in Data Science, the development of competences for other groups in the organization (non-Data Scientists) to be able to plan, drive, implement and consume Data Science in organizations seems largely disregarded in literature. However it is hypothesized that these groups (for example IT or Business people) also need to possess specific competences to help further drive DS into organizations.

DS competence development for Governments

The Dutch government is keen to investigate how the benefits of Data Science will be integrated into public policies and public value creation. At the Dutch Central Government (DCG) there is a strong notion that Big Data will revolutionize the way we live, work and think. There are many Data Science initiatives already ongoing. For example in the policy domain of security -broadly interpreted as ranging from national security, law enforcement to the combat and prevention of fraud- the number of programs that involve large-scale data collection, linking and analyses are on the rise (WRR, 2017). More and more Data Science is and will be used to deliver public services and to create public value. Deciding what Data Science competences are needed when delivering services, might be different for public service organizations than for businesses. Above all, governments and non-profit organizations must be responsible to society and are serving the public interest (Daley, 2012). For public service organizations, decisions to invest in certain Data Science competences cannot be simply based on projected increases in market share, revenue and profit or customer satisfaction.

(8)

8

A standardized framework to discuss Data Science competences

In 2015, the European EDISON (Education for Data Intensive Science to Open New science frontier) project has been established by a consortium of European governmental and academic institutions. Its objective is to support universities, research centers and industry to cope with the potential shortfall of Data Scientists, to define the framework of competences as well as the body of knowledge for this profession. For this research it provides a complete, broadly accepted and supported ´frame of reference´ when discussing data science competences. Observations, research questions and academic relevance

Based on the previous sections, the following observations have guided the research questions: 1 More and more public value will be created through Data Science. Aligning Data Science competency development with the concept of public value creation is important for public service organizations.

2 Although there are numerous publications on the desired competences of a Data Scientist only few publications address the development of broader organizational capabilities to drive DS in an organization. Additional research on this topic is desired.

3 With EDISON a comprehensive generic framework of Data Science competences has been published. There is, to the researcher´s attention, no publication that applied this framework onto a large organization outside Academic institutions. This is important for two reasons. On the one hand the practical usability outside these institutions has not yet been researched. On the other hand, non-academic institutions might be able to strengthen Edison content by providing feedback when investigating its applicability. Furthermore the extent to which the Edison framework can be used for non-Data Scientist roles has not yet been investigated.

The three research questions have been defined as follows:

1 What potential contributions can be suggested to strengthen the alignment between business strategy and Data Science competence development with a specific focus on the Dutch Central Government?

2 What (Data Science) competences are needed in an organization such as DCG, to successfully plan, drive, implement and consume Data Science?

(9)

9

3 To what extent can the Edison Framework be used as a standardized DS competency and curriculum framework for DCG?

For answering the first and second research questions, primarily a qualitative case study design has been used. The explorative nature of the research suggests an approach in which contextual information should be incorporated. Therefore interview data and documents of fifteen organizational entities within DCG have been used. For answering the third research question a quantitative approach has also been used, by projecting published Data Scientist vacancies, via a text mining algorithm, onto the Edison framework.

Structure of this thesis

In Chapter two relevant literature on strategic alignment and HR practices will be presented and on its definition and dimensions elaborated. This chapter will also include the conceptualization and explanation of a framework that can be used for classifying group competences needed to successfully drive, plan, implement and use data science. Finally literature on describing organizational- and DS competences will be discussed. In chapter three a brief explanation of the EDISON framework will be given. In Chapter four the conceptual model for this research is illustrated. In Chapter five the Dutch Central Government is introduced which represents the unit of analysis in this research. Chapter six deals with the Research Methodology and describes the methods used for data collection and analysis. The research findings are presented in the Results chapter of this thesis. Chapter eight, the discussion section, will integrate findings and evaluate them referring to the conceptual framework. This chapter will also provide theoretical and managerial implications of this research. This chapter concludes with recommendations for future research and limitations of the study will be developed. Finally the conclusion will be summarized in chapter nine.

(10)

10

2

LITERATURE REVIEW

This chapter will start with presenting academic findings on the alignment between Business Strategy and HR practices followed by a review of literature on alignment in Public Service Organizations. Furthermore a useful framework for discussing and positioning Data Science in a broader organizational context will be presented. Finally literature and publications on competences to drive DS in organizations is reviewed and depicted by the end of this chapter.

2.1 STRATEGIC HRM

Since the development of HRM as field of scientific research in the 1980´s, an important development has been the integration of HRM into the strategic management process (Boon, 2008). Strategic human resource management (SHRM) focusses on aligning HRM practices to build employees´ knowledge, skills and abilities to support business objectives and strategies (Werbel & DeMarie, 2005). The degree to which various characteristics of the HR system are aligned with an organizations´ strategy is called ´fit´. Two types of fit are studied most common: vertical and horizontal fit.

The alignment between an organizations´ strategy and HR strategy & practices is called strategic fit or also vertical fit (Paauwe & Boon, 2009). Horizontal fit is used to express the degree to which HR systems form a tightly coupled and consistent approach to managing people strategies (Baird & Meshoulam, 1988). Gratton and Truss (2003) added an action dimension to the horizontal and vertical alignment dimensions to account for the fact that ´make it happen in day to day reality´ is equally important as ´putting the strategy on paper´.

Most of these studies were focused on private sector companies. Objectives of private sector companies are generally related to profit and/or shareholder maximization. Some scholars have taken a broader perspective on organizational goals and propose to also include other than these bottom-line performance metrics. In 2003 Paauwe developed a Contextually Based Human Resource Theory (CBHRT). In this theory, on the one hand HRM is determined by the competitive mechanisms (Product/ Market/ Technology dimension) and on the other hand by the socio-political, cultural and legal dimension. In addition Paauwe states that the historical grown configuration of a firm is also impacting HRM policies and practices. The three dominant factors: PMT dimension (competitive mechanisms), SCL dimension (Institutional context) and configuration (organizational and administrative heritage) influence the degree of

(11)

11

leeway the dominant coalition has in making strategic choices and developing HR strategies practices. This is depicted in figure 1.

FIGURE 1:CBHRT(MAES,2004)

HRM on one hand is determined by demands arising from product market combinations and the appropriate technology (and these demands are usually expressed in terms of for example Efficiency, Effectiveness, Speed, etc.). On the other hand Paauwe emphasizes that ´the market´ is embedded in a specific Cultural and Social context. According this model, market forces drive the development of HRM but values, norms and legal restrictions might correct these. An important prerequisite for SHRM is a clear definition of the mission as well as a solid strategy to achieve it. This is complicated especially in the governmental sector as there are no competitive mechanisms driving product market combinations (the PMT dimension) and hence measuring success of a strategy is more difficult compared to companies in the private sector. Next to that, four-year electoral cycles (or sometimes even less) will drive organizational focus and will initiate shifts in strategy and or tactics. Public service performance and public organizations goals are therefore often not that clear. In addition to the generic requirements mentioned previously, public purpose governments and non-profit organizations must be responsible to society and are serving the public interest (Daley, 2012). It is argued (Vandenabeele, 2013, p37) that “it is necessary to return to the concepts of public value and public values in order to reconnect the concept of public service performance to the principal

(12)

12

´raison d´être´ of public organizations”. It is also argued (Vandenabeele, 2013) that the concept of public values is a useful instrument in valuing the ´public value´ created by public organizations. Public values can be seen as elements why people value things (´values related to the how´), whereas public value is related to the value itself that is meaningful for people (´the what´). To illustrate this with an example: ´law enforcement´ can be considered a practical example of the delivery of public value (´the what´). Public values like ´protection of the rights of the individual´ or ´adherence to the rule of law´ refer to way this is done (´the how´). According to Moore (2000, p186) the principle value delivered by the government sector is ‘the achievement of the politically mandated missions of the organization and the fulfillment of the citizen aspirations that were more or less reflected in that mandate. Public values have been described by various scholars. To illustrate the concept of public values, a detailed overview in seven categories has been constructed by Jorgensen and Bozeman (2007). Over 65 values have been identified and categorized in seven categories. The following categories have been described (some values have been added for illustration purposes only):

o Public Sector contribution to society (Sustainability, Altruism, Social Cohesion, etc.). o Transformation of interests to decisions (Democracy, Protection of minorities, etc.). o Relationship between Public Administrators (PA) and politicians (accountability and

responsiveness, etc.).

o Relationship between PA and their environment (responsiveness, openness, listening to public opinion, etc.).

o Relationship between PA and citizens/users/customers (Rule of law, justice, protection of the rights of the individual, etc.).

o Intra-organizational aspects of public administration (robustness, reliability, stability, etc.).

o Behavior of public sector employees (moral standards, ethical consciousness, integrity, etc.).

Following Vandenabeele (2013), the concepts of public value and public values can be seen as two key dimensions of public service performance. In the context of this research the element ´Product´ in the PMT dimension has been interpreted as ´Public value creation´, the element ´Market´ as ´Society in general´ and Technology as ´Data Science’. The public values will be

(13)

13

addressed in the SCL dimension of CBHRT. By doing this the CBHRT model has been more specifically tailored to public service organizations like DCG.

Strategic HRM models are generally not too focused on describing how the alignment between strategy and HRM is specifically impacting competency development. Alignment between strategy, HRM practices and competency development is more specifically described by the Line-of-Sight (LOS) construct (figure 2). LOS is defined as the alignment of organizational capabilities, group competences and norms, individual KSA´s (Knowledge, Skills, and Abilities) with one another and with the organization´s strategy (Buller & McEvoy, 2012).

FIGURE 2:LINE OF SIGHT (BULLER &MCEVOY,2012)

According Buller & McEvoy, overall firm performance is a function of the vertical alignment of strategic priorities among three organizational levels (Organization, Group and Individual) and the horizontal alignment of HRM practices (recruitment/selection, performance appraisal, training development and compensation). As used here, organizational capabilities are ´system-level resources´ like the ability of the organization to develop, learn and change. Group competences are referred to as a broader set of competences to allow working in various job categories within the organization. This is an important notion: group competences defines a broader set of competencies necessary to perform work in various job categories in the organization. Conceptually this is implying that -to be able to understand and reason about competence development for an entire organization- this can be decomposed to ´describing the group competences for all relevant groups that together form the organization´. The third level of analysis is that of individual-specific KSA´s (as opposed to the group competences).

(14)

14

Competency has been defined (McEvoy et al., 2005) as a configuration of Knowledge (e.g. economics, statistics, etc.), Skills (e.g. communication and collaboration, etc.) and Abilities (e.g. proactivity, innovation, etc.) that enables one to perform well in a professional role. The notion of group competences allows us to reason about DS competence development on a smaller aggregation level than the broader formulated ´organizational competences´ and on a higher aggregation level than the individual KSA´s. This is important for multiple reasons. Firstly it is an effective way to distinguish between distinct competences needed for a class of (coherent) functions or even for a functional domain. Distinguishing between relevant coherent groups is important as education and training can be quite effectively and efficiently organized, contracted and - more important- tailored to that specific group. Secondly considering competence development on group level will increase flexibility (should some resources be re- directed others easily can step in) and might generate ´within-group-mobility´ (people themselves might want to move to other functions in a group). Above all, it also provides organizational clarity to who is positioned to do what.

2.2 POSITIONING OF FRAMEWORK FOR COMPETENCES ON A GROUP LEVEL

Current publications on competence development to drive Data Science are either targeted at broad and generic organizational competency development (organizational capabilities) or at specific roles in the data scientist job family. This makes it difficult to effectively reason about and organize Data Science competency development in such a way that it that covers the entire organization but at the same recognizes that groups of (coherent) functions might need different (DS) competences. After a scan in literature, the Maes Information Management alignment framework has been considered a suitable starting point as will be explained below.

Data science can be considered an integrative discipline, connecting strategy, business, operations and technology and has -from a high-level perspective- similarities with the Information Management discipline that is also connecting the same elements (but from a different point a view as will be explained later on). In the IM arena the terminology “strategic alignment”, originally introduced by Henderson & Venkatraman (1993), has been used to manage the business ICT relationship (Maes, 2007). Based on the work of Henderson and Venkatraman (1993), Maes (2007) developed an integrative positioning framework of the Information Management discipline.

(15)

15

Based on this framework, Information Management is described as the integrative, balanced management of the different domains as represented in figure 3. “It concerns strategic, structural en operational information-related issues (the ´vertical´ dimension). Furthermore it relates the (external and internal) information and communication processes and their supporting technology to business aspects (the ´horizontal´ dimension)” (Maes, 2003). As displayed in Figure 3, from right to left information is produced, interpreted and used. Maes states that for each of the functional domains (columns) distinct competences or expertise is needed and refers to domain expertise, information expertise and technology expertise. This model has been frequently used by organizations both in government as well as in the private sector (Maes, 2003). Many interpretations of this model have been published on the internet1.

FIGURE 3:AN INTEGRATIVE FRAMEWORK FOR INFORMATION MANAGEMENT (AFTER MAES,2003).

At a high level, data science is a set of fundamental principles that support and guide the extraction and transfer from data to information and knowledge (Provost, 2013). Data Science is the ability to use Big Data for decision making´ (Lavalle et al 2011). By choosing this

(16)

16

formulation the field is directly connected to the business strategy of an organization. Data science is viewing business problems from a data perspective and is drawing on many traditional fields of study (for example business, statistics and IT). The boundaries between DS and IM could be discussed extensively in an academic setting. When looking at the Maes framework and taking the Information Management perspective to try to conceptually position Data Science we notice a substantial overlap as far as connecting to other disciplines is concerned. This is illustrated in Figure 3. We also notice some differences especially on the Strategic Business level as ´DS outcomes´ are potentially strongly and more tightly connected to business strategy. Furthermore, when looking at the framework from a DS perspective, the framework doesn´t account for essential key characteristics. From an IM perspective, the middle row and more specifically the center is related to Information Architecture: the blueprint of the Information and Communication capabilities of the Organization (Maes, 2003b). From a DS perspective however the center seems more related to the design and planning of DS- and DS management products& services. In that sense, it has much more to do with the analytical and methodological capabilities to transform data into information and knowledge (like a production system transforming in- to output). Despite the conceptual strong overlap from a high-level perspective, to be able to correctly position Data Science and to explore its competences, it is positioned apart from (the traditional) Information Management discipline. For Data Science to be effective in an organization, clearly distinct Data Science competences or expertise is needed. DS as an additional discipline has been added as a ´column´ in figure 4, together with the positioning of some roles/functions for illustration and clarification purposes.

(17)

17

Please note that, by depicting DS separately, it has not been the intention to position DS as a separate organizational entity or department. Data Science capabilities could reside anywhere in the organization.

FIGURE 4:POSITIONING OF DS AND TRADITIONAL IM(AFTER MAES,2003).

Furthermore the positioning of functions is for illustrative purposes only and is not meant to be normative in any way. In this view, Data Science is positioned between IM and IT. This is just a specific instantiation of the framework. Since most DS initiatives lack pre-defined questions and might be much more experimental in nature, DS capabilities might be organized in such a way that they are close to the products and processes in organizations, that is, co-located with business units (Debortoli, 2012). There might be other examples where IM is primarily ´only´ managing IT on behalf of the business and can be positioned close to IT delivery. In systematically uncovering the (Data Science) competences needed in an organization to successfully plan, drive, implement and consume Data Science, above 12 cells framework will be used (Figure 4). From the DS competency point of view and inspired by the

(18)

18

notion of group competences from the Line-of-Sight model, the cells of the framework will be considered to be a valid aggregation level for investigating the competencies needed on a group level to drive DS into organizations.

2.3 COMPETENCES TO DRIVE DATA SCIENCE IN ORGANIZATIONS VERSUS

DATA SCIENCE COMPETENCES

The first part of this paragraph reviews literature on competences needed to successfully drive Data Science into organizations. This is followed by literature describing desired competences of Data Scientists and ends with an overview of the Edison framework.

Competences to drive Data Science in organizations

In a literature study performed by Akter et al. (2016) three primary dimensions were identified that reflect Big Data Analytical Capabilities (BDAC) in organizations: BDA management capabilities (planning, investment, coordination and control), BDA technology capabilities (flexibility of the BDA technical platform) and BDA talent capabilities (skills and knowledge of the analytics professional). Looking at this study from a competency perspective (considering Skills, Knowledge and Abilities), this study is primarily pointing towards the skills of the data analytics professional (i.e. the data scientist) and to a much lesser extend to Management competences or Data science competences needed for groups of non-data scientists within an organization. Dijo et al. (2017) examined how organizations can address the Big Data challenge and transform their organizations by building competences which enable success with big data. They observe that organizations need to build competences on four themes: mobilize their big data technologies through new skills and constant technology acquisition, orient resources towards collaboration and continuous improvement, explore data through perpetual experimentation and exploit related discoveries with market orientation. Within each of the themes multiple elements are listed. They are referred to as competences but don´t necessarily comply with the SKA definition. A research paper by Dichev et al. (2017) describes the design and implementation of a course targeted at a non-technical audience and centered on data science literacy, with a focus on collecting, processing, analyzing and using data.

(19)

19

Data Science competences

Numerous publications in literature elaborate on the characteristics of a Data Scientist. A comprehensive overview is given by Davenport (2014). He explains: A data scientist is a Hacker (ability to code) and a scientist (evidence based decision making and the ability to design experiments) and a trusted advisor (communication and relationship skills) and a quantitative analyst (statistics and machine learning) and a business expert (domain knowledge: knows how the world works in a specific context).

According to this view a data scientist should possess all of these qualifications to a certain extent. Mauro et al (2017) performed a job families and skills analysis on 2786 job posts containing ´Big Data´. They conclude Data Scientists main skill is analytics (leverage big data methods), but they need also understand the business context, possess project management skills and should be able to access data warehouses and query databases. When looking at these skillsets they conclude there is some overlap in skillsets from business analysts and big data engineers although they can be quite easily distinguished from a data scientist.

A study performed by Debortoli et al. (2014) a pool of over 1000 Big Data and 4000 Business Intelligence job ads has been analyzed. There are similarities and differences but when the competency requirements are being compared from a high-level perspective, BI requires skills in the area of commercial software platforms where Big Data relies more on software engineering, statistical skills and open-source product.

It can be concluded that although there are a lot of commonalities in literature on competences for Data Scientists and related roles, there are some differences as well. Sometimes the same competences are classified differently or might have a different meaning. Therefore, for the purpose of this research, to be able to reason about Data Science competences in a uniform way throughout DCG, the EDISON framework is being used as a frame of reference.

3

EDISON FRAMEWORK

In 2015, The European EDISON project has been established by the EDISON consortium (primarily consisting of European government and academic institutions). Its objective is to support universities, research centers and industry to cope with the potential shortfall of Data Scientists, to define the framework of competences as well as the body of knowledge for this profession.

(20)

20

The Edison Data Science Framework (EDSF) provides a basis for the definition of the Data Science profession and is enabling the definition of the other components related to DS education, training, organizational roles definition and skills management.

Following definitions by NIST (n.d.) Data Science is referred to as to the conduct of data analysis as an empirical science. Following NIST ‘This can take the form of collecting data followed by open-ended analysis without preconceived hypothesis (data exploration or data mining)’. The second empirical method refers to the formulation of a hypothesis, the collection of data- new or pre-existing- to address the hypothesis and the analytical confirmation or rejection of the hypothesis. Note that the hypothesis may be driven by a business need, or can be the restatement of a business need in terms of a technical hypothesis. The key concept is that data science is an empirical science, performing the scientific process directly on the data”. Following NIST ‘Data science is the extraction of actionable knowledge directly from data through a process of discovery or hypothesis formulation and hypothesis testing’.

Based on the definitions by the NIST in Edison a data scientist is defined as follows: ‘A Data scientist is a practitioner who has sufficient knowledge in the overlapping regimes of expertise in the business needs, domain knowledge, analytical skills and programming and systems engineering expertise to manage the end-to-end scientific method process through each stage in the big data lifecycle’ (Edison (n.d.)).

The main components of the Edison Data Science Framework (EDSF) are:

CF-DS (Data Science Competence Framework). This provides the overall basis for the whole framework and its first version has been published in November 2015. By now, it has been widely discussed at numerous workshops, conferences and meetings. The Core CF-DS includes common competences required for Data Scientists and are classified in competence groups that are summarized as follows:

• DSDA. Data Analytics group including for example Machine Learning, statistical methods and business analytics.

• DSENG. Engineering group including software and infrastructure engineering.

• DSDM. Data Management group including data curation, preservation and data infrastructures.

• DSRM (Research Methods). Create new understandings and capabilities by using scientific methods (e.g. hypothesis formulation and experimentation).

(21)

21

DS-BOK (Data Science Body of Knowledge). This is the body of knowledge to support CF-DS and is organized in knowledge areas corresponding to the competence groups.

MC-DS (Data Science Model Curriculum). Learning outcomes are defined based on CF-DS competences and Learning Units are mapped to Knowledge Units in DS-BOK. It will allow for flexible curricula development.

DSPP (Data Science Professional profiles and occupations taxonomy).

Each competency group consists of six competences.to account for the clustering of specific coherent competences (to illustrate: DSDA1 deals with a variety of Machine Learning techniques, where DSDA4 is related to performance and accuracy metrics for model evaluation). In total 24 DS competences have been recognized by the Edison CF (See appendix 2 for an overview).

Edison further recognizes a portfolio of 22 Data Science roles that have been described in Data Science Professional Profiles (DSPP), ranging from for example a Data Scientist and Data Analyst to a Data Steward and Data Science Manager. Each role is characterized by a different valuation of the importance of each of twenty four standardized competences (for example some roles will have specific DSDA competences emphasized where others are more strongly focused on specific DSENG competences).

Following the Edison Architecture, it is possible to ´compose´ the associated Knowledge Units and Skills, given a specific DSPP profile (that by itself is characterized by a different valuation of the importance of each of the competences). These Knowledge Units can be used as a ´Functional Design´ to have specific courses developed or contracted.

4

CONCEPTUAL MODEL

This section presents the conceptual framework of this research. Based on the findings of the previous literature review, the model will combine theories, findings and frameworks that have been used in academic literature. From a high level perspective, the proposed research framework combines two streams of research. The first stream is focused on Strategic HRM for public service organizations. The CBHRT framework from Paauwe (2004) and its dimensions has been used as a starting point. The PMT dimension in this framework has been replaced by the notion of a Public Value strategy and additional attention is paid to the public values in the SCL dimension. This adjusted framework allows us to explore and reason about

(22)

22

the alignment between each of the three Strategic HRM dimensions of the conceptual model and the development of DS (related) competences. The second stream is focused on a framework adapted from Maes (2003), positioning Data Science as a connecting discipline between business, IT delivery and traditional Information Management on one hand and between Strategy and Operations on the other hand. The adjusted Maes framework can be used as an appropriate framework to reason about competences needed in each of the functional expertise domains (Business, Information Management, Data Science and IT) at each level of the organization (the rows). Considering competences on a group level (the cells in the matrix, Figure 4) is supported by the notion of group competences, as described in the LOS theory (McEvoy & Buller, 2012).

Finally this research is exploring the extent to which the EDISON framework can be used to support competency development for groups classified in the Maes framework. The conceptual foundation of this research is depicted as follows:

(23)

23

5

DUTCH CENTRAL GOVERNMENT

The Dutch Central Government (DCG) consists of several entities. It covers thirteen departments (´ministeries´), thirty agencies (´baten-lasten diensten´) and several hundred Independent Governing Bodies (IGB:´zelfstandige bestuursorganen´). Defense, Police, Justice and Education are often positioned separately but are in fact part of the departments. Dutch Central Government employs 150.000 civil servants.

Over the last years within DCG, as part of the Digital Transformation agenda, Data Science takes a prominent place when it comes to development of new policies or when executing responsibilities. As such ´data driven government´, ´datafication´, ´artificial intelligence´ and machine learning are becoming important topics (Maak Waar!, 2016). This has also been evidenced by another recent publication. “Governments are keen to investigate how the benefits of these new technologies will be integrated into public policies as well. In the policy domain of security – broadly interpreted as ranging from national security, law enforcement to the combat and prevention of fraud- the number of programs that involve large-scale data collection, linking and analyses are on the rise” (WRR-policy brief 6, 2017, p5). “Ultimately, digital transformation means reimaging virtually every facet of what government does, from headquarters to the field, from health and human services to transportation and defense” (Maak Waar!, p21).

It has also been recognized that digital transformation to a certain extent is entity bound: each entity taking care of their own processes and often using a specific approach. This might cause overlap, is inefficient and costly and is difficult to oversee (Maak Waar!, p11). To be able to develop DS competences within DCG in a structured and planned way, a standardized approach is needed and should be able to apply to the entire DCG. This has been formulated by the Program Manager HR-ICT at the start of this research. As such this research is part of a bigger ´HR-ICT program´ that is aimed at finding sustainable solutions for the shortage of high quality personnel within DCG.

(24)

24

6

RESEARCH METHODOLOGY

Research design

This research primarily followed an inductive research approach in that is about exploring to what extent Business Strategy and development of DS competences have been aligned and what competences are needed to successfully create public value through DS. As the approach has also been inspired and guided by frameworks already used, the research can also partly be considered deductive.

As DCG -as an entity- is being considered for this research, the research design is characterized as a Single Case design (Yin, 2009). To be able to obtain a good overview and to be able to generalize findings for DCG, multiple embedded ´Units of Analysis´ within DCG had to be considered and were included in this research. These embedded units (informants in departments in ministries, agencies or IGB´s) have been selected using non-probability sampling techniques called purposive sampling and snowball sampling (Saunders, 2012). Initially purposive sampling (using judgment to select units that are considered informative given the research objectives (Saunders, 2012)) was used but was followed by snowball sampling as during interviews new units/candidates have been identified that were considered to bring additional value to this research. As far as informants/interviewee selection is concerned, three important criteria were identified within the purposive and snowball sampling approach:

1. Informants -all together- should be fair representation of DCG. This means candidates have been selected with ministries, agencies and IGB´s.

2. From the standpoint of representation fairness, informants from multiple functional domains should be selected (Business, Information Management, IT delivery and Data Science) and ideally from each of three ´activity layers´ (Strategy, Structure and Operations) as well. This is especially important to be able to obtain a good and broad view on the competences need to drive DS in DCG.

3. Each interviewee should be able to broadly contribute to the research objectives. This means each interviewee was supposed to have a view on Business-HR-DS Competency alignment as well as on specific (DS) competences needed to further drive DS in DCG.

(25)

25

The principal researcher has had multiple discussions and conversations on the role of HR and how to best include a HR view in this research. There is strong consensus in DCG that competency development is steered from the functional domains (Business, IM, IT and DS) and HR involvement is considered quite limited. Therefore predominantly non-HR interviewees from the functional domains have been included. On the other hand HR has been addressed by inclusion of one HR director, one interviewee in a functional domain that is also working on HR strategic planning and one interviewee concerned with IT Labor Market communication.

As the research questions are aimed at getting better insights in some phenomena (“exploration”) the case study method (in the form of interviews) has been selected as a primary means to achieve this. However a quantitative approach has been added in the form of analyzing quantitative data to be able to complement any findings of the interviews. The quantitative part is related to one smaller part of the research only: in gaining a better understanding of the competences of the Data Scientists currently sought after by DCG. By adding quantitative data the construct validity of a part of this research has increased because multiple sources of evidence will provide multiple measures of the same phenomenon.

The data collection plan was shaped by the theoretical frameworks described previously and is directly linked to the conceptual model as presented in Chapter 4. These frameworks have provided guidance in further exploring the research questions. Prior to the start of the data collection phase a research protocol has been established. It is intended to guide the researcher in carrying out the data collection (Yin, 2009) and is increasing the reliability of the case study research. It provides an overview of the case study project, the field procedures and is including the data collection plan in the form of case study questions. Please see appendix 1 for the case study questions. The interview typically is semi-structured, where the researcher has some themes and pre-defined questions although their use may depend from interview to interview (Saunders, 2012).

(26)

26 Data collection - interviews

Based on the criteria mentioned over twenty units were selected and invited to join the research. In total sixteen interviews were held in the period June 2017 – August 2017. The interviews were conducted face-to-face primarily at the office of the interviewee and typically had a 90 minute duration. Interviews took place in Dutch, fourteen out of sixteen interviews were recorded, two couldn´t be recorded due to malfunctioning of the recording device. As per research protocol the confidentiality has been emphasized prior, during and after the interviews. Prior to the interview, interviewees were asked to provide some background material to be studied by the researcher. Eight out of 16 interviewees indeed provided background information on themes like strategy, competences and/or curriculum. As per criteria previously mentioned the positioning of interviewees over the functional domains (Business, Information Management, Data Science and IT delivery), levels (Strategy, Structure and Operations) and DCG entities (Ministries, Agencies and IGB´s) is depicted in below overview.

(27)

27

Each interview was followed by an interview report. In this report, parts of the interview have been summarized, some parts paraphrased and other parts literally quoted. Each interview report has been validated and approved by each of the interviewees. It allowed the interviewees to again reflect on their contribution to the interview and add new information or perspectives and context. In four cases interviewees indeed clarified some topics and/or added additional perspectives. This increased the construct validity of the study. During the interviews, some interviewees were asked to provide additional information for specific topics at hand with details on some of the topics discussed (for example vacancy profiles descriptions of DS competences and DS related curricula). All sixteen interview reports and related documents are part of the Case Study Database that will be transferred to the DCG. The case study Database is a key element in the so called ´chain of evidence´. The principle is to allow another researcher to use these building blocks when replicating the derivation of evidence for the research questions to the case study conclusions (Yin, 2009). This adds to the reliability of the study. When exploring the DS competences needed to successfully drive DS in DCG the interview questions were set up in such a way that interviewees did not limit themselves to only address elements within his/her specific cell of the matrix. They were explicitly asked to comment on competences needed in other domains/ cells. By doing this, interviewees help providing views on competences in other cells of the matrix as well.

Quantitative data collection – job vacancies

The qualitative part of the research is aimed at exploring the alignment between Strategy and competences and exploring what competences are needed to successfully create public value through DS. It has been expected that to able to achieve this, competences are needed in each of functional areas (Business, IT, IM and DS). To be able to better understand the required competences in one specific function domain (Data Scientists in the DS functional domain), the qualitative part of the research has been followed by a quantitative element. The purpose of that part is to algorithmically scan (text mine) a series of formally published Data Science vacancies by DCG for Data Science competences as formulated by Edison Competence Framework. By doing this an additional source is added to the research: the DS competences mentioned during the interviews can be matched with what is found in the automated discovery. To be able to add understanding on the Data Science competences needed within the Dutch Central Government, relevant job vacancies have been collected. They have been collected either as a result of interviews (where interviewee shared some vacancies in a word document)

(28)

28

or via ´scraping´ of vacancies on a specific website (www.werkenvoornederland.nl). The objective was to try to capture as much vacancies possible where the job title at least contained the word ´data scientist´. The purpose was twofold: any findings resulting from this analysis could complement the interview findings (triangulation), especially for the cell in the Maes framework containing the Data Scientists (Figure 4, cell 7). On the other hand if the competences listed in the vacancies, especially those of Data Scientists, are mapped on the Edison framework, it might become possible to uncover tendencies in the way DS competences have been formulated within DCG.

Data Analysis – Interviews

Based on Saldaña (2009) a two stage coding approach was used. In the first stage a combination of initial (open coding) and provisional coding was used for relevant text as per validated interview report. Based on the conceptual framework some provisional classification scheme was -at least implicit- available to the researcher. The EDISON framework provided some guidance on a classification scheme for Data Science, the Maes framework for competences related to Business, Information Management, Data Science and IT. The same goes for the types of fit with respect to Business & HR alignment. Although a formal provisional coding scheme was not used, the specific vocabularies used with each of specific elements of the conceptual model were used for coding purposes. For the second step, axial coding was used to be able to derive categories from the initial coding phase. Again some categories originating from the conceptual model were still considered valid during this coding step. Please find an overview of coding categories used in Appendix 3.

(29)

29

Quantitative Data Analysis – DS competencies vacancies

To better understand the competences of Data Scientists within DCG (obviously a subset of the total of competences needed to drive DS in DCG), the qualitative part of the research has been followed by a quantitative element. A set of seven formally published Data Science vacancies have been scanned for competences as formulated by Edison Competence Framework. The number has been limited by the available vacancies as published on the website (that didn´t contain vacancies that have been published in the past). In a series of iterations, during a 2 month period, the researcher provided input to the Edison project team to adjust and improve an existing text mining algorithm (in its initial form it was already part of the Edison toolset) to be able to automatically scan DCG vacancies for Edison DS competences.

During the first step, the algorithm -developed in Python by the Edison team- ingests relevant parts of a vacancy (a word document per DCG standard layout containing textual formulations on ´function description´ and ´function requirements´). These texts are automatically being translated from Dutch to English (as the Edison content is expressed in English) and some Natural Language Processing steps (like removing punctuation and so called ´stop words´ like ´the´, ´and´, etc.) are being performed. Alongside NLP processing, a key terms extraction step has been performed on the text data in which multi-word key terms are considered to belong together (so called ´n-grams´) that can be given high relevance2. During the second step both

the vacancy text as well as the Edison competences are being represented as vectors, using a term frequency inverse document frequency (TF-IDF) approach. In the next step these vectors are used to compute a similarity value (based on cosine similarity) between each vacancy and each of the twenty four Edison DS competences. Finally the end result is being represented in the form of a visualization (a spider diagram). The presentation and visualization is written in Angular2. Please see Appendix 5 for relevant parts of the algorithm3, written in Python.

2 The Edison team used a state of art algorithm known as SGRank was used for this. This transformation has been considered necessary as otherwise the TF-IDF in the second step would lead to only consider simple words hampering the semantic power and expressiveness of n-grams.

3 Not all code has been included as some code is not related to any algorithmic processing but for example only to persist data in a database (Mongo DB) or to visualize the end result.

(30)

30

7

RESEARCH FINDINGS

In the first part of this chapter research findings will be presented on the alignment between each of the three Strategic HRM dimensions of the conceptual model and the development of competences to further drive DS into DCG. That part will be concluded with a summary depicting how identified alignment issues will input DCG competence development. Based on the data collected, the second part will propose the competences for each functional domain (Business, IT IM and DS) and at each level (Strategy, Business and Operations).

7.1 ALIGNMENT BETWEEN STRATEGIC HRM DIMENSIONS AND DS

COMPETENCES

The alignment between each of the three Strategic HRM dimensions of the conceptual model and the development of DS competences has been explored in the interviews. The purpose of this paragraph is to depict a high level view for DCG, based on ´the common denominator of findings´ experienced during the interviews.

Public Value creation through DS (PMT dimension)

Interview data indicate that some entities have a clear view on the role of DS in the public value strategy (like Belastingdienst, RWS, CBDS and NFI). For some of the entities application of Big Data and Data Science is daily reality and tightly connected to short and longer term objectives in delivery to that value. On the other hand, often ´i-strategy documents´ clearly mention Big Data and DS as strategic imperatives but the current state of execution, as experienced by some interviewees, is still considered limited (i3,i7,i8). Almost all interviewees are concerned about the speed of adoption of (the use of) DS in policies and legislation. “Policy makers typically have a paper based multi-year horizon whereas the dynamics in our data driven activities typically are short cycled: the speed with which policies and legislation is being developed doesn´t match the pace of society” (i10). Multiple interviewees notice that potentially because of security, privacy and legal issues senior management (and or legal counsel) might be hesitant to follow through on absorbing DS into the primary process: “ we need (senior) management to think in terms of opportunities, not in risks; clear legal policies would be key” (i5). Furthermore it is felt by multiple interviewees that senior leaders/managers need help on how to ´safely´ develop a data-driven strategy (i6, i7 and i9). One interviewee mentioned “ at a certain time we were getting two highly classified Data Scientists available, as their project ended but our organization really didn´t know how to make best use of them…of course these two DS found new assignments themselves outside this ministry” (i5).

(31)

31

Configuration

This dimension of the model deals with the alignment between key characteristics of the organization and the development of DS competences.

In general, at the ministries, the level of innovation within the IM domain is considered limited. “They are usually aimed at process control and compliancy, not on innovation” (i5). Multiple interviewees report that (DS) innovations have been initiated bottom-up. “We used to push DS innovations/products to the business, although we are experiencing increased involvement from the business” (i10). Others report “there hardly is any enforcement at all from the business or CIO side, we ourselves want to work more data driven (i8). These findings are consistent with earlier findings that potentially because of security, privacy and legal issues senior management (and or legal counsel) might be hesitant to follow through on speeding absorbing DS into the primary processes.

Some refer to multiple layers of Information Managers in the organization that all need to be aligned: with the agencies, at DG level as well as on CIO level and this might slow down innovation (i5).

As far as training and education is concerned each ´hardcore DS entity´ (RWS datalab, NFI, Belastingdienst and CBDS) has developed a DS curriculum that is mandatory for aspiring Data Scientists. It usually addresses ´the way we work as a DS entity´ as well as – primarily in house- professional DS training. Especially for NFI strict adherence to and successful completion of this curriculum is considered crucial to be able act as a Forensic Data Science Expert in court when needed. These four entities have typically been able to attract adequate talent (Master and Doctorate level) for DS positions as they have interesting projects, huge data volumes and mentoring capacity available to aspiring data scientists. In general, outside these hardcore DS entities, the degree of formal mandatory (DS related) training is limited: employees have substantial degree of freedom to develop themselves and follow a variety of courses in the field of DS amongst others. There are a few other entities (like I&M) that have developed a curriculum for managers and policy makers. This curriculum is currently being implemented on a small scale. Others report having followed a DS master class themselves to gain a better understanding on the topic.

By design, with the DCG, the shorter term political mandate following the 4 year electoral cycle needs to be balanced against the longer term strategy. “The longer term strategy basically is

(32)

32

determined by the Secretary General (SG) and this might collide with the short view of the minister: sometimes IT initiatives are kept somewhat limited or not too explicit to avoid conflicts and potential risk” (i3). Balancing short term DS needs and longer term fixed budgeting is also mentioned as a potential problem. Some interviewees report that they could easily grow their DS team but they experience budgetary constraints that can only be resolved, if possible at all, on short term (within the next 1-2 years). One interviewee reported “Not being able to grow DS capacity is prohibiting DCG to collect additional taxes as there are substantial opportunities to better detect fraud (i10).

The four ´hardcore DS entities´ (RWS datalab, NFI, Belastingdienst and CBDS) all - without any exception- report having experienced very strong leadership support to drive, build and develop their DS departments or teams. Some report that having strong leadership has made all the difference in creating the necessary ´room to breathe´ for a DS team to operate, attract talented individuals and create a specific atmosphere (for example agile interior, table football, co-location of resources, etc.). Some even refer to “an entrepreneurial atmosphere”. Few interviewees fear the day that a specific senior leadership individual will leave (i11, i14). “ This specific manager has made all the difference and allowed us to develop DS to where we are today”….”should he be replaced by a traditional ´old school´ senior government leader we are afraid to enter the risk-avoidance era again and it will set us back substantially”.

This is contrasting with reports from others that do not experience strong leadership on DS topics: “their [´managers´] role in DS projects is really distant, they don´t drive it but basically play an “admin role” only “(i3). Role modelling, leadership, entrepreneurial spirit, thinking in terms of opportunities instead of risks and leadership that really has a good understanding of the typical DS approach was mentioned frequently (i2, i3, i4, i10, i14).

Social- Cultural Legal dimension

This dimension of the model deals with the alignment between the social-cultural and legal mechanisms and the development of DS competences. Until recently, over the last years DCG has had a strategy and culture to outsource/out task as many IT roles and functions possible to external organizations or to internal DCG IT providers (Logius, SCC-ICT, DICTU, ICTU, etc.). This resulted in quite a high number of external capacity, also in DS related roles. Although within NFI, Belastingdienst, RWS datalab and NFI the amount of external capacity is not dominant (or sometimes even limited) there are DS projects ongoing, solely depending on external capacity. “There is a crucial project completely staffed with external capacity” …….

(33)

33

“should the external project manager and the Data Analysts be leaving, we will be having a real problem” (i5). Another interviewee reported a project “that is critical and classified as ´highly confidential´….. “the 3 DS in project team of 6 are external capacity” (i1). Some interviewees especially at IT services organizations (like DICTU and ICTU) clearly illustrate currently not having the intention to build hardcore DS capabilities themselves. Main reason mentioned is that -as a service provider- to be able to employ hard core DS capacity there should be long term perspectives for these roles in the organization, driven by projects coming from internal DCG customers (“ and we expect our customers to develop hardcore DS competences themselves” (i1)).

Some interviewees specifically refer to values that DCG needs to adhere to as they are deemed crucial and that need to be built-in in primary processes when using DS: transparency, traceability and testability (i5 and i10) or reliability of data and information (i7). As soon as senior leaders are confident these can be guaranteed additional momentum is expected. “Creating confidence with our senior leaders by increasing awareness on and familiarity with DS and how a DS project typically is run would certainly help” (i5). “At the same time legal counsel needs to be involved upfront to develop policies that can be used to better guide our DS projects (i5)”

Some report the tight labor market for DS and the limitations to have senior DS capacity fit FGR (Funktie Gebouw Rijk) in terms of salary as a problem to be able to attract talent. Others report that -even in terms of compensation- DCG is able to offer ´market-equal-pay´ or mention the concept of ´Public Service Motivation´ as an important reason for talent to work for DCG. Throughout the four ´hardcore DS entities´ (RWS datalab, NFI, Belastingdienst and CBDS) there is a general notion that it is currently not that difficult to attract new academic talent or internal capacity. As one interviewee stated: “we offer a huge dataset, modern tools and processes (Agile/SCRUM), a DS curriculum, a mentor guiding new team members and the opportunity to really make a difference for society”…..” we offer a huge and steep learning experience for new hires” . Some report it is more difficult to retain senior talent: “we really need to offer them exposure to new DS insights, research methods, etc.” (i10).

Few interviewees mention some characteristics of the current workforce that is an aging population and highly protected (by legislation and reputation). In general it is considered a ´dependent workforce´ feeling comfortable with ´long-life employability´ in a strong and stable process- and compliance driven culture. The speed with which new technology might change

Referenties

GERELATEERDE DOCUMENTEN

Vaessen leest nu als redakteur van Afzettingen het verslag van de redaktie van Afzettingen voor, hoewel dit verslag reéds gepubliceerd is.. Dé

And, apart from research directly linked to Agriculture, the abundance and diversity of nematodes in both terrestrial and aquatic ecosystems in caves have also been

Here, the common development sequence is followed, first the data collection is created, followed by the student administration (serves as master data for the Student services

- Voor waardevolle archeologische vindplaatsen die bedreigd worden door de geplande ruimtelijke ontwikkeling en die niet in situ bewaard kunnen blijven:.  Wat is

The goals of the Journal of Open Psychology Data are (1) to encourage a culture shift within psy- chology towards sharing of research data for verification and secondary

In doing so, the Court placed certain limits on the right to strike: the right to strike had to respect the freedom of Latvian workers to work under the conditions they negotiated

Tussen rassen bestaan dus verschillen voor ureumge- halte, maar hoe zit het binnen een ras.. Om dat te bepa- len hebben we 40.992 gegevens geselecteerd