Analysis of Process Mining Reality

(1)

Analysis of Process Mining Reality

Timo Rafaël GRUWEL

Student number: 11270543 Supervisor University: Loek Stolwijk

Supervisor Ebicus: Mark Ormel

Abstract. The last few years, companies are increasingly applying process mining to get more insights in their processes. However, an interesting question remains whether this technique truly reflects the reality. Process mining uses event logs to discover business processes within a company. These event logs are based upon the human interaction of the employees with different IT systems. This research studies by means of two cases whether process mining does or does not reflect reality. One of the cases is an unstructured absence through illness process in an insurance company and the second case is a structured change management process in an IT department. The results of the process mining analysis have been compared with the results of multiple interviews, which led to the conclusion that process mining does not always reflect reality. This study shows that process mining can discover real processes, but it also provides insights that are necessary to understand why process mining will not work in certain circumstances.

Keywords. Process Mining, Alignment, Process Analysis, Disco Fluxicon

1. Introduction

The way of doing business is evolving and many new technologies allow this evolvement. For some years, systems evolve by moving from single functionalities towards business process orientation (Burattin, 2015). Therefore, a new research area is emerging called “process mining”. Process mining can be defined as the method of extracting a structured process of event data (Maruster, Weijters, van der Aalst, & van den Bosch, 2002). Burratin (2015) argues that process mining deals with several different activities that all aim to extract knowledge from available log files. Today’s information systems typically record the execution of process tasks in event logs (Bozkaya, Gabriels, & Werf, 2009; Suriadi, Andrews, Ter Hofstede, & Wynn, 2017). There is an incredible increase of event data, but there are still some open challenges and many organisations lack the awareness of the potential of process mining (Van der Aalst, 2012).

Brighman and Introna (2007) mention a difference between formalised processes and workarounds that make systems work. In this case, the predefined processes can be considered as the description of formalised processes (Brighman & Introna, 2007). By applying process mining, differences between the formalised (a-priori) processes and the process as is registered in event logs can be analysed.

To obtain a better view of process mining, the environment, as can be seen in Figure 1, can be considered. In practice the a-priori process (predefined and intended process) often differs from what actually happens as is described by the real process (Mans, Schonenberg, Song, Van Der Aalst, & Bakker, 2009; Song & Van Der Aalst, 2007), explaining the difference A.

(2)

Figure 1: Environment of Process Mining

The relation between an a-priori process and the process as analysed by process mining (relation B) is used for conformance checking. This means that the a-priori model is checked and compared against the model as analysed by process mining (Mans et al., 2009; Van der Aalst, 2011). This can be equal or different depending on the process you analyse.

The most interesting relationship however, is between the process as analysed by process mining and the real process. Does process mining give insight into the real process? The event logs of an Enterprise Resource Planning (ERP) system (e.g. Oracle or SAP) or Customer Relationship Management (CRM) system (e.g. Siebel or Salesforce) are recorded automatically but without a systematically approach. This means that there is some level of guarantee that the event logs match reality and thus could be considered trustworthy (Van Der Aalst et al., 2012). Other authors contribute to this by explaining that solid process mining techniques applied on event logs reflect reality (Ter Hofstede, Van der Aalst, & Weske, 2003).

Despite that event logs of ERP and CRM are considered to be trustworthy of reflecting reality, no research could be found that specifically stated that C is an equality (i.e. process as analysed by process mining equals real process). Van der Aalst et al. (2012) also mention that models that derive from event logs provide only a view on reality. Although process mining is an objective non-biased way of process analysis, if people bypass information systems by doing tasks differently, the log will not reflect reality (Van der Aalst et al., 2003). Besides, there are problems that can occur which will affect process mining negatively such as the incompleteness or noise (i.e. data that is incorrect or incomplete) within log files (Burattin, 2015; Tiwari, Turner, & Majeed, 2008).

Since the application of process mining is increasing within organisations (Van der Aalst, 2014), it is important to know whether process mining actually prescribes a concise assessment of reality. For that reason, the research question will be:

“Does the application of process mining reflect the reality of the executed processes”?

By answering the research question, the aim will be to research the relation between the real process, as it is executed in reality and the analysed process, as is done with process mining techniques. This research question is decomposed in a set of sub-questions which will aid the answerability. They are listed below:

(3)

1. What is the relationship between the formalised a-priori processes and the real processes?

2. What is the relationship between the formalised a-priori processes and the processes as analysed by process mining?

3. What criteria need to be considered for process mining to give a correct representation of reality?

The remainder of the paper is organised as follows: First, process mining will be explained in depth along with advantages and guidelines that will be considered for the application within the two case studies. Second, process mining will be put in a broader context to explain its position within the related disciplines. Consequently, the methodology, consisting of data collection, data analysis and interviews will be elaborated. Then, each sub-question will be answered based on the available literature. Next, section 8 will discuss the results and finally a conclusion and limitations will be discussed.

2. Process Mining

Process mining is an emerging research area (Saylam & Sahingoz, 2013; Suriadi et al., 2017; Van der Aalst, 2012). It aims to extract knowledge from event logs to use it for the purpose of analysing processes (Mans et al., 2009). Initially, process mining was used to discover processes from event logs as is explained by Maruster et al. (2002). It could be used to analyse how procedures are really taking place or how people really behave in the bigger process (Van Der Aalst, Weijters, & Maruster, 2004). However, nowadays there are three types of process mining that are considered (Van der Aalst, 2016), which are positioned in Figure 2.

(4)

2.1. Types of Process Mining

The first type of process mining is discovery. The goal of process discovery is to analyse a model. The input necessary for this to happen is an event log. Just like any other form of data analysis, the input quality will define the output quality (Suriadi et al., 2017). The process mining manifesto describes a star rating which defines the maturity of event logs (Van Der Aalst et al., 2012). Once the maturity of the event log is high enough (i.e. three stars or more) and when there is at least a case identifier, an activity name and some kind of timestamp, then the event log can be used as input for process discovery (Bernard, Boillat, Legner, & Andritsos, 2016).

The second type is conformance, where the event log will be compared against a predefined process model. Arpasat, Porouhan and Premchaiswadi (2015) argue that conformance can also be seen as auditing since the purpose is to compare the models, look for deviations and identify fitness between the model and log. The severity of potential deviations can be analysed in measures of time or frequency (Van der Aalst, 2016). So, conformance checking is used to check if reality (according to the log), conforms the model.

The third and last type of process mining is enhancement. After finding differences between the current process (according to the log) and the predefined process model, it is eligible to check for opportunities to enhance the process. Here, improvements and solutions will be suggested to change or extend the process (Arpasat et al., 2015).

2.2. Advantages

Process mining is used to gain certain advantages. It is relevant to provide objective information about how people work (Saylam & Sahingoz, 2013). The biggest advantage of process mining is the quick and objective analysis of business processes which is possible by using data that is already available (Günther, 2014). Traditional approaches with interviews proved to give less accurate information due to fragmented knowledge of actors within the process and the subjectivity where actors tell how they should work instead of how the work is actually done (Dumas, La Rosa, Mendling, & Reijers, 2012). Process mining can also be used to compare the predefined process and the actual process, better known as conformance checking (Saylam & Sahingoz, 2013). This is being done for auditing purposes (e.g. to detect fraud) or for business IT alignment (Burattin, 2015).

Another relevant topic to use process mining is Business Activity Monitoring (BAM), which is used for real-time monitoring of business processes (Van Der Aalst et al., 2012). BAM tools allow managers to monitor real-time business processes by the use of dashboards (De Weerdt, Schupp, Vanderloock, & Baesens, 2013). However, these tools can benefit from process mining as is stated by Van der Aalst et al. (2012), because process mining provides insights in conformance.

Nevertheless, these advantages will only be attained once the right data is being used. According to Saylam & Sahingoz (2013), this is the most difficult part of process mining and is thus an important preliminary of profiting from these benefits.

(5)

2.3. Conditions and Requirements

Several articles state that the bare minimum to apply process mining consist of an event log with at least a case identifier, a task identifier and a time notation (Bernard et al., 2016; Dumas et al., 2012; Van der Aalst et al., 2003). When these requirements are met, process mining can be applied. Naturally, additional attributes are desirable (e.g. resource, cost of resource etc.) to gain more insights.

Besides the bare minimum requirements that need to be available in an event log, different guidelines reoccur in literature which makes them mentionable (Van Der Aalst et al., 2012). In the next subparagraphs, the importance of the event log will be explained, followed by a short explanation of six guiding principles.

2.3.1. Event Log

Before benefitting from the output of process mining, the input needs to be considered. Event logs are used as the starting point of process mining and serve as input. As explained previously, the data of the event log can be mined and different aspects can be analysed (Jans, Van Der Werf, Lybaert, & Vanhoof, 2011).

Figure 3: Example of a fragment of an event log (Van der Aalst, 2016, p. 36)

As can be seen in Figure 3, an event log records tasks (i.e. events/activities) for processes. The event identifier corresponds to an activity for a specific case. In the fragment, there are two cases, which means there are two trails of how the process is executed. The minimum requirements are met, namely a case id, an event id (or name), and a time notation. Besides, there are two additional attributes: resources and costs.

(6)

2.3.2. Guiding Principles for Process Mining

Van der Aalst (2012) mentions that the application of new technologies are often prone to errors. That is why six guiding principles emerged to prevent to make mistakes. They can be seen in Table 1.

Table 1: Six Guiding Principles (Van Der Aalst et al., 2012)

First, the data collected in logs needs to be reliable. ERP and CRM systems record log data automatically and is considered to be reliable. Business Process Management (BPM) and workflow systems record log data with an even higher maturity, thus being more reliable (Van Der Aalst et al., 2012). Second, process mining will be easier if it is driven by questions. IT systems can consist of many tables that could be relevant for data extraction. Scoping down before extracting data will help in finding meaningful data that can be used. Third, the basic workflow patterns such as, sequences, parallel routings (AND-splits/joins), choices (XOR-splits/joins) and loops need to be supported. Fourth, events should be related because conformance and enhancement rely on the relationship between elements in the model and events in the log. Fifth, a model that derives from the output of an analysed event log can be interpreted in different ways. It can be seen as a “geographic map”. Depending on the use of the map, there are different views that could be useful. Lastly, according to Van der Aalst (2012), process mining should not be a one-time activity because processes are dynamic and might change.

3. Process Mining in a broader context Process mining is quite broad and has an overlap or is connected to different other disciplines. In many literature process mining has been compared to business intelligence, business process management and data mining as can be seen in Figure 4. The next subsections will explain how process mining is related to these fields.

3.1. Business Intelligence

According to Golfarelli et al. (2004), Business Intelligence (BI) can be defined as the process of converting data to valuable information which on turn can be turned into knowledge (as cited in De Weerdt, et al, 2013). BI is a term that is used a lot which aims to provide information based on analysed data, that can be used to support decision making. Although, BI lacks process mining capabilities, a part of BI is Business Process Intelligence (BPI) which focuses on analysing operational processes. Sometimes it is even used as a synonym for process mining but BPI tools do not support process discovery (Van der Aalst, 2016). Thus, in contrary to BI or BPI tools, process mining helps organisations to actually analyse their processes.

(7)

3.2. Business Process Management

Business Process Management (BPM) aims to support business processes by using methods and techniques to design, control and analyse processes (De Weerdt et al., 2013). Business processes can be modelled by using different techniques, but the most common are Business Process Modelling Notation (BPMN), Unified Modeling Language (UML) and Petri Nets. The most important difference between BPM and process mining is that with BPM models are handmade while process mining is based on automatically constructing process models, which in turn are based on facts (i.e. event logs) (Fahland & Aalst, 2012; Van Der Aalst et al., 2012)

Initially BPM was applied with the aim to design and implement business processes. Nowadays however, there is an increasing trend in the BPM community that focuses more on how to monitor and control processes. This trend is more data-driven and that is why process mining is frequently used in BPM (Van der Aalst, 2016).

3.3. Data Mining

Data mining is a data driven technique that aims to analyse big data sets to find relationships (Van der Aalst, 2016). Data mining is data-centric and is unable to discover processes (Van der Aalst, 2011). Process mining however, is process-centric which allows it to analyse processes deriving from event logs. Saylam & Sahingoz (2013) state that process mining can be seen as a link between data mining and BPM.

Nonetheless, there are some data mining techniques that are similar as process mining, such as sequence or episode mining. But it is important to note that these do not consider end-to-end process models. Besides, with process mining it is possible to analyse the event logs in a relatively easy way (Van der Aalst, 2016).

4. Methodology

This research consists of four major elements. First, an initial literature review has been executed. This has led to insights on the topic and to a research question. The research question was then divided in three sub-questions which were answered by means of existing literature. Second, to answer the main question, event data of two specific cases have been collected. Third, this data has been filtered and analysed. Fourth and last, different interviews have been conducted with the aim of comparing the results of the data analysis to get a multiperspectivity of reality.

4.1. Literature Review

The literature review has been divided into two parts. In the first part, section 2 and 3 provide an overview of process mining and how this is related to different fields. The concept has been explained in details and it has been put into a broader context where BI and BPM play a part as well. The aim of this first part is to create a solid knowledge foundation.

The second part of the literature review consists of the section 5 until 7 where the three sub questions were elaborated. These three sub-questions are the preliminaries for answering the main question. Relations A and B from Figure 1 are explained in depth and the criteria for an optimal application of process mining were researched.

(8)

4.2. Data Collection

Within the available network of Ebicus, different clients have been compared and finally two different clients have been contacted successfully. Within this research, two cases are used because Bryman (2015) stated that many authors recently argued for the use of case studies that use more than one case. These “multi-case” studies improve the theory building and make the research more reliable.

4.2.1. Insurance NL

Insurance NL has been chosen because according to De Weerdt et al. (2013), the financial services industry is of main interest for process mining. This is due to the human-centric processes which means that the analysis of these event logs can be highly beneficial. The data was queried from Siebel 8 which is a CRM system owned by Oracle. This system is being used within Insurance NL as a call centre. Insurance claim and absence through illness processes are processed within this system (Interviewee A, 2017).

A CSV file with 320.434 rows for the duration of one year was created on the 25th of April 2017. The specific data that was collected is shown in Appendix A.

4.2.2. Dept. IT

The process analysed within Dept. IT concerns a change process within an Oracle environment. This process follows different steps such as, register RFC (Request for Change), test change, implement change etc. The used data was queried from Topdesk 5.7.Topdesk is a service management solution that helps companies to support their business departments.

Two XLS files have been created on the 12th_{of May 2017. One of them containing} all change activities from 2016 and the other containing all the change activities of 2017. Combined they contained 87.487 rows. The specific data that was collected is shown in Appendix B.

4.3. Data Analysis

The data for both case studies has been analysed according to the Process Mining Methodology Framework of De Weerdt et al. (2013). They stated that the application of process mining in real environments is an issue and their framework consists of a broad number of steps that aid this applicability. Most of the steps are followed within this research.

The first step was data extraction. This has been done in cooperation with business experts as is explained in the previous paragraph. This data was then being pre-processed to filter out the unnecessary data and to clean the data needed etc.

The next step was the process data exploration where the data was being imported in the process mining tool Disco. Here, an initial analysis started to get familiar with the data. Depending on the case, these few steps can repeat a few times as is seen in Figure 5.

Figure 5: Process Mining Framework adapted from (De Weerdt et al., 2013)

(9)

The last step was the discovery analysis. This started with the visualization of the event logs and was then scoped to more frequent behaviour to limit the variety as suggested by De Weerdt et al. (2013).

4.4. Interviews

The interviews have been conducted with the goal to obtain another perspective of reality which can be compared to the findings gained by applying process mining. The method applied for these interviews is based on the seven steps of discovering processes (Sharp & Mcdermoot, 2009). These steps are described in Figure 6:

Figure 6: Seven Steps of Discovering Processes (adapted from Sharp & Mcdermoot, 2009)

In short, background information of the process is obtained. This was then further elaborated by means of initial interviews with the functional area representatives. Thereafter, detailed interviews with key-users were conducted. The results were then analysed and the activities were identified and linked per the input from the different interviews.

The interviews that have been conducted were semi-structured because this proved to be more useful than structured interviews since it is useful for exploring new areas and it allows respondents to answer in their own terms (Bryman, 2015). Prior to the interviews, an interview guide was created based on recommendations from Sauders, Lewis & Thornhill (2012). Finally, Atlas.ti v7.5.18 has been used to code the transcripts according to the coding manual (Saldana, 2012). The transcripts can be seen in Appendix F.

5. Non-Alignment of Formalised Processes and Reality

Ideally, the aim of BPM is to design business processes and then to implement them (Van der Aalst, 2016). However, these processes need to be executed by means of IT systems. Brighman and Introna (2007) studied the alignment of IT systems and they came up with three concepts that influence this, namely hospitality, bricolage and enframing. Business IT Alignment is important because it explains the dynamic fit between business processes and IT (Henderson & Venkatraman, 1999).

Hence, these three concepts influence the alignment between the business processes and IT and thus also how these business processes are executed. The first concept (Hospitality), explains how IT systems are received by the users. The second concept (Bricolage), concerns the improvisation and workarounds. The third concept (Enframing), describes the means of IT systems (i.e. the role it fulfils that provides a certain outcome).

The most important of the three concepts from Brighman and Introna (2007) is considered to be Bricolage. Improvisation or workarounds are forms of deviated behaviour where users are consciously choosing for, even if they are aware of the procedures (Outmazgin & Soffer, 2016). This can occur in exceptional situations which are not described in the normal procedures. However, intentional incompliance can also occur, for example: “consider a situation where a customer is urgently requesting some

(10)

goods and a truck is about to embark in his direction. An employee might decide to immediately load the goods on the truck, while the paperwork of registering the order and the delivery will be done afterward in retrospect” (Outmazgin & Soffer, 2016, p. 310).

Even if there are specific reasons for deviating from the standard process, it is still a workaround. Outmazgin & Soffer (2016) discuss six generic workaround types. Table 2 shows an overview along with a short description per workaround type.

Table 2: Types of Workarounds with a Short Description (Outmazgin & Soffer, 2016, pp. 311–312)

Besides workarounds, Dumas et al. (2012) & Ter Hofstede et al. (2003) highlight that business processes evolve over time. This causes that the process deviates from how it was intended. Yet, this does not mean that this is a bad development. There are good reasons for changing processes. One of them is that the world is changing as well as the technology that is being used. Another reason could be that new competitors enter the market, which obliges a company to change their processes (Dumas et al., 2012).

Ter Hofstede et al. (2003) further argues that these process evolutions over time create interesting possibilities for the application of process mining since the discovery is not limited to what is already known, but instead it can provide new insights. To sum up, many research shows that there is a non-alignment between formalised processes (i.e. as they were intended) and the reality which often deviates because of workarounds and evolvement of processes which happens over time. Song & Van der Aalst (2007) state that in practice there is frequently a gap between what is prescribed and what is actually happening, thus confirming the non-alignment.

(11)

6. Conformance

Section 2 mentioned that conformance is one of the three types of process mining. This type checks relation B as seen in Figure 1. Thus, conformance checking is used to investigate if reality (as is registered in the log), “conforms” the formalised processes (Van der Aalst, 2016). This relation can be verified after the automated process discovery. Process mining can analyse inefficiencies, workarounds that lead to incompliance and other kinds of performance and deviation issues (Van der Aalst, 2014). Günther (2014) states that process mining can prove whether the process is compliant based on a thorough data analysis. As Saylam & Sahingoz (2013) recall, process mining checks for any differences between what is expected to happen and what actually happens as is recorded in the event logs.

Although many literature studies explain what the relationship between a-priori processes and process mining is about (as seen in section 5), there is no research stating how often processes might deviate. However, different case studies in literature proof that a-priori processes frequently deviate from the processes as analysed by process mining (see Table 3).

Company Process Source

Insurance Company A

Document Management System flow De Weerdt et al. (2013)

Software Development Company B

Software Development Process Lemos et al. (2011)

Government Company C

Issue Document Bozkaya et al. (2009)

Table 3: Case Studies from Literature with Applied Conformance Checking

The first case study concerns a Belgium insurance company which products include life and non-life insurances as well as retirement savings. When a physical document is received, it is digitalised and consequently the following steps are being tracked. De Weerdt et al. (2013) found that the behaviour in their analysed process was non-compliant with the a-priori process. In some cases, the process execution took twice as long as had been expected.

The second case study explains the process that occurs when software is developed within a Brazilian software development company. Activities include: document analysis, proposal analysis, development, etc. The case study of Lemos et al. (2011) demonstrated that some projects followed the a-priori process, but most of them deviated from what was expected.

The last case study describes how certain documents were issued at a Dutch Government Company. In comparison to the first two cases, Bozkaya et al. (2009) found that their analysed process terminates correctly in 99% of the cases.

These three divers case studies demonstrate that relationship B (as seen in Figure 1) could either be different or close to equal. These case studies show that this is highly dependable of what process you monitor and how the process is organised within a company.

(12)

7. Criteria for an Optimal Reflection of Reality

There are different criteria necessary to create an optimal reflection of the reality. The first step within the procedure of applying process mining is to gather the right event/raw data (see Figure 7). This data needs to be available and needs to be registered correctly since this is the input side of process mining (Van der Aalst, 2014).

Figure 7: Data Validation Process (Suriadi et al., 2017)

Events can be simply defined as “things that occur” and they can be described by “references” and “attributes”. References refer to names or identifiers (e.g. case, resource etc.) while attributes refer to values (such as age or time). Van der Aalst (2014) defined twelve guidelines for logging based on these definitions (see Table 4).

Table 4: Twelve logging guidelines that provide a solid starting point (Van der Aalst, 2014, pp. 8–9) These logging guidelines provide a solid starting point for processes mining. However, there are some challenges that need to be considered. Dumas et al. (2012) note that in many cases, event data might not be in the required format or needs to be integrated because there are different sources that require the data necessary. Therefore, five challenges have been identified:

(13)

Challenges of Extracting Event Logs

Correlation Some IT systems do not recognise process instances. Thus, identifying which event belongs to which case might be a problem.

Timestamps Many IT systems do not log as a primary task. In some cases, the date is saved but an exact time is missing. Besides, different systems could have different time notations or even different time zones could exist.

Snapshots When logging is done for a specific period it could exclude some full end-to-end cases. Such incomplete cases should be removed.

Scoping Scoping could be a challenge when IT systems do not produce event logs

directly. In that case, data could be extracted from different tables but such system expertise may not be available.

Granularity Typically, process mining aims to conduct analysis on a conceptual level, but the event logs record much finer details. Therefore, it is difficult to define a process on a correct level of abstraction.

Table 5: Challenges of Extracting Event Logs (Dumas et al., 2012)

After you have lived up to the guidelines for logging and when the challenges have been considered, the beginning of the log analysis can start. This typically starts with a first visualisation of the process model and can later be filtered on frequent behaviour (De Weerdt et al., 2013)

One of the biggest challenges is to find a balance between overfitting and underfitting. A model is overfitting when it is too specific and contains all the exceptional behaviour. The model can also be underfitting when it is too general and does not allow specific behaviour at all. This balance is the crux of process discovery (Van der Aalst, 2016). When all of these criteria are closely followed, and considered carefully, the information loss (as is depicted in Figure 5) will be minimised. When an event log is of “high quality” the event log is valid in the domain context and is most likely to produce a model that confirms reality (Suriadi et al., 2017).

8. Results

8.1. Process Mining Results 8.1.1. Insurance NL

As already mentioned, the financial services industry is of main interest for process mining (De Weerdt et al., 2013). Processes in this sector are human-centric which means that the analysis of these event logs can show new insights. In the absence through illness process this proved to be the case.

The process analysis resulted in a highly unstructured process. The process always starts with “Intake Employer”. This is the trigger of the process where the employer and insurance company are in contact with each other and the employer calls in a sick employee. Depending on the kind of sickness, different activities can follow which results in a highly unstructured process (for the result see Appendix C). The process always finishes with “Call employer for full recovery”. Here the employee is fully recovered, thus the process ends.

(14)

8.1.2. Dept. IT

In contrary to the previous process, this change process is highly structured (for the result see Appendix D). Hence, it does not provide the new insights as was the case within the financial services (De Weerdt et al., 2013).

This process always starts with “ERP Registreren RFC en afstemmen Proceskring”, translated “Register RFC and align with stakeholders”. Then instead of many different possibilities throughout the process, this process is rather structured and has a general flow. The process ends with “Afsluiten wijziging”, translated “Close Change”.

8.2. Interview Results

The interview analysis resulted in codes which in turn have been categorised. The full codebook that resulted out of the analysis can be seen in Appendix E. The interviews resulted in some interesting findings. One finding is that the code “Wrong Time Administration” only appeared in the Dept. IT case study where time was registered manually. In the Insurance NL case, time per activity was registered automatically and the interviews did not show any problems with time registration.

Another finding is that, the code “mining issues” was only encountered in the Insurance NL case, where the interviewees considered the analysis to not reflect the reality. Whereas the Dept. IT case did not encounter the code “mining issues” and the interviewees did consider the analysis as the reality.

8.2.1. Insurance NL

The two interviews that have been conducted within Insurance NL proved that the analysed process did not match reality (Interviewee B, 2017; Interviewee C, 2017). The analysed process appeared to be too simplified. Because the process was highly unstructured, almost every activity could follow every other activity. This means that flows can be repetitive (e.g. A-B-A-B-A-B-A-B-A-C, instead of A-B-C) and that the process can take on unlimited variants. Besides that, the same activities were used for different tasks (e.g. “Administrative - Care” was used for all the administrative tasks). The above caused a serious limitation in analysing a generalised process model.

Despite that, there were little codes reoccurring concerning the influences on reality. Time is registered automatically so this code did not appear in the transcripts. However, wrong administration did happen (e.g. forwarding activities to the wrong person).

Furthermore, both interviewees mentioned that they could deviate from procedures, but this was either not related to the process itself, or they were allowed and they had the freedom to do so. Thus, the process was executed well but because the activities were executed in a highly unstructured manner and because several mining issues were encountered, the interview results differed from the process mining analysis.

8.2.2. Dept. IT

The three different interviews that have been conducted proved that the analysed process matched the reality as is perceived by the three key users of different departments (Interviewee X, 2017; Interviewee Y, 2017; Interviewee Z, 2017). However, an important finding was that the registered times are often not registered well (Interviewee, X, 2017, Interviewee, Y).

Concerning the execution of the process, the interviews demonstrated that there is a good intention of acting and that sometimes shortcuts were made for urgent changes. Yet, these changes were still conforming the analysed model.

(15)

9. Discussion and Conclusion

This research contributes to the demand of the practical application of process mining within academic research (De Weerdt et al., 2013). The two cases that have been analysed provide insight in how process mining can be applied in real-life environments. Besides that, this research is the first which verified the reality of a process mining application by means of interviews. Thus, the findings are of use for academics and companies who want to know more about the reflection of reality when applying process mining.

The aim of this research was to validate whether the application of process mining reflects the reality of the analysed business process. Hence, the research question is: “Does the application of process mining reflect the reality of the executed processes”? To answer this question, three sub questions were created: (1) What is the relationship between the formalised a-priori processes and the real processes? (2) What is the relationship between the formalised a-priori processes and the processes as analysed by process mining? (3) What criteria need to be considered for process mining to give a correct representation of reality?

The relationship between the formalised a-priori processes and the real processes often deviates because of workarounds and evolvement of processes which happens over time (Brighman & Introna, 2007; Dumas et al., 2012; Ter Hofstede et al., 2003).

The relationship between the formalised a-priori processes and the processes as analysed by process mining also proved to be different based on three case studies that can be found in literature (Bozkaya et al., 2009; De Weerdt et al., 2013; Lemos et al., 2011).

Before the relationship between reality and process mining was analysed using two case studies, the necessary criteria have been researched. A few criteria proved to be important in literature, namely: the twelve logging guidelines of Van der Aalst (2014), the five challenges of process mining as described by Dumas et al. (2012) and the balance between overfitting or underfitting a process (Van der Aalst, 2016).

After the relevant criteria were known, the analysis of the two case studies started and the result was that the relationship between reality and process mining turned out to be case-dependable.

The analysed model from the Dept. IT case study matched with the results from the three-different interviews with key-users. All interviewees agreed that the discovered process model reflected with how they execute their daily tasks. Some workarounds were made, but these were conforming the analysed process model.

The analysed model from the Insurance NL case study did not match with the results from the interviews. Although, the two key users agreed that the analysed model reflected the process in big lines, there were some important differences. Activities could be highly repetitive (e.g. A-B-A-B-A-B-A-B-A-C, instead of A-B-C) and could be used for different tasks. These facts proved to be major issues for applying process mining. In conclusion, process mining can reflect reality (1) if each activity in the process being analysed has a sole meaning and is used for only one particular task, (2) if there are no multiple activities that have the same meaning (i.e. synonyms) and (3) when all the criteria mentioned in this research are followed and considered. Then, depending on how the process is implemented, it is necessary for the employees to interact genuinely with the IT system, which, as demonstrated in this research, is not always plausible.

(16)

If these requirements are not met, process mining can still be used on a conceptual level, but it will not be reliable conform the reality. This occurred within the Insurance NL case study. The analysis provided conceptual insights but the process analysed proved to be wrong on a more detailed level.

10. Limitations

It is important to notice that some limitations might exist in this research. The first limitation is concerning generalization. Although this is a multi-case study research, there are many different organisations with different systems and procedures which might show a different outcome.

Another limitation is that experience with the practical application of process mining was gained during this research. However, methods and research approaches that were used, were based on existing literature.

Although five key-users have been interviewed, more interviews could have given better results. For these reasons, future research on the reality of process mining could extend the outcome of this research.

(17)

References

Arpasat, P., Porouhan, P., & Premchaiswadi, W. (2015). Improvement of call center customer service in a Thai bank using disco fuzzy mining algorithm. International Conference on ICT and Knowledge Engineering, 2015–Decem, 90–96. https://doi.org/10.1109/ICTKE.2015.7368477

Bernard, G., Boillat, T., Legner, C., & Andritsos, P. (2016). When Sales Meet Process Mining: A Scientific Approach to Sales Process and Performance Management ... A Scientific Approach to Sales Process and, (December).

Bozkaya, M., Gabriels, J., & Werf, J. M. Van Der. (2009). Process Diagnostics: A Method Based on Process Mining. International Conference on Information,

Process, and Knowledge Management, (1), 22–27.

https://doi.org/10.1109/eKNOW.2009.29

Brighman, M., & Introna, L. D. (2007). Strategy as Hospitality, Bricolage and Enframing: Lessons from the Identities and Trajectories of Information Technologies. In Information Management: Setting the Scene (pp. 159–172). Elsevier Ltd.

Bryman, A. (2015). Social Research Methods (4th ed.). New York: Oxford University Press.

Burattin, A. (2015). Process Mining Techniques in Business Environments. Springer International Publishing. https://doi.org/10.1007/978-3-319-17482-2

De Weerdt, J., Schupp, A., Vanderloock, A., & Baesens, B. (2013). Process Mining for the multi-faceted analysis of business processes - A case study in a financial services organization. Computers in Industry, 64(1), 57–67. https://doi.org/10.1016/j.compind.2012.09.010

Dumas, M., La Rosa, M., Mendling, J., & Reijers, H. A. (2012). Fundamentals of business process management.

Fahland, D., & Aalst, W. M. P. Van Der. (2012). Repairing Process Models to Reflect Reality. International Conference on Business Process Management, 229–245. Günther, A. R. & C. (2014). The Added Value of Process Mining. BPTrends, (1), 1–12.

Retrieved from http://www.bptrends.com/bpt/wp-content/uploads/04-01-2014-ART-AddedValueOfProcessMining-Rozinat-2-1.pdf

Henderson, J. C., & Venkatraman, N. (1999). Strategic Alignment Leveraging Information Technology for Transforming Organizations. IBM Systems Journal, 32(1), 472–783.

Jans, M., Van Der Werf, J. M., Lybaert, N., & Vanhoof, K. (2011). A business process mining application for internal transaction fraud mitigation. Expert Systems with Applications, 38(10), 13351–13359. https://doi.org/10.1016/j.eswa.2011.04.159 Lemos, A. M., Sabino, C. C., Lima, R. M. F., & Oliveira, C. A. L. (2011). Using process

mining in software development process management: A case study. IEEE International Conference on Systems, Man, and Cybernetics, 1181–1186. https://doi.org/10.1109/ICSMC.2011.6083858

Mans, R. S., Schonenberg, M. H., Song, M., Van Der Aalst, W. M. P., & Bakker, P. J. M. (2009). Application of Process Mining in Healthcare – A Case Study in a Dutch Hospital. Proceedings of BIOSTEC 2008, 25, 425–438. https://doi.org/10.1007/978-3-540-92219-3_32

Maruster, L., Weijters, A. J. M. M., van der Aalst, W. M. P., & van den Bosch, A. (2002). Process mining: Discovering direct successors in process logs. International Conference on Discovery Science, 364–373. https://doi.org/10.1007/3-540-36182-0_37

(18)

work-arounds. Software and Systems Modeling, 15(2), 309–323. https://doi.org/10.1007/s10270-014-0420-6

Saldana, J. (2012). An Introduction to Codes and Coding. The Coding Manual for

Qualitative Researchers, (2006), 1–8.

https://doi.org/10.1519/JSC.0b013e3181ddfd0a

Sauders, M. Lewis, P & Thornhill, A. (2012). Research Methods for business Students (Sixth Edit). Pearson.

Saylam, R., & Sahingoz, O. K. (2013). Process Mining in BPM: Concepts and challenges. Electronics, Computer and Computation (ICECCO), International Conference on

Digital Object Identifier, 131–134.

https://doi.org/10.1109/ICECCO.2013.6718246

Sharp, A., & Mcdermoot, P. (2009). Workflow Modeling Tools for Process Improvement and Applications Development. Proceedings of the National Academy of Sciences of the United States of America (Second Edi, Vol. 104). Boston | London. https://doi.org/10.1073/pnas.0703993104

Song, M., & Van Der Aalst, W. M. P. (2007). Supporting process mining by showing events at a glance. In Proceedings of 17th Annual Workshop on Information Technologies and Systems, 139–147.

Suriadi, S., Andrews, R., Ter Hofstede, A., & Wynn, M. T. (2017). Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information Systems, 64(October 2015), 132–150. https://doi.org/10.1016/j.is.2016.07.011

Ter Hofstede, A., Van der Aalst, W., & Weske, M. (2003). Fuzzy Mining – Adaptive Process Simplification Based on Multi-perspective Metrics, 2678(Bpm 2007), 1019-1019–1019. https://doi.org/10.1007/3-540-44895-0

Tiwari, A., Turner, C. J., & Majeed, B. (2008). A review of business process mining: state-of-the-art and future trends. Business Process Management Journal, 14(1), 5–22. https://doi.org/10.1108/14637150810849373

Van der Aalst, W. M. P. (2011). Process Mining. Springer. https://doi.org/10.1007/978-3-642-19345-3_1

Van der Aalst, W. M. P. (2012). Process Mining: Overview and Opportunities. ACM Transactions on Management Information Systems (TMIS), 3(2), 1–16. https://doi.org/10.1007/978-3-642-19345-3

Van der Aalst, W. M. P. (2014). Extracting Event Data from Databases to Unleash Process Mining. BPM - Driving Innovation in a Digital World SE - 8, 105–128. https://doi.org/10.1007/978-3-319-14430-6_8

Van der Aalst, W. M. P. (2016). Process Mining. Process Mining (Vol. 5). https://doi.org/10.1007/978-3-642-19345-3

Van Der Aalst, W. M. P., Adriansyah, A., De Medeiros, A. K. A., Arcieri, F., Baier, T., Blickle, T., … Wynn, M. (2012). Process mining manifesto. International Conference on Business Process Management, 99 LNBIP(Springer Berlin Heidelberg), 169–194. https://doi.org/10.1007/978-3-642-28108-2_19

(19)

Van der Aalst, W. M. P., Van Dongen, B. F., Herbst, J., Maruster, L., Schimm, G., & Weijters, A. J. M. M. (2003). Workflow mining: A survey of issues and approaches.

Data and Knowledge Engineering, 47(2), 237–267.

https://doi.org/10.1016/S0169-023X(03)00066-1

Van Der Aalst, W. M. P., Weijters, T., & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1128–1142. https://doi.org/10.1109/TKDE.2004.47

(20)

Appendices

Appendix A: Data Used for Insurance NL

Column Name Type

ROW_ID Case ID (system)

INSCLAIM_NUM Case ID (functional)

X_SUBTYPE_CD Activity CREATED Timestamp LAST_UPD Timestamp OWNER_PER_ID Resource STATUS_CD Other EVT_STAT_CD Other X_TYPE Other X_SUBTYPE Other

Appendix B: Data Used for Dept. IT

Column Name Type

Wijzigingsnummer Case ID (system)

Activiteitnummer Activity (number)

Korte omschrijving Activity

Startdatum Timestamp

Datum afronding Timestamp

Behandelaarsgroep Resource

Status Other

Categorie Other

(21)

Appendix C: Analysed Process Absence Through Illness

The process model below shows the analysed Absence Through Illness for Insurance NL. This model shows the frequencies. The thicker the arc, the more this step takes place. The darker the activity, the more often it occurs. In total, there are 4875 cases of illness, but because some activities are executed multiple times within a case, it is possible that an activity occurs more often than 4875 times. The activity names have

(22)

Appendix D: Analysed Change Management Process The process model below shows the Change Management Process for Insurance NL. This model shows the frequencies. The thicker the arc, the more this step takes place. The darker the activity, the more often it occurs. In total, there are 92 changes.

(23)

Appendix E: Codebook

Category Code Definition Sample Quote (in Dutch)

Process Description

Case Types Types of cases within the

process

No quotes available in Public Version

Process Trigger The start of the process

Most Intensive Activity The activity that demands

most time

Process Perception How the process is perceived

by the interviewee Process

Execution

Deviation from Procedures

Different process execution than planned

Good Intention of Acting Goodwill of executing the

process well

Process Shortcuts To skip necessary activities in

the process Process Improvement Improvement Opportunity An opportunity to improve the current process

Process Shortcuts To skip activities in the

process for efficiency

Lack of Knowledge Employee is not aware of

relevant information Influence on

Reality

Wrong Administration Activity is registered

incorrect in the log

Wrong Time Registration Incorrect registration of

activity time

Mining Issue A phenomenon that creates

(24)

Appendix F: Transcripts

Not available in Public Version Transcripts Dept. IT:

- Release Management - Finance Department - Development Department Transcripts Insurance NL: - Care Department - Intervention Department