Towards an integrated practice unit for cancer care: How can process mining help?

(1)

Towards an integrated practice unit

for cancer care:

How can process mining help?

Master Thesis

Medical Informatics

(2)

Towards an integrated practice unit for cancer care:

How can process mining help?

March 2, 2017

Student

drs. Kim Elise Steenis

Master of Medical Informatics University of Amsterdam Student number: 5982480 E-mail: ke.steenis@gmail.com M entor

dr. ing. M. van de Vrugt

Consultant capacity management Leiden University Medical Centre R. Piening, MSc

Implementation and conversion specialist Furore Informatica

Tutor

M. C. Schut PhD Assistant professor

Department of Medical Informatics

Academic Medical Centre, University of Amsterdam SRP duration

April 2016 – March 2017 SRP location

Furore Informatica

Bos en Lommerplein 280, 1055 RW Amsterdam

(3)

5.6. Conclusions 17 6. The cancer practice unit case study 18 6.1. Goals and research questions 18 6.2. Data sources 18 6.3. Preprocessing of data 19 6.4. Process mining methods and results 19 6.5. Conformance checking 22 6.6. Conclusion 23 7. Discussion 24 8. Literature list 25 Appendix A 27 Appendix B 29 Appendix C 30 Appendix D 31 Appendix E 32 Appendix F 33 Appendix G 34

(4)

1. Summary

Introduction: Because of challenges such as the rising demand for high quality healthcare and the aging population, Dutch hospitals are searching for the best strategies to optimize their care processes without losing quality of care. We set up two case studies to examine the oncological processes in a University medical center (UMC) by applying process mining techniques. The goal was to support the policy makers with the creation of an integrated oncology practice unit (IPU).

M ethods: The first case study was focused on the radiotherapy department. We used Disco process mining software to discover the care pathways that were most frequently carried out and to find possible bottlenecks. After preprocessing the dataset, we used the Fuzzy miner to discover a process model. The second case study was focused on the oncology practice unit. We used the ProM framework to discover the most frequent care pathway for different cancer types. After preprocessing the data, we used the Inductive miner and several filters to discover process models. Finally, we applied conformance checking on the discovered models.

Results: We identified process models in both case studies. In the radiotherapy study, we found a representative process model after an iterative process that involved domain experts. Disco provides a very clear representation of the discovered model, including the activities and sojourn times. In the oncology practice unit case study, we applied many filters and discovered a number of models that could not be used because of high precision (i.e. overfitting) and because of inconsistencies compared to the corresponding dataset. The inconsistencies were also found in conformance checking.

Discussion: Both case studies provided insight in the usefulness of process mining techniques. The model that was found in the radiotherapy study was used by policy makers to detect a number of improvements that could be made to the process. Insights, knowledge, clear research questions and a clear focus are needed to apply process mining techniques.

(5)

2. Nederlandse samenvatting

Introductie: Vanwege uitdagingen zoals de stijgende vraag naar kwalitatieve gezondheidszorg en de vergrijzende bevolking, zoeken Nederlandse ziekenhuizen naar de beste manier om hun zorgprocessen te optimaliseren zonder dat de geleverde zorg daarop inboet. In twee case studies onderzochten wij de oncologische zorgprocessen in een academisch ziekenhuis met behulp van process mining technieken. Het doel hiervan was om de beleidsmakers te ondersteunen met het oprichten van een geïntegreerd oncologisch praktijkcentrum (integrated practice unit, IPU).

M ethoden: In de eerste case studie, gericht op de radiotherapie afdeling, gebruikten wij Disco process mining software om het meest gevolgde zorgpad op de afdeling te identificeren en mogelijke knelpunten hierin te vinden. Na het voorbewerken van de gegevens werd de Fuzzy miner toegepast om procesmodellen af te leiden. De tweede case studie was gericht op de oncologie-gerelateerde processen. Hierin werd het ProM raamwerk gebruikt met als doel om per tumorsoort het meest gevolgde zorgpad te vinden. Na het voorbewerken van de gegevens werden onder andere de Inductive miner en diverse filters toegepast om procesmodellen af te leiden. Tenslotte hebben we ook een zogeheten conformatie analyse toegepast.

Resultaten: In beide case studies werden procesmodellen geïdentificeerd. In de radiotherapie studie werd na een iteratief proces in samenwerking met domeinexperts een procesmodel gevonden dat het meest gevolgde zorgpad correct weergeeft. Het door Disco gegenereerde model geeft de activiteiten in het zorgpad en de bijbehorende doorlooptijden op heldere wijze weer. In de oncologie case studie werden door het veelvuldig toepassen van filters een aantal modellen gevonden die niet gebruikt konden worden door een te hoge precisie (“overfitting”) en door een inconsistente weergave ten opzichte van de bijbehorende gegevens, wat ook naar voren kwam in de conformatie analyse.

Discussie: Beide case studies geven inzicht in het nut van process mining technieken. Het model uit de radiotherapie studie werd door beleidsmakers gebruikt om verbeteringen van processen in kaart te brengen. Voor process mining zijn verschillende inzichten, kennis, duidelijke onderzoeksvragen en een goede focus nodig om niet af te dwalen van het doel.

(6)

3. Introduction

Today’s healthcare system creates emerging challenges and pressure on its organizations. The healthcare costs are rising. In order to control and slow down this trend, healthcare organizations place a strong emphasis on medical and organizational efficiency and effectiveness. They aim to provide the highest quality services at the lowest costs. Overall, many solutions have been tried to slow down the rising costs (e.g. reducing errors, implementing electronic medical records and attacking fraud), however, none have had a significant impact. In the Netherlands, the total healthcare expenditure has increased from 8% of the gross domestic product (GBP) in 1972 to over 13% in 2010. It is estimated that the expenditure will increase further to 19-31% of the GBP in 2040 (Van der Horst, Van Erp, & De Jong, 2011). Porter and Lee (2013) therefore suggest a new overall strategy to maximize the value for patients in order to eventually reduce the costs. This implies changing the way in which clinicians are organized to deliver care, which in turn requires a shift from today’s fragmented organization focused on specialty and separated services, to an integrated practice unit (IPU) which is efficiently organized around the patient’s medical condition. In an IPU, a dedicated team consisting of both clinical and nonclinical personnel provides a full care cycle for the patient at one location (Porter & Lee, 2013). Our study was conducted in a Dutch hospital that is currently planning to set up such an IPU for cancer care to provide better care for patients in the region. The absolute number of new cancer patients in the Netherlands is projected to increase from 45,110 men and 41,690 women in 2007 to approximately 66,000 men and 57,000 women in 2020. This means an average increase of 3,5% cases per year for men and 2,8% per year for women (Meulepas & Kiemeney, 2011). Because this group is fast-growing, there is a serious need for more efficient and high quality care. The current cancer care in this hospital is provided through a fragmented system in which departments mainly work independently from each other and often do not exactly know what is happening at other departments. By relocating the total care cycle in one place, the quality and efficiency are expected to improve and the value for the patients should increase.

Redesigning oncology care from the current structure into an IPU is a complicated, large scale task which will cost a lot of time and money. First of all, the policy makers of this project need to have an image of the current services and infrastructures. There are multiple types of analyses that can provide an overview, but most of them are costly and time consuming because it involves multiple (expensive) staff members and resources. This is why the policy makers decided to explore the possibilities of process mining. Process mining techniques can be used to analyze large amounts of data and provide an accurate, quantitative picture of what has been going on in the organization. These

(7)

techniques can help to develop a better understanding of processes and infrastructures (Kurniati, Johnson, Hogg, & Hall, 2016).

In this study, we explored the possibilities of process mining techniques. Process mining is a relatively young discipline that forms the missing link between computational intelligence and data mining on the one hand and process modeling and analysis on the other hand. The idea of process mining is to discover, monitor and improve real processes (i.e. not assumed processes) by extracting knowledge from datasets readily available in today’s systems (Van der Aalst, et al., 2011).

In the first case study we applied process mining techniques on data from the radiotherapy department, in the second case study we applied these techniques on data from multiple oncology departments. The results of these analyses were provided to policy makers to support them in creating an IPU and meaningful insights were drawn about using process mining techniques in these case studies.

3.1.

O

BJECTIVES

In this study we investigated the oncology care processes in a Dutch hospital using process mining techniques. We aimed to provide the policy makers with an accurate representation of the realized patient processes that were obtained from the hospital data. Secondly, we aimed to provide the readers with meaningful insights about using process mining techniques.

(8)

4. Healthcare processes and process mining

In this section we summarize the characteristics that define healthcare processes, we explain the concept of process mining, we describe the software that we used for process mining and we explain the strategy that we used to evaluate the process models resulting from the analysis (e.g. internal and external validity).

4.1.

H

EALTHCARE PROCESSES

Healthcare processes can be classified as medical treatment processes or generic organizational processes. Medical treatment processes are also known as clinical processes and are directly linked to the patient. These processes are executed according to a diagnostic-therapeutic cycle, comprising observation, reasoning and action. This cycle depends on medical knowledge to deal with case-specific decisions that are made by interpreting patient-specific information. Also, administrative or organizational processes are generic process patterns that support the medical treatment processes in general. They aim to coordinate medical treatment among different people and organizational units, such as scheduling patient appointments (Rebuge & Ferreira, 2012).

Healthcare processes are recognized to have the following characteristics: 1) they are highly dynamic, since process changes can occur due to a variety of reasons, such as technological developments, the discovery of new drugs, or the introduction of new procedures; 2) they are highly complex, due to many factors such as a complex medical decision process or the unpredictability of patients and treatments; 3) healthcare processes are increasingly multi-disciplinary, because healthcare organizations have an increased level of specialized departments and medical disciplines, and 4) healthcare processes are ad-hoc, meaning that the participants of healthcare procedures have the expertise and autonomy to decide their own working procedures, which can cause a high degree of variability (Rebuge & Ferreira, 2012).

4.2.

P

ROCESS MINING

Hospitals nowadays rely on hospital information systems (HIS) that collect increasingly more data about the care processes, such as the details of the tasks that are performed, the people who are involved, the specific information from the equipment that is involved, and information about the time when the task is performed. These data reflect what happened in “the real world” and can be a valuable source of insight for understanding and improving the business. This information is often analyzed using data mining techniques in order to support business decisions, but these techniques do not inform about processes. At the same time, lots of money and effort is spent on process modeling (i.e. creating

(9)

workflow representations). In traditional process modeling, process models result from a collaborative effort between key stakeholders and process analysts in order to provide inside knowledge of the organization, but also the expertise to represent that knowledge into process models. However, this approach has several problems. The first is that this technique is expensive and time-consuming because it implies discussions with workers, extensive document analysis, careful observation of participants etc. The second is that the result represents the process as described by people, not the actual process. Thirdly, since this form of modeling is done manually by observing processes and describing them, they are quickly becoming outdated and often end up somewhere on dead piles of paper that have no value. Also, when processes become more and more complex, this manual from of description will get more and more difficult (Rebuge & Ferreira, 2012).

Process mining can be seen as a missing link that combines the strengths of data mining and process modeling. By automatically creating process models based on existing IT log data, it can create models that are connected to the business and can be updated at any point in time. Process mining takes on the challenge to process large amounts of data that simply cannot be evaluated by hand anymore. Unlike data mining, process mining focuses on the process perspective: it includes the temporal aspect and looks at a single process execution as a sequence of activities that have been performed. While data mining techniques extract abstract patterns such as rules or decision trees, process mining creates complete process models and then uses them to indicate where the bottlenecks are.

We have therefore used process mining techniques to investigate the described healthcare processes in two case studies. For process mining it is essential to extract event logs from data sources such as a HIS. In an event log, each event refers to a case (i.e. a process instance), an activity (i.e. a well-defined step in the process) and a point in time. Such a log can be seen as a collection of cases and a case can be seen as a trace or sequence of events. The three main types of process mining are 1) to discover new models from event logs, 2) to check the conformance of a model by checking whether the modeled behavior matches the observed behavior, or 3) to extend an existing model by projecting information extracted from logs onto some initial model (Figure 1). The three types can be executed independently and should not influence one another (Van der Aalst, et al., 2011).

(10)

4.3.

S

OFTWARE

There are a number of software systems that enable process mining techniques and algorithms to be applied to an event log in order to generate models for analysis. In healthcare, the most commonly used tool is ProM1, which is an open source tool with more than 200 extensible plug-ins for process mining. In their literature review, Rojas et al. (2016) found that ProM was used in 42% of the examined case studies. They also found that the Disco software2 was used in 10% of the examined cases, which makes it the second most commonly used tool. Since we were exploring the possibilities of process mining techniques for the first time, we used Disco and ProM; tools with the most reference work.

4.4.

P

ROCESS MODEL EVALUATION

The challenge in process discovery is to find the ‘best’ process model given the recorded behavior in the log. This can be difficult because of two reasons: (1) there are no negative examples (i.e. a log shows what has happened but does not show what could not happen, and (2) the log typically contains only a fraction of all possible behaviors and the search space has a complex structure due to concurrency, loops, and choices. In order to find out if a model is representative, one can test the validity internally and externally. To test the internal validation, one typically uses four competing quality criteria that are depicted in Figure 2 (Van der Aalst W. , Discovering concurrency, 2017), which are fitness, precision, generalization and simplicity.

1 http://www.promtools.org. 2 http://fluxicon.com/disco. 3

Although we are aware that the details of the map are unreadable because of the small font size, we included this map because it gives an overall idea of what the process looks like.

2_{http://fluxicon.com/disco}_.

(11)

4.5.

I

NTERNAL VALIDITY

Firstly, any definition of a workflow process should satisfy the soundness property, which consists of the following requirements: (1) whatever happens, every case can be completed, (2) after a case has been completed, no references are left behind to that case, and (3) every part of the process model is necessary for some case. According to the literature, the mining algorithms that guarantees soundness are the Inductive Miner (IM), the Inductive Miner – infrequent (IMi) and the Evolutionary Tree Miner (ETM) algorithm. Any resulting flower-model (FM) also guarantees soundness (Verbeek, 2004).

After discovering a process model, four quality dimensions (see Figure 2) are often used to put the results in perspective and to discuss them:

Fitness. This quantifies the extent to which the discovered model can accurately reproduce the cases recorded in the log. Perfect fitness implies that a model can reproduce all the traces of a log perfectly, but this also results in a precise or overfitting model that does not allow for much additional behavior that is not described in the log.

Precision. This quantifies the fraction of behavior that is allowed by the model which is not seen in the event log. It is meaningless to discover a simple process model that can reproduce the event log since such models can produce any arbitrary finite sequence of events. These models are generally referred to as flower-models (FM) (Rozinat & Van der Aalst, 2008). A relatively precise model allows only minimally more behavior than seen in the log.

Sim plicity. This quantifies the complexity of a process model. Often, process discovery algorithms result in spaghetti-like process models which are very hard to read and which do not allow for future behavior that was not initially present in the log.

Generalization. This assesses the extent to which the resulting model will be able to reproduce future behavior of the process. This criterion can also be seen as a measure for the confidence on the precision of the model (Buijs, Van Dongen, & Van der Aalst, 2012).

(12)

4.6.

E

XTERNAL VALIDITY

The external validity of a predictive model can be tested by adequately splitting the event log into a log that is used for model discovery and separate log to test the performance of the model. The underlying goal is to ensure that both the discovery set and the performance set separately cover the whole process and that they are not too dissimilar. According to Rozinat and Van der Aalst (2008), the external validity can be examined by calculating two metrics: fitness (i.e. the extent to which the log traces can be associated with valid execution paths specified by the process model) and appropriateness (i.e. the degree in which the process model describes the observed behavior, combined with the degree of clarity in which it is represented). Both metrics are calculated by the Conformance Checker tool in the ProM framework, which replays an event log within a Petri net model and calculates both metrics to quantify the conformance of the model. The result is a representation of the process model in which the problematic areas are shown through several visualization options. The Conformance Checker can be used to replay several scenarios, and the fitness measurement indicates whether a scenario corresponds to a possible execution sequence for the studied process.

A process model can also be externally validated by domain experts. They evaluate the model individually and their responses are used to update it. This is a recurrent process in which the model is adjusted a number of times based on the responses from the experts. The result is a correct and complete model of the process as is experienced daily by the experts.

(13)

5. The radiotherapy case study

In this section we will describe the first case study at the radiotherapy department of a Dutch hospital. The structure of this section is as follows: (1) goals and research questions, (2) data sources, (3) preprocessing of data, (4) process mining methods, (5) the results and (6) our conclusions.

5.1.

G

OALS AND RESEARCH QUESTIONS

In order to improve their services for the patients, the policy makers of the radiotherapy department in this case study specifically aimed to increase the effectiveness and the efficiency of their main treatment process. To be able to do this they needed a realistic representation of the actual process that was executed in the past years. We aimed to provide such a representation by using process mining techniques. The first goal in this case study was to provide the policy makers with a realistic representation of the main process (i.e. a process model) in order to support them with their decisions. Related to the type of diagnosis, there are differences in the number of appointments for each patient as some are undergoing longer radiation treatment than others. However, every patient has to follow a specific sequence of activities with each activity carried out at least once, regardless of the type of diagnosis. It was therefore not necessary to discriminate on diagnosis in order to find the main process.

The second goal was to gain information about the average throughput time of the main process. In other words, the average time that it takes for the patients to complete their traces. This information can be used by the policy makers to test if the national quality standards are met. These quality standards imply that acute patients should be treated within one day, 80% of the sub-acute patients should be treated within seven days with a maximum waiting time of 10 days and 80% of the remaining patients should be treated within 21 days with a maximum waiting time of 28 days (Nederlandse Vereniging voor Radiotherapie en Oncologie, 2000). It is important to notice if any activities are taking longer time than necessary. The research questions for this case study are therefore as follows:

• Which care pathway (i.e. sequence of activities) is most frequently carried out in the radiotherapy department and what is the throughput time?

• Can we define any bottlenecks in the processes of the radiotherapy department, if there are any?

(14)

5.2.

D

ATA SOURCES

The business intelligence unit (BIU) provided a dataset containing all of the appointments at the radiotherapy department between January 2015 and June 2016. For each appointment, or event, we asked for a pseudo ID of the patient to keep the data anonymous, the type of event and the start time in order to use process mining techniques. Next to that, we asked for details about the patients and the events, such as the gender and age of the patient, the age group, the diagnosis, the literal code for each event, the name of the specialist, the planned duration of the event, the logged duration of the event, the number of times that the event was changed or moved, and the diagnosis-treatment-combination (DBC, used for billing purposes) in order to examine different patient categories.

The provided dataset was extracted from the data warehouse of the hospital. The file contained detailed information about the diagnostic and treatment procedures of 5,509 unique patients at the radiotherapy department between January 1st 2015 and June 1st 2016. In total there were 123,109 events that could be divided into two types: the ones that involved the patient (e.g. a consult or a radiation appointment) and events that were carried out without the patient (e.g. patient discussions between specialists, or the preparation of the devices). An example of this dataset is represented in Dutch in Figure 3.

5.3.

P

REPROCESSING OF DATA

In order to create a clean event log to use for mining, we needed to preprocess the dataset. The dataset contained all events from January 2015 until June 2016, but the first five months were discarded because in June 2015 new software was implemented at the radiotherapy department to plan the appointments. In the remaining log, we first listed all of the denotations and we translated them into single denotations with the help of two domain experts (i.e. medical planners) of the radiotherapy department. Because the dataset contained the names of the medical specialists, we anonymized the set further by assigning them numbers (e.g. Doctor 1). We also created a timestamp

(15)

for the start of each event by combining two separate columns that contained start date and start time. Then, we added the planned duration of each event to the start timestamp in order to create an end timestamp. Finally, we saved the updated dataset in CSV (flat text file) format to use it the Disco software. Appendix A contains an overview of the contents of the dataset and the changes that were made.

5.4.

P

ROCESS MINING METHODS

After preprocessing the file and converting it to CSV, we imported it into Disco. We mapped the data by indicating which columns should be imported as case ID, activity and timestamp. Disco applies a Fuzzy miner algorithm and creates an understandable process map that can be directly observed (Mans, Schonenberg, Song, Van der Aalst, & Bakker, 2008). After creating the first process map, it was examined by domain experts in order to see if it covered all the activities and in the right order. This strategy was repeated several times until the final process map was created. Based on these experts advice, we applied the following four filters to obtain the final map which is presented in Figure 4:

1. Include all activities from June 22nd 2015 and later. The department had implemented new software for the planning and all the professionals were used to work with it on this date. This is why we used it as a starting point for the current process;

2. Include only complete traces. Only the cases that executed both the specified start activity and one of the possible end activities of the process were considered;

3. Include only traces that start with the activity “Eerste consult nieuwe/oude patient” (i.e. First consultation new patient). These traces were included because they reflect the ideal process as closely as possible. Traces starting with other activities were therefore discarded;

4. Exclude all cases without a diagnosis. Cases without a diagnosis are mostly outliers, because they are often mainly treated at other departments. We found 258 cases without a diagnosis and we excluded them from analysis because they were not representative for the main process.

5.5.

R

ESULTS

The final process map created with Disco represents the process based on the filtered log and it satisfied the perception of the domain experts (Figure 4a)3. The start of the process is illustrated by the triangle symbol on the top left and the end of the process is illustrated by the stop symbol on the bottom right. Activities are represented by boxes and the process flow between activities is visualized

3_{Although we are aware that the details of the map are unreadable because of the small font size, we included this map}

(16)

by arrows. The dashed arrows indicate the activities that occurred at the very beginning or at the very end of the process. The colors of the activities and the thickness of the arrows intuitively correspond to the frequency of the activities and sequences. Finally, the numbers next to the arrows indicate the absolute frequency and the mean activity time (Rozinat, Disco User's Guide, 2015). Figure 4b is a zoomed in image of Figure 4a showing the first part of the process. We can see that 1,754 cases in the log started with the activity “Eerste consult nieuwe/oude patient”. Another 71 cases came from a different activity which results in a total of 1,821 cases that engaged in the specified start activity. After the first activity, the process splits into six alternative paths and 1,068 cases directly go to “CT scan radiotherapie”, which took 20 minutes on average (Figure 4b). It is important to notice that the software creates a model based on the most followed traces and that the number of activities and traces that are shown in the model can be altered. This explains why the numbers next to the activities and the traces may not be complete.

(17)

Because of technical software issues, we could not use the Disco software to determine the internal validity of the process map. However, the two domain experts from the radiotherapy department were continuously involved in the project and recurrently examined the process maps created with Disco. They have externally validated the process map in Figure 4a by reaching consensus about the coverage and order of the activities and the throughput times.

5.6.

C

ONCLUSIONS

After recurrently applying process mining techniques and assessing the external validation of the results, we provided the policy makers with a complete visual representation of the main process, based on real data. This process model (Figure 4a) was then used on an operational level to compare the throughput times of the activities with the national norms and standards for treating radiotherapy patients. Based on this comparison, they were able to detect a number of bottlenecks or time-improvements that could be made to optimize the process. For example, it seemed possible to shorten the wait time between a first consult with the medical specialist and the first irradiation appointment. The policy makers are currently examining these options for improvement.

By using Disco (i.e. the Fuzzy Miner algorithm) we were able to create a clear and complete process model from the data. We used the dynamic view of Disco to replay the log on the model, which was immediately understandable for everyone involved in the project. This dynamic representation of the real data sparked the interest of the policy makers, the domain experts and the business intelligence unit about using process mining techniques for process analysis.

(18)

6. The cancer practice unit case study

This section describes the second case study in which we studied the log data from multiple departments in the cancer practice unit. The structure of this section is the same as in the first case study: goals and research questions, data sources, preprocessing of data, methods of process mining, the results and our conclusions.

6.1.

G

OALS AND RESEARCH QUESTIONS

Since the number of cancer patients in the Netherlands is fast-growing and the costs of healthcare are rising, the policy makers of the Dutch hospital in this study aim to provide the patients with the highest quality services at the lowest costs by setting up an IPU for cancer care. In order to do this, they need to have an accurate description of the current ongoing processes, which we aimed to provide by using process mining techniques. In this case, we analyzed processes that are carried out across departments, so we needed to determine which characteristic defines a separate process. Since the goal of creating an IPU is to provide patients with a complete care cycle under one roof, we decided to look for processes based on cancer types, assuming that patients with the same type of cancer would on average follow the same sequence of treatment activities. Knowing that the processes could significantly change per cancer type (e.g. the treatment of breast cancer is very different than the treatment of prostate cancer), we examined individual datasets for each cancer type. The following research question was stated:

• What are the most followed care pathways based on cancer types?

6.2.

D

ATA SOURCES

We asked the BIU for a dataset containing all the appointments related to oncology between January 2013 and June 2015. For each activity, we asked for a pseudo ID of the patient, the type of activity and the start time so that we could use process mining techniques. Finally, we asked for additional details about the patients and the activities, coherent with the radiotherapy case study.

The BIU provided us with three separate datasets, one for each year. All three files contained the same amount of details about the diagnostic and treatment procedures of 23,522 unique patients accounting for a total of 317,508 events. As with the radiotherapy study, the set contained events with and without involving the patient.

(19)

6.3.

P

REPROCESSING OF DATA

First of all, we combined the datasets from 2013 and 2014 into one set for process discovery. The dataset from 2015 was processed separately to be used for conformance checking. Since these datasets together are covering three years of appointments across departments, they contain a large amount of distinct activities. Our goal was to consider the events at the level of cancer types, but some of the events lacked this information because they did not have a DBC description or a diagnosis description. In both datasets we performed some analytical actions to correct for this missing information4 We used Excel formulas in both datasets locate the missing information for as much events as possible. For example, to complete the events without a cancer type, we used the pseudoID from the patient in the “VLOOKUP” function to look for a cancer type noted in the other events of the patient and copied this value into the empty cell. To anonymize the data further than the PseudoID for the patients, we used a number system to replace the names of each specialist.

The next step was to simplify the datasets in order to zoom out to the higher focus of care pathways across departments. With the help of a care administrator, we categorized all events into groups based on the corresponding department and the original event description. We then applied aggregation for low level activities by defining a representative. For example, for the radiology department we define “Scan” as a representative. A dataset that originally contains activities from the radiology department with descriptions such as “Ultrasound scan abdomen” or “X-ray upper left arm” would contain “Scan” after mapping the low level activities to this representative and removing any duplicates. After that, changes were made again to the date and time notations, since the parameters were delivered in the same format as in the radiotherapy case study. Again, a timestamp for the ending of each activity was created by adding the planned duration to the start timestamp. Finally, the updated datasets were saved in CSV format for process mining purposes. Appendix B contains a general overview of the characteristics of the datasets and the changes that were made.

6.4.

P

ROCESS MINING METHODS AND RESULTS

In this case study we used the ProM framework based on the literature about process mining. We imported the 2013-2014-file into ProM 6 and converted it into an XES file (event log standard), where we indicated which columns should be imported as case ID, activity, timestamp or as resource. Based on common practice, we used the “Mine Petri net with Inductive miner” plug-in to discover any structure in the data. The inductive miner specifically deals with infrequent behavior (e.g. healthcare processes) and is used to obtain understandable, sound models (Mans, Van der Aalst, & Verbeek). The

4

For example, to complete the events without a cancer type, we used the “VLOOKUP” function and based on pseudoID we looked for a cancer type noted in the other events of the patient. If found, we copied the value into the empty cell.

(20)

Figure 5 – Part of the model that was discovered from all traces in 2013 and 2014. It is a straightforward model that provides choices of many activities. The model allows for all types of behavior from the log, which makes it precise and overfitting. The black boxes represent silent transitions, the circles represent places for tokens.

noise threshold was set at 0.2 default, meaning that the algorithm filters out 20% of the infrequent behavior in the log. The resulting Petri net shows an overfitting model that includes all possible outcomes (Figure 5). For example, the start activity on the left could be one of 17 possible activities. Because this model is too precise, we used two filters in ProM to split the dataset into smaller logs to increase the chance of finding generalizable models. The “Filter log using simple heuristics” plug-in was

used to select only the traces that started with the “Nieuwe patient” activity, which is the notation for the first consult with a new patient in Dutch. As with the radiotherapy case study, we aimed to include only the traces that started with this specific activity in order to remove as much noise as possible. Applying this filter with its default settings resulted in a dataset of 6,625 remaining unique patients (i.e. 28% from the original log) and 135,401 remaining events. Again we used the “Mine Petri net with inductive miner” plug-in which resulted in the model presented in Figure 6. Because this model has great precision, it does not allow for any other behavior than what is recorded in the log. Therefore, this model is overfitting again and we decided to split-up the dataset based on cancer type by using the “Filter Log by attributes” plug-in. This way we created 18 separate datasets based on a single cancer type (see Table 1). In this plug-in we used the “Trace with an event having this attribute”-setting with “Tumorwerkgroep” (i.e. cancer type in Dutch) as the defining attribute, so we could filter out each cancer type.

(21)

Figure 6 – Part of the model that is discovered from the traces that start with “New patient”. The first clear box represents the “New patient”-activity and again each possible outcome is represented, making the model precise and overfitting.

After creating the cancer type datasets, we chose to examine the care pathways that were followed by most patients or that were already worked out in detail by the hospital: (1) melanoma and skin, (2) gastroenterology, (3) esophagus and stomach and (4) colorectal. For each dataset, we used the “Mine Petri net with inductive miner” plug-in to discover a Petri net for each set (see Appendices D, E, F and G). Unfortunately, all four models do not accurately represent the event logs because we specifically filtered on traces that start with the “New patient”-activity and none of the discovered models starts with this activity. The models generally started with a choice of multiple activities other than the “New patient-activity”, which was depicted somewhere in the center of the models.

(22)

Cancer type # Cases # Events New patients (start activity) 6,625 270,802 Melanoma, skin 1,489 30,344

Mamma 999 69,342

Head, neck 703 40,422

Urology 656 29,770

Bone, soft tissue 614 26,924

Gynecology 475 22,630 Hematology-oncology 423 18,178 Gastroenterology 370 24,682 Neuro-endocrine oncology 366 6,438 Lungs 303 19,194 Neurology-oncology 257 9,556 Colorectal 166 11,612 Malignancy other 103 5,916 Esophagus, stomach 49 5,758 Ophthalmology 44 1,206

Primary tumors other 25 2,172

Metastases 24 2,122

Pancreas 23 820

Table 1 – Overview of the 18 datasets based on cancer type, filtered from the “Nieuwe patient” dataset.

6.5.

C

ONFORMANCE CHECKING

Conformance checking is another perspective on process mining (as opposed to process discovery) in which a process model is compared with the reality reflected in an event log (or a “blueprint”) to assess its quality. Two different metrics are mainly used for conformance checking: fitness and appropriateness. Fitness is the extent to which the log traces can be associated with the execution paths specified by the process model. The fitness of a trace is high (i.e. 1) if the same sequence of activities is allowed by the model. Appropriateness is the degree of accuracy in which the process model describes the observed behavior, combined with the degree of clarity in which it is represented. The appropriateness metric consists on the one hand of evaluating how much behavior is allowed by the model that was not observed in the event log (i.e. behavioral appropriateness) and on the other hand of keeping the model’s structure minimal to clearly reflect the described behavior (i.e. structural appropriateness) (Rozinat & Van der Aalst, 2005).

We used the “Replay a log on Petri net for conformance analysis” plug-in for checking the conformance of the four discovered models. This plug-in aligns each model with its corresponding event log to see if and how often the real process deviates from the modeled process. Frequent occurring deviations were found for each model, indicating that a number of traces do not fit the model. The plug-in also calculated the trace fitness, which was approximately 0.5 for each model and 0.55 as highest value for the melanoma model. This value indicates that about 50% of the moves carried out in the model were also performed in the log at the same time (i.e. “move in both”). The other 50% of the moves was artificially constructed for either the log (i.e. “move in log”) or the model

(23)

(i.e. “move in model”) to properly end the process. This last notion was also visible in the models: a few places still contained a token while the process replay was finished (Rozinat & Van der Aalst, 2005).

Since all four models are very precise, the structural appropriateness is expected to be low. However, the behavioral appropriateness is estimated to be high, since the flowers might allow for much behavior not observed in the logs.

6.6.

C

ONCLUSION

In this case study, we aimed to provide policy makers with an accurate description of the current oncology processes and we aimed to analyze the (dis)similarities between these processes that may be relevant for setting up an IPU. To define the current processes, we received three years of data about oncology-related activities performed in more than fifty departments on which we applied specific techniques for process discovery and for conformance checking. With the use of a number of filters and miners we discovered two models that unfortunately were highly precise and therefore overfitting. We then narrowed our focus down to eighteen separate datasets filtered by specific cancer types in order to discover more meaningful models. The ProM 6 framework was further used for process discovery in four of these specific datasets but the resulting Petri nets were again not accurately reflecting reality with a moderate fitness of 0.55. Further work is needed to discover the right process models.

Figure 7 – Result of the “Replay a log on Petri net for conformance analysis” plug-in for the esophagus and stomach model explaining the legend and the calculated properties. The grey boxes represent silent activities and the circles represent the places holding the tokens.

(24)

7. Discussion

After studying two cases related to oncology care in a Dutch University medical center, we discovered a range of different process models. For the radiotherapy department we discovered a accurate process model which was used by policy makers to make decisions about improving the radiotherapy process. Although we did not find generalizable process models to support the cancer practice unit, we have gained experience and insight in applying process mining techniques to healthcare data.

As is explained in the process mining manifesto (Van der Aalst, et al., 2011), it is very important to start process discovery with a clearly defined focus based on research questions. In the second case study, we were provided with a dataset containing the events of three consecutive years across at least fifty departments. Although process mining techniques can handle large amounts of data, it should be remembered that within normal time limits, very large datasets might not provide relevant answers and can even distract from the goal of the study. We discovered that preprocessing such large datasets is very time consuming, just as filtering on relevant data. Another important note is to understand the characteristics of the different process mining algorithms in order to apply the right methods. This is not an easy task since the ProM framework already offers the user more than 200 plug-ins. We learned that this search for the correct miners is time consuming and may distract from the original goals. Another important argument to choose the correct miners is to avoid representational bias. This bias arises when an unfitting algorithm is chosen compared to the data, which can result in skewed process models. For example, based on the data, some algorithms may be unable to represent concurrency, or hierarchy that is embedded in the dataset. According to Van der Aalst (2011), Petri nets are associated with a number of problems that can generate representational bias, which are also relevant to our cancer practice unit case study; the search space is too large, Petri nets cannot capture important process patterns in a direct manner, it is difficult to “invent” modeling elements and the representational bias does not help in finding a proper balance between overfitting and underfitting.

This study provided us with good insight when it comes to process mining, By trial and error, we gained experience in finding and using a number miners and filters. After presenting some results to colleagues in the UMC, most of them were very interested in the benefits and prospects of using process mining techniques for the optimization of healthcare processes. Although, the research questions did not necessarily have to be answered by applying process mining, we are convinced that the resulting models provide a valuable insight in the realized processes. In the near future, we will transfer our knowledge of process mining to the colleagues in the UMC, so that it can be used to improve the currently discovered models. When these models are worked out in details, it provides valuable information for setting up an IPU.

(25)

8. Literature list

1. Buijs, J., Van Dongen, B., & Van der Aalst, W. (2012). On the role of fitness, precision,

generalization and simplicity in process discovery. OTM Confederated International Conferences "On the Move to Meaningful Internet Systems" (pp. 305-322). Springer Berlin Heidelberg. 2. Kurniati, A., Johnson, O., Hogg, D., & Hall, G. (2016). Process mining in oncology: A literature

review. Information Communication and Management (ICICM), International Conference (pp. 291-297). IEEE.

3. Mans, R., Schonenberg, M., Song, M., Van der Aalst, W., & Bakker, P. (2008). Application of process mining in Healthcare - A case study in a Dutch Hospital. International Joint Conference on

Biomedical Engineering Systems and Technologies (pp. 425-438). Springer Berlin Heidelberg. 4. Mans, S., Van der Aalst, W., & Verbeek, H. (n.d.). Defining and executing process mining workflows

with RapidProM.

5. Meulepas, J., & Kiemeney, L. (2011). Kanker in Nederland tot 2020 - Trends en prognoses. Oisterwijk: KWF Kankerbestrijding.

6. Nederlandse Vereniging voor Radiotherapie en Oncologie. (2000, November 24). Kwaliteit Indicatoren. Retrieved from NVRO: http://www.nvro.nl/kwaliteit/indicatoren

7. Porter, M., & Lee, T. (2013). The strategy that will fix health care. Harvard Business Review, 1-36. 8. Rebuge, A., & Ferreira, D. (2012). Business process analysis in healthcare environments: A

methodology based on process mining. Information Systems, 99-116.

9. Rojas, E., Munoz-Gama, J., Sepulveda, M., & Capurro, D. (2016). Process mining in healthcare: A literature review. Journal of biomedical informatics, 224-236.

10. Rozinat, A. (2015). Disco User's Guide. Discover your Process.

11. Rozinat, A., & Van der Aalst, W. (2005). Conformance testing: Measuring the fit and

appropriatenes of event logs and process models. International Conference on Business Process Management (pp. 163-176). Springer Berlin Heidelberg.

12. Rozinat, A., & Van der Aalst, W. (2008). Conformance checking of processes based on monitoring real behavior. Information Systems, 64-95.

13. Van der Aalst, W. (2011). Do Petri nets provide the right representational bias for process mining? ART@ Petri Nets, 85-94.

14. Van der Aalst, W. (2017). Aligning observed and modeled behavior. Retrieved from Coursera: https://www.coursera.org/learn/process-mining/lecture/RGAXo/4-7-aligning-observed-and-modeled-behavior

(26)

15. Van der Aalst, W. (2017, 2 2). Discovering concurrency. Retrieved from Process mining: http://www.processmining.org/_media/presentations/concur-wvdaalst-invited-talk-2011.pdf 16. Van der Aalst, W., Adriansyah, A., de Medeiros, A., Arcieri, F., Baier, T., Blickle, T., & Burattin, A.

(2011). Process mining manifesto. International Conference on Business Process Management (pp. 169-194). Springer Berlin Heidelberg.

17. Van der Horst, A., Van Erp, F., & De Jong, J. (2011). Trends in gezondheid en zorg. Den Haag: Centraal Planbureau. Retrieved from https://www.cpb.nl/persbericht/3211095/zorguitgaven-blijven-stijgen

18. Verbeek, H. (2004). Verification of WF-nets. Eindhoven, The Netherlands: Technische Universiteit Eindhoven.

19. Verbeek, H., Buijs, J., Van Dongen, B., & Van der Aalst, W. (2010). ProM 6: The process mining toolkit. Proceedings of BPM Demonstration Track, (pp. 34-39).

(27)

APPENDIX A

Radiotherapy case study, a general description of the dataset and the Dutch terms. • Pseudo ID, age, age group and gender of the patient.

• The event details from the radiotherapy agenda and it’s sub agenda’s including date and time of each appointment, the number of times an appointment was rescheduled, the duration of each appointment/activity and the attending specialist/physician.

• Information about the financial registration of each appointment (DBC). • Information about the diagnosis of each patient.

Information about the patient.

pseudoID: pseudonym of the patient ID. Geslacht: gender of the patient.

Leeftijd: age of the patient.

Leeftijdsgroep: age group of the patient per ten years. Information about the timestamp of events.

AfspraakDatum: date of the event.

→ Each row also contained an empty 00:00:00.000 time notation which was deleted using the Search and Replace function for this specific column.

AfspraakTijd: time of the event.

Dag_Naam _Afspraak: name of the weekday of the event. TimestampStart: timestamp of the planned start of the event.

→ Obtained by combining AfspraakDatum and AfspraakTijd using the TEXT function. TimestampEnd: timestamp of the planned ending of the event.

→ Obtained/Calculated by adding the GeplandeDuur to TimestampStart. Information about the resources of an event.

Agendaom schrijving: name of the main agenda. SubAgendaNaam : name of the sub agenda.

SubAgendaOm schrijving: description of the sub agenda. Information about the event.

Afspraakcode: short code of the event.

Afspraakcodeom schrijving: description of the event code.

SpecialistOm schrijving_Uitvoerder: name of the attending specialist. Information about additional timestamps belonging to the event.

GeplandeDuur: planned duration of the event.

Toegangstijd (min): time between registering an event in the agenda and the actual time of the event (in minutes).

→ Later confirmed by the business intelligence unit that this time is represented in days, not minutes.

→ Name of the column was therefore changed to Toegangstijd(dagen). Oproeptijd: time when a patient was called from the waiting room.

(28)

→ Not registered, column only contains 0. BehandelDuur: planned duration of the event.

→ Not registered, column only contains 0.

EindTijdAfwijking: time between the planned end time and the actual end time. → Not registered, column only contains 0.

Verplaatsingen: number of times the event was rescheduled in time or per specialist. → Unfortunately, it is not clear what this number represents exactly.

VertrekTijd: time when a patient leaves the department.

→ Not registered, column only contains 0 except for three events. Voldaan: indicator if the event was completed.

→ 0 indicating not completed, 1 indicating completed. W achtDuur: waiting time in the waiting room.

→ Not registered, column only contains 0. W erkelijkeDuur: actual duration of the event.

→ Not registered, column only contains 0.

Information about the diagnosis and the billing process.

DBC_code: code for the combination of diagnosis and treatment (DBC), needed for the billing process.

DBC_om schrijving: description of the DBC. DiagnoseCode: hospital specific diagnosis code. DiagnoseOm schrijving: description of the diagnosis.

→ This column contained NULL instances. The LOOKUP function was used to fill in the NULL instances for patients who had the same DiagnoseOmschrijving for all events. NULL instances for patients who only had one event or patients who had multiple DiagnoseOmschrijving instances were kept at NULL.

DiagnoseCode_ICD10: diagnosis code according to the standardized International Classification of Diseases and Related Health Problems.

(29)

APPENDIX B

One of the first discovered models, leading up to the final model. This model was still ambiguous since a number of patients started with the preparation of treatment instead of a first consult with a specialist.

(30)

APPENDIX C

Cancer practice unit case study, a general description of the dataset and the Dutch terms. • Pseudonymized ID, age, age group and gender of the patient.

• The event details including date and time of each event, the number of times an event was rescheduled, and the attending physician.

• Information about the financial registration of each event (DBC). • Information about the diagnosis of each patient.

Information about the patient.

pseudoID_dwh: pseudonym of the patient ID. Geslacht: gender of the patient.

Leeftijd: age of the patient.

Leeftijdsgroep: age group of the patient per ten years. Information about the timestamp of events.

AfspraakDatum : date of the event.

→ Each row also contained an empty 00:00:00.000 time notation which was deleted using the Search and Replace function for this specific column.

AfspraakTijd: time of the event.

TimestampStart: timestamp of the planned start of the event.

→ Obtained by combining AfspraakDatum and AfspraakTijd using the TEXT function. Information about the resources of an event.

Agendaom schrijving: name of the main agenda and department. SubAgendaNaam : name of the sub agenda.

Artstype_Uitvoerder: type of the attending specialist.

Specialistom schrijving_Uitvoerder: name of the attending specialist. Information about the event.

Afspraakcodeom schrijving: description of the event code. Information about additional timestamps belonging to the event.

Verplaatsingen: number of times the event was rescheduled in time or per specialist. → Unfortunately, it is not clear what this number represents exactly.

Information about the diagnosis and the billing process. DBC_om schrijving: description of the DBC.

DiagnoseOm schrijving: description of the diagnosis. → This column contains blanks.

(31)

APPENDIX D

The discovered model from the melanoma, skin log using the Inductive miner in ProM. The model is straight forward, but it is inconsistent with the data. The log contains only traces that start with the “New patient”-activity, while this model starts with a choice of eleven other activities. The “New patient”-activity is presented elsewhere in the process.

(32)

APPENDIX E

The discovered model from the gastroenterology log using the Inductive miner in ProM. The model is straight forward, but it is inconsistent with the data. The log contains only traces that start with the “New patient”-activity, while this model starts with a choice of nine other activities. The “New patient”-activity is presented elsewhere in the process.

(33)

APPENDIX F

The discovered model from the esophagus and stomach log using the Inductive miner in ProM. The model is straight forward, but it is inconsistent with the data. The log contains only traces that start with the “New activity, while this model starts with a choice of ten other activities. The “New patient”-activity is presented elsewhere in the process.

(34)

APPENDIX G

The discovered model from the colorectal log using the Inductive miner in ProM. The model is straight forward, but it is inconsistent with the data. The log contains only traces that start with the “New patient”-activity, while this model starts with a choice of eighteen other activities. The “New patient”-activity is presented elsewhere in the process.

(35)

Towards an integrated practice unit for cancer care: How can process mining help?

Towards an integrated practice unit

for cancer care:

How can process mining help?

Master Thesis

Medical Informatics

Towards an integrated practice unit for cancer care:

How can process mining help?

March 2, 2017

Table of contents

1.

Summary

2.

Nederlandse samenvatting

3.

Introduction

O

BJECTIVES

4.

Healthcare processes and process mining

H

EALTHCARE PROCESSES

P

ROCESS MINING

S

OFTWARE

P

ROCESS MODEL EVALUATION

I

NTERNAL VALIDITY

E

XTERNAL VALIDITY

5.

The radiotherapy case study

G

OALS AND RESEARCH QUESTIONS

D

ATA SOURCES

P

REPROCESSING OF DATA

P

ROCESS MINING METHODS

R

ESULTS

C

ONCLUSIONS

6.

The cancer practice unit case study

G

OALS AND RESEARCH QUESTIONS

D

ATA SOURCES

P

REPROCESSING OF DATA

P

ROCESS MINING METHODS AND RESULTS

C

ONFORMANCE CHECKING

C

ONCLUSION

7.

Discussion

8.

Literature list

APPENDIX A

APPENDIX B

APPENDIX C

APPENDIX D

APPENDIX E

APPENDIX F

APPENDIX G