• No results found

Temporal data analysis on blood glucose and physical activity data for diabetic patients

N/A
N/A
Protected

Academic year: 2021

Share "Temporal data analysis on blood glucose and physical activity data for diabetic patients"

Copied!
60
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Faculty of Medicine

MSc. Medical Informatics

Temporal data analysis on blood glucose and

physical activity data for diabetic patients

Master’s thesis

Lisanne J.E. Kruijver

(2)

Temporal data analysis on blood glucose and

physical activity data for diabetic patients

(3)

Temporal data analysis on blood glucose and

physical activity data for diabetic patients

Student

Lisanne J.E. Kruijver

Student number: 11992425

E‐mail: l.j.kruijver@amsterdamumc.nl Mentor

Prof. Lucia Sacchi

Laboratory for Biomedical Informatics “Mario Stefanelli” Department of Electrical Computer and Biomedical Engineering Università di Pavia, Italy

Email: lucia.sacchi@unipv.it Tutor

Dr. Martijn C. Schut

Department of Medical Informatics Amsterdam UMC, location AMC Email: m.c.schut@amsterdamumc.nl Location of Scientific Research Project

Laboratory for Biomedical Informatics “Mario Stefanelli”

Department of Electrical, Computer and Biomedical Engineering Università di Pavia, Italy

Via Ferrata, 5 27100 Pavia Italy

Practice teaching period February 2019 – February 2020

(4)

1

P

REFACE

This Master’s thesis provides the background, methods, results and conclusions of the scientific research project I did at the Laboratory for Biomedical Informatics “Mario Stefanelli” in Pavia, Italy. First of all, I would like to thank Prof. Lucia Sacchi for giving me the opportunity to go to Pavia, write my thesis at the University of Pavia and your supervision. Furthermore, I would like to thank Dr. Martijn Schut for his guidance and advice.

The months in Pavia would not have been the same were it not for everyone in the lab. I would like to thank each one of you for making me feel welcome and trying the food I brought from the Netherlands and my oven. I will never forget your reaction to stroopwafels. It was a pleasure to work with you.

I would like to thank every Erasmus student I met in Pavia, without you my time here would have been less enjoyable for sure and it was great to explore the surroundings of Pavia with you. A special thanks goes to Mirna, for taking away my doubts about going abroad and encouraging me to write the email asking for a spot to do my thesis in Pavia.

Moreover, my sincere thanks go to my family and friends, Joyce, Monica, Kiona, Carmen and Roni in particular. It has been wonderful to explore a bit of Italy with you. It is always nice to make new memories with long-lasting friends and these were definitely ones I will not forget.

And a final thank you for the readers of my Master’s thesis, I hope you will enjoy reading it. Thank you, grazie mille and bedankt.

Lisanne Kruijver Wijdewormer, February 2020

(5)

2

C

ONTENTS

Preface ... 1 Contents ... 2 Summary ... 4 Samenvatting ... 5 1. Introduction... 6 1.1 Research objectives ... 7 1.2 Chapter Organization ... 7 2. Background ... 8 2.1 Temporal abstraction ... 8

2.2 Temporal abstraction with JTSA ... 13

2.3 Temporal association rules ... 16

3. Data and Methods ... 18

3.1 Data ... 18

3.2 Data preparation ... 19

3.3 Descriptive statistics ... 19

3.4 JTSA for temporal abstraction detection and identifying critical patients ... 19

3.5 Detect the relation between heart rate and hypoglycemia during the night ... 22

3.5.1 New pattern definition ... 22

3.5.2 Variability in heart rate before hypoglycemia ... 23

3.6 Mining novel patterns ... 25

4. Results ... 26

4.1 Descriptive statistics ... 26

4.2 JTSA for temporal abstraction detection ... 27

4.3 Detect the relation between heart rate and hypoglycemia during the night ... 30

4.3.1 Creating a new pattern ... 30

4.3.2 Percentage variation ... 31

4.3.3 All heart rate values ... 36

4.4 Mining novel patterns ... 39

5. Discussion... 40

5.1 What are main characteristics of the blood glucose and physical activity datasets? ... 40

5.2 What temporal patterns can be detected in blood glucose data? Are these patterns able to identify the most critical cases in the patients set? ... 40

(6)

3 5.3 Can novel patterns be mined from the data and can these help to promote a better

management of the disease? ... 42

5.4 Implications and recommendations ... 42

6. Conclusion ... 45 References ... 46 Appendices ... 49 Appendix A ... 50 Appendix B ... 55 Appendix C ... 56 Appendix D ... 57

(7)

4

S

UMMARY

Background Type 1 Diabetes Mellitus patients have insulin deficiency and so they rely on external insulin to maximize the time of their blood glucose (BG) levels in target. BG levels that are too low (hypoglycemia) or too high (hyperglycemia) can have serious consequences for the patient. Several external factors have an influence on the BG level, including physical activity, sleep and meals. Patients were using a Flash Glucose Monitoring (FGM) system to continuous measure the BG levels and an activity tracker to measure the heart rate (HR), activities and sleep. This research focuses on analyzing BG, HR and activity data with the advanced data mining techniques temporal abstraction (TA) and temporal association rules (TAR).

Methods The main characteristics of the BG and physical activity dataset of pediatric patients have been analyzed with R. The Java Time Series Abstractor framework was used for TA detection. Patterns have been developed in collaboration with clinicians and have been used for analyzing the proportions of hypo-, normo-, and hyperglycemia episodes in patients. Also the percentage of nights with at least one hypoglycemia episode has been calculated. The relation between HR and hypoglycemia during sleep was analyzed by creating new patterns and the percentage variation of the HR and HR values itself before the start of hypoglycemia episodes. TAR was used to try discover novel patterns in the data without making any hypotheses on what to look for.

Results 17 Patients were included in the pediatrics dataset. Their age was 12 [10 – 13] years and 9 (53%) were female. Patients were using both the activity tracker and FGM device simultaneously for 45 [23 – 53] days. The activity tracker was used without a break by 71% of the patients and the FGM device was used by 65% of the patients without a break. The 5 most frequent occurring patterns were related to HR data and the pattern with the longest duration was Hypoglycemia. Analysis on glycemic control showed that 35% of the patients experienced almost no hypoglycemic events, 71% of the patients spend a majority of the time with high BG values and the other 29% of the patients spend a majority of the time with normal BG values. 15 Patients experienced nocturnal hypoglycemia episodes and 2 patients had one or more hypoglycemia episodes in over 40% of the nights. The analyses on the relation between HR and hypoglycemia during sleep did not have statistically significant results and no new patterns that can promote a better management of diabetes were found by using TAR.

Conclusion The analyses on BG data have the potential to help clinicians and patients to correctly manage the diet and dosage of therapies. The patterns can be used as indicators for glycemic control. With the help of TA it is possible to find patterns that otherwise are difficult to find in BG and HR data. However, with the available data it was not possible to find a relationship between the HR and BG levels before a nocturnal hypoglycemia episode and to mine novel patterns that can promote a better management of the diabetes type 1. For future research it is recommended that the population size is increased and that more types of data are taken into account to perform TAR.

Keywords temporal data analysis; temporal abstraction; patient-generated health data; flash glucose monitoring; activity tracker

(8)

5

S

AMENVATTING

Achtergrond Patiënten met diabetes mellitus type 1 (T1DM) maken zelf geen insuline aan en zijn afhankelijk van toegediende insuline om de tijd met normale bloedglucosewaarden (BG) te maximaliseren. Te lage (hypoglykemie/hypo) of te hoge (hyperglykemie/hyper) BG-waarden kunnen ernstige gevolgen hebben voor de patiënt. Verschillende externe factoren hebben invloed op de BG-waarden, waaronder lichamelijke activiteit, slaap en maaltijden. Patiënten gebruikten een Flash Glucose Monitoring (FGM) systeem om hun BG-waarden te meten en een activiteitstracker om de hartslag (HR), activiteiten en slaap te meten. Dit onderzoek richt zich op het analyseren van BG, HR en activiteitsdata met de geavanceerde datamining technieken temporal abstraction (TA) en temporal association rules (TAR).

Methoden De belangrijkste kenmerken van de datasets met BG-waarden en activiteiten van pediatrische patiënten zijn geanalyseerd met behulp van R. De Java Time Series Abstractor framework is gebruikt voor TA detectie. De patronen zijn ontwikkeld in samenwerking met artsen en zijn gebruikt voor het analyseren van de verhoudingen in normaalwaarden, hypo-, en hyperglykemie episoden bij patiënten. Ook het percentage nachten met een of meer hypoglykemie episoden is berekend. De relatie tussen HR en nachtelijke hypoglykemie is geanalyseerd met behulp van nieuwe patronen, de procentuele variatie van de HR en HR waardes zelf voor de start van een hypoglykemie episode. TAR werd gebruikt om nieuwe patronen in de gegevens te ontdekken zonder enige hypotheses aan te nemen over waar op te zoeken.

Resultaten 17 Patiënten zijn geïncludeerd voor de pediatrische dataset. De leeftijd was 12 [10 – 13] jaar en 9 (53%) was vrouw. Patiënten gebruikten de activiteitstracker en FGM tegelijkertijd gedurende 45 [23 – 53] dagen. De activiteitstracker werd door 71% van de patiënten zonder pauze gebruikt en dit was voor FGM het geval bij 65% van de patiënten. De 5 meest voorkomende patronen zijn allemaal HR gerelateerd en Hypoglykemie was het patroon met de langste episoden. Analyse van de BG-waarden toonde aan dat 35% van de patiënten bijna geen hypoglykemie ervaren, 71% van de patiënten bracht het grootste deel van de tijd door met hoge BG-waarden en de andere 29% van de patiënten had normale BG-waarden voor het grootste deel van de tijd. 15 Patiënten hadden hypoglykemie tijdens het slapen en 2 patiënten voor meer dan 40% van de nachten. De analyses over de relatie tussen HR en nachtelijke hypoglykemie leverden geen statistisch significante resultaten en met behulp van TAR werden er geen nieuwe patronen gevonden die het behandelen van diabetes kunnen bevorderen.

Conclusie De analyses over BG-gegevens kunnen clinici en patiënten helpen om het dieet en de dosering van therapieën aan te passen waar nodig. De patronen kunnen gebruikt worden als indicatoren voor de glykemische controle. Met behulp van TA is het mogelijk om patronen te vinden die anders moeilijk te ontdekken zijn in HR- en BG-gegevens. Met de beschikbare gegevens was het echter niet mogelijk om een verband te vinden tussen de HR- en BG-waarden voor een nachtelijke hypoglykemie en om nieuwe patronen te ontginnen die de behandeling van T1DM kunnen bevorderen. Voor toekomstig onderzoek wordt een grotere populatiegrootte aanbevolen, net als het gebruik van meer datatypen bij TAR.

(9)

6

1. I

NTRODUCTION

Diabetes mellitus is a chronic illness characterized by the body’s inability to produce any or enough of the hormone insulin. The International Diabetes Federation estimated in 2017 that there were 525 million diabetes patients around the world, which means that it is one of the most prevalent chronic diseases [1]. Around 9% of these patients have type 1 diabetes mellitus (T1DM) [2]. The main feature of these patients is insulin deficiency and so they rely on external insulin to maximize the time of their blood glucose (BG) levels in target. Normal BG levels are between 70 mg/dL and 180 mg/dL and levels that are too low (hypoglycemia) or too high (hyperglycemia) can have serious consequences. Therefore, testing their BG levels is one of the most important behaviors of T1DM patients [3].

Typical symptom generation such as sweating and dizziness can help to detect impending hypoglycemia at an early stage and this is critical to avoid severe hypoglycemia. When patients have an impaired awareness of hypoglycemia and have lost the capacity to sense these typical symptoms, the risk of severe hypoglycemia increases by a factor 6 [4]. During sleep, warning symptoms for a decrease in BG levels are blunted or absent, and patients are unlikely to be awaken by nocturnal hypoglycemia. Hypoglycemia can have adverse consequences including fatigue, unconsciousness, seizures, and even death [5]. Hyperglycemia can also become severe when left untreated and can lead to serious complications, such as a diabetic coma. Long-term consequences of persistent hyperglycemia are complications that affect the eyes, kidneys, nerves and heart [6]. Diabetes treatments are for an important part focused on decreasing the chances of getting these complications. Patients have to collect their blood glucose levels, insulin dosages, meals intakes, physical exercise and the occurrence of events that may affect glucose metabolism (e.g. illnesses and unusual stress). These data are evaluated by diabetologists in order to assess the patient’s glucose metabolism and to revise insulin therapy. To reduce the need of performing frequent blood glucose tests, in 1999 devices for continuous glucose monitoring (CGM) became available. They measure BG levels in the interstitial fluid at regular intervals over the course of the entire day [7]. Another type of glucose monitoring systems are devices that apply flash glucose monitoring (FGM). FGM systems do not require calibration, and the patient has to use a scanner in order to collect the BG values from a sensor placed on the arm. Both systems are often complemented with software, that allows visualizing reports such as the daily glycemic profiles and the average glucose profile over a pre-defined time interval (e.g., latest 2 weeks).

Some variables related to a patient’s lifestyle can influence or reflect the BG levels. Physical activity for example has an influence on the glycemic values and may lead to hypoglycemic or hyperglycemic episodes. Because the effect of exercise can have a remaining effect on blood glucose and insulin [8], it is fundamental to keep track of exercise over time. Poor quality of sleep or too few hours of sleep also have an effect on the glucose metabolism and can lead to an insulin resistance [9]. This is known as the hyperglycemic effect and can last for several hours. When exercising or during stressful moments, the heart rate (HR) increases. Moreover, increasing HR values can also be an indicative of hypoglycemia [10, 11].

Activity trackers are able to record parameters such as physical activity, sleep, and the heart rate of users. These trackers can be comfortably worn by the patients 24 hours a day. They can record the start time, end time and intensity of activities that take longer than a threshold duration and

(10)

7 the start time and end time of sleep. Moreover, they can record awaken periods during sleep and they can monitor the HR of the user continuously.

Since the use of activity trackers and the interest of integrating BG values with lifestyle information has increased over the years, various platforms that enable patients to upload their BG data from a glucometer and their activity data from trackers are available. These platforms usually provide dashboards and sometimes allow sharing data between patient and physician [12–14]. By combining different types of information, it might be possible to discover the sources of undesirable glucose patterns [15].

1.1

RESEARCH OBJECTIVES

The project studies pediatric T1DM patients who are monitoring their BG using a FGM device and their daily activities and HR using a Fitbit activity tracker. Data are collected through a platform called AID-GM, that was recently developed by researchers of the Laboratory for Biomedical Informatics “Mario Stefanelli”. This platform allows exchanging data between patients and diabetologists, and users are able to perform advanced analysis on the glucose and activity data [16]. Because of this platform, physicians are able to get a more complete view on the patient’s blood glucose levels and physical activity than they used to do by analyzing data only during periodic encounters. Data from adult T1DM patients were also available, but because of time limitations it was decided to focus on the pediatric data.

The data that is collected by the platform will be analyzed, with the aim of detecting temporal patterns of interest and identify the most critical patients. The work will answer the following research questions:

- What are the main characteristics of the blood glucose and physical activity datasets? - What temporal patterns can be detected in blood glucose and activity data? Are these

patterns able to identify the most critical cases in the patients set?

- Can novel patterns be mined from the data and can these help to promote a better management of the disease?

To analyze the patient-generated data and answer the research questions, the advanced data mining techniques temporal abstraction (TA) and temporal association rules (TAR) will be used. The JTSA (Java Time Series Abstractor) framework, recently developed at the Department of Electrical, Computer and Biomedical Engineering of the University of Pavia in Italy, is used to perform these analyses.

1.2

C

HAPTER

O

RGANIZATION

Chapter 2 will give a detailed description of the concepts and elements that were used to perform the data mining techniques. The temporal patterns and TA are first introduced, which allow to identify qualitative patterns in time series. TAR is then described and finally the JTSA framework that was used for the execution of the TA algorithms. Chapter 3 will give a description of where the data were coming from and state the methods to answer the research questions. Chapter 4 presents the results and chapter 5 the discussion, in which the results are reflected upon in terms of strengths and weaknesses of the research methods. It also gives recommendations for further research. Chapter 6 concludes this research project.

(11)

8

2. B

ACKGROUND

This chapter contains background information about the basic elements. The data mining techniques TA and TAR will be explained, as well as how they are detected by using JTSA.

2.1

TEMPORAL ABSTRACTION

The information that is detected by devices as a Fitbit and a FGM device, is provided in the form of time series. With the technique known as TA, these quantitative time series (raw data) can be transformed into a qualitative and interval-based representation in which the patterns of interest occur (e.g. an increase in heart rate) [17]. This methodology has been widely used in the medical field, where it has been proved to be particularly useful. TA has been applied to the clinical history of patients, which are made up of numerous series of clinical events (e.g. hospitalizations and drug intakes) and where data can have characteristics such as irregular sampling time or a great variability [18, 19].

Temporal data can be described using two temporal primitives: the time point and the time interval. The time point refers to one specific point in time t, while the time interval refers to a pair of points [tstart, tend], where tstart ≤ tend. A granularity can be selected, which represents the

maximum temporal resolution to represent a time point or time interval, for example minutes or hours [20]. An event is any variable v associated to a time point t and characterized as a pair (t, v), while an episode is the association between an interval I and a label L. This results in the pair (I, L) where the label L describes what is occurring in the interval I. Raw data that is collected over time often comes in the form of events.

Figure 1 – The ontology of algorithms.

TA can be extracted from data using a variety of algorithms (also known as TA mechanisms). Following the framework proposed in [21], the ontology of algorithms that is available for the extraction of TAs is shown in Figure 1. The framework includes algorithms for pre-processing a time series and algorithms for extracting TAs. There are three different types of TA:

(12)

9 1. BASIC TAs: receive a time series of events as input and return a time series of episodes. There are two types of basic TAs: qualitative and trend. Qualitative TA map ranges of quantitative values to qualitative levels based on pre-defined thresholds, and are able to detect for example normal and abnormal states. Decreasing, increasing and stationary patterns are detected by trend TA. An example is shown in Figure 2.

Figure 2 – An example of a time series (TS) abstracted with a three level qualitative TA (high H, middle M and low L)

and with a trend TA (increase I, stationary S and decrease D).

2. AGGREGATION TAs: take as input a series of episodes and aggregate consecutive episodes that have the same label. For this type of TA, two parameters need to be specified: the minimum length that an episode needs to have before being included into the output series and the maximum distance between two successive episodes to be aggregated to one single episode. The aggregation algorithms are divided into Level and High Level: while Level aggregates according to the criteria defined by the user, the High Level aggregation TA also realizes filtering on the label of episodes, and so it is possible to eliminate the episodes that are not considered to be useful by the user. For example, a High Level aggregation TA is able to extract just the episodes marked as “low” in Figure 2 and aggregate consecutive “low” episodes according to the parameters set.

3. COMPLEX TAs: allow to define complex abstractions that are used for the identification of patterns that cannot be represented by basic TAs. The complex abstractions start with basic, aggregation or other complex episodes and can detect complex patterns in univariate or multivariate data. The patterns are defined with the use of one of Allen’s relational operators [22], presented in Table 1 and one of the combiners as shown in Table 2. There are thirteen relationships possible between two episodes since six of these, except EQUALS, can be swapped around. The algorithm works on pairs of episodes and tries to detect if the time relationship as requested by the user is verified. If this is the case, a new episode of the complex TA is created. The combiners define how each pair of intervals has to be processed to obtain a single output interval for the resulting episode. This type of TA requires three parameters to be defined: the maximum distance between the starts of the two intervals (left shift), the maximum distance between the ends of the two intervals (right shift) and the maximum gap between the two. This is illustrated in Figure 3.

(13)

10 Figure 3 – An illustration of the parameters of a complex TA, including left shift, right shift and gap.

Relational operator Description Visual representation

BEFORE Episode x finishes

before episode y starts

MEETS

Episode x finishes at the same moment that

episode y starts

OVERLAPS

Episode x starts before episode y, but finishes after episode y starts

(14)

11

STARTS

Episode x and y start at the same time, while

episode x finishes before episode y

DURING

Episode y starts before episode x and episode y

finishes later than episode x

FINISHES

Episode y starts before episode x and both episodes finish at the

same time

EQUALS

Episode x and y start and finish at the same

time

(15)

12

Combiner Description Visual representation

UNION

Episode x and y are combined in one single

interval, including the possible gap between

them

INTERSECTION

The result is the time interval in which both episode x and y occur

GAP

The result is the time interval between the end of episode x and the start of episode y

LONGEST

The result is the time interval of the longest episode. In this case

episode x

SHORTEST

The result is the time interval of the shortest

episode. In this case episode y

(16)

13

GAP BETWEEN ENDS

The result is the time interval between the end of episode x and the end of episode y

GAP BETWEEN STARTS

The result is the time interval between the start of episode x and the start of episode y

FIRST IN SERIES

The result is the time interval of the first episode. In this case

episode x

LAST IN SERIES

The result is the time interval of the last v. In

this case episode y

Table 2 – Combiners for the construction of a new interval starting from two input intervals.

2.2

TEMPORAL ABSTRACTION WITH JTSA

JTSA is a framework that provides a library of algorithms to detect TAs and can be integrated into other applications [21]. In order to extract the TAs, the JTSA framework requires a workflow file with corresponding properties files for the TAs. The workflow is a document in XML language, which represents a formalization of the series of steps that need to be followed to detect the pattern of interest. A workflow consists of several blocks that are divided into steps, each of which can run a JTSA algorithm. There are two types of blocks: pipeline blocks and complex blocks. The pipeline blocks take a single time series as input and contain a sequence of steps that use a basic algorithm or an aggregation algorithm. The output is a series of episodes. Complex blocks take

(17)

14 two series of episodes as input and use a complex algorithm to combine the incoming series with a temporal operator and a combiner, as described in Table 1 and Table 2 respectively. The single blocks and entire workflows can be run over different data sets and in different medical applications. The data types that can be handled by JTSA are event time series (TS) and episodes time series (A-TS). When temporal operators are applied to episodes, this results in time series of pairs of episodes (CA-TS). These are the series of pairs that verified the temporal operator and are the input for the combiner.

When a graphical renderer is added to the JTSA workflow, it is possible to visualize the different steps of TA. This can be helpful during the prototyping phase and to tune parameters. It is possible to visualize a partial output using a step renderer, or the final output by using a pipeline renderer. An example of a complex pattern in T1DM is the dawn effect, also known as the ‘dawn phenomenon’ and reported for the first time in 1981 [23]. This effect is linked to hormonal factors and means that a patient wakes up with hyperglycemia, preceded by normal blood glucose levels during sleep [24]. The data used for this pattern are the blood glucose levels during sleep and during routine (this means not during a work-out and not during sleep). The interval of the sleep and routine data can be precisely detected by using the Fitbit data. The workflow consists of three different blocks and Figures 4 – 6 show the plots generated by using a graphical renderer in JTSA:

1. Figure 4 shows the plots obtained after the first block is applied. Here, the episodes of normal blood glucose during sleep are detected. The input data for this step is the time series of BG data while being asleep. A pipeline block with two steps is used to achieve this. The first step is to map the qualitative glycemia levels to the quantitative data by a basic qualitative TA and, second, a high level aggregation TA is used to only get the intervals where the blood glucose levels are normal.

2. In the second block (Figure 5) the episodes of hyperglycemia in routine are detected in the same way as obtaining the normal levels during sleep, but in this case the input data is the time series of routine BG data and the high level TA is used to obtain only the hypoglycemia episodes.

3. The third and final block takes the episodes of normal blood glucose levels during sleep and hyperglycemia episodes during routine as input and, with the use of the BEFORE operator, detects the intervals of the dawn effect (Figure 6).

(18)

15 Figure 4 – Extraction of normal episodes. The first graph shows the input time series, the second shows the mapping

between ranges of quantitative values and qualitative levels and the last graph shows the resulting normal interval as extracted by the TA detection algorithm

Figure 5 – Extraction of hyperglycemia episodes. The first graph shows the input time series, the second shows the

mapping between ranges of quantitative values and qualitative levels and the last graph shows the resulting hyperglycemia interval as extracted by the TA detection algorithm

(19)

16 Figure 6 – Extraction of the dawn effect. The first graph shows the normal episode, the second shows the hyperglycemia

episodes and the last graph shows the resulting dawn effect interval as extracted by the TA detection algorithm.

2.3

TEMPORAL ASSOCIATION RULES

Another data mining technique that is used is TAR. With this technique, the goal is to discover frequent occurrences of temporal precedence between any pattern. Differently from the extraction of patterns that are specified by the user, TARs are able to mine the patterns of interest from data, potentially being able to identify novel behaviors. A TAR is an association rule in which an antecedent and a consequent are linked to each other by a temporal relationship. An antecedent can be composed by one or a number of patterns, while a consequent always consists of only one pattern, which already can be arbitrarily complex (for example, a set of episodes precedes the occurrence of another episode of interest). In addition to the already mentioned operators defined by Allen [22], an additional operator called PRECEDES can be used for the extraction of TARs from data. This operator synthesizes 6 of the 13 temporal operators by Allen which are OVERLAPS, FINISHES, MEETS, BEFORE, EQUALS and STARTS. With the use of the three parameters left shift, right shift and gap as mentioned before and depicted in Figure 3, it is possible to select a subset of the temporal operators and not look for all the relationships that are included in the PRECEDES relationship. To avoid ambiguity, it is important to set the parameters of left shift, right shift and gap in such a way that only the two closest intervals are found [17].

A core aspect of TAR is defining the set of Abstractions of Interest (AoI), or the set of patterns in which we are interested in to extract the rules, which will then form the antecedent and the consequent. Once the set of AoI is defined, each of its elements is selected and used as a result of the rule. Then the temporal operator PRECEDES is verified on all the possible combinations of patterns in the AoI. Once all possible combinations have passed, the consequent will be changed. A TAR is defined as: 𝐴 → 𝑝𝐶, where A is the antecedent and consists of a subset of AoI, C is the consequent and consist of a series of episodes that belong to the abstraction of interest and p is a vector of three temporal parameters: left shift, right shift and gap. The method used for temporal

(20)

17 rule extraction is based on an Apriori-like search strategy [25], and looks for rules in which a number of intersecting TA episodes (the antecedent) has a PRECEDES relationship with another TA episode (the consequent).

Two important parameters for TAR are support and confidence. As in the Apriori algorithm, they are essential for an efficient search over the rule space. The original definitions of the two concepts have been designed for rules in a static context, so they must be adapted to the temporal domain. The terminology used to define support and confidence are as follows:

- TSO: the time span of the observation period over which the rule is derived

- RTS: the rule time span, which corresponds to the union of the episodes in which both the antecedent and the consequent of the rule occur.

- NAT: the number of episodes in which the antecedent occurs during TSO - NARTS: the number of episodes in which the antecedent occurs during RTS And so, support and confidence are defined as:

𝑆𝑢𝑝𝑝𝑜𝑟𝑡 = 𝑅𝑇𝑆 𝑇𝑆𝑂⁄ 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 = 𝑁𝐴𝑅𝑇𝑆 𝑁𝐴𝑇⁄

Both indicators are needed to be evaluated together, since they provide important information about the quality of the rule. For example, when a precedence relationship happens several times, but over very short intervals during a long time series, the support would be low, but the confidence might be very high. An example that illustrates the definitions is shown in Figure 7.

Figure 7 – An illustration of the definition of confidence and support. The precedence relationship of interest is the one

between abstraction a and abstraction c over the two episode sets Ea and Ec. Only on the intervals x and y the temporal

association rule is satisfied. The figure shows the two intervals RTS and TSO and in this example NAT is equal to 2 (x and z) and NARTS is equal to 1, because only antecedent x is involved in the rule [17].

(21)

18

3. D

ATA AND

M

ETHODS

This chapter outlines the methods that were chosen to answer the research questions. It starts with a paragraph on what techniques were used for data collection and how the data were processed. Then the methods for obtaining the descriptive statistics are explained and afterwards using JTSA for TA and identifying critical patients. In this part, detecting hypoglycemia during the night with the help of HR data was also explored. And finally the methodology for mining new patterns is described. The time distribution between confirmation of patterns with TA and discovery of new patterns with TAR was 90/10. Half of the time for confirming patterns was spend on detecting the relation between heart rate and hypoglycemia during the night.

3.1

DATA

The data used for this project were provided by two different hospitals in Pavia, Italy. The Pediatric Diabetology division of the IRCCS Policlinico San Matteo hospital in Pavia provided the dataset of children. The datasets were collected within two separate clinical studies, each one approved by the Ethical Committee of the respective hospitals. In both studies, the datasets consist of data measured by an activity tracker and data measured by a blood glucose measurement system. At each hospital, a web server that runs AID-GM was installed. An eligibility condition for the studies was that patients were already using the Abbott Freestyle Libre FGM system for monitoring their BG levels. This system enables users to download the collected measurements in a text file. This file can be uploaded to AID-GM and data is saved into a MySQL database located at the center where the study is performed. To get the data from the FGM device, the patient had to scan the sensor on their upper arm at least every 8 hours and this provided one BG measurement for every 15 minutes. Patients were asked to regularly upload the text file downloaded from the sensor to the AID-GM system, and to provide a form with their usual time schedule concerning meals, snacks and sleep and BG events were tagged according to this schedule.

To collect the HR, activity and sleeping data, patients were also wearing a Fitbit device. The versions used are the Fitbit Charge 2, Fitbit Charge 3 and Fitbit Charge HR. Each tracker was connected to an application on the patient’s mobile phone. Upon synchronization, the data of the Fitbit was synchronized to the Fitbit cloud. Patients provided their consent to download the Fitbit data and use them in AID-GM through the Oauth2 Protocol [26]. The Fitbit data were downloaded directly from the Fitbit cloud every night and were stored in the same MySQL database as the BG data. With the help of the Fitbit data it was possible to tag the blood glucose data with workout, sleep and routine, which are referred to as Fitbit tag. The workout tag is assigned to a glycemic event if it occurs during a tracked workout and the sleep tag is used for glycemic events during a tracked sleep session. When events are occurring while a patient is not asleep, nor doing a workout, it is tagged as routine. For each workout, the start time, intensity and duration was stored and for each sleep record, the time that a subject fell asleep, the time of waking up, the time spent in restless sleep and the amount of time of being awake during the night were registered. The Fitbit devices also registered the HR values continuously with an interval of 1 minute. For this research, a dump of the two databases was used. Since the thresholds for hyperglycemia, hypoglycemia, tachycardia and bradycardia differ per patient, these were set by clinicians. In total 34 patients were enrolled in this study.

(22)

19

3.2

D

ATA PREPARATION

The dump of the MySQL database is imported into MySQL Workbench 6.3 CE. The analyses are done in R, version 3.5.2, by using RStudio version 1.1.463. The first step for every analysis is to make a connection to the MySQL database.

3.3

DESCRIPTIVE STATISTICS

Three datasets are obtained: the first dataset obtained comprises information about each patient, such as age, sex and thresholds for hypoglycemia, hypoglycemia, tachycardia and bradycardia. The second dataset contains the time series of all heart rate for every patient and the third dataset contains the time series of all blood glucose data for every patient. The first dataset is combined with the other two and this results in two datasets that include per data type (HR or BG) the patient information, as well as:

- number of measurements; - follow-up in days;

- median and the interquartile range;

- whether the follow-up period included breaks in days: a gap of at least a day in between two consecutive measurements;

- number of these breaks; - total time of these breaks;

- number of days the sensor was worn;

- effective duration: follow-up in days minus the total length of breaks.

With both types of data combined, the number of days when both sensors were worn was calculated.

3.4

JTSA

FOR TEMPORAL ABSTRACTION DETECTION AND IDENTIFYING CRITICAL PATIENTS

We evaluated 11 patterns that have been defined in agreement with diabetologists and are relevant for evaluating the diabetes outcome. Some of the patterns only use BG or HR data, while others combine the HR and BG trends. Thanks to the Fitbit tag, it is also possible to search for patterns during a specific condition (e.g. while asleep or during routine). The patterns that are used are listed in Table 3. Basic patterns can be extracted with the use of only one type of time series, while complex patterns, which consist of a combination of patterns, are potentially extracted by using different types of time series. Most of the patterns make use of thresholds for hypo- and hyperglycemic episodes, and tachy- and bradycardia. The thresholds for tachy- and bradycardia are patient-specific and defined by diabetologists. The algorithms for pattern detection have been tested and validated in research by Sacchi et al. [17, 21, 27].

A report with information on each episode of one of the patterns was obtained by running all the patterns with JTSA. With the use of R, we analyzed how often patterns occur and the duration of these patterns. Two different analyses were performed by using the information on patterns. The patterns Hypoglycemia, Hyperglycemia and BG Normal were used as indicators of the glycemic control of patients by calculating the percentage of time patients spend in each BG level. Furthermore, the hypoglycemia pattern during sleep is used to indicate the percentage of nights that a patient had one or more hypoglycemia episodes.

(23)

20

Pattern Input data Visual representation

Bas ic p attern s Hypoglycemia BG Hyperglycemia BG Increasing BG or HR Decreasing BG or HR Bradycardia HR Tachycardia HR

(24)

21 Normal BG or HR Co m plex an d/or m ul ti vari ate p attern s Rebound Effect (hypoglycemia followed by hyperglycemia) BG

Dawn Effect (normal BG values at night followed by hyperglycemia when waking

up) BG + sleep + routine Tachycardia PRECEDES hypoglycemia BG + HR Hypoglycemia PRECEDES bradycardia BG + HR

Table 3 – The patterns of interest to evaluate the diabetes outcome. The input data states the source of the data (HR for

the Fitbit device and BG for the Freestyle Libre FGM device) and if applicable if a specific Fitbit tag was used (sleep, routine or workout). In the visual representation, the red dots represent BG measurements, while blue dots represent HR measurements. The increasing, decreasing and normal pattern can be used for either BG or HR time series.

(25)

22

3.5

D

ETECT THE RELATION BETWEEN HEART RATE AND HYPOGLYCEMIA DURING

THE NIGHT

A request from a pediatrician of the IRCCS Policlinico San Matteo hospital in Pavia was the basis for the definition of a new pattern to understand the possible relations between heart rate and hypoglycemic episodes during sleep. The pattern we had to create is intended to detect a situation as sketched in Figure 8. In this pattern, the heart rate increases before a patient gets a hypoglycemia. After consulting another physician, another approach relies on the hypothesis that the variation in HR before a hypoglycemia episode is higher than during other parts of the night.

Figure 8 – Sketch of the pattern where an increase of the heart rate happens before a hypoglycemia episode.

3.5.1 N

EW PATTERN DEFINITION

With the use of the already defined blocks for TA, two different workflows for detecting the above described pattern have been defined:

1. Increase of heart rate values DURING a (decrease of blood glucose values PRECEDES a period of hypoglycemia) (shown in Figure 9).

2. Increase of heart rate values BEFORE a (decrease of blood glucose values PRECEDES a period of hypoglycemia) (shown in Figure 10).

(26)

23 Figure 10 – Workflow for HR increase BEFORE (BG decrease PRECEDES hypoglycemia).

The workflows were run with adding a renderer to JTSA. In this way, the output was shown as graphs and it was possible to verify whether the pattern was detecting the correct episodes and if the highlighted output was corresponding to a period in BG and HG values where the requested pattern was occurring. If this was indeed happening, the workflows were run with the version of JTSA that gives as output all patient numbers and the corresponding date and time of the begin and the end of the workflow.

3.5.2 V

ARIABILITY IN HEART RATE BEFORE HYPOGLYCEMIA

Another hypothesis was that an hour before hypoglycemia during night, heart rate values show higher variability than during other times during the night that are not close to a period of hypoglycemia. We performed two different types of analyses: one with using the percentage variation of the heart rate before a period of hypoglycemia and one with all heart rate values before a period of hypoglycemia. For both cases, the nights with at least one episode of hypoglycemia were separated from the nights without a period of hypoglycemia with the following steps:

1. Get all hypoglycemia episodes during sleep by using the pattern HypoglycemiaS. 2. Get HR data during sleep.

3. Mark all HR measurements that are the first one of a period of sleep. 4. For each patient:

a. Get just the first measurements of each period of sleep. b. For each start time of a hypoglycemia episode:

i. Check to which measurement of sleep this hypoglycemia belongs to. ii. Put this night in the data frame that consists of the nights with one or more

episodes of hypoglycemia.

5. Create the normal-nights data frame by taking the difference between all nights and the nights with hypoglycemia episodes.

To consider periods of time sufficiently close to a hypoglycemic episode, only the data of 120 minutes before the start of a hypoglycemia was needed and the rest of the data was deleted from the report with hypoglycemia episodes. The next steps are different per analysis, and are described in the two following sections.

(27)

24

3.5.2.1 PERCENTAGE VARIATION

The data of 120 minutes before nocturnal hypoglycemia episodes were split into four different sections of 30 minutes as shown in Figure 11. The two sections furthest from the hypoglycemia episode are the period “far from hypoglycemia” and the two sections closest to the hypoglycemia episode are “close to hypoglycemia”.

Figure 11 – An illustration of the periods used as far from a hypoglycemia episode and close to a hypoglycemia

episode.

The periods of hypoglycemia starting less than 120 minutes after another period of hypoglycemia were excluded, as well as periods of hypoglycemia where no heart rate data for the full 120 minutes before the start were available. To be able to calculate the percentage variation far from a hypoglycemia and close to a hypoglycemia, we computed the average heart rate in each of the 30-minute time windows:

𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑓𝑎𝑟 𝑓𝑟𝑜𝑚 ℎ𝑦𝑝𝑜𝑔𝑙𝑦𝑐𝑒𝑚𝑖𝑎 = 𝑏̅ − 𝑎̅

𝑎̅ ∗ 100% 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑐𝑙𝑜𝑠𝑒 𝑡𝑜 ℎ𝑦𝑝𝑜𝑔𝑙𝑦𝑐𝑒𝑚𝑖𝑎 = 𝑑̅ − 𝑐̅

𝑐̅ ∗ 100%

All values of the percentage variation far from a hypoglycemia were compared to all values of the percentage variation close to hypoglycemia by using the Wilcoxon Matched-Pairs test. The same has been done for a period of 60 minutes before a hypoglycemia split into periods of 15 minutes. Finally, we decided to also calculate the percentage variation between the period 90 to 60 minutes before the hypoglycemia episode and the period 60 to 30 minutes before the hypoglycemia episode. This resulted in three values of percentage variation per hypoglycemia episode, as shown in Figure 12. With these three numbers of all hypoglycemia periods, the 75 percentile was calculated and an analysis of the points above this value was done.

Figure 12 – An illustration of a period 120 minutes before an hypoglycemia episode where the 120 minutes are split

into three parts, far from the hypoglycemia episode, medium from the hypoglycemia episode and close to the hypoglycemia episode.

(28)

25

3.5.2.2 ALL HEART RATE VALUES

The same data were used as in the previous section, but for this analysis the raw heart rate data was used instead of the percentage variation. For this analysis, it was only necessary to split the data in two groups, as shown in Figure 13. All individual heart rate values for the two periods of 60 minutes were taken into account and the heart rates in the period far from a hypoglycemia were compared to the heart rates close to a hypoglycemia using a Mann-Whitney test. The same is also done for a period of 60 minutes before a hypoglycemia where the period of 60 minutes before a hypoglycemia is split into two periods of 30 minutes.

Figure 13 – In the analysis of all heart rates, the period before a hypoglycemia episode is split into two groups of 60

minutes.

3.6

M

INING NOVEL PATTERNS

Up to now, we have defined the patterns to look for and run JTSA to extract them. The result of this process is that the occurrences of the required patterns are extracted, but no new patterns are discovered. To try to discover whether there are some unexpected patterns in the data, we decided to run TAR on a set of initial patterns of interest, without making any prior hypothesis on what to look for. We decided to start analyzing the BG data. First a set of patterns of interest are selected. These patterns will be retrieved from the data through TA. Then, temporal rules may be specified by setting some context-dependent parameters in the properties file, as the minimum confidence and minimum support. The parameters set were:

- Minimum support = 0.6; - Minimum confidence = 0.7.

When running JTSA, an Apriori-like algorithm looks for meaningful temporal relationships among the patterns of interest. This will create a report that states the input data (patients, patterns and the start, end and duration of these patterns) and the output data. This output data consists of the rules that are found, the support and confidence and the RTS. With the use of R, this report is analyzed in two different ways: the results per rule per patient and the results per rule in total.

(29)

26

4. R

ESULTS

The following results are about the pediatric dataset. The order of analyses as mentioned in the methods is followed. A dataset with adult data was also available. The results of the analyses on descriptive statistics and JTSA for TA detection for the adult dataset are stated in Appendix A. All 34 patients in the pediatric dataset were treated in the IRCCS Policlinico San Matteo hospital in Pavia. 4 were discarded from the analyses because they never uploaded any data, and a further 10 patients only uploaded BG data. Out of the 20 patients that had both uploaded BG and HR data, 17 have used the Fitbit tracker and the Freestyle Libre device simultaneously. In the following, we will show the results on this group of patients.

4.1

D

ESCRIPTIVE STATISTICS

In the group of 17 patients with data from both of the devices simultaneously, the median age at the moment of enrollment was 12 years with an interquartile range of [10 – 13] years and 9 (53%) of them were female. The distribution of age and sex is illustrated in Figure 14. There are no males in the age groups of 11-12, 15-16 and 17-18 and no females in the age group of 17-18 and 21+.

Figure 14 – Age and sex distribution of the pediatric patients who simultaneously heart rate and blood glucose data Table 4 shows the median follow-up period of patients monitoring their HR and physical activity trough a Fitbit device and their BG levels with a Freestyle Libre in days. The table also shows that both sensors are not always used simultaneously. Since not all patients were using the devices continuously, the follow-up period of HR and BG monitoring is not the same as the effective duration of HR and BG monitoring. 5 (29%) Patients had breaks of one day or more in HR data, and so, 12 (71%) of the patients used a Fitbit device for the entire follow-up period. For BG monitoring, 6 (35%) patients had breaks of data during the follow-up period. More detailed information on patient-level is stated in Appendices B – D.

0 1 2 3 7-8 9-10 11-12 13-14 15-16 17-18 19-20 21+ Cou n t Age group Male Female

(30)

27

Monitoring Duration (days)

HR 47 [37 – 56] BG 53 [39 – 61] Only HR 2 [1 – 16] Only BG 7 [7 – 7] HR and BG simultaneously 45 [23 – 53] Effective HR 47 [33 – 56] Effective BG 53 [26 – 61]

Table 4 – Monitoring characteristics of the patient group. The median is provided, as well as the interquartile range in

brackets.

4.2

JTSA

FOR TEMPORAL ABSTRACTION DETECTION

To further characterize the patient’s population, JTSA was used to perform an analysis of the BG and HR data trough pattern detection. As stated before, the thresholds that define tachycardia and bradycardia are patient-specific and are defined by clinicians and the normal BG range is between 70 and 180 mg/dL.

When there is at least one measurement of BG < 50 mg/dL, it is defined as an episode of Severe Hypoglycemia, whereas at least one measurement of BG > 250 mg/dL defines an episode of Severe Hyperglycemia. BG Increasing episodes are defined as an increase of the BG level of at least 15 mg/dL every 15 minutes for at least 35 minutes. BG Decreasing episodes are defined in the same way, but then with a decrease of the BG level. HR Increasing and HR Decreasing episodes are defined as a variation of the HR level of at least 1.3 bpm every minute for a period of at least 6 minutes.

Table 5 summarizes the study population by providing patterns of BG and HR data. Not all patterns occur in the data of every patient, especially the more complex patterns only occur a couple of times in only a few patients. The table shows that some patterns happen more frequent than others and that different patterns have different durations of episodes. The 5 most frequent patterns are all HR related, because shorter periods of time are required for these patterns. The pattern with the longest episode direction is Hypoglycemia. Further patterns have also be extracted (e.g. Stationary patterns, HR Normal PRECEDES Tachycardia), but they were considered less significant from a clinical viewpoint.

(31)

28

Pattern Number of

patients

Total amount of episodes

Episode duration in min Median [interquartile range] HR Normal 17 21243 16 [8 - 35] HR Decreasing 17 9076 7 [6 - 8] Tachycardia 16 8898 9 [6 - 16] HR Increasing 17 8442 7 [6 - 8] Bradycardia 15 5301 12 [7 - 26] BG Normal 17 3059 105 [45 - 210] BG Increasing 17 2888 61 [45 - 61] BG Decreasing 17 2775 75 [46 - 108] Hyperglycemia 17 2204 181 [61 - 421.25] Severe Hyperglycemia 17 1621 135 [47 - 302] Hypoglycemia 16 821 32 [15 - 76] Severe Hypoglycemia 11 208 45 [17.75 - 105] Tachycardia Precedes Hypoglycemia 10 98 38 [31 - 55.75] Dawn Effect 12 20 279 [223 - 450.25] Hypoglycemia Precedes Bradycardia 1 1 35 [35 - 35]

Table 5 – The found patterns and their occurrence.

The Hypoglycemia, BG Normal and hyperglycemia patterns can be used as indicators of glycemic control in patients. For each patient in this study, the percentages of time the BG levels are too low (hypoglycemia), normal (normoglycemia) or too high (hyperglycemia) are shown in Figure 15. From this figure, it is possible to observe that 6 patients (35%) show almost no hypoglycemic episodes (under 1% of the time), and 1 patient (6%) experiences hypoglycemic events for over 15% of the time. From this picture it is also easy to identify that 12 patients (71%) spend a majority of the time with high BG values and that the other 5 patients (29%) spend the majority of time with normal BG values.

(32)

29 Figure 15 – Percentages of time patients of the pediatric group spend in hypo- normo- or hyperglycemia.

When patients are using the FGM and the Fitbit device at the same time, it is possible to extract patterns that take BG, HR and tracked sleep into account. By using tracked sleep and BG levels, it is possible to compare the patients in terms of the percentage of nights in which nocturnal hypoglycemia episodes occur. Figure 16 shows the number of nights in which a Fitbit and Freestyle Libre were both used (blue bars) and the percentage of nights with one or more hypoglycemic events during sleep (red line). Two patients (8 and 9) did not experience any nocturnal hypoglycemia episodes, while the other patients experienced nocturnal episodes for 2% to 44% of the nights. When comparing patient 12 to patient 14, it can be seen that the first one has used the Fitbit for less nights than the second, but the proportion of nights with one or more hypoglycemic events is higher for patient 12 than for patient 14.

Figure 16 – The number of nights both sensors were worn and the percentage of these nights one or more

hypoglycemic events were detected per patient in the pediatric group.

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Perc en ta ge o f time Patient

Hypoglycemia Normoglycemia Hyperglycemia

0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 0 20 40 60 80 100 120 140 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Perc en ta ge o f n ight s with h yp o glycem ia N u m b er o f n ight s w ear in g b o th s en so rs

(33)

30

4.3

D

ETECT THE RELATION BETWEEN HEART RATE AND HYPOGLYCEMIA DURING

THE NIGHT

By only using the eleven existing patterns, it was not possible to detect the relation between heart rate and hypoglycemia during the night. Besides creating a new pattern, this relation was also analyzed by using the difference in percentage variation before hypoglycemia and by using the HR values before hypoglycemia. The results of the last two analyses have been split into two hours and one hour before the start of the hypoglycemia episode.

4.3.1 C

REATING A NEW PATTERN

By adding a renderer to JTSA, it was possible to view whether the created patterns worked correctly. One HR increase BEFORE (BG decrease PRECEDES hypoglycemia) pattern is shown in Figure 17 and shows that the pattern is highlighting the correct time span of HR increase BEFORE (BG decrease PRECEDES hypoglycemia).

The result of running both the HR increase BEFORE (BG decrease PRECEDES hypoglycemia) workflow and HR increase DURING (BG decrease PRECEDES hypoglycemia) pattern is shown in Table 6Fout! Verwijzingsbron niet gevonden.. The DURING pattern occurs more frequently than the BEFORE pattern for all patients. While there are occurrences of the patterns partly during the period when the patients are asleep, there are no occurrences of the patterns entirely during sleep.

Figure 17 – The result for one episode of HR increase BEFORE (BG decrease PRECEDES hypoglycemia). Left: the BG

trend with the patterns BG decrease, hypoglycemia and BG decrease PRECEDES hypoglycemia. Right: the HR trend with the patterns HR increase, BG decrease PRECEDES hypoglycemia and the end result HR increase BEFORE (BG decrease PRECEDES hypoglycemia)

(34)

31 Patient number HR increase BEFORE (BG decrease PRECEDES hypoglycemia) HR increase DURING (BG decrease PRECEDES hypoglycemia) BEFORE pattern during sleep DURING pattern during sleep 1 2 7 0 0 2 1 7 0 0 3 0 3 0 0 4 1 11 0 0 5 1 12 0 0 6 0 9 0 0 10 0 7 0 0 11 2 3 0 0 12 0 3 0 0 13 0 1 0 0 14 0 13 0 0

Table 6 – The results of the two created patterns per pediatric patient.

4.3.2 P

ERCENTAGE VARIATION

The results of the analysis on percentage variation are split into two parts: two hours before hypoglycemia and one hour before hypoglycemia.

Two hours before hypoglycemia

During sleep, 14 out of 17 patients experienced hypoglycemia episodes. In total, 110 hypoglycemia episodes were detected, but only 40 episodes of hypoglycemia during sleep from 10 patients were analyzed when taking the HR data of two hours before het hypoglycemia episode. Excluded were:

- 19 episodes because there was a previous hypoglycemia episode within 120 minutes; - 49 episodes because the hypoglycemia episode occurred within 120 minutes of falling

asleep;

- 1 episode because HR data was not available from the 120 minutes before the hypoglycemia episode.

Figure 18 shows the percentage variation close to and far from hypoglycemia episodes during sleep when taking into account the heart rate 120 minutes before a hypoglycemia episode. The period from 120 to 60 minutes before the start of a hypoglycemic event is taken as far from hypoglycemia and the period from 60 to 0 minutes before the start of a hypoglycemic event is taken as close to hypoglycemia. It shows that for 6 patients the percentage variation is higher far

(35)

32 from the hypoglycemia episodes than close to the hypoglycemia episodes, while for 4 patients it is the other way around.

Figure 18 – The percentage variation per patient close to and far from hypoglycemia episodes during sleep, where far

from hypoglycemia is 120 to 60 minutes before a hypoglycemia episode and close to hypoglycemia is 60 to 0 minutes before the start of a hypoglycemia episode. N indicates the total amount of hypoglycemia episodes analyzed per patient.

The result of plotting all the percentage variations together per far from and close to a hypoglycemia episode are shown in Figure 19. As can be seen, two boxplots are really similar. The Wilcoxon Matched-Pairs test showed that the p-value is 0.9417, so there is no statistically significant difference between the percentage variation far from hypoglycemia episodes and the percentage variation close to the hypoglycemia episodes.

(36)

33 Figure 19 – The percentage variation per group (far from or close to a hypoglycemia episode) for all patients together

when taking a timeframe of 120 minutes.

When also plotting the percentage variation for the group in between far and close, so 90 to 30 minutes before hypoglycemia, the boxes still look similar. This can be seen in Figure 20. Table 7 shows the amount of percentage variation measurements above the third quartile. When adding the values together per group, close to the hypoglycemia there are 12 percentage variation measurements above the third quartile and for the other two groups 10, so there is no difference.

Figure 20 - The percentage variation for the groups far, medium and close, where the far group is 120 to 60 minutes

before the start of a hypoglycemia episode, the medium group is 90 to 30 minutes before the start of a hypoglycemia episode and the close group is 60 to 0 minutes before the start of a hypoglycemia episode.

(37)

34

Patient number Close Medium Far

1 1 1 2 2 2 3 1 3 3 3 1 4 1 0 1 5 2 2 0 6 1 0 1 7 0 0 1 10 0 0 2 14 1 1 0 16 1 0 1

Table 7 – The amount of measurements above the third quartile per patient, per group.

One hour before hypoglycemia

When taking HR data of one hour before the hypoglycemia episode, 61 hypoglycemia episodes were taken into account and they are divided between 11 patients. Out of the 110 episodes,

- 9 were excluded because there was a previous period of hypoglycemia within an hour; - 37 were excluded because the period of sleep started within an hour before the

hypoglycemia episode;

- 3 were excluded because there was no sufficient data available.

The results of the percentage variation in the two groups (far from and close to a hypoglycemia episode) are shown in Figure 21. Far from hypoglycemia is in this case 60 to 30 minutes before the start of an episode and close to hypoglycemia is 30 to 0 minutes before the start of an episode. In this figure it can be seen that there is no consensus on the increase or decrease of the percentage variation. For 6 patients the percentage variation is higher far from the hypoglycemia episode than close to the hypoglycemia episode, while for 5 patients this is the other way around.

(38)

35 Figure 21 – The percentage variation per patient close to and far from a period with hypoglycemia where n is the

amount of hypoglycemia episodes during sleep per patient. Far from hypoglycemia is 60 to 30 minutes before an episode and close to hypoglycemia is 30 to 0 minutes before an episode.

Figure 22 shows the results of adding up all the percentage variation per group for all patients. As with the period of 2 hours before a hypoglycemia episode, the median value is similar. The Wilcoxon Matched-Pairs Test results in 0.3286 so there is no significant difference between the percentage variation 30 to 60 minutes before hypoglycemia and the percentage variation 0 to 30 minutes before hypoglycemia.

Figure 22 – The percentage variation of all patients per group when taking a timeframe of 60 minutes before the start

(39)

36

4.3.3 A

LL HEART RATE VALUES

Two hours before hypoglycemia

All heart rate values from two hours before a hypoglycemia episode until the start of the hypoglycemia episode were also analyzed. A measurement of the heart rate was made every minute. These were also split into far from and close to the hypoglycemia episode, where far from hypoglycemia includes all heart rates 60 to 120 minutes before the start of the hypoglycemia episode, while close to hypoglycemia entails all heart rates 1 to 60 minutes before an episode. The results are shown in Figure 23 and it can be seen that the median heart rate is decreasing in 4 patients and increasing in 6 patients.

Figure 23 – All heart rate values starting from 120 before a hypoglycemia episode until the episode, divided into two

groups that each represent a timeframe of 60 minutes: far from and close to the start of the period of hypoglycemia.

Figure 24 shows the result after plotting all heart rate values for the groups far from hypoglycemia and close to hypoglycemia. It shows that the boxes of both groups are similar. The result of the Mann-Whitney test was a p-value of 0.4205, so there is no statistically significant difference between the percentage variation far from hypoglycemia episodes and the percentage variation close to the hypoglycemia episodes.

(40)

37 Figure 24 – All heart rates per group of 60 minutes, where far are the heart rate values from 120 to 60 minutes before

the start of a hypoglycemia episode and close are the heart rate values 60 to 0 minutes before the start of a hypoglycemia episode.

One hour before hypoglycemia

The analysis of all heart rate values is also done for the period of 60 minutes before a hypoglycemia episode until the start of the episode. In this case, the period of 60 minutes is split into two periods of 30 minute to create two groups. The result of this analysis is shown in Figure 25. Also in this analysis, the trend of the values is the same for both periods.

(41)

38 Figure 25 – Analysis of all heart rate values one hour before hypoglycemia where far means 60 to 30 minutes before

and close 30 to 0 minutes before the start of a hypoglycemia episode.

Figure 26 shows the result after plotting all heart rate values for the groups far from hypoglycemia and close to hypoglycemia. It shows that the boxes of both groups are similar. The result of the Mann-Whitney test was a p-value of 0.3452, so there is no statistically significant difference between the percentage variation far from hypoglycemia episodes and the percentage variation close to the hypoglycemia episodes.

Figure 26 – All heart rates per group where far means 60 to 30 minutes before hypoglycemia and close 30 to 0 minutes

(42)

39

4.4

M

INING NOVEL PATTERNS

Novel patterns have been mined from all BG data of children through TAR. All BG patterns were the set of initial patterns of interest. The found rules are mentioned in Table 8, together with the amount of times the rule was found, the total RTS, support and confidence. The rule with the highest frequency is BG Decreasing AND Hyperglycemia PRECEDES BG Normal, and the rule with the longest time span is BG Increasing PRECEDES Hyperglycemia. This rule also has the highest support and confidence. This means that the relationship happens over long intervals of time.

Rule Count Total RTS

(min) Support Confidence BG Decreasing AND Hyperglycemia

PRECEDES BG Normal

792 97023 0,73 0,84

BG Increasing PRECEDES Hyperglycemia 776 127923 0,77 0,89

BG Decreasing PRECEDES BG Normal 683 110079 0,73 0,84

BG Increasing PRECEDES Severe Hyperglycemia

512 82066 0,68 0,79

BG Increasing PRECEDES BG Decreasing 466 66957 0,64 0,74

BG Increasing AND Hyperglycemia PRECEDES BG Decreasing

464 62569 0,68 0,79

Hyperglycemia PRECEDES BG Decreasing 442 55852 0,64 0,74

Hyperglycemia PRECEDES BG Normal 353 46000 0,64 0,74

Referenties

GERELATEERDE DOCUMENTEN

We start by setting the random seed to make sure the results are random but can be reproduced exactly. For now, we forget that we know the variables are in fact uncorrelated and

The most direct consequence of the Schrems judgment is the new legal framework for EU-US data transfers: The Privacy Shield, which was introduced in February 2016 and adopted by

With the use of a literature study, a case study, and a proof of concept, this research provides evidence that existing project data can easily be transformed into RDF/XML

The EU’s internal standards regarding privacy and data protection have risen, however, it is questionable how the application of article 3 GDPR influences further trade

Fur- ther research is needed to support learning the costs of query evaluation in noisy WANs; query evaluation with delayed, bursty or completely unavailable sources; cost based

A is the (I X P) matrix with the coefficients of the variables of the first mode on the variable components. In the original data matrix X every element of the matrix represents

Based on these observations, EMODnet Biology organised from 25 th to 27 th of October 2011 in Heraklion, Crete a 3-day Biological data analysis workshop to test a number

the kind of personal data processing that is necessary for cities to run, regardless of whether smart or not, nor curtail the rights, freedoms, and interests underlying open data,