• No results found

Impact of Medication Grouping on Fall Risk Prediction in Elders: A Retrospective Analysis of MIMIC-III Critical Care Database

N/A
N/A
Protected

Academic year: 2021

Share "Impact of Medication Grouping on Fall Risk Prediction in Elders: A Retrospective Analysis of MIMIC-III Critical Care Database"

Copied!
64
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)
(2)

U

NIVERSITY OF

A

MSTERDAM

M

ASTERS

T

HESIS

Impact of Medication Grouping on Fall

Risk Prediction in Elders: A Retrospective

Analysis of MIMIC-III Critical Care

Database

Student: Noman Dormosh Student No. 11412682 SRP Mentor: Dr. Martijn C. Schut – SRP Tutor: Prof. dr. Ameen Abu-Hanna

SRP Address:

Amsterdam University Medical Center - Location AMC Department Medical Informatics

Meibergdreef 9, 1105 AZ Amsterdam

Practice teaching period: November 2018 - June 2019

A thesis submitted in fulfillment of the requirements for the degree of Master of Medical Informatics

(3)
(4)

iii

Abstract

Background:Falls are the leading cause of injury in elderly patients. Risk factors for falls in-cluding among others history of falls, old age, and female gender. Research studies have also linked certain medications with an increased risk of fall in what is called fall-risk-increasing drugs (FRIDs), such as psychotropics and cardiovascular drugs. However, there is a lack of consistency in the definitions of FRIDs between the studies and many studies did not use any systematic classification for medications.

Objective: The aim of this study was to investigate the effect of grouping medications at different levels of granularity of a medication classification system on the performance of fall risk prediction models.

Methods:This is a retrospective analysis of the MIMIC-III cohort database. We created seven prediction models including demographic, comorbidity and medication variables. Medica-tions were grouped using the anatomical therapeutic chemical classification system (ATC) starting from the most specific scope of medications and moving up to the more generic groups: one model used individual medications (ATC level 5), four models used medication grouping at levels one, two, three and four of the ATC and one model did not include med-ications. We compared the predictive performance of these models with the performance of a model with a predefined ATC groups of FRIDs based on the literature and expert review (Expert-Opinion). Medications and comorbidities were extracted from the discharge sum-maries. Logistic regression with least absolute shrinkage and selection operator (LASSO) was used to construct the prediction models. The main performance measure was the area under the receiver operating characteristic curve (AUC-ROC). Additionally, we systemati-cally evaluated the performance by including area under the precision-recall curve (AUC-PR), Akaike Information Criterion (AIC), sensitivity, specificity, precision and the Brier score. Calibration was assessed using calibration plots. In order to internally validate the results, we performed bootstrapping to obtain optimism-corrected estimates and Wilcoxon signed-rank test to test the significance of the median differences of estimates between all the models and the expert-opinion FRIDs model.

Results: The highest statistically significant AUC-ROC was achieved by grouping medi-cations at level two of the ATC [0.7143 (CI 95%: 0.710-0.718) compared to 0.698 (CI 95%: 0.6927-0.7018) of the Expert-Opinion model]. The AUC-PR of all the models with medica-tions significantly outperformed the Expert-Opinion model except the ATC-1 model. The values of the other performance measures were varied between the models depending on the grouping level. All the models showed satisfactory calibration.

Conclusion:The performance of fall risk prediction model is significantly affected by medi-cation grouping. The discrimination performance is enhanced by grouping the medimedi-cations using the ATC classification system instead of using a FRIDs list. The optimal grouping level can be determined through experimentation.

(5)
(6)

v

Samenvatting

Achtergrond:Vallen bij ouderen is een belangrijkste oorzaak van letsel. Risicofactoren voor vallen, waaronder onder meer zijn geschiedenis van vallen, ouderdom en vrouwelijk ges-lacht. Onderzoeksstudies hebben ook bepaalde medicijnen in verband gebracht met een verhoogd valrisico bij zogenaamde valrisicoverhogende geneesmiddelen (FRIDs), zoals psy-chotropen en cardiovasculaire geneesmiddelen. Er is echter een gebrek aan consistentie in de definities van FRIDs tussen de studies en veel studies hebben geen systematische classi-ficatie voor medicijnen gebruikt.

Doel: Het doel van deze studie was om het effect te onderzoeken van het groeperen van medicijnen op verschillende niveaus van granulariteit van een medicatieclassificatiesysteem op de prestaties van val-risicovoorspellingsmodellen

Methoden:Dit is een retrospectieve analyse van de MIMIC-III cohortdatabase. We hebben zeven voorspellingsmodellen gemaakt, waaronder demografische, comorbiditeits- en med-icatievariabelen. Medicijnen werden gegroepeerd met behulp van het anatomische thera-peutische chemische classificatiesysteem (ATC) uitgaande van de meest specifieke reikwi-jdte van medicijnen en oplopend naar de meer generieke groepen: één model gebruikte individuele medicijnen (ATC niveau 5), vier modellen gebruikten medicatiegroepering op niveau één, twee, drie en vier van de ATC en één model zonder medicijnen. We hebben de voorspellende prestaties van deze modellen vergeleken met de prestaties van een model met vooraf gedefinieerde ATC groepen van FRIDs op basis van de literatuur en expert review (Expert-Opinion). Medicijnen en comorbiditeiten werden uit ontslagsamenvattin-gen gehaald. Logistieke regressie met minst absolute krimp en selectie-operator (LASSO) werd gebruikt om de voorspellingsmodellen te construeren. De belangrijkste prestatiemaat-staf was “area under the receiver operating characteristic curve” (AUC-ROC). Bovendien hebben we de prestaties systematisch geëvalueerd door Akaike Information Criterion (AIC) op te nemen, area under precision recall (AUC-PR), sensitiviteit, specificiteit, precisie en de Brier-score. Om de resultaten intern te valideren, hebben we bootstrapping uitgevoerd om optimisme corrigeerde schattingen te verkrijgen en Wilcoxon signed-rank test om de signif-icantie van de mediaan te testen op verschillen in schattingen tussen alle modellen en het (Expert-Opinion) model.

Resultaten: De hoogste statistisch significante AUC-ROC werd bereikt door medicatie te groeperen op niveau twee van de ATC [0.7143 (CI 95 %: 0.710-0.718) vergeleken met 0.698 (CI 95 %: 0.6927-0.7018) van het Expert-Opinion model]. De AUC-PR van alle modellen met medicijnen overtrof aanzienlijk het Expert-Opinion-model behalve het ATC-1 model. De waarden van de andere prestatiemetingen varieerden tussen de modellen, afhankelijk van het groeperingsniveau. Alle modellen vertoonden een bevredigende kalibratie.

Conclusie:De prestaties van val-risicovoorspelling worden aanzienlijk beïnvloed door med-icatiegroepering. De discriminatieprestaties van de val voorspellen is verbeterd door de hele medicatie te groeperen met behulp van het ATC classificatiesysteem in plaats van de FRIDs lijst. Het optimale groeperingsniveau kan worden bepaald door middel van experimenten.

(7)
(8)

vii

Acknowledgements

I would like first of all to thank my wife Heba. She has been extremely support-ive of me throughout my life and has made innumerable sacrifices to help me get to this point. Thanks to my children, Boshra, Mohammad, and the triplets Omar, Huda and Hamza, who have continually provided the requisite motivation to finish my degree. My parents who taught me to pursue perfection instead of excellence, deserve special thanks. Without such a team behind me, I doubt that I would be in this place today.

I would like to thank my tutor Prof. Dr. Ameen Abu Hanna and the daily mentor, Dr. Martijn Schut, for their prompt guidance and support through the SRP-process. I am also grateful to Prof. Dr. Nathalie van der Velde who provided valuable inputs and feedback to the SRP. Special thanks go to Dr. Ronald Cornet who granted me the permission to access the data used in this thesis, the MIMIC database.

Last but not least, thanks to you reader. You have read at least one page of my thesis. Thank You. . .

(9)
(10)

ix

Contents

Abstract iii Abstract (Dutch) v Acknowledgements vii 1 Introduction 1 1.1 Research Questions . . . 2

1.2 Outline of this Thesis . . . 3

2 Background 5 2.1 Fall-Risk-Increasing Drugs (FRIDs) . . . 5

2.2 Medical Information Mart for Intensive Care III (MIMIC-III) database . 6 2.3 Anatomical Therapeutic Chemical Classification System (ATC) . . . . 7

2.4 Least Absolute Shrinkage and Selection Operator (LASSO) . . . 8

3 Materials and Methods 11 3.1 Study Design . . . 11

3.2 Population . . . 11

3.3 Patient and Event Inclusion . . . 12

3.4 Data Extraction . . . 13

3.5 Data Processing . . . 14

3.6 Clinical Outcome . . . 15

3.7 Outcome Measurements: Predictive Performance . . . 15

3.8 Statistical Analysis . . . 15 3.9 Models Validation . . . 16 4 Results 17 4.1 Study Population . . . 17 4.2 Extracted Variables . . . 18 4.3 Missing Variables . . . 19

4.4 Measures of Predictive Performance . . . 19

5 Discussion and Future Direction 25

(11)

x

A Common Abbreviations Used in Medication Extraction 31 B Average Predictive Comparison (APC) 33

(12)

1

Chapter 1

Introduction

About one-third of community-dwelling elderly people aged 65 years and more fall at least once per year [1]. Falls are the major cause of injury among elders and result in high healthcare demand. Age-related problems, including falls, are expected to increase in the coming decades due to aging population. As a consequence, falls are expected to become one of the major public health problems worldwide [2–4]. There are several risk factors for falls including among others history of falls, old age, fe-male gender and mobility problems [5–8]. Research studies have also linked certain medications with an increased risk of fall among older adults in what is called fall-risk-increasing drugs (FRIDs), which include psychotropics such as antidepressants, sedative hypnotics, cardiovascular drugs such as diuretics and antihypertensives and a miscellaneous group including analgesics [9–11].

Although numerous published studies have sought to establish an association be-tween medication use and risk of fall, substantial number of these studies have not used any systematic classification of medications [9, 12]. Moreover, many studies lacked the consistency in the way medications have been classified. For instance, the class ‘antihypertensives’ sometimes includes diuretics and sometimes does not [76].

Several attempts have been made to establish a list of FRIDs that can serve as a tool in fall prevention programs [13, 14] or to improve medication safety in older adults [15– 17]. However, much uncertainty still exists about which medication is supported by evidence to be included and how should these medications be classified accordingly. In spite of this, the lack of the consistency in the definition of FRIDs hinders the effort of the researchers to compare results between different studies. To better illustrate this, consider the various prediction models created to estimate the future risk of fall [18–21]. Because FRIDs were inconsistently defined in these studies, it remains difficult to compare the results and it is uncertain whether a different definition of FRIDs could have affected the obtained results. In fact, a recent systematic review and meta-analysis [22] found that pharmacological subgroups (e.g. loop diuretics) could lead to increased risk of fall whereas the broader pharmacological group (e.g. diuretics) did not.

(13)

2 Chapter 1. Introduction

To increase the quality, consistency, comparability and clinical implementation, med-ications are grouped according to their therapeutic effects or using the Anatomical Therapeutic Chemical (ATC) classification system [23]. The ATC classification sys-tem, recommended by the World Health Organization (WHO), is the most widely recognized classification system for drugs. This classification system has five levels; the first level allows active substances to be classified into fourteen main groups ac-cording to the organ or system which they act. The first level is divided into pharma-cological/therapeutic subgroups (2nd level), chemical/pharmapharma-cological/therapeutic (3rd and 4th) and the 5th level is the chemical substance. For example, captopril is coded C09AA01 (C=cardiovascular system, C09=agents acting on the renin-angiotensin system, C09A=ACE inhibitors, C09AA=ACE inhibitors, C09AA01=captopril).

Because medications represent one of the most modifiable risk factors for falls, there is an opportunity to address medication related issues especially in the prevention programs. In fact, previous studies in community settings have shown that reducing FRIDs can result in a 39–66% reduction in falls [13, 16, 24, 25]. This thesis seeks to address the problem of FRIDs classification inconsistency with the focus on fall risk prediction models. We hypothesize that we can obtain better discrimination perfor-mance results (between fallers and non-fallers) by the use of the entire medications instead of a predetermined list of FRIDs. We exploit the hierarchical connection be-tween medications and their groups using the ATC to create a fall risk prediction model. Because subgroups and individual medications may have different fall-risk-increasing properties, we explore the effect of grouping the entire medications at varying levels of the ATC.

The aim of this study is to compare the performance of fall risk prediction by consid-ering expert-based FRIDs grouping, individual medications, and by clustconsid-ering the entire medications at varying ATC levels.

1.1

Research Questions

The research question addressed in this thesis is the following:

1. To what extent is grouping medications affects the performance of fall risk prediction model ?

Performance assessment is evaluated systematically using various performance mea-sures. These performance measures are: the area under the receiver operating char-acteristic curve (AUC-ROC), the area under the precision-recall curve (AUC-PR), sensitivity, specificity, precision (positive predictive value), Akaike Information Cri-terion (AIC), calibration plots and the Brier score.

(14)

1.2. Outline of this Thesis 3

1.2

Outline of this Thesis

The next chapter provides all the necessary background for the reader of this the-sis: information on fall-risk-increasing drugs (FRIDs), an overview of the Medical Information Mart for Intensive Care III (MIMIC-III) database that we used in our ex-periment, a description of the Anatomical Therapeutic Chemical (ATC) classification system and a brief overview of the Least Absolute Shrinkage and Selection Opera-tor (LASSO). Chapter 3 presents the methodology of the experiment and chapter 4 shows the results. Detailed discussion, future recommendations and the conclusion are provided in chapter 5. In order to facilitate the reading, long tables were placed in an appendix at the end of the thesis.

(15)
(16)

5

Chapter 2

Background

In this chapter we discuss FRIDs, the Medical Information Mart for Intensive Care III (MIMIC-III) database, the ATC classification system and the Least Absolute Shrink-age and Selection Operator (LASSO).

2.1

Fall-Risk-Increasing Drugs (FRIDs)

Many widely used drugs are associated with falls in elderly patients [26–29]. Drugs that increase fall risk are summarized under the acronym FRIDs (fall-risk-increasing drugs). This may alleviate further analysis of fall risk and also help to update rec-ommendations.

Drugs can induce falling in different ways, including the expected and adverse ef-fects of either single drugs or drug–drug interactions [30] and dosage and change of medications [31]. Both pharmacokinetics and pharmacodynamics properties of drugs change with aging [28]. As age increases the half-life of drugs increases due to the increase of body fat mass. In fragile and malnourished elders, serum albu-min is often reduced, hence increasing the free drugs concentration. Some of the intended and unintended pharmacotherapeutic effects of drugs (e.g., sedation, cog-nitive changes, dizziness, and orthostatic hypotension) might increase the risk of falls.

Numerous published studies have sought to establish an association between med-ication use and risk of falling. These medmed-ications include certain anti-hypertensives (e.g., diuretics, beta-blockers), anti-arrhythmics, anti-cholinergics, anti-histamines, sedatives and hypnotics (e.g., benzodiazepines), neuroleptics, antidepressants, nar-cotics, and nonsteroidal anti-inflammatory drugs (NSAIDs) [9, 10, 32–34]. However, this evidence is based primarily on observational data with few adjustment for con-founders, dosage, or duration of therapy. Therefore it is unclear whether the ob-served increase in falls is truly related to the use of these drugs or the underlying conditions that the drugs are treating.

(17)

6 Chapter 2. Background

Much effort has been made to establish and a list of FRIDs that can be used in falls prevention programme or to improve medication safety in older adults [16, 24, 35]. However, for many drugs there is only minimal scientific evidence of their impact on falls; therefore, there is no concrete list of FRIDs and they are inconsistent. Table 2.1 shows an example of a list of FRIDs based on the literature and expert reviews, according to Swedish National Board of Health and Welfare [15] (NBHW).

ATC* code Drugs/group of drugs

Increase the fall risk

NO2A Opioids

N05A (NO5AN excluded) Antipsychotics (lithium excluded)

N05B Anxiolytics

N05C Hypnotics and sedatives

N06A Antidepressants

May cause or worsen orthostatism

C01D Vasodilators used in cardiac diseases

C02 Antihypertensives

C03 Diuretics

C07 Beta blocking agents

C08 Calcium channel blockers

C09 Renin-angiotensin system inhibitors

G04CA Alpha-adrenoreceptor antagonists

N04B Dopaminergic agents

N05A (NO5AN excluded) Antipsychotics (lithium excluded)

N06A Antidepressants

TABLE 2.1: Fall risk-increasing drugs (FRIDs) and drugs that may cause or worsen ortho-statism (ODs) according to the list from the Swedish National Board of Health and Welfare (NBHW)

*Anatomical Therapeutic Chemical classification system.

2.2

Medical Information Mart for Intensive Care III

(MIMIC-III) database

Medical Information Mart for Intensive Care (MIMIC-III) is a large database com-prising information relating to patients admitted to critical care units at a large ter-tiary care hospital. The database was established in October 2003 as a Bioengineering Research Partnership between MIT, Philips Medical Systems, and Beth Israel Dea-coness Medical Center.

The current release of the database, MIMIC-III, currently contains more than forty thousand patients who were admitted to Intensive Care Units (ICU) at Beth Israel Deaconess Medical Center (BIDMC), an academic, urban tertiary-care hospital. The data is de-identified, annotated and is made openly available to the research com-munity.

(18)

2.3. Anatomical Therapeutic Chemical Classification System (ATC) 7

Beside patient information driven from the hospital, the MIMIC-III database con-tains detailed physiological and clinical charted data. One of the purposes of MIMIC-III is to develop and evaluate advanced ICU patient monitoring and decision sup-port systems that will improve the efficiency, accuracy, and timeliness of clinical decision-making in critical care.

There are mainly two types of data in the MIMIC-III database; clinical data driven from the electronic health record (EHR) such as patients’ demographics, diagnoses, laboratory values, vital signs, etc. The second type is the free-text of various cate-gories of clinical notes. Among other catecate-gories, it comprises discharge summaries, radiology reports and nursing notes.

MIMIC-III is an open access database available to any researchers around the world. However, researchers seeking to use the database must formally request access. The database is maintained by PhysioNet (http://physionet.org), a diverse group of computer scientists, physicists, mathematicians, biomedical researchers, clinicians, and educators around the world.

Such a database allows for extensive epidemiological studies that link patient data to clinical practice and outcomes. The extremely high granularity of the data allows for complicated analysis of complex clinical problems. A list of hundreds published pa-pers based on MIMIC data can be found in PhysioNet (https://mimic.physionet. org/about/publications/)

2.3

Anatomical Therapeutic Chemical Classification System

(ATC)

The Anatomical Therapeutic Chemical (ATC) classification system, recommended by the World Health Organization (WHO), is currently the most widely recognized classification system for drugs. This classification system divides medications into fourteen main groups according to the organ or system on which they act. Addition-ally, the system includes drug classifications at five levels; anatomical, therapeutic, pharmacological, chemical and drugs or ingredients as shown in table 2.2. For ex-ample, captopril is coded C09AA01 (C=cardiovascular system, C09=agents acting on the renin-angiotensin system, C09A=ACE inhibitors, C09AA=ACE inhibitors, C09AA01=captopril). Also included are defined daily doses (DDDs) and admin-istration routes assigned to most drugs in accordance to the therapeutic and phar-macological groups.

ATC Code Description ATC Level Classification

C Cardiovascular system 1stLevel Anatomical main group C09 Agents acting on the

renin-angiotensin system 2

(19)

8 Chapter 2. Background

Table 2.2 continued from previous page

ATC Code Description ATC Level Classification

C09A ACE inhibitors, plain 3rdLevel Pharmacological subgroup C09AA ACE inhibitors, plain 4thLevel Chemical subgroup C09AA01 Captopril 5thLevel Chemical substance

TABLE 2.2: Classification of captopril according to the Anatomical Therapeutic Chemical (ATC) classification system

The ATC classification system is a strict hierarchy, meaning that each code necessar-ily has one and only one parent code, except for the fourteen codes at the topmost level which have no parents. However, since an ingredient can have therapeutic ap-plications on different anatomical sections, ATC assigns a different code to the same ingredient in different anatomical sections. For example, the beta-blocker timolol has different codes when used as a cardiovascular drug (C07AA06) and as a treat-ment for glaucoma (S01ED01). In spite of that, each one of these codes has only one unique parent.

2.4

Least Absolute Shrinkage and Selection Operator (LASSO)

Real-world datasets, such as the one used in this thesis, often have a large number of features (high dimensional), many of which can be redundant (highly correlated with each other) or irrelevant for classification. Using machine learning algorithms to learn from such datasets in order to build a prediction model, suffer from two main issues. The first issue is related to the interpretability of the model due to the high number of features. Another issue is over-fitting which means that the model poorly generalizes to a different dataset. Over-fitting or variance increases in relation to the complexity of the model as shown in figure 2.1. As more and more features are added to a model, the complexity of the model rises and variance becomes our primary concern while bias steadily falls. There are several techniques that can be utilized to alleviate these problems, such as cross-validation sampling, reducing number of features, pruning, and regularization.

LASSO [36] is a regularization method that is used to learn a regularized regression that is sparse in the feature space. LASSO regression (also called L1 regularization) for generalized linear models can be explained as adding a penalty against com-plexity to reduce the degree of over-fitting of a model by adding more bias (under-fitting). This penalty term (λ) is added directly to the cost function which makes the computation more efficient. Besides its regularization capability, LASSO has a fea-ture selection ability as it shrinks the less important feafea-ture’s coefficient to zero thus, removing some features altogether.

(20)

2.4. Least Absolute Shrinkage and Selection Operator (LASSO) 9

FIGURE2.1: Bias and variance contributing to total error.

In nutshell, regularization parameter λ penalizes all the parameters except intercept so that model generalizes the data and will not over-fit. However, regularization is highly dependent on the tuning parameter λ and this value should be chosen with care. This is because we want to avoid over-fitting and under-fitting in the same time by choosing the optimal value. The parameter λ depends on the data and it is usually chosen by cross-validation. Cross-validation works as follows. (i) We divide the samples into roughly 10 equal-sized parts at random. (ii) For each part K, leave out this part, fit the model to the other nine parts over a range of λ values, and then record the prediction error over the left-out part. (iii) Repeat step ii for K=1, 2...10, and for each value of λ , compute the average prediction error over the 10 parts. Finally, our estimate λ is the value yielding the smallest average prediction error. Traditional methods like cross-validation, stepwise regression to handle over-fitting and perform feature selection work well with a small set of features but regular-ization techniques are a great alternative when we are dealing with a large set of features mainly because of its efficiency.

To address the limitations of LASSO, several other regularization techniques exist such as Ridge, Elastic Net and several variants of LASSO (e.g. group lasso and sparse group lasso). Further explanations of these algorithms can be found in [37].

(21)
(22)

11

Chapter 3

Materials and Methods

3.1

Study Design

We evaluated seven prediction models, where variables from the MIMIC-III were included if there was evidence suggesting that there was an association with or had the potential to predict falls1. We considered the following sets of variables:

• demographic variables - including age, gender, ethnicity, marital status.

• comorbidity variables - including anxiety, depression, heart disease, hyperten-sion, lung disease, diabetes, stroke, urinary incontinence, arthritis, osteoporo-sis, cancer, hearing problems, sleep problems, gait disturbance, visual impair-ment, dizziness, polypharmacy, history of fall.

• medication variables - which can be grouped at different levels of the ATC clas-sification system, as explained above.

We created seven prediction models; one model uses medications without grouping (ATC-5); four models by grouping medications at ATC level one (ATC-1), ATC level two (ATC-2), ATC level three (ATC-3) and ATC level four (ATC-4); one model uses FRIDs grouped according to clinical expert knowledge (based-on-expert-opinion) as shown in table 3.1; one model with only demographic and comorbidity variables without medications (NO-MED).

3.2

Population

Data were obtained from the Medical Information Mart for Intensive Care (MIMIC-III, Version 1.4) [38]. The dataset comprises demographic elements such as age, gen-der and ethnicity, clinical elements including vital signs, laboratory data, diagnostic codes (in International Classification of Diseases 9th Edition, Clinical Modification [ICD-9-CM] format) and outcome values. Beside these it contains various types of clinical notes, including discharge summaries and nursing notes.

(23)

12 Chapter 3. Materials and Methods

Group Name ATC Code ATC Level

Alpha Blockers (Prostate Hyperpla-sia)

G04CA 4

Alpha Blockers (Antihypertensives) C02CA 4

Angiotensin Converting Enzyme In-hibitors

C09AA 4

Angiotensin Receptor Blockers C09C 3

Antiadrenergics (Antihyperten-sives)

C02A, C02B, C02C 3

Antiarrhythmics (Class I and III) C01B 3

Antiepileptic Drugs N03 2

Antihistamines R06 2

Antiparkinson Drugs N04 2

Antipsychotic Drugs N05A 3

Beta Blockers (Non-Selective) C07AA, C07AG 4

Beta Blockers (Selective) C07AB 4

Calcium Channel Blockers C08 2

Cardiac Glycosides C01A 3

High-ceiling Diuretic C03C 3

Insulins and Analogues A10A 3

Low-ceiling Diuretic C03A, C03B 3

Other Antidepressants N06AF, N06AG, N06AX 4

Monoamine Reuptake Inhibitors (Non-Selective)

N06AA 4

Nitrates (Vasodilators used in Car-diac Disease)

C01DA 4

None Steroidal Anti-inflammatory Drugs

M01AA, M01AB, M01AC, M01AE, M01AG, M01AH

4

Opioids N02A 3

Blood Glucose Lowering Drugs excl. insulins

A10B 3

Proton Pump Inhibitors A02BC 4

Selective Serotonin Reuptake In-hibitors (SSRI)

N06AB 4

Statins C10AA 4

Urinary Incontinence Drugs G04BD 4

TABLE3.1: FRIDs groups according to the literature and expert reviews.

3.3

Patient and Event Inclusion

Patients were included in this study cohort if they were 65 years or older and who had complete discharge summaries. By complete discharge summary we mean dis-charge summaries that contained the header sections of both medications registered on admission and past medical history. Only the first hospital admission was con-sidered for analysis for those with multiple admissions.

(24)

3.4. Data Extraction 13

3.4

Data Extraction

We explain the three types of variables which were extracted from both structured data and free text data.

Demographic variables: In the MIMIC-III database, ages above 89 were obscured. We used the median age of 91.4 to replace the obscured values per MIMIC-III docu-mentation. There were four primary ethnicities designated in MIMIC-III: White, Black/African American, Hispanic/Latino, Asian and Unknown. Any ethnicity out-side these four was placed in an Other category. With regards to marital status, there were six possible categories: Married, Single, Divorced, Widowed, Separated or Un-known. We also extracted the gender variable which had two possible values: male and female.

Medication variables: MIMIC-III contains medications prescribed during the hospi-tal stay. However, to predict falls, home medications registered on admission were the variables of interest. These were available as text in the discharge summaries. When we observed these discharge summaries, we noticed that it contained semi-structural information. The noisy data consisted of different header sections in dif-ferent formats such as Medications on Admission, Social History, Past Medical His-tory etc. The common reference of each document was the punctuation mark “:” after each heading. We found that medications may fall in two categories, namely, medications on admission and medications on discharge. The occurrence of the heading ”medications on admission” was not consistent in the text. Some variants of the heading were “home medications”, “admission medications” and “medica-tions”. Therefore, we composed a regular expression pattern to capture all these headings and to exclude any heading that includes the word “discharge”. In order to recognize the medications, we compiled a dictionary lookup table using RxNorm [39]. It was arranged as key-value pairs, where the keys were the data items being searched (looked up) and the values were only the active ingredients. The keys were the active ingredients, synonyms, drug salts, brand names and drug combinations. For example, the drug atorvastatin has “lipitor”, “atrovastatin”, “atorvastatin cal-cium” and “atorvastatin calcium trihydrate” as keys, and “atorvastatin” as a value. The lookup table was also manually enriched with a well-known pharmaceutical drugs abbreviations as shown in supplementary appendix A. We followed a tradi-tional natural language processing (NLP) workflow to pre-process the the raw text. (1) All numbers and punctuations were removed. (2) N-grams tokenizer from nltk package [40] was used to split a string of text into term tokens. Each term contains one to four words (3) Lower case token was applied to all the term tokens. The lookup algorithm works in iterative approach where it initially tries to match the longest terms first then the shortest. This is important to capture drug combinations and drug salts. For instance, the drug salt “atorvastatin calcium” should be mapped to “atorvastatin” instead of the two individual drugs “atorvastatin” and “calcium”.

(25)

14 Chapter 3. Materials and Methods

In each iteration, if the algorithm fails to find a match, it tries to look up different words ordering of the term, and to correct spelling errors. The following simple method was performed to correct misspellings. First, we counted and created a list with all words in the raw text. Second, we ordered the list based on the most fre-quent words. Then, We used the Ratcliff and Obershelp algorithm implemented by the Python difflib module with the cutoff argument set to 0.6 to get close matches of the term we want to check with the list and to pick up the best match with the high-est frequency. We assumed that the top words in the list should not contain words with typographical errors.

Comorbidity variables: All comobidity variables existed in free text, except the vari-able “history of fall” which is defined by the existence of ICD-9 code “V15.88” in the diagnoses table. Comorbidities were identified in the section “Past Medical History” in the discharge summaries. A regular expression was created to pull out that part from the text in order to be parsed later on. Because some of the comorbidities may include different related concepts and synonyms, we used the Systematized Nomen-clature of Medicine Clinical Terms (SNOMED CT) to compile a list of related terms and synonyms. In SNOMED CT, each term has a unique identifier “concept id”. These terms are arranged into relationships with one another, representing hierar-chies of common concepts and themes. For each comorbidity, we used it’s “concept id” to find all the subclasses recursively. For instance the concept “heart disease” is the superclass of “myocardial infarction” and this descendant has “heart infraction” as a synonym. We then repeated the same text pre-processing steps and the look up algorithm described earlier on the obtained text to extract the variables.

3.5

Data Processing

Medications extracted from the discharge summaries were plain text. As we men-tioned, we want to use ATC to group these medications. In order to do the grouping, we need to map each medication to an ATC code. We applied the mapping provided by Observational Health Data Sciences and Informatics (OHDSI) [41] (version 5) to map each medication into an ATC code (level five ATC code). Because some medica-tions may have multiple ATC codes (discussed earlier in 2.3) we included all possible codes.

After medications were mapped, we obtained all the super (parent) groups for each medication code per patient. For example, if a patient was taking Atorvastatin (code: C10AA05), we included HMG CoA reductase inhibitors (code: C10AA), Lipid Mod-ifying Agents (code: C10A), Lipid ModMod-ifying Agents Plain (code: C10) and Cardio-vascular System (code: C) which represent ATC levels five, four, three, two and one respectively. Polypharmacy variable was added If a particular patient had four or more concurrent medications.

(26)

3.6. Clinical Outcome 15

3.6

Clinical Outcome

The outcome of interest was fall incidents. A fall was defined as “an unexpected event in which the participants come to rest on the ground, floor, or lower level” [42]. We identified fallers using two sources in MIMIC-III dataset. First, the presence of an ICD-9 code that starts with “E88” which represent any type of accidental falls. Second, if the reason of admission (available as free text in admissions table) contains any of the words (fall, fell, fallen). We assumed that fall incidents were before the admission or the reason of admission.

3.7

Outcome Measurements: Predictive Performance

Performance assessment is evaluated systematically using various performance mea-sures. For discrimination (ability of the test to correctly classify those with and with-out the event) we used the area under the receiver operating characteristic curve (AUC-ROC). We also used the area under the precision-recall curve (AUC-PR) as a performance measure for imbalanced datasets. Other performance measures in-cluded were sensitivity, specificity and precision (positive predictive value). To mea-sure the relative goodness of fit, we used Akaike Information Criterion (AIC). To assess calibration (the agreement between observed outcomes and predictions) we used calibration plots and the method proposed in [43], which can be described by fitting a logistic regression model with the event indicator as the outcome and the linear predictor from the fitted model as the only covariate. Perfect calibration would occur when the intercept and slope are zero and one respectively, with the in-tercept calculated assuming a slope of one. To assess the accuracy of the probabilistic predictions we calculated the Brier score. For binary outcomes the Brier score is a quadratic scoring rule that quantifies how close predictions are to the actual outcome by calculating the squared differences between actual outcomes y and predictions p, with values between 0 (perfect prediction) and 1 (worst prediction) [44].

3.8

Statistical Analysis

Since we had high number of variables (up to hundreds), we used Least Absolute Shrinkage and Selection Operator (LASSO) to allow for variable selection as part of fitting the model. LASSO considers a large number of variables while minimizing the model error as well as minimizing the risk of over-fitting [36]. LASSO works by penalizing the likelihood of the model, leading to shrinkage of variable coefficients, sometimes to zero (that is, unselecting them in the model). The tuning parameter λ, controls the severity of the penalty and is usually chosen using cross-validation. The larger the parameter, the more coefficients are shrunk to zero. We used the

(27)

16 Chapter 3. Materials and Methods

glmnet library to perform LASSO regression. The analysis was performed using the R programming language, version 3.4 and data extraction was performed using Python programming language. Missing data were imputed using a single imputa-tion (predictive mean matching method for numeric data and logistic regression for categorical predictors) with the MICE package.

To find out the most relevant variables in the prediction for each model, we use the average predictive comparison (APC) method [45]. The method estimates the expected difference in the outcome associated with a unit difference in one of the variables. It takes into consideration the uncertainty of model parameters, and aver-ages over the population distribution of the variables, this is useful with nonlinear models. APC calculates the quantity of interest at two different values of the input of interest while holding the other inputs at their observed values, and averaging the difference of the two quantities across all the observations.

3.9

Models Validation

The LASSO logistic regression models were validated with bootstrapping [46]. Boot-strapping is the preferred approach for validation of prediction models [47, 48]. To get an honest assessment of model performance, we included all modeling steps. In particular, we repeated LASSO model selection steps per bootstrap sample.

In bootstrapping we repeatedly sample from the original dataset, with replacement, forming a large number (B) of bootstrap datasets, each of the same size as the original data. To estimate the standard error of a parameter estimate, we fit our model to the original data, and then also fit the model to each of the B bootstrap sample datasets (B = 200 in our study). The variance of an estimator is the variance of its value across repeated samples from the population. Since we treat the bootstrap samples as if they are samples from the population (independent samples), variability of the point estimates across the B bootstrap datasets is our estimate of variance for the parameter estimate obtained from fitting the model to the original observed data. We calculate the difference in predictive performance for each bootstrap sample, and take the average across 200 bootstrap samples. This estimate of optimism is then subtracted off the naive estimate of predictive performance on the (original dataset).

The distribution of optimism-corrected values of AUC-ROC, AUC-PR, sensitivity, specificity, precision and the Brier score were compared between all the models and the Expert-Opinion model. In order to test for the statistical significance of the differences in the median, we used Wilcoxon signed-rank test. A p-value ≤

(28)

17

Chapter 4

Results

4.1

Study Population

The total number of patients was 46,520 in the MIMIC-III database. The number of patients aged 65 and older was 25,289 (54.4%). Of these older patients, 23,034 (91.1%) met the inclusion criteria. The number of fallers and non-fallers were 1,721 (7.5%) and 21,313 (92.5%) respectively. Table 4.1 depicts the baseline characteristics of the patients.

Non-Fallers Fallers p-value

N 21,313 1,721 Age (mean (SD)) 77.56 (7.7) 81.2 (7.9) <0.001 Ethnicity (%) WHITE 16,032 (75.2) 1,453 (84.4) <0.001 BLACK 1,603 (7.5) 62 (3.6) UNKNOWN 2,304 (10.8) 122 (7.1) HISPANIC/LATINO 424 (2.0) 21 (1.2) OTHER 468 (2.2) 33 (1.9) ASIAN 482 (2.3) 30 (1.7)

Marital Status (%) MARRIED 11,149 (52.3) 828 (48.1) <0.001 WIDOWED 5,608 (26.3) 560 (32.0) SINGLE 3,044 (14.3) 261 (15.2) DIVORCED 1,161 (5.4) 67 (3.9) LIFE PARTNER 4 (0.0) 0 (0.0) SEPARATED 155 (0.7) 5 (0.3) UNKNOWN 192 (0.9) 10 (0.5) Gender (%) Male 11,247 (52.8) 828 (48.1) <0.001 Female 10,066 (47.2) 903 (51.9) Medications (mean (SD)) 1.54 (1.31) 1.42 (1.3) <0.001 Polypharmacy (%) Yes 1,544 (7.2) 101 (5.9) 0.037 No 19,769 (92.8) 1,630 (94.1)

(29)

18 Chapter 4. Results

4.2

Extracted Variables

The number of extracted medications from the discharge summaries was 939. Fallers used 535 distinct medications while non-fallers had 926 distinct medications. Table 4.2 shows the top 20 medications registered on admission of both fallers and non-fallers, while table 4.3 compares the comorbidities between fallers and non-fallers

Medication Non-Fallers (n=21,313) Fallers (n=1,721) p-value

Metoprolol (%) 6,925 (32.5) 462 (26.8) <0.001 Aspirin (%) 6,877 (32.3) 370 (21.5) <0.001 Furosemide (%) 5,416 (25.4) 368 (21.4) <0.001 Lisinopril (%) 4,958 (23.3) 372 (21.6) 0.126 Atorvastatin (%) 4,708 (22.1) 285 (16.6) <0.001 Acetaminophen (%) 3,827 (18) 286 (16.6) 0.173 Simvastatin (%) 3,705 (17.4) 285 (16.6) 0.404 Warfarin (%) 3,692 (17.3) 351 (20.4) 0.001 Docusate (%) 3,406 (16) 239 (13.9) 0.024 Albuterol (%) 3,126 (14.7) 151 (8.8) <0.001 Omeprazole (%) 3,095 (14.5) 242 (14.1) 0.627 Levothyroxine (%) 3,032 (14.2) 243 (14.1) 0.932 Atenolol (%) 2,978 (14) 239 (13.9) 0.95 Pantoprazole (%) 2,949 (13.8) 144 (8.4) <0.001 Hydrochlorothiazide (%) 2,528 (11.9) 210 (12.2) 0.703 Clopidogrel (%) 2,409 (11.3) 145 (8.4) <0.001 Amlodipine (%) 2,392 (11.2) 172 (10) 0.129 Fluticasone (%) 2,149 (10.1) 115 (6.7) <0.001 Ipratropium (%) 2,056 (9.6) 79 (4.6) <0.001 Metformin (%) 1,648 (7.7) 116 (6.7) 0.149

TABLE4.2: The frequency of the top 20 medications registered on admission of fallers and non-fallers. The percentages represent within-group contribution of the medications.

Comorbidities Non-Fallers (n=21,313) Fallers (n=1,721) p-value

Anxiety (%) 1,104 (5.2) 95 (5.5) 0.579 Arthritis (%) 1,923 (9) 143 (8.3) 0.341 Cancer (%) 3,798 (17.8) 199 (11.6) <0.001 Depression (%) 2,275 (10.7) 216 (12.6) 0.018 Diabetes (%) 5,863 (27.5) 375 (21.8) <0.001 Dizziness (%) 234 (1.1) 19 (1.1) 1 Gait Disturbance (%) 210 (1) 12 (0.7) 0.295 Hearing Problem (%) 450 (2.1) 38 (2.2) 0.857 Heart Disease (%) 8,807 (41.3) 541 (31.4) <0.001 Hypertension (%) 13,195 (61.9) 1,000 (58.1) 0.002 Lung Disease (%) 2,933 (13.8) 184 (10.7) <0.001 Osteoporosis (%) 1,662 (7.8) 112 (6.5) 0.06 Pain (%) 1,189 (5.6) 84 (4.9) 0.244 Sleep Problem (%) 684 (3.2) 38 (2.2) 0.026 Stroke (%) 1,586 (7.4) 132 (7.7) 0.765 Urinary Incontinence (%) 286 (1.3) 29 (1.7) 0.284 Visual Impairment (%) 93 (0.4) 5 (0.3) 0.483

TABLE 4.3: The frequency of the extracted comorbidity variables for both fallers and non-fallers. The percentages represent within-group contribution of the comorbidities.

(30)

4.3. Missing Variables 19

4.3

Missing Variables

Only the variable “marital status” had missing values which was 4.4% in total. The frequency of missing “marital status” in fallers group (n=142, 8.3%) was double that of non-fallers.

4.4

Measures of Predictive Performance

Figure 4.1 shows comparisons of the performance of the models. The highest AUC-ROC was achieved by the ATC-2 model. When compared with the Expert-Opinion model, the gain in the AUC-ROC performance of the ATC-2 model was 1.6% and this results was statistically significant (p-value <0.05). In contrast, the NO-MED model had the lowest AUC-ROC but the difference was not statistically significant compared to the Expert-Opinion model. Figure 4.2 depicts the relationship between the AUC-ROC and the level of ATC grouping. It can be seen that the AUC-ROC initially increases when going up in the hierarchy of the ATC, peaks at level two and then drops at the first level of the ATC.

Regarding AUC-PR, the models ATC-2, ATC-3, ATC-4 and ATC-5 outperformed the Expert-Opinion and the differences were statistically significant. With respect to sen-sitivity, it was found to be significantly higher for the models ATC-2 and ATC-3 com-pared to the Expert-Opinion model. On the other hand, the Expert-Opinion model was significantly better in terms of specificity. In terms of precision, the Expert-Opinion model has the highest value than all the other models except the ATC-4 model, though the difference between the ATC-4 model and Expert-Opinion model was not statistically significant. In comparison with the Expert-Opinion model, the best significant Brier score value (lower is better) was achieved by the ATC-2 and ATC-3 models.

As shown in table 4.4 the lowest Akaike Information Criterion (AIC) estimate was for the model ATC-2 (-16838.66) followed by the model ATC-3 (-16655.52). The number of selected variables for these two models were 110 for the model ATC-3 and 70 for the model ATC-2. The AIC and the number of selected variables for the Expert-Opinion model were relatively between the range of the other models (AIC: -16599.76, selected variables: 45).

Table 4.5 and figure 4.3 show the calibration performance of the models. While the calibration intercepts and slopes were significantly close to zero and one respectively (95% CIs spanned these values) indicating sufficient calibration, miscalibration was observed from the plots of the calibration curves of the ATC-4 and ATC-5 models in the intervals (0.2-0.5) and (0.2-0.7) respectively, suggesting an overestimation of the predicted probability.

(31)

20 Chapter 4. Results

(A) Optimism-corrected AUC-ROC (B) Optimism-corrected AUC-PR

(C) Optimism-corrected sensitivity (D) Optimism-corrected specificity

(E) Optimism-corrected precision (F) Optimism-corrected Brier score

FIGURE4.1: Boxplots of various optimism-corrected performance measures after

bootstrap-ping (B=200). The reported p-value is that of Wilcoxon signed-rank test of the difference in the median between every model and Expert-Opinion model. The numbers are rounded to four decimal places.

Models AIC Total Variables Selected Variables

NO-MED -16406.67 38 28 ATC-1 -16615.84 52 37 ATC-2 -16838.66 118 65 ATC-3 -16655.52 212 110 ATC-4 -16444.1 465 121 ATC-5 -16369.3 977 196 Expert-Opinion -16599.76 65 45

TABLE 4.4: Aikake Information Criterion (AIC) and the number of selected variables after applying LASSO regression on the entire dataset.

(32)

4.4. Measures of Predictive Performance 21

FIGURE4.2: The relationship of AUC-ROC and different levels of grouping using the ATC. Note that the y-axis does not start from zero.

Model Calibration Intercept (CI 95%) Calibration Slope (CI 95%)

NO-MED 0.0132 (-0.1012-0.1243) 0.9979 (0.9532-1.0441) ATC-1 0.0133 (-0.1013-0.1244) 0.9970 (0.9524-1.0433) ATC-2 0.0220 (-0.0928-0.1333) 0.9981 (0.953-1.0449) ATC-3 0.0190 (-0.0961-0.1308) 0.9917 (0.9465-1.0385) ATC-4 0.0220 (-0.093-0.1336) 0.9945 (0.9494-1.0413) ATC-5 0.0148 (-0.1003-0.1266) 0.9937 (0.9487-1.0404) Expert-Opinion 0.0129 (-0.1020-0.1244) 0.9937 (0.9488-1.0401)

TABLE 4.5: Calibration intercepts (calibration-in-the-large) and calibration slopes for the

models created by by fitting LASSO models on 80% of the data and testing on the remain-ing 20%. The values of the calibration intercepts and the slopes were estimated assumremain-ing a slope of one to obtain calibration intercept, and an intercept of zero to obtain calibration slope. Sufficient calibration would occur if the 95% confidence intervals for the calibration intercept and slope contain zero and one respectively. The numbers are rounded to four decimal places.

The average predictive comparisons (APC) of all the models are found in supple-mentary appendix B. The highest APC of 90% was for the history of fall in all the models, implying that, on average in the data, individuals with a history of fall are 90% more likely to fall, compared to individuals who did not experience fall before. The APC calculated for a 5-year increase in age was 1% for all the models, suggest-ing 1% increase of the risk for old people every 5 years above the mean age. For the Expert-Opinion model, there was a positive association of fall with the use of certain medication groups such as SSRI (APC = +3.96%), drugs used in urinary incontinence drugs (APC = +3.96%), antiparkinson drugs (APC = +3.51%), antiepileptics drugs (APC = +2.25%) and cardiac glycosides (APC = +1.16%). Conversely, individuals who used other groups of medications were less likely to fall. For example, anti-histamines (APC = -1.81%), proton pump inhibitors (APC = -1.63%), vasodilators

(33)

22 Chapter 4. Results

(A) NO-MED model (B) ATC-1 model

(C) ATC-2 model (D) ATC-3 model

(E) ATC-4 model (F) ATC-5 model

(G) Expert-Opinion model

FIGURE4.3: Calibration curves of the seven models created by fitting LASSO on 80% of the data and testing on the remaining 20%. The diagonal line indicates the ideal calibration line.

used in cardiac disease (APC = -1.47%) and selective beta blockers (APC = -0.63%). The model ATC-2 had new groups of medications beside the other groups obtained

(34)

4.4. Measures of Predictive Performance 23

in the Expert-Opinion model. For instance, M05-drugs for treatment of bone dis-eases (APC = +1.27%), S03-ophthalmological and ontological preparations (APC = +2.00%), D02-emollients and protectives (APC = +1.11%), M04-antigout prepara-tions (APC = +1.03%). Similarly, the ATC-3 showed up other groups of medicaprepara-tions like H03B-antithyroid preparations (APC = +2.37%), S03B-corticosteroids (APC = +2.33%) and B01A-antithrombotic agents (APC = +0.53%).

(35)
(36)

25

Chapter 5

Discussion and Future Direction

Findings

This study compared expert-based grouping of FRIDs based on literature and knowl-edge with the grouping of the entire medications using ATC, with respect to the per-formance of fall risk prediction in elders who were admitted to the hospital due to fall. Our main finding was that the performance of fall prediction model is affected by the way of medication grouping. In particular, grouping the entire medications based on the ATC at level two significantly outperformed the model with expert-based grouping of FRIDs.

Our reported AUC-ROC (71%) in this study is in line with previous results of re-lated works [18–21]. Additionally, we found that the discrimination performance in-creased by a range between 1.6% and 3.3% depending on ATC grouping level, when medications were added to the demographic and diagnosis variables. This find-ing is consistent with [19] which has demonstrated that model performance is not driven solely by the demographic and diagnosis variables, as medications variables improved model performance. Although individual medications contributed to an increase in the AUC-ROC, this gain was noticeably higher with the use of the ATC to group these medications. This finding broadly corroborates the findings of [49] which utilized two drug ontologies and greedy based top-down search strategy to do variable selection based on the hierarchical drug ontology. It has been suggested that the integration of a data-driven model with a domain hierarchy and applying Tree-LASSO [50] improves the generalizability and interpretability of a prediction model without an impact on the predictive performance [51, 52]. This does not ap-pear to be the case in our study as we also found an impact on the performance. Our finding of 1.6% AUC-ROC gain of performance can be explained by the def-inition of FRIDs. The list of FRIDs used was based on expert opinion and litera-ture reviews. Therefore, the performance of the Expert-Opinion model is limited to those FRIDs included by definition. On the contrary, grouping all medications at certain ATC level allows the investigation of new possible FRIDs that are not yet in

(37)

26 Chapter 5. Discussion and Future Direction

the list of FRIDs. For example, we observed groups of medications that were asso-ciated with falls and are not in the list of FRIDs, such as H03B-antithyroid prepara-tions, S03B-corticosteroids, M05B-drugs affecting bone structure and mineralization, S03-ophthalmological and otological preparations, M04A-antigout preparations and B01AA-vitamin k antagonists. Consistent with [9], this study supports the evidence of possible association of falls in elderly patients with other groups of medications not yet identified.

The average increase in the AUC does not per se imply clinical relevance. On the one hand, an improvement in falls risk prediction is important to discriminate pa-tients with an increased risk of falls and hence allowing clinicians to begin earlier with the intervention to minimize the risk. On the other hand, parsimonious models (simple models with great explanatory predictive power) are often preferred in clin-ical practice. That is, a simple model with the few variables to ease interpretation and decision making is preferable. However, the only difference between the mod-els is the way of medications grouping. In this way, the increase of the number of variables does not involve collecting new data or performing new measurements. Although the ATC-2 model had more selected variables in comparison with the Expert-Opinion model, the AIC was substantially lower, suggesting that the ATC-2 model is more parsimonious. Furthermore, a prediction model with higher sensitiv-ity than specificsensitiv-ity is evidently important when it will be used to screen individuals at high risk of falling and at high risk of suffering fall-related injuries. For example, the ATC-2 model has the ability to screen 12% people with a risk of fall more than the Expert-Opinion model (ATC-2 model sensitivity = 0.701 [95% CI, 0.683-0.718], Expert-Opinion model sensitivity = 0.583 [95% CI, 0.568-0.601]). However, the ATC-2 model had a lower specificity compared to the Expert-Opinion model, which is not desired because of the increase in false positive patients and consequently the increase of unwanted costs associated with the interventions. In fact, a net benefit analysis may be necessary to test this trade-off between sensitivity and specificity in order to select the desired model.

Although calibration plots showed good calibration performance for all the models except the models ATC-4 and ATC-5, the results of calibration intercept and slope showed good calibration for all the models. This contradictory result may be due to low proportion of patients with predicted probability between 0.2 and 1.0 which led to a deviation of the curve after smoothing. According to [53], calibration intercept and slope are more informative and is the preferred method for assessing calibration. Therefore, all the models are deemed to be well calibrated and could be used in clinical practice.

Of notice is the non-linear relationship between the level of medication grouping us-ing the ATC and the performance measures. For example, the AUC-ROC increases when going up in the hierarchy of the ATC, peaks at level two and then drops at the first level of the ATC. The optimal point (tuning point) can be obtained through

(38)

Chapter 5. Discussion and Future Direction 27

experimentation. Our finding of the influence of grouping variables using a classifi-cation system is consistent with the finding of [54]. However, we also found that the predictive performance also depends on the selected performance measures. For example, the AUC-ROC increases with medications generalization while AUC-PR decreases.

Strengths and limitations

To our knowledge, this is the first study that investigates the effect of medications grouping on fall prediction accuracy. In addition, our study is the first study that developed a model for falls that result in a hospital visit using MIMIC-III database, although other prediction models for risk of future falls have been developed for residents of nursing homes [55], inpatients [56], members of a cohort study on ag-ing and mobility [57] and the general elderly population [19]. Another strength is that we proposed a simple general algorithm to extract medications and medical concepts from narrative clinical text. Finally, we systemically evaluated different performance measures and validated these scores using bootstrapping.

Our study also has limitations. First, although the MIMIC-III database is quite large, the study population may not necessarily be fully representative of the population as falls do not always result in a hospital visit. Second, although the MIMIC-III database has ICD-9 codes to encode the diagnosis of the patients, these ICD codes were generated for billing purposes at the end of the hospital stay and they are not the reason of admission. The outcome of “fall” was based on the assumption of the presence of a corresponding ICD-9 code for fall. Therefore, it is unknown if the fall incident occurred before the admission, whether it was the reason of admission or if the patient fell during the hospital stay. This limitation may have affected the overall estimation of fall risk and classification performance because we considered only home medications registered on admission and not during hospital stay. Third, we could not validate the accuracy of the algorithms used to extract medications on admissions and comorbidities because of the lack of an annotated dataset. In addi-tion, a possible source of error is not handling negated phrases when we extracted comorbidity variables. For example, the phrase ‘no history of stroke’ was labeled incorrectly as having a stroke.

The effect of the abovementioned limitations are mitigated because our aim was to investigate the effect of medication grouping on the prediction performance but not to estimate a causal link between the predictive variables and the outcome nor to establish a definitive prediction model. Therefore, we believe that these limitations had negligible consequences on the obtained results.

(39)

28 Chapter 5. Discussion and Future Direction

Implications

Our study has implications for clinicians. Although the FRIDs list used in this study was extensive and matches other international FRIDs lists [16, 24, 35], the present study raises the possibility that there are more medication groups that are associated with fall. Nevertheless, it might be unnecessary to limit the scope of medications with FRIDs in prediction models.

Our findings also have implications for researchers in the matter of establishing pre-diction models. The use of a domain knowledge (e.g. ontologies and classification systems) does not only reduce the dimensionality of the dataset and hence more interpretable model, it might also enhance the performance of the prediction model.

Future work

So far, we have only considered medication grouping at various levels of the ATC. A possible interesting extension is to apply Tree-LASSO as it can exploit the hierarchi-cal domain knowledge together with data. Another important work that warrants further investigation is validating our results on another dataset. In addition, devel-oping a risk-profile for falls that result in hospital visit might be an interesting area for future research.

(40)

29

(41)
(42)

31

Appendix A

Common Abbreviations Used in

Medication Extraction

Abbreviation Medication cpm chlorpheniramine dl deciliter dx dextromethorphan dxm dextromethorphan ees erythromycin gg guaifenesin ggpe guaifenesin/phenylephrine hgb hemoglobin hct hydrochlorothiazide hctz hydrochlorothiazide dh hydrocodone hd hydrocodone hc hydrocortisone inh isoniazid mtx methotrexate msc methscopolamine ms morphine sulfate ms multiple sclerosis ntg nitroglycerin nph insulin novolin n pcn penicillin g pb phenobarbital

peg polyethylene glycols ptu propylthiouracil pse pseudoephedrine

smz-tmp sulfamethoxazole/trimethoprim tac triamcinolone

(43)

32 Appendix A. Common Abbreviations Used in Medication Extraction

Table A.1 continued from previous page Abbreviation Medication

tmp trimethoprim

(44)

33

Appendix B

Average Predictive Comparison

(APC)

ATC-1 Model: Average Predictive Comparison

TABLEB.1: The average predictive comparison of ATC-1 model

Variable Falling Risk (%) Prevalence (n) Prevalence (%) fallhistory 91.4% + 83 0.4% heartdisease 2.8% - 9348 40.6% cancer 2.7% - 3997 17.4% ethnicityWHITE 2.3% + 17485 75.9% ethnicityBLACK 1.5% - 1665 7.2% marital_statusSEPARATED 1.4% - 160 0.7% marital_statusUNKNOWN 1.3% - 202 0.9% visualimpairment 1.3% - 98 0.4% ethnicityOTHER 1.2% + 501 2.2% hearingproblem 1.2% - 488 2.1% depression 1.1% + 2491 10.8% age (every 5 years increase above mean age) 1.1% + 23034 100.0% osteoperosis 0.8% - 1774 7.7% marital_statusSINGLE 0.8% + 3305 14.3% lungdisease 0.7% - 3117 13.5% pain 0.6% - 1273 5.5% arthritis 0.6% - 2066 9.0% marital_statusDIVORCED 0.5% - 1228 5.3% hypertension 0.4% - 14195 61.6% num_med 0.4% - 35310 153.3% stroke 0.3% + 1718 7.5% gaitdist 0.3% - 222 1.0% diabetes 0.2% - 6238 27.1% marital_statusMARRIED 0.2% + 11977 52.0% genderF 0.2% + 10959 47.6% marital_statusWIDOWED 0.1% - 6158 26.7% ethnicityHISPANIC.LATINO 0.0% - 445 1.9% dizziness 0.0% - 253 1.1%

(45)

34 Appendix B. Average Predictive Comparison (APC)

ATC-2 Model: Average Predictive Comparison

TABLEB.2: The average predictive comparison of ATC-2 model

Variable Falling Risk (%) Prevalence (n) Prevalence (%)

fallhistory 89.83% + 83 0.4% C04.peripheral.vasodilators 4.25% - 41 0.2% N06.psychoanaleptics 3.73% + 5539 24.0% L03.immunostimulants 3.40% + 49 0.2% cancer 2.46% - 3997 17.4% A01.stomatological.preparations 2.37% - 8347 36.2% D08.antiseptics.and.disinfectants 2.34% - 119 0.5% heartdisease 2.25% - 9348 40.6% N04.anti.parkinson.drugs 2.19% + 582 2.5% ethnicityWHITE 2.03% + 17485 75.9% S03.ophthalmological.and.otological 2.00% + 966 4.2% J01.antibacterials.for.systemic.use 1.77% - 3391 14.7% A02.drugs.for.acid.related.disorders 1.59% - 9534 41.4% R01.nasal.preparations 1.40% - 4365 19.0% C02.antihypertensives 1.38% - 1299 5.6% marital_statusUNKNOWN 1.34% - 202 0.9% ethnicityBLACK 1.33% - 1665 7.2% N03.antiepileptics 1.31% + 2842 12.3% M05.drugs.for.treatment.of.bone.diseases 1.27% + 1031 4.5% D02.emollients.and.protectives 1.11% + 146 0.6% age (every 5 years increase above mean age) 1.07% + 23034 100.0% R06.antihistamines.for.systemic.use 1.06% - 1007 4.4% M04.antigout.preparations 1.03% + 1674 7.3% hearingproblem 1.01% - 488 2.1% S01.ophthalmologicals 0.98% - 5477 23.8% G01.gynecological.antiinfs.and.antisep 0.95% - 1877 8.1% J02.antimycotics.for.systemic.use 0.89% - 427 1.9% P01.antiprotozoals 0.86% - 787 3.4% A07.antidiarrheals intestinal antiinflam antiinf 0.84% - 3293 14.3% visualimpairment 0.83% - 98 0.4% A16.other.alimentary.tract.and.metabolism 0.80% - 13 0.1% marital_statusSEPARATED 0.76% - 160 0.7% G04.urologicals 0.74% + 2704 11.7% S02.otologicals 0.71% + 1749 7.6% V04.diagnostic.agents 0.64% + 107 0.5% osteoperosis 0.63% - 1774 7.7% C10.lipid.modifying.agents 0.60% - 10734 46.6% A06.drugs.for.constipation 0.60% - 4503 19.5% ethnicityOTHER 0.55% + 501 2.2% D11.other.dermatological.preparations 0.54% - 1115 4.8% marital_statusSINGLE 0.51% + 3305 14.3% B02.antihemorrhagics 0.51% + 59 0.3% pain 0.46% - 1273 5.5% marital_statusDIVORCED 0.44% - 1228 5.3% arthritis 0.44% - 2066 9.0% A03.drugs.for.func.gi.disorders 0.38% - 575 2.5%

(46)

Appendix B. Average Predictive Comparison (APC) 35

Table B.2 continued from previous page

Variable Falling Risk (%) Prevalence (n) Prevalence (%)

R03.drugs.for.obstructive.airway 0.33% - 4355 18.9% A12.mineral.supplements 0.33% + 3969 17.2% R05.cough.and.cold.preparations 0.28% - 1100 4.8% G03.sex.hormones.and.mod.of.genital 0.27% - 457 2.0% hypertension 0.26% - 14195 61.6% A11.vitamins 0.24% + 959 4.2% C07.beta.blocking.agents 0.22% - 12384 53.8% L02.endocrine.therapy 0.16% - 383 1.7% H03.thyroid.therapy 0.14% - 3356 14.6% stroke 0.13% + 1718 7.5% marital_statusWIDOWED 0.10% - 6158 26.7% marital_statusMARRIED 0.10% + 11977 52.0% N07.other.nervous.system.drugs 0.06% - 313 1.4% B01.antithrombotic.agents 0.04% + 11910 51.7% diabetes 0.03% - 6238 27.1% C03.diuretics 0.02% - 8984 39.0%

(47)

36 Appendix B. Average Predictive Comparison (APC)

ATC-3 Model: Average Predictive Comparison

TABLEB.3: The average predictive comparison of ATC-3 model

Variable Falling Risk (%) Prevalence (n) Prevalence (%)

fallhistory 89.97% + 83 0.4% C03X.other.diuretics 71.93% + 1 0.0% R01B.nasal.decongestants.for.systemic.use 15.12% + 28 0.1% L01D.cytotoxic.antibiotics.and.related 10.48% + 2 0.0% D01B.antifungals.for.systemic.use 8.67% + 12 0.1% C04A.peripheral.vasodilators 4.36% - 41 0.2% B05A.blood.and.related.products 3.78% - 6 0.0% G02A.uterotonics 3.69% - 21 0.1% L03A.immunostimulants 3.36% + 49 0.2% N06A.antidepressants 3.32% + 4983 21.6% D03B.enzymes 3.01% - 22 0.1% cancer 2.51% - 3997 17.4% D02B.protectives.against.uv.radiation 2.51% + 145 0.6% J01C.beta.lactam.penicillins 2.50% - 564 2.4% H05A.parathyroid.hormoness 2.49% - 10 0.0% H03B.antithyroid.preparations 2.37% + 76 0.3% S03B.corticosteroids 2.33% + 430 1.9% D07X.corticosteroids.other.combinations 2.25% + 841 3.7% heartdisease 2.23% - 9348 40.6% N04B.dopaminergic.agents 2.10% + 535 2.3% R01A.decongestants.and.other.nasal.topical 2.06% - 4357 18.9% ethnicityWHITE 2.03% + 17485 75.9% G03C.estrogens 1.97% + 71 0.3% L02A.hormones.and.related.agents 1.94% - 185 0.8% B02B.vitamin.k.and.other.hemostatics 1.84% + 54 0.2% C05B.antivaricose.therapy 1.76% - 922 4.0% N06D.anti.dementia.drugs 1.67% + 842 3.7% N02B.other.analgesics.and.antipyretics 1.64% - 7271 31.6% J01X.other.antibacterials 1.55% - 1245 5.4% D08A.antiseptics.and.disinfectants 1.53% - 119 0.5% G03X.other.sex.hormones of the.genital 1.47% - 140 0.6% C02D.arteriolar.smooth.muscle 1.46% - 549 2.4% D06A.antibiotics.for.topical.use 1.43% + 240 1.0% G01A.antiinfectives antiseptics excl CS 1.42% - 1877 8.1% G04B.urologicals 1.41% + 1009 4.4% A16A.other.alimentary.and.metabol 1.40% - 13 0.1% N03A.antiepileptics 1.38% + 2842 12.3% A03B.belladonna.and.derivatives.plain 1.38% - 174 0.8% ethnicityBLACK 1.37% - 1665 7.2% J01M.quinolone.antibacterials 1.34% - 1119 4.9% R06A.antihistamines.for.systemic.use 1.31% - 1007 4.4% N07A.parasympathomimetics 1.26% - 136 0.6% A01A.stomatological.preparations 1.25% - 8347 36.2% A02B.drugs.for.peptic.ulcer.and.gerd 1.20% - 8884 38.6% M05B.drugs.affecting.bone.structure 1.19% + 1031 4.5% R05C.expectorants.excl.combi.cough 1.18% - 403 1.7%

(48)

Appendix B. Average Predictive Comparison (APC) 37

Table B.3 continued from previous page

Variable Falling Risk (%) Prevalence (n) Prevalence (%)

marital_statusUNKNOWN 1.17% - 202 0.9% J02A.antimycotics.for.systemic.use 1.14% - 427 1.9% D10A.anti.acne.preparations.for.topical.use 1.11% - 553 2.4% age (every 5 years increase above mean age) 1.08% + 23034 100.0% M04A.antigout.preparations 1.06% + 1674 7.3% D03A.cicatrizants 1.04% + 348 1.5% S01X.other.ophthalmologicals 1.03% - 1667 7.2% hearingproblem 0.99% - 488 2.1% L01C.plant.alkaloids.and.other.natural 0.98% - 5 0.0% J01F.macrolides.lincosamides.and.str 0.93% - 334 1.5% C01D.vasodilators.used.in.cardiac.diseases 0.88% - 2278 9.9% N07X.other.nervous.system.drugs 0.87% - 9 0.0% A02A.antacids 0.87% - 1535 6.7% visualimpairment 0.87% - 98 0.4% marital_statusSEPARATED 0.87% - 160 0.7% A12B.potassium 0.87% + 821 3.6% C09X.other.agents.acting.on.the.RAAS 0.85% - 4 0.0% P01B.antimalarials 0.84% - 236 1.0% V04C.other.diagnostic.agents 0.82% + 107 0.5% J01E.sulfonamides.and.trimethoprim 0.73% - 424 1.8% J01G.aminoglycoside.antibacterials 0.69% - 119 0.5% C01A.cardiac.glycosides 0.67% + 1599 6.9% A07E.intestinal.antiinflammatory.agents 0.66% - 2147 9.3% osteoperosis 0.62% - 1774 7.7% ethnicityOTHER 0.62% + 501 2.2% J01D.other.beta.lactam.antibacterials 0.60% - 644 2.8% C03C.high.ceiling.diuretics 0.57% - 6189 26.9% arthritis 0.56% - 2066 9.0% B01A.antithrombotic.agents 0.53% + 11910 51.7% C10A.lipid.modifying.agents.plain 0.53% - 10734 46.6% marital_statusSINGLE 0.53% + 3305 14.3% pain 0.52% - 1273 5.5% L01B.antimetabolites 0.51% + 145 0.6% A06A.drugs.for.constipation 0.51% - 4503 19.5% L01X.other.antineoplastic.agents 0.50% - 355 1.5% S01E.antiglaucoma.preparations.and.miotics 0.47% - 1656 7.2% marital_statusDIVORCED 0.46% - 1228 5.3% C03B.low.ceiling.diuretics.excl.thiazides 0.46% + 279 1.2% C02C.antiadrenergic.agents.peripherally.acting 0.45% - 467 2.0% N06B.psychostimulants.for.adhd.and.notropi 0.43% + 273 1.2% D11A.other.dermatological.preparations 0.42% - 1115 4.8% B03B.vitamin.b12.and.folic.acid 0.33% + 1756 7.6% C08C.selective.CCB.vascular.effects 0.31% - 3387 14.7% N01A.anesthetics.general 0.27% - 406 1.8% H03A.thyroid.preparations 0.26% - 3281 14.2% hypertension 0.25% - 14195 61.6% C03A.low.ceiling.diuretics.thiazides 0.23% + 2751 11.9% C07A.beta.blocking.agents 0.22% - 12384 53.8% N05A.antipsychotics 0.22% - 1523 6.6%

(49)

38 Appendix B. Average Predictive Comparison (APC)

Table B.3 continued from previous page

Variable Falling Risk (%) Prevalence (n) Prevalence (%)

A12A.calcium 0.19% + 2513 10.9% N02A.opioids 0.19% + 2871 12.5% marital_statusWIDOWED 0.13% - 6158 26.7% A11C.vitamin.a.and.d.incl.combi 0.12% + 641 2.8% stroke 0.12% + 1718 7.5% marital_statusMARRIED 0.10% + 11977 52.0% N05B.anxiolytics 0.09% + 2376 10.3% M02A.topical.for.joint.and.muscular.pain 0.09% - 927 4.0% A12C.other.mineral.supplements 0.09% - 1130 4.9% S01G.decongestants.and.antiallergics 0.08% + 114 0.5% A10A.insulins.and.analogues 0.06% - 1131 4.9% C09C.angiotensin.ii.antagonists.plain 0.06% - 2279 9.9% B03A.iron.preparations 0.04% - 960 4.2% C01C.cardiac.stimulants.excl.cardiac.glycosides 0.01% + 220 1.0%

Referenties

GERELATEERDE DOCUMENTEN

niet beter maakt, is dat de sfeer in de wiskunde- en wiskundeonderwijs community in dit kleine land zich soms meer kenmerkt door animositeit dan door synergie, wat ons imago en

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:.. • A submitted manuscript is

Als de arts de naald heeft verwijderd, wordt het wondje met een gaasje verbonden.. Om nabloeden te voorkomen moet u ongeveer 15 minuten plat op de rug blijven liggen met een

The aim of this study was to investigate current classroom management practices, disciplinary strategies and educator duties and demands, in the Sedibeng West

The continuity equation is discretized by a finite volume method, with the control volume being a single cell in the mesh as seen in the left part of figure 2.. The index i is used

Scientists increasingly recognize the importance of good data management during research and the storage of digital data for future reuse.. This

In Chapter 7 we define a model for the administration of RBAC standard in a distributed system, formal safety and availability requirements, and we propose administrative procedures

A CO-OPERATIVE FOCUS: CULTURE AND GENDER AS FACTORS IN PATTERNS OF HIGH-RISK SEXUAL BEHAVIOUR AMONG STUDENTS ON THE MAIN CAMPUS OF THE UNIVERSITY OF THE FREE STATE, AND OTHER