Black-Box Predictive Classifier and Model-Agnostic Explanator Application in Text Data: A Study of Interpretability

(1)

Black-Box Predictive Classifier and Model-Agnostic Explanator

Application in Text Data: A Study of Interpretability

2019 – 2020

Assignment: Master’s Thesis MSc Marketing Intelligence

First Supervisor: dr. Keyvan Dehmamy

Second Supervisor: dr. P.S. (Peter) van Eck

Student: Dezaldy Irfianza Irfan Student Number: S3156109

Date of Submission: 10 January 2020

(2)

Black-Box Predictive Classifier and Model-Agnostic Explanator

Application in Text Data: A Study of Interpretability

Faculty of Economics and Business

Rijksuniversiteit Groningen

Master’s Thesis

MSc Marketing Intelligence

First Supervisor: dr. Keyvan Dehmamy

Second Supervisor: dr. P.S. (Peter) van Eck

Date of Submission: 10 January 2020

Dezaldy Irfianza Irfan

S3156109

(3)

ABSTRACT

Interpretability of predictive classifiers in marketing is becoming more important due to higher concerns over the use of private consumer data to map preference and behaviour. The author put together a research to measure interpretability of a predictive classifier against a marketing dataset. The research used IMDB text dataset with 50,000 reviews classified into positive or negative sentiment. The predictive classifiers used were neural network and sentiment analysis to represent black-box and white-box models respectively. LIME was used as the model-agnostic explanator. By comparing the accuracy of the predictive classifiers on the basis of complexity, the findings suggest that the black-box model performs better albeit suffers from low interpretability. The black-box neural network model classifies the data with 87.7% accuracy. Furthermore, a sample of 156 participants took part in a survey to measure interpretability. The results suggests that interpretability decreases as complexity of the model increases (F = 31.503), but incorporating model-agnostic explanator significantly increases interpretability (t = -8.167). The research therefore supports the hypothesis that models with higher complexity leads to better accuracy at the expense of interpretability, which could be improved through the incorporation of a model-agnostic explanator.

(4)

ACKNOWLEDGEMENT

I hereby present the result of months of reading, writing, and programming in the form of a Master’s research paper. This marks the close of a one and a half year journey as a student of MSc Marketing Intelligence at Rijksuniversiteit Groningen. I am proud of the hours I put into this research and the work it takes to get me to this point. I would like to use this platform to thank my first supervisor dr. Keyvan Dehmamy, for the guidance within the past few months. I hope that you find our relentless back-and-forth debates as meaningful as I did. I would also like to thank my second supervisor dr. P.S. (Peter) van Eck, for investing the time to evaluate my work and provide constructive feedbacks.

(5)

TABLE OF CONTENTS

Introduction 1

Theoretical Framework 2

Macroeconomic Context 2

Predictive Accuracy and Interpretability 4

Model-Agnostic Explanator 6

Conceptual Framework 8

Research Design 9

Data 9

Procedure and Measurement 11

Hypothesis Testing 15

Results 16

Exploratory Analysis 16

Sentiment Analysis 17

Neural Network Analysis 19

Model-Agnostic Explanator 22

Interpretability Analysis 23

Discussions and Recommendations 25

(6)

Appendix B: Exploratory analysis 39

Appendix C: Sentiment analysis 39

Appendix D: Neural network model 40

Appendix E: Model-agnostic Explanator LIME 41

Appendix F: Interpretability 42

Appendix G: Survey question set example 43

LIST OF FIGURES AND TABLES

Figure 1: Conceptual framework

Figure 2: Visual representation of most used words in the IMDB dataset Figure 3: Review text of a randomly selected observation 13

Figure 4: Fit of adjusted model of neural network

Figure 5: Visual explanation of LIME on the first observation Table 1: Summary statistics of sentiment analysis

(7)

INTRODUCTION

In recent years, predictive classifiers have become increasingly popular in fields such as healthcare, transport, and marketing. By using sophisticated algorithms from machine learning methods, predictive classifiers are able to map feature variables and classify observations or subjects into predetermined categories (Guidotti et al 2018). As various markets grow in proportion and sophistication, the demand for scalable tools to process myriad of information with high efficiency increases. Furthermore, wide accessibility of the world wide web means that many industries have been tapping and capitalising on the

growing presence of their consumers online. The aforementioned industries have long shifted from a labour-intensive solution of employing millions of workers to utilising tools that do more heavy-lifting in much less time and cost (Ghazinour et al 2013).

Machine learning, and by extension predictive classifier, grew from a tool which main objective was to diagnose computer issues into a classification tool based on user’s activities, interests, and behaviours. In order for predictive classifiers to be as effective, the most

important ingredient is to have rich data of the subject its studying. The main concern lies when the data a classifier is predicting is of real people, thereby extensive observations need to be acquired and thus a concern to people’s privacy. Moreover, as more features are included to capture real-world situations, the inner-workings of predictive classifiers are a challenge for groups of people directly affected by it to understand (Fong and Vedaldi 2017).

(8)

2019), to which each category is assigned its own customised content. The issue violates voter rights in The United States as voters whose information were used without their consent were a part of an elaborate campaign strategy in favour of a certain presidential candidate. In light of this, the European Council has put into law the General Data Protection Regulation (GDPR) on May 2018. One point of GDPR entails data subjects, e.g. consumers, have the right to be informed of what data collectors, e.g. e-commerce, do with the subjects’ personal data (The European Parliament and Council of the European Union 2016, Articles 12-23).

As a result of the challenges faced by data collectors, transparency of predictive classifier becomes a requirement. Data collectors and machine learning researches are pressured into coming up with a way to explain the inner-workings of machine learning methods and predictive classifiers through an explanator. In order to be universally usable, the explanator needs to be model-agnostic. Model-agnostic entails that the method of explanation can be used against different types of learners and classifiers (Krause, Perer, and Ng 2016). This is particularly important in the field of marketing, where data collections are used to map the behaviour and preference of consumers using different methods. Therefore, we will conduct a research to answer the question on whether model-agnostic explanator can be applied in an illustrative marketing dataset. Therefore, the objective of our research is to make predictive classifiers relatively more interpretable for a more universal audience to, among others, mitigate privacy issue, develop pragmatic due diligence, and make better strategic decisions.

THEORETICAL FRAMEWORK

Macroeconomic Context:

(9)

lives easier for most people, as users are for instance able to obtain insights about relevant products or services based on their web searches or similar activities (Ghazinour et al 2013). On the other hand, the intuitiveness and scalability of predictive classifiers result in a lower incurred cost of human capital to improving the standard of living at the expense of privacy (Bellovin et al 2013; Grimmer 2014; Jordan and Mitchell 2015). To illustrate, it is common practice for health insurance providers to use predictive classifiers to classify applicants into different insurance groups based on individual health information and subsequently process claims based on historical payment data (Murdoch and Detsky 2013). In addition to that, the advantages and disadvantages of classifier can be seen by organisations with digital presence; such as Amazon on using personal information of users to promote their products. This is an important illustration of the use of classifier in the field of marketing, more so a digital one. As the internet becomes more accessible to more people, a way to mitigate privacy issue is by improving classifier protocols that pay attention to private and sensitive information to make it more understandable to a general audience (Mohassel and Zhang 2017).

(10)

moving at a faster rate than the law (Barratt, Lenton, and Allen 2013; Oswell 1998; Wallsten 2005). It is interesting to witness as predictive classifiers are extensively used in the practice of law, where precedents are included to assess whether a legal case holds weight to proceed into court (Možina, Žabkar, and Bench-Kapon 2005). Before GDPR, organisations were not mandated to disclose their internal know-how publicly due to competitive reasons. One advantage of organisations stems on how securely they are able to keep key information within the confines of the organisation. In an online environment, data collectors rely on information disclosed by consumers. Due to network effect and increasing online presence of retailers, consumers are incentivised into giving personal information willingly (Dingledine and Mathewson 2006; Gross et al 2005; Li, Sarathy, and Xu 2011; Saxton and Wang 2014).

Concerns regarding oversharing private information online is not new. Wirtz, Lwin, and Williams (2007) states that online privacy issue pushes governments to revise regulations in order to protect their citizens. A paper written two years later by Cho, Rivera-Sanchez, and Lim (2009) extends the notion by outlining the urgency revolving online privacy at the global scale and solution local governments have and should do. Consequently, macroeconomic factors such as increasing privacy concerns and legal context dictate the environment in which interpretable classifier should be more prevalent within marketing.

Predictive Accuracy and Interpretability:

(11)

white-box models as these methods yield outputs with clear coefficients and weights that illustrate interrelationships between variables and thereby provide a relatively intuitive pathway to a classification decision (Hammer et al 2012). This issue inevitably presents us with the accuracy-interpretability trade off, whereby as accuracy of a predictive classifier increases, its interpretability reduces (Ribeiro et al 2016). As a result of this, there has been a growing discussion as to how much accuracy we are willing to sacrifice in order to make learners more interpretable (Guidotti et al 2018). Predictive classifier needs to be accurate enough that it does justice to its predictive purpose, yet transparent enough for the public to understand the risk or regulators to be able to formulate laws effectively.

(12)

In the field of marketing, adopting predictive classifier have improved financial returns and redefined digital marketing strategy. This is illustrative of the research by Sundsøy et al (2014) that proved classifier outperforms gut-feeling in increasing conversion rate, whilst a more older research by Cheung et al (2003) paved the way by establishing a solid framework for a more personalised targeting within an online environment. More prior research on how classifier improves digital marketing focuses on extending the possibility of online targeting to capture user interest and push them to a point of purchase (Chintagunta et al 2016; Glance et al 2005; Huang et al 2007; Melville and Sindhwani 2009). Thus, marketing managers are pressured to have a deeper knowledge on their digital marketing strategy that recently involved classifier in order to align their digital marketing practice with more scrutinisation from stakeholders (Jordan and Mitchell 2015; Ribeiro et al 2016). Moreover, interpretability to a more general audience can for instance translates to a more pragmatic law-making as regulators understand better how these systems work. The previously discussed Cambridge Analytica issue proves lack of proper regulations on digital tools that can be attributed to poor understanding by regulators (Kuner, Svantesson, and Care 2017; Surden 2014).

Model-Agnostic Explanator:

(13)

classify an observation, said classification has to come from a value of select variables and thus LIME provides us with the reasoning. To illustrate, Ribeiro et al (2016) uses the wolf and husky test, where a neural network engine is used to identify whether a picture is that of a wolf or a husky. Due to similar physical features between the two animals, the neural

network classifies the picture based on other features it could distinguish, the background. It would classify as a picture of a wolf if there is snow on the background, and classify a picture of a husky otherwise. Ribeiro et al (2016) concludes that the wolf-husky classifier carry poor external validity. Based on the test, LIME provides a faithful reasoning out of the classifier that is easily understandable to users and allow user to judge its trustworthiness.

(14)

groups most affected by privacy and security concerns (Brankovic and Estivill-Castro 1999; Fawcett and Provost 1996; Hoffman, Novak, and Peralta 1999).

Prior research on LIME have mostly focused surrounding technicalities. Papers such as Doshi-Velez and Kim (2017) on developing a more robust framework on interpretability given an LIME output, Lundberg and Lee (2017) on introducing SHAP to develop further based on the accuracy-interpretability trade off, Lakkaraju et al (2019) on approximating global explanations from black-box models, and Selvaraju et al (2017) on visualising neural network interpretations using gradient-based localisation were extensions of the algorithm that Ribeiro et al (2016) had introduced. There has been no significant development of using real-world marketing objective, data, and metrics in respect to LIME. Reflecting on the importance of interpretability and accuracy in marketing, we propose a research incorporating model-agnostic explanator LIME against real marketing decisions to illustrate the feasibility of LIME for a more generalised audience such as managers, regulators, or consumers.

Conceptual Framework:

(15)

Figure 1: Conceptual framework

Based on the conceptual underpinnings, we proposed the following hypotheses: first, we hypothesise that black-box models are more complex compared to white-box models. Secondly, we hypothesise that accuracy is positively influenced by how complex a model is, and by extension influences negatively on interpretability of the model. Furthermore, we hypothesise that explanations given by model-agnostic explanator improves interpretability, thereby making predictive classifier more interpretable without sacrificing accuracy.

RESEARCH DESIGN

Data:

(16)

researchers to test available tools and frameworks against the dataset to simulate real-world unstructured data. We will be using this dataset as the primary source of analysis.

In addition to that, the Keras documentation developed by Cholet (2015) comes pre-packaged with a tailored IMDB dataset, whereby a review is transformed into a string of integers. Each integers represent different words that exists within the tailored IMDB dataset, rendering it more convenient for Keras-based analysis. As we extensively use the Keras API to feed it into a predictive classifier method, we will use the corresponding Keras IMDB dataset (See Appendix A). Additionally, the tailored IMDB dataset retains 10,000 frequently used words and remove rare ones in order to keep the dataset manageable (Chollet 2015). We will discuss the Keras API extensively and the decision to use neural network as our main classifier in later sections. Nevertheless, we retain the original IMDB dataset in the event that we need to feed it into analyses that do not heavily rely on the API provided by Keras. The IMDB dataset consists of two variables and 50,000 observations of movie reviews. The two variables in the IMDB dataset are the content of the review and the sentiment identifier of either positive or negative. Departing from the discussion in the literature review and the IMDB dataset, our research focuses primarily to explore the capabilities of a model-agnostic explanator against a marketing dataset using a black-box model. This breakthrough research connects a widely used machine learning method with a high volume dataset, and we extend a significant development on machine learning interpretability in a marketing setting.

(17)

social media distribution. As our research is indicative on a general audience, demographic questions were not asked in the survey thus supporting identifiers of each participants are not available. The independent and dependent variable used were complexity and interpretability respectively; as well as the variable explanation interpretability to compare interpretability following an explanation, which departs from incorporating a model-agnostic explanator.

Procedure and Measurement:

Our research uses R as the primary interface and tool for our analyses but uses additional foreign packages. Firstly, we will conduct simple text analysis to get acquainted with the IMDB dataset. Our analyses use packages such as wordcloud, ggplot2, rcolorbrewer, and

tidyverse. In order to properly perform text analysis, we transform our IMDB dataset into a

corpus format. The corpus consists of outer layer list type with length of 50,000 along with an inner layer lists to denote each review. The content of the IMDB corpus is then simplified for exploratory analysis. Simplifications include transforming all characters into lowercase; removing numberings and punctuations, whitespace, and stop words (See Appendix B). Stop words are words used to express grammatical concepts but add no exposition to the content (Wilbur and Sirotkin 1992); such as “if”, “until”, and “are” among others. These methods of removal make each review simpler for further exploratory analysis. A practical understanding of our review data can be done by seeing the most frequent words within the dataset. To achieve this in a visually representative way, we conduct a word cloud analysis.

(18)

general purpose Harvard IV-4 Dictionary developed by Hartman et al (1967) to classify words based on positive and negative affiliations. As a result of that, the output given by sentiment analysis are percentages of sentiments (overall, negative, and positive) on each review observation. Moreover, the analysis gives clear outputs on why certain observations are classified into a sentiment, making this a “white-box” model (Hammer et al 2012).

Furthermore, we want to compare white-box with black-box accuracy. Predictions or classifications are made by organisations to improve insight generation thereby improving decisions. This entails causality between the features assigned to each observations towards a certain classification. A more general approach to infer causality is through regression. By and large, regression is a type of machine learning method, but it is a supervised learner and therefore is not a black-box model (Rasmussen 2004). The simplicity of the regression model defeats the purpose to explain complex models for a more general audience. The unstructured text data that we base our research on can be challenging for a simple linear regression model to accurately learn from. Moreover, implications from simple statistical inference such as regression rests heavily on statistical significance illustrated through P-values. Consequently, readers would be quick to assume causality given the statistical significance. This pitfall is apparent particularly in large datasets commonly found in marketing-related studies, where statistical significance increases along with high volume of observation (Khalilzadeh and Tasci 2017). Thus, we will be using another method instead of regression.

(19)

variables and features are kept hidden (Olden and Jackson 2002), providing no insights as to how the classification is done. Thus, neural network is considered to be a “black-box model”. Moreover, we develop our neural network model heavily on the Keras API available through the keras package. As a result of the Keras API, we will use the tailored IMDB dataset that came pre-installed within the package. Due to the contents of the dataset, we will also make use of a decoding function developed by Chollet (2015) in order to revert integers back to its word representation (See Appendix D). The function decodes string of integers within each observation to be reverted back into paragraph. Before building the neural network model, Keras API requires the dataset to be made into tensors and padded into sequence of similar lengths enabling it to be fed to the neural network model (Chollet 2015). We begin building the baseline model with an arbitrary number of hidden nodes to see how the it perform. Building on the baseline model, we develop an adjusted neural network model in an effort to improve scores of accuracy and minimise loss. The adjusted models would be modified on additional types of layer, configuration of epoch, or different loss and activation functions.

(20)

Lastly, we conduct a survey to assess interpretability. The survey contains three sections consisting of similar line of questioning. We use the first three observations of the IMDB dataset as the basis of the survey. Each section starts with the review text of the observation, with a question “Do you think this review text is more positive or negative?” followed by two choices of positive or negative (See Appendix G). Secondly, we presented an output of the review from a white-box sentiment analysis followed by the question “Model x classifies

review n as positive/negative. Regardless of the classification, you think the model is:”.

Under which two scales were presented on complexity and interpretability with six scales from simple to complex and unclear to understandable respectively. To account for

explanation interpretability, we present a simple explanation about the output of the

white-box model by colour-coding the positive and negative words as blue and red respectively along with a follow-up question “What do you now think about the output?” along with a second set of Likert-scale on interpretability.

(21)

each for independent variable complexity (ɑ = .603), dependent variable interpretability (ɑ = .695), and second dependent variable explanation interpretability (ɑ = .761).

Hypothesis Testing:

The accuracy of the simpler “white-box” model represented by sentiment analysis can be measured through the way the model analyses our review dataset. The library of words and sentiments available in the Harvard IV-4 Dictionary is well established and thereby static, meaning that it does not evolve or learn on different dataset (Hartman et al 1967). Moreover, the dictionary records the sentiments of standalone words, which means that it does not take into account nuances of words against sarcasm, prior knowledge, or informal grammatical structures. Thus, the sentiment analysis model have relatively low external validity on one specific dataset such as ours, and thereby low accuracy. A limitation to this model is we are unable to precisely compute the accuracy score, because unlike other predictive classifiers, the sentimentanalysis package in R does not require training or test data to cross-examine the model with. A solution would be eyeball test, where we compare the sentiment given by the sentiment analysis with each review observation it is based upon. Nevertheless, because we are able to observe the positive or negative weighting on each words belonging to a particular review, this model is highly interpretable.

(22)

increasingly hard to understand as the model grows (Andrychowicz et al 2016; Wythoff 1992). However, the increasing complexity comes at the expense of interpretability.

To mitigate the challenge of interpretability, we propose on using model-agnostic explanator. Model-agnostic explanator such as LIME provides output to explain how a black-box model classifies an observation into one sentiment or the other along with explanation fit. However, we need to test whether our hypothesis of LIME making more accurate black-box models more interpretable is valid. Therefore, we conduct a survey to test interpretability. The survey will provide us with insights on the complexity and interpretability of the white-box and black-box model. We hypothesised that the same person would perceive black-box model to be more complex than its white-box counterparts; as well as whether interpretability improve after explanation was given through incorporating LIME. In order to do so, we measure these through within-subject comparisons involving all participants of the survey.

RESULTS

Exploratory Analysis:

(23)

only report one for simplicity. Movie-related terms such as “movie”, “film”, and “story” frequently appear on the dataset; as well as words associated with opinions such as “good”, “like”, and “bad” followed by supporting words such as “even”, “also”, and “can”.

Figure 2: Visual representation of most used words in the IMDB dataset

Sentiment Analysis:

(24)

Dataset Min Median Mean Max

Ceiling -.269 .032 .034 .375

Floor -.333 .033 .035 .385

Table 1: Summary statistics of sentiment analysis

Moreover, to test the accuracy of the sentiment analysis, we take several examples of actual sentiments given by our dataset and compare it with the classification given by the sentiment analysis. Using examples randomly picked from the dataset, we can see the function presents weight in a form of probability of sentiment for each observation and estimates the overall sentiment (See Table 2). Randomisation is done by random.org, an online tool that uses atmospheric noise to ensure optimum randomised results (Kenny, 2005). We found that the sentiment scores given by randomly selecting observations is rather small, with the highest probability only accounting at a little under 17%. The low confidence of the sentiments given by the analysis may stem on the static nature of the library it was based upon.

Data point Word Count Overall Sentiment Negative Sentiment Positive Sentiment Actual Sentiment 250 96 .031 .073 .104 Positive 13 64 .031 .094 .125 Positive 409 72 -.014 .167 .153 Negative

Table 2: Example of sentiment output on three random observations

(25)

the original IMDB dataset is negative. Furthermore, this is effectively proven by looking into observation 13 review text, which carries a negative connotation (See Figure 3).

Figure 3: Review text of a randomly selected observation 13

Neural Network Analysis:

(26)

IMDB dataset against our adjusted model (See Appendix D). The imported model returned an accuracy score of .869 and a loss score of .341 without adjustments.

Layer type Output shape No. of parameters

Embedding (na, na, 16) 160,000

Global Average Pooling 1D (na, 16) 0

Dense (na, 16) 272

Dense 1 (na, 1) 17

Table 3: Summary of imported and adjusted sequential model

(27)

Figure 4: Fit of adjusted model of neural network

(28)

Model-Agnostic Explanator:

Using the previously established and trained neural network model, we begin incorporating LIME as an explanator. We used the lime package in R along with the lime, explain, text explanation, and feature explanation functions for our analysis. Furthermore, we fed our adjusted neural network model to the function. Through randomised sampling, we run the LIME explanator against 1,000 permutations to get the explanation fit of .7 on a random local observation (See Appendix E). However, the algorithm estimates an explanation on the local level, thus the explanation fit differs from one observations to the next. This is particularly important to note as there are missing functionalities to include foreign variables into the LIME package. LIME provides us with the local estimates based on the trained dataset through a neural network model. As the estimates were local (observation-level output), the algorithm thus takes into account the variations of each review text such as and not limited to text length or punctuation. However, note that text length in our adjusted neural network model is padded to 256 words. To illustrate how LIME output provides explanation locally, we use the text and feature explanation functions against the first observation (See Figure 5).

Figure 5: Visual explanation of LIME on the first observation

(29)

of padded words in the observation in respect to the neural network model. Nevertheless, we focus on the top 10 features to be explained. On the left side of the figure, we can see that the LIME algorithm highlights words that are most representative of the positive or negative sentiment, whereas the right side of the figure illustrates the weight of each words to calculate the probability of a local observation belonging to a sentiment. For the first observation, we can see that the highest weight for positive sentiment is “charming” followed by “news”, whereas the highest weight for negative sentiment is “power” followed by “familiar”. The LIME algorithm classifies it as positive with .52 and .88 for the probability and explanation fit respectively. Cross-examination with the dataset confirms that the sentiment is positive.

Interpretability Analysis:

Based on the survey data, we conducted several statistical analysis (See Appendix F). The first analysis we conducted is the comparison of complexity and interpretability of white-box and black-box models. The Cronbach’s Alpha values of each items allow the model in respect to the variable to be averaged (See Table 4).

Variable Model No. of Item Scale Cronbach’s Alpha

Complexity White-box 3 .655

Black-box 3 .628

Interpretability White-box 3 .642

Black-box 3 .736

Table 4: Item scale and Cronbach’s Alpha (ɑ) of variables

(30)

conduct two paired samples t-tests. We found that complexity is significantly higher for black-box models (M = 3.876, SD = 1.145) compared to white-box models (M = 3.105, SD = 1.163) at 95% confidence interval (t(155) = 12.243, p < .000). We also found interpretability to be significantly higher for white-box models (M = 4.098, SD = 1.092) compared to black-box models (M = 3.455, SD = 1.258) at 95% confidence interval (t(155) = 5.664, p < .000).

Secondly, we conduct a linear regression analysis to test causal inference between complexity and interpretability vis-a-vis the independent and dependent variable. The output indicated that our predictor explained 17% of the variance (R2_{= .17, F(1,154) = 31.503, p < .000).} Based on the results, we found that complexity has a significant negative causal effect on

interpretability (β = -.443, p < .000) at 95% confidence interval (See Table 5). Because of

this, we can infer that participants who think that a model is more complex find it harder to find the model interpretable. Lastly, we measure interpretability following an explanation. All participants were shown interpretability measures of the model in relation to the explanation, thus we compare within-subject means of interpretability before and after an explanation was given. In order to do so, we conduct a paired samples t-test and found that interpretability is significantly higher after (M = 4.345, SD = .913) compared to before an explanation was given (M = 3.777, SD = .940) at 95% confidence interval (t(155) = -8.167, p < .000).

Coefficient Std. Error T-Value P-Value

Intercept/Constant 5.322 .284 18.751 .000

IV: Complexity -.443 .079 -5.613 .000

(31)

DISCUSSIONS AND RECOMMENDATIONS

Grounded Discussion:

The objective of our research is to examine how explanations provided by model-agnostic explanator can improve interpretability in a more accurate but more complex black-box model. Our sentiment and neural network analysis to assess white-box and black-box models respectively proved that accuracy of a certain model is positively influenced by how complex the model is. The result of sentiment analysis is intuitive to interpret, but it lacks the accuracy fit of the model against the entire dataset. This means that we would need to cross-examine the accuracy of each output against the data in the observation. Moreover, accuracy of the neural network model is arguably high at over 80%, but it lacks relative comparison due to the fact that it is being compared to a model with no exact measure of accuracy. In addition to that, the fit of the LIME output differs from one observation to the next, as the algorithm estimates local (observational) fit and not a global (dataset) fit. Therefore, it is inherently challenging for us to conclude how good of a fit LIME is to our dataset. Nevertheless, the importance of LIME against our dataset, interestingly, is up to the reader. As Ribeiro et al (2016) has demonstrated, explanations given through a LIME output simply act as a tool to aid readers in making decisions for themselves on whether or not to trust the explanation.

(32)

perceived complexity and interpretability of a visual representation of a certain model. Controlling for a priori knowledge on, for instance, machine learning methods is difficult to quantify, as the application for machine learning or predictive classifier is widely abundant. Moreover, we also see that the mean differences of interpretability is only slightly higher at .568 between before and after explanation is given. In addition to that, participants have mentioned several times in the text entry within the survey that they find the questions to be too complicated, and they feel as if they need to solve for the model instead of answering on complexity or interpretability. Although this is to some extent the objective of the research, we fell short in controlling for the clarity to distinguish whether “complexity” refers to how complex the model output looks or how complex a model might be perceived. Somewhat conclusively, we were able to see from the paired samples t-tests we conducted that the perception of complexity contributes to the level of interpretability of a model.

Reflecting back to our hypothesis, we observed that black-box models are more complex than white-box models. The complexity of models conceptually increases the score for accuracy. Secondly, we observed that the more complex neural network model carry better accuracy compared to the less complex sentiment analysis model by examining the two models on our dataset. Furthermore, we found that as the complexity of a model increases, interpretability significantly decreases. However, the incorporation of an explanation by model-agnostic explanator proved that interpretability of the model can improve significantly; thus supports our objective of making the output of a black-box model more understandable.

Limitations:

(33)

The Keras API for R provides an extensive library and resources for model-building, but features such as visualisation is limited in the R version of the Keras API. This lead to restrictive capabilities. Tools such as tensorflow, which is used primarily for building and visualising neural networks, works more extensively and flexibly with Python (Bahrampour et al 2015). Nevertheless, this brings us to the next limitation, which is the relatively narrow architecture of our adjusted model. Most neural network model

—

especially those used by large organisation to handle large volume of data

—

benefit from higher accuracy due to more complex architecture. Our adjusted model relatively pales in comparison. We demonstrated how LIME was able to provide meaningful explanations for a more general audience to understand how predictive classifiers work, in which real world applications might refer to our research to use more intricate model architecture. Furthermore, our research was heavily based on only one type of predictive classifier: neural network. Other black-box predictive classifiers and models such as random forest or support vector machine are as widely used as neural network (Hasan et al 2014; Nitze, Schulthess, and Asche 2012; Statnikov, Wang, and Aliferis 2008). Thus, we were unable to compare different black-box models to our neural network model; in accuracy, loss, or fit on the dataset. This is an important limitation to note as we are unable to guarantee successful extensions of our results to other black-box models such as support vector machines or random forest. Comparing our model with every available methods of black-box classifier would undoubtedly be costly and time-consuming, but it would give insights to how model-agnostic explanator can interact with different models.

(34)

type used in marketing. Other types of unstructured data such as audio-visual content can be beneficial in this analysis to assess the extent to which our research can be used for an array of different dataset types (Balducci and Marinova 2018; Belt and Kazinets 2005). Moreover, data in regards to sales, customer insights, or market performance can provide meaningful insights to how LIME can be implemented on different structured data. Unlike unstructured data such as text or images, sales or customer data benefit from being tabular and thus easier to be estimated linearly. It is therefore much more challenging to arrive at a good model fit for our black-box model and our LIME explanations. The limitation of low accuracy and fit may not apply to a more structured dataset. We are aware that this is a major limitation of this research, as text-rich IMDB dataset fed into neural network engine is a specific prerequisite of this research that may not be transferred well into other applications. However, this study nevertheless illustrates that explanations are essential to increasing interpretability of a model in a time where understanding is crucial for data collectors and the people affected by it. The model-agnostic nature of LIME makes it usable for different predictive classifiers, though not explored more in this research beyond its application in text analysis on neural networks.

(35)

channels and snowball sampling. However, our lack of controlling variables made it difficult for us to analyse the underlying demographic skewness. Location identifier was available that pinpoints participants to the latitude and longitude location of the device they use as they filled the survey, but this is limited in its power to provide meaningful information on each individual participants as a person of one nationality could fill our survey in another country.

Implications:

(36)

group of the population remains somewhat vague. As various types of organisations provides various products to various markets, regulators can rely on the explanations regarding the inner-workings of automated classifiers to regulate different markets accordingly. Therefore, regulations can be made in more pragmatically and thus more enforceable as regulators can better understand the reasoning behind decisions made by predictive classifier.

In addition to that, we reflect further from the limitations on our research. We suggest future research to explore different effects of different white-box models and methods. Although the output of sentiment analysis is intuitive to understand and presents meaningful output, it lacks flexibility. Different white-box classifiers such as logistic regression might provide additional insights. We also suggest on exploring further on different black-box predictive classifiers, and by extension observe how different model-agnostic explanator can be used to improve interpretability in place where LIME lacks. Extension of LIME such as SHAP (Lundberg and Lee 2017) is interesting to be explored to study interpretability. Moreover, future research would benefit from the extension of our interpretability research into a much more targeted sampling. Our analysis of interpretability heavily relies on the notion that explanation should be made generally understandable to a wider audience, but lacks further insights about how it affects differently on specific groups. An extension to measure the differences of complexity and interpretability into more targeted groups such as managers, privacy-concerned users, or regulators would generate insights to better explanation methods on relevant groups.

Conclusion:

(37)

(38)

REFERENCES

Andrychowicz, Marcin, Misha Denil, Sergio Gómez Colmenarejo, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, and Nando De Freitas (2016), “Learning to learn by gradient descent by gradient descent,” in Advances in Neural

Information Processing Systems.

Bahrampour, Soheil, Naveen Ramakrishnan, Lukas Schott, and Mohak Shah (2015), “Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning,” ArXiv. Balducci, Bitty and Detelina Marinova (2018), “Unstructured data in marketing,” Journal of

the Academy of Marketing Science.

Banz, Rolf W. (1981), “The relationship between return and market value of common stocks,” Journal of Financial Economics.

Barratt, Monica J., Simon Lenton, and Matthew Allen (2013), “Internet content regulation, public drug websites and the growth in hidden Internet services,” Drugs: Education,

Prevention and Policy.

Bellovin, Steven M., Renee M. Hutchins, Tony Jebara, and Sebastian Zimmeck (2013), “When Enough is Enough: Location Tracking, Mosaic Theory, and Machine Learning,”

SSRN Electronic Journal.

Burrell, Jenna (2016), “How the machine ‘thinks’: Understanding opacity in machine learning algorithms,” Big Data & Society.

Cheung, Kwok Wai, James T. Kwok, Martin H. Law, and Kwok Ching Tsui (2003), “Mining customer product ratings for personalized marketing,” Decision Support Systems. Chintagunta, Pradeep, Dominique M. Hanssens, and John R. Hauser (2016), “Marketing

science and big data,” Marketing Science.

(39)

Chollet, François (2015), “Keras: The Python Deep Learning library,” Keras.Io.

Dingledine, Roger and Nick Mathewson (2006), “Anonymity Loves Company: Usability and the Network Effect,” Economics of Information Security.

Doshi-Velez, Finale and Been Kim (2017), “Towards A Rigorous Science of Interpretable Machine Learning,” (Ml), 1–13.

Fong, Ruth C. and Andrea Vedaldi (2017), “Interpretable Explanations of Black Boxes by Meaningful Perturbation,” in Proceedings of the IEEE International Conference on

Computer Vision.

Ghazinour, Kambiz, Stan Matwin, and Marina Sokolova (2013), “Monitoring and

recommending privacy settings in social networks,” in ACM International Conference

Proceeding Series.

Glance, Natalie, Matthew Siegler, Matthew Hurst, Robert Stockton, Kamal Nigam, and Takashi Tomokiyo (2005), “Deriving marketing intelligence from online discussion,” in Proceedings of the ACM SIGKDD International Conference on Knowledge

Discovery and Data Mining.

Goolsbee, Austan (2000), “In a world without borders: The impact of taxes on internet commerce,” Quarterly Journal of Economics.

Grimmer, Justin (2014), “We are all social scientists now: How big data, machine learning, and causal inference work together,” in PS - Political Science and Politics.

Gross, Ralph, Alessandro Acquisti, and H. John Heinz (2005), “Information revelation and privacy in online social networks,” in WPES’05: Proceedings of the 2005 ACM

Workshop on Privacy in the Electronic Society.

Guidotti, Riccardo, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi (2018), “A survey of methods for explaining black box models,” ACM

(40)

Hammer, Barbara, Bassam Mokbel, Frank Michael Schleif, and Xibin Zhu (2012), “White box classification of dissimilarity data,” in Lecture Notes in Computer Science

(including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).

Hartman, John J., Philip J. Stone, Dexter C. Dunphy, Marshall S. Smith, and Daniel M. Ogilvia (1967), “The General Inquirer: A Computer Approach to Content Analysis.,”

American Sociological Review.

Hasan, Md. Al Mehedi, Mohammed Nasser, Biprodip Pal, and Shamim Ahmad (2014), “Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS),” Journal of Intelligent Learning Systems and Applications.

Huang, Jih Jeng, Gwo Hshiung Tzeng, and Chorng Shyong Ong (2007), “Marketing segmentation using support vector clustering,” Expert Systems with Applications. Jordan, M. I. and T. M. Mitchell (2015), “Machine learning: Trends, perspectives, and prospects,” Science.

Kenny, Charmaine (2005), “Random number generators: An evaluation and comparison of

random. org and some commonly used generators,” Management Science and

Information Systems Studies Project Report, Trinity College Dublin.

Khalilzadeh, Jalayer and Asli D.A. Tasci (2017), “Large sample size, significance level, and the effect size: Solutions to perils of using big data for academic research,” Tourism

Management.

Koh, Pang Wei and Percy Liang (2017), “Understanding black-box predictions via influence functions,” in 34th International Conference on Machine Learning, ICML 2017. Krause, Josua, Adam Perer, and Kenney Ng (2016), “Interacting with predictions: Visual

inspection of black-box machine learning models,” in Conference on Human Factors in

(41)

Lakkaraju, Himabindu, Ece Kamar, Rich Caruana, and Jure Leskovec (2019), “Faithful and Customizable Explanations of Black Box Models.”

LeNail, Alexander (2019), “NN-SVG: Publication-Ready Neural Network Architecture Schematics,” Journal of Open Source Software.

Li, Han, Rathindra Sarathy, and Heng Xu (2011), “The role of affect and cognition on online consumers’ decision to disclose personal information to unfamiliar online vendors,”

Decision Support Systems.

Lin, Min, Qiang Chen, and Shuicheng Yan (2014), “Network in network,” in 2nd

International Conference on Learning Representations, ICLR 2014 - Conference Track Proceedings.

Lins, Karl V., Henri Servaes, and Ane Tamayo (2017), “Social Capital, Trust, and Firm Performance: The Value of Corporate Social Responsibility during the Financial Crisis,” Journal of Finance.

Lundberg, Scott M. and Su In Lee (2017), “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems.

Maas, Andrew L., Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and

Christopher Potts (2011), “Learning word vectors for sentiment analysis,” in ACL-HLT

2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.

Mark, Gloria and Bryan Semaan (2009), “Expanding a country’s borders during war: The winternet war diary,” in Proceedings of the 2009 ACM SIGCHI International

Workshop on Intercultural Collaboration, IWIC’09.

(42)

Mohassel, Payman and Yupeng Zhang (2017), “SecureML: A System for Scalable Privacy-Preserving Machine Learning,” in Proceedings - IEEE Symposium on Security and

Privacy.

Možina, Martin, Jure Žabkar, Trevor Bench-Capon, and Ivan Bratko (2005), “Argument based machine learning applied to law,” Artificial Intelligence and Law.

Murdoch, Travis B. and Allan S. Detsky (2013), “The inevitable application of big data to health care,” JAMA - Journal of the American Medical Association.

Nitze, I, U Schulthess, and H Asche (2012), “Comparison of machine learning algorithms random forest, artificial neuronal network and support vector machine to maximum likelihood for supervised crop type classification,” in Proceedings of the 4th

Conference on GEographic Object-Based Image Analysis – GEOBIA 2012.

Oh, Seong Joon, Bernt Schiele, and Mario Fritz (2019), “Towards Reverse-Engineering Black-Box Neural Networks.”

Olden, Julian D. and Donald A. Jackson (2002), “Illuminating the ‘black box’: A

randomization approach for understanding variable contributions in artificial neural networks,” Ecological Modelling, 154 (1–2), 135–50.

Oswell, David (1998), “The place of ‘childhood’ in internet content regulation: A case study of policy in the UK,” International Journal of Cultural Studies.

Park, Hye Jung, Leslie Davis Burns, and Nancy J. Rabolt (2007), “Fashion innovativeness, materialism, and attitude toward purchasing foreign fashion goods online across

national borders: The moderating effect of internet innovativeness,” Journal of Fashion

Marketing and Management.

(43)

Rasmussen, Carl Edward (2004), “Gaussian Processes in machine learning,” Lecture Notes in

Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin (2016), “‘Why should i trust you?’ Explaining the predictions of any classifier,” in Proceedings of the ACM SIGKDD

International Conference on Knowledge Discovery and Data Mining.

———, ———, and ——— (2016), “Model-Agnostic Interpretability of Machine Learning,” 2016 ICML Workshop on Human Interpretability in Machine Learning

(WHI 2016), New York, NY, USA., 91–95.

Saxton, Gregory D. and Lili Wang (2014), “The Social Network Effect: The Determinants of Giving Through Social Media,” Nonprofit and Voluntary Sector Quarterly.

Selvaraju, Ramprasaath R., Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra (2017), “Grad-CAM: Visual Explanations from Deep

Networks via Gradient-Based Localization,” in Proceedings of the IEEE International

Conference on Computer Vision.

Statnikov, Alexander, Lily Wang, and Constantin F. Aliferis (2008), “A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification,” BMC Bioinformatics.

Steinfield, C., T. Adelaar, and Ying Ju Lai (2002), “Integrating brick and mortar locations with e-commerce: Understanding synergy opportunities,” in Proceedings of the Annual

Hawaii International Conference on System Sciences.

Sundsøy, Pål, Johannes Bjelland, Asif M. Iqbal, Alex Pentland, and Yves Alexandre De Montjoye (2014), “Big data-driven marketing: How machine learning outperforms marketers’ gut-feeling,” in Lecture Notes in Computer Science (including subseries

(44)

Teoh, Siew Hong, Yong George Yang, and Yinglei Zhang (2011), “R-Square and Market Efficiency,” SSRN Electronic Journal.

The European Parliament and the Council of the European Union (2016), “Regulation (EU) 2016/679 (GDPR),” Official Journal of the European Union. Articles 12-23.

Wallsten, Scott (2005), “Regulation and internet use in developing countries,” Economic

Development and Cultural Change.

Westby, Joe (2019), “‘The Great Hack: Cambridge Analytica is just the tip of the iceberg,”

Amnesty International (accessed October 24, 2019), [available at

https://www.amnesty.org/en/latest/news/2019/07/the-great-hack-facebook-cambridge-analytica/]

Wilbur, W. John and Karl Sirotkin (1992), “The automatic identification of stop words,”

Journal of Information Science.

Wirtz, Jochen, May O. Lwin, and Jerome D. Williams (2007), “Causes and consequences of consumer online privacy concern,” International Journal of Service Industry

Management.

Wythoff, Barry J. (1992), “Backpropagation neural networks. A tutorial,” Chemometrics and

(45)

APPENDICES

Appendix A: Dataset

# for exploratory analysis

imdbFloor <- imdb[1:floor(nrow(imdb)*.5), ]

imdbCeiling <- imdb[(floor(nrow(imdb)*.5)+1):nrow(imdb), ] imdbTrain <- imdb[1:floor(nrow(imdb)*.75), ]

imdbTest <- imdb[(floor(nrow(imdb)*.75)+1):nrow(imdb), ] # for sentiment analysis, neural networks, and lime imdbx <- dataset_imdb(num_words = 10000)

c(trainData, trainLabels) %<-% imdbx$train c(testData, testLabels) %<-% imdbx$test # imported data for interpretability analysis surveyData <- read.csv("Interpretability.csv")

Appendix B: Exploratory analysis

# similar transformation and analysis is done for imdbCeiling

imdbCorpusF <- SimpleCorpus(VectorSource(imdbFloor$review)) # to store textual data imdbCorpusF <- tm_map(imdbCorpusF, stripWhitespace)

imdbCorpusF <- tm_map(imdbCorpusF, content_transformer(tolower)) imdbCorpusF <- tm_map(imdbCorpusF, removePunctuation)

imdbCorpusF <- tm_map(imdbCorpusF, removeWords, stopwords("english")) imdbcorpusStemmedF <- tm_map(imdbCorpusF, stemDocument)

# making a matrix unique document x unique terms imdbcorpusMatrixF <- DocumentTermMatrix(imdbCorpusF)

imdbcorpusstemmedMatrixF <- DocumentTermMatrix(imdbcorpusStemmedF) wordSumsF <- as.data.frame(colSums(as.matrix(imdbcorpusMatrixF))) wordSumsF <- rownames_to_column(wordSumsF)

colnames(wordSumsF) <- c("term", "count") wordSumsF <- arrange(wordSumsF, desc(count))

imdbWordcloudF <- wordcloud(words = wordHeadF$term, freq = wordHeadF$count, min.freq = 1000, max.words=100, random.order=FALSE, rot.per=0.35, colors=brewer.pal(8, "Dark2"))

Appendix C: Sentiment analysis

# similar transformation and analysis is done for imdbCeiling

imdbSentimentF <- analyzeSentiment(imdbcorpusMatrixF, language = "english") imdbSentimentF <- imdbSentimentF[,1:4]

imdbSentimentF <- as.data.frame(imdbSentimentF) #Organizing it as a dataframe summary(imdbSentimentF$SentimentGI)

(46)

Appendix D: Neural network model

imdbWordindex <- dataset_imdb_word_index()

paste0("Training entries: ", length(trainData), ", labels: ", length(trainLabels)) imdbwordindexData <- data.frame(word = names(imdbWordindex),

idx = unlist(imdbWordindex, use.names = FALSE), stringsAsFactors = FALSE)

# The first indices are reserved

imdbwordindexData <- imdbwordindexData %>% mutate(idx = idx + 3) imdbwordindexData <- imdbwordindexData %>%

add_row(word = "<PAD>", idx = 0)%>% add_row(word = "<START>", idx = 1)%>% add_row(word = "<UNK>", idx = 2)%>% add_row(word = "<UNUSED>", idx = 3) imdbwordindexData <- imdbwordindexData %>% arrange(idx)

# function to turn sequence of integers to words reviewReveal <- function(text){

paste(map(text, function(number) imdbwordindexData %>% filter(idx == number) %>% select(word) %>%

pull()), collapse = " ")} # pad data to same length for neural network trainData <- pad_sequences(

trainData, value = imdbwordindexData %>% filter(word == "<PAD>") %>% select(idx) %>% pull(), padding = "post", maxlen = 256)

testData <- pad_sequences(

testData, value = imdbwordindexData %>% filter(word == "<PAD>") %>% select(idx) %>% pull(), padding = "post", maxlen = 256)

# create validation data to check model accuracy imdbxValidationX <- trainData[1:10000, ]

trainedPartialX <- trainData[10001:nrow(trainData), ] imdbxValidationY <- trainLabels[1:10000]

trainedPartialY <- trainLabels[10001:length(trainLabels)]

# input shape is the vocabulary count used for the movie reviews (10,000 words) imdbxModel <- keras_model_sequential()

imdbxModel %>%

layer_embedding(input_dim = 10000, output_dim = 16) %>% layer_global_average_pooling_1d() %>%

layer_dense(units = 16, activation = "relu") %>% layer_dense(units = 1, activation = "sigmoid") imdbxModel %>% compile(

optimizer = 'adam',

loss = 'binary_crossentropy', metrics = list('accuracy')) imdbxTest <- imdbxModel %>% fit( trainedPartialX, trainedPartialY, epochs = 25, batch_size = 512,

validation_data = list(imdbxValidationX, imdbxValidationY),

(47)

imdbxResults # check accuracy and loss modelEval <- imdbxModel %>% evaluate(

testData, testLabels, batch_size = 512, verbose = 1) imdbxTestplot <- plot(imdbxTest)

Pred <- imdbxModel %>% predict_classes(testData[1965,], verbose = 1, steps = NULL)

Appendix E: Model-agnostic Explanator LIME text <- imdb$review

y_train <- imdb$sentiment max_features <- 10000

tokenizer <- text_tokenizer(num_words = max_features) tokenizer %>% fit_text_tokenizer(text)

text_seqs <- texts_to_sequences(tokenizer, text) maxlen <- 256

batch_size <- 512 epochs <- 25

x_train <- text_seqs %>% pad_sequences(maxlen = 256) x_train[1:2,] LIMEValidationX <- x_train[1:10000, ] LIMEPartialX <- x_train[10001:nrow(x_train), ] LIMEValidationY <- y_train[1:10000] LIMEPartialY <- y_train[10001:length(y_train)] LIMEPartialY <- as.integer(LIMEPartialY) LIMEValidationY <- as.integer(LIMEValidationY) LIMEPartialY <- LIMEPartialY - 1 LIMEValidationY <- LIMEValidationY - 1 LIMEPartialY <- as.integer(LIMEPartialY) LIMEValidationY <- as.integer(LIMEValidationY) # mirror similar model to the neural network LIMEmodel <- keras_model_sequential() %>%

layer_embedding(input_dim = 10000, output_dim = 16) %>% layer_global_average_pooling_1d() %>%

layer_dense(units = 16, activation = "relu") %>% layer_dense(units = 1, activation = "sigmoid") LIMEmodel %>% compile(

loss = "binary_crossentropy", optimizer = "adam",

metrics = "accuracy")

LIMEtest <- LIMEmodel %>% fit( LIMEPartialX, LIMEPartialY, batch_size = 512, epochs = 25,

validation_data = list(LIMEValidationX, LIMEValidationY),

(48)

# imported from keras

get_embedding_explanation <- function(text) { tokenizer %>% fit_text_tokenizer(text)

text_to_seq <- texts_to_sequences(tokenizer, text)

sentences <- text_to_seq %>% pad_sequences(maxlen = 256)} sentence_to_explain <- as.character(imdb$review[12])

explainer <- lime(sentence_to_explain, model = LIMEmodel, preprocess = get_embedding_explanation)

explanation <- explain(sentence_to_explain, explainer, n_labels = 1, n_features = 10, n_permutations = 1000)

plot_text_explanations(explanation) plot_features(explanation)

Appendix F: Interpretability

# independent samples t testcomplexity

c <- t.test(surveyData$Sim_Complexity,surveyData$Com_Complexity,paired = TRUE) # independent samples t testinterpretability

i <- t.test(surveyData$Sim_Interpret,surveyData$Com_Interpret, paired = TRUE) # paired samples t testbefore and after explanation

e <- t.test(surveyData$DV_Interpret, surveyData$MO_Explanation, paired = TRUE) # regression interpretability ~ complexity

(49)

(50)

(51)

Black-box Predictive Classifier and Model-Agnostic Explanator Application in Text Data:

A Study of Interpretability Dezaldy Irfianza Irfan (3156109)

Master’s Thesis MSc Marketing Intelligence

The purpose of information is

not knowledge, it is being able

to take the right action.

Peter Drucker (2012) “Management Challenges for the 21st Century”

Predictive classifiers are getting better at classifying based on activities, interests, and preferences.

This leads to better customer experience,

lower cost, and an increase in eﬃciency

and performance1-3_.

These systems are increasingly complex thus suﬀer from a lack of interpretability and

accountability4_.

Theoretical Framework

Within the context of privacy concerns over sensitive information and the scope of legal

liability: GDPR5-6_.

Predictive classifiers are subject to the trade-oﬀ of accuracy

and interpretability: black-box models1,4,7_.

The trade-oﬀ can be compensated using LIME, one example of

a model-agnostic explanator model7-8_.

Theoretical Framework

LIME (Local Interpretable Model-Agnostic Explanator) developed by Ribeiro et al (2016):

‣ Can be used to interpret any classifier.

‣ Identifies local interpretability on an interpretable data representation.

Conceptual Model

(52)

Research Design Manipulate dataset and establish model architecture Test predictive classifier accuracy, complexity, and interpretability Do survey to measure real world

complexity and interpretability

Hypotheses Testing ‣ Text Data: 50,000 IMDB review text9_{and preprocessed version from Keras}10_.

‣ Survey Data: 258 participants, 156 samples. Qualtrics, snowball sampling.

‣ Using 3 randomised text sample to measure black-box and white-box models.

‣ Global visualisation of neural network, local visualisation of LIME explanation.

Results: Model Building

‣ Neural Network Analysis: Adj. Accuracy = 87.7% Adj. Loss Score = 30.1% Test Accuracy = 87.7% Test Loss Score = 27.6%

‣ LIME Interpretability*: Global explainer fit is unavailable, local fit on randomised samples** fall between 70-80% ‣ Sentiment Analysis*:

Global fit are unavailable due to local estimates, but random sampling** reveal that it reaches mere 17%.

*sentiment analysis and LIME results need to be cross-examined against the data, **random sampling is done through random.org11

Results: Interpretability

‣ Within-subject measurements (using paired samples t-test): ‣ Black-box model has higher complexity compared to white-box model. ‣ Black-box model has lower interpretability compared to white-box model. ‣ Interpretability for the black-box model increases after an explanation was given. ‣ Linear Regression: Complexity has a negative eﬀect on interpretability (R2 = 17%)

Note: all results above are significant at 95% confidence interval.

Discussions

‣Hypotheses confirmed: black-box model is more complex and accurate but less interpretable. But interpretability improved after an explanation.

‣Low explained variance: indicative that the study requires additional control variables such as data on demographics, which was unavailable in this study.

‣ Model-based limitation: limited architecture of the neural network model, and lack of comparisons between diﬀerent types of predictive classifiers.

‣ Data-based limitation: the study heavily rely on a text-rich dataset, even though other data types and model-agnostic explanator are present.

‣ Academic implication: provides a fundamental basis of using model-agnostic explanator in an illustrative marketing setting and dataset. ‣ Managerial implication: importance of giving an explanation to a predictive classifier to be generally understandable to a general audience.

In conclusion, the interpretability

of the more accurate black-box

classifier can be improved by

using model-agnostic explanator.

References

1. Guidotti, Riccardo, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi (2018), “A survey of methods for explaining black box models,” ACM Computing Surveys.

2. Ghazinour, Kambiz, Stan Matwin, and Marina Sokolova (2013), “Monitoring and recommending privacy settings in social networks,” in ACM International Conference Proceeding Series.

3. Fong, Ruth C. and Andrea Vedaldi (2017), “Interpretable Explanations of Black Boxes by Meaningful Perturbation,” in Proceedings of the IEEE International Conference on Computer Vision.

4. Lins, Karl V., Henri Servaes, and Ane Tamayo (2017), “Social Capital, Trust, and Firm Performance: The Value of Corporate Social Responsibility during the Financial Crisis,” Journal of Finance.

5. The European Parliament and the Council of the European Union (2016), “Regulation (EU) 2016/679 (GDPR),” Oﬃcial

Journal of the European Union. Articles 12-23.

6. Steinfield, C., T. Adelaar, and Ying Ju Lai (2002), “Integrating brick and mortar locations with e-commerce: Understanding synergy opportunities,” in Proceedings of the Annual Hawaii International Conference on System Sciences. 7. Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin (2016), “‘Why should i trust you?’ Explaining the predictions of any