Credit Cards transaction routing using online learning

(1)

Credit Cards Transaction

Routing using Online

Learning

Varesh Tapadia

(11434678)

(2)

Research submitted in partial fulfilment for the degree of MBA Big Data & Business Analytics

Amsterdam Business School University of Amsterdam

(3)

Abstract

With e-commerce on the rise across the globe, customers are using online shopping for their smallest of needs. All these shopping requires payments to be processed across the globe. Global Online, part of Ingenico Group is a market Leader in this area. However, there is always a need to innovate and improve the processing platform.

This thesis explores one such concept of Dynamic Routing. The hypothesis behind this is that smart routing of transaction can help in improving the overall performance and response time of the platform. The goal here is to explore some of the machine learning algorithms and see if the models can be trained to make this decision.

The research that followed used the CRISP-DM approach. To prove the hypothesis, trans-action data was extracted from the Global Online platform and various Machine learning models were applied and results were evaluated. In total around half a million transaction rec-ords were extracted from the data store. Python programing language was used for data cleaning and preparation, model selection and result presentation in graphs.

The analysis succeeded in proving that machine learning algorithms, specially the online learning algorithms, can be used for making the routing decisions to the Acquirers for trans-action processing. The results shows approx. 2.28% of the transtrans-actions directly benefitting from this and platform will benefit indirectly.

(4)

Attestation

I understand the nature of plagiarism, and I am aware of the University’s policy on this.

I certify that this thesis reports original work by me during my University project.

Varesh Tapadia (UVA ID #11434678) 30th_{Sep 2018}

(5)

Acknowledgements

At this instance I would like to thank all the people that were involved and helped in comple-tion of this thesis.

First, I would like to thank my UvA thesis supervisor: Prof. Dr. Maarten Soomer for his guidance, supervision and inspiration during the research.

With respect to Global Online, I would like to thank Jorge Villacampa, Data scientist for his supervision and support during my research. I would also like to thanks Xiaoxi Zhang, Product Manager for having so many interesting discussions that helped me understand the business aspects of this research.

I had a lot of support from the management team at Global Online for allowing me to manage my work and study as per my convenience. Specifically, I would like to thank Vincent van Katoon, Sudeep Moses, Edwin Stam, Gennaro Buccini and Camiel Schoonens for their support.

Last but not the least, I would like to thank my family, especially my wife Mrs. Nidhi Tapadia for her unconditional support during the last 2 years.

(6)

4.6 Chapter Conclusion ... 24 5 Modelling ... 25 5.1 Model Overview ... 25 5.1.1 Model Selection ... 25 5.1.2 Model Tuning ... 25 5.2 Model evaluation ... 25 5.2.1 Model Metrics ... 26 5.2.2 Model Performance ... 26 5.3 Chapter Conclusion ... 27 6 Results ... 28 6.1 Metrics ... 28 6.2 Performance ... 30

6.3 Value for Business ... 32

6.4 Chapter Conclusion ... 32

7 Conclusion ... 33

7.1 Summary ... 33

7.2 Recommendation for Future Work ... 33

References ... 34

(8)

List of Figures

Figure 1: Simplistic Business Model ... 2

Figure 2: Authorization Process... 3

Figure 3: Clearing and Settlement Process ... 4

Figure 4: CRISP-DM overview ... 6

Figure 5: General Routing Process ... 7

Figure 6: Overall and Transactional Success Rate ... 16

Figure 7: Failover Transaction Success Rate ... 16

Figure 8: Acquirer Transaction Results ... 17

Figure 9: Credit Card Company Transaction Success Rate ... 18

Figure 10: Authorization Type Success Rate ... 18

Figure 11: CVV Indicator Success Rate ... 19

Figure 12: Address Verification Success Rate ... 19

Figure 13: Additional features variability ... 20

Figure 14: Recurring Transaction Success Rate ... 21

Figure 15: Correlation matrix ... 22

Figure 16: Confusion Matrix ... 28

Figure 17: ROC AUC Curve ... 29

(9)

1 Introduction

Electronic commerce, commonly known as eCommerce, is a business model where buyers and sellers communicate via Internet and make a trade of goods and/or services. This mode of trans-action had grown exponentially over the last several years. One of the reason behind the success of this mode of business is the wide availability of World Wide Web and the ease of doing business.

One of the most critical component of the whole eCommerce chain is the Online Transaction Processing. Since the buyer and the seller could be sitting at the two ends of the world, no real-time money exchange is possible. What actually happens is multiple (chain of) agreements / commitments are made between various legal entities about the money in question. Once this agreements are in place, the seller is confident that it will receive the money, the goods and/or services are delivered to the buyer.

As one can imagine, the complexity of being able to sell all across the globe is not easy. Each country has its own evolving laws and each legal entity in the chain of transaction might have their own rules and regulations. For a business to maintain this is not easy and definitely not cost effective. This is where Payment Service Provider, commonly known as PSP, comes into picture.

1.1

What is a PSP

A Payment Service Provider (PSP) offers merchants who have online presence for accepting electronic payments by a variety of payment methods including Credit Card/Debit Card/Bank Transfer/Mobile Banking and other real–time payment options. A PSP works on a software as a service model and provides a single interface (payment gateway) with which there clients (also referred to as merchants) can integrate and provide multiple payment methods.

Normally, a PSP connects to multiple acquiring banks, cards and payment networks. The PSP manages these technical connections, relationship with external networks, and bank ac-counts. This makes merchant less dependent on a financial institution and free from the task of establishing these connections directly.

An extension to the PSP, is a full-service PSP. These companies offers risk management, transaction matching, reporting, find management across currencies and multi-currency func-tionalities. They also provide alternative payments methods also like, payments via wallets,

(10)

Figure 1: Simplistic Business Model

Global online [1] is part of the Ingenico Group which operates in the Payment industry and provides Full-Service to its merchants. Besides normal transaction processing, it also provides various fraud related solutions, FX related services and Analytics of transaction. However, from here on, the focus will be on the Credit Card Transaction processing capabilities offered by the organization.

1.2

Credit Card Processing

As a user, Credit Card transactions seems simple: Customer swipes the card, enters pins, and transaction is done. In online, it’s even more simple, just give the card details, possible provide a password on 3d Secure page, and the transaction is done. Although the process looks very simple, it involves multiple stages and many parties are involved in this process.

Parties Involved

We first need to know all the various parties that are involved in the process of a Credit Card Transaction:

 CARDHOLDER: This is quite clear by the name itself. It represent the person or legal entity that owns the Credit Card.

 MERCHANT: This is the store/ online shop or vendor that sells goods or services to the Cardholder. The merchant needs to be able to accept Credit Card Payments. The mer-chant sends the card details over the network and gets the confirmation of the sale.

 ACQUIRING BANK/ MERCHANT’S BANK: The acquiring bank is responsible for receiving payment authorization request from the merchant and sends it to the issuer bank via the appropriate channels. It also relays back the response received from the issuing bank to the merchant. The merchant, based on this response decides whether to

(11)

 CREDIT CARD/SCHEME NETWORK: These entities operate the networks that pro-cesses Credit Card payment worldwide. Some of the well-known examples are VISA, MasterCard, Discover, American Express, etc. A Credit Card is normally associated with one network. They normally forward the payment details received from the Ac-quiring bank to the Issuing Bank and provide the response back to the AcAc-quiring bank.  ISSUING BANK: This is the financial institution that has issued the Credit Card in-volved in the transaction. It receives the payment authorization request from the Card network and either approves or declines the transaction.

Credit Card Transaction Process

Let’s take an example of an online purchase and the steps involved in the full process.

Stage 1: Authorization

In the authorization stage, merchant must obtain approval from the issuing bank on the payment. This is like a guarantee that the merchant can receive the money he requested from the customer (via issuing bank).

Figure 2: Authorization Process

1. Cardholder enters the credit card details on the merchant website.

2. Merchant sends the details of the credit card with the amount to the acquiring bank.

3. The acquiring bank processes and forwards the details to the credit card network.

4. The credit card networks clears the payment and request payment authorization from the issuing bank. This normally involves sending the Card Number, Card Expiration,

(12)

Cardholder name, CVV, Address details and amount. Additional security check results may also be send. (e.g. 3d Secure)

5. The Issuing bank validates the transaction against fraud and various security measures. Based on the results, the transaction is either approved or rejected. This response is sent back to merchant via the same channel: via Credit card network and Acquiring bank.

6. In case the transaction is approved, Issuer bank puts a hold on the amount of purchase (approved amount) to the cardholder account.

7. Once the merchant receives the response from the Issuer Bank, via Acquiring bank the goods or services are delivered to the cardholder. Merchant also keeps track of all these confirmed authorizations.

Stage 2: Clearing and Settlement Process

In this phase the real money transfer takes place. The transaction is displayed on the cardholder account statement and the money is paid to the merchant account.

Figure 3: Clearing and Settlement Process

1. At the end of the day, all the approved authorizations are batched by the merchant and sent to the Acquiring bank.

2. The acquiring process routes the batched transaction details to the credit card network for settlement.

3. The Credit Card network forwards all the approved transactions to the appropriate issu-ing bank.

(13)

4. The issuing bank revalidates the transactions and transfer the money to the Credit card network less the Interchange Fee.

5. The Credit card network credits the acquiring bank the amount less Assessment fees. Sometimes this is combined at the issuer bank itself.

6. The Acquiring bank, after receiving the funds, transfer the transactions amount to the merchant account after deducting a processing fee.

7. The merchant sees the transaction in his account and the Cardholder sees the transaction in his statement and pays the amount to the Issuing Bank.

1.3

Scope and Objectives

All Payment Service Providers have a constant need to improve their platform. Global Online is no different here. The business is always looking for solution to improve Authorization rate, improve Response Time and provide value added services.

The scope of this research is to look at alternatives to the current transaction routing logic (sending transaction to an Acquirer for processing) and see if Machine learning can help in im-proving the above mentioned challenges.

At the end, we would like to quantify that with Machine Learning, a smart and dynamic routing solution could be build that can help in improving the Global Online transaction pro-cessing platform. There could be other use cases and business values created along the path, but as an objective, we want to prove that we can make the machine a bit smarter when making routing decisions.

1.4

Thesis outline

This research follows the industry standard CRISP-DM [6] (CRoss Industry Standard Process for Data Mining) process for Data Driven projects. The CRISP-DM model contains six phases: Business Understanding, Data Understanding, Data Preparation, Modelling, Evaluation and De-ployment. The model follows an iterative approach to the Data Mining projects and focus is always on the Business needs.

The same approach has been followed in this research. During the start of the research, Busi-ness understanding has been identified and documented. Using those Data sources were identified and data extraction was performed. The data in this case, contained some sensitive information. Those information was then masked or cleaned before further processing.

(14)

The next step is to prepare/enrich the data for modelling. Many columns were converted to categorical variables. Missing values were replaced carefully. Business transformations were performed to come out with more relevant features.

Figure 4: CRISP-DM overview

After this, models were selected in consultation with Professors and Literature researched. Also the various evaluation conditions were explained in details. The models were then fine-tuned and adjusted using the training data set. After the models are scored, the next phase of evaluation started.

During the Evaluation phase, the pre-defined evaluation results were published and infer-ences has been explained. These inferinfer-ences were then cross validated with the business and were further refined.

(15)

2 Business Understanding

We start the research by exploring the Business needs and goals. We then document the research questions that we would like to answer as part of this research. We then document the Business requirements and define the scope of this research.

2.1

Business goals

For Global Online to provide higher quality of service and having a competitive advantage over other Payment Service Providers, it needs to provide a stable and better performing platform when compared to others. The “better performance” here means that when a transaction is sent to Global Online to process, it has the highest chances of converting to a sale.

Figure 5: General Routing Process

Dynamic Routing is expected to improve the response times of Global Online authorization API’s. This can also lead to increase in the overall success rate of transactions being processed by Global Online platform.

2.1.1 Performance Gains

One of the most important performance gain that can be observed is the transaction will more likely be redirected to a more stable Acquirer with higher success rate. This could avoid trans-actions going into the fall-back scenario where when the transaction is rejected by one Acquirer, the transaction is requested for processing by another Acquirer.

This is in iterative implementation and in some cases, can even be failed over to a lot more Acquiring connections for processing of the transaction.

Selection

• Merchant Configuration

• Acquirer Configuration

Filtering

• Merchant Configuration

• Transaction Data

Ordering

• Priority

• Random

(16)

2.1.2 Service Gains

Most of the other Payment Service Providers are leveraging the data and providing value added services. Dynamic Routing is one such service that is using the behaviour observed in the plat-form and adjust accordingly. This will help Global Online increase the value added services based on data.

Dynamic Routing being offered as a service will also help in enhancing the notion in mer-chant eyes that the Global Online has a mature platform with advanced analytics capabilities. This, in turn, will help sales retain existing and attract new customers in the highly competitive Payment industry.

The traditional mode of revenue in the Payment industry is already stressed and with data driven services like the Dynamic Routing, new avenues of revenue can be opened helping in overall revenue increase for the Global Online.

2.2

Research question

This section helps in structuring the research to answer the correct question and keep the focus in the right direction. The questions are divided into three sub-categories: Business, Academic and Model Evaluation.

2.2.1 Business related research question

This research is done as part of the business context and hence is highly influenced by the busi-ness requirements.

1. How can Global Online improve Credit Card Authorization rate?

2. How can Global Online improve its API response time?

3. How can Global Online use the data available to provide value added services?

4. What Future steps will be needed to implement this solution?

2.2.2 Academic related research question

The research is done in an academic context and hence cover some academic aspects.

1. What kind of data sources used? How is the data quality?

2. What kind of models are used?

(17)

2.2.3 Model Evaluation related research questions

The end result of this research is to provide business with proof on the reliability of the machine learning models and to present the results.

1. Are the model results reliable?

2. Performance of the model?

3. Possible business gains?

2.3

Requirement Specification

The main goal of this research is to be able to help understand the behaviour of an Acquirer and make smart routing decisions based on those. However, those can only be achieved when the model has high performance and provides good results. The requirement specification section lists these conditions in terms of specific requirements.

2.3.1 Business Requirements

The research is driven by business needs and therefore the business requirements are the main drivers.

- The model should help in determining the success probability of a transaction with an acquirer.

- The model should be able to help the business in deciding the order of processing trans-action with multiple Acquirers.

- The model should be able to provide results in quick time to avoid any delay in pro-cessing of transactions.

- The model should be generally explainable to the merchants.

- The model should be learning regularly to avoid any manual intervention.

2.3.2 Research Scope

The scope of this research is to look at historic data and apply machine learning algorithms to see if we can predict the probability of a transaction being successful with an Acquirer or not. However, this should be read with lot of caution.

First, the data that is being used in this research is a very small subset of all the transactions that are processed in the Global Online platform. This does not include all the possible combi-nations that can be setup. The data set was reduced due to sensitivity of the data and to limit the

(18)

added and more reliable results can be achieved. This research provides a glimpse on how Ma-chine Leaning algorithms can be used to improve the overall platform of Global Online.

This research is not to implement and provide results for Dynamic Routing in production system. However, the scope is to provide with proof that the ordering of the Acquirers can be altered to provide better response times and in turn provide better transaction success rate also.

2.4

Chapter Conclusion

The chapter presented the Business and Academic aspects of this research and helped in docu-menting the requirements that are useful throughout the research. The chapter also documents the scope of this research again in the context of the questions mentioned earlier. The chapters ahead will try to answer the questions that have been raised in this chapter.

(19)

3 Related Literature

3.1

Payment Industry

A lot of research has been done in using Machine Learning modules in the payment industry. These researches main area of focus has been on the Fraud Detection. Payment industry works with actual money and Fraud is one of the major risk to this industry. Therefor it’s understand-able and justifiunderstand-able on the amount of research one can find on this area.

Most of the routing rules defined in the industry have been rules based where based on the issuing bank, the transaction needs to be sent to respective bank. Based on the scheme of the card, e.g. Visa/MasterCard or AMEX, the transaction needs to be sent to the respective scheme.

However, during this research we were unable to find any specific literature or research done on this field. This is understandable as not many Payment Service Providers offer this kind of flexibility and transaction routing as offered by the Global Online platform. Also, these routing decisions are heavily depend on cost per transaction that is determined by the mutual agreement with the acquirer and is very static in nature. We are starting to see some dynamic pricing now offered by the Acquirer and soon this will become an interesting topic.

3.2

Comparable Research

During the master program, we were introduced to the concepts of the online marketing as part of the Quantitative Marketing subject. CTR (Click Through Rate) calculations and the online learning algorithms were covered as part of this subject. The dynamic nature and ease of under-standing of these algorithms inspired this research to look at these algorithms and explore their use beyond the online marketing world.

CTR calculations are based on continuous learning of the customer behaviour. It observes features of an advertisement and tries to predict if a customer will click an advertisement or not. However, this is such a low probability, in absolute terms it does not make sense. But given a particular advertisement and if there are 2 customers with different characteristics, which one is the most likely to click on the advertisement, is something which is very similar to this research.

Using the same approach, what this research is focussing on is that given the set of features for a transaction, which acquirer is the best possible to process this transaction. The success or failure of a transaction depends on a lot of fields that are not even available to a Payment Service Provider. So determining in absolute term is not possible and should not be used.

(20)

Acquirer change their behaviour based on the knowledge they have about the transactions and underlying providers and schemes. This is something we have no control of. The only thing we can do is to learn of this change in behaviour and adjust accordingly.

In the current industry, this is a manual process where observations and are then reacted upon. As part of this research, our focus is to show this is something machines can do in real time and adjust accordingly. The benefits of this adjustment will reflect in improved customer satisfaction and better value for money.

3.3

Chapter Conclusion

The chapter started with the current research focus on Payment Industry and what lies in the future. It then compares the CTR calculation and builds a case of using algorithms defined for that purpose to be applicable for this research.

(21)

4 Understanding Data

Most of the solution rely on the quality of data and how easily it is available. Dynamic Routing is no different. The solution is heavily dependent on the available data and its characteristics. This chapter defines first how the data is collected and the challenges associated with some of the data that were discarded later. Then it defines the various characteristics of the data. The third section describes the data transformations that are performed to prepare the data for the model evaluation phase.

4.1

Data Challenges

Most of the transaction data is available when the transaction is in flight (or in progress). How-ever, when looking at it at the transaction historically, some of these data points are not readily available or are not stored because of the various regulations that surrounds the payments in-dustry specifically the Credit Card transactions. Below are some of the challenges explained and the route followed in this research to avoid those challenges.

4.1.1 Global Online Monitoring

Global Online platform is monitored 24x7 and abnormal behaviour from any Acquirer triggers an incident and the transaction flow is migrated to another Acquirer. This will result in scenarios being not recorded when there is failure.

As part of this research, only a few acquirers and merchants has been used. For this set, a check has been made to ensure that there is no configuration change happened during the period when we extracted the data.

4.1.2 Merchant Categories

Transaction failures are very dependent on the merchant category also. E.g. Airline merchants have a very different success rate for transactions having airline data then when they don’t pro-vide the data. However, this information is not applicable for other merchant categories. In general, there are lot of categories of merchants (normally 1000+). A categorization of merchant to a set of categories is a subject for research in itself.

As part of this research, only a specific category of merchants are selected, namely online retail merchants. These merchants have high volume of data and also have implemented (or are in the process of) most of the new features supported by our platform.

(22)

4.1.3 Fraud Score

It is a known fact that if a transaction has a poor fraud score, there are high possibility of the transaction not getting completed successfully.

However Fraud score in our platform does not determine the transaction performance at the acquirer because they might have their own way of identifying a fraudulent transaction. Also the Fraud results are normally don’t passed to the acquirer but it is for a merchant to choose if he want to continue with the transaction or not.

Retails merchants of Global Online in this case are special and normally have their own fraud detection mechanism. This research will focus on merchants who are not using the Fraud score of our platform.

4.1.4 3D check

3D check is an additional security measure to avoid fraudulent transactions being processed by the system. However, this normally results in a lot of drop-off scenarios e.g. the customer has to enter a code that they receive via SMS. Now if the customer does not have the phone near him, they might not continue with the transaction which could lead to transaction not being executed. On the other hand, if we don’t do 3D, the merchant has a risk for chargebacks where the customer could claim that the transaction was not initiated by him.

Considering the important of this, a mix of merchants that have enabled 3D and not are used in this research. However, because of the merchant, currency and Acquirers that we have chosen, we don’t expect too many transactions to be going through the 3D check process.

4.1.5 Amount in Currency

We wanted to use amount Global Online platform supports all currencies across the globe. How-ever, amount is dependent on the currency and conversion of this currency is a challenge.

To avoid this, in this research only one currency amounts are being used. In actual imple-mentation, this can also be added to the model. But for simplicity and limiting the scope of this research, only one currency is used.

4.1.6 Identifying fail-over transactions

This is a feature very unique to the Global Online platform where a transaction that has failed on one acquirer can be sent to another acquirer and could be processed there. Because this was a feature developed later than the original product design, the data model is not in such a state

(23)

to identify this failures easily. The data here is spread across multiple tables across data sources which are very hard to correlate on a large scale.

For this research we have already identified the currency. This also help us in identifying the Acquiring platforms which Global online are mostly using for processing of a particular cur-rency. Using this subset, it was possible to extract the information and use it for this research.

4.2

Data Collection

The data is distributed and stored in multiple data sources in the Global Online platform. Each being fine-tuned for the specific purpose. For this research, the data was collected from the following main sources.

4.2.1 Merchant and Platform Configuration Data Store

All the data that is related to configuration of the platform and merchants are stored in this data source. This source helps in deciding the transaction flow for a merchant at any given time. In this research, this data source is used to identify the subset of merchants and there configuration.

4.2.2 Transaction Data Store

This is the highly resilient and high performance data store where all the relevant transaction data is stored. This provides information of the transaction that has been performed on the Global Online platform.

4.2.3 Audit Data Store

This is the data store where information about the low level communication with the Acquirers and external parties are stored for debugging purposes. This data store will provides us with information to identify failure-over scenarios when the transaction failed at one provider and passed at the other.

4.3

Data Description

Data description section explores the data used in this research and provides details and expla-nations where required. The data set consists of 487167 overall transactions.

4.3.1 Success Rate

A transaction is defined as success when the transaction is approved by the Acquirer and there are no network errors that would have resulted in the failures. In our dataset however, this is not

(24)

other Acquirer. For the final analysis, the raw transaction data needs to be split into individual attempt data. Below charts shows that even though we have a high success rate, the actual trans-action failures are quite high.

After converting the data from transaction to attempt level, we see that the total number of attempts to Acquirers have increased to 513192 rows. The overall successful transactions re-mains the same as expected, however the failed attempts are added to the dataset as new records.

Figure 6: Overall and Transactional Success Rate

Just to emphasise that because of the failover implementation, Global Online already have a better service offered to their merchants. This reduces the overall failed transactions and provide a better level of service. In the dataset also, we can see that the transactions that have failed at the first Acquirer, we observe successful results (43.5%) by the next Acquirer.

(25)

4.3.2 Spread across Acquirer

We also look at each Acquirer transaction attempts below.

Figure 8: Acquirer Transaction Results

As can be seen, we have a lot of transactions attempts made for Acquirers P1, P2 and P4. However, there are not many transactions that went to Acquirer P3. This is not specially signif-icant here as the data selected is a subset of the whole flow in the Global Online platform. P3 is a provider for which we don’t have huge flows but the other 3 Acquirers are where most of the flow for the chosen merchants were processing.

Also can be seen that the Acquirers have a success rate varying from 10% to 20%. This is general observation in the payment industry and hence the data seems to be well spread across all the Acquirers.

4.3.3 Credit Card Company

While observing the transaction distribution across the Credit Card companies, namely VISA and Master Card, we also see that the transactions are evenly distributed and have similar suc-cess rate.

We do however see a slight increase in the overall failover rate of Visa card. This is because Visa recently introduced specifications for transactions where the Credit Card when stored by the Merchant, needs to provide more information on what the transaction is about and how the payment is triggered. This is all in attempt to know more about the transaction and improving the security. However, the adoption of this requirement is not yet so high. This could possibly be the reason for a slightly higher failure rate.

(26)

Figure 9: Credit Card Company Transaction Success Rate 4.3.4 Authorization Type

Authorization Type is a parameter that the merchant needs to pass to Acquirer indicating when and how much of the authorized amount will be settled/captured. A Final authorization transac-tion needs to be settled quickly then the Pre Auth transactransac-tion. Also, Merchant can request for a slightly higher amount for capture in a Pre Auth transaction whereas in Final Auth, the amount cannot be changed.

Figure 10: Authorization Type Success Rate

Pre Auth transactions are very commonly used in hotels. During the check-in the cashier does a Pre Auth for the amount of the stay. However, customer might pick some drinks from the room bar, which can then be added to the final invoice and billed to the card, without asking the customer to provide the card again. However, these transactions are a bit risky also and therefore we see higher failures observed.

4.3.5 CVV Indicator

CVV are the last 3 digits defined at the back of the card. This was one of the first security measure introduced. The CVV indicator specifies whether this information is provided by the Merchant/Customer or not. According to the PCI compliance rules, CVV cannot be stored by

(27)

anyone anywhere in the system. The initial idea was to ensure that such a transaction can only be initiated by the person who is holding the card.

Figure 11: CVV Indicator Success Rate

However, with so many credit card fraud now a days, this became quite evident that this security measure is not good enough. This is also evident from the data that a large part of transaction is missing this information and still does not have huge impact on the success rate.

4.3.6 AVS Indicator

AVS stands for Address Verification System. During a Credit Card transaction, merchants re-quest Customers to fill in the billing address. This is the address used by the Address Verification System to validate the correctness of the customer.

Figure 12: Address Verification Success Rate

The address if provided is then passed onto the Issuer where a check is done if the address is correct or not. AVS indicator here specifies that the merchant has submitted the Address details to our platform and the information will be passed to the actual Acquirer.

We see a lot of transactions using this information. This is because when the Address details are submitted, the processing fee for the transaction is reduced. It is also very common in some

(28)

of the new address. This results in incorrect address being submitted and transaction being re-jected or failed.

4.3.7 Additional Features

Besides the features defined so far, following additional features are also extracted from the Global Online Transaction Data Store. The variability in these data is very less. This is mostly due to merchant specific configuration or less adoption of the feature by the selected merchants.

Figure 13: Additional features variability

UCOF (Unscheduled Card on File) is a very recent mandate coming from Visa scheme that requires merchants saving the card to support use cases for One-Click pay, UBER taxi, auto top-up, etc. In these cases, the card is saved by the merchant and transactions are initiated at random schedule and of random value. Since the adoption of this is too low, we don’t have too much

(29)

values for a real analysis. However, it is good to mention this because the adoption is only in-creasing and soon this will add a lot of variability to the data.

Most of the transaction attempts have Fraud Results value ‘N’. This indicates that the Fraud Check was not requested via the Global Online platform. There are multiple Fraud providers and bigger merchants either integrates directly with them or have their own Fraud Engine. For the research, we have selected merchants that don’t use Fraud results too much. However, just for future use case, the feature is kept.

Recurring is another type of payments and in general has no special characteristics other than it provides a bit less information. E.g. CVV is never provided in recurring transactions. In re-curring, we provide reference to the original transaction.

Figure 14: Recurring Transaction Success Rate

As can be seen above, the success rate appears to be stable for both One-off and recurring transactions. However, recurring transactions generally have a little less failure rate when com-pared to one off transactions.

4.3.8 Correlation matrix

Correlation gives an indicator of how two variables are related to each other. If two variables, changes in the same direction, they are positively correlated. If they change in opposite direc-tion, they are negatively correlated. However, if two continuous variables are not related to each other, there correlation value would be 0. This is useful to identify if the variables are independ-ent or not.

Below is a graph showing correlation matrix in a heat-map format. Since all the categories of the variable are included, we can see some complete negative co-relations in the graph. Nor-mally to avoid multi-collinearity, the first or the last is dropped. However, with current model implementations it is not required as the model removes them automatically.

(30)

Figure 15: Correlation matrix

From above, we see some relation that are explained below:

- Credit Card Company and Auth Type: This is because currently Auth Type is supported by Master Card only. Visa has yet to adapt to this and therefor, VISA transactions are correlated to Unspecified Auth Type.

- UCOF Indicator and Requestor: These fields are also having very less variability and therefor the high correlation observed. Both the fields normally go hand in hand. - AVS Indicator and Provider 4: This could just be coincidence or the provider does not

support AVS check. In that case, even when the transaction contains Address, the infor-mation is not passed and the indicator is set to false.

4.4

Data Cleaning and Transformation

The following data cleaning and transformation steps are performed to ensure the quality of the data and make it ready for further analysis.

(31)

4.4.1 Transaction to Attempt Transformation

One of the critical feature of Global Online is the Failover mechanism it provides to merchants. This ensures that any request received by the Global Online transaction processing platform, will be tried with as many acquirers as possible to get a seamless experience to the end customer. However, this feature was added after the initial platform was build and therefore adds a bit of complexity to the data model.

All the attempts are available in the Audit Data Store. So we extracted the audit data for all the Acquirers for a given transaction and this is then available in the raw format. This means in the raw format extracted, we have entries in the transaction of merchant requests, but not at-tempts made towards the acquirers.

To come to that, all records in raw transaction data were scanned, and if there was infor-mation available from any other Acquirer attempt then the last one was found, new rows were created to document this attempt. Also importance has been given to keep all this data in the correct order so that the failed ones appear first in the generated data then the last successful or unsuccessful one.

4.4.2 3D Results Calculation

All attempts that passes through the 3D process of verification have complex logic. As a Pay-ment Service Provider, we are not sure what the customer has done on the 3D validation results. The information is queries from our server and the information is retrieved from the third party 3D providers.

This result is not a Boolean of success and not success, but a complex condition involving multiple values. The data was transformed to convert this into a categorical variable with fol-lowing values:

Approved: Fully Approved

Partial: Partial Approval

No Attempt: Merchant choose not to do 3D

Not Approved: Failed or cancelled by customer

Most of the retail merchants normally choose to continue with the transaction irrespective of the 3D results. This is normally due to the business model of the specific merchants (retail mer-chants) where the aim is to make the sale go through. This adds additional risk to the transaction in case of chargebacks, however, with the amount of online transactions the risk is not that high.

(32)

It has also been observed that 3D check in general results in lot of dropouts (customer leaving before completing the purchase). For retail merchants, the loss of not making a sale is higher than additional security check, and therefore the configuration is selected like this.

It can also be seen that most of the retail merchants prefer customers to save there card in-formation for the future purchase. Also features like one-click-pay are promoted to avoid delay in making the sale. Schemes like VISA and Master Card are now enforcing more regulations around this practice and a lot of changes are expected in this regard in the future with 3D v2 and PSD2.

4.5

Training and Test Data

Once the complete dataset is transformed and cleaned, the data is then split into the training and test data set. Because attempts are sequential events, therefore the order is important and during the split, this has been respected (Shuffle is turned off when splitting the dataset). The first 80% of the data is used for training the model and the last 20% of the attempt is used for testing the model.

During the Cross validation for finding optimal parameters, the training data set with be split from 5 subsections and these will be used as Training and Validation data set.

4.6

Chapter Conclusion

This chapter shows the transaction data available from the Global Online platform is a complex data set with various characteristics. It also explains the various steps that was done so that the data can be used for modelling in the next step. However, during in flight transactions, most of this data is readily available and could be used directly when feeding into the modelling engine.

(33)

5 Modelling

Continuing on the CRISP-DM model where the data has been processed, cleaned and pre-pared for model analysis, we now start with the Modelling phase of this research. This chapter describes the model selection considerations and evaluation mechanism used.

5.1

Model Overview

One of the classifications used in defining the models is the offline vs. online models. Where Offline models require to be trained upfront and then use it in the production setup, online mod-els are trained on the fly with live data and adjust accordingly.

With offline trained models, the results are repeatable, but with online model, the results may vary based on the learning rate and variation in data.

5.1.1 Model Selection

In this research we will be exploring both types of model. This is useful to prove that we are able to determine the transactions success rate probability. Also, Offline learning models are better and providing the results because they use the complete test data to train while online learning models uses each event one-by-one.

For this research we have selected the following models, Naïve Bayes and SGD Classifier from Online learning and Random Forrest and Decision Tree from the Offline learning prospect. The ration behind this is that we want to test how the Online models perform with live data and how do they compare to the more common Offline leaning models.

5.1.2 Model Tuning

For fine tuning of the models, we did a grid search on the tuning parameters and choose the best one amongst them. This was done using the standard Grid Search with cross validate implemen-tation available from the python sklearn library.

However, for actual implementation to be used for Dynamic Routing, online model seems to be the better choice. This is because of the continuous learning capabilities these models have which fits the continuous update in the model behaviour nicely.

5.2

Model evaluation

There are standard approaches to access and evaluate the machine learning models. These stand-ard measures do provide a common understanding and comparison across models class. However, data also have their own characteristics which makes it unique. This makes evaluation

(34)

For this reason, we will be focussing on two types of model evaluation.

5.2.1 Model Metrics

This is the first kind of data that we will use to evaluate the performance of the models. This metrics will include model Accuracy, Precision, Recall and F1-Score. All these metrics rely on the correctness of the prediction. The parameters used for the correctness is defined as follows:

1. True Positive (TP): Successful transaction predicted correctly by model

2. False Positive (FP): Failed transactions predicted as successful by model

3. True Negative (TN): Failed transactions predicted correctly by model

4. False Negative (FN): Successful transactions predicted incorrectly by model

Precision and recall are very well documented (Wikipedia). In general a perfect score of 1.0 for precision means that all the predicted positive values are true positive. However, it does not mention how many of True Positive values are not predicted.

Similarly, a perfect score of 1.0 for Recall means that all the True positives were identified. However, it does not provide information on how many Negatives were included in this. This is expected to happen in our data model, if our model predicts always success/positive. The recall value will be perfect 1, but then the algorithm is not really predicting anything.

To avoid those drawbacks, F1-Score and Accuracy will also be used. This provides a very quick understanding of how the model has performed.

The result in our analysis is a binary classification problem where we are trying to determine whether an attempt will be successful or not. For evaluation of these kind of models, Receiver Operating Characteristics or ROC curve is also used very commonly. The area under this curve is also referred to as AUC Score. The ROC curve is prepared using the True Positive Rate (TPR) (also known as Recall) and False Positive Rate (FPR) using various threshold settings.

𝑇𝑃𝑅 (𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 𝐹𝑃𝑅 (𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) = 𝐹𝑃

𝐹𝑃 + 𝑇𝑁

5.2.2 Model Performance

The dataset used in this research (after preparation) is very unique to the platform that is ex-tracted from. Specifically the failed transactions. There are multiple reasons for the failed transactions in the data set. In general they are described below:

(35)

- Transaction rejected by multiple Acquirer

- Transaction rejected by First Acquirer (but processed OK by Next Acquirer).

As can be seen predicting success is easy, it’s the Failure that is challenging. This is very similar to the CTR model where Success prediction is a challenge because of the much higher class imbalance issue.

Since we are doing binary classification, it would be interesting to know how correctly we predicted the probability of less than .5 for failed transactions. This can give us an indication that the model is able to not just predict success, but also failure transactions.

Another important characteristics of the data is the failover transactions. The transactions that failed at one Acquirer and processed successfully by another Acquirer. For these transac-tions, we are using the standard routing approach. Now, we are converting this kind of transaction into 2 different transactions in the dataset, one that failed with Acquirer 1 while other succeeded with Acquirer 2.

Given the scenario selection, we know that the transaction could be redirected to both these Acquirers. Now, we need to compare the probability (of success) of this specific transaction against both these Acquirers. These probability can eventually help us in determining the order of the Acquirers.

5.3

Chapter Conclusion

This chapter started with providing an overview of the Model selection and tuning methods. Then it explained the methods that we will be using to evaluate the model which includes the general metrics and also specific calculations to judge model success.

(36)

6 Results

This chapter presents the results from this research. All the results are based on the test data set. It starts with documenting the Metrics produced by each method/algorithm. Then it evaluates these models, and present the results using the evaluation methods defined in the previous chap-ter. The chapter wraps this up with providing Value for Businesses that want to explore Machine Learning Algorithms use case in Credit Card transaction processing platform for routing deci-sions.

6.1

Metrics

As explained in the previous chapter, model evaluation requires the data from the Confusion Matrix. Below is the charts with data from all the 4 models with their confusion matrix.

Figure 16: Confusion Matrix

Some of the important observations from the matrix above is that both SGD Classifier and Random Forrest Classifier were able to very clearly identify any failed transactions. This is evident from the fact that these methods could not classify a single attempt to be failed.

Another important data to look at is the True Negatives. We see that Decision Tree was able to predict the maximum correctness in this value and Naïve Bayes was also not that bad. This is important characteristics because this shows that model was able to identify transaction features

(37)

Using the above data, following table is generated that presents the Accuracy, Precision, Recall, F1-Score and AUC score.

Model Accuracy Precision Recall F1-Score AUC

Naïve Bayes 0.8468 0.8587 0.9757 0.9138 0.7114

SGD Classifier 0.8323 0.8324 1.0000 0.9085 0.4428

Random Forrest 0.8323 0.8324 1.0000 0.9085 0.6882

Decision Tree 0.8722 0.8681 0.9981 0.9286 0.6622

Looking at these numbers, it is clear that Decision Tree algorithm has presented the best results in terms of Accuracy, Precision and F1-Score. However, if we look closely, Naïve Bayes (from the online learning methods) has not performed that bad either. Also the AUC score shows that Naïve Bayes had the best value.

Figure 17: ROC AUC Curve

Important features that are used by each model is documented in the Appendix A. In this research, all the transaction specific features could not be used because of the compliance in the payment industry. However, during implementation, it is important to mention that more fea-tures should be considered for analysis.

(38)

6.2

Performance

As explained in the previous chapter, besides the Success scenarios, we are also interested in the failure scenarios. If a model/method is able to predict the failures correctly, then it means it has learned about features that lead to failed results quite correctly. To determine this, we need to check the True Negative Rate (TNR) of the models. TNR is determined using the following formula:

𝑇𝑁𝑅 (𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒) = 𝑇𝑁 𝑇𝑁 + 𝐹𝑃

The results for each of the model is presented below which shows that Naïve Bayes and Decision Tree are not that far in predicting the failed transactions. But overall this is still very less. Model TN TN + FP TNR Naïve Bayes 3558 17205 0.2068 SGD Classifier 0 17205 0 Random Forrest 0 17205 0 Decision Tree 4254 17205 0.2472

Since SGD Classifier and Random Forrest never predicted a transaction as failure, it is very clear that they have 0% on correctly predicting failed transactions.

The other performance measure that we described in the previous chapter was to compare the probability of the successful transactions which had a failed attempt with another Acquirer. During the pre-processing of the data, we added a flag to the failover transactions. This helped in identifying attempts that failed and were passed onto the next Acquirer. Using this, we are able to differentiate failed transaction which did not fail over to another acquirer and which ones did.

After this, we looked at the next entry in the dataset which actually is the next attempt and looked at the results. We only picked up those records where we had the success in this row. This set of records now represent the scenario that we explained in the previous chapter (failed Acquirer 1, Success Acquirer 2). In the test data set, we had a total of 2401 attempts that were successful and had Failover flag set to True. We calculated the predicted probability using all the models for the test data set and focussed on these records. Below is the visual representation of the predicted probability for failed and success attempts.

(39)

Figure 18: Predicted Probability of Successful Failover transactions

Since we are not interested in the actual result of the attempt but which Acquirer to send the transaction for processing first, these results looks very promising. With the exception of SGD Classifier, all the models were predicting higher probability of successful attempt compared to the failure attempt. Naïve Bayes algorithm has shown promising results in this dataset. Random Forrest and Decision Tree algorithms have also been able to do this quite significantly.

Comparing the results on the individual transaction level:

Successful Acquirer - Probability Higher Naïve Bayes SGD RF Decision Tree False 54 2398 37 270 True 2347 3 2364 2131

For the complete test data set, with 102639 records/attempts, if we use this approach, we would have saved approx. 2347 attempts to less favourable Acquirers. This translates to 2.28% of the total transactions. On a global level with volumes of transaction processed by Global Online like platforms, these numbers can show improvements in the response time and add benefits to the merchants and platform equally.

(40)

6.3

Value for Business

Global Online business model, or in general any Payment Service Provider provide ease of in-tegration with multiple Acquirer around the globe. In the most simplistic terms, we receive input, transform and output to Acquirer. This makes our system very efficient, and we are bound by the performance of our Acquirer partner performance.

Based on the results, we can clearly see that by implementing such a solution, we can reduce the response time of Global Online API. This in turn will add more capability to the platform with the current infrastructure (faster response leading more resource available for additional flow).

This could also be sold as an additional Value Added Service giving the Global Online new revenue stream. It could also be used for presenting during sales pitch to provide superior capa-bilities as compared to other Payment Service Providers.

Using the figures from previous section, approx. 2.28% of the transactions on Global Online platform will see an immediate increase in the response times and a total of similar percentage of reductions in the attempts to Acquirers can be observed.

Also, in some cases, Merchants (including Payment Service Providers), are charged for failed authorization. In this specific case, we can choose a more stable Acquirer where the success rate of the transaction is also higher and might even save the costs of these failures.

6.4

Chapter Conclusion

This chapter shows that the models used in the research, if used to implement dynamic routing in the Global Online platform, good results can be obtained. Specially the Naïve Bayes algo-rithm provides great result and is also comparatively easy to implement. The algoalgo-rithm can be further fine-tuned with more features that are available with an in-flight transaction.

(41)

7 Conclusion

7.1

Summary

This research started with understanding how we can improve the Global Online transaction processing platform and specifically the

7.2

Recommendation for Future Work

Though we have proved that the machine learning models can be used for Dynamic Routing implementation, it requires still some more work to be really production ready.

We first need to look at the all possible features that are available to us and not bound by the restrictions in the context of this research. There is only so much data that could be used in this research due to the compliance restrictions. The way forward would be to understand and test more features that could possibly impact the performance of an Acquirer. Adding more touch points will only improve the model and provide even better results.

The model works the best when it is also learning all the time. To be able to avoid the problem of local optima, we can choose ℇ-greedy or Thomson Sampling algorithms. These algorithm helps in balancing the exploration vs exploitation. They help in choosing the most optimal most of the times, but also explores the suboptimal ones so that they are not ignored completely. The future research could look into optimizing the epsilon for the most cost effective solution.

We haven’t looked into the class weight for different categories, but it is a common under-standing that some of the features have more importance than the other generic features. Future research can focus also on optimizing those. The analysis of class weights are also required for an actual implementation in transaction processing platform.

(42)

References

[1] Global Online Services, Ingenico,

https://epayments.developer-ingenico.com/best-prac-tices/services/gateway-gateway-and-gateway-service

[2] How Credit Card Transaction Processing Works: Steps, Fees & Participants, Odysseas Pa-padimitriou, WalletHub CEO, https://wallethub.com/edu/credit-card-transaction/25511/, April 2009

[3] Credit Card Processing explained,

https://www.mccs-wi.com/credit-card-processing-ex-plained-and-mobile-readers-reviewed, Feb 2018

[4] E –Commerce and Credit Card Payments, Wikipedia, https://en.wikipedia.org/wiki/E-com-merce_credit_card_payment_system

[5] DeGennaro, Ramon P., Credit Card Processing: A Look Inside the Black Box. Economic Review, Vol. 91, No. 1, pp. 27-42, 2006. Available at SSRN: https://ssrn.com/ab-stract=904027

[6] Shearer, C. (2000). The CRISP-DM model: the new blueprint for data mining. Journal of data warehousing, 5 (4), 13-22.

(43)

Appendix 1 – Feature Importance

One of the key important feature of the model is important features. Below is a list of all the important features per model.

Naïve Bayes

Naïve Bayes model seems to be relying heavily on the country codes.

SGD Classifier

SGD Classified model did not produce the results and it is quite evident from the features it has relied on like NA/Not Attempt/N.

(44)

Random Forrest

CVV Indicator, Amount, Auth Type and AVS Indicator are appearing to the most relevant features for the Random Forrest model.

Decision Tree

AVS Indicator, CVV Indicator, US card issues and amount seems to be most relevant for this model.

Credit Cards transaction routing using online learning