Search Engine Advertising: How to keep a constant Return on Investment?

(1)

Search Engine Advertising: How to keep a constant Return on

Investment?

(2)

Master’s Thesis Econometrics

Supervisor University of Groningen: dr. Evert de Haan

Second assessor University of Groningen: prof. dr. Tammo Bijmolt

Search Engine Advertising: How to keep a constant

Return on Investment?

Abstract

(3)

1 Introduction

Search Engine Advertising (SEA) has become a multi-billion dollar business and one of the dominant forms of advertising on the Internet (Zenetti et al., 2014). It is a powerful tool to improve the visibility of advertisements (ads) in search engines and it creates the opportunity to target customers based on their search term (words or phrases entered by a user in the search engine) and hence on their current interests (Rutz and Trusov, 2011). Stated differently, SEA allows advertisers to reach a targeted audience on a relatively low-budget (Ghose and Yang, 2009). Yang and Ghose (2010) found that organic search results and paid search results have a positive interdependence on each other’s click-through rates due to the “second opinion effect”, indicating that SEA is even a powerful tool for companies that appear in the organic search results as well as in the paid search results.

In present days, most search engines use auction mechanisms to determine which ads appear and the order of these ads. Advertisers select relevant keywords and bid on them and when the search term of a user of the search engine matches a keyword, a real-time lightning-fast auction is conducted and the winners of the auction are displayed to the user (Varian, 2007). The ordering of the ads are ranked based on a combination of characteristics such as bid amount, ad quality, and context of the search term. The ad quality is measured by a quality score, but search engine providers do not publish their exact algorithms for determining the scores (Abou Nabout and Skiera, 2012).

The Cost Per Click (CPC) is one of the most prevalent payment mechanisms in SEA; advertisers only pay for clicks on their ads (Varian, 2007). Using this payment mechanism advertisers can obtain many impressions and improve their visibility in search engines without paying for solely impressions. In order to use this payment mechanism, advertisers need to submit bids on keywords representing the maximum amounts they are willing to pay for a click on an advertisement. The amount charged for a click is often smaller than the exact bids. For example, with Google Ads advertisers pay the minimum amount that is necessary to obtain their eligible rank (Google, 2020b). This implies that advertisers pay the bidding price of the next best advertiser plus 1 cent. This is approximately a generalized, second-price, sealed bid auction.

Companies that use SEA want to determine bids that satisfy their marketing strategy. Various proposed bidding methods or models in the literature focus on profit maximizing. For example, Skiera and Abou Nabout (2013) developed a fully automated bidding decision support system (PROSAD) that maximizes the advertisers profit per keyword. Furthermore, Adikari and Dutta (2019) developed the Auto Pricing Strategy (APS) model, that maximizes the number of conversions with the condition of staying within a certain budget. Thus, PROSAD maximizes profit directly and APS maximizes profit indirectly by maximizing the number of conversions while staying below budget.

This thesis is written in co-operation with a Dutch online retailer, specialized in .

For privacy reasons the exact name of the company cannot be published, so we will refer to it as “the company” throughout this paper. The strategy of the company regarding SEA is to keep a constant Return on Investment (ROI) equal to a fixed target ROI (first part of the strategy) while

The ROI used in this thesis is defined as the amount of

profit received from the advertisements divided by the costs of the

advertisements. This paper focuses on the first part of the strategy, so keeping a constant ROI. The

second part of the strategy, is not the focus of this paper.

By stabilizing the ROI, market changes (e.g. changes in the behaviour of competitors or customers) are captured. The company observed that

Furthermore, internal research of the company indicated that

Combining these two observations gives rise to the strategy of the company; if the ROI is smaller than the target ROI, the profit compared to the cost is too low and the CPC bids should decrease. If the

(5)

ROI is larger than the target ROI, more money can be spend on SEA to increase the metrics of their interest while still keeping a desirable profit to cost ratio.

The company currently uses a complex self-designed algorithm to determine the optimal bids. This algorithm uses a

Unfortunately, this algorithm can

Highly fluctuating ROIs are observed during for instance promotional periods. This is shown in Figure 1. These fluctuations are not captured in the current algorithm due

Therefore, the company aims to develop an extension of the current model that is able to

change the bids based on recent data. This thesis focuses on developing such an extension.

0.5 1.0 1.5

jul 2019 okt 2019 jan 2020 apr 2020

Date Inde x ed R OI w r.t. target R OI Target ROI

Daily calculated ROI

Figure 1: Observed ROI of the company

Several steps are being taken to develop the bid adjustment model extension. The first step is to get more insight into the data by analyzing the historical behaviour of the ROI together with internal sales data of the company and pricing data. ARIMA models are developed and the fit of the models and the forecasting power is analyzed. The second step is to make time series simulations of the ROI to simulate the effect of different bid adjustment methods. This results in an optimal bid adjustment method which calculates bid adjustments based on the first lag of the ROI. One parameter is included in this method and this needs to be estimated. Therefore, the next step is to develop an algorithm that can be used to determine the parameter for the bid adjustment method in practice.

The resulting optimal bid adjustment method is a simple and adaptive method. The simulations suggest that the mean absolute deviation between the obtained ROIs and the target ROI decreases with 36%. In the future, the company will test the bid adjustment method in practice.

The first part of this thesis, given in Section 2, explains the basic definitions regarding SEA and summarizes some key researches. Subsequently, Section 3 describes the problem description of this thesis. Section 4 analyzes historical data of the ROI and examines the fit of ARIMA models. There-after, Section 5 proposes a method to simulate the ROI with and without bid adjustments. Section 6 expands the bid adjustment method by including the hierarchical structure. Following, Section 7 describes algorithms that can be used to quickly determine good parameters for the bid adjustment methods in real-life. The thesis will end with a conclusion and discussion in Section 8.

2 Literature research

Search engine advertising (SEA) is a branch of online marketing, which focuses on paid advertising on search engines. Ghose and Yang (2009) defined SEA as the phenomenon where advertisers pay a fee to search engines to be displayed alongside organic search results. When a user of the search engine

(6)

(7)

For advertisers, it is especially of interest how to select relevant keywords, how to examine and measure the effects of SEA, and how to determine the bids. Various researches are conducted in the last twenty years regarding these subjects. For example, Lu and Zhao (2014) examined how specific keywords and general keywords influence the direct sales of the advertised products and indirect sales of other products. Their results suggest that specific keywords improve the direct sales of the advertised products, while general keywords improve the indirect sales of other products. Therefore, they suggest that advertisers should use different types of keywords, depending on the preferences of the company.

2.1 Metrics for examining the effects of SEA

The appearance of a paid advertisement on a user’s screen is called an impression. For example, if a

user of a search engine searches for “ ” and an advertisement appears on its screen, it is

counted as an impression. The Impression Share (IS) is defined as the number of obtained impressions divided by the number of impressions that the ads were able to receive. The calculation for the total number of impressions that the ads were able to receive depends on many factors. The IS is a commonly examined performance metric.

When an advertisement has an IS approaching one (or 100%), increasing the bids does not result in more impressions if the behaviour of the competitors stays the same. If an advertisement has an IS approaching zero (or 0%), it is evident that the bid price or the ad quality is too low to be effective. Zenetti et al. (2014) found indirect positive effects of SEA on advertising and brand awareness even among consumers who did not click on the advertisement. Therefore, companies that prioritize increasing brand awareness, might want to focus on maximizing the number of impressions and the impression share.

Another common metric to evaluate the effects of SEA is called the Click-Through Rate (CTR), calculated as the number of clicks divided by the number of impressions, where a click is defined as a click on an advertisement. Agarwal et al. (2011) and Ghose and Yang (2009) found that the CTR decreases with ad rank, meaning that better ad ranks lead to higher CTRs and vice versa. High CTRs are in general preferred over low CTRs, because the CTRs are included by the search engines when calculating the ad quality; higher CTRs lead to higher ad quality scores (Ghose and Yang (2009), Google (2020d)). The ad quality, measured by a quality score, is an important performance metric, because higher quality scores can lead to lower advertisement costs and better ad positions (Google, 2020d). Unfortunately, search engine providers do not publish their exact algorithms for determining the quality scores (Abou Nabout and Skiera, 2012).

An additional frequently analyzed metric is called the Conversion Rate (CR), calculated as the number of conversions divided by the number of clicks, where the definition of a conversion depends on the goal of the company. A conversion could for example be a purchase, a download, or a phone call to the company. Agarwal et al. (2011) found that the CR is higher for more specific keywords. They also found that better ad ranks lead to lower CRs and vice versa. Interestingly, this result is not in line with the results of Ghose and Yang (2009) as they found the opposite relationship between the CR and ad rank.

Calculating the conversion rate of an advertisement depends on the attribution system for allocating the number of conversions. The most straightforward attribution system is last-click (or last-touch) attribution, where the advertisement on which a user last-clicked before making a conversion, obtains

full credit of the conversion (Berman, 2018). However, this attribution system is not always the

most accurate. For example, Berman (2018) states that last-click attribution often overincentivize ad exposures. Also, this attribution system ignores the impact of other advertisements that may have contributed to the sales (Kireyev et al., 2016).

Various other attribution systems are developed over the years. For example, multitouch attribution systems that give credits to multiple advertisements. Within the multitouch attribution system, many different methods are created. For example, if a consumer clicked on multiple advertisements before

(8)

making a conversion, all previously clicked advertisements are assigned an equal share of the conversion. Also more complicated multitouch attribution systems are designed, for example, Berman (2018) proposed a multi-channel attribution system based on Shapley values. This way, the conversion shares assigned to the previously clicked advertisements differ from each other.

Next to the attribution system, another difficulty arises in measuring the effects of SEA. Blake et al. (2015) showed that returns from SEA are a fraction of conventional returns. They used an experimental design in which a random subset of users did see ads and preserves a control group who did not see ads. Therefore, measuring the effects of SEA without incorporating this substitution effect could overestimate the effect of SEA. At the other hand, Yang and Ghose (2010) found that organic search results and paid search results have a positive interdependence on each other’s click-through rates due to the “second opinion effect”, which contradicts with the findings of Blake et al. (2015).

2.2 Bid calculations

In present days, the Cost-Per-Click (CPC) is one of the most prevalent payment mechanisms. Using this payment mechanism, advertisers pay for each click on their advertisements and not for impressions on their own. Advertisers need to submit bids on keywords representing the maximum amounts they are willing to spend for a click on an advertisement. The amount charged for a click is often smaller than the exact bids. For example, with Google Ads advertisers pay the minimum amount that is necessary to obtain their eligible rank (Google, 2020b). This implies that advertisers pay the bidding price of the next best advertiser plus 1 cent. This is approximately a generalized, second-price, sealed bid auction.

Moreover, search engines explicitly ask advertisers to specify spending limits (Shin, 2015). Shin (2015) analyzed the generalized second-price auction with budget constraints in SEA. They found that the budget constraint may induce advertisers to increase their bids as much as possible, to accelerate the elimination of the budget-constrained competitor and to reduce their own advertising cost. However, as Shin (2015) also points out, most studies neglect the budget-constraint part in SEA. This is not a big problem when advertisers have strategies that do not involve an absolute budget.

Advertisers can have various strategies. For example, maximizing profit, the CTR, the CR, the IS, the absolute number of clicks, the absolute number of conversions, or the absolute number of impressions. Also, advertisers can set the same goal as the company in this paper; stabilizing the ROI. Including an additional budget-constraint can be part of the strategy as well. Start-ups might have less money and do need to set a budget constraint, while big advertisers might only consider profit. Furthermore, start-ups might focus more on getting more brand awareness and hence set strategies corresponding to maximizing the number of impressions (Zenetti et al., 2014), while advertisers with more brand awareness might focus more on for instance profit maximizing.

Despite the fact that many research is developed regarding SEA, not much literature has been developed based on actual bidding models, methods or algorithms. Skiera and Abou Nabout (2013) did develop a fully automated bidding decision support system (PROSAD) for SEA. This system maximizes the advertiser’s profit per keyword, but it is based on last-click attribution and the bid budgets are neglected. Furthermore, Adikari and Dutta (2019) developed the Auto Pricing Strategy (APS) model, that maximizes the number of conversions with the condition of staying within a certain budget. The authors do not mention the used attribution system. As far as we are aware, there are not yet bidding models, methods, or algorithms that stabilize the ROI.

In Google Ads, advertisers can use bid multipliers to adjust the bids for specific properties such as

date and time or demographics (Google, 2020e). If the CPC bid of a company is usuallye0.50 and a

bid multiplier of 1.4 is applied, then the CPC bid of the company becomese0.50 × 1.40 = e0.70. If

the CPC bid of a company is usuallye0.50 and a bid multiplier of 0.6 is applied, then the CPC bid of

the company becomese0.50 × 0.6 = e0.30. So, the CPC bid increases with a multiplier larger than

one and it decreases with a multiplier smaller than one.

(9)

(10)

(11)

5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag A CF

ACF ROI, all accounts, daily

(a) 5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag P ar tial A CF

PACF ROI, all accounts, daily

(b)

Figure 5

The company wants the opportunity to change the bids as frequently as possible, preferably hourly. Therefore, a lower time unit level needs to be considered than daily. Figure 4b shows the time-series

(12)

of the ROI calculated on an hourly basis. Since the minimum and maximum ROI are not even close to the minimum and maximum ROI calculated at a daily level, we can conclude that these ROIs are not good representations of medium-term ROIs. This is primarily due to the low number of clicks (and hence low costs) and low number of conversions (and hence low profit), giving individual sales high weight when calculating the ROI. Therefore, we can conclude that in case of the company, we need to calculate the ROI on a larger time unit level.

To deal with the problem regarding low number of statistics during the night, time groups can be made based on the average number of sales. Therefore, four different time groups are formed where all time groups obtain approximately the same percentage of sales. The first, second, third, and fourth

time group are from respectively.

The time series of the ROI calculated each time group is given in Figure 4c. These ROIs are much more realistic than the ROIs calculated on an hourly basis and therefore we continue using these time groups.

In order to find a model representing the ROI sufficiently over time, we again look at the acf and pacf plots given in Figure 6. Due to the nature of the calculated ROIs each time group we observe a repeating significant acf at lags 4, 8, 12, 16, and 20 as well as a significant pacf at lag 4.

This indicates that The

correlation between the first, second, third, and fourth time group and the ROI are −0.180, −0.274, −0.027, and 0.484 respectively. These correlations are Pearson’s correlation and therefore Pearson’s correlation table can be used in order to conclude whether the correlation coefficients are significant (Best and Roberts, 1975). Based on the values in this table, it can be concluded that all correlations are significant at a 5% level, except for the correlation between the ROI and the third time group.

Therefore, we expect that an OLS model with dummy variables for the time groups as regressors and ARIMA errors (Hyndman and Athanasopoulos, 2018) might be a good fit for the data. The seasonality effects with respect to the time of the day are captured by including the time group dummy variables.

5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag A CF

ACF ROI, all accounts, time groups

(a) 5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag P ar tial A CF

PACF ROI, all accounts, time groups

(b)

Figure 6

Furthermore, data about the total number of business and non-business sales (including sales without a click on an advertisement) of the company is available. The correlation between the ROI calculated each time group and the total number of business sales, non-business sales, and business plus non-business sales are −0.237, −0.031, and −0.167, respectively. These correlations are significant at a 5% level, except for the correlation with non-business sales which is somewhat surprising.

For forecasting the ROI, the number of sales in the same period cannot be used as this information is not available in advance in real-life. Historical data can be used for forecasting the ROI, so consider the first lag of the sales. The correlations between the ROI calculated each time group and the lag

(13)

(14)

5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag A CF

ACF ROI, brand specific account, daily

(a) 5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag P ar tial A CF

PACF ROI, brand specific account, daily

(b)

Figure 8

Again we are interested in the correlation of the ROI with different variables. The

prices of competitors are available each hour, so the correlation between the ROI and some price index can be calculated. The price index used in this paper is calculated as follows.

Afterwards, the price index is calculated as

The of the brand of interest are exclusively

considered in calculating the price index.

The correlation between the ROI of the specific brand account and the price index is insignificant. However, the correlation between the ROI and the price index with respect to the second competitor is

(15)

significant (correlation equals −0.25). The negative correlation indicates that the ROI of the company goes up if the prices of the company are cheaper than the prices of the second competitor on average, which is as expected. The insignificant first competitor price index could be because the first competitor is not always a competitor where customers want to buy, while the second competitor might be seen as more reliable online shops and therefore more preferred to customers. However, these are just speculations and the correlations might differ for customers of other brands.

The correlations between the ROI and the first lag of the price index of the second competitor equals −0.23 and is still significant. Other lagged variables that show significant correlations with the ROI of the specific brand are the ROI of all accounts together (correlation equals 0.300) and the total of non-business sales of the specific brand (correlation equals 0.180). Table 2 gives the AIC, BIC, and MAE of different fitted models. All models including coefficients are given in Appendix A.2. The first model is an ARIMA model obtained using the R function auto.arima (Hyndman et al., 2007), which returns the ARIMA model with the lowest AIC value and no external regressors. The second model is an ARIMA model based on the acf and pacf plots. The other models are OLS models with ARIMA errors, where the included regressors are based on the significant correlations as described in the previous paragraphs. The best fitted model is the model with the lowest AIC, BIC (Kuha, 2004), and MAE and this model is given by

ROIt= −0.58P I2,t−1− 2.12ROIall,t−1+ ˆt,

where ˆt= 37.8 − 0.41ˆt−1+ 0.46ˆt−1+ 0.83ˆut−1+ ˆut, P I2,t denotes the the price index of the second

competitor in period t, and ROIall,t denotes the lag of the ROI of all accounts together in period t.

Table 2: Different models of the ROI of the specific brand calculated daily

Model AIC BIC MAE

ARIMA(2,0,1) with non-zero mean 1225 1239 26.77

ARIMA(1,0,2) with non-zero mean 1229 1243 27.19

OLS with ARIMA(2,0,1) errors, regressors: L(P I2), L(ROIall), L(N Bb) 1181 1203 26.40

OLS with ARIMA(2,0,1) errors, regressors: L(P I2), L(ROIall) 1179 1198 26.40

L(P I2) denotes the lag of the price index of the second competitor, L(ROIall) denotes the lag of the

ROI of all accounts together, L(N Bb) denotes the lag of non-business sales of the specific brand.

If the bids are adjusted such that the difference between the one-step ahead forecast of the ROI and the target ROI equals 0, then the MAD would equal the MAE. The MAD of the ROI was historically

on average during this period so it would approximately be reduced with %. This reduction

is low compared to the % reduction obtained with the first aggregation level. This is not surprising as the relative noise increases when reducing the amount of statistics.

Furthermore, it is interesting to note that including external regressors slightly decreases the MAE, but it is not a big change. This suggests that it is not necessary to include these variables in the final bid adjustment method. We will continue using the first aggregation level. The next section will explain a method to make simulations of the ROI and explains what happens with different bid adjustment methods.

5 Simulating the ROI with and without bid adjustments

It is important to test a bid adjustment method before applying the method in practice. Historical data can be used to test various bid adjustment methods. However,

limited data is available to simulate longer time periods of the ROI under constant bids. The conversion rate can be used to estimate the ROI under constant bids, because

(16)

Therefore, this thesis uses the conversion rate as a proxy that indicates the market under constant bids. The relationship between the ROI and the conversion rate will be explained in the following subsection.

Furthermore, to get a wider picture of the effects of different bid adjustment methods, this thesis uses simulations to simulate different paths of the conversion rate. This way, the bid adjustment methods can be tested on various different time series. This is useful as historical data is not guaranteed to reflect the future data. Therefore, by testing the bid adjustment method on numerous time series, it is more likely to be a good fit on the future data.

5.1 Simulating the ROI

The ROI in percentages can be approximately calculated as

ROIt= 100 ×

Profit of sales of accounts of interest in t − Total SEA costs of accounts of interests in t Total SEA costs of accounts of interests in t

= 100 ×P × CRt× Ct− CP Ct× Ct CP Ct× Ct

= 100 ×P × CRt− CP Ct

CP Ct

, (1)

where P denotes a constant representing the profit margin, CRtdenotes the conversion rate in period

t, and Ctdenotes the number of clicks in period t.

The company uses multiplicative bidding, with bid multipliers defined as

Mt= CP Cbid t CP Cbid t−1 ,

where Mtdenotes the bid multiplier in period t and CP Ctbid denotes the CPC bid in period t. This

way, all bid types can be multiplied with the multiplier and no exact bids have to be calculated. This assumption greatly decreases the complexity of the bidding problem.

For simplicity assume that CP Cbid

t = c × CP Ct, for some constant c for all t. The company has

corroborative evidence that this is true on average, as long as the IS is not approaching one. Thus,

Mt= CP Ct CP Ct−1 , and hence CP Ct= Mt× CP Ct−1. (2)

If the bid multiplier changes, and hence the average CPC changes, the new ROI can be calculated using Equation 1, with CP Ctcalculated as in Equation 2.

The next subsection analyzes the historical conversion rate and shows how the conversion rate can be simulated.

5.2 Analyzing the conversion rate

Data from 05-01-2019 until 04-30-2020 is used to analyze the conversion rate. This data is based on the first aggregation level, so all data together excluding the brand name account. Figure 9a shows the

(17)

conversion rate over time, where one period is equal to one hour. Figure 9b shows the conversion rate over time, where one period equals one day. The standard deviation of the conversion rate calculated per hour is considerably higher than the standard deviation of the conversion rate calculated per day, because the number of clicks and the number of sales highly depend on the hour of the day. For instance, almost no orders are placed during the night, while many orders are placed during the afternoon. This leads to a low number of statistics during night hours causing the high standard deviation. Due to the unequal number of statistics, it is not desirable to compare the conversion rate of one hour in the middle of the night with a conversion rate of one hour in the afternoon.

On the other hand, it is also not desirable to calculate the conversion rate once a day, because high peaks or low troughs can be noticed too late. For instance, during Black Friday (11-29-2019) a high peak is shown, but this high peak only consists of one single point when calculating the conversion rate once a day. When observing this peak after collecting the data, the peak is already over and you missed the opportunity to change the bids. For this specific day, it would be more optimal to calculate the conversion rate more often.

To deal with both problems, periods can be based on a specific number of sales instead of specific time periods. This way, each calculated conversion rate consists approximately of the same number of sales and can therefore better be compared to each other. Also, the number of sales can be chosen such that there are sufficient statistics to draw conclusions. Moreover, this method is adaptive, meaning that changes in the number of sales of the company are directly encountered in the period length. However, since the data of Google Ads is only reported each hour (and not each minute/second), it is not possible to match the number of clicks in a specific period to the number of sales if the period is not in complete hours. Therefore, the sales periods are rounded up, the periods stop if there are at least x sales, where x is determined as follows.

Multiple different values of x are considered and the corresponding autocorrelations at lag one (acf1) are compared to conclude what period length is the most appropriate. The smallest number of sales such that the conversion rate was still sufficiently large is chosen. The exact number of sales is not documented because this is confidential information.

As we are interested in the relative conversion rate compared to the previous conversion rate, we will look at log-returns, i.e. lrt= ln

CRt

CRt−1

, where lrtdenotes for the log-return in period t and CRt

denotes the conversion rate in period t. The acf and pacf plots of the conversion rate with periods of approximately x sales can be found in Figure 10.

0 5 10 15 20 25 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 Lag A CF

ACF plot of original log−return of conversion rate

(a) 0 5 10 15 20 25 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 Lag P ar tial A CF

PACF plot of original log−return of conversion rate

(b)

Figure 10: ACF and PACF plots of the log returns of the conversion rate with periods of approximately x sales

The ARIMA(p, d, q) model with the lowest AIC value, is the following ARIMA(1, 0, 1) model with

(18)

zero mean:

lrt= −0.9427lrt−1+ 0.5660ˆt−1+ ˆt. (3)

After fitting the ARIMA model to the log-returns, the log-returns should be transformed back to conversion rates. This works as follows

lrt= ln

_CR

t

CRt−1

⇐⇒ CRt= exp (lrt) × CRt−1= exp (lrt) × exp (lrt−1) × · · · × exp (lr1) × CR0,

where CR0 is the first observed conversion rate.

5.3 Simulating the conversion rate

The ARIMA model from Equation 3 can be used to simulate the conversion rate. The error terms of the simulated time series are sampled with replacement from the residuals of the original ARIMA model and the starting value (CR0) is equal to the mean of the original conversion rate time series.

Unfortunately, this method does not lead to exclusively realistic patterns. For example, Figure 11a displays a simulated time series for the conversion rate using the ARIMA model and setting CR0equal

to the mean of the observed conversion rates. The conversion rate keeps decreasing and the standard deviation gets smaller. This is not a desired time series, since it is not a realistic representation of

(19)

actual conversion rates. To prevent the mean of the simulated conversion rates to increase or decrease repeatedly over time, a mean reverting term can be added to the time series to force the simulated time series to have a more or less constant mean over time.

The time series of the conversion rate including the mean reverting terms is simulated as

CRt= exp lrt µt × exp lrt−1 µt−1 × · · · × exp lr1 µ1 × CR0,

where µtdenotes the rolling mean of the previous 50 estimated log returns. Including mean reverting

terms is realistic in case of the company, since their current model assures the ROI to be constant in the long-run.

Figure 11b shows the same time series as Figure 11a but this time series includes the mean reverting term. Appendix B shows some additional mean reverted time series of the conversion rate.

0.0 0.5 1.0 1.5 2.0 2.5 0 250 500 750 1000 Period Inde x ed CR w.r t. mean(CR)

Simulated conversion rate, without mean converting term

(a) 0.0 0.5 1.0 1.5 2.0 2.5 0 250 500 750 Period Inde x ed CR w.r t. mean(CR)

Simulated conversion rate, with mean converting term

(b)

Figure 11: Simulated conversion rates with and without mean reverting term

Figure 12 shows the average autocorrelation function and partial autocorrelation function of 10,000 simulated paths of the conversion rate, with length 10,000. Note that the acf1 is large.

0 10 20 30 40 0.0 0.2 0.4 0.6 0.8 1.0

Average ACF plot of simulated conversion rates

Lag A CF (a) 0 10 20 30 40 0.0 0.2 0.4 0.6 0.8 1.0

Average PACF plot of simulated conversion rates

Lag

P

A

CF

(b)

Figure 12: Average ACF and PACF plots of conversion rate with periods of approximately x sales 10,000 paths for the conversion rate are simulated with length of 10,000.

(20)

Figure 13a displays a histogram of the MAD of 10,000 simulated ROIs (of length 1000) when no

bid adjustments took place, so Mt= 1 for all t. The most simulations obtained a MAD between 20

and 30. Figure 13b displays the corresponding autocorrelation function at lag 1. This correlation is between 0.59 and 0.68 for all simulations.

0 500 1000 1500 2000 2500 30 40 50 MAD count No bid adjustments

MAD of 10,000 simulated paths of length 10000

(a) 0 200 400 600 0.60 0.62 0.64 0.66 Autocorrelation at lag 1 count No bid adjustments

ACF(1) of 10,000 simulated paths of length 10000

(b)

Figure 13: Histograms of the MAD and autocorrelation at lag 1 of 10,000 simulated ROI paths of length 1000

With the simulated conversion rates, the simulated ROIs can be calculated as explained in Section 5. A baseline CP C0and P are needed in order to start the simulation. For P , some fixed constant is

used. CP C0is calculated by setting the mean of the next 10 observed ROIs equal to the target ROI.

The absolute values of P and CP C0 are not relevant, since multiplicative bidding is applied and only

the relative difference is of interest.

5.4 Bid adjustment methods

This thesis compares the results of three different bid adjustment methods. Figures 5, 6, and 8 showed the acf and pacf plots of the ROI for different time periods and different aggregation levels. For all theses figures, high peaks were shown at acf1. Also, Figure 14 shows the acf and pacf plots of the ROI for time periods of x sales. Again, a high peak at lag one is shown, while the pacf is insignificant for lags higher than one. Note that the acf1 is approximately the same for the ROI calculated daily as for the ROI calculated each period of x sales. Therefore, the first bid adjustment method only includes the first lag of the ROI to calculate bid multipliers. This bid adjustment method is defined as

1. Mt= 1 + p1(ROIt−1− ROItarget),

where p1 denotes a yet to be determined parameter.

(21)

5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag A CF

ACF ROI, all accounts, for periods of x sales

(a) 5 10 15 20 −0.2 0.0 0.2 0.4 0.6 0.8 Lag P ar tial A CF

PACF ROI, all accounts, for periods of x sales

(b)

Figure 14

The second analyzed bid adjustment method looks similar, but it is slightly more elaborate as it penalizes negative and positive deviations from the target ROI differently. This bid adjustment method is given by

2. Mt=

(

1 + p2(ROIt−1− ROItarget) if ROIt−1≤ ROItarget

1 + p3(ROIt−1− ROItarget) if ROIt−1> ROItarget,

where p2 and p3denote yet to be determined parameters.

The last bid adjustment shown in this thesis uses one-step ahead forecasting to calculate bid mul-tipliers. This method is given by

3. Mt=ROI_ROIf orecasted

target .

More difficult bid adjustment methods are also analyzed during the research. For instance, non-linear bid adjustment methods, bid adjustment methods including more lags, and bid adjustment methods including more variables. However, these methods did worse than the simple methods shown in this paper. Therefore, these methods are excluded in this paper.

The goal of the company is to minimize the difference between the ROI and the target ROI. Therefore, the best bid adjustment method is assumed to be the method resulting in the smallest MAD with respect to the target ROI. For the first and the second bid adjustment method, simulations are performed to test which bid adjustment method yields the smallest MAD on average. In order to perform the simulations, the same steps are undertaken as described in the previous section. First the conversion rate is simulated for 10,000 periods and thereafter the ROIs are calculated using the simulated conversion rates.

Next, for bid adjustment methods 1 and 2, the optimal parameters are calculated for each specific time series by minimizing the corresponding MAD using the Brent (Brent, 1971) optimization method for bid adjustment method 1 and Nelder-Mead (Nelder and Mead, 1965) for bid adjustment method 2. The optimization is repeated for 10,000 different paths (all with 10,000 periods). For each path, the optimal parameters are returned as well as the MAD of the ROI for those specific parameters. Table 3 displays descriptive statistics of the average MADs for 10,000 paths with optimized parameters such that the MADs are minimized with the corresponding optimal parameters for methods 1 and 2.

The simulations indicate that the MAD decreases on average with 24% when the optimal parameter setting is used. Furthermore, we observe slightly better MADs for the second bid adjustment method

(22)

(23)

Table 4: Comparison of MAD from bid adjustment methods 1 and 3

Mean(MAD) Median(MAD) Min(MAD) Max(MAD) sd(MAD)

Bid adjustment method 1 18.60 18.60 17.42 19.66 0.50

Bid adjustment method 3 18.52 18.52 17.13 19.92 0.49

100 simulations of length 1000 0 1 2 3 0 100 200 300 Period Inde x ed R OI w r.t. target R OI

Without bid adjustments Simulated ROI, MAD = 23.8

(a) 0 1 2 3 0 100 200 300 Period Inde x ed R OI w r.t. target R OI

With bid adjustments

Simulated ROI, MAD = 18.2

(b)

Figure 15

For future data it is not possible to obtain the best parameter using the Brent or the Nelder-Mead optimization methods, it is interesting to observe how volatile the MAD is with respect to its parameter. To test this, 10,000 simulations of length 10,000 are performed and for each simulation many different parameters are used to calculate the bid multipliers. Figure 21a shows the mean (the dots) and the standard deviations (the lines) of the MAD of the 10,000 simulations. In the range of parameters between 0.0025 and 0.00625, the mean MAD is approximately 0.5 MAD from the minimum MAD. This is a relatively small change and therefore we conclude that all parameters between 0.0025 and 0.00625 are sufficient for the company.

The first bid adjustment method depends exclusively on the previous ROI. Therefore, if all available information from the first autocorrelation term is used, the autocorrelation at lag 1 of the simulated ROIs equals zero. We would expect that the parameter corresponding to the smallest MAD, is ap-proximately the same as the parameter corresponding to the point where the autocorrelation at lag 1 equals zero. However, it does not need to be the exact same point, because the underlying simulation also consists of moving average parts as is shown in Figure 12. The conversion rate in the simulations does not follow an ARIMA(1,0,1) model exactly due to the permutations regarding the log-returns and the mean reverting term. Also, in reality it is difficult to say what the underlying model will be after changing the bids as this changes the behaviour of the ROI.

However, if the difference between the MAD at the parameter at which the autocorrelation at lag 1 is zero and the parameter at which the MAD is minimized is negligible, then it would be easier to find the point where the autocorrelation at lag 1 is approximately zero than the point where the MAD is minimized. Table 5 displays descriptive statistics of the average MADs for 10,000 paths with optimized parameters such that the absolute value of the first autocorrelation is minimized (so acf1 = 0) with the corresponding optimal parameters. Compared to the results in Table 3, the MADs are slightly larger, but this difference is in relative terms negligible for the company.

(24)

(25)

time period. Calculating the ROI based on a small number of clicks is not representative for the medium term ROI as observed in Section 4.1. Therefore, this section proposes an algorithm that calculates the multiplier on the overarching level first and thereafter checks if enough statistics exists for varying the multiplier on lower levels.

The developed algorithm is defined as follows.

1. Calculate the multiplier on the overarching level as

Mt= 1 + p1(ROIt−1− ROItarget).

2. Calculate for each bid object type the corresponding multiplier as

Mt,b= 1 + D(Cb,t−1, CRb) × p1(ROIt−1,b− ROItarget),

where ROIt−1,b denotes the ROI of bid object type b at period t − 1 and D denotes a yet to

be explained damping function as a function of Cb,t−1(number of clicks on bid object type b at

period t − 1) and CRb (average conversion rate of bid object type b of the last 30 days).

3. Scale the multipliers on bid object type level such that the click-weighed average of the multipliers is equal to the multiplier on the overarching level. So multiply all Mt,b with

MtP B b=1Cb,t−1 PB b=1Cb,t−1Mt,b ,

where Cb,t−1 denotes the number of clicks on bid object type b at period t − 1 and B denotes

the total number of bid object types.

4. Calculate for each bid object the corresponding multiplier as

Mt,b,o= 1 + D(Cb,o,t−1, CRb,o) × p1(ROIt−1,b,o− ROItarget),

where ROIt−1,b,o denotes the ROI of bid object o which is inside bid object type b at period

t − 1.

5. Scale the multipliers on bid object level such that the click-weighed average of the multipliers is equal to the multiplier of the corresponding bid object type level. So for each b separately, multiply all Mt,b,owith

Mt,bP Ob

o=1Cb,o,t−1

POb

o=1Cb,o,t−1Mt,b,o

,

where Cb,o,t−1 denotes the number of clicks on bid object o which is inside bid object type b at

period t − 1 and Ob denotes the total number of bid objects in bid object type b.

It is important to note that the autocorrelation structure is assumed to be equal between the bid objects, bid object types, and overarching level. In the future it can be analyzed if this is realistic or not and whether it can be improved. Furthermore, scaling the multipliers on bid object and bid object type level is not the neatest solution to deal with the hierarchy, because of the compensation afterwards. However, it does appear to work sufficiently and it is an intuitive and simple method to deal with the hierarchy.

(26)

6.1 Damping function

The damping function in the developed algorithm is a function between zero and approximately one. If the damping function equals zero, then the corresponding multiplier equals one which indicates no bid adjustment. However, since the multipliers on the lower segment levels get scaled, the bid objects or bid object types still get bid adjustments based on the data on the higher segment levels. For bid objects and bid object types with small amounts of clicks, the ROI is not representative for the medium-term ROI. Therefore, the damping function should go to zero when the number of clicks goes to zero. If a bid object has a large amount of clicks, then the bid object can get a multiplier based on the ROI of that specific bid object and hence the damping function can be one.

The question is how to determine the damping function. The damping function should go to one when the number of clicks increases and go to zero when the number of clicks decreases. Furthermore, as the first lag of the ROI is used exclusively when calculating the multipliers, the main goal is to find a damping function that decreases when the acf1 of a bid object or bid object type decreases and vice versa. To estimate such a damping function, we assume that the behaviour of the ROI is similar to the behaviour of the number of conversions. If the number of conversions goes up, the ROI also goes up and vice versa, ceteris paribus. Therefore, it seems reasonable that the autocorrelation properties of the ROI is largely dictated by the underlying binomial process of clicks to conversions.

The number of conversions can be estimated using a binomial distribution where the number of clicks (C) is the number of trials and the conversion rate (CR) is the probability of a conversion. Hence,

CVt∼ BN (Ct, CR),

where CVtdenotes the number of conversions in period t.

To find an appropriate damping function, the acf1 of the number of conversions is calculated for different numbers of clicks and different average conversion rates. Therefore, conversion rates are sim-ulated as explained in Section 5.3. To obtain conversion rates with different means, the simsim-ulated time series are multiplied with a constant.

The following simulations are carried out. Set a fixed Ctand a fixed constant for calculating CRt

and run 1000 simulations as follows.

1. Simulate the conversion rate as explained in Section 5.3 for 100 periods.

2. Multiply the simulated time series with the constant to obtain a simulated conversion rate with a different average value.

3. For each conversion rate in the time series, draw a random value from BN (Ct, CRt) and save

this as the number of conversions in period t.

4. Calculate and save the acf1 of the conversion rates and save the mean of the conversion rate. Next calculate the mean of the acf1 and the mean of the conversion rate of the 1000 simulations. Thereafter, repeat the same steps for another combination of Ct and a fixed constant for calculating

CRt. The results of repeating these steps for many possible combinations are shown in Figure 17.

When the number of clicks gets smaller, the acf1 gets smaller and vice versa. The same holds for the conversion rate.

The calculated acf1 function is used to calculate the damping function as follows D(Ct, CR) =

acf 1(Ct, CR)

acf 1(Ct,all, CRt,all)

,

(27)

(28)

(29)

7.1 Bisection algorithm

The Bisection algorithm is a basic root-finding algorithm (Jones et al., 2014). Using the same notation as Jones et al. (2014), the algorithm can be explained as follows.

Start with a continuous function f and xl< xr, s.t. f (xl)f (xr) < 0 and some small number .

1. If xr− xl≤ , then stop.

2. Put xm= xl+x₂ r; if f (xm) = 0 then stop.

3. If f (xl)f (xm) < 0 then put xr= xm; otherwise put xl= xm.

4. Go back to step 1.

In our case, f denotes the autocorrelation at lag 1 and the x values denote the parameters set for the bid adjustment method. As the possible parameters can only be between xland xr, it is important that

the bounds only get adjusted when the difference between the MAD corresponding to xlis significantly

different than the MAD corresponding to xr. Figure 19 shows a simulated path, with bid adjustment

method 1 with parameter equal to 0.004. Each period has a length of 50 and each period has the same parameter. This graphs shows that the MAD corresponding to one constant parameter highly depends on the period.

15.0 17.5 20.0 22.5 25.0 0 10 20 30 40 50 Period MAD

One simulation, with periods of length 50, parameter = 0.004

Figure 19

Therefore, it is difficult to conclude whether the MAD of a parameter is actually better in the long-run than another parameter. To reduce the risk, we decided to add bootstrap hypothesis tests. The specific tests are described in Appendix D. However, only changing the parameters when the test is significant results in a low convergence rate of the parameter because the parameter only gets rarely updated. To speed up the process, we will also consider a bootstrap hypothesis test regarding the standard deviation of the ROI.

Lower standard deviations are preferred over higher standard deviations when aiming to stabilize the ROI. So, if the p-value of the standard deviation of the same parameter is significantly smaller than the standard deviation of the other parameter and the MAD of one parameter is smaller than the MAD of the other parameter, then the parameters will also be adjusted. This speeds up the process. For the significance level, α = 15% is chosen, because with a smaller significance level the parameters did barely get adjusted.

In our case, the algorithm is used as follows. Start with xl= 0.001 and xr = 0.01. Set xm= xl+x₂ r.

One period is of length 50 and each parameter gets tested a third of the period.

(30)

1. Perform bootstrap hypothesis testing.

2. If f (xl)f (xm) < 0 and p-value of right MAD is smaller than left MAD < 0.15 or if f (xl)f (xm) < 0

p-value of right sd is smaller than left sd < 0.15 and M ADl < M ADr then put xr = xm and

xm= xl+x₂ r.

If f (xl)f (xm) >= 0 and p-value of left MAD is smaller than right MAD < 0.15 or if f (xl)f (xm) >=

0 p-value of left sd is smaller than right sd < 0.15 and M ADr< M ADl then put xl= xmand

xm= xl+x₂ r.

3. Go back to step 1.

Figure 20 shows the MAD, autocorrelation at lag 1, left parameter, and right parameter of one sample path. 0.0025 0.0050 0.0075 0.0100 0 10 20 30 40 50 Period P ar links 0.0025 0.0050 0.0075 0.0100 0 10 20 30 40 50 Period P ar rechts 0 10 20 30 40 0 10 20 30 40 50 Period MAD links 0 10 20 30 40 0 10 20 30 40 50 Period MAD rechts −1.0 −0.5 0.0 0.5 1.0 0 10 20 30 40 50 Period Acf1 links −1.0 −0.5 0.0 0.5 1.0 0 10 20 30 40 50 Period Acf1 rechts

Bisection method, one simulation, with periods of length 50

Figure 20

7.2 Probabilistic Bisection Algorithm

As the bounds of the bisection method are static while the time-series can change, it is not usable in the long-run. Also, the MAD is very noisy with respect to the parameter of the multiplier. In other words, querying the MAD in a certain period with a specific parameter does not yield the exact MAD which would be obtained with that specific parameter in the long-run as was also shown in Figure 19. Therefore, finding the parameter corresponding to the smallest MAD is a stochastic optimization problem.

Finding the parameter corresponding to the point where the acf1 equals zero is a stochastic root-finding problem. An algorithm which solves stochastic root-root-finding problems is the Probabilistic Bi-section Algorithm (PBA) (Horstein, 1963). This algorithm updates prior beliefs on the location of the root based on noisy responses to queries at chosen points (Frazier et al. (2019)). In our application, the noisy responses denote the MADs and the chosen points denote the parameters. This section explains the basics of PBA, but more in depth information about PBA can be found in Frazier et al. (2019) or

(31)

Waeber et al. (2011).

Let r denote a parameter and let A(r) denote the autocorrelation at lag 1 as a function of r.

PBA considers monotonically decreasing functions such that g : [0, 1] → R and there exists x∗∈ [0, 1] s.t. g(x) > 0 for x < x∗and g(x) < 0 for x > x∗. The goal of PBA is to estimate x∗. Figure 21b shows that A(r) is a monotonically decreasing function.

19 20 21 22 23 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

Mean and sd of the MAD

10.000 simulations of length 10 000

Mean and standard deviation of the MAD per parameter

(a) −0.3 0.0 0.3 0.6 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

Mean and sd of the acf1

10.000 simulations of length 10.000

Mean and standard deviation of the acf1 per parameter

(b)

Figure 21

When querying A(r) for some r, we obtain the A(r) in some specific period (denoted by Ap(r))

instead of the A(r) in the long run (denoted by ALR(r)). Therefore, by querying A(r), we obtain

A(r) in the long run plus some stochastic noise. Let Ap(r) = ALR(r) + (r), where (r) represents the

stochastic noise.

The PBA learns about the location of the optimal r∗by looking at the direction of Ap(r), denoted

by Z(r) = sign(Ap(r)). This direction is assumed to be correct with probability p ∈ (1₂, 1] and incorrect

with probability 1 − p.

As an initial guess, the PBA starts with a Bayesian prior density function f0 on the root r∗

map-ping from [0, 1] that is positive everywhere. Let F0 denote the corresponding cumulative distribution

function. Then for each step (n = 0, 1, ..., N ) the PBA iterates as follows: 1. Determine the next parameter rn as the median of fn, hence rn= Fn−1(12).

2. Plug in rn to obtain the corresponding Ap(rn) and Z(rn).

3. Update the density fn:

if Z(rn) = +1, then fn+1(y) = ( 2pfn(y), if y ≥ rn 2(1 − p)fn(y), if y < rn if Z(rn) = −1, then fn+1(y) = ( 2(1 − p)fn(y), if y ≥ rn 2pfn(y), if y < rn

4. Repeat the same steps for n = n + 1.

(32)

As the PBA needs a prior density function f0 on r∗ mapping from [0, 1], but the optimal r∗ lies

somewhere between 0 and 0.01 (as can be seen in Figure 21a), it is not efficient to start with roots be-tween 0 and 1. Therefore, we adjust step 2 and 3 by plugging in rn

100 into Ap(rn) and Z(rn) instead of rn.

Figure 22 shows the first 15 steps of the density function of one simulation with period lengths of 50 of the PBA with starting parameter 0.005 and p = 0.7 (probability that the direction of the ob-tained acf1 is correct). It nicely shows that the algorithm is finding the best parameter by changing the density function. Also, the peek of the density function changes during the period, so the optimal parameter is not fixed and the bounds are not fixed as with the Bisection algorithm. Based on the relatively low peeks in the density functions, it appears to be a slow converging algorithm with our ap-plication. However, Table 6 shows the percentage of parameter located in the good range as described in Section 5.4 (between 0.00250 and 0.00625) for different starting parameters and different p-s and we observe that most simulations are in the good range of parameters after eight steps.

Based on the results in Table 6, we would advise to set with p = 0.7, because the results for all three different starting values are sufficient for this probability. As a side-note, it is not really realistic to assume that p is constant. It is more intuitive to assume that p gets smaller when the param-eter converges towards the optimal paramparam-eter. However, it is hard to decide whether the obtained parameter is close to the optimal parameter or not. Assuming a non-constant p greatly complicates the algorithm. Moreover, the results with constant p = 0.7 satisfy the current needs of the company. Therefore, we decide to use the constant p instead of a more complicated method.

0 1 2 3 0.000 0.005 Parameter f Period 1 0 1 2 3 0.000 0.005 Parameter f Period 2 0 1 2 3 0.000 0.005 Parameter f Period 3 0 1 2 3 0.000 0.005 Parameter f Period 4 0 1 2 3 0.000 0.005 Parameter f Period 5 0 1 2 3 0.000 0.005 Parameter f Period 6 0 1 2 3 0.000 0.005 Parameter f Period 7 0 1 2 3 0.000 0.005 Parameter f Period 8 0 1 2 3 0.000 0.005 Parameter f Period 9 0 1 2 3 0.000 0.005 Parameter f Period 10 0 1 2 3 0.000 0.005 Parameter f Period 11 0 1 2 3 0.000 0.005 Parameter f Period 12 0 1 2 3 0.000 0.005 Parameter f Period 13 0 1 2 3 0.000 0.005 Parameter f Period 14 0 1 2 3 0.000 0.005 Parameter f Period 15

Figure 22: Density plots of PBA with starting parameter 0.005 and p = 0.7

(33)

Table 6: Results Probabilistic Bisection Algorithm Starting parameter 0.002 0.005 0.00714 p = 0.55 99.6% 95.2% 85.5% p = 0.6 100.0% 94.8% 96.2% p = 0.7 99.5% 94.8% 99.8%

Percentage parameters in sufficient range after 8 steps of 10,000 simulated paths with periods of length 50

8 Conclusion and discussion

In co-operation with an online retailer, a bid adjustment method for SEA is created that attempts to minimize the difference between the ROI and the fixed target ROI of the company. Simulations are carried out to find the best bid adjustment method and to find appropriate corresponding parameters. The best bid adjustment method proposed in this thesis uses the autocorrelation of the ROI at lag one to calculate sufficient bid multipliers. This bid adjustment method appears to decrease the mean

absolute deviation between the ROI and the fixed target ROI of the company with approximately %

compared to the original observed ROI.

This thesis adds to the existing literature by analyzing a different type of SEA strategy than used in previous literature. Furthermore, a method is proposed to simulate the effects of bid adjustments on the ROI, which can be used by other researchers or companies as well. Next to that, the focus of this thesis is mainly on the first autocorrelation of the ROI and the conversion rate which is not conventional in the existing SEA literature which might be inspiring for others. Lastly, the Probabilistic Bisection Algorithm is applied to this field for the first time.

There are also several limitations to the research. For example, the data used from the company relies heavily on competitors behaviour in the auctions and on the historical bid changes of the company. Therefore, it is not guaranteed that the methods used in this thesis also suffice in other fields and for other companies. Researchers and companies that are interested in the same goal (constant ROI) can use the applied methods and simulations, but no assurances can be given regarding the fit to different data.

Furthermore, the carried out simulations assume that the future conversion rate is independent of the multiplier. However, the company found in previous internal research a

Unfortunately, it is beyond the scope of this study to include a research regarding the relationship between the multiplier and the conversion rate and to incorporate it inside the simulations. This will not be a big issue in practice, because the Probabilistic Bisection Algorithm is used to change the parameter based on recent data.

Another limitation of this thesis includes the definition of the ROI. The ROI in this thesis is calculated based on

(34)

As a final remark, it should be noted that this thesis focuses exclusively on SEA without looking at substitution effects between for example organic search and paid search or online conversions versus offline conversions. Being aware of these substitutions and encountering them might improve the performance, but this is also out of the scope of this research.

References

Abou Nabout, N. and B. Skiera (2012). Return on quality improvements in search engine marketing. Journal of Interactive Marketing 26 (3), 141–154.

Adikari, S. and K. Dutta (2019). A new approach to real-time bidding in online advertisements: Auto pricing strategy. INFORMS Journal on Computing 31 (1), 66–82.

Agarwal, A., K. Hosanagar, and M.D. Smith (2011). Location, location, location: An analysis of profitability of position in online advertising markets. Journal of Marketing Research 48 (6), 1057– 1073.

Berman, R. (2018). Beyond the last touch: Attribution in online advertising. Marketing Science 37 (5), 771–792.

Best, D.J. and D.E. Roberts (1975). Algorithm as 89: the upper tail probabilities of spearman’s rho. Journal of the Royal Statistical Society. Series C (Applied Statistics) 24 (3), 377–379.

Blake, T., C. Nosko, and S. Tadelis (2015). Consumer heterogeneity and paid search effectiveness: A large-scale field experiment. Econometrica 83 (1), 155–174.

Brent, R.P. (1971). An algorithm with guaranteed convergence for finding a zero of a function. The Computer Journal 14 (4), 422–425.

Burnham, K.P. and D.R. Anderson (2004). Multimodel inference: understanding aic and bic in model selection. Sociological Methods & Research 33 (2), 261–304.

Efron, B. and R.J. Tibshirani (1994). An introduction to the bootstrap. Chapmann & Hall/CRC. Frazier, P.I., S.G. Henderson, and R. Waeber (2019). Probabilistic bisection converges almost as

quickly as stochastic approximation. Mathematics of Operations Research 44 (2), 651–667.

Ghose, A. and S. Yang (2009). An empirical analysis of search engine advertising: Sponsored search in electronic markets. Management Science 55 (10), 1605–1622.

Google (2020a). About your account organization.

https://support.google.com/google-ads/answer/1704396?hl=en. [Online; accessed 02-October-2020].

Google (2020b). Actual cost-per-click (cpc): Definition.

https://support.google.com/google-ads/answer/6297?hl=en. [Online; accessed 28-July-2020].

Google (2020c). Google ads. https://ads.google.com/. [Online; accessed 18-September-2020].

Google (2020d). Google ads. https://support.google.com/google-ads/answer/140351?hl=en. [Online; accessed 21-October-2020].

(35)

Google (2020e). Use bid multipliers to fine-tune your bidding.

https://support.google.com/displayvideo/answer/7378954?hl=en. [Online; accessed

02-October-2020].

Horstein, M. (1963). Sequential transmission using noiseless feedback. IEEE Transactions on Infor-mation Theory 9 (3), 136–143.

Hyndman, R.J. and G. Athanasopoulos (2018). Forecasting: principles and practice (2nd ed.). OTexts. Hyndman, R.J., Y. Khandakar, et al. (2007). Automatic time series for forecasting: the forecast package

for R. Number 6/07. Monash University, Department of Econometrics and Business Statistics. Jones, O., R. Maillardet, and A. Robinson (2014). Introduction to scientific programming and

simula-tion using R. CRC Press.

Kireyev, P., K. Pauwels, and S. Gupta (2016). Do display ads influence search? attribution and dynamics in online advertising. International Journal of Research in Marketing 33 (3), 475–490. Kuha, J. (2004). Aic and bic: Comparisons of assumptions and performance. Sociological Methods &

Research 33 (2), 188–229.

Lu, X. and X. Zhao (2014). Differential effects of keyword selection in search engine advertising on direct and indirect sales. Journal of Management Information Systems 30 (4), 299–326.

Nelder, J.A. and R. Mead (1965). A simplex method for function minimization. The computer jour-nal 7 (4), 308–313.

Rutz, O.J. and R.E. Bucklin (2011). From generic to branded: A model of spillover in paid search advertising. Journal of Marketing Research 48 (1), 87–102.

Rutz, O.J. and M. Trusov (2011). Zooming in on paid search ads—a consumer-level model calibrated on aggregated data. Marketing Science 30 (5), 789–800.

Shin, W. (2015). Keyword search advertising and limited budgets. Marketing Science 34 (6), 882–896. Skiera, B. and N. Abou Nabout (2013). Practice prize paper—prosad: A bidding decision support

system for profit optimizing search engine advertising. Marketing Science 32 (2), 213–220.

Varian, H.R. (2007). Position auctions. International Journal of Industrial Organization 25 (6), 1163– 1178.

Waeber, R., P.I. Frazier, and S.G. Henderson (2011). A bayesian approach to stochastic root finding. In Proceedings of the 2011 Winter Simulation Conference (WSC), pp. 4033–4045. IEEE.

Yang, S. and A. Ghose (2010). Analyzing the relationship between organic and sponsored search advertising: Positive, negative, or zero interdependence? Marketing Science 29 (4), 602–623. Zenetti, G., T.H.A. Bijmolt, P.S.H. Leeflang, and D. Klapper (2014). Search engine advertising

effec-tiveness in a multimedia campaign. International Journal of Electronic Commerce 18 (3), 7–38.

(36)

A

ARIMA models ROI

A.1 Different models of the ROI, first aggregation level, calculated each

time group

This subsection summarizes the models given in Table 1.

The first model, an ARIMA(4,0,2) model with non-zero mean, is given by

ROIt=56.74 + 0.23(ROIt−1− 56.74) − 0.42(ROIt−2− 56.74) + 0.20(ROIt−3− 56.74)

+ 0.46(ROIt−4− 56.74) + ˆt+ 0.02ˆt−1+ 0.55ˆt−2.

The second model, an OLS model with ARIMA(0,0,2) errors and time group dummies as regressors, is given by

ROIt= −43.38t1− 49.57t2− 33.32t3+ ˆt,

where ˆt = 88.26 + ût+ 0.40ût−1+ 0.18ût−2, ti denote dummy variables indicating the time group

(i = 1, 2, 3), and St denotes the total number of non-business sales in period t.

The third model, an OLS model with ARIMA(0,0,4) errors and time group dummies as regressors, is given by

ROIt= −43.37t1− 49.55t2− 33.31t3+ ˆt,

where ˆt= 88.25+ ût+0.26ût−1+0.14ût−2+0.17ût−3+0.32ût−4, tidenote dummy variables indicating

the time group (i = 1, 2, 3), and St denotes the total number of non-business sales in period t.

The fourth model, an OLS model with ARIMA(1,0,2) errors and time group dummies as regressors, is given by

ROIt= −43.39t1− 49.59t2− 33.32t3+ ˆt,

where ˆt= 88.25 + 0.91ˆt−1+ ût− 0.60ût−1− 0.05ût−2, ti denote dummy variables indicating the time

group (i = 1, 2, 3), and Stdenotes the total number of non-business sales in period t.

The fifth model, an OLS model with ARIMA(1,0,4) errors and time group dummies as regressors, is given by

ROIt= −43.37t1− 49.55t2− 33.30t3+ ˆt,

where ˆt= 88.23 + 0.66ˆt−1+ ût−0.36ût−1−0.01ût−2+ 0.09ût−3+ 0.24ût−4, tidenote dummy variables

indicating the time group (i = 1, 2, 3), and Stdenotes the total number of non-business sales in period t.

The fifth model, an OLS model with ARIMA(1,0,4) errors with time group dummies and lag of business sales as regressors, is given by

ROIt= −43.37t1− 49.33t2− 33.23t3− 0.00Bt−1+ ˆt,

where ˆt= 88.23 + 0.66ˆt−1+ ût−0.36ût−1−0.01ût−2+ 0.09ût−3+ 0.24ût−4, tidenote dummy variables

indicating the time group (i = 1, 2, 3), and Bt denotes the total number of business sales in period t.

The last model, an OLS model with ARIMA(1,0,4) errors with time group dummies and lag of non-business sales as regressors, is given by

ROIt= −35.32t1− 54.82t2− 34.63t3+ 0.02N Bt−1+ ˆt,

(37)

where ˆt = 71.90 + 0.70ˆt−1+ ût− 0.46ût−1+ 0.00ût−2+ 0.08ût−3 + 0.20ût−4, ti denote dummy

variables indicating the time group (i = 1, 2, 3), and N Bt denotes the total number of non-business

sales in period t.

A.2 Different models of the ROI of the specific brand calculated daily

This subsection summarizes the models given in Table 2.

The first model, an ARIMA(2,0,1) model with non-zero mean, is given by

ROIt= 48.85 − 0.40(ROIt−1− 48.85) + 0.49(ROIt−2− 48.85) + 0.84ˆt−1+ ˆt.

The second model, an ARIMA(1,0,2) model with non-zero mean, is given by ROIt= 48.78 − 0.61(ROIt−1− 48.78) + 0.00ˆt−2− 0.18ˆt−1+ ˆt.

The third model, an OLS model with ARIMA(2,0,1) errors and lag of the price index with respect to the second competitor, lag of the ROI of all data together, and the lag of non-business sales as regressors, is given by

ROIt= −0.65P I2,t−1− 1.37ROIall,t−1+ −0.01N Bb,t−1+ ˆt,

competitor in period t, ROIall,t denotes the lag of the ROI of all accounts together in period t, and

N Bb,t denotes the lag of non-business sales of the specific brand in period t.

The last model, an OLS model with ARIMA(2,0,1) errors and lag of the price index with respect to the second competitor and lag of the ROI of all data together, is given by

ROIt= −0.58P I2,t−1− 2.124ROIall,t−1+ ˆt,

competitor in period t, and ROIall,t denotes the lag of the ROI of all accounts together in period t.

B

Mean reverted conversion rates

0.0 0.5 1.0 1.5 2.0 2.5 0 100 200 300 400 500 Period Inde x ed CR w r.t. mean(CR)

(a) 0.0 0.5 1.0 1.5 2.0 2.5 0 100 200 300 400 500 Period Inde x ed CR w r.t. mean(CR)

(b)

Figure 23: Two different simulated conversion rates, with mean reverting terms

(38)

0.0 0.5 1.0 1.5 2.0 2.5 0 100 200 300 400 500 Period Inde x ed CR w r.t. mean(CR)

(a) 0.0 0.5 1.0 1.5 2.0 2.5 0 250 500 750 Period Inde x ed CR w r.t. mean(CR)

(b)

Figure 24: Two different simulated conversion rates, with mean reverting terms

C

Comparison acf1 equals zero with min(MAD)

The conversion rate is simulated as explained in Section 5.3. The underlying model used in this thesis is the ARIMA(1,0,1) model. For this underlying model, the parameter corresponding to acf1 = 0 lies in the range of [min(M AD) − 0.5, min(M AD) + 0.5]. To test whether this range is also met for other underlying models, we executed simulations with different underlying models. For each underlying model, 100 simulations are ran with length 10,000 and mean and standard deviation plots of these simulations are shown in Figures 25 to 30. The dotted lines depict the parameter bounds such that the mean MAD corresponding to the parameters are at most 0.5 different from the minimum MAD. For all simulations, the mean and standard deviation of the MAD corresponding to the acf1 equals zero lies in the range of sufficient parameters.

19 20 21 22 23 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

100 simulations of length 10.000

(a) −0.3 0.0 0.3 0.6 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

(b)

Figure 25: Underlying model is the ARIMA(1,0,1)

(39)

20 25 30 35 40 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

(a) −0.5 0.0 0.5 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

(b)

Figure 26: Underlying model ARIMA(0,0,1)

18 20 22 24 26 28 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

(a) −0.50 −0.25 0.00 0.25 0.50 0.75 0.00000 0.00125 0.00250 0.00375 0.00500 0.00625 0.00750 0.00875 0.01000 Parameter

100 simulations of length 10 000

(b)

Figure 27: Underlying model ARIMA(0,0,2)

Search Engine Advertising: How to keep a constant Return on Investment?

Search Engine Advertising: How to keep a constant Return on

Investment?

Master’s Thesis Econometrics

Supervisor University of Groningen: dr. Evert de Haan

Second assessor University of Groningen: prof. dr. Tammo Bijmolt

Search Engine Advertising: How to keep a constant

Return on Investment?

Contents

1

Introduction

2

Literature research

2.1

Metrics for examining the effects of SEA

2.2

Bid calculations

5

Simulating the ROI with and without bid adjustments

5.1

Simulating the ROI

5.2

Analyzing the conversion rate

5.3

Simulating the conversion rate

5.4

Bid adjustment methods

6.1

Damping function

7.1

Bisection algorithm

7.2

Probabilistic Bisection Algorithm

8

Conclusion and discussion

References

A

ARIMA models ROI

A.1

Different models of the ROI, first aggregation level, calculated each

time group

A.2

Different models of the ROI of the specific brand calculated daily

B

Mean reverted conversion rates

C

Comparison acf1 equals zero with min(MAD)