Generative adversarial networks in marketing:

(1)

Overcoming privacy concerns with the generation of fake data.

Gilian Ponte

S2591634

Master Thesis Defense

July 4, 2019

(2)

Introduction - Theory - Methodology - Results: H1 - H2 - H3 - Empirical issues - Discussion.

-

Facebook & Cambridge Analytica.

-

87 million users.

-

Without consent.

-

Consequences:

-

119 billion dollars’ (= McDonald’s value).

-

Termination Cambridge Analytica.

-

FTC investigates Facebook.

Facebook and Cambridge Analytica.

(Peltier, et al. 2013: Reuters, 2019).

(3)

Increasing relevance of privacy.

-

Firms annually spend around $36 billion.

-

Customers respond negatively to a firm collection and use of individual data.

-

General Data Protection Regulation (GDPR).

-

Martin, Borah & Palmatier (2017) describe that a data breach leads to:

-

Significant negative stock performance.

-

Spillover effects.

-

Transparency and control.

-

Privacy is one of companies and research highest priorities.

(Columbus, 2014: van Doorn & Hoekstra, 2013: Kannan & Li, 2017: Marketing Science Institute, 2018).

(4)

Generative adversarial networks (GANs).

(5)

Discriminator

over real data

x.

Discriminator over

fake data generated

by G(z).

-

Train jointly in a minimax game:

Generative adversarial networks.

5

(Goodfellow, 2016).

Expected sample from the

real distribution.

(6)

Training a GAN.

z ∼ Normal(0,1)

Real data set

(7)

Non-convergence.

-

Very new method!

-

Mode collapse.

-

Evaluation of training.

(Radford, et al. 2015: Theis, Oord, & Bethge, 2015: Goodfellow, 2016: Salimans, et al. 2016: Arjovsky, Chintala & Bottou, 2017).

(8)

-

Wasserstein distance:

-

Overlap.

-

Differentiable.

-

Dependency of G on D.

-

Linear activation function (critic).

Wasserstein - Generative adversarial network.

(Arjovsky, Chintala & Bottou, 2017).

8

GAN =

not defined

(9)

(Goodfellow, 2016: Beaulieu-Jones et al., 2018: Kumar, Biswas & Sanyal, 2018). * Three data sets: artificial churn data set (1), real churn data set (2) and market data set (3).

Hypotheses*

9

-

Theoretically, when a GAN successfully converges the real density surface is approximated.

H1: The

correlation matrix

from the fake data set significantly

correlates

with the real data

correlation matrix.

(10)

Introduction - Theory - Methodology - Results: H1: Artificial churn data set - H2 - H3 - Empirical issues - Discussion.

Correlations - Artificial churn data set.

r = .99***

r = .70***

10

Real

GAN

WGAN

Similar

results for the

(11)

Introduction - Theory - Methodology - Results: H1a - H2 - H3 - Empirical issues - Discussion.

Artificial churn data set

(i = 50,000)

Real churn data set

(i = 500,000)

Market data set

(i = 300,000)

Multiple experiments in terms of training the GANs have been done to better approximate the real data.

11

***

G

(12)

Introduction - Theory - Methodology - Results: H1a - H2 - H3 - Empirical issues - Discussion.

Wasserstein GAN.

Multiple experiments in terms of training the WGANs have been done to better approximate the real data.

12

Artificial churn data set

(i =1,000,000)

Real churn data set

(i = 1,000,000)

Market data set

(13)

Conditionally on successfully generating fake data...

H2: The

predictive accuracy of machine learning techniques is significantly lower

on fake data than

on real data.

H2a: The

addition of generated fake data to real data

significantly

increases

the predictive

accuracy of machine learning techniques compared

to only real data.

(14)

Introduction - Theory - Methodology - Results: H1 - H2: Artificial churn data set - H3 - Empirical issues - Discussion.

H2: Artificial churn data set.

(15)

Introduction - Theory - Methodology - Results: H1 - H2: Real churn data set - H3 - Empirical issues - Discussion.

H2: Real churn data set.

(16)

Introduction - Theory - Methodology - Results: H1 - H2a: Artificial churn data set - H3 - Empirical issues - Discussion.

H2a: Additional data - Artificial churn data set.

16

*n = 512,000

Also, higher compared to

(17)

Introduction - Theory - Methodology - Results: H1 - H2a: Real churn data set - H3 - Empirical issues - Discussion.

H2a: Additional data - Real churn data set.

17

(18)

The normative value of OLS estimations for lemonade sales?

H3: The

parameters are equal between the estimation

based on

generated

fake data and an

estimation on the

real

data.

(19)

Introduction - Theory - Methodology - Results: H1 - H2 - H3: Artificial churn data set - Empirical issues - Discussion.

Significantly different

parameters and variance (H =194, H =1,449), but

correlate highly

(r =.99, r =.90).

19

Ef

(20)

(21)

Introduction - Theory - Methodology - Results: H1 - H2 - H3: Market data set - Empirical issues - Discussion.

Significantly different

estimations (H = 10, H = 5,208), but

correlate highly

(r = .95, r = .65).

21

Ef

(22)

Empirical issues...

How does the generated fake data influence the

MAPE (Mean Absolute Percentage Error)

,

compared to the estimation on the real data?

How does the generated fake data influence the

RAE (Relative Absolute Error) and Theil

U-statistic

, compared to the estimation on the real data?

(23)

* Null-model

(24)

-

Very new method.

-

Differential privacy.

-

Artificial intelligence & marketing.

-

Time-series data.

Discussion & limitations.

-

Increase in predictive accuracy!

-

Implications for privacy, data sharing and

the development of theory.

-

Cambridge Analytica & Facebook?

(25)

(26)

References

1. Arjovsky, M., Chintala, S., & Bottou, L. 2017. Wasserstein GAN. arXiv. https://arxiv.org/abs/1701.07875.

2. Beaulieu-Jones, B. K., Wu, Z. S., Williams, C., Lee, R., Bhavnani, S. P., et al. 2018. Privacy-preserving generative deep neural networks support clinical data sharing. http://doi.org/10.1101/159756 . 3. Columbus, L. 2014, June 27. 2014: The Year Big Data Adoption Goes Mainstream In The Enterprise. Forbes. Forbes Magazine.

https://www.forbes.com/sites/louiscolumbus/2014/01/12/2014-the-year-big-data-adoption-goes-mainstream-in-the-enterprise/#10a2418c2055 4. Doorn, van J. and J.C. Hoekstra. 2013. Customization of Online Advertising: The Role of Intrusiveness. Marketing Letters, 24 (4), 339-351. 5. Goodfellow, I. 2016. NIPS 2016 Tutorial: Generative Adversarial Networks. arXiv. https://arxiv.org/abs/1701.00160.

6. Kannan, P. K., & Li, H. A. 2017. Digital marketing: A framework, review and research agenda. International Journal of Research in Marketing, 34(1): 22–45. 7. Kumar, A., Biswas, A., & Sanyal, S. 2018. eCommerceGAN : A Generative Adversarial Network for E-commerce. arXiv. https://arxiv.org/abs/1801.03244 . 8. Marketing Science Institute. 2018. Research Priorities 2018-2020, Cambridge, Mass.: Marketing Science Institute.

9. Peltier, J. W., Zahay, D., & Lehmann, D. R. 2013. Organizational Learning and CRM Success: A Model for Linking Organizational Practices, Customer Data Quality, and Performance. Journal of Interactive

Marketing, 27(1): 1–13.

10. Radford, A., Metz, L., & Chintala, S. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv. arXiv:1511.06434v2. 11. Reuters. 2019, February 15. Facebook may face multibillion-dollar US fine over privacy lapses – report. The Guardian. Guardian News and Media.

https://www.theguardian.com/technology/2019/feb/14/facebook-ftc-privacy-cambridge-analytica-fine.

12. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A. & Chen, X. 2016. Improved Techniques for Training GANs. In Advances in Neural Information Processing Systems, 2234-2242. 13. Schneider, M. J., Jagpal, S., Gupta, S., Li, S., & Yu, Y. 2017. Protecting customer privacy when marketing with second-party data. International Journal of Research in Marketing, 34(3): 593–603. 14. Theis, L., Oord, A. van den, & Bethge, M. 2015. A note on the evaluation of generative models. arXiv. https://arxiv.org/abs/1511.01844.

(27)

Introduction - Theory - Methodology - Results: H1 - H2: Artificial churn data set - H3 - Empirical issues - Discussion. 27

-

When the discriminator has high certainty over the fake samples.

-

D(G(z)) will be very small (e.g., .001).

-

Thus, assuming that D is very certain (D(G(z)) = .001).

-

log(1 - D(G(z)) = log(.999) = -.001

-

log(D(G(z)) = log(.001) = -6.91

Loss function G.

Allows for a larger

gradient and faster

(28)

Introduction - Theory - Methodology - Results: H1 - H2: Artificial churn data set - H3 - Empirical issues - Discussion. 28

-

Similar to we humans learn, from the most difficult examples (Goodfellow, et al. 2014)

-

More variability in the examples given to the network (Leeflang, et al. 2015).

-

Better able to explain variability in the dependent variable.

-

Better generalization to new data by training on more different samples.

-

Risselada et al. (2010) as accuracy highly related to the data set.

(29)

Introduction - Theory - Methodology - Results: H1 - H2: Artificial churn data set - H3 - Empirical issues - Discussion.

Dependency of G on D.

29

Real data

D captures the real distribution G captures distribution from D

(30)

Difference Kullback-Leibler and Wasserstein distance.

-

Definitions:

-

Kullback-Leibler divergence = infinity.

-

Loss function not defined!

-

Wasserstein distance ≠ infinity.

-

Implications:

-

Possibility to take the gradient.

-

No dependence of G on D.

-

Does not matter how strong D is.

(31)

Introduction - Theory - Methodology - Results: H1 - H2 - H3 - Empirical issues - Discussion. 31

Some first results from a medical trail.

(Beaulieu-Jones et al., 2018).

(32)

Introduction - Theory - Methodology - Results: H1: Real churn data set - H2 - H3 - Empirical issues - Discussion.

Correlations - Real churn data set.

r = .89***

r = .43***

32

(33)

Introduction - Theory - Methodology - Results: H1: Market data set - H2 - H3 - Empirical issues - Discussion.

Correlations - Market data set.

r = .93***

r = .62***

33