Collecting Contextual Information About a DDoS Attack Event Using Google Alerts

(1)

Data Collection:

Motivation:

Goals:

Data Sharing &Future Work:

Distributed Denial of Service (DDoS) attacks can lead to massive economic damages to victims. In most cases, the damage caused is dictated by the circumstances surrounding the attack (i.e. context). One of the ways of collecting information on the context of an attack can be by using the online articles written about the attack.

Ÿ Introduce a dataset collected using Google Alerts that provides contextual information related

DDoS attacks.

Ÿ Invite other researchers for collaboration.

Ÿ Step 1: Export the emails and store them in a local ﬁle storage.

Ÿ Step 2: Scrape the text from the emails using

mailbox package (Python) and extract the

following features from the alert using regular

expressions: 1) Alert Header, 2) Associated Text, 3) Type of Alert (News or Web)

Then we ﬁlter the duplicate alerts as the same alert may be reported by both the triggers.

Ÿ Step 3: Introduce two additional features to the dataset: 1) the language of the alert, 2) the historical alexa rank of the source of the alert. Ÿ Step 4: Store all data in a relational database.

Ÿ We are working on a web portal in order to make the dataset public. We will also share all the

scripts used for scraping and preparation of the data. In near future we plan to build an algorithm to label and track articles belonging to a single attack event.

Ÿ The dataset not only contains reports that describe attacks but also articles on actions by law enforcement against DDoS attackers and studies on DDoS attack trends by researchers. We are developing a supervised machine learning algorithm to classify these articles in each of

these categories automatically.

Collecting Contextual Information About a DDoS Attack Event

Using Google Alerts

*

Abhishta Abhishta

s.abhishta@utwente.nl

*

Reinoud Joosten

r.a.m.g.joosten@utwente.nl

†

Mattijs Jonker

m.jonker@utwente.nl

*

Wim Kamerman

info@wimkamerman.com

*

Lambert J. M. Nieuwenhuis

l.j.m.nieuwenhuis@utwente.nl

* Industrial Engineering and Business Information Systems (IEBIS), Faculty of Behavioural, Management and Social Sciences (BMS), University of Twente, Enschede, The Netherlands.

† Design and Analysis of Communication Systems (DACS),

Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, Enschede, The Netherlands.

Fig. 1: #Articles about the attack. Fig. 2: #Articles about the attack

after ﬁltering noise.

Ÿ

Analyse the metadata of the articles related to four major DDoS

attack events within ﬁrst 20 days following an attack.

Ÿ

Fig. 1 shows the number of articles related to each of the attacks

within 20 days of the attack.

Ÿ

We use a machine learning algorithm to classify articles reporting a

DDoS attack. Fig. 2 clearly shows that we are able to remove all

noise from our dataset (there are no attack reporting articles before

the attack day).

Ÿ

We observe that we record a relatively large number of articles just

after the attack day. This proves that we are able to successfully

track articles reporting DDoS attack using our data collection

strategy.

Ÿ

The fact that more articles discussed the attack on Pokemon than

attack on OVH shows that the popularity of an attack on web

forums is not proportional to the intensity of an attack.

Analysis:

_{Observations:}

#Articles Tagged as

Year News Web

2015 1427 3653

2016 4458 9387

2017 5805 9658

2018 5230 7005

Table 2

Year # Sources # Languages 2015 2467 37 2016 4889 42 2017 5692 44 2018 5071 45

Table 3

th

Collection Start Date: 20 of August 2015

Table 1

#Email Alerts on Trigger Word Year ‘ddos’ ‘denial of service’ 2015 2763 132 2016 8084 350 2017 7256 349 2018 4863 313