• No results found

Python Design Defects Detection

N/A
N/A
Protected

Academic year: 2021

Share "Python Design Defects Detection"

Copied!
47
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Python Design Defects Detection

Nikola Vavrová

vavrova.n@gmail.com

December 2015, 46 pages

Supervisor: Dr. Vadim Zaytsev Host organisation: University of Amsterdam

Universiteit van Amsterdam

Faculteit der Natuurwetenschappen, Wiskunde en Informatica Master Software Engineering

(2)

Contents

Abstract 3 1 Introduction 4 1.1 Motivation . . . 4 1.2 Background . . . 5 1.3 Related Work . . . 6

2 Problem & Approach 7 2.1 Problem Statement . . . 7

2.2 Research Questions . . . 8

2.3 Method & Approach . . . 8

2.3.1 Design Defect Detector Architecture . . . 8

2.3.2 Version Support . . . 8 3 Parsing 10 3.1 Grammar . . . 10 3.1.1 Resources . . . 10 3.1.2 Engineering . . . 10 3.1.3 Testing . . . 11 3.1.4 Result . . . 11 4 AST Building 13 5 Code Model 14 5.1 Design . . . 14 5.2 Construction . . . 16 6 Analysis 17 6.1 Technical Implementation . . . 18

7 Design Defect Detection 19 7.1 Technique . . . 19

7.2 Design Defect Catalogue . . . 20

7.2.1 Feature Envy . . . 20

7.2.2 Data Class . . . 20

7.2.3 Long Method . . . 21

7.2.4 Long Parameter List . . . 21

7.2.5 Large Class . . . 21

7.2.6 Blob . . . 22

7.2.7 Swiss Army Knife . . . 22

7.2.8 Functional Decomposition . . . 23

7.2.9 Spaghetti Code . . . 23

8 Evaluation Data Set 24 8.1 Collected GitHub Repository Links . . . 24

(3)

8.2 Filtering: First Step . . . 24

8.3 Initial Data Set . . . 25

8.4 Filtering: Second Step . . . 25

8.5 Final Data Set . . . 25

9 Results 27 9.1 Answers . . . 28 10 Discussion 30 10.1 Threats to Validity . . . 30 10.2 Future Work . . . 30 11 Conclusion 32 Bibliography 33 Appendices 36 Appendix A AST structure 36 A.1 Root . . . 36

A.2 Level 1. . . 37

A.3 Level 2. . . 40

A.4 Level 3. . . 42

(4)

Abstract

Design defects are bad design practices in source code and have a negative impact on software main-tenance. The majority of the research conducted in the eld of design defect detection is focused only on Java source code. Assuming that the context of a programming language is a signicant one, we concentrated on the detection of design defects in Python source code, to see if there are any notable dierences from Java-based ndings.

To detect design defects in Python, we developed a tool called Design Defect Detector. This tool is compatible with any Python version used at the present time, i.e. starting from Python 2.5 to Python 3.5. To achieve this, we have developed a combined Python ANLTR4 grammar.

This tool was used on a data set of 4 121 GitHub repositories, consisting of 32 058 823 lines of Python code. As a result, we have found that 8 out of 9 design defects we chose for our study were detectable in Python. We also found that code smells were more common in Python than antipatterns. We have compared our results to DECOR [27] and found that the density of detected design defects was slightly lower in Python than in Java code.

(5)

Chapter 1

Introduction

1.1 Motivation

Most software systems go through constant evolution derived from the need to keep up with the chang-ing requirements. Since design aws are known to have a large negative impact on maintenance [2], this introduces a requirement for quality design of software. There exist multiple well known, indus-try established design practices, such as design patterns [11] and heuristics for good object-oriented programming [37], to aid the process of creating good design.

In our work, we focus on the good design practice counterparts, called design defects. Moha et al. dened this term as the embodiment of bad design practices in the source code of programs [30]. This means that design defects span across all levels of software design. They include low level issues such as code smells [10], but also architectural issues, e.g. antipatterns [2].

Detection of the dierent types of design practices aids in determination of the quality of software design. The work that has been done in this eld is described in further detail inChapter 1.3.

Design defect detection is challenging for numerous reasons. Firstly, it is dicult to do manually due to the analyzed systems often being very large. Design defects can span across dierent subsystems, so they cannot be detected locally [6]. Most of the used techniques therefore employ various levels of automation.

However, the largest diculty with detecting design defects is that they are dened in a very loose manner. Unlike design patterns, which have precise UML denitions and concrete descriptions, design defects descriptions are mostly textual and often contain phrases open to interpretation, such as class with a large number of attributes, operations, or both [2]. Such denitions are not only ambiguous, but also very context dependent  a normal sized class in one project or programming language can be considered a large class in another project or language. Brown et al. actually dene an antipattern as a pattern in an inappropriate context [2].

Most of the design defect detection studies that currently exist have focused on Java in their experiments [8,9,15,16,2628,30,31,33,34,38,40]. There are a couple of approaches that claim language independence, such as the the one by Llano and Pooley [22] and Cortellessa [7], however neither of them seems to have gained a lot of popularity.

In our work, we consider the context of a programming language a signicant one. Due to the dierences in languages, the code produced in them often diers. These dierences go beyond just the syntactic level. Some languages promote certain ways of solving problems over others. They also have dierent limitations and recommended practices. To give an example,Table 1.1 shows some of the elementary dierences between Python and Java.

Python Java

functional & object oriented object oriented dynamically typed statically typed

concise verbose Table 1.1: Dierences between Java & Python

(6)

This thesis partially bridges the gap in design defect detection research in programming languages other than Java. For this purpose, we create a tool, Design Defect Detector, for automatic detection of design defects in Python. The requirements for this tool are explained inChapter 2and its separate components are described in chapters36.

We dene the design defects in a concrete manner inChapter 7and use the Design Defect Detector in a quantitative experiment on a data set of GitHub projects. This set is further described inChapter 8. We inspect the results obtained from the experiment in Chapter 9, discuss them inChapter 10and nally conclude our work inChapter 11.

There exist initiatives similar to ours, such as Pylint1, which detects deviations from coding

stan-dards and looks for code smells. However, to the best of our knowledge, there is currently no scientic study of detecting design defects in Python.

1.2 Background

This section briey explains essential concepts used in this thesis. Python

Python is a high-level, interpreted programming language. It is classied as both functional and object oriented. It has strong, dynamic typing. At the present time, there are two major releases in use, 2.x and 3.x. Python's ocial website and documentation can be found at

https://www.python.org/. PyPi

PyPi, or Python Package Index, is the ocial repository of open-source, third party software for Python. Its can be found athttps://pypi.python.org/pypi.

GitHub

GitHub (https://github.com/) is a popular, web-based repository hosting service based on a version control system called Git. Git is explained in detail by Loeliger [23].

Grammars

Grammars, in the context of theoretical computer science, systematically describe the syntax of programming language constructs like expressions and statements [1]. A grammar consists of terminals, nonterminals, production rules and a start symbol. Context-free grammars are grammars, in which each production rule describes possible form of a single nonterminal, not their combination. Grammars are described in detail by Aho et al. [1] and Grune et al. [13]. Parsing

Grammars are an essential component in a process called parsing. A parser uncovers the implicit grammatical structure in an input text. The result of parsing is represented in a form of a parse tree [42]. If this cannot be done, a parser reports a syntax error for the input text [1]. This process is further described in literature by Aho et al. [1] and Grune and Jacobs [12].

ANTLR4

Writing parsers is a dicult and a rather time consuming task. However, since parsing is a very well established eld [42], nowadays there are a numerous tools to automatically generate parsers from the input grammar, such as Yacc, Bison++, etc. In our work, we use ANTLR4 for this purpose. ANTLR4 [35] is the newest version of ANTLR, a parser generator which uses LL(*), a top-down parsing strategy [36].

Fact extraction

Fact extraction is the process which focuses on parsing input source code and producing a fact base about it. The fact base is usually used to further analyze the given software. In our work, the extracted facts are used to build a code model  a representation of the given software, which stores the relevant software metrics. Fact extraction is described in more detail by Lin and Holt [21].

(7)

1.3 Related Work

The commonly mentioned and researched design defects are code smells [10] and antipatterns [2]. Since their introduction, multitude of dierent methods for their detection (and sometimes also correction) has been proposed in the literature. They range from manual approaches to semi-automatic or automatic ones.

Mäntylä et al. [24] use one of the manual approaches for detecting bad code smells. They evaluate developer questionnaires about their occurrence and then explore correlation between the smells. Dierent manual approach was used by Moha et al. [31], who have classied design pattern defects into groups. Their work is based on the denition of design defects being design patterns implemented in a wrong way. They have shown presence of these design defects by having students look for design patterns in code and examining the distortion of these design patterns.

Dhambri et al. [8] propose a semi-automatic method for detecting design aws. They automatically detect certain symptoms and visualize them, but rely on a human analyst to make nal conclusions. Ciupke [6] proposes an automatic method which queries a meta-model of the source code for de-sign problems. Guéhéneuc and Albin-Amiot [14] introduce a method for automatically detecting and correcting inter-class design defects. They identify distorted forms of design patterns by using con-straint relaxation on a source code meta-model and apply transformation rules based on the relaxed constraints to x the defects.

An automatic method, which includes visualization of the results, was proposed by van Emden and Moonen [40]. It is based on a source model, which stores primitive smell aspects.

Dierent approaches for detecting design defects based on source code metrics were proposed by Marinescu [25], Munro [33] and Fontana and Maggioni [9].

The extensive work of Moha et al. proposes an automatic way to detect design defects using so called rule cards and a DSL [30] and results in DECOR [2629], a state of the art method to automatically generate design defect detection algorithms.

Khomh et al. [16] propose an approach which extends DECOR and accommodates for uncertainty by using Bayesian Belief Networks to rank and prioritize classes based on their probability of being part of an antipattern.

Oliveto et al. [34] automatically identify antipatterns through using B-Splines  interpolation curves built using a set of metrics and their values for a given class. Antipatterns are detected based on the class' B-Spline distance from known antipattern classes' B-Splines and known good quality classes' B-Splines.

A logic-based approach was proposed by Stoianov and Sora [38] who dene Prolog rules to auto-matically detect design patterns and antipatterns.

An approach to automatically detect and correct design defect based on genetic programming has been proposed by Kessentini et al. [15] This approach, unlike most others, does not require upfront denition of detection rules.

Most of the methods proposed in the literature are heavily language dependent. One of the excep-tions is an approach proposed by Llano and Pooley [22], who dene UML specications for antipatterns at a design level and guidelines for manual refactoring of these antipatterns. Second language inde-pendent approach was developed by Cortellessa et al. [7] They formalize performance antipatterns and use a OCL specication language to describe expressions on UML models and query them.

Lastly, an approach to automatic refactoring of design defects based on relational algebra is de-scribed by Moha et al. [32].

(8)

Chapter 2

Problem & Approach

2.1 Problem Statement

The main goal of this project is to create a tool, Design Defect Detector, which automatically detects design defects in Python, and use it in an experiment. For the development of this tool, there are numerous considerations to be taken into account.

Detecting design defects requires a top-down approach, meaning the design defects to be detected have to be specied upfront, along with their characteristics. This is due to a simple reason  using a bottom-up approach yields information about abnormal metrics, however it is very dicult to tell what design defect they actually point towards [25].

As previously mentioned inChapter 1.1, design defects are often dened in a very loose manner. Therefore, in order to automatically detect them, it is necessary to rst transform their loose, textual denitions into quantiable, concrete rules.

Operating on the level of source code has the benet of full availability of technical information. However, design defects are dened on a design level. Thus, to detect design level problems in the source code, it is necessary to rst abstract from the concrete implementation [6].

A relevant consideration for the Design Defect Detector is which Python version(s) should be supported. Python as a language is still evolving and at the present time, multiple dierent versions are being actively used by the developers. An online survey [3] about the year 2014 with 6 746 respondents has shown, that although Python 2.7 and Python 3.4 were the most widely used versions of Python, other versions were still rather common at that time. For the exact results of the survey, seeFigure 2.1. 0 2,000 4,000 6,000 8,000 Python 3.5 Python 3.4 Python 3.3 Python 3.2 Python 2.7 Python 2.6 Python 2.5 138 (2%) 2 922 (43,3%) 798 (11,8%) 191 (2,8%) 5 510 (81,7%) 609 (9%) 89 (1,3%) Users

(9)

Some of the available Python versions have major dierences between them, for instance Python 3.x is not backwards compatible with Python 2.x.

Distinguishing between the versions is a very time consuming task. Most projects on GitHub, which is the source of our data set, do not state which precise version of Python are they written in. To the best of our knowledge, there are no automatic tools that perform this task. In addition to that, a large portion of the projects is not written in a single Python version, but support both Python 2 and Python 3.

2.2 Research Questions

1. Which of the well known design defects can be detected in Python code? 2. Do Java code and Python code have comparable design defects?

3. What is the density of these design defects in Python code?

4. Do Java code and Python code have comparable design defect density?

2.3 Method & Approach

2.3.1 Design Defect Detector Architecture

The process used by our tool to detect design defects can be divided into 4 steps: (1) parsing, (2) AST construction, (3) model construction and (4) model analysis. This workow is displayed inFigure 2.2. The rst two steps of this process reect the abstraction requirement. At the start, the input source code is parsed by an ANLTR generated parser (Chapter 3). Because the output parse tree is not a sucient level of abstraction for extracting the necessary information about the source code, the next step is transforming it into an AST (Chapter 4).

After an adequate level of abstraction is achieved in form of the AST, the following step focuses on creating a source code model by using fact extraction (Chapter 5). The main idea behind the model is making the necessary metrics and code characteristics easily accessible to the analyzer.

The nal step is the code model analysis (Chapter 6), which detects and records the occurrence of the concrete design defects (as specied inChapter 7).

2.3.2 Version Support

To make Design Defect Detector widely applicable, the requirement to support more than one Python version has emerged. We have decided to support all currently used versions, i.e. from Python 2.5 to Python 3.5.

(10)
(11)

Chapter 3

Parsing

The rst step of the Design Defect Detector consists of parsing the source code of the Python projects. Since writing a parser manually is a tedious and a time consuming task, an automatic parser generator was used for this purpose. We decided to use ANLTR4, as there is an available Python 3.3.5 grammar, written by Bart Kiers1.

However, this grammar alone was insucient for the Design Defect Detector, because as explained inChapter 2.1, supporting a single version of Python would make the tool usable only in very specic conditions. To achieve wider support, we have combined the grammars for dierent Python versions into a single grammar.

3.1 Grammar

3.1.1 Resources

As previously mentioned, the grammar written by Bart Kiers was used as a baseline. The remaining resources include the ocial Python documentation pages2, which oer full grammar specications for

the dierent Python versions, and ANTLR3 grammar for Python 2.5 authored by Frank Wierzbicki3.

3.1.2 Engineering

To create a grammar that covers all versions of Python starting at 2.5, we used a grammar adaptation method known as iterative grammar engineering. Essentially, this process consists of converging two existing grammars into a single, overapproximating one. The resulting grammar covers a superset of both languages described by the original grammars. The techniques of grammar adaptation and grammar convergence are explained in greater detail by Lämmel [17] and Lämmel and Zaytsev [19] respectively.

The grammar written by Bart Kiers served as a baseline to be extended. To combine the gram-mars together, we translated the available Python 2.5 grammar from ANTLR3 into ANTLR4 and subsequently merged it with the base grammar of Python 3.3. Afterwards, all the other grammar specications from Python's documentation were compared to the base grammar, the necessary parts were translated to ANTLR4 and merged in as well.

In general, the diculty of the combination varied. Some parts were very straightforward, such as the denition of small_stmt rule, seeTable 3.1.

However, some of the rules diered more from each other and the dierences did not span across a single rule, but also its subrules and/or rules dependent on it, seeTable 3.2.

Furthermore, there exists an issue with reserved keywords, namely print and exec are reserved keywords in Python 2.x, but not in Python 3.x and nonlocal is a reserved keyword in Python 3.x but not in Python 2.x. In addition, Python 3.5 introduced keywords async and await. To account for

1https://github.com/antlr/grammars-v4/tree/master/python3 2https://docs.python.org

(12)

Python 2.7 Python 2.7 Python 3.3 Converged 2.7 & 3.3 specication ANTLR grammar ANTLR grammar ANTLR grammar small_stmt small_stmt small_stmt small_stmt

: ( expr_stmt : expr_stmt : expr_stmt : expr_stmt | print_stmt | print_stmt | print_stmt | del_stmt | del_stmt | del_stmt | del_stmt | pass_stmt | pass_stmt | pass_stmt | pass_stmt | ow_stmt | ow_stmt | ow_stmt | ow_stmt | import_stmt | import_stmt | import_stmt | import_stmt | global_stmt | global_stmt | global_stmt | global_stmt | exec_stmt | exec_stmt | exec_stmt

| nonlocal_stmt | nonlocal_stmt | assert_stmt | assert_stmt | assert_stmt | assert_stmt

) ; ; ;

Table 3.1: Merging of Python 2.7 and Python 3.3 grammar

this disparity, we added a production rule for identier, which covered the standard identiers shared among all versions, but also print, exec, nonlocal, async and await.

3.1.3 Testing

To test the combined grammar's capabilities, we used a data set of nearly 40 million lines of Python code. This data set is further described inChapter 8. This code was parsed by the ANLTR generated parser. Each le that did not parse correctly was manually inspected afterwards. During the inspection, the le was categorized as either (1) not conforming to any of the supported grammars or (2) conforming to one or more of the supported grammars.

The les which belonged to the second category were collected and combined into a smaller data set, designated for quick testing of the discovered grammar issues and their xes. All of the known issues were xed.

3.1.4 Result

The resulting grammar is a level 4 grammar in the quality model of Lämmel and Verhoef [18], which means it was tested to parse several million lines of code and a realistic parser can be derived from it. The converged grammar consists of 143 terminal symbols, 217 nonterminal symbols and 394 production rules (following the standard denitions of TERM, VAR and PROD metrics [4] as calculated by GrammarLab [41]).

(13)

Python 2.7 Python 2.7 Python 3.3 Combined 2.7 & 3.3 specication ANTLR grammar ANTLR grammar ANTLR grammar list_for list_for comp_for comp_for

: 'for' exprlist 'in' : FOR exprlist IN : FOR exprlist IN : FOR exprlist IN testlist_safe testlist_safe or_test test_nocond

((',' test_nocond)+ ','?)? [list_iter] [list_iter] comp_iter? comp_iter?

; ; ;

testlist_safe: old_test testlist_safe: old_test [(',' old_test)+ [',']] ((',' old_test)+ ','?)?

;

list_iter list_iter comp_iter comp_iter : list_for : list_for : comp_for : comp_for | list_if | list_if | comp_if | comp_if

; ; ;

list_if list_if comp_if comp_if

: 'if' old_test : IF old_test : IF test_nocond : IF test_nocond [list_iter] list_iter? comp_iter? comp_iter?

; ; ;

old_test old_test test_nocond test_nocond : or_test : or_test : or_test : or_test

| old_lambdef | old_lambdef | lambdef_nocond | lambdef_nocond

; ; ;

old_lambdef old_lambdef lambdef_nocond lambdef_nocond : 'lambda' : LAMBDA : LAMBDA : LAMBDA

[varargslist] varargslist? varargslist? varargslist? ':' old_test ':' old_test ':' test_nocond ':' test_nocond

; ; ;

(14)

Chapter 4

AST Building

After the input source code is parsed into a parse tree, the next step in Design Defect Detector is building an AST. In our implementation, we used a parse tree visitor for this purpose. A highly simplied layout of the AST is displayed inFigure 4.1.

The AST can be viewed as a stepping stone to the code model (further explained inChapter 5). Although the parse tree generated by ANTLR contains the same information as our AST and more (with the exception of accurate LOC measurements), this information is not easily accessible. The parse tree is too bloated and segmented into far too many small parts for our purposes. Thus, while creating the code model directly from the parse tree would be possible on a theoretical level, we deemed creating an AST beforehand a better option.

Figure 4.1: Simplied AST structure The AST structure is explained in detail inAppendix A.

(15)

Chapter 5

Code Model

The two previous parts of the Design Defect Detector workow, parsing (Chapter 3) and AST building (Chapter 4), have dealt with abstraction from the concrete implementation of the supplied source code. The construction of the code model goes further than that. The main purpose of the code model is to simplify querying for the necessary metrics and other characteristics by linking together the dierent parts (such as classes and variables). Building the code model is therefore based on fact extraction, which is explained in more detail by Lin and Holt [21].

5.1 Design

The code model is a structure inspired by the general structure of any project based on the object oriented paradigm. At the heart of the code model there are four classes: Project, Module, Class and Subroutine. A simplied structure of the code model is displayed inFigure 5.1, detailed view is displayed inFigure 5.2.

Figure 5.1: Simplied view of the source code model The individual parts of the code model are explained below.

Project

Project is basically the root node of the code model heart, and at the same time it is the simplest of all the code model parts. Its main purpose is to aggregate the modules based on their location in the folder structure.

(16)

Figure 5.2: Source code model structure ContentDenitions

Abstract class ContentDenitions contains the denitions for dierent Classes, Subroutines and Variables.

ContentContainer

ContentContainer is an abstract class, which in addition to the denitions also contains the references to dierent Classes, Subroutines and Variables.

Module

Module, apart from being an implementation of ContentContainer, also holds the references to all the Classes and Modules that were imported inside of it.

Class

Class, another implementation of ContentContainer, is the most heavily used object during the analysis of the code model. A lot of design defects are dened on a class level, thus classes play a crucial role in the process of detecting them. In our code model, a class holds references to its parent object (e.g., Module) and to its superclasses.

(17)

Subroutine

Subroutines are the simplest type of a ContentContainer. Their most important task is holding a reference to their parent object (i.e., Class or Module).

5.2 Construction

The construction of the code model mostly works with the provided AST. It utilizes an AST visitor and requires two passes. The rst pass focuses on the primitive properties, i.e. properties which can be directly observed in code. An example of a primitive property is the name of the variable used in a particular class. The second pass obtains the derived properties, i.e. properties that are inferred from other properties. An example of a derived property is a reference to a class that was imported by a specic module.

First step is the construction of a skeleton and lling it with simple, observable information. The AST visitor creates the individual classes and adds the important information, such as names of superclasses, etc.

After the rst pass, the second one consists mostly of linking the model parts to each other. A second, smaller AST visitor is used for this. It visits the import statements and based on the existing skeleton, it creates references from one class to another. After its completion, it continues to resolve the dependencies such as referencing a variable of a dierent class. This process is facilitated by creating a Scope in each Module and passing it down to the dened Classes and Subroutines.

(18)

Chapter 6

Analysis

After the code model is constructed, the code analyzer uses it to detect design defects. This constitutes the nal step of the Design Defect Detector.

The most important components of the analyzer are shown inFigure 6.1and explained below.

Figure 6.1: Simplied overview of the analyzer classes Metrics

Metrics is the class that holds the knowledge about all the metrics that are required for the semantical relative lters or statistical lters, e.g. top 20% values (further explained in Chap-ter 7). It also supplies the knowledge about the limits for these lters after all the metrics are stored.

IntMetricVals

(19)

lters. Our implementation only uses integer metrics, such as LOC, however this could be easily extended to dierent types.

Detectors

Each design defect has its own, separate detector. Detector is the component which decides whether the object (class or subroutine) is a design defect or not. This decision is a two step process, facilitated by the Metrics.

In the rst step, the detector checks whether the object possesses properties of the given design defect which are easily observable from the code model (e.g., if the class uses global variables). If yes, it is judged to be a candidate for a design defect and Detector stores the necessary metrics specic to this object for further inspection.

The second step occurs after all the objects of all projects have been through the rst step. The Detector then checks all the design defect candidates and compares their metrics against the limits for semantical relative lters or statistical lters, supplied by the Metrics. At this point, the Detector either conrms or rejects the object as a design defect.

Register

Register is a simple class that facilitates registering all the Detectors and supplies them with Metrics. It also triggers the metric collection phase and nalization phase.

6.1 Technical Implementation

Analysis was technically most challenging part to implement due to the large quantities of source code processed by Design Defect Detector. The analyzer had to go through the code model of each individual project and store required information about it.

Since the data was far too large to be stored in memory, our implementation extensively uses the le system. The information about all collected metrics and also about potential design defects was stored in les. For this purpose, we have implemented le based collections, diagram of which is displayed inFigure 6.2.

For running Design Defect Detector on our data set of about 32 million LOC, the necessary tem-porary data reached 2,5 GB in size.

(20)

Chapter 7

Design Defect Detection

7.1 Technique

To detect the design defects, we have used mechanisms of ltering and composition of metrics [25] to dene the design defect detection rules in a quantiable manner.

Composition is a simple application of logical operations, such as and, or and not. Filtering is based on applying dierent types of lters, the classication of which can be seen inFigure 7.1.

Figure 7.1: Dierent types of lters dened by Marinescu [25]

For the Design Defect Detector, we mostly use a relative semantical percentage based lter, e.g. TopValues(10%), and statistical lter, i.e. Box Plot (Figure 7.2) with 2 types of considered outliers, mild and extreme.

Figure 7.2: Box plot structure In our denitions of design defects, we use the following functions:

• MildOutlier(M)  the metric M is a mild outlier among all measured values of M. Mild outliers are at least 1.5 IQRs1away from the median.

(21)

• ExtremeOutlier(M)  the metric M is an extreme outlier among all measured values of M. Extreme outliers are at least 3 IQRs away from the median.

• TopXPercent(M)  the metric M is in the top X percent of all measured values of M.

7.2 Design Defect Catalogue

The design defects detected by Design Defect Detector are listed inTable 7.1. Design Defect Description Type Denition Feature Envy Ÿ 7.2.1 Code smell Fowler et al. [10] Data Class Ÿ 7.2.2 Code smell Fowler et al. [10] Long Method Ÿ 7.2.3 Code smell Fowler et al. [10] Long Parameter List Ÿ 7.2.4 Code smell Fowler et al. [10] Large Class Ÿ 7.2.5 Code smell Fowler et al. [10] Blob Ÿ 7.2.6 Antipattern Brown et al. [2] Swiss Army Knife Ÿ 7.2.7 Antipattern Brown et al. [2] Functional Decomposition Ÿ 7.2.8 Antipattern Brown et al. [2] Spaghetti Code Ÿ 7.2.9 Antipattern Brown et al. [2]

Table 7.1: Design defects detected by Design Defect Detector

7.2.1 Feature Envy

Description

Feature Envy is a code smell dened by Fowler et al. [10] as a method that seems more interested in data of a dierent class than the one it is in. This often means invoking a large amount of accessor methods.

Implementation

The implementation of Feature Envy in our Design Defect Detector is based on the denition proposed by Li & Shatnawi [20].

Formal denition:

AID > 4 ∧ Top10Percent(AID) ∧ ALD < 3 ∧ NRC < 3 Used metrics & properties:

• AID  Access of Import Data, the amount of referenced variables that do not belong to this class

• ALD  Access of Local Data, the amount of referenced variables that do belong to this class • NRC  Number of Related Classes, the amount of dierent classes referenced in this one

7.2.2 Data Class

Description

Data Class, a code smell dened by Fowler et al. [10], is a class that only contains data elds and accessor/mutator methods for these elds. Data Class is used solely for the purpose of holding data.

(22)

Implementation Formal denition:

ExtremeOutlier(AOPuF) ∨ ExtremeOutlier(AOA) Used metrics & properties:

• AOPuF  Amount of Public Fields • AOA  Amount Of Accessors

7.2.3 Long Method

Description

Long Method, a code smell dened by Fowler et al. [10], is a method that is too long and should be decomposed into smaller pieces. Fowler et al. state that comments (signaling some kind of semantic distance), conditionals or loops often signify a place where the code should be extracted to a separate method.

Implementation

The implementation of Long Method in our Design Defect Detector is based on the denition proposed by Moha et al. [29]. Formal denition:

ExtremeOutlier(LOC) Used metrics & properties:

• LOC  amount of Lines Of Code for the given method, excludes comments and new lines

7.2.4 Long Parameter List

Description

A code smell dened by Fowler et al. [10], Long Parameter List pertains to method denitions having too many parameters.

Implementation

The implementation of Long Parameter List in our Design Defect Detector is based on the denition proposed by Moha et al. [29]. Formal denition:

ExtremeOutlier(NOP) Used metrics & properties:

• NOP  Number Of Parameters the given function takes

7.2.5 Large Class

Description

Fowler et al. [10] has dened a Large Class as a code smell occurring when a single class is trying to do too much. The instance variables of this class are used in a lot of places, possibly as part of duplicated code.

(23)

Implementation

The implementation of Large Class in our Design Defect Detector is based on the denition proposed by Moha et al. [29]. Formal denition:

ExtremeOutlier(NMD + NAD) Used metrics & properties:

• NMD  Number of Methods Dened • NAD  Number of Attributes Dened

7.2.6 Blob

Description

Blob, as dened by Brown et al. [2], is an antipattern that consists of a one complex class surrounded by multiple Data Classes. The complex class monopolizes all the processing, while the only responsibility of the Data Classes is to encapsulate data.

Implementation

The implementation of Blob in our Design Defect Detector is based on the denition proposed by Moha et al. [27]. Formal denition: HasControllerName ∨ HasControllerMethods ! ∧MildOutlier(LOC) ∧MildOutlier(LCOM) ∧ RDC > 2 Used metrics & properties:

• HasControllerName  the class' name contains words such as Manage, Process, Control, etc. • HasControllerMethods  the class contains methods that have a controller name

• LOC  amount of Lines Of Code for the given class, excludes comments and new lines • LCOM  Lack of Cohesion of Methods [5]

• RDC  amount of Related Data Classes, i.e. classes that have Top15Percent(AOA) • AOA  Amount Of Accessors

7.2.7 Swiss Army Knife

Description

Brown et al. [2] dene Swiss Army Knife antipattern as a class with too many responsibilities. This can easily be observed by having a large number of methods, implementing too many interfaces and/or using multiple inheritance.

Implementation

The implementation of Swiss Army Knife in our Design Defect Detector is based on the denition proposed by Moha et al. [27].

Formal denition:

ExtremeOutlier(SUP) Used metrics & properties:

(24)

7.2.8 Functional Decomposition

Description

Functional Decomposition was dened by Brown et al. [2] as a class that does not leverage object-oriented principles such as inheritance and polymorphism. It usually has a single action as a function and all its attributes are private and used only inside the class. It often has a function-like name (e.g., CalculateInterest).

Implementation

The implementation of Functional Decomposition in our Design Defect Detector is based on the denition proposed by Moha et al. [27].

Formal denition:

HasProceduralName ∧ NoInheritance ∧RCOMPF > 2 Used metrics & properties:

• HasProceduralName  the class' name contains words such as Make, Create, Exec, etc. • NoInheritance  the class has no superclasses

• RCOMPF  amount of Related Classes with One Method and a lot of Private Fields, i.e. classes that only have one method and have MildOutlier(AOPrF)

• AOPrF  Amount of Private Fields

7.2.9 Spaghetti Code

Description

This antipattern is dened by Brown et al. [2] as a piece of code lacking any structure. This is often detectable by the use of global variables instead of parameters for methods. Spaghetti Code does not make use of basic OO concepts such as inheritance or polymorphism.

Implementation

The implementation of Spaghetti Code in our Design Defect Detector is based on the denition proposed by Moha et al. [27].

Formal denition:

HasProceduralName ∧ NoInheritance ∧ UsesGlobals

∧HasLongMethod ∧Top15Perc(MNP) Used metrics & properties:

• HasProceduralName  the class' name contains words such as Make, Create, Exec, etc. • NoInheritance  the class has no superclasses

• UsesGlobals  at least one of the variables referenced by this class is a global variable • HasLongMethod  at least one of the methods of this class has Top15Percent(LOC) • LOC  amount of Lines Of Code, excludes comments and new lines

(25)

Chapter 8

Evaluation Data Set

To obtain the data set used in our experiment, we have used the PyPi in combination with GitHub and GitHub API1. For more information about PyPi and GitHub, seeChapter 1.2

8.1 Collected GitHub Repository Links

Initially, we automatically collected a comprehensive list of all GitHub links belonging to any PyPi package submitted under the category Python 2.5 and above. We have obtained over 17 thousand unique GitHub repository links.

To ensure that the data set is suitable for the purposes of this research, this list was ltered in a two step process. The overview of the process is shown inTable 8.1.

Collected unique GitHub links 17 568 Unavailable, fork or non-Python projects −3 307

Initial data set 14 261

Initial data set 14 261

Small projects or projects with insucient parse ratio −10 140

Final data set 4 121

Table 8.1: Projects obtained from GitHub and the ltering leading to the nal data set

8.2 Filtering: First Step

In the rst ltering step, there were three criteria that the repositories had to fulll to become part of the initial data set:

1. The repository is publicly available on GitHub.

2. The repository is a Python repository. We dened a Python repository as a repository, in which the usage of all the non-Python languages individually is lower than that of Python. For example, if a repository consists of 40% Python, 30% HTML and 30% CSS, it is a Python repository.

3. The repository is not a fork. As one of our objectives is to nd out the density of design defects, it would be detrimental to include multiple forks of the same project into our research. To prevent this, we queried the GitHub API for information about the collected GitHub repositories. Any fork repository was removed from the result set and instead replaced with its parent (recursively).

(26)

This process used the information from the GitHub API and as a result 3 307 repository links were ltered out.

8.3 Initial Data Set

After the rst ltering step, there were over 14 thousand GitHub repository links left. These projects created our initial data set. The measured statistics about this set can be seen below inTable 8.2.

Total Correctly Parsed Parsed % Modules 331 361 329 945 99,57 Lines Of Code 39 474 492 39 386 269 99,78

Classes 551 358

Table 8.2: Initial Data Set - statistics about 14 261 projects (Note: the number of classes was obtained from each module's parse tree. The total amount of classes is unknown, as it was not possible to build all parse trees correctly.)

This data set was used to validate our combined grammar, as explained inChapter 3. It was also used as an input for the second ltering step.

8.4 Filtering: Second Step

To apply the second ltering step, we performed a basic meta-analysis on the initial data set. Each of the projects was parsed with the combined grammar and we have obtained information about the amount of lines of code, modules, dened classes and parse ratio.

We have applied two more lters with arbitrarily chosen limits:

1. Removing all projects that are too small. A small project contains less than 20 classes. This lter was applied because a large portion of design defects which we are trying to detect are related to OOP. If the project is functional for most part, it is irrelevant to our research. 2. Removing all projects whose parse ratio is lower than 90%. If the project does not parse correctly,

it is not possible to infer the relationships between dierent classes, methods and variables and therefore it is also impossible to perform the design defect detection.

8.5 Final Data Set

After the ltering, we have obtained the nal data set of 4 121 repositories. The information about this data set is displayed inTable 8.3,Figure 8.1 andFigure 8.2.

Total Correctly Parsed Parsed % Modules 238 861 238 503 99.85 Lines Of Code 32 094 184 32 058 823 99.89

Classes 488 008

Table 8.3: Final Data Set - statistics about 4 121 projects (Note: the number of classes was obtained from each module's parse tree. The total amount of classes is unknown, as it was not possible to build all parse trees correctly.)

(27)

Figure 8.1: Distribution per project - Lines Of Code, number of modules and number of classes

(28)

Chapter 9

Results

We ran the Design Defect Detector on the data set of 32 million lines of code (described inChapter 8). The amount of detected design defects is shown inTable 9.1. Table 9.2shows the amount of detected design defects (and their density) per design defect type.

Design Defect Instances Feature Envy 5 103 Data Class 13 537 Long Method 79 367 Long Parameter List 10 429 Large Class 16 576

Blob 24

Functional Decomposition 0 Spaghetti Code 1 Swiss Army Knife 30 011

Table 9.1: Results - amount of found occurrences per design defect Design Defect Type Amount Instances Average Density Code Smells 5 125 012 7,79 Antipatterns 4 30 036 2,34 Table 9.2: Results - amount of found occurrences per design defect type

We can see that the amount of detected defects was relatively high for Feature Envy, Data Class, Long Method, Long Parameter List, Large Class and Swiss Army Knife. The amount of detected Blob and Spaghetti Code defects was very low. We were not able to detect Functional Decomposition. To compare the density of design defects, we used data Moha et al. [27,29] obtained with DECOR. This data is listed inTable 9.3.

Design Defect Instances Total LOC Source Long Method 22 21 267 [29] Long Parameter List 43 21 267 [29] Large Class 13 21 267 [29] Blob 150 516 092 [27] Functional Decompositon 179 516 092 [27] Spaghetti Code 363 516 092 [27] Swiss Army Knife 441 516 092 [27]

(29)

The calculated density of each design defect per 10 000 lines of code for both Design Defect Detector and DECOR is shown inTable 9.4.

Density Design Defect DDD DECOR Long Method 24,73 10,34 Long Parameter List 3,25 20,22 Large Class 5,16 6,11

Blob 0,01 2,90

Functional Decomposition 0,00 3,47 Spaghetti Code 0,00 7,03 Swiss Army Knife 9,35 8,54 Average 6,07 8,37

Table 9.4: Comparison of normalized design defect density (per 10 000 LOC) for Python and Java, detected by Design Defect Detector (DDD) and DECOR respectively

0 2 4 6 8 10 12 14 16 18 20 22 24 26 Long Method

Long Parameter List Large Class Blob Functional Decomposition Spaghetti Code Swiss Army Knife

24,73 3,25 5,16 0,01 0,00 0,00 9,35 10,34 20,22 6,11 2,90 3,47 7,03 8,54 DDD DECOR

Figure 9.1: Graphical representation ofTable 9.4

9.1 Answers

This section contains answers to this poject's research questions based on the obtained results. 1. Which of the well known design defects can be detected in Python code?

We can conclude that the following design defects are detectable: Feature Envy, Data Class, Long Method, Long Parameter List, Large Class, Blob, Spaghetti Code and Swiss Army Knife. It is not possible to say whether Functional Decomposition can be detected or not.

2. Do Java code and Python code have comparable design defects?

Since we have been able to detect the majority of the design defects that have previously been detected in Java source code, we conclude that similar design defects can be detected in Java and Python source code.

3. What is the density of these design defects in Python code?

On average, we have detected the density of 6,07 design defects per 10 000 lines of code. The individual measurements per design defect are listed inTable 9.4.

(30)

4. Do Java code and Python code have comparable design defect density?

The measured average density of design defects per 10 000 lines of code for Python and Java were 6,07 and 8,37 respectively. The individual measurements per design defect are listed in

Table 9.4and displayed inFigure 9.1. This shows that the design defect densities are comparable and according to our results, the density is usually slightly lower in Python code.

(31)

Chapter 10

Discussion

Python is generally considered much more concise than Java, meaning that developers can accomplish the same task with less lines of code. Our results show that in most cases, the design defect density per 10 000 lines of code in Python is lower than that in Java. This result suggests that Python is superior to Java in the aspect of design defects. However, there are arguments which undermine this theory, such as that the concrete denitions which work for Java source code are insucient in the context of Python and they need to be tailored for this specic context.

Another interesting fact is that in our experiment, detection rate of code smells in Python was higher than that of antipatterns. This could mean that whilst Python code contains low level issues just like Java code, it has less architectural problems compared to Java code.

10.1 Threats to Validity

A common threat to validity of all design defect detection studies is introduced by the loose, textual denitions of design defects. Due to their openness to interpretation, personal judgment impacts the selection of suitable metric combinations and their thresholds.

Another threat is the data set we have used for evaluation. It only consists of public GitHub repos-itories. Although we have examined 4 121 repositories, we cannot guarantee it to be a representative sample. For instance, there could be fundamental dierences between corporate Python code and the open source code available on GitHub. One could also argue that GitHub attracts certain types of developers and projects.

The study is also limited by the amount of design defects we have studied. It would require a more extensive study to examine the density of Python design defects in general and make a comparison to, for instance, Java. Furthermore, it is also possible that Python as a language has its own specic design defects. Although we are aware of this option, its further consideration falls outside of the scope of this project.

Lastly, we have not been able to make any statement about precision or recall of Design Defect Detector. This means that the comparison to results in Java might be skewed one way or the other. If the Design Defect Detector shows a signicant number of false negatives, the Python design defect density could be higher than that of Java. On the other hand, if the Design Defect Detector shows a lot of false positives, the design defect density of Python would be lower than what we have measured. This issue would also mean that the denitions successfully used for detecting design defects in Java code rst need to be adjusted for our context to work well.

10.2 Future Work

In this thesis, we have developed a tool to detect a subset of known design defects in Python source code and compared the density of these defects to the data from previous studies conducted on Java. This work could be further expanded by adding detection for the design defects that were not included in our study. Conducting a study with a larger set of design defects would bring more

(32)

accurate results about their average density. Furthermore, measuring the precision and recall of the Design Defect Detector would help support (or dismiss, as stated in Chapter 10.1) the claims made based on its results.

There are numerous other object oriented languages, such as C++ or C#, which could be studied in a similar fashion and compared to the results of our predecessors and also to our results.

Another topic worth of exploring would be the relation between the detected design defect in Python and buggines of the software, similarly as Taba et al. did for Java [39].

Lastly, it would have been very valuable to study the possibility that Python has its own, context specic design defects, that do not occur (or occur only rarely) in other programming languages.

(33)

Chapter 11

Conclusion

We have conrmed the presence of various design defects in Python source code. Our tool, the Design Defect Detector, has been able to detect 8 out of 9 design defects chosen for our study. The size of the inspected data set was 4 121 GitHub repositories, consisting of 32 million lines of code.

Density of the found design defects varied per type. The average density was 6,07 design defects per 10 000 lines of code. Most commonly found design defects were Long Method (79 367 found instances) and Swiss Army Knife (30 011 found instances). The least commonly found design defects were Spaghetti Code (1 found instance) and Blob (24 found instances). Overall, the measured density was higher for code smells than for antipatterns.

We have compared the detected design defect density in Python to results of DECOR [27], a state of the art tool built for detection of design defects in Java source code. Generally, the density we have measured in Python (average 6,07 design defects per 10 000 lines of code) was slightly lower than the density measured in Java (average 8,37 design defects per 10 000 lines of code).

Our research has rearmed that the context of a programming language is one of the essential factors in design defect detection. Because of this, the existing research is still very incomplete and requires studies similar to ours to be conducted on dierent programming languages in the future. Furthermore, the possibility of design defects specic to Python should also be examined in future studies.

(34)

Bibliography

[1] Alfred V. Aho, Monica S. Lam, Rave Sethi, and Jerey D. Ullman. Compilers: Principles, Techniques and Tools. Addison-Wesley, 2006.

[2] William J. Brown, Raphael C. Malveau, Hays W. Skip McCormick III, and Thomas J. Mowbray. AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. Wiley, 1998.

[3] Bruno Cauet. Python 2.x vs 3.x usage survey, 2014 edition, Jan 2015. http://blog. frite-camembert.net/python-survey-2014.html.

[4] Julien Cervelle, Matej ƒrepin²ek, Rémi Forax, Tomaº Kosar, Marjan Mernik, and Gilles Roussel. On Dening Quality Based Grammar Metrics. In Proceedings of the International Multiconference on Computer Science and Information Technology, pages 651658. IEEE, 2009.

[5] Shyam R. Chidamber and Chris F. Kemerer. Towards a metrics suite for object oriented design. Conference proceedings on Objectoriented programming systems, languages, and applications -OOPSLA '91, 1991.

[6] Oliver Ciupke. Automatic detection of design problems in object-oriented reengineering. Proceed-ings of Technology of Object-Oriented Languages and Systems - TOOLS 30 (Cat. No.PR00278), 1999.

[7] Vittorio Cortellessa, Antinisca Di Marco, Romina Eramo, Alfonso Pierantonio, and Catia Tru-biani. Digging into UML models to remove performance antipatterns. Proceedings of the 2010 ICSE Workshop on Quantitative Stochastic Models in the Verication and Design of Software Systems - QUOVADIS '10, 2010.

[8] Karim Dhambri, Houari Sahraoui, and Pierre Poulin. Visual detection of design anomalies. 2008 12th European Conference on Software Maintenance and Reengineering, Apr 2008.

[9] Francesca Arcelli Fontana and Stefano Maggioni. Metrics and antipatterns for software quality evaluation. 2011 IEEE 34th Software Engineering Workshop, Jun 2011.

[10] Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. Refactoring: Im-proving the Design of Existing Code. Addison-Wesley Professional, 1999.

[11] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design patterns: elements of reusable object-oriented software. Pearson Education, 1994.

[12] Dick Grune and Ceriel J. H. Jacobs. Parsing Techniques: A Practical Guide. Springer, second edition, 2010.

[13] Dick Grune, Kees van Reeuwijk, Henri E. Bal, Ceriel J. H. Jacobs, and Koen G. Langendoen. Modern Compiler Design. Springer, second edition, 2012.

[14] Yann-Gaël Guéhéneuc and Hervé Albin-Amiot. Using design patterns and constraints to auto-mate the detection and correction of inter-class design defects. Proceedings 39th International Conference and Exhibition on Technology of Object-Oriented Languages and Systems. TOOLS 39, 2001.

(35)

[15] Marouane Kessentini, Wael Kessentini, Houari Sahraoui, Mounir Boukadoum, and Ali Ouni. Design defects detection and correction by example. 2011 IEEE 19th International Conference on Program Comprehension, Jun 2011.

[16] Foutse Khomh, Stéphane Vaucher, Yann-Gaël Guéhéneuc, and Houari Sahraoui. A Bayesian approach for the detection of code and design smells. 2009 Ninth International Conference on Quality Software, Aug 2009.

[17] Ralf Lämmel. Grammar adaptation. FME 2001: Formal Methods for Increasing Software Pro-ductivity, pages 550570, 2001.

[18] Ralf Lämmel and Chris Verhoef. Semi-automatic grammar recovery. Software: Practice and Experience, 31(15):1395, Dec 2001.

[19] Ralf Lämmel and Vadim Zaytsev. An introduction to grammar convergence. Integrated Formal Methods, pages 246260, 2009.

[20] Wei Li and Raed Shatnawi. An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution. Journal of Systems and Software, 80(7):11201128, Jul 2007.

[21] Yuan Lin and Richard C. Holt. Formalizing fact extraction. Electronic Notes in Theoretical Computer Science, 94:93102, May 2004.

[22] Maria Teresa Llano and Rob Pooley. UML specication and correction of object-oriented anti-patterns. 2009 Fourth International Conference on Software Engineering Advances, Sep 2009. [23] Jon Loeliger. Version Control with Git: Powerful Tools and Techniques for Collaborative Software

Development. O'Reilly Media, Inc., 1st edition, 2009.

[24] Mika Mäntylä, Jari Vanhanen, and Casper Lassenius. A taxonomy and an initial empirical study of bad smells in code. International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings., 2003.

[25] Radu Marinescu. Detection strategies: Metrics-based rules for detecting design aws. 20th IEEE International Conference on Software Maintenance, 2004. Proceedings., 2004.

[26] Naouel Moha. Detection and correction of design defects in object-oriented designs. Companion to the 22nd ACM SIGPLAN conference on Object oriented programming systems and applications companion - OOPSLA '07, 2007.

[27] Naouel Moha, Yann-Gaël Guéhéneuc, Laurence Duchien, and Anne-Françoise Le Meur. Decor: A method for the specication and detection of code and design smells. IEEE Transactions on Software Engineering, 36(1):2036, Jan 2010.

[28] Naouel Moha, Yann-Gaël Guéhéneuc, Anne-Françoise Le Meur, and Laurence Duchien. A domain analysis to specify design defects and generate detection algorithms. Lecture Notes in Computer Science, page 276291, 2008. Shorter version of [29].

[29] Naouel Moha, Yann-Gaël Guéhéneuc, Anne-Françoise Le Meur, Laurence Duchien, and Alban Tiberghien. From a domain analysis to the specication and detection of code and design smells. Form Asp Comp, 22(3):345361, May 2009.

[30] Naouel Moha, Yann-Gaël Guéhéneuc, and Pierre Leduc. Automatic generation of detection algorithms for design defects. 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06), 2006.

[31] Naouel Moha, Duc-loc Huynh, and Yann-Gaël Guéhéneuc. A taxonomy and a rst study of design pattern defects. In 1st International Workshop on Design Pattern Theory and Practice, part of the STEP'05 workshop, 2005.

(36)

[32] Naouel Moha, Amine Mohamed Rouane Hacene, Petko Valtchev, and Yann-Gaël Guéhéneuc. Refactorings of design defects using relational concept analysis. Lecture Notes in Computer Science, page 289304, 2008.

[33] Matthew James Munro. Product metrics for automatic identication of bad smell design prob-lems in java source-code. 11th IEEE International Software Metrics Symposium (METRICS'05), 2005.

[34] Rocco Oliveto, Foutse Khomh, Giuliano Antoniol, and Yann-Gaël Guéhéneuc. Numerical sig-natures of antipatterns: An approach based on b-splines. 2010 14th European Conference on Software Maintenance and Reengineering, Mar 2010.

[35] Terence Parr. ANTLR  ANother Tool for Language Recognition, 2008. http://antlr.org. [36] Terence Parr and Kathleen Fisher. LL(*): the Foundation of the ANTLR Parser Generator. In

Mary W. Hall and David A. Padua, editors, Proceedings of the 32nd Conference on Programming Language Design and Implementation (PLDI), pages 425436. ACM, 2011.

[37] Arthur J. Riel. Object-oriented design heuristics, volume 338. Addison-Wesley Reading, 1996. [38] Alecsandar Stoianov and Ioana Sora. Detecting patterns and antipatterns in software using

prolog rules. 2010 International Joint Conference on Computational Cybernetics and Technical Informatics, 2010.

[39] Seyyed Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Ahmed E. Hassan, and Meiyappan Nagappan. Predicting bugs using antipatterns. 2013 IEEE International Conference on Software Maintenance, Sep 2013.

[40] Eva van Emden and Leon Moonen. Java quality assurance by detecting code smells. Ninth Working Conference on Reverse Engineering, 2002. Proceedings., 2002.

[41] Vadim Zaytsev. Grammar Zoo: A Corpus of Experimental Grammarware. Fifth Special issue on Experimental Software and Toolkits of Science of Computer Programming (SCP EST5), 98:2851, February 2015.

[42] Vadim Zaytsev and Anya Helene Bagge. Parsing in a Broad Sense. In Jürgen Dingel, Wolfram Schulte, Isidro Ramos, Silvia Abrahão, and Emilio Insfran, editors, Proceedings of the 17th In-ternational Conference on Model Driven Engineering Languages and Systems (MoDELS 2014), volume 8767 of LNCS, pages 5067, Switzerland, October 2014. Springer.

(37)

Appendix A

AST structure

To explain the AST in more detail, we will use a layered, breadth rst approach.

A.1 Root

Package ast

Figure A.1andFigure A.2display the root of the AST. The root node has three concrete children  Decorator, Module and Suite. Decorator represents Python's function and class decorators, e.g. Fig-ure A.3. Module is the representation of the dierent les in the analyzed projects. Suite represents a collection of statements, for instance a function's body is a suite.

(38)

Figure A.2: ast package (2/2)

1 @login_required ( login_url ='/ accounts / login /') 2 def home_view ( request ):

3 ...

Figure A.3: Function decorator in Python

A.2 Level 1

Package ast.param

The ast.param package contains the function denition parameters, see Figure A.4. A function def-inition has Params, divided into three groups  regular parameters, positional parameters (often denoted *args) and keyword parameters (often denoted **kwargs). Each parameter can be either typed or untyped, e.g.Figure A.5.

(39)

Figure A.4: ast.param package

1 def my\ _function (a, b: int, c, *args , d, ** kwargs ):

2 ...

Figure A.5: Parameters in Python

Package ast.path

The structure of ast.path is show in Figure A.6. Generally, paths are used in import statements, e.g.Figure A.7

Figure A.6: ast.path package

1 from general . util . storage import *

Figure A.7: Import path in Python

Package ast.expression

The ast.expression package is displayed in Figure A.8. In Python, expressions can be conditional (e.g.,Figure A.9) or non-conditional.

(40)

Figure A.8: ast.expression package

1 0 if is_even (x) else 1

Figure A.9: Conditional expression in Python

Package ast.statement

Because the amount of statements in Python is rather large, they have been divided into three groups  simple, ow and compound. SeeFigure A.10.

Figure A.10: ast.statement package

Package ast.argument

The dierent types of Python arguments are displayed in Figure A.11. Arguments are used to pass the values for function calls, decorator denitions and class denitions, e.g.Figure A.12.

Figure A.11: ast.argument package

1 parser . parse ( input_file )

(41)

A.3 Level 2

Package ast.statement.ow

The ow statement consists of the statements inuencing the ow of the program, i.e. which statement is going to be executed next. However, they do not contain any further instructions/statements. They are Continue, Raise, Yield, Return and Break, seeFigure A.13.

Figure A.13: ast.statement.ow package

Package ast.statement.simple

Simple statements are the most basic type of statements. They do not aect the ow of the program and only consist of a single command, they have no body and contain no more statements. They are Print, Global, Assign, Import, Delete, Assert, Pass, Nonlocal and Exec, see Figure A.14 and

Figure A.15.

(42)

Figure A.15: ast.statement.simple package (2/2)

Package ast.statement.compound

Compound statements are dened as statements which have some sort of body. They are statements such as If, Except, Function, Try, ClassDef, With, While, For and WithItem, see Figure A.16 and

Figure A.17.

Figure A.16: ast.statement.compound package (1/2)

Figure A.17: ast.statement.compound package (2/2)

Package ast.statement.compiter

The ast.statement.compiter package stands for comprehension iteration and its contents are displayed inFigure A.18. The iteration can have 2 types: (1) for or (2) if . Neither one of them is a statement on its own, they are mostly used for list comprehensions, dictionary makers and set makers, such as inFigure A.19.

(43)

Figure A.18: ast.expression.compiter package

1 def filter_larger_numbers (list, val ): 2 return [x for x in list if x <= val ]

Figure A.19: Comprehension iteration example

Package ast.expression.nocond

Non-conditional expressions do not contain a condition, such as an if condition. The ast.expression.nocond package is displayed in Figure A.20. Naturally, most expressions fall under this category. Non-conditional expressions include Atom, Trailer, Arithmetic, Bitwise and Logical expressions and a Non-conditional Lambda expression.

Figure A.20: ast.expression.nocond package

A.4 Level 3

Package ast.expression.nocond.atom

The ast.expression.nocond.atom package is displayed inFigure A.21andFigure A.22. It contains the most elementary constructs, such as Booleans, Strings, Numerics, Identiers and None (i.e., Python's equivalent of a null value). In addition, it also contains Makers, Comprehensions, Yields and Trailed Atoms.

(44)

Figure A.21: ast.expression.nocond.atom package (1/2)

Figure A.22: ast.expression.nocond.atom package (2/2)

Package ast.expression.nocond.trailer

The ast.expression.nocond.trailer package contains possible trailers for an atom. In Python, we distin-guish between three types of trailers: (1) Slice, (2) Index Element and (3) Argument List. However, Slice and Index Element can easily be grouped together, as their syntax is highly similar, see Fig-ure A.23.

Figure A.23: ast.expression.nocond.trailer package

Package ast.expression.nocond.arithmetic

Package ast.expression.nocond.arithmetic contains three types of arithmetic expressions - Unary (e.g., addition), N-ary (e.g., plus) and a Power. The package contents are displayed inFigure A.24.

(45)

Figure A.24: ast.expression.nocond.arithmetic package

Package ast.expression.nocond.logical

Logical expressions consist of Not, Comparison and Binary logical expression (either and or or). They are a part of ast.expression.nocond.logical package, displayed inFigure A.25.

Figure A.25: ast.expression.nocond.logical package

Package ast.expression.nocond.bitwise

The contents of ast.expression.nocond.bitwise package are shown inFigure A.26. This package consists of simple bitwise operations, such as And, Or, Xor and Shift (which can be left or right).

Figure A.26: ast.expression.nocond.bitwise package

A.5 Level 4

Package ast.expression.nocond.atom.comprehension

The ast.expression.nocond.atom.comprehension package, shown in Figure A.27, expresses the two ways of creating a list of values: (1) through a Comprehension and (2) Enumeration of all values.

(46)

Figure A.27: ast.expression.nocond.atom.comprehension package

1 even_comprehension = [2* x for x in range(10)]

2 even_enumeration = [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Figure A.28: If comprehension iteration

Package ast.expression.nocond.atom.yield

The ast.expression.nocond.atom.yield package contains two dierent types of yields in Python. It is displayed inFigure A.29.

Figure A.29: ast.expression.nocond.atom.yield package

Package ast.expression.nocond.atom.maker

There are two types of makers in the ast.expression.nocond.atom.maker package: (1) Dictionary Maker and (2) Set Maker. Their structure is displayed inFigure A.30.

Figure A.30: ast.expression.nocond.atom.maker package

Package ast.expression.nocond.atom.numeric

The ast.expression.nocond.atom.numeric package, displayed inFigure A.31, is a very simple package containing dierent types of numeric types in python - Imaginary, Long, Integer and Float.

(47)

Figure A.31: ast.expression.nocond.atom.numeric package

Package ast.expression.nocond.atom.trailed

The last package, ast.expression.nocond.atom.trailed, contains trailed atoms. A trailed atom is an atom, that has a trailer attached to it. Python's trailed atoms are Slice, Attribute Reference and Call. In our implementation we also distinguish between a simple function call and a call of an object method.

Referenties

GERELATEERDE DOCUMENTEN

21 See about that question further C.J.J.M. Stoiker &amp; D.I.. Aviation Products Law in Manufacturing and Design Defects 101 &#34;the omission of that alternative renders the

To detect plagiarism in a given document, we first create variable-order Markov models [Rissanen, 1983] of all known documents, and then for every substring of length n in our

Better solutions could be devised but they all appear to require significant changes to the current preprocessors – so much so that the this cloning trick was deemed a

This thesis discusses on a unique methodology that incorporates three important aspects of information system design and configuration, through the development of

Since previous work [ 16 ] has demonstrated the appropriateness of binary classification and showed that RandomForest can achieve significantly good performance for IaC

A coil is wound in a figure ‘8’ with a bar over the middle line of the window and with the outer perimeter of the ‘8’ closely against the fuselage at some distance of the

As a task-based language syllabus has the potential of providing extensive support and guidance regarding classroom practice, i t may assist teachers in managing

The Distributed Design Assistant (DiDeas I), which was developed by Schueller [2002] in the Department of Mechanical Engineering at Stellenbosch University, is a design