VisualBlame - An interactive Git blame visualization

(1)

Bachelor Informatica

VisualBlame - An interactive

Git blame visualization

Daan Siepelinga

June 9, 2016

Supervisor(s): Raphael Poss

Inf

orma

tica

—

Universiteit

v

an

Ams

terd

am

(2)

(3)

Abstract

Developers often have a hard time understanding a codebase, running into problems such as not understanding the rationale behind code. Almost all codebases make use of a version control system like Git though, which contains data that can help solve these problems. There are already a lot of tools available to gather the different kinds of data from Git, but they show the data separately. This makes it hard to satisfy the user’s need of having an overview of Git’s data related to a piece of code the user is interested in. This research proposes VisualBlame, a new tool developed to explore the opportunities of grouping Git’s data in one view. This required more knowledge about what the important related information to a piece of code is, therefore the tool was developed in iterations with user researches between the iterations. Through this process more was learned about how users understand codebases in practice, revealing data such as what tools are important and when the different kinds of information from Git are useful. It also showed that it was possible to create a tool that presents the different kinds of related data in one view without losing the overview, giving users a new option to help them understand a codebase.

(4)

(5)

Introduction

Most software projects nowadays are constantly growing and evolving. Therefore tasks such as maintaining code have become more common to a lot of developers. This results into developers often having to deal with code that was written by other developers. This can leave the developer with a lot of questions, as code can be written in many ways. The developers could for example wonder what the rationale behind a piece of code is or what it is supposed to do. Studies have shown that these are common cases, developers frequently ask themselves these kind of ques-tions [1, 2, 3]. It has also been shown that answering them is a time consuming process, usually involving scanning through the code or asking another developer for help.

These kind of questions are even more frequent to developers joining a new project, because they have not yet written any code. Therefore they exclusively deal with code written by other developers. The new developer thus has a low understanding of the project’s codebase, causing issues such as having trouble understanding the rationale behind code to be more likely to oc-cur. This can make the task of understanding the project’s codebase harder to the new developer. Developers have multiple information sources available to them that can be of help during this process of understanding a codebase. The information source that is available in every software project includes the source code. This code may not be clear to the developer though and could raise the issues discussed above. Other information sources that could help dealing with these issues include comments in the code, documentation, bug repositories and version control sys-tems. However, comments and documentation contain subjective data and may be of low quality or outdated. They are also not guaranteed to be available in the first place, a problem that bug repositories also suffer from. The version control system on the other hand is both very common and contains a mix of objective and subjective data. This guarantees that it contains at least some useful information.

1.1 Version control systems

A version control system (VCS) is a system that allows multiple developers to work on a project simultaneously. Every developer works on their own copy of the project and adds their changes to the main copy of the project that is stored on a server in a repository. The VCS stores these changes as a revision and adds them to this repository. This creates a history of changes to the project, with context information about the changes. This information includes objective data such as by who the changes were made and when they were added to the repository. It can also include subjective data in the form of a message that may explain why the changes were made. Git is currently one of the most popular version control systems [4]. It is a distributed VCS, meaning that the full repository including the historical data of the project is also stored locally at every developer. This results in faster operations on this data, as no latency is involved. The

(8)

historical data is formed by the revisions, they are called commits in Git and they contain the context information of the revision. This context information is also called the metadata of the commit, and it includes the objective and subjective data of the revision. Besides this metadata, commits also contain the code of the project in the state it was in when the commit was created. Git also comes with a set of standard tools to acquire this commit information. For exam-ple, the git log command gives a time line of commits focused on their metadata and the git diff command can be used to see the code changes between commits. There is also the git blame command, this annotates every line of code in a file with the commit metadata of the commit the line was last changed in. Its existence shows that there is a need for more context oriented tools.

1.2 Problem statement

The context information from the Git database can already be obtained through multiple tools including the Git commands. There is also another text-based tool called Tig, there are web interfaces such as GitHub and there are desktop interfaces like SourceTree. These interfaces are all based around the Git commands and try to improve upon them by showing the context information in a more accessible view and by showing more relevant information at once. A more detailed description of them can be found in section 3.2 on page 11.

These tools suffer from the same problem as the Git commands though. They only show a limited amount of the context information available to a piece of code per view. To get a full overview of the context information of one piece of code, the user has to enter multiple com-mands or switch between multiple views. The overview is quickly lost while doing this, as all the different results are shown in separate views that contain a lot of data. There is thus a need to decrease the effort required to find all the related context information of a piece of code that is available in the Git database.

1.3 Contribution

Following from the problem statement, here are the research questions that will be explored in this thesis:

• What does the process of understanding a codebase look like?

• What information in a Git database helps to understand a software project’s codebase? • How can the results from the Git commands be shown in one view without cluttering the

view?

These questions were explored by designing an interactive tool called VisualBlame that shows all the relevant data of a piece of code found in the Git database in one view. That way the user is presented with more relevant data in one place, with options to continue browsing through this relevant data if necessary. VisualBlame was implemented in multiple iterations, with a user research in between the iterations. The goal of these user researches was to gather feedback about the tool and to gain more insight into the subjects of the research questions. This way the potential users could influence the tool’s implementation process.

This thesis is organized as follows. First more background information is provided about both the process of understanding code and about Git in chapter 2. The related work is explored in more detail in chapter 3. In chapter 4 the rationale behind the design and architecture of the tool is described. Chapter 5 covers the implementation process of VisualBlame, including both the implementation details and the user researches. In chapter 6 the tool is evaluated to validate that it works correctly, to find out what it adds over other tools and to determine whether its performance is acceptable or not. Finally the thesis concludes in chapter 7.

(9)

CHAPTER 2

Background

In order to design a tool that helps developers when they are in the process of understanding a codebase, it is important to learn more about this process. This gives insights that can shape the tool’s design towards supporting this process. It is also important to learn more about Git’s inner workings. This can show what the limitations and possibilities are of the data that can be gathered from the Git database. That is why this chapter provides more background information about both the process of understanding a codebase and the inner workings of the Git database.

2.1 Understanding a codebase

Multiple theoretical models have been developed in the past that try to capture how developers comprehend programs. In the top down model the developer first forms hypotheses about the program based on reconstructed domain knowledge, after which the developer looks to confirm these hypotheses with the source code and possibly refines them. There is also the bottom up model, where the developer first looks at the individual statements in the source code and men-tally chunks them into higher level abstractions. It was found that the top down approach is mostly used when the code is familiar, whereas the bottom up approach is used with unfamiliar code. These models were combined into new models that integrate both approaches, telling us that developers switch between the approaches as their mental model of the program improves during the program comprehension [5]. In the case of understanding a codebase, the code is un-familiar and thus the bottom up approach should initially be the more common used approach. These models only cover the strategies used by developers to understand how a codebase works though, while developers also need information about the rationale behind it. This need comes from the fact that the same thing can be achieved in many different ways with code. This means that there is a reason why the code is written in the way it is and not in another way. The original developers have a rationale behind the way the codebase looks, they possess the theories behind the codebase that can only partly be communicated through tools like documentation [6]. In order to contribute to the codebase in a way that is in line with these theories, the new developer needs to understand this rationale. The commit metadata can reveal some of this rationale, although it relies heavily on the optional subjective data of the commits.

Futhermore, Maalej et al. [7] showed that in practice context information is indeed valuable during the task of program comprehension. They found that the implementation rationale helps comprehend code, but it is often not available. There is also a gap between the interest of de-velopers in this information and the willingness of dede-velopers to document it. This results into developers relying heavily on communication for program comprehension tasks.

(10)

2.2 Git

One of the strengths of Git as a version control system is its simple underlying concepts. At its core Git is a key value store, the values are Git objects and each of them has an SHA-1 hash as a key. Git’s object model consists of only three main objects, the commit, tree and blob objects [4]. The commit objects are the top level object, they contain the metadata of the commit. They also have pointers, one to the previous commit and one to the top level tree object. A tree object represents a directory, the top level tree object therefore represents the root directory of the repository. Tree objects can contain pointers to other tree objects and to blob objects. These pointers also contain the associated data of the object it points to, such as the name of the object. Lastly the blob objects represent the files in the directories, they only contain the content of the file they represent.

Figure 2.1: Git’s object model in a small example repository [4].

An example of a simple Git repository containing three commits is shown in figure 2.1. The commit, tree and blob objects are on the right side of the figure. The pointers there show one of Git’s optimizations, tree and blob objects are not bound to one top level tree object. If a file for example didn’t change in a commit, the top tree object of the commit will have a pointer to the last blob object of that file instead of creating a new blob object. What is also shown in the example on the left are the references pointing to the commit objects. Git has a branching model, each branch contains a reference pointing to the last commit on the branch. Only the user’s current branch is relevant to this thesis, Git stores a pointer to this reference in a file called HEAD. This is the entry point to the object model of Git, which can be traveled through by following the pointers.

With this information about Git’s internal workings, some points of interest that can be gath-ered from its database can be identified. One point of interest is the commit history around a commit, it can be retrieved by going to the first commit of the HEAD pointer and following the pointers to parent commits. Each of these commit objects contains the commit metadata point of interest. Lastly another point of interest are the changes in a commit. These changes can be found by using a difference algorithm on the content of the new blob objects in a commit and the content of the previous versions of them found in earlier commits.

(11)

CHAPTER 3

Related work

Various research has already been done towards using version control system (VCS) data to help developers understand a codebase. There are also many tools available with the same goal. In the sections below, both relevant research as well as relevant tools are discussed.

3.1 Related research

Van Rysselberghe and Demeyer [8] researched how a visualization of the VCS data can help find the most important code of a codebase. They visualized the VCS data as a simple scat-terplot, with files on the x-axis and time on the y-axis. Points in the plot indicate that the corresponding file was changed on the corresponding time. The resulting plot provides insights into the evolution of a software system, such as how the design changed over time. Using an open source system as a case study, they were able to successfully identify relevant changes such as unstable components and related entities in the system. However, the visualization only provides symptoms that can indicate these components. Manual inspection is needed to confirm that the indications of the symptoms are correct.

Research of Hassan and Holt [9] made use of the metadata stored in the VCS. They located the static dependencies in a software system used as a case study with the help of a dependency graph. Then they attached the VCS metadata to these dependencies as notes. Afterwards they successfully found the rationale between the unexpected dependencies in the system through these notes. This confirms that the VCS metadata can explain the rationale behind code, al-though it relied heavily on the quality of the commit messages in the case study.

Other research has been done that uses the data from the VCS in combination with other data sources. Holmes et al. [10] developed the Deep Intellisense tool as a plugin for the Visual Studio integrated development environment. The plugin combines data from a VCS with data from bug reports, specifications and emails. Its goal is to give developers a more complete view of the code history by combining the information from these artifacts and presenting it with one tool. While no case study was done with the tool, interviews were held with developers and testers from one company before its implementation. From these interviews it became clear that developers have a need to find the correct contact person and that developers are more interested in finding information at the statement level. This is in line with the earlier findings about the codebase understanding process in section 2.1 on page 9.

3.2 Related tools

There are already various tools available that present information from Git to the user. Git comes with a set of standard tools that were already mentioned in the introduction, such as git log, git diff and git blame. With these commands the points of interest from the Git

(12)

database identified in section 2.2 on page 10 can be retrieved. They also indicate that some of the points of interest can be used in a different way to create a new point of interest. The git diff command has a command line option that shows a summary of the changes in a commit (--stat), providing a different point of interest than the changed lines themselves. The git blame command shows that the objective part of the commit metadata can be used as annota-tions to the lines of a file. Each line is annotated with the objective part of the commit metadata from the commit it was last changed in, creating a new point of interest.

There are also many graphical user interfaces available that allow you to browse a Git repository, often either web based or desktop based. Examples of desktop based interfaces are SourceTree1 and Github Desktop2_{. Some known web based interfaces are GitHub}3 _{and BitBucket}4_{. Next to} showing the Git repository in a graphical interface, they contain graphical views for the different Git commands with their own small additions. For example, GitHub contains a blame view in which every line number also has a color based on the age of the commit the corresponding line was last changed in. Next to this it also shows the subjective part of the metadata of these commits next to the lines in the blame view.

Tig5 _{is an interactive tool with a text-based interface that enhances the Git commands with} a cleaner visual look and options to interactively browse through the results. This is done by presenting the results with colors in the terminal and by letting the user browse through the results with a keyboard. It also supports showing multiple points of interest in one view. For example, a line in the blame view of Tig can be selected with the enter key, which opens another view below the blame view. This view contains the commit information, change summary and changes from the commit the selected line was last changed in, essentially showing four points of interest from the Git database at once. Lastly Tig also simplifies some advanced options of the Git commands. For example, the git blame command can be used with a commit hash parameter to start looking from the commit with that hash instead of from the current commit. This results in the blame point of interest of an older version of the file. Tig makes using this powerful functionality a lot more user friendly, as only one key is required in the blame view to show this older version of the file with its blame point of interest.

1_{https://www.sourcetreeapp.com/} 2_{https://desktop.github.com/} 3_{https://github.com/} 4_{https://bitbucket.org/} 5_{https://github.com/jonas/tig}

(13)

CHAPTER 4

Design and architecture

4.1 User interface

The user interface is the most important part of VisualBlame, because it determines what the user can do with the tool. The main requirement of this user interface was that it needs to be able to show all the Git points of interest identified in section 2.2 on page 10 and in section 3.2 on page 11 in a single view. This means separate subviews were required for most of the points of in-terests and that their data had to be filtered. Otherwise their data could already fill a whole view, as shown by the Git commands. A graphical user interface (GUI) is more suitable to satisfy this requirement than a text-based user interface (TUI), because it generally has a better ability to show multiple things in one view. This is why the tool was designed with a GUI instead of a TUI. The first important choice of VisualBlame was that it would be based around the git blame functionality. The reason behind this is that that the git blame functionality works on the line level of code, which corresponds well with the fact found in section 2.1 on page 9 that users most likely browse unfamiliar code from the line level. Another reason to use the git blame functionality as the basis of the GUI is that it has a commit as output per line, these commits can be used to gather the other points of interest with. This means a view was needed to present the file in that the GUI was started up with, called the blame view. However, there was no room to immediately show all the blame annotations due to the other points of interests needing space. Therefore it was chosen give the user the ability to select a line in the blame view, after which the metadata of the commit the line was last changed in could be shown in a separate view. All the lines of the file in the blame view that were last changed in this commit would be highlighted afterwards. This way the relevant git blame data is still shown to the user, and there is also room for the other points of interest.

The other points of interest could be triggered too after a line in the blame view was selected. With the commit metadata retrieved from selecting this line, the commit changes, the commit change summary and the commit history of this commit could be retrieved. The main idea was that the commit changes could be shown per file in a similar view to the blame view called the diff view. This allows the user to compare the file contents of the blame view more easily with the file contents and changes found in the diff view. The other files that were changed in the same commit, together with a summary of how many lines were changed in them could then be placed in a separate list. The commit history point of interest could also be shown in a separate view called the log view, this would show the commits with their metadata in a horizontal list to allow for more intuitive chronological ordering of them. Now that it was determined how the tool would be used, the final design could be created.

(14)

Figure 4.1: The graphical user interface design of the tool.

The final design of the GUI was created in a few iterations, it is shown in figure 4.1. View 3 represents the blame view. The diff view was placed right next to this view, so that the contents of the two views could be compared more easily. The commit metadata view and the diff sum-mary list were also place on the right side, so that all the points of interest were grouped. The log view was placed on its own at the top of the view.

Besides these main components, some other views were added to give the user more information and interaction options. For example, beneath the blame view is another commit context view showing the commit that contains the file in the blame view in that state. With button 8 the changes of this commit can be retrieved, showing them in the views on the right. The most important additional functionality would be in button 9, clicking on this button moves the file of the diff view to the blame view. This allows the user to browse through the whole code history of the project, by moving files from previous commits to the blame view and then selecting the lines there.

However, this only allows to go to previous versions of files in the repository. That is why two options were thought of to allow the user to go back to newer versions of files. First there is the button tab panel in view 2, this shows the history of files that were in the blame view and allows the user to put these files back in the blame view. Secondly the log view’s commits were intended to be clickable, showing the changes of those commits on the right side.

4.2 Technical design

In order to support the GUI design, a back end architecture was designed. This architecture was designed to support the functionality needed to retrieve the points of interest in a modular way, ensuring functionality for retrieving new points of interests can easily be added. Next to that the architecture was also designed to support the basic requirements that the GUI should always stay responsive and that data should be reused when possible.

(15)

Figure 4.2: The technical architecture of the tool

Based on these requirements, the final architecture design was created after multiple iterations as shown in figure 4.2. The architecture consists of five components, with two main components. These main components are the GUI component and the modules component, they respectively provide the front end GUI logic and the back end data retrieval logic.

To fill the requirement of the GUI staying responsive, a scheduler component was introduced. This component is called whenever the GUI wants to retrieve data from the Git database. It starts the correct module in a concurrent way and then returns the control back to the GUI. It also takes care of returning the result of the module back to the GUI.

A cache component was introduced because of the requirement to reuse data where possible. While this won’t make the initial request of certain data faster, a significant speed up can be achieved when the same data is requested again by returning the data from the cache instead of calling the module. This is especially important when working with large repositories, as the modules are expected to take more time then due to processing larger amounts of data from the Git database. The reason to choose a cache over the alternative of a database is that a database likely gives more overhead due to working with the user’s file system. The main advantage of a database is that the tool can reuse the gathered data over multiple sessions. This is not a significant advantage in this case though, as it is not likely that the user will look at the same files many times over multiple sessions with VisualBlame.

Lastly there is the event layer, this layer serves as the main communication between the GUI and the modules. A standard API was not an option, as the modules run concurrent with the GUI. Therefore an event driven mechanism was chosen in the form of the event layer, which turned out to be similar to the publish-subscribe pattern. The basic idea is that modules and GUI components can subscribe to any event and can trigger any event. The modules subscribe to their own event, the GUI components trigger this event and subscribe to the corresponding result event. This way the GUI and back end are decoupled from each other, the GUI components only need to call the right event with valid arguments and be subscribed to the corresponding result event.

4.3 Technical choices

Before VisualBlame could be implemented, the main technologies to create VisualBlame had to be chosen. These were limited to the technologies that could support the GUI and the back end that interacts with the Git database. Python was chosen to create the GUI with, because it is a high level language featuring mechanisms such as the garbage collector that takes care of memory management and the high level data structures that adjust their size when needed. These kind of features allow for quick development with Python. This is important, as the time to create the tool in was limited and the visual design contains many components.

Python has a lot of GUI frameworks that could be used to implement the visual design1. These were first filtered based on the objective requirements that the framework had to be open source, actively maintained and able to support both Linux and Mac. This narrowed the selection down

(16)

to Kivy, PyQt, Tkinter and libavg. The final decision was made based on subjective require-ments, such as how well the framework is documented and how easy it is to work with the framework. Kivy was selected based on these requirements. It contains a clear amount of base widgets that can be inherited from to create custom widgets. Kivy is also actively maintained, has quite some documentation and supports Linux and Mac. Lastly another benefit of Kivy is that it contains its own markup language, separating the logic of a widget from its appearance styling. The initial choice for the back end was to use the libgit2 C library. This library contains functions to interact with the Git database and return the corresponding objects such as a commit object. This is more convenient than using a shell process with the Git commands, as that adds more complexity in the form of manually parsing the plain text output of the Git commands. While libgit2 also has a Python version, the initial choice was to use the C version. This was chosen as C is a low level language, meaning that performance can be gained if it is used properly. There are also more options to use concurrency in C, which could also result in a performance gain. However, this resulted in having to create communication code between C and Python code. This code would also need to transform the low level C objects to Python objects and interact with Python’s global interpreter lock. The Python version of libgit2 already contains similar communication code, it contains bindings to the C version of libgit2. Therefore to increase the development speed by avoiding having to write this communication, the Python version of libgit2 was used at the cost of having less concurrency options.

(17)

CHAPTER 5

User-driven implementation scoping

The implementation started with a proof of concept of VisualBlame. This was followed by a first user research to gain feedback on the concept. The tool was then further developed, followed by yet another user research. Finally, the tool was developed into its final state for this thesis.

5.1 Proof of concept

The proof of concept’s goal was to create a basic version of VisualBlame that could be shown in the first user research. This gives the users a good idea of how the tool works and allows them to provide better feedback. Therefore the main focus was on creating the blame view, as this is the view that the user starts interacting with. The blame view on its own gives few information though, thus it was also important to create one of the commit context views in the proof of concept. Working in this scope meant that first versions of almost all the components in the architecture were required. Only the cache component was not yet needed, because its only purpose is to speed up the module operations by reusing data when possible.

The components of the architecture were directly translated to Python objects. The GUI has a corresponding class VisualBlame, which inherits from the App class of Kivy. This class contains the functionality to start the GUI and load the widgets the user can see and interact with. The event layer was translated to the EventManager class that makes use of the fact that Python has first class functions, allowing it to use a dictionary mapping events to functions that will be called when the event is triggered. The scheduler was implemented as the Scheduler class and it contains the functionality to call the modules in a concurrent way. A separate class was created for every module, all inheriting from the same base class.

One goal of the proof of concept was to keep these classes simple. As it was not known what all the modules and GUI widgets would look like, this makes sure that the main classes can be changed or extended more easily when it is needed. It also lowered the development time of the proof of concept, which was important to decrease the chance the first user research had to be delayed.

This simplicity can be seen in the proof of concept’s primary communication class, the Event-Manager. This class contains a dictionary mapping events to a list of functions. Besides this it only has two methods, one to add a function to the list of an event and one to call all functions in the list of an event with data. One instance of the EventManager is created at the start of the program, this instance is given to both the VisualBlame and the Scheduler class. The widgets of the VisualBlame class such as the blame view then add their functions to the events they want to receive data from. The Scheduler class adds its own function to the events it has corresponding modules of. This way a widget from the GUI can for example trigger the event “blame” through the EventManager. This then results in the function of the Scheduler being called, which calls

(18)

the module corresponding with the event. When the module is done, it returns its data to the Scheduler with the return event “blame result”. Finally through the EventManager the widget functions that are registered to that event are called with the returned data.

The other classes were also kept simple. The GUI widgets inherit from Kivy’s existing wid-gets where possible. For example, the blame view widget inherits from Kivy’s ListView class and only adds functionality to select the correct lines after receiving the blame data. This data comes through the event communication, where the Scheduler calls the Blame module in a new thread to keep the user interface responsive while the module retrieves its data. It was considered to use processes instead of threads, as they make use of multiple cores in a system while threads are limited by Python’s global interpreter lock1_{. Threads were still chosen though, because the} data sharing between processes and the initialization of processes is more expensive. Then in the Blame module the data is retrieved with the pygit2 library and processed in the module to only return the relevant lines back to the widget.

Some styling was added to the proof of concept after all the classes were finished, to better communicate what the concept does to the participants of the first user research. The styling was added with the kv language of Kivy and included things such as adding clear background colors to the selected lines and using a monospace font. One of the screenshots used in the user research can be seen in figure 5.1. In this screenshot one of the lines was selected, after which the commit metadata of the commit that line was last changed in was shown. All lines that were changed in this commit then received a different background color.

Figure 5.1: The proof of concept, showing the result of clicking on line 40 of the utils.py file from the Kivy repository. The author and committer names are hidden.

(19)

5.2 Concept validation

After the proof of concept was finished, the first user research was done. The main goals of this research were to find out what potential users think of the proof of concept and what changes to the tool they would like to see. The secondary goal of the research was to find out more about the process of understanding a codebase in practice. This research was done in the form of a survey, containing questions regarding these goals and screenshots of the proof of concept and the visual design of the tool. To gather as many responses as possible, incomplete responses were also accepted resulting in some questions having fewer answers. The survey ended with 11 participants, their relevant programming experience is summarized in figure 5.2. Unfortunately due to the small size of participants, the conclusions made on the data hold less value.

Figure 5.2: The years of programming experience in software projects with multiple people of the first survey’s participants

5.2.1 Understanding a codebase in practice

The questions about the process of understanding a codebase regarded what tools developers currently use to analyze unfamiliar code, what information is important to them to help them understand a codebase and how data from the version control system (VCS) can help understand a codebase. Note that the survey allowed the user to enter multiple answers to the open questions, explaining why there can be more answers than participants in the summarized results.

Tool Amount of answers Graphical user interfaces for Git 9

The standard Git commands 7 Integrated development environments 2

Table 5.1: Summarized top results of the question: “What tools do you currently use when analyzing and reviewing code written by other people?”

The Git commands scored the best on the question regarding the tools developers use to analyze unfamiliar code. This is not a surprising result, as they are both powerful and accessible to anyone that uses Git. It also confirms the data that is shown by these tools is important to the users when dealing with unfamiliar code. However, table 5.1 shows that the graphical user interfaces score even higher when you summarize the results of the different interfaces to one final result. This shows that either the Git tools themselves do not always give a clear overview of the needed data, or that the small additions added by the graphical interfaces to the different Git results are very useful to their users.

(20)

Information source Amount of answers Documentation 6

Commit metadata 6 Code history 3

Table 5.2: Summarized top results of the question: “What information helps you when trying to understand a new codebase?”

Table 5.2 shows that the most useful information to help understand a codebase with is the doc-umentation and the commit messages of a codebase. This means that even though high quality of these information sources is not guaranteed, they are very useful when this is the case. After these the most useful information turned out to be the code history. The VCS thus contains a lot of the useful data that helps to understand a codebase, as it contains both the code history and the commit metadata.

The results of the questions regarding the VCS data didn’t reveal anything new. The first question concerned how the VCS data is useful to discover the rationale behind code. The most common answer was that the commit messages can show the rationale, which is in line with the findings in chapter 3 on page 11. The second question regarded in what situations the code history of the VCS is more useful and in what situations the commit metadata is more useful. Unfortunately, this question turned out to be poorly formulated and gathered too few valid responses to draw conclusions from.

5.2.2 Concept feedback

The questions regarding VisualBlame started with a question asking if the tool would be an im-provement over the current way they explore a codebase. From the 11 responses, 54.5% indicated this was the case. There were however many different answers on their reasoning behind this, there was no common answer. This makes it hard to draw conclusions from this data, showing the weakness of the low amount of participants. Furthermore, none of the answers included that the tool would be able to show more relevant data, indicating that the visual design didn’t make this clear. The participants that didn’t see the tool as an improvement did have a most common argument, they saw no added value in the tool over existing tools. This again could indicate that the advantage of VisualBlame over other tools is not clear.

The last question about the concept asked the participants what features the tool would need to make it an important part of their workflow. This resulted in a wide range of answers, with no feature that appeared significantly more than other features. Summarized from the answers the three features appearing a little more in the answers were Linux support, the ability to use the tool in the terminal and integration with other tools. The tool already supports Linux as it is developed in Python on Linux with libraries supporting Linux. The second feature was not considered as a feasible possibility, because the tool was designed with a graphical user interface. Lastly while integration with other tools could benefit VisualBlame, it was not considered as relevant to this thesis.

5.3 Further development

The feedback gathered from the first user research didn’t have a direct impact on the develop-ment of VisualBlame, because the results didn’t reveal that the design had to be changed. It did however show that the advantage of the tool over existing tools was not clear. Therefore the goal of this development cycle was to implement most of the views showing the relevant data to the selected line, so that the purpose of VisualBlame would be clearer in the next user research. This limited the scope of this development cycle to creating the diff view and the list of changed files above it, because these are two points of interest that are important in the visual design.

(21)

In order to gather the commit changes, the diff module was implemented. It uses the pygit2 diff functionality to gather the initial data, using the commit of the blame view and its parent as input data. This caused a problem with the first commit though, as this commit has no parent. Custom logic had to be added to cover this edge case. The pygit2 format of the diff data was also not directly usable. The diff module therefore has logic added to transform the different diff objects of pygit2 into a custom format. This format consisted of a list of hunks per file. A hunk consists of a list of lines and an origin indicating if these lines were added, removed or unchanged in this commit. This was the data given to the GUI widgets, which added line numbers and background colors to them.

The diff view, like the blame view, also needed a ListView widget. As these two views are placed next to each other, the lines in them had a lot less space then before when there was only the blame view. In order to solve the problem of long lines not fully being shown, the ListView implementation was improved to show the long lines on multiple lines. However, during the im-plementation of this it became clear that the ListView couldn’t support list items with variable height, because it uses lazy loading and calculated its initial height by multiplying the height of one of its items with the amount of items it contains. Therefore a custom list view named the CodeScrollView was created, inheriting from the simpler ScrollView widget of Kivy. The list view logic that was required by both the blame and the diff view was added to this class. Then two more specialized classes inheriting from this class were created for the blame and the diff view, respectively implementing the selection logic required by the blame view and the different background color highlighting required by the diff view.

The list of changed files didn’t cause many problems. It only had to receive the data from the diff module, initialize its own widgets and initialize the diff view with the selected file’s diff data. However, this required yet another register to an event. This was done at the initialization of the VisualBlame module, with a few lines of code per event that had to be registered or triggered by the GUI widgets. While this worked, it was not ideal and an extension of the Event-Manager could provide a more flexible solution that would make adding new widgets easier. In order to not delay the second user research, this was postponed to the final development phase though.

Lastly a first version of the cache was implemented as the Cache class. This was a very simple version of the cache though, as the development of the diff view was more important and took longer than expected due to the issues discussed above. Therefore the cache was only a simple Python object holding a dictionary at this point. Keys were the event name combined with the input arguments to that event and the values were the return data of the modules. This meant that if the exact same action was done twice, its value could be retrieved from the cache. This was not ideal as some data was stored multiple times. For example, the blame data was stored two times in the case where the user selects two lines with the same last commit in the blame view. An improvement of the cache was therefore needed, it was postponed to the final development phase. Finally some styling was done to VisualBlame, for the same reasons as in the proof of con-cept. One of the screenshots used in the second user research is shown in figure 5.3 on the following page.

(22)

Figure 5.3: VisualBlame after the second development cycle, showing the result of clicking on line 3 of the utils.py file from the Kivy repository. The author and committer names are hidden.

5.4 Related change importance

The second and last user research’s main goal was to gather results to use in the tool comparison evaluation and learn more about the usefulness of the different kinds of data in the Git database. Its secondary goals were to gain additional feedback on the tool again and to gain feedback about additional features that could possibly be added to the tool. The research was in the form of a survey again. This time there were 25 participants, their experience is summarized in figure 5.4.

Figure 5.4: The years of programming experience in software projects with multiple people of the second survey’s participants

(23)

5.4.1 Importance of the points of interest

Five scenarios were described at the start of the survey. These scenarios were formed through a few discussions and were validated by asking the users how frequently they encounter them. The results of them are used during the technical evaluation to weight the different scenarios. The users were also asked if they encountered another scenario frequently. This resulted in only four responses with specific scenarios, indicating that there were no common scenarios missing in the survey. The objectives of the scenarios are described below.

1. Contact person: searching the best contact person to explain a piece of code. 2. Age: find out how old certain code is.

3. Speed of growth: figure out how fast a piece of code has grown over time. 4. Iterations: find out what previous iterations of some code look like. 5. Related entities: find the dependencies to a piece of code.

After the questions regarding the scenarios, the participants were asked how often each point of interest from the Git database was useful in the different scenarios. To ensure that the users knew what information the point of interest includes, screenshots of each point of interest gathered with the Git commands were added to their corresponding question.

Figure 5.5: Results of how often the commit metadata is useful in the different scenarios The results of the commit metadata in figure 5.5 show that the commit metadata is most often useful in the contact person and the age scenario. This is a logical result, as the commit metadata includes information about the person that made the commit. It is also shown that the metadata is not often useful in the speed of growth scenario. Lastly the results on the iterations and related entities scenario are interesting as they show no favor towards often or never. That while the points of interest in general should either not be useful or often be useful in a scenario. This could indicate that “sometimes” is a comfort pick and that the question form should have had another option.

(24)

Figure 5.6: Results of how often the blame annotation is useful in the different scenarios The blame annotation was expected to score well on the contact person and age scenario due to it annotating each line with an author and date. The results in figure 5.6 match these expectations. It is also clear that this point of interest is not useful in the related entities scenario. The other two scenarios show no favor towards any answer, indicating that it is very situation dependent whether the blame annotation is useful in these scenarios.

Figure 5.7: Results of how often the patch content is useful in the different scenarios It was expected that the patch content would score well on the iterations and the related entities scenarios, as it shows the changed lines of all the changed files in a commit. These expecta-tions were matched by the results in figure 5.7. The patch content showed a favor towards not being useful in all the other scenarios, showing that these three scenarios have a very different information need.

(25)

The diff summary was expected to score well on the related entities scenarios, as it shows what files were changed in a commit. It could also help in the speed of growth scenario, as it shows how much the files have changed in a commit. These expectations were partially matched by the results in figure 5.8 on the preceding page, the diff summary was found to be most useful in the related entities scenario. It didn’t score well on any of the other scenarios though, making this point of interest the least versatile so far.

Figure 5.9: Results of how often the history position is useful in the different scenarios The history position was expected to be most useful in the speed of growth scenario, because it shows multiple commits in a time line. These expectations were not matched by the results in figure 5.9 though. The scenarios this point of interest did score well on were the age and iterations scenarios. Scoring well on the age scenario is a logical result, as it shows the age of the commits in the time line. That it scores so well on the iterations scenario is less logical, but could be explained by the fact that the commits could point towards older iterations of certain code.

5.4.2 Feedback on the improved tool

The questions regarding VisualBlame again started with the question asking if the tool would be an improvement over the current way they explore a codebase. This time there were 24 responses and 62.5% of them indicated that VisualBlame would be an improvement. This is a a higher percentage than the previous user research, indicating that the purpose of the tool has become more clear. This was confirmed by the answers, the most common answer of the participants that indicated VisualBlame would be an improvement was that it gave a better overview of the context information than the Git tools. The most common answer of the participants that didn’t see VisualBlame as an improvement was that they thought other tools covered this, which is the same as the most common answer of the previous user research.

During this user research some potential features that could be added to VisualBlame were described to the participants, they could incidicate how often these features would be useful to them. These features were formed through discussions and the first user research result. The results of this question are in figure 5.10 on the next page and show that the most useful fea-tures would be the “open in editor” and “related files based on line content” feafea-tures. These respectively would give the user the option to open the current file in an editor and show the user more related files of the selected lines based on its content. If the line contains a function for example, related files would be files that also use this function. Even though this means that even more data is shown on the view, it still turned out to be the most useful potential feature according to the results.

(26)

Figure 5.10: Results of the question: “If you were to use the tool, do you think the potential features would improve the tool to you?”

Lastly the survey asked what features the participants would like to see in the tool, just like in the previous survey. Unfortunately there were only five responses to this question. This may indicate that VisualBlame is already very feature complete. There was a most common answer though, IDE integration. This means VisualBlame would have to be implemented as a plugin for an IDE, so that it can make use of their advanced features like syntax highlighting and locating the definitions of language objects such as functions. Just like in the last survey, it was not considered as relevant to this thesis.

5.5 Final development

Due to the issues found in the last development cycle, the final development’s first goal was to improve the EventManager. Another goal of the final development was to add the remaining points of interest and interaction features to VisualBlame. The last goal was to improve the cache, so that it would not store data multiple times. It was also considered to implement a sim-ple cache eviction algorithm, but this was the last priority as the other goals listed before have a higher impact on the results of the technical evaluation that would come after this development cycle.

As the registering to events was not flexible enough, the EventManager was extended to sup-port the use of a configuration file in which all the GUI widgets could specify the events they want to listen to and the events they want to be able to trigger. This was implemented as two dictionaries containing the widget identifiers mapped to specialized namedtuples defined in the EventManager file in which the event details could be specified. The first dictionary was used to specify the events the widget wants to listen to, the second was used to specify the events the widget wants to trigger. The dictionaries also contained extra functionality, like the ability to specify the caller widget identifier of an event the widget wants to listen to and the ability to chain more events after the results of the first event. This was especially useful to the blame widget, which has to trigger the diff, commit context and log view after getting its blame results. Adding new widgets using the event communication also became easier, as only a few lines in the configuration dictionaries had to be added.

The first configuration dictionary is passed to the EventManager through the VisualBlame class. At the start of VisualBlame, the VisualBlame class gives the namedtuples of this dictionary and a standard receive function of the widget to the EventManager. The EventManager then processes the configurations by registering the functions to the specified events in its event dictionary. This standard function to receive events was defined in the new EventWidget class, the widgets using the event communication have to inherit from this class. It also contains a function that the widgets can call to trigger the event they specified in the second configuration dictionary. Lastly this base class also took care of processing the results in a concurrent safe way through one of Kivy’s functions.

The cache was extended too. All the data of the different modules were evaluated, to dis-cover what the best option for a cache would be. As all the modules were implemented as

(27)

separate independent classes, the cache could start by grouping the results per module. To allow for more flexibility, a new way of interacting with the cache was provided to the modules. Now before their execute function was started in a new thread, they first received all the data of their event from the cache. They could then check themselves if their result was already in this data, and return it if it was there. If it wasn’t, their execute function would still be called in a new thread. However, functionality was added to let them do a call specifically with the purpose to put data in the cache. This allowed them to also specify themselves how their data is stored. This way the modules had a lot more control about the cache, while not knowing much about the implementation details. These were hidden in in the base module class and the Scheduler class.

The data of the Blame module didn’t work well with this cache though, because its data was stored as a list with hunks containing the lines and the last commit these lines were changed in. This means the line that was selected had to be searched in the line lists of the hunks, which was inefficient. Therefore a better way to store this data was implemented. The solution made use of Python’s reference-to-object way of storing data. The final results were stored as a list in which the indices corresponded to the lines of the file. Each index in the list now got its own hunk, but the indices of lines changed in the same last commit didn’t both store the same hunk. They only contained the same reference to the hunk’s Python object. This way the data could be retrieved fast by getting the hunk of the selected line through its corresponding list index, without storing the blame data multiple times.

Then there were two more points of interest added, the history position and the diff summary. These caused no significant problems, showing off the flexibility of the architecture. The diff summary turned out to be easily gathered from the diff results and was thus added to the diff module. The history position did require a new module, this simply walks the commit history until it finds the commit it is looking for and then returns it with its surrounding commits. Their details had to be gathered through the commit context module, this was extended to be able to handle multiple commits in one call. Lastly some new GUI widgets were added and changed to support the output data of these modules, they were registered to the events through the new event configuration.

The blame feature to start looking from a specified commit required extension of the blame functionality. Now the commit hash to start looking from had to be specified too. The button triggering this turned out to be simple. It only had to ask the data from the diff view, pass it to the blame view and get the right commit to give to the blame view. Then the blame view could be used in the exact same way with the difference that it was now in an older point of time in the repository.

Unfortunately, some features didn’t make it due to the implementation costing too much de-velopment time. The biggest feature that didn’t make it was the history of the files that were in the blame view. The other big feature that didn’t make it was clicking on a commit view in the log view and then getting the diff results of the view back just as when you click on a line in the blame view. This means there were no options to go back to newer versions of files, meaning VisualBlame could still be significantly improved.

(28)

(29)

CHAPTER 6

Technical Evaluation

The technical evaluation of VisualBlame consisted of three parts. In the first part the tool was validated, to check if its results are correct. In the second part the tool’s performance was mea-sured in terms of time and memory, to see if the tool performs to an acceptable standard. In the last part the tool was compared to some of the other tools discussed earlier in the related work chapter. In this comparison the tools were used in different scenario’s and the amount of points of interest per action was measured. The data from the last user research was used to weight the points of interest in the different scenario’s.

There were 3 repositories considered to use during the technical evaluation. Firstly the Linux repository was considered as an option, as this repository is known to be very active and old. Secondly the Rust repository was considered, because it is an open source project that is getting a lot of attention currently. Lastly the Kivy repository was considered, as this repository is more familiar after the creation of the tool. In order to make decisions on which repository to use, some statistics were gathered from the repositories as listed in the table below. The decisions themselves are discussed in the different parts of the technical evaluation.

Linux Rust Kivy Age (years) 11 6 5.5 Total contributors 15375 1525 320 Total commits 602000 54000 10000 Average monthly commits 4392 746 142

Lowest monthly commits 946 (2005-04) 101 (2010-09) 36 (2011-12) Average commit changes 117 162 43

Table 6.1: Repository statistics with rounded values. The data was gathered at May 31, 2016.

6.1 Validation

The goal of the validation was to confirm that VisualBlame is showing the correct data. Therefore the results of the tool were compared to those of the Git commands, as they show the correct results. The Linux repository was used during the validation, because it contains a lot of commits and is very active as shown in table 6.1. The results were gathered from the kernel/Makefile file, this is a file with over one hundred contributors. The results of VisualBlame and the Git commands are shown below.

(30)

Figure 6.1: The results after clicking on line 6 in the left code view. The kernel/Makefile was used and is displayed in the left code view.

Figure 6.2: Result from the command git blame kernel/Makefile.

(31)

Figure 6.4: Result from the command git show 5cee96459

Figure 6.5: Result from the command git log --pretty=format: "%C(yellow)%h%Creset%x09%an%x09%ad%x09%s" --date=short 9e1e0

Figure 6.6: Result from the command git diff 5cee96459^ 5cee96459 --stat --find-renames

The results from VisualBlame in figure 6.1 on the preceding page match the results from the Git commands shown in the other figures below it. This validates that the tool gathers and processes the data from the Git database correctly. The only notable difference is that the diff summary of VisualBlame doesn’t show renamed files that have no changes, as can be seen when comparing it to the result in figure 6.6. This is a small weakness of the tool, as this means it misses a bit of relevant information. A future version of VisualBlame could improve on this.

6.2 Performance

While performance was not the focus of this thesis, it was still considered important to give potential users an indication of VisualBlame’s performance. Therefore time and memory mea-surements were performed while using VisualBlame in a small session. During this session a file of the Kivy repository was used, because Kivy has an active repository with a considerable amount of commits and contributors as can be seen in table 6.1 on page 29. This makes it a realistic candidate that VisualBlame could be used on. These measurements were performed on two different laptops, an Asus X301A-RX079V and an Asus X550CC-XX424H.

(32)

Number Action Results 1 Start VisualBlame with

kivy/uix/boxlayout.py Initial widgets loaded 2 Select line 205 Blame, diff and log results 3 Select line 206 Blame, diff and log results 4 Select file

kivy/uix/gridlayout.py Diff view content changed Table 6.2: The actions in the session that was measured.

The actions of the session and their visual results are shown in table 6.2. The time and memory usage were measured at the start of each action and at the times the visual results were rendered, with the exception that the time was not measured for the first action as no significant work is done during this action. The expectation was that action 2 would take the longest time, because there was no data of that action in the cache yet at that moment. The other actions can use part or all of their data from the cache, they should therefore take less time. The biggest memory increase was expected after action 2 too, because it populates the diff view for the first time and fills the cache with data from each point of interest.

Figure 6.7: Results from the time measurement, the results are averages of 5 measurements.

Figure 6.8: Results from the memory measurement, the results are averages of 5 measurements. As can be seen in figure 6.7, the time measurements don’t match the initial expectation. While action 2 as expected took more time to complete than action 3, action 4 unexpectedly took the

(33)

longest to complete. That while action 4 can get all its data from the cache, whereas action 2 and 3 can’t. The factor that likely caused this unexpected result is that the file clicked in action 4 contains almost three times as many lines as the original opened file, meaning a lot more widgets in the diff view have to be rendered. This could also explain the unexpected result of the memory measurements in figure 6.8 on the facing page where the memory increases after action 4, even though no new data is stored in the cache. The other results of the memory usage did match the initial expectations. The measurements also show that the performance is consistent on both laptops, the same increase and decrease patterns can be observed between the different measurements. Lastly, the memory measurements do indicate that VisualBlame uses quite some memory. Especially the observation that the memory increases with a factor higher than two between action 1 and 2 raises concerns, but further in depth measurements are needed to identify how much of this increase comes from the code views and how much comes from the cache.

6.3 Comparison

In this last part of the technical evaluation VisualBlame is compared to some of the existing tools. The main goal of this comparison was to validate that VisualBlame shows more useful information related to a piece of code than other tools. This is done by going through the five scenarios explained in section 5.4.1 on page 23 with VisualBlame and the other tools, keeping track of the points of interest found per action by the different tools. These points of interest are then weighted based on their importance in the scenario. The scenarios themselves were also weighted in the same way as the points of interest, this makes comparing the results between sce-narios a possibility. The weights are calculated based on the results of the second user research, where participants could indicate how often each point of interest was useful in the different scenarios. They could choose between the two extremes “often” and “never”, and the middle ground between the extremes “sometimes”. The extremes were represented as 1 and 0, result-ing in the followresult-ing simple formula to calculate the weight of a point of interest (poi) in a scenario:

poi weight = of ten percentage +1

2 · sometimes percentage (6.1) During each scenario the goal was to find as many points of interest relevant to the current scenario as possible. In order to make the comparison more objective, there was no subjective goal that had to be reached in the scenarios. Instead, five actions were done with each tool per scenario. As the next action to be taken is still a subjective choice, a few other rules were composed to make the choice of the next action more objective. Firstly the tool’s next action had to be the one finding the most relevant points of interest to that scenario. Secondly it was required to look at older versions of the file that was looked at in the scenario if no more relevant points of interest at the current version could be found. After the points of interest were found, the final score of the tool of that scenario was computed. This was done using the following formula, where n is the amount of points of interest (poi) found:

score = n P i=1

(poii· poii weight)

actions · scenario weight (6.2) Tig was chosen as one of the tools to compare VisualBlame with, because it also has options to show more related data of a piece of code in the same view. Github was the other chosen tool that represented the GUI category, because it is a popular GUI for Git. The Kivy repository was used during the comparison, because it is more familiar than the other repositories as a result of using it during this thesis. This allows for more realistic starting points of each scenario. The results of the comparison are shown in figure 6.9 on the following page, the weights and the actions taken can be found in appendix chapter A on page 41.

(34)

Figure 6.9: The results of the comparison, showing the weighted points of interest per action of the tools in the different scenarios.

The results show that VisualBlame indeed scores better than the other tools in the different scenarios. However, this was expected as VisualBlame is created with this purpose and it should thus score better at the points of interest per action metric than the other tools due to its design. Furthermore, the results show that the difference between the tools in the more important iter-ations and related entities scenarios is relatively small. This means that out of the three most frequent scenarios, VisualBlame may not add much over the other tools in two of them. It is hard to say with the points of interest per action metric though, as it doesn’t directly translate to how useful the tool is in the scenario.

The comparison also has some other weaknesses. For instance, the comparison doesn’t take into account that the required knowledge of the scenario may already be found within fewer than five actions. The way the score works also resulted in some illogical actions giving a higher score than logical actions. This was for example seen in the iterations scenario with Tig. In this scenario using Tig’s functionality to go back to a previous iteration of the file would be a logical action, but this only gives the blame annotation point of interest. Looking at the diff changes resulted in a higher score, because the points of interest from this action combined result in a higher score. Lastly it also doesn’t consider the overlap between some of the points of interest, the commit metadata for example doesn’t offer more useful information than the blame annota-tions in the age scenario where only the date is important. However, even with these weaknesses the comparison still validates that VisualBlame gives a better overview of the Git data related to a piece of code than the other tools do in one action.

(35)

CHAPTER 7

Conclusions

In this thesis a new interactive tool called VisualBlame was proposed to help developers during the process of understanding a codebase. The tool was designed to support the different kinds of data found in the Git database, while providing interaction options that were influenced by background research on how programmers understand a codebase. The tool was successfully im-plemented in multiple iterations with a user research between the iterations to gather feedback on the tool. The user researches also served to learn more about how developers understand a codebase and what information is important during this process.

Through the user researches was found that Git is one of the more popular information sources to assist with understanding a codebase. Especially its commit messages were found to be help-ful by the participants, they were used as much as the documentation and more than the code history. It was also discovered how useful the different kinds of data found in the Git database were in different types of situations and that none of them were never useful.

The technical evaluation of VisualBlame validated that it is showing the correct data. This means the tool successfully managed to show all the different data from the Git database in one view. The performance measurements of the tool did raise a few concerns though, the tool’s performance seemed to be influenced a lot by the amount of lines of the file that was viewed. Performance wasn’t the focus of this thesis though, therefore more in depth measurements are needed to confirm these observations. Lastly the tool was compared to other tools using the points of interest per action metric, while weighting the points of interest to their importance in the scenario that was measured calculated with the user research data. This confirmed that VisualBlame indeed does a better job at showing an overview of Git’s relevant data to some code than the other tools. However, the metric used did have its weaknesses and is in favor of VisualBlame that was designed to score well on such a metric.

7.1 Future work

VisualBlame can still be improved in a number of ways. Due to a lack of time, the cache cur-rently has no limit. While the performance measurements did indicate this was most likely not an issue, the measurements were only done on a small session and more in depth measurements are needed to confirm this. Two other features that didn’t make it were the interaction options that would allow the user to go back to previously opened files and view the changes of a commit shown in the top view. These options would improve the tool by allowing the user to go back to newer versions of files in the repository, instead of only going to older versions. Lastly the second user research revealed that potential users responded well to extensions of the tool in the form of showing more related files based on line content and allowing them to open the currently viewed file in their preferred editor.

(36)

More research could also be done towards the process of understanding a database. While it has become more clear from the first user research what tools and information are important to the users while understanding a codebase, this user research was limited by its low amount of participants. Therefore a similar research with more participants could reveal different answers.

VisualBlame - An interactive Git blame visualization

Bachelor Informatica