R e c o m b i n a t i o n C o n t e s t : CROWDSOUR C ING

(1)

“ R e c o m b i n a t i o n C o n t e s t ” :

C R O W D S O U R C I N G

S o f t w a r e A r c h i t e c t u r e a n d D e s i g n

Final Version: 14th of August, 2014 Supervisor: P. Lago

First Examiner: Signature:

Second Examiner: Signature:

Author: Luxi Jiang Student Number: 10634193

Thesis Master Information Science – Business Information Systems University of Amsterdam Faculty of Science

(2)

1 Table of Contents

2 Abstract ...3

2.1 Keywords ... 4

2.2 General Terms... 4

3 Introduction...4

4 Theoretical Background ...5

4.1 What is Crowdsourcing? ... 5

4.2 Advantages of Crowdsourcing... 5

4.3 Design Working Styles ... 6

4.4 Competition-‐based Crowdsourcing... 6

4.5 Three Conditions to be Crowd Smart ... 6

4.6 Recombination... 7

4.7 Our Method and Justification ... 7

5 Research Questions...8 5.1 Scientific Relevance... 8 6 Methods ...9 6.1 Study Design... 9 6.1.1 Overview...9 6.2 Participants... 9 6.2.1 Recruiting Process ...9

6.2.2 Selection and Target Size ...9

6.2.3 Demographics...10

6.2.4 Compensation and Reimbursement ...10

6.2.5 Informed Consent Process ...10

6.2.6 Privacy ...10

6.3 Tasks ...10

6.3.1 Summary of Design Prompt...10

6.4 Procedure...11

6.4.1 Detailed Description ...11

6.4.2 Participant Dropouts and Final Sample ...12

6.4.3 Data Collected...12

6.4.4 Data Analysis ...12

7 Findings and Results...14

7.1 Designs...14

7.1.1 Design Process...14

7.1.2 Structure and Characteristics of Designs ...15

7.1.3 Predictors of Design Quality...16

7.2 Design Evaluations ...20

7.2.1 Winners of Contest...20

7.2.2 Accuracy of Peer Rankings...21

7.2.3 Ranking Criteria ...22

7.2.4 Ranking Strategy...22

7.2.5 Accuracy of Self-Ranking...23

7.2.6 Reactions to Seeing Other Designs ...24

7.3 Design Changes...24

(3)

7.3.2 Predictors of Improvement...25

7.3.3 Recombination Process...27

7.4 On the Contest...29

7.4.1 Contest Effectiveness ...29

7.4.2 Gamification...30

7.4.3 Suggestions ...30

8 Discussion and Critical Reflection ...30

8.1 Discussion of Main Findings ...30

8.2 Limitations and Future Recommendations ...32

9 Conclusion ...32

10 Acknowledgements...33

References...34

11 Appendices ...36

11.1 Appendix A: Recruitment Email...36

11.2 Appendix B: Design Prompt ...38

11.3 Appendix C: Interview Protocol ...40

11.4 Appendix D: Interview Transcriptions...42

11.4.1 Coconut ...42 11.4.2 Banana ...46 11.4.3 Lemon ...52 11.4.4 Cantaloupe ...58 11.4.5 Watermelon ...63 11.4.6 Mandarin...69 11.4.7 Apricot...74 11.4.8 Lime ...80 11.4.9 Pomelo...86 11.4.10 Peach ...91

11.5 Appendix E: Individual Peer Rankings ...96

11.5.1 Round 1 – Experimental Group ...96

11.5.2 Round 2 – Control Group...96

11.5.3 Round 2 – Experimental Group ...96

2 Abstract

This study addresses the question of whether the same remarkable successes that crowdsourcing is having in revolutionizing other domains (such as, medical drug development, data mining, and logo design) can be brought to software architecture and design, specifically. Software development tasks are often interdependent, complex, and heterogeneous, hence they are difficult to break down into simpler, distributable tasks as required by the conventional crowdsourcing model. In this paper, we present our first experiment with our proposed method (called “Recombination Contest”) to enable crowdsourcing of software architecture and design processes.

Our proposed method consists of a contest setup with two design rounds and one recombination step in between. In the first round, participants had to submit their initial design, and in the second round, they were given the opportunity to improve and revise their design. During the recombination step, participants are exposed to the initial designs of their peers, and are encouraged to borrow ideas from each other for their revised design. The experimental condition applied was a ranking task in the first round, which means the experimental group was required to rank the initial designs submitted by their peers. In the

(4)

second round, both control and experimental groups had to rank the revised designs of their peers. In the end, each contestant is being interviewed on his or her design activities, ranking process, and overall opinions about the contest.

We were primarily interested in whether (1) our approach could help designers improve design quality and if (2) peer ranking had a positive effect on improvement. In addition, we analyzed (3) whether the crowd is able to accurately identify good designs. The main findings of our study all provide support for the “Recombination Contest” crowdsourcing method. 2.1 Keywords

Crowdsourcing; recombination; collaboration; competition; software architecture and design; software development

2.2 General Terms Design

3 Introduction

Crowdsourcing is a widely used strategy that emerged in the 1990s (Thurlow & Yue, 2012), which involves delegating a variety of tasks conventionally carried out by professionals to an unknown workforce (i.e., the crowd) in order to perform in a collaborative manner (Stol & Fitzgerald, 2014). With today’s Web 2.0 technologies, organizations are able to tap into a workforce comprising anyone with an Internet connection (Stol & Fitzgerald, 2014). This emerging and promising approach has been adopted quite successfully in various fields, such as chemical structure identification (Oprea et al., 2009), data mining (Brew et al., 2010), medical drug development (Norman et al., 2011), logo design (Araujo, 2013), and software development (Jayakanthan & Sundararajan, 2011).

One of the leading crowdsourcing platforms is Amazon Mechanical Turk1

(AMT). The tasks posted on AMT are known as HITs (Human Intelligence Tasks) or micro-tasks. Micro-tasks are characterized as short, self-contained, simple, and repetitive pieces of work (Stol & Fitzgerald, 2014). By dividing the overall task into micro-tasks, work can be performed independently and in parallel, requiring little time, cognitive effort, and specialized skills (Stol & Fitzgerald, 2014). Crowdsourcing is particularly effective for micro-tasks, such as tagging images (Nowak & Rüger, 2010), and translating small snippets of text (Zaidan & Callison-Burch, 2011).

This study addresses the question of whether the same remarkable successes that crowdsourcing is having in revolutionizing other domains can be brought to software architecture and design, specifically. According to Stol & Fitzgerald (2014), software engineering is rapidly shifting away from occurring in small, isolated groups of developers to organizations and communities involving many people. Companies are increasingly using crowdsourcing to execute particular software development tasks (Stol & Fitzgerald, 2014). In contrast to micro-tasks, software development tasks are often interdependent, complex, and heterogeneous, hence they are difficult to break down into simpler, distributable tasks as required by the conventional crowdsourcing model (Park et al., 2013). However, cases of crowdsourcing complex tasks can be found on, for example, InnoCentive2

, which deals with problem solving and

1_{https://www.mturk.com/mturk/welcome} 2_{http://www.innocentive.com/}

(5)

innovation projects.

Examples of software crowdsourcing platforms include uTest3

, and TopCoder4

. Software crowdsourcing can be broadly classified into two categories, namely (1) outsourcing only, and (2) outsourcing with competitions (Wu et al., 2013). uTest outsources without competitions, and TopCoder is an example of the latter, in which people contribute their software and compete for prizes.

Crowdsourcing is a multi-disciplinary research topic, and to date studies in the software engineering domain are scarce (Stol & Fitzgerald, 2014). More specifically, research in crowdsourcing software architecture and design remains unattempted. In this paper, we aim to provide preliminary support for CrowdDesign (i.e., crowdsourcing software architecture and design) by suggesting a new competition-based method to enable crowdsourcing of software architecture and design processes. This new method is different from existing competition-based crowdsourcing methods as it allows contestants to borrow ideas from each other for their designs during a so-called recombination step.

In the remaining article, we first provide some theoretical background on the definition of crowdsourcing and its advantages. Next, we begin with analyzing existing design working styles and then provide justification for our own method. After, we propose our main research questions regarding our new method, and describe the research methods used, including: study design, participants, tasks, and procedure. After analyzing the gathered data from the participants (including submitted designs, peer rankings, and conducted interviews), our findings and results are reported. Last, we critically reflect on and discuss the limitations of our current research design, and propose future recommendations.

4 Theoretical Background

4.1 What is Crowdsourcing?

Multiple definitions exist for the term ‘crowdsourcing’. Howe (2006), who coined the term, defines it as follows:

Crowdsourcing is the act of taking a job traditionally performed by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call.

For crowdsourcing software development, Stol & Fitzgerald (2014) present the following definition:

The accomplishment of specified software development tasks on behalf of an organization by a large and typically undefined group of external people with the requisite specialist knowledge through an open call.

4.2 Advantages of Crowdsourcing

Stol & Fitzgerald (2014) list several potential benefits to the use of crowdsourcing in general, which are also applicable to the field of software development specifically:

3 http://www.utest.com/ 4_{http://www.topcoder.com/}

(6)

- Cost reduction (lower development costs)

- Faster time-to-market (across time zones and parallel development) - Higher quality (broad participation)

- Creativity and open innovation (variety of expertise) 4.3 Design Working Styles

Park et al. (2013) identify two broad elements of the industrial design process, namely collaboration and competition. They categorize and place different design working styles (also applicable to software design) along these two dimensions, including their own method (i.e., Crowd vs. Crowd; see Figure 1).

-‐ Single design team: A single body is responsible for all design activities. There is no external competition or collaboration; the crowd is not involved. -‐ Design contest: A large number of independent bodies externally compete to

win without collaboration.

-‐ User involvement: A single design team externally collaborates by involving users.

-‐ Crowd vs. Crowd: A crowd of people is motivated to actively participate in design on an open platform, where they form several competing teams.

Figure 1. Design working styles (Park et al., 2013)

4.4 Competition-‐based Crowdsourcing

Multiple agents simultaneously produce versions of designs at the request of a principal who is after the design of the highest quality. This model is known as crowdsourcing, and the contest model matches one of the standard approaches to crowdsourcing where numerous agents make an effort to submit a design to compete for prizes (Cavallo & Jain, 2012). The winner is dependent on the relative quality of designs submitted by other contestants (Cavallo & Jain, 2012).

Very few studies have considered simultaneous collaboration and competition to be beneficial in contest communities, and that both of them need to be emphasized at the same time (Hutter et al., 2011). In the article by Hutter et al. (2011), it is argued that their concept of ‘communitition’ (a portmanteau word of community-based collaboration and competition) should include the principles of competitive participation without removing the space for collaboration, as various user discussions and comments improve the quality of submitted designs.

4.5 Three Conditions to be Crowd Smart

According to Surowiecki (2005), there are three conditions necessary for the crowd to be smart and wise:

1. Diversity of opinion (each person should have some private information, even if it is just an eccentric interpretation of known facts)

(7)

2. Independence (people’s opinions are not determined by the opinions of those around them)

3. Decentralization (people are able to specialize and draw on local knowledge) Furthermore, groups benefit from members talking to and learning from each other, however, this is at the cost of independency of the crowd, which means that too much communication can actually make the group as a whole less intelligent (Surowiecki, 2005). The real key is not perfecting a particular method, but rather satisfying the conditions that a group needs to be smart: diversity, independence, and decentralization.

4.6 Recombination

Yu & Nickerson (2011) proposed an iterative design process of idea generation, evaluation, and combination to design a chair for children. They showed that the crowd could produce more creative ideas through combining and transforming the previous generation of design ideas. However, the design output of this method did not suffice the level of design quality for real-life application. Park et al. (2013) believed that this is due to the fact that a non-designer crowd who lacked the necessary design skills and expertise was involved in the study.

4.7 Our Method and Justification

The three central problematic areas (cost, time, and quality) inherent in the so-called “software crisis” (Karlsson et al., 2014) are directly addressed by the first three advantages provided by the use of crowdsourcing mentioned in Section 4.2. Hence, it is not surprising that quite a few authors have argued that crowdsourcing could become a standard approach to software development (Stol & Fitzgerald, 2014). Our proposed method for crowdsourcing software architecture and design is called “Recombination Contest”, which is placed in between “Design Contest” and “Crowd vs. Crowd” in the graph (see Figure 2). Basically, “Recombination Contest” is a two-round design contest with a recombination step in between.

Figure 2. Our method, “Recombination Contest”, added to the graph

With existing working styles, we observe that, when competition is used, it comes with either direct collaboration (Crowd vs. Crowd) or no collaboration (Design Contest). According to Hutter et al. (2011), it is favorable to have ‘communitition’, however Surowiecki (2005) argues that too much communication can make the crowd less intelligent. Therefore, we want to explore somewhere in between the extremes and name it “indirect collaboration”, which means collaborating without direct communication. Recombination of designs can be recognized as “indirect collaboration”, as it does not involve direct communication between contestants.

(8)

During the recombination step, designers are exposed to the initial designs of other designers and, therefore, are able to borrow ideas from each other for their revised designs. The recombination step acts as collaboration without direct communication among participants, as sharing designs can be perceived as some kind of indirect communication, and borrowing ideas perceived as collaboration.

The combination of competition and indirect collaboration (Recombination Contest) balances the needs of collaboration (to improve the quality of designs; Hutter et al., 2011) and less communication (to make the crowd more intelligent; Surowiecki, 2005).

Furthermore, we chose competition-based crowdsourcing to ensure higher quality, as people with software development talent self-select on the basis that they have the necessary expertise to participate in contests where the design with the highest quality will be chosen as the winner (Stol & Fitzgerald, 2014).

Following the study conducted by Yu & Nickerson (2011) on recombination, it is argued by Park et al. (2013) that to maintain a sufficient level of design quality, the necessary design skills and expertise is needed in the crowd. Also, returning to our working definition of crowdsourcing software development by Stol & Fitzgerald (2014), the crowd needs to possess the requisite specialist knowledge. Therefore, taken this requirement into consideration, we prescreened all candidates prior to testing our new method to ensure a designer crowd with the requisite specialist knowledge.

5 Research Questions

We were primarily interested in whether our approach could help designers improve design quality and if peer ranking had a positive effect on improvement. In addition, we wanted to analyze whether the crowd is able to identify good designs through peer ranking. We have tested our new crowdsourcing method with the following three main research questions:

1. Do designers improve their designs after recombining their own design with other designs?

2. Does ranking other designs have a positive effect on improving design quality of the revised design?

3. Are designers able to identify good designs?

Section 6.4.1 offers explanations for these research questions. 5.1 Scientific Relevance

Given the importance of architecture and design activities to software development, and the importance of software to society as a whole, potential social and scientific benefits are substantial. This study may lead to new and improved methods to crowdsource software architecture and design processes for developers that significantly increase the productivity and/or the quality of their work, and also make software development less tedious and frustrating. This, in turn, could result in less expensive and faster ways to produce software, making it cheaper, more readily available, and more responsive in its development to user needs.

(9)

6 Methods

6.1

Study Design

6.1.1 Overview

Participants had the option to enroll in one of the two design competitions: one for software architecture and design, and one for user interface design. This research paper is focused solely on the software architecture and design competition. Both competitions were run parallel to each other. The competition consists of two rounds in which selected participants were required to submit an initial design (first round) and a revised design (second round). In between the two rounds, participants were given the opportunity to take a look at the initial designs submitted by other participants in their pool and were strongly encouraged to use this as inspiration for their own revised design (i.e., a recombination step). The participants were equally divided into a control and an experimental group. The experimental condition is to have the participants provide a ranking of the initial designs of other participants in their group (excluding their own design). However, all participants, both control and experimental group, were required to rank the revised designs of other participants after the second round. In the end, all participants had to participate in an interview. See Table 1 for an overview of the experimental conditions together with the time given for each task in the competition.

First round Ranking? Second round Ranking? Interview Time given One week 3 days One week 3 days 30-45 min

Control No Yes Yes

Experimental Initial design Yes Revised design Yes Yes

Table 1. Experimental conditions and timeline

6.2 Participants

6.2.1 Recruiting Process

The recruitment process involved both passive and active recruitment through direct contact. Passive recruitment involved postings to professional mailing lists and websites (e.g. Reddit, Facebook, Craigslist, 99Designs, TopCoder, Wordpress) and postings to student mailing lists (i.e., undergraduates and graduates in the Donald Bren School of Information and Computer Sciences). See Appendix A for the email that was used for announcements to mailing lists. Active recruitment involved direct correspondence with contacts in industry, and students and staff at the University of California, Irvine, and other universities, such as Carnegie Mellon University and the University of Southern California. Candidates had to fill out a form on google.doc5 for pre-screening purposes.

6.2.2 Selection and Target Size

Candidates were selected for participation in the contest based on number of years of industry experience in software architecture and design (minimum of two years), educational background, skills in languages and tools. We selected twenty qualified

5

https://docs.google.com/forms/d/160nFaxGcUffSHoJPvcqH5EwsW9FYVTJEPQZx0qj1b1k/ viewform?usp=send_form

(10)

participants, equally divided over the control and experimental groups. This target size was determined by our capacity to analyze the data.

6.2.3 Demographics

All participants are developers, people who write software as part of their studies or professionally as part of their job. Most participants were students. The developer population has an overrepresentation of males. In our selected sample, there were only three female participants. The age of the participants ranged between 23 and 34, with years of experience in software development ranging between 2 and 10.

6.2.4 Compensation and Reimbursement

Participants were compensated $100 for their participation at the conclusion of the full study period (28 May – 27 June 2014). Participants who dropped out of the competition were compensated in a pro-rated manner (i.e., participants were paid $45 for completing the first round design, taking the interview into consideration at 10%). Additionally, two $1000 prizes were awarded, one for the best software architecture and design at the end of the first round, and one for the second round. The winning participant from the first round remained eligible for the prize in the second round.

6.2.5 Informed Consent Process

After participants were selected for inclusion, written consent forms were obtained from them prior to the beginning of the competition. They were asked to print, sign, scan (as all of them were participating remotely), and return a copy of the informed consent form by email.

6.2.6 Privacy

To prevent private or confidential information from being publicly disclosed, participants were coded into fruit names.

6.3 Tasks

6.3.1 Summary of Design Prompt

Participants were given a design prompt in which they were tasked with designing a traffic flow simulation program for students that shows how traffic signal timing works by allowing them to create different traffic signal timing schemes. Traffic signal timing involves determining the amount of time that each traffic light spends being green, yellow, and red at an intersection to allow cars from each direction to flow through the intersection smoothly.

There are four system requirements that participants need to follow when designing the system. Students must be able to:

1. Create a visual map of an area, laying out roads and intersections (it should accommodate at least six intersections).

2. Describe the behavior of the traffic lights at each of the intersections (it should accommodate protected left turns).

3. Simulate traffic flows on the map, based on the map and intersection timing schemes they have created.

4. Change the traffic density that enters the map on a given road.

Participants were asked to create the architecture and design of the system; user interface design was not needed. This required them to focus on the important design

(11)

decisions that form the foundation of the implementation. They could use whatever representations of design that they find appropriate. Participants were informed that their designs would primarily be evaluated based on its elegance, clarity, and completeness. For the full description of the given design prompt see Appendix B. 6.4 Procedure

6.4.1 Detailed Description

All communication with selected participants was done through email. Participants were sent an email with the design prompt and were given a one-week period to work individually on their design and upload it to Dropbox6

. Participants were free to choose how much time they wished to spend on the task. They were not allowed to exchange ideas on the design and worked independently. Designs had to be submitted in PDF format.

At the conclusion of the one-week period, a recombination step took place where participants were instructed to read half of the designs produced by the other participants in the competition, depending on which group they belonged to (either control or experimental). Designs were distributed to the participants by sharing a Dropbox link. Half of the participants, i.e., the experimental group, were given three days to rank the designs of other participants in their group and submit their ranking by email. The reason for doing this is to analyze whether there is an effect in the quality of the revised designs as we assume that participants might look more carefully through the designs of other participants if they have to rank them, and hence be better able to incorporate the good aspects of those other designs into their own revised design (see Research Question 2). In addition, these peer rankings were used to assess whether participants were able to identify good designs (Research Question 3), and primarily borrow the good ideas from these designs as a consequence.

After having received all rankings, participants were given a second one-week period in which they had the chance to revise their initial design and submit a new one in Dropbox based on what they learned from examining (and ranking) the other designs. We want them to submit a revised design in order to analyze any design quality improvements after the recombination step (see Research Question 1).

At the conclusion of the second one-week period, participants were again asked to read the subset of revised designs from other participants in their pool and all participants (both control and experimental) were given three days to rank the revised designs and submit the rankings by email. As we want to analyze whether participants were able to accurately rank their peers, and identify good designs (see Research Question 3).

Finally, participants were required to participate in a brief 30-45 minutes semi-structured interview using Skype7

. Interview questions mainly served to inform us about participants’ design and ranking approach during, and opinions on the competition. For the interview protocol, see Appendix C.

Winners of the design challenge were announced via email at the end of the experiment. They had the option to be publicly recognized as a winner of the competition on a public website: http://softwaredesignuci.wordpress.com/. See Figure 3 for the experimental setup.

6_{http://www.dropitto.me/}

7

(12)

Figure 3. Experimental setup

6.4.2 Participant Dropouts and Final Sample

Eight participants dropped out during the first round and two participants dropped out during the second round, resulting into a final sample of ten participants (N=10). See Table 2 for an overview of the participants (coded into their fruit names) divided into control and experimental group.

Control group (N=5) Experimental group (N=5)

Watermelon Lime

Cantaloupe Peach

Banana Pomelo

Coconut Mandarin

Lemon Apricot

Table 2. Participants in control and experimental group

6.4.3 Data Collected

Collected data includes all designs from participants in the first and second round, rankings for half of the designs submitted in the first round by experimental group, rankings of all designs submitted in the second round by all participants, and recorded interview data.

6.4.4 Data Analysis

We used a combination of qualitative and quantitative methods for analyzing our data. Quantitatively, we have quality ratings for each design (from both rounds) produced by an expert panel of four using double-blind judging8

. Based on their quality ratings given for each design, a final ranking was produced (apart from the ranking given by the participants) separately for the control and experimental group, as well as in absolute, meaning that the rankings of both groups are combined. Winners of the contest in the first and second round were selected according to the expert absolute rankings. In addition, we examined the extent to which participants improved the quality of their designs in the second round using the expert quality ratings in a dependent samples t-test, and whether ranking the first round designs (experimental condition) helped participants improve the quality of their designs more by performing an independent samples t-test. Furthermore, we analyzed whether

(13)

participants were able to rank designs in correspondence to the rankings produced by the expert panel by using comparison matrices.

Quantitative data were extracted from the interviews through manually inserting all numbers in an Excel9

spreadsheet. These include hours spent on creating designs, self-rankings, perceived task difficulty, perceived ranking difficulty, hours spent on ranking designs, and years of industry experience. Other quantitative data, such as number of pages submitted for the designs, and amount of ideas borrowed by the participant, were counted from the designs and inserted in Excel. Subsequently, we imported all these quantitative data from Excel into SPSS10

together with previously mentioned quantitative data (such as expert ratings, expert rankings, and participant rankings). We performed Pearson’s correlation analyses to investigate the relationship between each pair of quantitative variables in our data set. The list of variables used in the analyses can be found in Table 3.

Variable Name Meaning

Hrs_first Number of hours spent on creating the design in the first round by participant.

Hrs_second Number of hours spent on creating the design in the second round by participant.

Rank_first_self Participant’s rank in the first round according to self. Rank_first_participants

(only experimental)

Participant’s rank in the first round according to other participants.

Rank_first_experts Participant’s rank in the first round according to expert panel; separate ranks for control and experimental group. Absolute_rank_first Participant’s absolute rank in the first round according to expert panel grading, which means that the rankings for control and experimental groups are combined into one ranking.

Grading_first Participant’s grade for their design in the first round given by expert panel.

Rank_second_self Participant’s rank in the second round according to self. Rank_second_participants Participant’s rank in the second round according to other

participants.

Rank_second_experts Participant’s rank in the second round according to expert panel; separate ranks for control and experimental group. Absolute_rank_second Participant’s absolute rank in the second round according to

expert panel grading, which means that the rankings for control and experimental groups are combined into one ranking.

Grading_second Participant’s grade for their design in the second round given by expert panel.

Task_difficulty Perceived difficulty of the design task on a Likert scale of 1-7.

Rank_difficulty Perceived difficulty of ranking other people’s design on a Likert scale of 1-7.

Hrs_rank_first (only experimental)

Number of hours spent on ranking other people’s design in the first round.

Hrs_rank_second Number of hours spent on ranking other people’s design in the second round.

9_{http://office.microsoft.com/nl-nl/excel/}

(14)

Improvement Improvement in the grade given by expert panel in the second round compared to the first round, which is calculated by Grading_second minus Grading_first.

Yrs_experience Participant’s number of years of industry experience. Page_first Number of pages submitted by the participant in the first

round.

Page_second Number of pages submitted by the participant in the second round.

Page_increase Increase in number of pages submitted by participant in the second round compared to first round, which is calculated by Page_second minus Page_first.

Amount_ideas Amount of ideas borrowed by participant from other designs during the recombination step.

Table 3. List of variables used in analyses

Qualitatively, two persons independently compared participants’ revised design with their initial design and tried to find possible changes made from looking at the other designs to which this participant had access. The findings of each person were put in a “who copied what from whom”-matrix and later combined into one matrix with overlapping findings merged into one (increasing reliability). We used this matrix during the interviews to confirm participant’s borrowing activities. Furthermore, we have transcribed all interviews (see Appendix D for interview transcriptions) and two researchers separately built cards for card sorting11

in order to extract qualitative data. Four researchers helped in creating categories for card sorting. This qualitative data was used to support our findings regarding the perceived difficulty of the design task, the criteria participants used in ranking other designs, ranking strategy, participants’ reactions to seeing other designs after the first round, most popular ideas borrowed, types of changes people made in their revised designs, borrowing process (what was easy or difficult; reasons for not incorporating ideas), contest effectiveness, gamification of the competition, and suggestions for improvement.

7 Findings and Results

7.1 Designs

7.1.1 Design Process

Participants spent on average 10.65 ± 5.45 and 10.6 ± 11.73 hours on their first and second design, respectively. The average hours spent on the first and second design are about the same, however the standard deviation is high, which means that there is a high variability in the amount of time spent on designs between participants. See Table 4 for the hours spent on the designs by each participant.

Fruit name Hrs_first Hrs_second

Watermelon 20 37.5

Cantaloupe 9 24

Banana 5 2.5

Coconut 6 5

11_{Cataldo, E.F., Johnson, R.M., Kellstedt, L.A., & Milbrath, L.W. (1970). Card sorting as a} technique for survey interviewing. Public Opinion Quarterly, 34(2), 202-215.

(15)

Lemon 5.5 2 Lime 11 4 Peach 12 15 Pomelo 10 2 Mandarin 8 8 Apricot 20 6 AVERAGES 10.65 10.6

Table 4. Hours spent on designs

On average participants perceived the design task as moderately difficult, rated as a 4.8 ± 0.63 on a scale from 1 to 7. See Table 5 for the perceived task difficulty of all participants.

Fruit name Task_difficulty

Watermelon 5 Cantaloupe 4 Banana 5 Coconut 4.5 Lemon 5 Lime 3.5 Peach 5.5 Pomelo 4.5 Mandarin 5 Apricot 5 AVERAGES 4.8

Table 5. Perceived task difficulty (on a scale from 1 to 7)

Participants found that lack of domain/background knowledge on traffic lights makes design hard. For example, Lemon reported “There were some specific situations like

managing cars in the traffic lights or possible situations that might happen, for those cases you need to be careful on the elements you provide as part of the architecture”

and Mandarin reported “This traffic control is not familiar for me”. Quite a few participants (Mango, Pomelo, Lemon, Cantaloupe, Apricot, and Watermelon) also mentioned that, due to time pressure, tools, and technical issues, they were not able to create a better design, as Pomelo reported “I think mine was least complete, because I

didn’t really have much time; I haven’t been around designing UML stuff for a while”. Furthermore, Apricot reported that the task was simple in terms of scope, but

there were challenging user needs. The complexity of the system is another indicator of perceived difficulty for participants (Coconut, Mandarin, Watermelon, Cantaloupe, Banana, and Lime), as, for example, Mandarin reported “It was more difficult than a

normal task because of the complexity on the technical side”. The recombination step

in between the two rounds made the design task easier for Peach as he could use other designs as examples and see what other people are doing.

7.1.2 Structure and Characteristics of Designs

The average number of pages submitted by participants in the first round is 10.3 ± 4.92 and in the second round 12 ± 4.59, which results in an increase of 1.7 pages ± 1.57 on average. See Table 6 for number of pages submitted by each participant in each round.

(16)

Fruit name Page_first Page_second Page_increase Watermelon 3 8 5 Cantaloupe 5 7 2 Banana 11 12 1 Coconut 12 13 1 Lemon 14 15 1 Lime 19 20 1 Peach 9 12 3 Pomelo 5 5 0 Mandarin 14 17 3 Apricot 11 11 0 AVERAGES 10.3 12 1.7

Table 6. Page count of designs

The typical sections appearing in the designs are system requirements, quality attributes, simulation design, detailed design, software components, assumptions, benefits, and limitations. The types of diagrams typically used in the designs are class diagrams, sequence diagrams, use case diagrams, static view, dynamic view, and high-level architecture.

7.1.3 Predictors of Design Quality

We have identified several possible predictors of the quality of participants’ designs: - Expertise of participants

- Time spent on designing

- Ranking task in first round (experimental condition; predictor of second round design quality)

- Time spent on ranking in first round (experimental condition; predictor of second round design quality)

- Perceived task difficulty

- Length of document (number of pages) Other subjective predictors could be:

- Amount of ideas borrowed from other designs (predictor of second round design quality)

- Clarity of presentation

These predictors of design quality are each analyzed in more detail in the following sections. Design quality is measured by the sum of the grades given for elegance, clarity, and completeness by the expert panel.

7.1.3.1 Expertise: Were the designers of the best designs the most experienced? The age of the participants ranged between 23 and 34. Most participants were students. Participants had between 3-8 years of experience in software development. The two participants with the most professional experience (8 years) placed first and second in the competition. The winner of both rounds (Lemon) has the least amount of programming languages knowledge (one) but is highest educated. However, the winner is not the oldest participant. The first runner-up (Lime) is the oldest participant and knows four programming languages. This implies that the best designers are among the most experienced ones in terms of professional experience but not necessarily in terms of knowledge of languages and tools.

In addition, when running a correlation analysis in SPSS between years of industry experience and grade received for both first and second round, a strong significant

(17)

positive correlation could be found (first round: r = .664; p = .036, second round: r = .612; p = .025). This also provides evidence that the higher the years of industry experience, the higher the grade for the design.

7.1.3.2 Design Time: Did good designs cost more time?

The average number of hours spent on the first design is 10.65 ± 5.45. The winner and first runner-up (Lemon and Lime) spent 5.5 and 11 hours creating the design, respectively. The bottom design (Cantaloupe) spent 9 hours on their design.

The average number of hours spent on the second design is 10.6 ± 11.73. The winner and first runner-up (Lemon and Lime) spent 2 and 4 hours revising their design, respectively. The bottom design (Pomelo) spent 2 hours on their design. This seems to indicate that strong designs do not necessarily take more time to create.

By running a correlation analysis in SPSS, no significant correlation can be found between the time spent creating the designs (both first and second round) and their grades received from the expert panel (first round: r = -.180; p = .620, second round: r = -.486; p = .155).

7.1.3.3 Ranking Task: Did the designers that ranked other designs after the first round produced better revised designs? (experimental condition)

Participants in the experimental group were asked to rank the initial designs of their peers, and the control group was not required to rank. We analyzed whether this ranking task had a positive effect on the grades received from the expert panel for the revised designs of participants. On average, the experimental group scored a 14.6 ± 3.66 for the design quality of their revised designs, and the control group a 13.2 ± 3.90. This means that the experimental group produced better designs than the control group by 1.4 points, on average. However, by running an independent samples t-test in SPSS, this result seems to be statistically insignificant (t = .5851; p = .5746).

7.1.3.4 Ranking Time: Did the designers of good designs spend more time on ranking? (experimental condition)

Participants in the experimental group were asked how much time they spent on ranking the initial designs of the first round. The second round winner in the experimental group (Lime) spent 3 hours on ranking the first round designs. On average, participants in the experimental group spent 2.23 hours ± 1.79 on ranking the designs for the first round. The second round bottom design (Pomelo) spent half an hour ranking the second round designs. The top design spent more than average time on ranking and the bottom design spent less than average amount of time on ranking. Designers of strong designs thus might have spent more time on ranking. However, the correlation between time spent on ranking designs after the first round and their design grades for the second round is statistically non-significant (r = .147; p = .813). This indicates that by spending more time on ranking in the first round did not necessarily lead to better designs in the second round, which means that designers of good designs did not necessarily spend more time on ranking designs.

7.1.3.5 Perceived Task Difficulty: Did designers of the good designs think the task is easier?

Participants were asked how difficult they believed the task to be on a scale from 1 to 7, with 1 meaning very easy and 7 meaning highly difficult. The winner of both rounds (Lemon) rated the difficulty of the task as 5. The runner-up of both rounds (Lime) gave the task a 3.5. The bottom ranked designs, Cantaloupe, Watermelon, and Pomelo, ranked it a 4, 5, and 4.5, respectively. On average, participants rated the task

(18)

a 4.8 ± 0.63, indicating moderate difficulty. Given the fact that the winner of both rounds found the task more difficult than the average and the runner-up found it easier than the average, we can infer that designers of strong designs did not necessarily find the task easy.By looking at the statistics, we also cannot infer a significant correlation between perceived task difficulty and grading for both first and second round designs (first round: r = -.037; p = .920, second round: r = 0.127; p = .726).

7.1.3.6 Length of Document: Were good designs bigger?

The top two designs in the first round had the most number of pages. In the second round, they were among the top 3 with most number of pages. Lemon’s 15-page design was outnumbered by Mandarin (5th

rank), who had 17 pages. The bottom designs had 5 pages in the first round (Cantaloupe) and 5 in the second (Pomelo). See Table 7 for a complete overview of page count of designs in the order of their absolute ranking. The average number of pages is 10.3 ± 4.92 and 12 ± 4.59 in the first and second round, respectively. This implies that stronger designs generally are longer than average. By looking at the statistics, a very strong positive correlation can be found between the page count and grading of the designs given by experts for both first and second round, i.e., the higher the grade, the higher the number of pages in the document (first round: r = .801; p = .005, second round: r = .760; p = .011).

Absolute ranking

Round 1 Page count Round 2 Page count

#1 Lemon 14 Lemon 15 #2 Lime 19 Lime 20 #3 Apricot 11 Apricot 11 #4 Banana 11 Banana 12 #5 Mandarin 14 Mandarin 17 #6 Peach 9 Peach 12 #7 Coconut 12 Coconut 13 #8 Pomelo 5 Cantaloupe 7 #9 Watermelon 3 Watermelon 8 #10 Cantaloupe 5 Pomelo 5 AVERAGES 10.3 12

Table 7. Page count of designs in the order of their absolute ranking

7.1.3.7 Amount of Ideas Borrowed: Did good designs get more influences from other designs?

Second round winner Lemon made three changes that were influenced by Banana’s design (added sensors, concept of events, and framework to design optimal path). Runner-up Lime was influenced by Mandarin, Apricot, and Papaya, and made four changes that were inspired by these designs (UI mockup, functional requirements, off-the-shelf statistical distribution generator, and map controller logic). The bottom design, Pomelo, took three ideas from other participants (system requirements and components, traffic simulator, and traffic flow controller). This leads to the belief that good designs were not necessarily more influenced by others than lower-ranked designs. Moreover, when running a correlation analysis in SPSS with the amount of ideas borrowed and second round design grading received by experts, no statistically significant result can be found (r = .314; p = .377).

See Tables 8 and 9 for the “who copied what from whom”-matrices of the control and experimental group, respectively. Note: Guava and Papaya dropped out after the first round, which means participants were exposed to their initial designs during the

(19)

recombination step, and some participants borrowed ideas from them.

Control

group Watermelon Cantaloupe Banana Coconut Lemon

Guava (disqualified) Watermelon Quality attributes, static view of architecture Cantaloupe Extended domain model Extended domain model Directions (N, S, E, W) Banana Bi-directional traffic Light scheme Coconut Assumptions, benefits, limitations Lemon Sensors, concept of events, framework to design optimal path

Table 8. “Who copied what from whom”-matrix control group (vertical copies from

horizontal)

Experimental

group Lime Pomelo Mandarin Apricot Peach

Papaya (disqualified) Lime UI mockup, functional requirements Off-the-shelf statistical distribution generator UI mockup, map controller logic Pomelo System requirements and components System requirements and components, traffic simulator, traffic flow controller Mandarin Model elements, simulation sequence Apricot System diagram, road graph, and subcomponents diagram System diagram, road graph, and subcomponents diagram System diagram, road graph, and subcomponents diagram Peach Graphical representation dependency graph, multiple turns at streets Cheat sheet of classes

Table 9. “Who copied what from whom”-matrix experimental group (vertical copies from

horizontal)

7.1.3.8 Clarity of Presentation: Were the strong designs clearer? How?

(20)

text) and many diagrams. It does not contain any code or a UI mockup, as was said in the design prompt. Other designs have either a lot of text and little to no diagrams, or the other way around. This is also reflected in the experts grading, who gave Lemon’s design a 6.3 (first round) and 6.5 (second round) for clarity (on a 1-7 scale). The first round bottom design (Cantaloupe) only contains 5 pages and has a UI mockup. Cantaloupe scored a 2.8 on clarity. The second round bottom design (Pomelo) contains hardly any text and was given a 3 for clarity. This implies that strong designs scored higher on clarity than weaker designs. There is a strong positive correlation between grade received for clarity and total quality grade received by experts (which defines the stronger designs) for both first and second round (first round: r = .981; p = .000, second round: r = 0.990; p = .000).

Note: This result is not surprising as clarity was one of the three criteria used to assess the quality of designs. However, we were interested in looking at clarity in specific, as clear designs do not necessarily have to be high in overall design quality. For example, a design could score high on clarity, but low on completeness and elegance (the other two criteria), resulting in a lower overall grade for design quality.

7.2 Design Evaluations

7.2.1 Winners of Contest

Winners of the contest were selected based on the absolute expert rankings (see Table 10). A breakdown of the grades can be found in Table 11.

Round 1 Round 2

Absolute

ranking Fruit name Grade Fruit name Grade

#1 Lemon 18.25 Lemon 18.5 #2 Lime 17.5 Lime 17.5 #3 Apricot 15.75 Apricot 17.25 #4 Banana 15.75 Banana 16.25 #5 Mandarin 13.75 Mandarin 15.5 #6 Peach 9.5 Peach 14.25 #7 Coconut 8.75 Coconut 10.75 #8 Pomelo 8.5 Cantaloupe 10.5 #9 Watermelon 8 Watermelon 10 #10 Cantaloupe 7.75 Pomelo 8.5

Table 10. Absolute expert rankings

Ranking Round 1

Control Fruit name Completeness Elegance Clarity Total

#1 Lemon 6.25 5.75 6.25 18.25 #2 Banana 5.75 4.75 5.25 15.75 #3 Guava 3 3.25 3.75 10.0 #4 Coconut 2.5 3 3.25 8.75 #5 Watermelon 2.75 2.5 2.75 8.0 #6 Cantaloupe 2.5 2.5 2.75 7.75 Experimental #1 Lime 6.25 5.25 6 17.5 #2 Apricot 5.25 5.25 5.25 15.75 #3 Mandarin 5 4.5 4.25 13.75 #4 Peach 3.25 3 3.25 9.5

(21)

#5 Papaya 3.25 3.25 2.5 9.0

#6 Pomelo 2.75 2.75 3 8.5

Ranking Round 2

Control Fruit name Completeness Elegance Clarity Total

#1 Lemon 6.5 5.5 6.5 18.5 #2 Banana 5.75 4.75 5.75 16.25 #3 Coconut 3.5 3.25 3.75 10.75 #4 Cantaloupe 4 3 3.5 10.5 #5 Watermelon 3.5 3.25 3.25 10.0 Experimental #1 Lime 6 5.25 6.25 17.5 #2 Apricot 5.5 5.5 6.25 17.25 #3 Mandarin 5.75 4.75 5 15.5 #4 Peach 4.75 4.75 4.75 14.25 #5 Pomelo 2.5 3 3 8.5

Table 11. Ranking based on expert scores for completeness, elegance, and clarity on a scale

from 1 to 7, the average is taken from the expert panel of four; Note: Guava and Papaya (red) are missing in the second round ranking because they dropped out.

7.2.2 Accuracy of Peer Rankings

Only the experimental group was asked to rank the designs of their peers after the first round, and all participants had to rank them after the second round. The accuracy of peer ranking in comparison with expert rankings (not in absolute) is evaluated by means of the ranking matrix (see Table 12). By aggregating each participant’s ranking, peer rankings are calculated (see Table 13) and compared against the expert rankings (see Table 14). For individual participant’s ranking in experimental group of round 1, and control and experimental group of round 2, see Appendix E.

From the matrix, we can infer that participants were quite accurate in predicting the final rankings, as 11 out of the 16 total ranks (69%) were ranked similarly as the experts (green), 3 ranks were only 1 place off (yellow), and 2 ranks were 2 places off (orange). Experts #1 #2 #3 #4 #5 #6 R1-experimental 0 -2 +1 0 +2 0 R2-control 0 0 0 0 0 Participants R2-experimental 0 -1 +1 0 0

Table 12. Ranking matrix (negative = participants ranked design lower than experts; positive

= participants ranked design higher than experts; R1 = round 1; R2 = round 2); Note: there are no 6th_{ranks in the second round due to dropouts.}

Ranking R1-experimental R2-control R2-experimental

#1 Lime Lemon Lime/Mandarin

#2 Mandarin Banana Lime/Mandarin

#3 Papaya Coconut Apricot/Peach

#4 Peach/Apricot Watermelon/Cantaloupe Apricot/Peach

#5 Peach/Apricot Watermelon/Cantaloupe Pomelo

#6 Pomelo N/A N/A

Table 13. Participants’ collective ranking of other designs; Note: Papaya (red) dropped out in

(22)

Ranking R1-experimental R2-control R2-experimental

#1 Lime Lemon Lime

#2 Apricot Banana Apricot

#3 Mandarin Coconut Mandarin

#4 Peach Cantaloupe Peach

#5 Papaya Watermelon Pomelo

#6 Pomelo N/A N/A

Table 14. Experts’ ranking of other designs; Note: Papaya (red) dropped out in the second

round.

Very strong positive correlations can be found between rankings provided by the expert panel and rankings given by participants for the designs from both the first and second round. However, only the correlation for the second round is statistically significant (first round: r = .843; p = .073, second round: r = .821; p = .004). Note that there was more data in the second round, as both the control and experimental group had to rank their peers. These statistical results show that participants were better able to accurately rank the designs in the second round, where both groups had to rank their peers.

7.2.3 Ranking Criteria

We asked participants what they found important when ranking the designs of other participants. The most popular criteria mentioned are the ones we talked about in the design prompt (elegance, clarity, and completeness), of which clarity and completeness are the two main factors. 4 out of 10 participants (Mandarin, Pomelo, Coconut, and Lemon) were considering clarity as an important factor in determining the ranking, and 7 out of 10 participants (Lemon, Banana, Apricot, Pomelo, Mango, Mandarin, and Lime) looked for completeness. Simplicity as a part of elegance was one of the criteria used by Banana, and overall elegance used by Lime and Lemon. Other examples of criteria that participants assessed were level of detail in design (Cantaloupe and Coconut), flexibility (Apricot and Banana), easily implementable (Lime), and visuals (Apricot). In addition, Coconut reported that he was looking for diagrams, “Does it contain diagrams? Design should contain 3 views: static, dynamic,

and physical”. Also, Pomelo looked at whether participants included UML diagrams

and not just coding.

7.2.4 Ranking Strategy

Participants primarily used criteria strategies for ranking the designs. 7 out of 10 participants (Lime, Lemon, Peach, Watermelon, Mango, Pomelo, and Cantaloupe) used this strategy. It basically means that they went through each design individually and followed a set of criteria to rank the designs with. Coconut used a grouping strategy, meaning that he separated the good designs from the bad (two groups) and read through them again per group. Banana and Apricot used a comparative strategy, meaning that they go through the designs one at a time, comparing each with the previous ones, and fit it somewhere in between.

Some participants (Mango, Banana, and Lime) reported that they found ranking similar designs hard, thus for them the variety in designs has an impact on the perceived difficulty of the ranking task. Furthermore, in terms of quality, a couple of participants found it more difficult to rank the top ones (Mandarin and Peach), while another participant (Pomelo) found it easy to identify the top and worst. Cantaloupe perceived the ranking task as moderately difficult criteria-wise as he reported “I

(23)

Differences between rankings in the first versus second round are, for example, taking less time to rank after second round because of little changes made by other participants (Lime), and Apricot used different criteria in the second round as she started to look at improvements made compared to others’ initial designs.

7.2.5 Accuracy of Self-‐Ranking

We asked participants to rank themselves for the first and second round during the interview in order to compare this rank with their actual rank in the group (either control or experimental) given by expert panel. See Table 15 for participants’ self-rankings compared to their actual rank.

Fruit name Round 1 Round 2

Control Actual rank Self rank Actual rank Self rank

Watermelon 5 2-3 5 2 Cantaloupe 6 4 4 2-3 Banana 2 1 2 1 Coconut 4 2 3 2-3 Lemon 1 1-2 1 1-2 Experimental Lime 1 3 1 2 Peach 4 4 4 2 Pomelo 6 6 5 5 Mandarin 3 1 3 1-2 Apricot 2 3-4 2 3-4

Table 15. Participants’ self-rankings compared to their actual rank (given by experts); Note:

The disqualified participants, Guava (control group) ranked 3rd_{and Papaya (experimental} group) ranked 5th_{in the first round by experts, are excluded from this table, which causes the} 6th_{ranks in the table.}

The first and second round winner (Lemon) ranked himself either 1st

or 2nd

for both rounds, as he reported that he did not change many things for the second round and there was one other strong design that did not improve much during the second round. Furthermore, he felt that there was still a gap between the stronger designs and the other ones in the second round, even though the weaker designs did improve.

The first runner-up (Lime) ranked himself 3rd

in the first round and 2nd

in the second round, because he was missing some parts (e.g. model design) in his initial design, and therefore ranked himself lower in the first round.

The bottom design (Cantaloupe) in the first round believed his rank to be 4th

, as he reported that he was not sure what the expectations were and left quite a few methods blank, but he did feel that he had more detail than the designs he would ranked lower than 4th

.

The bottom design (Pomelo) in the second round believed his rank to be 5th_{, as she}

reported “I think mine was least complete because I didn’t really have much time. So

from a completeness point probably mine is the last because I was not really able to include whatever I wanted to”.

Most participants (6 out of 10; Watermelon, Banana, Coconut, Lemon, Lime, and Mandarin) ranked themselves fairly high and believed their design to be in the top 3 after the first round. For example, Banana reported “I would rank mine at the top. I

(24)

anything significant, as I felt pretty confident with my design from the beginning”.

The same holds true for the second round, namely 8 out of 10 participants (except for Pomelo and Apricot) ranked themselves fairly high and believed their design to be in the top 3. For example, Mandarin reported “after second round I would place myself

as either first or second, because there was only one technically superior than mine”.

On average, participants expected their initial design to be ranked 2.85 ± 1.58 and their revised design to be ranked 2.35 ± 1.16, indicating that most designers were overall confident about their own work and even more confident about their revised design than about their initial design.

Most participants believed that their designs would end up high in the rankings even in the case of weaker designs, thus no significant correlation can be found between self-rankings in both rounds and their final ranking (first round: r = .596; p = .069, second round: r = .451; p = .191). This indicates that participants’ own judgment is a bad indicator of their final ranking, as participants tend to be more biased towards their own design (IKEA effect; Norton et al., 2012).

7.2.6 Reactions to Seeing Other Designs

Most participants (Coconut, Watermelon, Mandarin, Lime, and Peach) were encouraged by seeing good designs during the recombination step, as it gave them the opportunity to improve, because seeing others put in more time and effort pushes their design. For example, Coconut reported “I found some good designs, one design that

was in my section and this design encouraged me”, and Mandarin reported “When I see that something can be made better, I actually see how I can implement it in my design, and that motivates me”. However, Pomelo felt that seeing good designs was

discouraging as it made her think that her own design is really bad. Banana was encouraged by seeing worse designs than his own, as he reported “I feel that that

made me feel like I put more effort than other people did into the design process, and that I stood a pretty good chance”.

Some participants (Coconut, Cantaloupe, and Apricot) found that seeing other designs during the recombination step was somewhat providing feedback on their own designs. For example, Coconut felt that the other designs gave him negative feedback on what is expected as no participant designed it the way he did, and this was discouraging. However, Cantaloupe and Apricot felt encouraged as it gave them more clarification on the design task, and helped them to be more critical and reflective on their designs.

7.3 Design Changes

7.3.1 Improvement

Participants’ initial and revised designs were given a quality score by the expert panel. Participants’ improvement in design quality is calculated by subtracting the grade for the initial design from the revised design (see Table 16).

Fruit name Grading_first Grading_second Improvement Control

Watermelon 8 10 2

Cantaloupe 7.75 10.5 2.75

Banana 15.75 16.25 0.5