On using heuristics to approximate paintings from polygons

(1)

Bachelor Informatica

On using heuristics to approximate

paintings from polygons

Teun C. Mathijssen

June 25, 2019

Inf

orma

tica

—

University

o

f

Amsterd

am

(2)

(3)

Abstract

RGB-images can be approximated using semi-transparent, colored polygons. Little sci-entific research on image approximations using polygons has been done to date, hence this research is important towards exploration of the problem. In the approximation process, the locations of the vertices, the colors of the polygons and the drawing order of the poly-gons need to be adjusted in such a manner that we reach a more accurate approximation of the target image. This is an optimization problem with an enormous solution space. This research extends recent research by Paauw, van den Berg [23], by applying multiple nature-inspired optimization algorithms to the aforementioned optimization problem. We implement two versions of a stochastic hillclimbing algorithm: one from their paper and one using relative mutations. We also use a simulated annealing algorithm, but with a differ-ent cooling scheme than theirs. Finally, we develop a genetic algorithm that is specifically tailored for approximating images. We also redefine the experimental setup to use only triangles and hexagons during a run. We discuss the performance of the algorithms, the difference between using triangles and hexagons and the proposed ’polygonicity ranking’ by Paauw, van den Berg. We found that we can construct a simulated annealing algorithm that generally performs better than the other algorithms. We also found that the genetic algorithm we implemented often did not outperform the other algorithms. Next, we found that relative mutations do not lead to more optimal results than absolute mutations in terms of the MSE-values when approximating images. Next, we have shown that using triangles to approximate images does not necessarily lead to better results than using hexagons, al-though there are slight differences for some of the algorithms. Finally, we have shown that the ’polygonicity ranking’ as defined in [23] is also apparent in our results.

(4)

(5)

4 Method 17 4.1 Experimental setup . . . 17 4.2 Paintings . . . 17 4.3 Mutations . . . 18 4.4 Algorithms . . . 19 4.4.1 Stochastic Hillclimbing . . . 19 4.4.2 Genetic Algorithm . . . 20 4.4.3 Simulated Annealing . . . 21 4.5 Experiments . . . 22 5 Results 23 5.1 Average final MSE per painting . . . 23

5.2 Triangles and hexagons . . . 24

5.3 Algorithm performance . . . 24

5.4 Individual runs . . . 29

6 Discussion 31 6.1 Conclusion . . . 31

(6)

(7)

CHAPTER 1

Introduction

Heuristics are useful in solving problems that lack an exact solving method or consist of a solution space that is too big to explore efficiently (in polynomial time). Examples of those problems are optimization problems, many of which are NP-hard [16]. Another such problem is the category of timetabling problems [28, 11], which are often dealt with industrially.

Several experiments that have not been peer-reviewed [13, 30, 3, 7, 12] were done on approx-imating images using semi-transparent partially overlapping colored polygons. Initially, these polygons have random colors and they are placed at random locations. These experiments lead to published work in 2019 by Paauw, van den Berg [23] that further explores this process. They do so by approximating several well-known paintings. An example of this is shown in figure 1.1. Research into applying heuristic algorithms on problems like these is important, because it might give us more insight into the structure of fitness landscapes. Several experiments have been done on approximating paintings from polygons (we refer to the Related Work section), but most of them do not appear to be peer-reviewed scientifically. Because of this novelty of the research into approximating paintings from polygons, we believe it is important to expand on the research that has been done. Exploring how transparent polygons behave when approximating paintings might also lead us to discover (image) compression algorithms that are more space efficient, as this is currently done using techniques like the Discrete Cosine Transform (DCT) [33], a wavelet transform [22] or Block Truncation Coding (BTC) [14]. Finally, earlier work on approximating paintings (we refer to the Related Work section) shows that the process can lead to aesthetically pleasing results.

Image approximation using heuristics is done by minimizing an objective function, in this case the Mean Squared Error (MSE), between a painting they generate and the target painting. Therefore, the MSE is the objective function of these experiments, and calculating the MSE be-tween a rendered image and the target image is an evaluation of this function. Before performing an evaluation, one or more randomly chosen mutations are done. These mutations include: mov-ing a vertex to a new location, changmov-ing the color of a polygon, changmov-ing the drawmov-ing order of a polygon and transferring a vertex from one polygon to another.

(8)

Figure 1.1: Progression of rendering the Mona Lisa. Shown below the rendering images is a plot of the MSE of the rendering progression, which is used as the objective function. This figure is taken directly from Paauw, van den Berg [23].

In their paper, Paauw and van den Berg employ three different heuristic algorithms to ap-proximate the target images. The first algorithm they use is a stochastic hillclimber (SH). It is relatively basic, as it randomly performs a mutation and accepts it when it leads to a better approximation (lower MSE-value) of the target painting than the previously generated painting. However, because of its greedy nature, this heuristic is prone to get stuck in local maxima, ridges or plateaus in the objective function [25].

The second algorithm they treat is the plant propagation algorithm (PPA). PPA is a relatively new nature-inspired algorithm that provides a balance between intensification and diversification within the optimization process [27]. These two characteristics conflict, as diversification provides a means of escaping the attraction regions of local optima while intensification allows an algorithm to search locally and explore these optima. PPA seeks to explore a balance between these two. It does so by exploring many new solutions nearby current, more optimal, solutions while exploring fewer new solutions far from current solutions that are less good. As a result, the algorithm is suitable in an attempt to avoid local minima of the objective function.

The third and final algorithm they employ is simulated annealing (SA). Simulated annealing, introduced by Kirkpatrick et al. [20], provides an analog for solving optimization problems [26]. In metallurgy, material is annealed, heated to a high temperature, and then cooled slowly for the material to freeze and obtain finer crystalline structures [25]. This algorithm has a probability of accepting a worse approximation of the target image, which decreases over time. This probability is inversely proportional to how much the MSE increases in the worse approximation. At the same time, this probability is related to the ’temperature’ variable of the simulation: a new approximation is less likely to be accepted given a high temperature instead of a low temperature. Our research extends that of Paauw and van den Berg. We use a similar experimental setup and use the same paintings to reason about the effectiveness of our algorithms. We start off with theory on color in image representation. Then we describe how image approximation is performed (by a process called rendering). Next, we provide a detailed explanation of the algorithms that we use.

A new algorithm that we use in this paper is a genetic algorithm. To enable for recombination, we modify the experiments such that we only use triangles and hexagons throughout the runs. In contrast, Paauw and van den Berg start a run with mostly triangles, which can transform into other types of polygons throughout the run by receiving a vertex from another type of polygon. In both our experiments and those in [23], the total number of vertices that is used in the simulation is kept equal to be able to compare results.

We also run a simple stochastic hillclimber algorithm identical to theirs. Next, we design another version of the stochastic hillclimber that uses mutations with a relative range (the only algorithm we discuss here to do so). Finally, as the cooling scheme parameters for simulated annealing in [23] likely made the algorithm perform worse than the SH and PPA algorithms, we attempt to use a cooling scheme that leads to a much better performance.

Hence, in this thesis, we answer the following research question: can we design algorithms that perform better than the ones used in [23]? To answer this question, we answer the

(9)

fol-lowing questions: (RQ1): Can we implement a genetic algorithm and an annealing scheme that yield an acceptable approximation of the target images? (RQ2): Do relative mutations lead to better results than absolute mutations? (RQ3): Is there a difference between using triangles or hexagons in approximating paintings? We conclude about a metric of how well paintings can be approximated using polygons (called polygonicity by Paauw, van den Berg) and by providing pointers for future research.

(10)

(11)

CHAPTER 2

Related work

This research is further motivated by earlier experiments using polygons to approximate paint-ings. In 2008, Roger Johansson posted a blog entry of a simple genetic algorithm experiment where he approximates the Mona Lisa using transparent polygons [19]. The setup of the exper-iment is different than that of the experexper-iment done by Paauw, van den Berg [23]. They start off with a set number of vertices and polygons, while Roger iteratively increases the number of polygons within the generated image in order to produce the target image. Paauw, van den Berg suspect that this program is biased towards increasing numbers of polygons and vertices, hence they fix both the number of polygons and the total number of vertices.

Another author, Chris Cummins, made an online application ”Grow Your Own Picture” as a spin-off to the original idea of Roger [12]. This application also makes use of a genetic algorithm that models individuals of a population using a string of DNA. This DNA-string is then visualized as an image. The visualized images are then compared with the target image, which subsequently leads to ’fitness’ values. He then describes ’breeding’ the fittest individuals from the population to select the most accurate representation of the target image.

Further ideas have been published on the web as blog posts, such as [13], [30], [3] and [7]. Yann N. Dauphin [13] uses an implementation of the problem in the Clojure language, in which uses a genetic algorithm to approximate images. Phil Stubbings [30] also uses a genetic algorithm, selecting individuals for the next generation proportionally based on their fitness (fitness proportionate selection). A website by the user with the pseudonym ’AlteredQualia’ [3] uses an approach based on a simulated annealing algorithm, while also allowing the user to select different mutation types. Finally, Peter Collingridge [7] uses the same process, but instead of using polygons, he uses transparent circles with a variable diameter. The aforementioned projects all have in common that they don’t appear to have been peer reviewed scientifically. They use one single strategy to perform the image approximations and don’t compare the performance of different methods.

Paauw, van den Berg [23] instead compared the performance of different algorithms on a controlled setup. They did so by examining the MSE values obtained at the end of each run. They found that the stochastic hillclimber (SH) and plant propagation algorithm (PPA) could successfully lead to a large decrease in MSE as opposed to the initial situation. Simulated annealing (SA), however, led to the worst final MSE values. Also, when examining the typical runs of the algorithms, it was shown that SA didn’t lead to much decrease of the MSE value compared to the SH and the PPH. The temperature cooling scheme they use is the Geman & Geman cooling scheme; the only scheme that is guaranteed to find the global optimum of the objective function [17]. However, this guarantee only holds for an infinite number of evaluations. Finally, Berg et al. [6] recently wrote a paper on evolved art with transparent geometric

shapes. In this work they approximate paintings by using circles and lines in addition to polygons.

They again tackle the problem using a genetic algorithm and try combinations of shapes as well. More specifically, they allow for absolute mutations, relative mutations and a combination of the two. Parameters they include in the genes that are used in the genetic algorithm include the radius length for circles or the thickness for lines, all of which potentially differ between runs.

(12)

A very strong point in their research is their use of multiple shapes in approximating images. However, they only approximate one image using one algorithm, while we specifically aim to explore the usage of different algorithms on approximating multiple images.

As many of the aforementioned experiments employ genetic algorithms or variations thereof, it is clear that research into nature-inspired heuristic algorithms in optimization is widespread. Examples are the aforementioned plant propagation algorithm [27], the bat algorithm [34], particle

(13)

CHAPTER 3

Theoretical background

3.1 ’Rendering’ and the color model

Several different color models exist in digital imaging. red-green-blue (RGB), and hue-saturation-lightness (HSL) are common examples of color models. Depending on application support, these models allow a fourth parameter, the alpha value (α), that represents the degree of transparency of an object. The extended versions of these color models are called RGBA and HSLA, respec-tively.

In our research, the Pillow library is used [9]. We will use imported Portable Network Graphics (PNG) files, which store color data in 8-bit unsigned RGB-format [31]. While approximating images, we generate an image canvas and draw rectangles on top of it in RGBA-mode [8]. In this mode, four 8-bit unsigned integers are used to express the values of the red, green, blue and alpha channels of the object that is to be drawn. An alpha value of 0 means that the polygon is completely transparent (and therefore not visible in the image). An alpha value of 255 means that the polygon is fully opaque. Any value between 0 and 255 means that the polygon is semi-transparent.

Drawing polygons on top of one another is called ’rendering’. Even though the polygons have RGBA color values, the black canvas that we start with and the image after the rendering process are fully opaque. When drawing polygons on top of the black canvas, a process called ’alpha compositing’ takes place. To composite semi-transparent objects onto an opaque background, the following formula is used:

channelnew← α ∗ channelpolygon+ (1− α) ∗ channelold

assuming a normalized alpha value between 0 and 1 [31]. This calculation is done separately,

for every pixel and the red, green and blue channels. Stacking two different polygons, P1 and

P2, is therefore non-commutative: P1◦ P2 ̸= P2◦ P1. A trivial example is the composition of two fully opaque polygons (both with alpha values set at 1), as shown in figure 3.1. The formula

now reduces to rednew← redpolygon. This noncommutativity comes into play when defining our

mutations.

To conclude, knowledge about alpha compositing is important in the rest of this work. When rendering an image, we start with a fully opaque, black canvas. Then, semi-transparent polygons (with alpha-values ranging between 0 and 255) are iteratively drawn on top of this canvas using the aforementioned alpha compositing formula. In this process, the RGB-color of the polygon that is being drawn on the canvas will blend with the RGB-color of the pixels already on the canvas, depending on the alpha value of the polygon that is being drawn. The resulting image will again be a fully opaque image. The compositing process is automatically handled by Pillow [9].

(14)

(a) Red on top of green (b) Green on top of red

Figure 3.1: Alpha compositing is generally noncommutative. A trivial example is the compositing of two fully opaque polygons: in the figure to the left, the red polygon is drawn on top of the green one, and in the figure to the right, the green polygon is drawn on top of the red one.

3.2 Objective Image quality assessment

In this research, we will compare images using the Mean Squared Error (MSE), like in the original paper [23]. The MSE is calculated across the red, green and blue channel values of all pixels. It is a dimensionless quantity. We regard our target image and rendered image as separate, flattened arrays with equal ordering of positions and channel values. An image is thus represented by

an array of size n = height· width · 3. Therefore, the MSE between two RGB-images can be

calculated using the following equation:

MSE = 1

n

3n_∑₋₁

i=0

(Renderedi− T argeti)2 (3.1)

Where Renderedi is a color channel in the rendered polygon constellation and T argetiis the

corresponding color channel in the target image. This is the equation that is implemented in [23]. We thus chose to implement equation 3.1 accordingly. Regardless of the width and height of the painting, the minimum and maximum values of this equation can be determined exactly. The minimum value of this equation is 0 when the rendered and target paintings are exactly equal in terms of pixel values. The maximum value is 2552_{· 3: this occurs when all channels of} all pixels of the rendered and targeted image differ by 255.

The solution space S of the polygon constellations is as follows [23]:

S = (height· width)v· (2564)vpv · v

vp!

where v is the number of polygons and vp is the number of vertices per polygon (3 for triangles, 6 for hexagons). For the larger segments in the formula, (height· width)v _represents

the vertex positions, (2564₎vpv represents the polygon colors and v

vp! represents all drawing orders of the polygons. Compared to the size of the solution space as described in [23], the integer partition function [4] no longer plays a role. This is because we fix the number of vertices per polygon (vp). Using our formula, we can directly calculate the size of the solution space given the number of vertices v and the number of vertices per polygon vp in an experiment. As mentioned before, the solution space is very large. For example, when running a simulation on a picture with length 240, width 180 and just 1 triangle (3 vertices), the solution space is already (240· 180)3· 2564· 1! ≈ 3.46 × 1023.

We refer to all of the polygons collectively as a polygon constellation. Ultimately, we are trying to find a polygon constellation in the solution space that minimizes the fitness function, the MSE between the rendered image and the target image:

arg min

s∈S

(15)

Given that we want to composite semi-transparent polygons to approximate paintings and the novelty of this method, no known algorithms exist in the literature to directly find a constellation that minimizes this function. We therefore use heuristic programming techniques in an attempt to find good candidates.

(16)

(17)

CHAPTER 4

Method

4.1 Experimental setup

We start off each simulation by generating a random polygon constellation. Next, we iteratively

mutate and select new constellations each iteration. The mutations are defined further in this

chapter. The constellations are rendered and compared with the target image by MSE. After evaluating the rendered constellations a predetermined number of times, the rendered constella-tion with the least MSE will be our closest approximaconstella-tion to the target image. In doing so, we always keep the best solution that was found so far during the run. This approach is referred to as ’elitist’ [5]. The process is further outlined in algorithm 1.

Algorithm 1: Process of finding approximations of paintings using an elitist approach.

targetImage← image to approximate (one of the paintings); constellation← random polygon constellation;

while maximum number of evaluations has not been reached do mutate constellation(s);

evaluate the rendered mutated constellation(s) against targetImage by MSE; select mutated constellation(s) for the next iteration;

end

return constellation with smallest MSE-value;

4.2 Paintings

We approximate eight different images by using polygons. Seven of these are famous paintings, directly taken from Paauw, van den Berg [23]. These paintings provide a means to compare their results to ours. The paintings are shown in figure 4.1. An 8th image was added, that was generated using uniform random noise.

(18)

Figure 4.1: The paintings used in the study. From the top left, clockwise (all sources are directly taken from [23]): Portrait of J.S. Bach (1746, Elias Gottlieb Hausmann); The Kiss (1908, Gustav Klimt); Convergence (1952, Jackson Pollock); The Persistence of Memory (1931, Salvador Dali); Mona Lisa (1503, Leonardo da Vinci); Composition with Red, Yellow, and Blue (1930, Piet Mondriaan); The Starry Night (1889, Vincent van Gogh); Random noise (generated by a script)

4.3 Mutations

Similar to [23], we define mutations that can dramatically change the rendered image. For each mutation, there is a small chance that the mutated constellation underwent no state change at all. We regard these to be identity mutations, which are also listed for completeness.

1. Move Vertex: select a random vertex within a randomly chosen polygon. The vertex is then relocated to anywhere within the painting. An example is shown in figure 4.2. Identity mutation: assign the same coordinates to the vertex.

2. Change Color: select a random polygon and assign either the red, green, blue or alpha channel a new value. An example is shown in figure 4.3. Identity mutation: change the selected color channel to the same value.

3. Change Drawing Index: select a random polygon and randomly assign it a new drawing index. An example is shown in figure 4.4. This mutation is necessary because drawing transparent polygons is a noncommutative operation, as noted in section 3.1. Identity mutation: assign the same drawing index to the selected polygon.

Both the Move Vertex and Change Color mutations are different for the variation of the stochastic hillclimber algorithm that we define later. In [23], a fourth mutation was also defined. This mutation, Transfer Vertex, was used to transfer a vertex from one polygon to another. A key difference in our research is that we fix the number of vertices per polygon to a constant value throughout the run. We will perform our experiments either by using only triangles (3 vertices per polygon) or by using only hexagons (6 vertices per polygon). Hence, the Transfer Vertex-mutation is redundant. We chose to fix the types of polygons throughout the simulation as this simplifies the recombination stage in our genetic algorithm.

(a) Before the mutation (b) After the mutation

Figure 4.2: Showing the Move Vertex mutation. In this case, this mutation randomly moves a vertex from the red polygon.

(19)

(a) Before the mutation (b) After mutating a color chan-nel

(c) After mutating the alpha channel

Figure 4.3: Showing the Change Color mutation. In the middle, the green color channel is randomly changed such that the blue polygon becomes more green. To the right, the alpha channel is changed such that the blue polygon becomes more transparent.

(a) Before the mutation (b) After the mutation

Figure 4.4: Showing the Change Drawing Index mutation. Here, the mutation picked the blue polygon and put it in front of the red polygon.

4.4 Algorithms

In this chapter, we propose the algorithms that we explore in the paper.

4.4.1 Stochastic Hillclimbing

The stochastic hillclimber (SH) is the simplest algorithm. This algorithm was also implemented in the original paper. The steps of the algorithm involve trial and error to obtain a state that has less MSE than the previous one. Each iteration, one mutation is chosen. Every mutation has an equal probability (1

3) of occurring. After mutating the current constellation, the constellation is rendered to an image and evaluated against the target painting. If the MSE decreases, the mutated constellation is kept. If the MSE increases, however, the mutated constellation is thrown away. This process continues until the maximum number of evaluations is reached.

We designed a modification of the SH algorithm that uses relative mutations. Whenever a mutation is chosen, the new value becomes the current value of the mutation plus a delta value. Hence, we will now refer to this version of the hillclimber as delta hillclimber (DH). Because we no longer pick a uniform random value in an entire range, but instead add a uniform randomly picked value, the newly assigned value is clipped if it exceeds the allowed range. Hence, we have redefined the mutations for the DH as follows:

1. Move Vertex: Pick two delta values: one for the x-coordinate and one for the y-coordinate of the current vertex. Add these value to the current vertex x- and y-coordinates. Clip the resulting values should they exceed the x- and y- boundaries of the image.

2. Change Color: Pick a delta value. Add this value to either one of the R, G, B, or A channels. Clip the resulting value should it exceed the domain of [0, 255].

(20)

3. Change Drawing Index: This mutation is chosen to remain identical, as changing the drawing index of a polygon also potentially changes the drawing index of other polygons. The delta value is picked by multiplying the range of values with a range factor. The mutation range factor is set to 1 at the start of the simulation and linearly decrements to 0 at the end of the simulation. By doing so, the polygons are allowed to reach any point in 2D and any color at the start of the simulation using just one mutation, but this mutation range will become limited over time. We suspect that limiting the mutation range might lead to the polygons accidentally ’fitting’ into small gaps at a larger number of evaluations more easily.

4.4.2 Genetic Algorithm

The third algorithm we employ is a genetic algorithm (GA). This algorithm consists of three stages: selection, recombination and mutation. We have to make design decisions for the stages of the algorithm. According to Russel and Norvig [25], is unknown under which conditions genetic algorithms perform well. Therefore, we will both experiment and turn to literature when determining our parameters.

In the selection stage, we select a number of constellations to proceed to the next iteration. Some experiments have been done on determining an optimal population size [2, 18]. However, these problems are not comparable to ours. In [18], population sizes of up to 160 are described. in our experiments, we found a population size N = 200 to work well. Selection is done using a process called Roulette Wheel Selection (RWS) [35]. Since the optimal fitness we can reach is 0, we need to rescale the fitness values of each generation. This is done according to equation 4.1.

fi=

1

fi− minF itness + 100

(4.1)

Where minF itness is the minimum MSE-value of the current generation and fi is the

MSE-value of the i-th individual in the population. This equation scales the fitness MSE-values such that smaller fitness values are scaled to larger values and vice versa. After fitness rescaling, selection probabilities pi for each individual are assigned according to equation 4.2.

pi=

fi

∑N

j=1fj

(4.2) Where N is the population size, fi is the MSE-value of the i-th individual in the population

and fj is the MSE-value of the j-th individual in the population. Since our algorithms follow an

elitist approach, the fittest individual bypasses this selection process by always getting selected for the next generation. After selecting the other 199 individuals for the next generation, we are ready to recombine the individuals and generate offspring. Recombining individuals can be done directly as all individuals are represented in an equal way. This representation is the ordered list of polygons (either triangles or hexagons) within the constellation. Recombination is done using two-point crossover. First, two parents are randomly selected. Next, two randomly chosen crossover points are determined. Finally, the two children are alternatively assigned segments of polygons from both parents. This process is illustrated in figure 4.5.

(21)

Figure 4.5: Showing the recombination step of the genetic algorithm: two parents constellations have been selected and two cutoff points are chosen. Next, the two children are alternatively assigned segments of polygons from both parents.

Finally, both of the generated children are mutated once according to the unaltered mutations we defined for the stochastic hillclimber algorithm. The children are subsequently put back into the population, which will now consist of all parents and all children that were generated during the current iteration. Thus, 400 children are generated this way. The process is repeated until the maximum number of evaluations has been reached.

4.4.3 Simulated Annealing

Simulated annealing (SA) is an algorithm that has been successfully applied to scheduling and layout problems in the past [20] [21]. It has a connection to statistical mechanics, causing it to be a popular nature-inspired heuristic. SA works very similar to the SH algorithm. It uses the same mutations and always generated constellations that decrease the MSE value. However, SA also allows for a probability to accept mutations that increase the MSE value. This probability is given by:

p(accepting higher MSE) = e−∆MSET (4.3)

where ∆MSE is the increase in MSE value, and T is the temperature. Numerous different cooling schedules exist, sometimes even combined with reheating [1]. Even though the Geman & Geman schedule (equation 4.5) used in Paauw, van den Berg [23] is proven to find the global optimum as the number of evaluations goes to infinity, the tempo of the cooling is impractical. We are left with two options: decrease the value of the c-parameter or choose a different cooling schedule. Roa-Sepulveda and Pavez-Lazo describe that the cooling criterion is the main key to obtain good solutions when using the algorithm. We need to find a schedule that has a high initial temperature and cools down slowly [24]. We therefore chose to implement a different cooling schedule: an exponential one that we can fine tune based on custom start and end temperatures

Tstart and Tend:

T = Tstart· (

Tend

Tstart

)max evaluationsevaluation (4.4)

Determining these values requires sample runs. For our experiments, we set Tstart to be 50

and Tend to be 0.002. These values were found to work well. The temperature will thus decrease

exponentially from 50 to 0.002 throughout one million evaluations.

c

(22)

4.5 Experiments

The simulation accepts four parameters: the target image, the number of vertices v, the number of vertices per polygon vp and the total number of evaluations to run the simulation for. We use one million evaluations, like in Paauw, van den Berg [23].

The total number of vertices v is fixed per run. This will be one of 60, 300, 600, 900 or 1200. These numbers were chosen arbitrarily but a multiple of 6, because we only use triangles and hexagons in our experiments. Hence, we also fix the number of vertices per polygon vp in a run to either 3 or 6. As a result, the total number of polygons p will stay constant through the run

and will depend on v and vp: p = v

vp.

Rendering the polygons is a process that takes much time, as all the polygons have to be alpha composited on the canvas. Like in Paauw, van den Berg, we do this a million times per run, once for every evaluation [23]. The process is not parallelizable between steps, as the implemented algorithms make decisions based on the run directly before it. We can, however, speed up the alpha compositing itself by using a drop-in replacement of the Pillow-library called Pillow-SIMD. This library allows for Single-Instruction-Multiple-Data (SIMD) alpha compositing, leading to a large general speedup throughout the runs [10]. This replacement does, however, require AVX2-enabled CPUs to be able to optimize compositing performance.

The results are obtained by running the experiments on the Lisa Cluster, a service managed by SURFsara [29]. The Lisa cluster consists of several hundreds of multi-core nodes that run the Linux operating system. We run our experiments on 16-core Silver 4110 2.10 GHz nodes that run the AVX2 instruction set. Other nodes on the cluster are not AVX2-enabled. By running our experiments on the cluster, we are often able to run more than 100 program executions concurrently.

(23)

CHAPTER 5

Results

The results of the approximations are shown below. For reference, all paintings used in this paper are shown again in figure 5.1.

Figure 5.1: The paintings used in the study. From the top left, clockwise (all sources are directly taken from [23]): Portrait of J.S. Bach (1746, Elias Gottlieb Hausmann); The Kiss (1908, Gustav Klimt); Convergence (1952, Jackson Pollock); The Persistence of Memory (1931, Salvador Dali); Mona Lisa (1503, Leonardo da Vinci); Composition with Red, Yellow, and Blue (1930, Piet Mondriaan); The Starry Night (1889, Vincent van Gogh); Random noise (generated by a script)

5.1 Average final MSE per painting

In the following pages, we will display several figures that show the average final MSE for the three algorithms on all paintings, for all numbers of vertices v∈ {60, 300, 600, 900, 1200} and for

triangles and hexagons. The results are averaged over 5 runs.

As can be seen in figures 5.2 and 5.3, the final MSE differs a lot by changing the total num-ber of vertices used in the approximation. For all four algorithms, the final MSE value when using 60 vertices is the highest. Performance of the stochastic hillclimber seems to improve as we increase the number of vertices. But, when using 1200 vertices, results become dramatically worse. Performance for the DH seems to decrease when using larger numbers of vertices. Sim-ulated annealing shows a continuous improvement when using more vertices. A whole different

(24)

trend appears for the genetic algorithm: suddenly, when using 900 vertices, the final MSE value becomes much better for every painting.

Furthermore, note the apparent ordering in the results of the algorithms using triangles as shown in figure 5.2. This ordering is also apparent for the experiments with hexagons in figure 5.3. In both cases, the ordering does not seem to be there for every number of vertices used. For the stochastic hillclimber, this order remains consistent, except for that of the Dali and Jackson Pollock paintings. The same ordering appears to occur for the DH algorithm as well. For the genetic algorithm, a different ordering occurs. Only when using 900 vertices, the same ordering as for the other algorithms appears. The ordering is visible in the simulated annealing algorithm, again for all numbers of vertices greater than 60.

5.2 Triangles and hexagons

When examining figure 5.4, we can see that there is a difference between using triangles and hexagons. Apart from a few outliers, the trend appears to be that the lowest final MSE is attainable by using triangles. Examining figure 5.5, it appears that this is not the case for the DH. For example, when approximating the Mona Lisa, hexagons generally lead to a more accurate approximation in terms of final MSE. Examining figure 5.6, it appears that using hexagons is better about half the time. Look at the values for Bach, for example. Finally, examining figure 5.7, there appears to be a trend: apart from the one outlier in the Mondriaan values, all other vertex values seem to be lower for triangles. When using simulated annealing, the most accurate approximations of the target image appear to be obtained by using triangles.

5.3 Algorithm performance

We now examine several executions of the four algorithms in more detail. In figure 5.8, we can see that the SH initially performs very well on the Dali painting. Afterwards, it is beaten by the DH and the SA algorithms. The GA consistently shows the worst performance in this case. In figure 5.9, we can see that the SA far outperforms the other three algorithms on this configuration on the Mondriaan painting. The other three algorithms actually remain very close during the run. As the Mondriaan painting has large, clearly divided sections, we suspect that the SA is able to fit the polygons properly in those sections, while the other three algorithms get stuck in local minima. In figure 5.10, we can see one of the only cases where the GA was able to outperform all the other algorithms. Judging by the trend of the SA, it would likely still beat the GA when using more than a million evaluations. Finally, in figure 5.11, something very interesting happens: both the DH and SA appear to get stuck in local minima. Then, after about 400000 evaluations, the SA appears to be able to escape from its minima and obtain a final MSE value very similar to that of the SH and GA.

(25)

60 300 600 900 1200 Number of vertices 102 103 104 Final MSE

Stochastic Hillclimber

60 300 600 900 1200 Number of vertices

Delta hillclimber

Genetic Algorithm

Simulated Annealing

Bach

Dali Jackson PollockKlimt Mona LisaMondriaan Random noiseStarry Night

Figure 5.2: Final MSE values for the four implemented algorithms on five numbers of vertices for triangles over a million evaluations. These values were obtained over five runs

(26)

60 300 600 900 1200 Number of vertices 102 103 104 Final MSE

Stochastic Hillclimber

Delta hillclimber

Genetic Algorithm

Simulated Annealing

Bach

Dali Jackson PollockKlimt Mona LisaMondriaan Random noiseStarry Night

Figure 5.3: Final MSE values for the four implemented algorithms on five numbers of vertices for hexagons over a million evaluations. These values were obtained over five runs

(27)

60 300 600 900 1200 Number of vertices 500 750 1000 1250 1500 1750 2000 2250 Final MSE Bach Triangles Hexagons 60 300 600 900 1200 Number of vertices 1500 2000 2500 3000 3500 4000 4500 Dali Triangles Hexagons 60 300 600 900 1200 Number of vertices 12000 13000 14000 15000 16000 17000 18000 19000 20000 Jackson Pollock Triangles Hexagons 60 300 600 900 1200 Number of vertices 2500 3000 3500 4000 4500 5000 5500 Klimt Triangles Hexagons 60 300 600 900 1200 Number of vertices 600 800 1000 1200 1400 1600 1800 2000 Final MSE Mona Lisa Triangles Hexagons 60 300 600 900 1200 Number of vertices 500 1000 1500 2000 2500 Mondriaan Triangles Hexagons 60 300 600 900 1200 Number of vertices 13000 14000 15000 16000 17000 Random noise Triangles Hexagons 60 300 600 900 1200 Number of vertices 2000 2500 3000 3500 Starry Night Triangles Hexagons

Figure 5.4: Final MSE values for the stochastic hillclimber algorithm per painting over a million evaluations. These values were obtained over five runs. In this figure, the values for triangles are compared with those for hexagons.

60 300 600 900 1200 Number of vertices 750 1000 1250 1500 1750 2000 2250 Final MSE Bach Triangles Hexagons 60 300 600 900 1200 Number of vertices 1500 2000 2500 3000 3500 4000 4500 Dali Triangles Hexagons 60 300 600 900 1200 Number of vertices 13000 14000 15000 16000 17000 18000 19000 Jackson Pollock Triangles Hexagons 60 300 600 900 1200 Number of vertices 3000 3500 4000 4500 5000 5500 Klimt Triangles Hexagons 60 300 600 900 1200 Number of vertices 600 800 1000 1200 1400 1600 1800 2000 2200 Final MSE Mona Lisa Triangles Hexagons 60 300 600 900 1200 Number of vertices 500 1000 1500 2000 2500 Mondriaan Triangles Hexagons 60 300 600 900 1200 Number of vertices 13000 14000 15000 16000 17000 Random noise Triangles Hexagons 60 300 600 900 1200 Number of vertices 2000 2250 2500 2750 3000 3250 3500

3750 Starry Night_Triangles Hexagons

Figure 5.5: Final MSE values for the delta hillclimber algorithm per painting over a million evaluations. These values were obtained over five runs. In this figure, the values for triangles are compared with those for hexagons.

(28)

60 300 600 900 1200 Number of vertices 500 1000 1500 2000 2500 3000 Final MSE Bach Triangles Hexagons 60 300 600 900 1200 Number of vertices 2000 3000 4000 5000 6000 7000 Dali Triangles Hexagons 60 300 600 900 1200 Number of vertices 12000 14000 16000 18000 20000 Jackson Pollock Triangles Hexagons 60 300 600 900 1200 Number of vertices 2500 3000 3500 4000 4500 5000 5500 6000 6500 Klimt_Triangles Hexagons 60 300 600 900 1200 Number of vertices 500 1000 1500 2000 2500 3000 Final MSE Mona Lisa Triangles Hexagons 60 300 600 900 1200 Number of vertices 1000 2000 3000 4000 Mondriaan Triangles Hexagons 60 300 600 900 1200 Number of vertices 13000 14000 15000 16000 17000 Random noise Triangles Hexagons 60 300 600 900 1200 Number of vertices 1500 2000 2500 3000 3500 4000 Starry Night Triangles Hexagons

Figure 5.6: Final MSE values for the genetic algorithm per painting over a million evaluations. These values were obtained over five runs. In this figure, the values for triangles are compared with those for hexagons.

60 300 600 900 1200 Number of vertices 400 600 800 1000 1200 1400 Final MSE Bach Triangles Hexagons 60 300 600 900 1200 Number of vertices 1500 2000 2500 3000 3500 Dali Triangles Hexagons 60 300 600 900 1200 Number of vertices 11000 12000 13000 14000 15000 16000 17000 18000

19000 Jackson PollockTriangles

Hexagons 60 300 600 900 1200 Number of vertices 2500 3000 3500 4000 4500 Klimt Triangles Hexagons 60 300 600 900 1200 Number of vertices 600 800 1000 1200 1400 Final MSE Mona Lisa Triangles Hexagons 60 300 600 900 1200 Number of vertices 200 400 600 800 Mondriaan Triangles Hexagons 60 300 600 900 1200 Number of vertices 13000 14000 15000 16000 17000 Random noise Triangles Hexagons 60 300 600 900 1200 Number of vertices 1600 1800 2000 2200 2400 2600 2800 3000 Starry Night Triangles Hexagons

Figure 5.7: Final MSE values for the simulated annealing algorithm per painting over a million evaluations. These values were obtained over five runs. In this figure, the values for triangles are compared with those for hexagons.

(29)

5.4 Individual runs

Below, several notable individual runs of the algorithms are shown.

0 200000 400000 600000 800000 1000000 Evaluation 104 MSE Stochastic hillclimber Delta hillclimber Genetic algorithm Simulated annealing

Figure 5.8: Showing the progress of the best MSE value for the Dali painting. This figure was obtained by running all four algorithms with three hundred vertices using triangles over one million evaluations. The experiments were re-peated five times.

0 200000 400000 600000 800000 1000000 Evaluation 102 103 104 MSE Stochastic hillclimber Delta hillclimber Genetic algorithm Simulated annealing

Figure 5.9: Showing the progress of the best MSE value for the Mondriaan painting. This figure was obtained by running all four algorithms with twelve hundred vertices using triangles over one million evaluations. The experi-ments were repeated five times.

0 200000 400000 600000 800000 1000000 Evaluation 102 103 104 MSE Stochastic hillclimber Delta hillclimber Genetic algorithm Simulated annealing

Figure 5.10: Showing the progress of the best MSE value for the Mondriaan painting. This figure was obtained by running all four algorithms with nine hundred vertices using hexagons over one million evaluations. The experi-ments were repeated five times.

0 200000 400000 600000 800000 1000000 Evaluation 2 × 104 MSE Stochastic hillclimber Delta hillclimber Genetic algorithm Simulated annealing

Figure 5.11: Showing the progress of the best MSE value for the random noise image. This figure was obtained by running all four algorithms with nine hundred vertices using triangles over one million evaluations. The ex-periments were repeated five times.

(30)

(31)

CHAPTER 6

Discussion

We have approximated several different paintings from polygons. These are Bach, Klimt, Jackson Pollock, Dali, Mona Lisa, Mondriaan, Starry Night and finally, the generated uniform random noise image. The approximation was done by using different numbers of vertices v and by using either triangles or hexagons. In this section, we discuss the results.

We have seen bad results for v = 1200 in most of the algorithms. By using this many vertices, the SH likely has to perform too many mutations and will get stuck on a relatively bad approximation of the target image. The same happens for the DH when using 1200 vertices, although it already starts to perform worse on constellations with more than 300 vertices. This can be explained by the slowly decreasing mutation range: we suspect that the delta hillclimber gets stuck on local maxima throughout the run. Interestingly, the SA algorithm doesn’t show this decline for v = 1200, except for approximations of the Mondriaan painting. This relatively good performance for v = 1200 is likely due to the algorithm’s ability to escape local minima by occassionally accepting a worse rendering than the previous iteration. We found no explanation for the outlier in approximating the Mondriaan painting.

The ordering shown in both 5.2 and 5.3 is apparent. This phenomenon was referred to by Paauw, van den Berg as ’polygonicity ranking’ [23]. Ignoring a few outliers in both plots (especially those at v = 60 and v = 1200), we can see that this ordering is shared for the SH, DH and SA algorithms. This is more experimental evidence that there exists a ’polygonicity ranking’ between the paintings. For the genetic algorithm, however, this ordering is much different. This can perhaps be explained by the slightly poor performance compared to the other algorithms. The genetic algorithm likely gets stuck at local optima that are less optimal than those of the other algorithms. When using 900 vertices, the final MSE value becomes much better for every painting using the GA. For this, we have no direct explanation. We hypothesize that the population diversity is very large when using this number of vertices, such that more optimal recombinations can take place and the genetic algorithm is able to escape local minima more easily.

6.1 Conclusion

Based on the results, we can answer our research questions: (RQ1): We can conclude that it is possible to implement a genetic algorithm and an annealing scheme that yield acceptable approximations of the target images. The results have shown that SA algorithm even outperforms the other implemented algorithms for a large number of vertices. (RQ2): We can conclude that relative mutations do not lead to more optimal results in terms of MSE than absolute mutations, at least for our way of designing these relative mutations. This can be derived from the relatively poor results of the stochastic hillclimber for larger numbers of vertices when compared with the delta hillclimber algorithm. (RQ3): We can not conclude that approximating paintings using triangles leads to more optimal approximations when compared to approximating paintings using hexagons. There is an apparent difference for the stochastic hillclimber and simulated annealing

(32)

algorithms, but this difference is not profound for the other algorithms. We can finally conclude with more certainty that the hypothesized polygonicity ranking [23] exists. This is consistently shown for larger number of vertices in all algorithms, for both triangles and hexagons, for all of the implemented algorithms.

6.2 Future work

We think that a viable extension to this work should be focused on data compression. Since we can save a list of polygons only by their vertices and colors, perhaps we can achieve competitive data compression when approximating paintings from polygons. This could be an alternative to traditional compression algorithms such as zip.

More future work could be focused towards the polygonicity ranking that was mentioned earlier. What exactly is causing this ranking? Perhaps it is possible to describe image features that influence this ranking. Maybe it is even possible to find a measurement that describes a priori how hard it is to approximate a painting using polygons.

Several other methods are also usable in objective image quality assessment. These include the MSE-related Peak Signal to Noise Ratio (PSNR, measured in dB) and a structural information-based Structural Similarity index (SSIM) [32]. We have not further discussed these other image quality assessment methods in the paper, but we suspect that they can lead to very different results.

Further experimentation can be done by approximating paintings on a toroidal grid that allows for wraparounds, rather than a regular 2D grid. We suspect that the polygons are biased towards placement that covers the more central areas of the grid. This might lead to less optimal results when compared to approximations that are executed on a torus-like structure where the polygons are allowed to wrap to the other side of the image.

(33)

Bibliography

[1] David Abramson, Mohan Krishnamoorthy, Henry Dang, et al. “Simulated annealing cooling schedules for the school timetabling problem”. In: Asia Pacific Journal of Operational

Research 16 (1999), pp. 1–22.

[2] Jarmo T Alander. “On optimal population size of genetic algorithms”. In: CompEuro 1992

Proceedings computer systems and software engineering. IEEE. 1992, pp. 65–70.

[3] AlteredQualia. Image Evolution. 2009. url: https://alteredqualia.com/visualization/ evolve/.

[4] George E Andrews and Kimmo Eriksson. Integer partitions. Cambridge University Press, 2004. isbn: 9780521600903.

[5] Anne Auger and Benjamin Doerr. Theory of randomized search heuristics: Foundations

and recent developments. Vol. 1. World Scientific, 2011. isbn: 9789814282666.

[6] Joachim Berg et al. “Evolved Art with Transparent, Overlapping, and Geometric Shapes”. In: arXiv preprint arXiv:1904.06110 (2019).

[7] P. W. Collingridge. Evolving images. 2009. url: http://www.petercollingridge.co. uk/blog/evolving-images/.

[8] Pillow contributors. Pillow color modes. 2018. url: https://pillow.readthedocs.io/ en/5.1.x/handbook/concepts.html.

[9] Pillow Contributors. Pillow: the friendly PIL fork. 2019. url: https://python-pillow. org/.

[10] Pillow-SIMD Contributors. Pillow Performance. 2019. url: https://python- pillow. org/pillow-perf/.

[11] Tim B Cooper and Jeffrey H Kingston. “The complexity of timetable construction prob-lems”. In: International Conference on the Practice and Theory of Automated Timetabling. Springer. 1995, pp. 281–295.

[12] Chris Cummins. Grow Your Own Picture. 2013. url: https : / / chriscummins . cc / s / genetics/.

[13] Yann N. Dauphin. Clojure: Genetic Mona Lisa problem in 250 beautiful lines. 2009. url: http : / / npcontemplation . blogspot . com / 2009 / 01 / clojure genetic mona lisa -problem-in.html.

[14] E. Delp and O. Mitchell. “Image Compression Using Block Truncation Coding”. In: IEEE

Transactions on Communications (Sept. 1979), pp. 1335–1342. issn: 0090-6778. doi: 10.

1109/TCOM.1979.1094560.

[15] Marco Dorigo and Gianni Di Caro. “Ant colony optimization: a new meta-heuristic”. In:

Proceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406).

Vol. 2. IEEE. 1999, pp. 1470–1477.

[16] Michael R Garey, David S Johnson, and Larry Stockmeyer. “Some simplified NP-complete problems”. In: Proceedings of the sixth annual ACM symposium on Theory of computing. ACM. 1974, pp. 47–63.

(34)

[17] Stuart Geman and Donald Geman. “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images”. In: Readings in computer vision. Elsevier, 1987, pp. 564– 584.

[18] John J Grefenstette. “Optimization of control parameters for genetic algorithms”. In: IEEE

Transactions on systems, man, and cybernetics 16.1 (1986), pp. 122–128.

[19] Roger Johansson. Genetic Programming: Evolution of Mona Lisa. 2008. url: https:// rogerjohansson . blog / 2008 / 12 / 07 / genetic programming evolution of mona -lisa/.

[20] Scott Kirkpatrick, C Daniel Gelatt, and Mario P Vecchi. “Optimization by simulated an-nealing”. In: science 220.4598 (1983), pp. 671–680.

[21] Christos Koulamas, SR Antony, and R Jaen. “A survey of simulated annealing applications to operations research problems”. In: Omega 22.1 (1994), pp. 41–56.

[22] AS Lewis and G Knowles. “Image compression using the 2-D wavelet transform”. In: IEEE

Transactions on Image Processing (1992), pp. 244–250.

[23] Misha Paauw and Daan van den Berg. “Paintings, Polygons and Plant Propagation”. In:

International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar). Springer. 2019, pp. 84–97.

[24] C.A. Roa-Sepulveda and B.J. Pavez-Lazo. “A solution to the optimal power flow using simulated annealing”. In: International Journal of Electrical Power & Energy Systems 25.1 (2003), pp. 47 –57. issn: 01420615. doi: https : / / doi . org / 10 . 1016 / S0142 -0615(02 ) 00020 - 0. url: http : / / www . sciencedirect . com / science / article / pii / S0142061502000200.

[25] Stuart J Russell and Peter Norvig. Artificial intelligence: a modern approach. Malaysia; Pearson Education Limited, 2016. isbn: 978-0-13-604259-4.

[26] Rob A Rutenbar. “Simulated annealing algorithms: An overview”. In: IEEE Circuits and

Devices magazine (1989), pp. 19–26.

[27] Abdellah Salhi and Eric S Fraga. “Nature-inspired optimisation approaches and the new plant propagation algorithm”. In: Proceeding of The International Conference on Numerical

Analysis and Optimization (ICeMATH2011) (2011).

[28] Andrea Schaerf. “Tabu Search Techniques for Large High-school Timetabling Problems”. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence - Volume

1. AAAI’96. Portland, Oregon: AAAI Press, 1996, pp. 363–368. isbn: 0-262-51091-X. url:

http://dl.acm.org/citation.cfm?id=1892875.1892930.

[29] SURFsara services. SURFsara. 2019. url: https://userinfo.surfsara.nl/.

[30] Phil Stubbings. Genetic Algorithm for Image Evolution. 2012. url: http://parasec.net/ blog/image-evolution/.

[31] W3C. PNG Specification. 2003. url: https://www.w3.org/TR/2003/REC-PNG-20031110/ #6AlphaRepresentation.

[32] Zhou Wang et al. “Image quality assessment: from error visibility to structural similarity”. In: IEEE transactions on image processing 13.4 (2004), pp. 600–612.

[33] Andrew B Watson. “Image compression using the discrete cosine transform”. In:

Mathe-matica journal (1994), p. 81.

[34] Xin-She Yang. “A new metaheuristic bat-inspired algorithm”. In: Nature inspired

cooper-ative strategies for optimization (NICSO 2010). Springer, 2010, pp. 65–74.

[35] Jinghui Zhong et al. “Comparison of performance between different selection strategies on simple genetic algorithms”. In: International Conference on Computational Intelligence for

Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06). Vol. 2. IEEE. 2005,

On using heuristics to approximate paintings from polygons

Bachelor Informatica