Where Sometimes the Really Hard Problems Are

(1)

Where

Sometimes the Really

Hard Problems Are

Thomas J. Meeus 11276363

Bachelor thesis Credits: 18 EC

Bachelor Opleiding Kunstmatige Intelligentie University of Amsterdam

Faculty of Science Science Park 904 1098 XH Amsterdam

Supervisor

Dhr. drs. D. van den Berg Informatics Institute Faculty of Science University of Amsterdam Science Park 904 1098 XH Amsterdam June 28th, 2019 1

(2)

Where Sometimes the Really Hard Problems Are

A replication study of the paper of Cheeseman et al.

Thomas Meeus

University of Amsterdam

Bachelor AI Faculty of Science

Email: thomas.meeus@student.uva.nl

Abstract—NP-complete problems are widely studied problems that no known algorithm can solve in a polynomial time. In the paper ”Where the Really Hard Problems Are”, Cheeseman et al. show that a set of NP-complete problems contain a phase transition. A phase transition is a transition from problem instances that have a high probability of being solvable to problem instances that have a low probability of being solvable. They also show that the hardest instances of a NP-complete problem are located near the phase transition. One of those NP-complete problems is the SAT problem, this problem consists of a formula in the conjunctive normal form that has to be solved. The algorithm that Cheeseman et al. used to find the phase transition for the SAT problem has been replicated and the graphs have been reproduced. Also, two other algorithms have been made to compare the different results, a conflict-driven clause learning algorithm and a look-ahead algorithm. The graph showing the solvability of the algorithm of Cheeseman et al. has successfully been replicated and the phase transition has been found around the M/N ratio of ⇠4.2 for the 3-SAT problem. The graphs that display the computational cost in iterations show that the hardest instances in terms of the iteration cost can be found near the phase transition for the tested algorithms. The graphs that display the computational cost in seconds show that the hardest instances in terms of the cost in seconds are not always located near the phase transition. The algorithm of Cheeseman et al. and the look-ahead algorithm both show that for lower N values the phase transition is not necessarily the place where the hardest problem instances are located when regarding the computational cost in second. Furthermore, the results of the algorithms can be used to find the hardest SAT instances when regarding time and iteration cost.

Keywords–Phase transition; Critical value; SAT problem; Back-tracking; Complete algorithm; Resolution; DPLL algorithm; CDCL algorithm; Look-ahead based SAT solver

I. INTRODUCTION

Within computer science there are different classes of prob-lems, including the P and NP problems. Problems in class P are solvable in a polynomial time by the best-known algorithm. This means that the runtime for solving the problem increases polynomially when the problems state space increases [1] [2]. An example of a problem in class P is finding the lowest value of a given list; the complexity of this problem is O(n). O(n) means that in the worst case the problem is solved by going through the input (which in this case is n long) exactly one time, this is an example of a polynomial time. Problems in class NP are problems that no known algorithm can solve in a polynomial time, these are solved in a non-polynomial time. This means that when an instance of a NP problem is large

enough, no known algorithm will be able to solve it. However, problems in class NP do have the ability to check if a solution is correct in polynomial time. Problems where even this cannot be done in a polynomial time, are problems of the subclass NP-hard [2].

Also within the class NP there is a subclass of problems, this subclass is the NP-complete problem class. A problem that is NP-complete is generally seen as one of the hardest NP problems and can be transformed in any of the other NP-complete problems [2]. This means that if someone had managed to create an algorithm that can solve one NP-complete problem in a polynomial time, every NP-NP-complete problem would be solvable in a polynomial time. Whether it is even possible to create such an algorithm is not yet confirmed. The P = NP problem is still unresolved to this day [3] [4] [5].

A widely studied NP problem is the Boolean satisfiability problem (SAT). This problem is from the constrained satisfac-tion problem (CSP) group, meaning that it contains a set of objects whose state must satisfy a number of constraints [6]. In the SAT problem a logical formula in a conjunctive normal form (CNF) has to be solved, an example of such a formula can be seen in formula (1). Every literal, which are the variables like x and y, in this formula can be assigned a Boolean value: true or false. The ¬ symbol implies a negation, this means that the Boolean value of the literal is reversed. For example, if y is true, then ¬y is false. The _ symbol implies an ”or” operation. This means that in the clause (x _ y _ w) at least one of the literals has to be true to make that clause true. A clause is a set of literals between parentheses. The ^ symbol implies an ”and” operation. This means that in the clauses

(x_{_y_w)^(¬z_¬x_¬y) both (x_y_w) and (¬z_¬x_¬y)}

have to be true to make that part of the formula true. The entire formula is solved when every clause in the formula is true, this means that at least one of the literals in every clause has to be true. This problem is NP-complete because no known algorithm can solve this problem in a polynomial time in the worst case scenario. The best known algorithm for solving the SAT problem to this date is the PPSZ algorithm. This algorithm has a complexity of O(20.386n₎_{for 3-SAT problems, meaning}

that it is still non-polynomial. [7] [8].

(x_ y _ w) ^ (¬z _ ¬x _ ¬y) ^ (¬y _ z _ ¬w) (1) Although NP-complete problems cannot be solved in poly-nomial time, most problem instances are still relatively easy

(3)

to solve. The really hard instances of the problems can be found at a certain value of a given order parameter [1] [9]. For example, for the graph coloring problem this order parameter can be the average connectivity of a node in a graph. The critical value of the order parameter can be found at the point where instances of a certain problem change from having a high probability of being solvable to having a low probability of being solvable. This phenomenon is called the phase transition [1]. When a set of problem instances has an order parameter lower than the critical value, almost all instances are solvable. When the order parameter is higher than the critical value, almost all instances are unsolvable. When a set of problem instances has an order parameter close or equal to the critical value, some instances are solvable and some are not. This is the location where the hard problems occur. In this context hard means that a complete algorithm takes a long time or needs many iterations to solve a problem instance or show that it is unsolvable. This is the result of the algorithm going through many almost-solutions, which take time to go through [1]. In the paper ”Where the Really Hard Problems Are”, P. Cheeseman, B. Kanefsky and W. Taylor (henceforth: ”Cetal”) show that the phase transition is present in a set of NP-complete problems. Cetal’s research on the SAT problem from that paper shall be replicated in this thesis. The algorithm that they made will be reproduced and the thesis will be extended with two extra algorithms to compare the different results.

Replication studies are very important in the field of science for a few reasons. First, checking whether experiments from studies are correct is often useful. Second, checking if papers are still relevant can prevent mistakes from being made due to falls information. In two papers of C. Camerer et al. multiple social science and economics papers have been replicated to check if they are still relevant [10] [11]. Some of the replicated experiments still produce the same data, but some of the experiments show different results from the original papers.

Also within computer sciences replicating papers is im-portant. For example, a key part of making and maintaining software is experimenting with it to constantly evaluate its efficiency and correctness. This can be done by testing the software in replication papers [12]. This also applies to algo-rithms, which need to be checked and possibly be improved in terms of performance. In the paper of G. van Horn et al. the Hamiltonian cycle problem from Cetal’s paper has been replicated and the algorithm reproduced, just like this thesis will do for the SAT problem. The paper successfully found the phase transition and proved that Cetal’s work is valid [13]. So, replication studies are very imported for checking correctness and keeping information relevant.

The next section contains an explanation of how the different CNFs are made, what sort of algorithms are used and how the different graphs are produced. Section III contains detailed descriptions of the various algorithms that were used, including Cetal’s DPLL algorithm, a conflict-driven clause learning algorithm and a look-ahead algorithm. In section IV the different results are stated and comparisons are made between the different algorithms. In section V different conclu-sions are made and some discussion points are stated. Section VI contains some acknowledgements and this thesis ends with the used references.

II. METHODS

The SAT formulas were generated using different param-eters, with k being the most important one. Because this parameter is so important for the structure of a SAT problem, many papers call the problem the k-SAT problem. As a result of Cetal not mentioning the meaning of k, it had to be deducted through research. The following two meanings of k were found: the first meaning is that k is the maximum number of literals per clause. So, say k has a value of 3, this means that a clause can have a length between 1 and 3 [9]. Every length has the same chance of occurring, so the probability that a clause is one literal long is equal to the probability that the clause is three literals long. The second possible meaning is that k indicates the absolute length of the clauses. So, when k is for example 4, every clause in the CNF will be four literals long. The last of the two possibilities has been used for this thesis. There are two reasons for this. First, most of the papers that have been read about the SAT problem indicate that they use k as the absolute length of the clauses [14] [15] [16]. Second, when using k as the maximum length of the clauses, there will always be a small change that a CNF with k = 2 will be exactly the same as a CNF with k = 4. Meaning that formula (1) can be an example of a 2-SAT, 3-SAT or even a 4-SAT problem, this might result in unreliable data.

The second and third parameters that are important for the SAT formulas are M, which stands for the number of clauses in the CNFs, and N, which stands for the total number of variables. So, formula (1) in the introduction will have k = 3, M = 3 and N = 4 as parameters. M and N are mostly variables used for creating data and for finding the phase transition.

The different CNFs used in this thesis are made in the following manner. The parameters discussed above have to be known if a CNF has to be made. The first step in creating the formula is making an empty list where the clauses can be added into. Then a list is made of N different values to create the different literals on a later stage. After this, the clauses will be made. M different clauses of length k will be made in an iterative manner. A clause is made by adding k different literals to an empty list, adding identical literals to one clause is avoided. The variables assigned to a literal are its number, what Boolean value it has and if it has a negation. The number of the literal will be randomly chosen from the value list and every literal will have a 50/50 change of having a negation. The Boolean value will be assigned the value none, this will be changed to true and false by the different algorithms. The clauses are individually made and added to the CNF. When this has been done M times, the CNF will be done.

To be able to reproduce the results from Cetal’s paper, it had to be known how many and what kind of CNFs they used. They divided their CNFs in the following way. First, there are a total of three k-SAT problems that they used to find the phase transitions: 2-SAT, 3-SAT and 4-SAT. Second, for every k-SAT group multiple variable groups were made, with every group having its own N value. For the 3-SAT problem a total of four different variables groups were made with the N values 7, 10, 15 and 25. For the 2-SAT and 4-SAT problems only a N value of 15 was used. Third, the different N value groups where plotted with various order parameter values. However, this presents a problem. The horizontal axis that Cetal used as order parameter is not clear. First of all, what the ”avg. constraints per variable” means for the SAT problem is not

(4)

known. No literature has been found that uses the same order parameter for the SAT problem. Second, the values of Cetal’s order parameter are unreadable (see the top graph in Figure 4). Because the order parameter of Cetal presents these problems, the decision was made to use M/N as order parameter. This parameter is easy to use for making large amounts of data and is used in different papers about the SAT problem [14] [15].

In this thesis a total of 70 different M/N ratios were used per N value, arranged from 1 to 8 with increments of 0.1. Because these values were used as order parameter, the different N value groups that Cetal used had to be changed. Using the N values that they used results in M values (amount of clauses) with decimals. For example, suppose the N value for a CNF is 7 and the order parameter is 4.5 M/N, this results in a CNF with 31.5 clauses. This is impossible to make, because partial clauses do not exist. This is why the N values of 7, 10, 15 and 25 has been changed to 10, 20, 30 and 40 for 3-SAT. For 2-SAT and 4-SAT the N value of 30 has been used. The reason behind using this value for 2-SAT and 4-SAT is because it is the third biggest N value and Cetal also used the third biggest N value for 2-SAT and 4-SAT. Furthermore, for every M/N ratio a total of 200 different CNF were created to make the graphs as reliable as possible. As a result, a total of 56.000 CNFs were generated for the 3-SAT problem and a total of 14.000 for the 2-SAT and 4-SAT problems each. This is most likely much more than that Cetal used, making this replication thesis more drastic than the original paper.

To be able to show the phase transition for the SAT problem, complete algorithms had to be used. A complete algorithm means that the algorithm will return a solution when there is one and return that there is no solution when there is none. A complete algorithm is useful because it will always return an answer, but because it goes through almost the entire state space in the worst case, it is exhaustive. This means that the cost of solving a problem can become increasingly high when the problems state space becomes bigger. For showing the phase transition for a CSP problem that is NP-complete such as SAT, it is beneficial to use a complete algorithm. This is because it is able to show the transition from solvable instances to unsolvable instances, even when a problem instance is extremely big. The different algorithms are discussed in section III.

Three different graphs have been made for every algorithm. The first graph is for the solvability of the algorithms, it shows the percentage of solvable CNFs per M/N ratio for every k-SAT problem. This graph is used to show where the phase transition is located. The second graph shows the average iteration cost of the different algorithms. This graph is used to find the location where the computational cost in iterations of the algorithms will be highest. This graph only contains the 3-SAT problem instances. The third graph shows the average time in seconds that an algorithm uses to solve the different CNFs. This graph can be used to find the location where the computational cost in seconds of the algorithms will be highest. This graph also only contains the 3-SAT problem instances. Furthermore, to get a better understanding of how good the various algorithms are in terms of performance, a ”blanco” algorithm has been made. This algorithm is a simple DPLL algorithm without any heuristics and without any form of resolution. The blanco algorithm can be used to compare with, because the algorithms that where tested for this thesis are all variations on the

simple DPLL algorithm. For this algorithm a cutoff is set for 15000 iterations, this is done because the algorithm takes an extremely long time to solve the CNFs with a N value of 40. Furthermore, when 50% or more of the CNFs for a specific M/N ratio reach the cutoff, the data point will not be plotted.

III. ALGORITHMS

A. Cetal’s DPLL algorithm

Because in this thesis the SAT part of Cetal’s paper is replicated, the first algorithm that had been made was the algorithm that they used to search for the really hard problem instances. They stated that ”a form of resolution has been used to reduce random k-SAT problems before applying a simple backtrack algorithm with a most-constrained-first heuristic”. In this sentence there are three parts that are not really specific. What form of resolution is used, what simple backtrack algorithm is used and what is the most constrained variable in a SAT problem?

1) Resolution: The most basic form of resolution is the one that M. Davis and H. Putnam used in their paper ”A computing Procedure for Quantification Theory” [17]. This form of resolution (henceforth: ”Davis-Putnam resolution”) can be used in the following way: suppose there are two clauses (¬x_y)^(x_z). These clauses are logically equivalent to the following clause: (y _z). This is justified, because when x_{is true, the only way for (¬x _ y) to be true is if y is true.} Alternatively, when x is false, the only way for (x _ z) to be true is if z is true. So, regardless of the Boolean value of x, one of the literals y and z has to be true. This results in the clause (y _ z). Thus, every time there are two or more instances of the same literal in different clauses, some with and some without negation, the clauses can be merged and the specific literal removed [17]. This can also be used for larger clauses. See Figure 1 for an example of Davis-Putnam resolution applied on a 3-SAT problem instance.

Figure 1. An example of Davis-Putnam resolution on a 3-SAT problem instance.

Another form of resolution is called unit resolution [18] [19]. Suppose while solving a CNF the following clauses are found: (x)^(¬x_y)^(x_z). Because of the fact that a CNF can only be solved when every clause is true, clauses with only a single literal in it have to be true too. This means that the literal in the single literal clause has to get the Boolean value it needs to be true. In the given example the clause with the single literal has a single x in it. To solve this CNF, the literal x has to be assigned a Boolean value true. This results in the following CNF: (y). What happens is that clauses that have a true literal in it and literals that are false are removed. So, the main idea behind unit resolution is that clauses with a single literal in it have to be true. This means that when one occurs, the literal in the clause can be assigned a Boolean

(5)

value first so the clause becomes true. This resolution method can be used during the simple backtrack algorithm.

The last form of resolution, unit resolution, was used to replicate the algorithm of Cetal. This is because of the following reasons. First, using unit resolution in combination with the simple backtrack algorithm that is explained in the following section is a logical step to make. The combination of the two makes the algorithm much more efficient in solving SAT problems [19]. Second, the Davis-Putnam resolution is part of a bigger algorithm for solving SAT problems, the Davis-Putnam (DP) algorithm [17]. Using a part of a bigger algorithm to reduce random SAT formulas does not seem logical. Third, no found paper uses Davis-Putnam resolution to reduce the SAT formulas. Unit resolution on the other hand has been used in combination with backtrack algorithms in different papers [14] [16] [19]. So, because of these reasons the assumption is made that Cetal used unit resolution to reduce the CNFs when possible during the backtrack algorithm.

2) Backtrack algorithm: As backtrack algorithm Cetal used a ”simple backtrack algorithm” with the most-constrained-first heuristic. Again, there is no mention of which simple backtrack algorithm is used. Whether it uses a lazy backtrack structure or normal backtracking is not stated. This means that a calculated guess had to be made about what backtrack algorithm had been used. One of the most well known backtrack algorithms for solving SAT problems is a variation on the DP algorithm. Instead of using the resolution step, M. Davis et al. replaced it with a backtrack step. So, rather than using Davis-Putnam resolution to solve the CNFs, the search space is searched through in an iterative manner by assigning Boolean values to the literals. This algorithm is called the Davis-Putnam-Logemann-Loveland algorithm (DPLL) [20].

Algorithm 1 DPLL(CNF)

1: if CNF = ; then

2: return true

3: if empty clause 2 CNF then

4: return false

5: x unassigned literal from CNF

6: if DPLL(CNF[x true]) = true then

7: return true

8: if DPLL(CNF[x false]) = true then

9: return true

10: return false

The algorithm works as follows: every iteration starts with checking whether the CNF is empty, if that is the case then return true. Next, the algorithm will check whether there is an empty clause, if there is one then return false. When these checks have been done, the first unassigned literal will be searched for. This literal will be assigned a Boolean value, the first value for every literal will be true. The algorithm will be called again in a recursive manner with the given literal and Boolean value. The CNF will be updated by assigning every instance of the given literal its Boolean value. When a literal in the CNF evaluates to a Boolean value of true, the clauses that contain that literal can be removed. This is possible because a clause only needs one true literal to be true. When a literal evaluates to false, the literal will be removed from the clauses. This is done so the algorithm will not use this literal again in this search branch. If a branch results in an unsatisfiable

CNF, which means that the CNF contains an empty clause, the algorithm will backtrack over the last assigned literal. This means that instead of assigning the last literal a Boolean value of true, the literal will be assigned a value of false. The main idea behind backtracking is when an unsatisfiable CNF has been found, the last made decision will be changed. This will be done until the CNF is satisfied or when every possible combination of Boolean values has been gone through, which means that the CNF is unsatisfiable. See Algorithm 1 for the pseudo code, the input of the algorithm is a CNF formula.

Figure 2. An example of how the simple DPLL algorithm works on a 2-SAT problem instance.

This algorithm will be applied on the example formula (p_{_ ¬q) ^ (q _ r) ^ (¬r _ ¬p) in the following manner (see} Figure 2). First, because no check can be satisfied yet, a literal will be picked from the CNF. This literal will be p, because it is the first unassigned literal. The literal p will be assigned the value true. Because of this, the clause (p_¬q) will be removed as result of p being true. In the clause (¬r _¬p) the literal ¬p will be removed. The resulting CNF after the removing steps will be (q _ r) ^ (¬r). Now the next literal will be chosen and be assigned a Boolean value, this will be the literal q. The Boolean value will again be true. By removing the clauses that contain a true literal and the literals that are false, the new CNF will be (¬r). Now the only literal to work with will be r, which again will be assigned the value true. Because of the negation before r, the Boolean value will change to false. Because false literals are removed from the clauses without removing the clause itself, the resulting CNF will be (). An empty clause indicates that the clause has no true literal in it, meaning that the entire clause is false. This means that the CNF is not solvable with the used assignments. Now the algorithm will backtrack over the last made decision, which is assigning the literal r with the value true, and give that literal the opposite value false. This results in an empty CNF, meaning that the CNF is satisfiable. The labeling for this CNF is p := true, q := true, r := false.

The resolution step is added in this algorithm before the two checks. So, when a CNF is created with a single literal clause, the unit resolution algorithm will take over. The resolution algorithm keeps working if after a resolution cycle a new single literal clause appears. The resolution algorithm stops and returns the CNF if there are no single literal clauses left or when an empty clause is made. When the backtrack algorithm finds an empty clause, a backtrack step will be made if possible or the conclusion will be made that the CNF is unsatisfiable.

(6)

3) Most-constrained-first heuristic: The most-constrained-first (MCF) heuristic that Cetal mentions presents another small problem. What exactly the most constrained value is in a CNF is not clear. For problems like the graph coloring problem, some papers where found that stated the meaning. In the graph coloring problem the most constrained variable is the node with the smallest number of possible colors that the specific node can have [21]. Because no papers were found that used the MCF heuristic while solving SAT problems, a decision had to be made what the most constrained variable is in a CNF. In this thesis the most constrained literal will be the literal that occurs in the most clauses. The rational behind this decision is that when filling in the literal that occurs in the most clauses, the depth of the search tree will not be as big. Suppose a specific literal appears the most in the different clauses. By assigning this literal a Boolean value, a lot of clauses and literals will be removed. This means that in the next iteration, fewer literals and clauses have to be filled in, resulting in a faster algorithm.

B. Conflict-driven clause learning algorithm

The first algorithm that was looked into to find a phase transition is the conflict-driven clause learning (CDCL) algo-rithm. The algorithm is largely inspired by the DPLL algorithm [22]. Just like DPLL, CDCL uses a form of backtracking to find a solution. When there is a solution, the algorithm will return that the CNF is solvable. When there is no solution, this will be stated by the algorithm too. Because of the fact that the algorithm will always return whether the CNF is solvable or not, makes it a complete algorithm just like DPLL. The biggest difference between CDCL and DPLL however, is the way the two algorithms backtrack. DPLL backtracks in a recursive manner, CDCL on the other hand makes a list of operations that have been done and backtracks by removing certain operations from this list. This method of backtracking has several advantages, including backjumping and clause learning [22] [23].

The algorithm works as follows (see Algorithm 2 for the pseudo code): the algorithm receives a CNF and an empty operations list that will be filled in. The first operation is setting the decision level to 0. The decision level is needed for backtracking. While the CNF is not solved, meaning that not every clause has at least one true literal in it, the algorithm will keep working. The first thing that happens in the while-loop is emptying and filling in the given CNF with the operations from the operations list. This is done so the rest of the functions work with the right Boolean values in the CNF. When unit resolution is not possible, the most constrained literal will be chosen. This literal will be used to create an operation that will be added to the operation list. The operation will receive the following variables: the literals value, the Boolean value that was assigned, a decision value and the decision level. Because this operation is not the result of unit resolution, the Boolean value and decision value will be assigned a value of true. The Boolean value will be assigned a value of false when backtracking. This can also be done the other way around. The decision value is made true, because this is needed for the backtrack step. More on this later. When the operation is added to the list, the decision level will increase and the algorithm will proceed with a new iteration.

When unit resolution is possible, the following will be

Algorithm 2 CDCL(CNF, operation list)

1:decision level 0

2:while CheckCDCLSolved(CNF, operation list) = false do

3: CNF = EmptyBooleansFromCNF(CNF)

4: CNF = RunOperationList(CNF)

5: if CheckResolutionClauses(CNF) = true then

6: operation list UnitResolution(CNF, operation list)

7: else

8: operation list PickBranchingLiteral(CNF, operation list)

9: decision level decision level +1

10: if ConflictCheck(CNF, operation list) = true then

11: conflict level ConflictAnalysis(CNF, operation list)

12: new clause = ConflictClause(CNF, operation list)

13: if conflict level = -1 then

14: return false

15: else

16: operation list BacktrackFunction(CNF, operation list, conflict level)

17: decision level conflict level

18: return true

done. First, the clause is found where unit resolution is possible. A new operation can be made by checking what Boolean value the literal in that clause has to have to make the entire clause true. The operation will be assigned the following values: the literals value, the Boolean value so the clause becomes true, a decision value and the decision level. The decision value in the operation will be false, this is because with unit resolution the Boolean value cannot be decided, it has to be derived from the clause. When the operation is added to the operation list, another check for unit resolution will be done. This will only stop when no unit resolution steps can be done or when a conflict occurs. An important note to make, unit resolution will not increase the decision level. This is only done when an operation is made without unit resolution, when the Boolean value of a given literal can be decided. This is important for the backtracking step.

Every iteration of the algorithm ends with a conflict check. This step checks if the given operation list results in a clause with only false literals in it. When this occurs, the following steps are made. First, an analysis is made to find the decision level to backtrack to. When normal backtracking is done, this will be the level of the last operation with a decision value of true. When no operation with a decision value of true is left, the function returns a decision level of -1. This results in the algorithm declaring the CNF as unsolvable. Because CDCL uses an operation list to backtrack instead of the traditional way, backjumping becomes a possibility.

Backjumping works by returning the decision level of the operation that has no role in the conflict. For instance, suppose a decision operation is made that makes the literal y true. The following decision literal will be x and also this one will be set to true. However, setting the Boolean value of x to true results in a conflict. Normal backtracking will go back one decision and set the Boolean value of x to false. This also results in a conflict and the Boolean value of y will be backtracked on and changed. However, by looking at what the cause was of the conflict, backjumping could be used. Suppose the Boolean value of x was just set to true. This results in the following conflict clauses: (x _ ¬z) ^ (x _ z). Literal y does not play any role in this conflict, making backtracking over x completely redundant. In this situation a backjump can be made to the literal y. This is also implemented in the algorithm made for this thesis. By searching for a decision operation that does not play a role in a given conflict and

(7)

returning the corresponding decision level, a backjump can be made. However, it is important to pick the decision operation with the highest decision level if that operation does play a role in the conflict. When this is not done and a decision operation is chosen that has a lower decision level than another decision operation that does play a role, the algorithm will make mistakes. When backjumping is done correctly, this means that a part of the search tree that is search through by an algorithm like DPLL will be skipped. This should increase the efficiency of the algorithm [22].

After the analysis is done, another step is done to improve the efficiency of the algorithm. This step creates clauses from the clauses that caused a conflict. In other words, the algorithm learns from its mistakes. For instance, suppose the following conflict clauses: (x _ y _ ¬z) ^ (¬x _ q _ ¬z). Keep in mind that all the literals in the clauses are filled in, otherwise the conflict will not (yet) be present. The conflict literal in this conflict is x, this is because if it is true, the second clause will become false and when it is false, the first clause will become false. In this situation the resolution form that was not used to reduce the given CNF mentioned in the section III (Davis-Putnam resolution) can be used. This is because of the following facts: when x is set to true, literal q or literal ¬z have to be true and when x is set to false, literal y or literal ¬z have to be true. By combining the clauses without the conflict literal in it, it results in the clause (y _ ¬z _ q). This clause can then be added to the CNF. By doing this, some search branches that normally would be searched through can be avoided increasing the efficiency.

The CDCL algorithm made for this thesis creates clauses in the following manner. The conflict literal that caused the conflict in the CNF is always the last operation in the operation list. This is because of the way the algorithm is made. Because the conflict literal is already known, the conflict clauses with the conflict literal in it can be found. These conflict clauses are divided into two list, one with the clauses with the normal conflict literal in it and one with the clauses with the negation of the conflict literal in it. By going through the two lists and using resolution between the selected clauses, new clauses can be made. When clauses are tautological, meaning that it contains the normal form and the negation form of one literal, it is removed. This is done for two reasons: the clauses are redundant because they can always be solved and because they can lead to mistakes. The remaining clauses are added to the CNF.

There is a small difference between the making of the clauses in this thesis and the paper that was used to create the CDCL algorithm [22]. In the used paper, the clauses are made using not only the conflict literal, but also the other literals that are present in the conflict clause. This will result in even more learned clauses that can be used to solve the CNFs. The problem with this is that it is not clear which clauses can be used for resolution and which are not. So, only the conflict literal was used for making new clauses, because using the other literals might result in unwanted mistakes.

After the right decision level is found to backtrack to and the new clauses are added to the CNF, the actual backtrack function is called. This function goes through the operation list and finds the decision operation with the corresponding decision level. This operation gets a Boolean value and de-cision value of false. The operations after this operation are

removed from the list. When the given decision level from the analysis is -1, the algorithm will return that the CNF is not solvable. When the backtrack step is done, the decision level will be set to the decision level of the last decision operation in the operation list and the algorithm will begin a new iteration. When every clause in the CNF has a true literal in it, the CNF is satisfiable.

C. Look-ahead algorithm

The last algorithm that was made to look for a phase transition is the look-ahead algorithm. This algorithm is also based on the DPLL framework (see Algorithm 3), which means that it is also complete [24] [25]. However, there are some differences that makes the algorithm worth testing. The first and most important difference is the way the algorithm decided which literal to assign a Boolean value to. Instead of only checking what literal is best in the current CNF (using the MCF heuristic for example), the look-ahead algorithm checks what literal is best by looking at the CNF after different assignments are done (see Figure 3). This is done by a look-ahead function (see Algorithm 4). This function works as follows. First, a set of preselected literals is made that the function uses. This set contains the literals that are located in binary clause, these are clauses that have only two literals in them. The idea behind picking these literals is that the resulting search tree becomes more balanced. This means that some almost-answers are avoided, increasing the efficiency of the algorithm [24] [25]. The preselection only works when ten or more literals are found that occur in binary clauses. This is done because of the fact that when for example only two literals are used in the look-ahead function, the chance of missing the best literal becomes very high. When less than ten literals are found, every literal will be used for evaluation [24] [25].

Figure 3. DPLL inspired path finding (left) and look-ahead inspired path finding (right), ? symbol denotes the target. This figure is adapted from the

paper ”Look-Ahead Based SAT Solver” by M. Heule and H. Maarten.

When the set of preselected literals are picked, the look-ahead function will go through them and stops when no changes have been made to the CNF. Every literal will get a Boolean value of true and false assigned to them to look at the resulting CNF. Three things can happen when a literal is assigned the Boolean values. When both Boolean values result in an empty clause, the CNF will be returned with the literal filled in (it does not matter which Boolean value it is). This will result in a backtrack step by the DPLL structure. When only one Boolean value, true or false, results in an empty clause, the literal will get the Boolean value assigned to it that did not result in the empty clause. This is justified because the literal can only be assigned the Boolean value that does not result in a conflict when the goal is solving the CNF. When both Boolean

(8)

values do not result in a conflict, the heuristic score (h-score) of the literal is measured with a decision heuristic. This heuristic works by looking at what the difference is in the structure of the current CNF and the CNF with the assigned literal. For example: the current CNF has a total of 10 literals. When assigning the literal with a value of true, the CNF contains a total of 7 literals and when assigning the literal a value of false, the CNF contains a total of 5 literals. The h-score is calculated by taking the difference between the original CNF and the CNF with the true literal and the difference between the original CNF and the CNF with the false literal and taking the product of those two values. In the example this would be a score of (10 7) ⇤ (10 5) = 15. The idea behind this h-score is that the best literal to pick is the literal that reduces the CNF the most. The smaller the CNF, the faster a solution or conflict is found. When no changes are being made in the CNF and every not assigned literal has its h-score, the literal with the highest h-score will be given back to the DPLL algorithm. After the look-ahead function, another check for empty clauses and empty CNFs is done. Also, when the look-ahead function returns no decision literal, what can happen when a conflict occurs or when every preselected literal has been assigned a Boolean value by the function, the CNF will be returned so a new iteration can start.

Algorithm 3 DPLL(CNF)

1: if CNF = ; then 2: return true

3: if empty clause 2 CNF then 4: return false

5: (CNF, xdecision) Look-ahead(CNF)

6: if CNF = ; then 7: return true

8: if empty clause 2 CNF then 9: return false

10: if xdecision=nonethen

11: return CNF

12: Boolean DirectionHeuristic(CNF, xdecision)

13: if DPLL(CNF[x Boolean]) = true then 14: return true

15: if DPLL(CNF[x ¬Boolean]) = true then 16: return true

17: return false

The second difference from the DPLL algorithm is that the look-ahead algorithm also checks what Boolean value is best to assign to the literal first, this is also called the direction heuristic [25]. The direction heuristic works by looking at whether the literal occurs the most with or without a negation. When the literal occurs the most without a negation, it is better to assign the value true. When the literal occurs the most with a negation, it is better to assign the value false. The idea behind this is the more clauses that can be removed from the CNF, the better. The algorithm finds a solution or a conflict faster when there are fewer literals left in the CNF. This helps with the efficiency of the algorithm. When the best assignment results in a conflict in a later stage, the Boolean value is changed using backtracking.

Using the look-ahead function in combination with the DPLL algorithm might result in the CNFs being solved in

Algorithm 4 Look-ahead(CNF)

1:P Preselect(CNF)

2:while no changes in CNF do

3: for all variables xi2 P do

4: if empty clause 2 CNF[x true]

5: and empty clause 2 CNF[x false] then

6: return (CNF[x true], none)

7: else if empty clause 2 CNF[x false] then

8: CNF CNF[x true]

9: else if empty clause 2 CNF[x true] then

10: CNF CNF[x false]

11: else

12: H(xi) DecisionHeuristic(CNF, CNF[x false], CNF[x true])

13: return (CNF, xiwith highest H(xi))

less iterations. Many almost-solution are avoided by selecting the next literal to use in a smart way. However, there is one downside to this algorithm, the look-ahead function is very time consuming. Selecting the best literal means that the CNF has to be filled in with a large set of literals, be checked if it contains an empty clause and be evaluated with a h-score if needed. Filling in and checking through a CNF takes time, especially when the CNF is 200 or more clauses long. But, because the formula creates a balanced search tree, fewer almost-solution will be gone through when solving a CNF. This could mean that the algorithm might underperform when solving CNFs outside of the phase transition, but perform well when solving CNFs near the phase transition.

IV. RESULTS

A. Cetal’s DPLL algorithm

Cetal’s DPLL algorithm has been replicated and the bottom graph of Figure 4 has been generated. The results show that there is indeed a phase transition for the 3-SAT formulas. The critical value for the 3-SAT problem lays around ⇠4.5 M/N and moves towards a value of ⇠4.2 when the N value (amount of literals in a CNF) of the formulas becomes higher and the CNFs thus become bigger, this is similar to what was found in the literature [1] [9] [15] [16] [19]. The 2-SAT problem also shows to have a phase transition, the critical value of 2-SAT lays around ⇠1.8 M/N which is similar to the found literature [26]. This also matches the graph of Cetal in the sense that the critical value of the 2-SAT problem is lower than the critical value of the 3-SAT problem. By looking at how little data points there are in the phase transition of the 2-SAT problem, the assumption can be made that the transition is very sharp. This is most likely the result of the CNFs being relatively big in size in relation to the k value. So, the bigger the k value, the bigger the CNFs have to be to show a sharp phase transition. this phenomenon can also be seen in the 3-SAT data. As for the 4-SAT problem, the replication graph in Figure 4 does not show a phase transition while the original graph does. This is likely because the critical value of 4-SAT, with the M/N ratio as order parameter, is bigger than the maximum M/N ratio used to make the replication graphs. So, 4-SAT most likely still has a phase transition, but the M/N ratio of the critical value is too big to show in the graph. This has been proven in a paper about the phase transition for various SAT problems, the critical value for the 4-SAT problem is approximately ⇠9.75 [27]

Next, the computational cost in iterations for solving the 3-SAT and 4-SAT problems has been evaluated. In Figure 5 the difficulty graphs can be seen that Cetal made using their

(9)

Figure 4. The top graph shows Cetal’s solvability graph, the horizontal axis is the ”avg. constraints per variable” and the vertical axis is unreadable. The

bottom graph is the satisfiability graph for every used algorithm, the horizontal axis is the M/N ratio and the vertical axis is the percentage of

satisfied CNFs.

algorithm. There are a few aspects not entirely clear about these graphs. First, the vertical axis of both graphs are un-readable. The vertical axis used to show the computational cost of the used algorithm in the replication graphs is the average amount of iterations to solve a specific size of CNF. Second, the amount of variables per CNF that has been used for the making of the graphs is not stated. In the replication graph for 3-SAT every N value is used to show the computational cost and in the replication graph for 4-SAT a N value of 30 is used to show the computational cost. Third, no line has been drawn in the difficulty graph of 4-SAT. This means that comparing this to the replication graph becomes impossible. The top graph of Figure 7(b) and the top graph of Figure 6 show the replication graphs. As for 3-SAT, Cetal’s graph and the replication graph are very similar to each other. In Cetal’s graph the line starts with a sharp increase in difficulty just before the critical value and ends with a slow decrease after the critical value. This is also found in the replication graph when looking at the N values of 20 and higher. The data points from the N value of 10 most likely form a similar shape too, but because the values on the vertical axis reach too high it is not noticeable. A small difference between Cetal’s graph and the replication graph is that the decent in difficulty in Cetal’s graph

is a bit jittery, this is not the case in the replication graph. This is most likely due to the fact that the replication graph used much more data than Cetal did, resulting in a smoother looking graph. In the replication graph for 4-SAT show in Figure 6 a sharp increase in iteration cost can be seen at the end of the graph, around a N value of ⇠7. This indicates that there most likely is a phase transition with an increase in iteration cost just outside the reach of the used horizontal axis. Nothing similar can be found in the bottom graph of Figure 6.

A small thing to note however is that the replication graph with the computational cost in iterations for 3-SAT (top graph of Figure 7(b)) shows a small decline in iteration cost around a M/N ratio of ⇠2.7, just before the sharp increase near the critical value. This small decline is not present in the graph of Cetal. Why this exactly happens is unknown. Also, no literature has been found that describes this phenomenon. A possible explanation is that unit resolution works better when the CNFs are bigger. This combined with the fact that unit resolution works multiple times in one iteration could explain why there is a small decline in iteration cost around the M/N value of ⇠2.7. This decline most like disappears when using a vertical axis that shows the average amount of literal assignments to solve a CNF.

As stated before, the computational cost in seconds has also been measured for every algorithm. In the bottom graph of Figure 7(b) the computations cost in seconds can be seen for the different 3-SAT instances for Cetal’s algorithm. This graph again shows a sharp increase in computational cost just before the critical value, but in most cases no slow decrease after the critical value. This means that the phase transition might contain the hardest instances in terms of iteration cost, but not necessarily in terms of the cost in seconds. This is most clear for the N values of 10, 20, and 30. For the N value of 40 the slow decrease is present again. As for 4-SAT, here the sharp increase is again cut short due to the horizontal axis not being long enough (see bottom graph of Figure 6).

Another difference from the iteration graphs is that the small bump in the beginning of the top graph of Figure 7(b) is not present in the time graph. So, even though the smallest CNFs needs more iterations to be solved, it does not necessarily need more time to be solved. Again, this could be the cause of unit resolution working better with bigger CNFs in regards to iterations. However, the bottom graph of Figure 7(b) shows that unit resolution does not necessarily preforms better with bigger CNFs in regards to time.

B. Conflict-driven clause learning algorithm

In the top graph of Figure 7(c) the computational cost in iterations can be seen for the CDCL algorithm. It can be seen that the CDCL algorithm drastically outperforms the simple DPLL algorithm in terms of iterations for the bigger N values. The CDCL algorithm also outperforms the simple DPLL algorithm for the smaller N value, but because the vertical axis of the top graph of Figure 7(a) reaches too high it cannot be seen. Furthermore, the graph is almost exactly the same as the top graph in Figure 7(b). This means that also the CDCL algorithm knows a phase transition where the hardest instances of the SAT problem are located when regarding the cost in iterations. A difference with Cetal’s algorithm however is that the CDCL algorithm needed overall more iterations to solve the different CNFs. This is curious considering that the CDCL

(10)

algorithm uses advanced solving methods like backjumping and clause learning to solve the CNFs. An explanation for this is that using iterations to compare the two algorithms might be misleading. For instance, one iteration in Cetal’s algorithm preforms unit resolution and the assigning of the most constrained literal. One iteration in the CDCL algorithm on the other hand, performs unit resolution or the assigning of the most constrained literal. This is most likely the reason that Cetal’s algorithm ”outperformed” the CDCL algorithm. If the CDCL algorithm had more of the same structure as Cetal’s algorithm (so one iteration would use unit resolution and the assigning of the most constrained literal) or if instead of the average amount of iterations, the amount of attempts of assigning the literals would be chosen to compare the algorithms, the CDCL algorithm would most likely outperform Cetal’s algorithm. Nevertheless, it can be said that the CDCL algorithm knows a phase transition with the hardest instances located on this point.

As for the computational cost in seconds (bottom graph of Figure 7(c)), the CDCL algorithm also outperforms the simple DPLL algorithm in a major way. However, there is a noticeable difference between the time graph of Cetal’s algorithm and the time graph of the CDCL algorithm regarding the 3-SAT problem. The time graph of the CDCL algorithm has a more defining peak for the N value of 30 and 40 than the time graph of Cetal’s algorithm. This means that the CDCL algorithm

Figure 5. Cetal’s difficulty graphs, the top graph is for the 3-SAT problems and the bottom graph is for the 4-SAT problems. The horizontal axis of both

graphs is the ”avg. constraints per variable” and both the vertical axis are unreadable.

Figure 6. The top graph shows the computational cost in iterations for the 4-SAT problem, the horizontal axis is the M/N ratio and the vertical axis is

the average computational cost in iterations. The bottom graph shows the computational cost in seconds for the 4-SAT problem, the horizontal axis is

the M/N ratio and the vertical axis is the average computational cost in seconds.

has a harder time solving the CNFs located near the phase transition than the CNFs located far from the phase transition. So, for the used CDCL algorithm it can be stated that the CNFs located near the phase transition are the hardest in terms of iterations cost and cost in seconds. However, this might again be because of the structure the CDCL algorithm has been build in for this thesis. If it had a more similar structure as Cetal’s algorithm, the bottom graphs of Figure 7(b) and Figure 7(c) might have been more like each other. Furthermore, the CDCL algorithm needs more time in general to solve the CNF formulas than Cetal’s algorithm. This might be the result of the way the CDCL algorithm works. Instead of constantly saving the CNF during the solving of one, the operations that have been made are saved in an operation list. This means that this list and the entire CNF have to be gone through multiple times during the solving of the CNF. Cetal’s algorithm does this less because the CNF constantly becomes smaller. This combined with the fact that the algorithm needs overall more iterations to solve the CNFs, results in the CDCL algorithm being slower. However, because the CDCL algorithm only has to save one operation list instead of multiple CNFs, the algorithm probably uses much less memory than Cetal’s algorithm. When memory is an issue for solving a SAT problem, the CDCL algorithm might thus be a better pick.

(11)

(a) Simple DPLL algorithm (b) Cetal’s DPLL algorithm (c) CDCL algorithm (d) Look-ahead algorithm

Figure 7. All the results of the different tested algorithms on the 3-SAT problem instances. The top row are the computational cost in iterations graphs and the bottom row are the computational cost in seconds graphs. The horizontal axis for every graph is the M/N ratio, the vertical axis for the top row is the average computational cost in iterations and the vertical axis for the bottom row is the average computations cost in seconds. Note that the 40 N value line for the simple DPLL algorithm does not show every data point. This is because a data point is not shown when 50% or more of the corresponding CNFs reach the

cutoff point.

C. Look-ahead algorithm

In the top graph of Figure 7(d) the computational cost in iterations can be seen for the look-ahead algorithm. Just like the other two algorithms, this algorithm is much more efficient in solving the CNFs in terms of iterations than the simple DPLL algorithm. Furthermore, the graph of the look-ahead algorithm looks almost exactly the same as the graph of Cetal’s algorithm (top graph of Figure 7(b)), meaning that also the look-ahead algorithm has a phase transition where the hardest instances in terms of iteration cost are located. There are only two small differences. The first difference is that the look-ahead graph contains a few data points that do not follow the main trend. This might be the result of the algorithm not picking the correct literal to branch on, which results in an increase in iterations. For instance, if during the preselect procedure the actual best literal is not picked multiple times in a row, it could lead to the algorithm getting stuck in multiple almost-solutions. The second difference is that the look-ahead algorithm performs just a bit worse that Cetal’s algorithm. This is most likely the result of the flexibility of the look-ahead algorithm. Almost every part of the algorithm can be changed to better suit a certain size and type CNF. To give a few examples: the preselect heuristic can be changed to better suit smaller CNFs, a look-ahead reasoning heuristic can be added in the look-ahead function, the decision heuristic can be changed to better suit smaller CNFs, etc. When it is known what combination of heuristics and functions best performs with the used CNFs for this thesis, the result would most likely be much better.

The computational cost in seconds graph shows something much different (see bottom graph of Figure 7(d)). Although it still outperforms the simple DPLL algorithm, it is much slower than Cetal’s algorithm and the CDCL algorithm. However, this

does not come as a surprise. The look-ahead function in the algorithm has to go through a certain CNF many times to estimate what the best literal will be to fill in. Additionally, this keeps on going until no changes have been made to the CNF, this means that this function does a lot of evaluating for only one iteration. Combine this with the fact that solving a CNF could take 50 or more iterations and the algorithm will be doing a lot of work for a long time. It can thus be safely said that the look-ahead algorithm might not be the best algorithm for solving large SAT problems.

Furthermore, just like the Cetal’s algorithm, the phase transition is not always the place where the hardest instances are located when regarding the computational cost in seconds for the look-ahead algorithm. This can be seen for the N values of 10, 20 and 30. For the N value of 40 the decline after the phase transition is noticeable again.

V. CONCLUSION ANDDISCUSSION

It can be safely said that Cetal’s results for the SAT problem are reproducible and valid when using the replication algorithm of this thesis. Figure 4 shows that the solvability graph from Cetal’s paper and the reproduction graph created using the replicated algorithm are almost exactly the same. The figure shows that the reproduction graph has phase transitions for 2-SAT and 3-2-SAT. The corresponding critical values are similar to the values found in various papers [9] [15] [16] [19] [26]. 4-SAT most likely also has a phase transition with a critical value outside the range of the horizontal axis when considering Figure 6. This has been proven in a paper about the phase transition for various SAT problems, the critical value for the 4-SAT problem is approximately ⇠9.75 M/N [27].

Furthermore, the effect of the phase transition can also be seen when looking at the 3-SAT difficulty graph of Cetal in

(12)

Figure 5 and the reproduction graph in Figure 7(b). The line in Cetal’s graph shows that the hardest instances of the 3-SAT problem are near the phase transition, this is also the case for the reproduction graph. The two graphs are almost exactly the same in terms of its shape, showing that Cetal’s algorithm is reproducible. The only difference is the small bump in the beginning of the reproduction graph. This is most likely the result of using unit resolution in the algorithm. However, this bump most likely disappears when a different parameter is used for the vertical axis. As for 4-SAT, because Cetal’s graph for the 4-SAT difficulty showed no line, no conclusions can be made using the replication graph.

In addition to the computational cost in iterations, the computational cost in seconds has been evaluated for Cetal’s algorithm. Here the effect of the phase transition is also noticeable for 3-SAT when looking at the sharp increase in cost in seconds before the critical value, however for the N values of 10, 20 and 30 there is no decrease after the critical value. This means that the phase transition does not necessarily contain the hardest SAT problems when regarding to computational cost in seconds for Cetal’s algorithm. For the N value of 40 there is a decline after the critical value, showing that for that N value the hardest instances are located near the phase transition. 4-SAT again shows an increase in computational cost at the end of the graph, indication that the critical value of 4-SAT is larger then 8 M/N [27].

Also, for the CDCL algorithm and look-ahead algorithms the phase transition has been proven to exist. Both show a very similar shape for the computational cost in iterations as Cetal’s difficulty graph when regarding 3-SAT problems. They outperformed the simple DPLL algorithm in a major way, but performed a bit worse than Cetal’s algorithm. The CDCL algorithm probably performed worse because of its structure and the look-ahead algorithm probably performed worse because of the flexibility of the algorithm. Nevertheless, they can be used as benchmark to show where the hardest problems are located when regarding iteration cost. As for computational cost in seconds, there are some differences from Cetal’s algorithm. They still drastically outperformed the sim-ple DPLL, but again where outperformed by Cetal’s algorithm. The CDCL algorithm showed more defined peaks at the critical value for higher N value, showing that the hardest instances of the 3-SAT problem for the CDCL algorithm are located near the phase transition when regarding the computational cost in seconds. The computational cost in seconds graph for the look-ahead algorithm is almost exactly the same in shape as the graph for Cetal’s algorithm. The biggest difference is that the look-ahead algorithm performed much worse than Cetal’s algorithm for larger N value and thus larger CNFs. This is the result of the look-ahead function being very computationally expensive. However, just like Cetal’s algorithm, the phase transition is not always the place where the hardest instances are located when regarding the computational cost in seconds for the look-ahead algorithm. This can be seen for the N values of 10, 20 and 30.

Knowing where the hardest problem instances are located can be very beneficial. Some real life problems can for example be translated into a SAT problem, building circuits is one of those problems [28]. When it is known where the really hard problem instances are located, the manufacturers can keep this in mind and work more efficiently by using the right algorithms

or managing time better. Papers like the one of Cetal show where the hardest problem instances are located, thus making it easier for the manufacturers to work more efficiently.

So, it has been shown that Cetal’s results are reproducible and that the phase transition is present when using the repli-cated algorithm. Furthermore, it can also be concluded that the phase transition is not always the location where the hardest SAT problems are located when regarding the computational cost in seconds. Using the graphs created with the different algorithms from this thesis and most importantly the graphs from Cetal’s paper, a good indication is given of where and when a phase transition happens and if it contains the hardest SAT instances.

VI. ACKNOWLEDGEMENT

This thesis could never have been made without the con-stant support of Daan van den Berg. This thesis would not have been the same without his supervision and knowledge. Also, many thanks to Teun Mathijssen, Angelo Groot and Reitze Jansen who helped when it was needed.

REFERENCES

[1] P. C. Cheeseman, B. Kanefsky, and W. M. Taylor, “Where the really hard problems are.” in IJCAI, vol. 91, 1991, pp. 331–337.

[2] M. R. Garey and D. S. Johnson, Computers and intractability. New York: wh freeman, 2002, vol. 29.

[3] M. Yannakakis, “Expressing combinatorial optimization problems by linear programs,” Journal of Computer and System Sciences, vol. 43, no. 3, 1991, pp. 441–466.

[4] L. Fortnow, “The status of the p versus np problem,” Communications of the ACM, vol. 52, no. 9, 2009, pp. 78–86.

[5] S. Cook, “The p versus np problem,” The millennium prize problems, 2006, pp. 87–104.

[6] S. C. Brailsford, C. N. Potts, and B. M. Smith, “Constraint satisfaction problems: Algorithms and applications,” European journal of opera-tional research, vol. 119, no. 3, 1999, pp. 557–581.

[7] R. Paturi, P. Pudl´ak, M. E. Saks, and F. Zane, “An improved exponential-time algorithm for k-sat,” Journal of the ACM (JACM), vol. 52, no. 3, 2005, pp. 337–364.

[8] M. P˘atras¸cu and R. Williams, “On the possibility of faster sat algo-rithms,” in Proceedings of the twenty-first annual ACM-SIAM sympo-sium on Discrete Algorithms. SIAM, 2010, pp. 1065–1075. [9] B. Selman, D. G. Mitchell, and H. J. Levesque, “Generating hard

satisfiability problems,” Artificial intelligence, vol. 81, no. 1-2, 1996, pp. 17–29.

[10] C. F. Camerer, A. Dreber, E. Forsell, T.-H. Ho, J. Huber, M. Johannes-son, M. Kirchler, J. Almenberg, A. Altmejd, T. Chan et al., “Evaluating replicability of laboratory experiments in economics,” Science, vol. 351, no. 6280, 2016, pp. 1433–1436.

[11] C. F. Camerer, A. Dreber, F. Holzmeister, T.-H. Ho, J. Huber, M. Johan-nesson, M. Kirchler, G. Nave, B. A. Nosek, T. Pfeiffer et al., “Evaluating the replicability of social science experiments in nature and science between 2010 and 2015,” Nature Human Behaviour, vol. 2, no. 9, 2018, p. 637.

[12] N. Juristo and O. S. G´omez, “Replication of software engineering exper-iments,” in Empirical software engineering and verification. Springer, 2010, pp. 60–88.

[13] G. van Horn, R. Olij, J. Sleegers, and D. van den Berg, “A predictive data analytic for the hardness of hamiltonian cycle problem instances,” DATA ANALYTICS, 2008, pp. 91–96.

[14] B. Hayes, “Computing science: Can’t get no satisfaction,” American scientist, vol. 85, no. 2, 1997, pp. 108–112.

[15] S. Kirkpatrick and B. Selman, “Critical behavior in the satisfiability of random boolean expressions,” Science, vol. 264, no. 5163, 1994, pp. 1297–1301.

(13)

[16] I. P. Gent and T. Walsh, “Easy problems are sometimes hard,” Artificial Intelligence, vol. 70, no. 1-2, 1994, pp. 335–345.

[17] M. Davis and H. Putnam, “A computing procedure for quantification theory,” Journal of the ACM (JACM), vol. 7, no. 3, 1960, pp. 201–215. [18] W. F. Dowling and J. H. Gallier, “Linear-time algorithms for testing the satisfiability of propositional horn formulae,” The Journal of Logic Programming, vol. 1, no. 3, 1984, pp. 267–284.

[19] D. Mitchell, B. Selman, and H. Levesque, “Hard and easy distributions of sat problems,” in AAAI, vol. 92, 1992, pp. 459–465.

[20] M. Davis, G. Logemann, and D. Loveland, “A machine program for theorem-proving,” Communications of the ACM, vol. 5, no. 7, 1962, pp. 394–397.

[21] T. Hogg and C. P. Williams, “The hardest constraint problems: A double phase transition,” Artificial Intelligence, vol. 69, no. 1-2, 1994, pp. 359– 377.

[22] J. Marques-Silva, I. Lynce, and S. Malik, “Conflict-driven clause learning sat solvers,” Handbook of satisfiability, vol. 185, 2009, pp. 131–153.

[23] L. Zhang, C. F. Madigan, M. H. Moskewicz, and S. Malik, “Efficient conflict driven learning in a boolean satisfiability solver,” in Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design. IEEE Press, 2001, pp. 279–285.

[24] M. Heule, M. Dufour, J. Van Zwieten, and H. Van Maaren, “March eq: Implementing additional reasoning into an efficient look-ahead sat solver,” in International Conference on Theory and Applications of Satisfiability Testing. Springer, 2004, pp. 345–359.

[25] M. Heule and H. van Maaren, “Look-ahead based sat solvers.” Hand-book of satisfiability, vol. 185, 2009, pp. 155–184.

[26] R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman, and L. Troyan-sky, “Determining computational complexity from characteristic phase transitions,” Nature, vol. 400, no. 6740, 1999, p. 133.

[27] I. P. Gent and T. Walsh, “The sat phase transition,” in ECAI, vol. 94. PITMAN, 1994, pp. 105–109.

[28] R. E. Bryant, “On the complexity of vlsi implementations and graph representations of boolean functions with application to integer multi-plication,” IEEE transactions on Computers, vol. 40, no. 2, 1991, pp. 205–213.