Integral Points on Curves Defined by the Equation Y 2 = X3 + aX2 + bX + c

(1)

MSc Mathematics

Master Thesis

Integral Points on Curves Defined by the

Equation Y

2

= X

3

+ aX

2

+ bX + c

Author: Supervisor:

Vadim J. Sharshov

Dr. S.R. Dahmen

Examination date:

Thursday 28

th

July, 2016

(2)

Abstract

In this project I will investigate, explain and compare different methods for finding all integral points on curves that are given by the equation Y2 = X3 + aX2 + bX + c (with a, b, c being integers). The theory is based on the book of Nigel P. Smart "The Algorithmic Resolution of Diophantine Equations", most proofs of theorems that I use in my project can be found there. I have applied this knowledge, worked out some of the examples and wrote programs that use these techniques to determine all integral points on the curves defined by the given equations.

Title: Integral Points on Curves Defined by the Equation Y2 = X3+ aX2+ bX + c Author: Vadim J. Sharshov, wadishar@hotmail.com, 6061524

Supervisor: Dr. S.R. Dahmen Second Examiner: Dr. M. Shen

Examination date: Thursday 28th July, 2016 Korteweg-de Vries Institute for Mathematics University of Amsterdam

Science Park 105-107, 1098 XG Amsterdam http://kdvi.uva.nl

Department of Mathematics VU Amsterdam

De Boelelaan 1081a, 1081 HV Amsterdam http://www.math.vu.nl/

(3)

Introduction

At the start of the previous century, the mathematician David Hilbert [9] published a list of twenty-three at that time unsolved mathematical problems. One of those was about Diophantine equations. A Diophantine equation is a polynomial equation with integer coefficients in which only integral solutions, solutions that only have integer coordinates, are searched. Hilbert’s problem asked for a universal algorithm for determining whether a Diophantine equation had at least one integral solution. In 1970 the Russian math-ematician Yuri V. Matiyasevich [11] showed the impossibility of this. However, under certain conditions it is possible not only to determine that there is at least one integral solution, but find all integral solutions of the Diophantine equation.

In this thesis a specific kind of Diophantine equations is studied, namely those of the form

Y2= X3+ aX2+ bX + c (0.1) where a, b, c ∈ N, to be solved in integers X and Y . Another restriction is that we con-sider only triplets (a, b, c) for which the cubic polynomial X3+ aX2+ bX + c has three different roots, in other words the discriminant of X3+ aX2+ bX + c should be nonzero. It is known that on this kind of equations there are finitely many integral solutions. We will study some of the methods for finding those solutions. For two of the methods I have implemented the theory in programs written by me. For this I used the mathematics software system SageMath, since my thesis is in the area of Algebraic Number Theory. SageMath is a free open-source mathematics software system, that has many build-in commands from this field. The codes of these programs can be found in Appendix A and B.

In Chapter 1 we discuss the method, known as Skolem’s method which makes use of p-adic numbers. We will explain more about these numbers in the next chapter. This method will illustrate the basic approach for finding integral solutions on Diophantine equations. We will illustrate Skolem’s method with examples and introduce some impor-tant theorems for the theory of finding integral solutions to equations, namely Strass-mann’s theorem and Hensel’s theorem.

In Chapter 2 we give theory about how to find all integral solutions on Thue equations. A Thue equation is a Diophantine equation of the form:

(6)

This theory will be used in Chapter 3 where we discuss the process of finding all inte-gral solutions for our equation 0.1. The program that is based on the theory explained in Chapter 2, is called ’programFindBoundThueEquation’. It is an essential part of the program ’programone’, which is the program that finds all integral solution of the equation 0.1. The codes of the programs ’programFindBoundThueEquation’ and ’pro-gramone’ can both be found in the Appendix A. They have been written by me and tested for various tuples of integers (a, b, c).

In Chapter 4 we consider the Mordell-Weil group of the curve defined by the equation 0.1. This is the group of all rational points on our curve. We state the Mordell-Weil theorem, which tells us that the there are finitely many generators of the Mordell-Weil group and then consider different methods of how these generators can actually be determined. We do not implement these methods in our program, since SageMath has already a build-in function that finds the generators of the Mordell-Weil group based on this theory. Then in Chapter 5 we explain how you can use the found generators of the Mordell-Weil group to determine all integral solutions for our equation. Based on this method I have wrote program ’programTwo’ that actually finds all integral points on our curve. The code of this program can be found in Appendix B.

We finish with Chapter 6, where we discuss differences between the method which uses Thue equations and the method that uses Mordell-Weil group generators. Also we con-sider some alternative methods for solving this problem. Besides that we think about the possibilities for further generalization of our problem and we propose ideas for further research.

(7)

Notation

My thesis considers a problem from the area of Algebraic Number Theory, therefore we will quickly refresh some basis notation from this field of mathematics. For more infor-mation we refer to the following article of F. Beukers [2] or to Chapter 4 from [6]. Now follows a list of definitions that we will use throughout this thesis.

Q(θ) will denote the number field, often for the same number field we will use K instead of Q(θ). A number field is a field extension of the rational numbers (Q) with an element θ that is a root of an irreducible polynomial with rational coefficients. (θ)(i) _{will stand for the i-th conjugate of the element θ.}

O_K will denote the ring of integers of a number field K.

kgk will denote the absolute value of the norm of an element g of O_K.

ηi will denote (in context of number fields) the i-th fundamental unit, so a generator

for the unit group of the ring of integers O_K of a number field K.

will denote (in context of number fields) a unit from the unit group, so an element of O_K whose absolute value of the norm is one.

Qp will denote the field of p-adic numbers, with p a prime number. The norm of a

p-adic number will be denoted by | · |p.

log(α) will denote a discontinuous function from C∗ to C defined in the following way: C \ R≤0 we take the analytically extended logarithm with log(1) = 0, defined for

C \ R≤0, so with the branch cut being the negative real axis,

R≤0 we take the analytically extended logarithm with log(1) = 0, defined

for C \ iR≤0, so the branch cut along the negative imaginary axis.

We will work with Thue equations. During this approach we often use the following principle, that of their factorization. So when we consider a polynomial of the form Xm_{− aY}m _{(a ∈ Z), we set θ =} m√_{a and we use the fact that X − θY divides X}m_{− aY}m_:

Xm− aYm_{= (X − θY )(} m−1

X

i=0

(8)

1. Skolem’s method

To illustrate the development of various methods used to find integral points, we will show a method which was the main method in the first half of the twentieth century. This method for finding solutions of Diophantine equations dates back to Thoralf Skolem [16]. This chapter is based on Chapter III of [17], we use same examples only work them out in more detail. In the following chapters of this thesis we will see how the idea behind this method is still used in other methods.

1.1. Introduction of Strassmann’s theorem

Strassmann’ theorem uses p-adic numbers and also p-adic convergence, so before we introduce the theorem we recall what the p-adic numbers are.

For any prime number p the set of p-adic numbers, noted as Qp, is a completion of Q,

i.e. it consists of Q together with the limit points of every convergent sequence. The difference with R, the usual completion of Q is the norm. For more about completion of Q and more theory about p-adic fields we refer to Chapter 4 of [5]. Certain points that are considered ’far apart’ in R are close by each other in Qp. We will now introduce this

p-adic norm. Every nonzero p-adic number n can be uniquely written as n = pk(

∞

X

i=0

nipi), where k ∈ Z and ni ∈ {0, . . . , p − 1}

for i > 0 and n₀ ∈ {1, . . . , p − 1}. From this notation the p-adic order and the p-adic norm of a p-adic number are defined as follows:

ordp(n) = k (p-adic order),

|n|p = p−k (p-adic norm).

So that rational number with "high" powers of p in the numerator have low p-adic norm and a positive order. Thus we see that for the following series of integers

∞

X

i=0

pi,

the p-adic norm of each term is less then the p-adic norm of the previous term and goes to zero. It is not difficult to prove that p-adic norm is an ultrametric norm. And lemma 4.1.18 from [5] tells us that in an ultrametric space a series converges if and only if its terms in norm tend to 0. Thus it would mean that the series P

i∈Npi converges, we will

(9)

1.1.1. Theory of the method

Skolem’s method makes use of Strassmann’s theorem, introduced by Reinhold Strass-mann [19] in 1928. This is a very useful theorem that gives a solution to converging power series.

Theorem 1.1 (Strassmann’s theorem). Let (ai)i∈N6= 0 be a sequence of p-adic numbers,

and let

f (X) =X

i≥0

aiXi

denote a power series which converges for all x ∈ Zp. When N is defined such that

|aN|p= maxi≥0|ai|p, and |ai|p < |aN|p for all i > N , then there are at most N elements

in α ∈ Zp such that f (α) = 0.

The proof of this theorem can be found in Theorem 4.5.1 of [5].

1.1.2. Examples of Skolem’s method

The applience of this theorem will be illustrated by finding the integral solutions of the following curves: X4− 2Y4 _{= ±1, X}3_{+ 6Y}3 _{= ±1 and X}3_{+ 2Y}3 _{= ±1. Al lot of this}

calculations for these examples can be done with the help of SageMath. Equation X3_{+ 6Y}3 _{= ±1}

We will consider the number field K = Q(θ), where θ is defined as a solution of the equation θ3 + 6 = 0. We can calculate that the discriminant of this number field is −972 = −1 · 22_{· 3}5_{. For the definition of the discriminant we refer to proposition 4.4.1}

of [6].

The rank of of O∗_K is one, thus for the basis we can choose one fundamental unit, let it be 1 + 6θ + 3θ2_{. When we take X, Y ∈ Z such that they are solutions of our equation} and therefore X − θY ∈ O_K∗ and because we have taken 1 + 6θ + 3θ2 as the basis element of O_K∗ , we get that there exists a k such that:

X − θY = ±(1 + 6θ + 3θ2)k.

(10)

in k. X − θY = ± X 0≤i,j≤k _k i, j, k − i − j 1i· (6θ)j_{· (3θ}2₎k−i−j ! = ± 1 + 3 2 _k k − 1, 1, 0 θ + _k k − 1, 0, 1 θ2 + + 9 _k k − 2, 2, 0 (2θ)2+ _k k − 2, 1, 1 (2θ3) + _k k − 2, 0, 2 θ4 + 27 (. . .) + . . . ! = ± 1 + 3 2 _k k − 1, 1, 0 θ + _k k − 1, 0, 1 θ2 + + 9 4 _k k − 2, 2, 0 θ2+ _k k − 2, 1, 1 · (−12) − 6 _k k − 2, 0, 2 θ + 27 (. . .) + . . . ! = ± 1 + 3 2kθ + kθ2 + + 9 2k(k − 1)θ2 + 27 (−4k(k − 1) − k(k − 1)θ + . . .) + . . . !

In other words there are constants c1, c2, c3 ∈ Z[k], such that X − θY = c1 + c2θ +

c3θ2. By construction we see that ci have ascending powers of 3 in the numerator,

thus they converge in 3-adic norm, which is a necessary requirement if we want to use Strassmann’s Theorem. Since it is impossible to make a non-trivial factorization in a linear and quadratic term over Q3 of the polynomial of our initial equation, X3+ 6, it is

irreducible over Q3. Also we use a theorem 4.8.8 from [6], that states if a prime number

(in our case we take p = 3) divides the discriminant of the number field K, the ideal (3) is ramified. Using the fact that our equation is an Eisenstein polynomial, we get from Proposition 4.4.34 of [5], that the extension Q3(θ)/Q3is a ramified extension. Therefore,

we can equate the coefficients of {1, θ, θ2} and conclude that c3 = 0:

0 = 3(k) + 9(2k(k − 1)) + . . . .

We have constructed the coefficients such that in 3-adic norm they converge to 0, therefore we get that the series converge in 3-adic norm. So now we can apply Strassmann’s theorem and we find that there the unique solution to this 3-adic equation is k = 0. With this value for k we obtain the solution (X, Y ) = (±1, 0).

(11)

Equation X3+ 2Y3 = ±1

For this equation we again consider the number field K = Q(θ) but this time with θ being a solution of the equation θ3+ 2 = 0. we find that the discriminant of this field is −108 = −1 · 22_{· 3}3_.

We see that the rank of O_K∗ is again one, so we can choose the fundamental unit −1 − θ as the basis of O_K∗ _{. So for some k ∈ Z we can write}

X − θY = ±(−1 − θ)k

If we now consider the prime ideal lying above (3), this completely ramifies, because 3 divides the discriminant of K. For Strassmann’s theorem we want again convergence in 3-adic norm, therefore we do the following quick calculation to get (−1−θ)3 = 1−3θ(1+θ). Therefore it makes sense to consider the equation for k = 3s +k, where k ∈ {0, 1, 2}, which leads to three different series.

First, the case where k = 0. This gives the following: X − θY = ±(−1 − θ)k = ±(1 − 3θ(1 + θ))s = ±(1 + 3s 1 (−θ(1 + θ)) + 9s 2 (−θ(1 + θ))2+ . . .) = ±(1 + 3s 1 (−θ − θ2) + 9s 2 (θ2+ 2θ3+ θ4)) + . . .) = ±(1 + 3s 1 (−θ − θ2) + 9s 2 (θ2+ 2 · (−2) − 2θ)) + . . .) = ±(1 + 3s(−θ − θ2) + 9s(s − 1) 2 (θ 2_{− 4 − 2θ)) + . . .)}

Using the fact that the ideal (3) is ramified, we can equate the coefficients of θ2, which gives us

0 = −3s + 9s(s − 1) 2 + . . . .

Strassmann’s theorem tells us that this equation has at most one solution. This solution is found when s = 0, and thus k = 0. This gives the solution of the initial equation (X, Y ) = (±1, 0).

(12)

The case k = 1 gives the following: X − θY = ±(−1 − θ)k = ±((−1 − θ) · (1 − 3θ(1 + θ))s = ±((−1 − θ) · (1 + 3s 1 (−θ(1 + θ)) + 9s 2 (−θ(1 + θ))2+ . . .) = ±(−1 − θ + 3(sθ(1 + θ)2) + 9(s 2 (−θ2(1 + θ)3) + . . .) = ±(−1 − θ + 3(sθ(1 + 2θ + θ2)) + 9s 2 (−θ2(1 − 3θ(1 + θ)) + . . .) = ±(−1 − θ + 3(s(θ + 2θ2+ θ3)) + 9s 2 (−(θ2− 3θ3− 3θ4)) + . . .) = ±(−1 − θ + 3(s(θ + 2θ2− 2)) + 9s 2 (−(θ2+ 6 + 6θ)) + . . .) From equating the coefficients of θ2 at both sides follows that:

0 = 6s − 9s(s − 1) 2 + . . . .

Therefore, by Strassmann’s theorem, there is at most one solution for this equation, which is when s = 0, and thus k = 1. This leads to another solution of the initial equa-tion, namely (X, Y ) = (∓1, ±1).

Finally we consider the case that k = 2. Doing the same procedure gives us: X − θY = ±(−1 − θ)k = ±((−1 − θ)2· (1 − 3θ(1 + θ))s = ±((1 + 2θ + θ2) · (1 + 3s 1 (−θ(1 + θ)) + 9s 2 (−θ(1 + θ))2+ . . .) = ±(1 + 2θ + θ2− 3(sθ(1 + θ)3_{) + 9(}s 2 (θ2(1 + θ)4) + . . .) = ±(1 + 2θ + θ2− 3s(θ − 3θ2_{+ 6) + 9}s 2 (θ2− 2θ3_{− 6θ}4_{− 3θ}5_{) + . . .)} = ±(1 + 2θ + θ2− 3s(θ − 3θ2_{+ 6) + 9}s 2 (θ2+ 4 + 12θ + 6θ2) + . . .) = ±(1 + 2θ + θ2− 3s(θ − 3θ2_{+ 6) + 9}s 2 (7θ2+ 4 + 12θ) + . . .) Equating the coefficients of θ2 from both sides gives that

0 = 1 + 9s + 63s(s − 1) 2 + . . . .

This equation has no solutions for any s, thus, for k = 2, we get no solutions for the initial equation.

Hence, equation X3 + 2Y3 = ±1, has the following integral solutions: (∓1, ±1) and (±1, 0).

(13)

Equation X4− 2Y4 _{= ±1}

This time we will set θ as a solution of the equation θ4− 2 = 0 and use its corresponding number field Q(θ). We compute that the discriminant of this number field is −2048 = −211_{. In this case the rank of O}∗

K is 2, we can take the following two fundamental

elements as the basis of O∗_K, namely 1 + θ2 and 1 + θ. We see that the smallest prime number that stays prime in Q(θ) is 5. We consider the image of the elements 1 + θ2 and 1 + θ in OK/(5)OK and find that their multiplicative orders are respectively 12 and 32.

Moreover we can rewrite the series as following:

(1 + θ2)12= 1 + 5 · 2θ2+ 52· (. . .) + . . . and

(1 + θ)312= 1 + 5 · (4θ2+ 3θ3) + 52· (. . .) + . . . . Now we consider:

X − θ · Y = ±(1 + θ2)δ1_{(1 + θ)}δ2_{(1 + ((1 + θ}2₎12_{− 1))}k1_{(1 + ((1 + θ)}312_{− 1))}k2

After doing the same steps as in the previous examples we can equate the coefficients of θ2 and θ3 to get for each of them a series in two variables on the right hand side. Substituting the relation from the one in the other will eventually give a series in one variable on which we can apply Strassmann’s theorem. But we have to repeat this procedure for δ1 ranging from 0 to 11 and δ2 ranging from 0 to 311. This requires a lot

of calculations, in the next part we explain a way how this number can be decreased.

1.2. Improvement of Skolem’s method

In the previous section we have seen how we can find integral solutions of equations by working in an appropriate number field Q(θ) to reduce the problem to an easier equation on which we could apply Strassmann’s theorem to get a bound on the number of possible solutions. However, with some equations it was needed to repeat this procedure multiple times.

This number of repetitions can be reduced when we use Hensel’s theorem together with Strassmann’s theorem. This theorem is named after Kurt Hensel. He was a German mathematician who described p-adic numbers in 1897 in his publication [8], This theo-rem is also known as Hensel’s lifting theotheo-rem and broadly speaking it states that a solu-tion of a polynomial modulo a prime number uniquely corresponds to solusolu-tions modulo a higher power of that prime under certain conditions. We will state the Multi-Dimensional Hensel theorem later on.

When we combine Skolem’s Method with Multi-Dimensional Hensel Theorem, we can find a smaller bound for the maximum exponent. With the consequence that it reduces the number of needed computations. We will now introduce the Multi-Dimensional Hensel theorem and we will illustrate it with an example. In this theorem we will use the

(14)

symbol (J_~

f(~a)) to denote the determinant of the Jacobian matrix, which is the matrix

with entries ∂fi

∂aj evaluated in ~a.

Theorem 1.2 (Multi-Dimensional Hensel). Let p be a prime, n in Z≥0 f be an n-vector~

of power series in n-variables with coefficients coming from Zp. Suppose there is a vector

~a ∈ Znp such that

~

f (~a) = ~0 mod p2δ+1, where δ = ordp(J_f~(~a)) < ∞

Then there is a unique zero ~a ∈ Znp of ~f , ~α, such that

~

α ≡ ~a (mod pδ+1).

This is the Multi-Dimensional Hensel theorem, which states how to find a solution to a vector of power series. This theorem has different names and many equivalent formula-tions. For a prove of this theorem we refer to Section 4.1.7 of [5]. Now we will illustrate Skolem’s method and also the application of Multi-Dimansional Hensel theorem for the curve X4− 2Y4_{= ±1.}

1.2.1. Example of the improved method

Let us take the equation: X4− 2Y4 _{= ±1. To find all integral solutions, we will first}

study the quartic number field K(θ), with θ, being the solution of the equation θ4−2 = 0. The unit rank of the ring of integers is two, so we need to take two fundamental units, so we take the following elements: η₁ = 1 + θ2 _{and η}

2= 1 + θ. We now need to find all

possible powers a1, a2, such that

X − θY = ±ηa1

1 η a2

2 . (1.1)

The smallest prime number that is still prime in K is 5. In the residue field OK/(5)OK

the image of η₁ has multiplicative order 12 and in the residue field of the image of η₂ has multiplicative order 312. When we calculate, η₁12 and η₂312, we rewrite them as:

η₁12=1 + 5 · (2θ2) + 52· (. . .) + . . . η312₂ =1 + 5 · (4θ2+ 3θ3) + 52· (. . .) + . . . When fill this in the equation 1.1:

X − θY = ηβ1

1 η β2

2 (1 + (η112− 1))k1(1 + (η3122 − 1))k2.

Now we equate the coefficients of θ2 and θ3 from both sides and get two power series in two variables k1 and k2. But since we need to check it for all values of 0 ≤ β1≤ 11 and

(15)

We get four equations from the four roots of θ4−2 = 0, they are of the form X −θ_iY = βi.

Now we eliminate the X and the Y from the equation. We are get two equations, that are also called Siegel’s identities:

(θ3− θ2)β1+ (θ1− θ3)β2+ (θ2− θ1)β3 = 0

(θ4− θ2)β1+ (θ1− θ4)β2+ (θ2− θ1)β4 = 0

Now we will consider the polynomial modulo 7, because the prime 7 decomposes in the field K as a product of three prime ideals, one of degree 2 and two of degree 1. We see that:

x4− 2 ≡ (x + 2)(x + 5)(x2+ 4) (mod 7).

Now because 7 is not an index divisor, we can write θ1 and θ2 as 7-adic roots of x4− 2,

given by θ₁ = 2 + 7 · (. . .) and θ2 = 5 + 7 · (. . .). Now we take θ3 = Ω and θ4 = Ω0

to be the roots of the polynomial g(x) = _(x−θx4−2

1)(x−θ2). In the localization of K the

elements η_i satisfy η₁6 = 1 mod 7 and η₂48 = 1 mod 7. So we write a1 = b1+ 6k1

and a2 = b2+ 48k2, we need to find what values of 0 ≤ b1 ≤ 5 and 0 ≤ b2 ≤ 47 solve

the equation defined above. But this are only 6 · 48 = 288 possibilities. But of these possibilities only 6 are good, namely:

(b1, b2) = (0, 0); (0, 1); (2, 23); (3, 24); (3, 25); (5, 47).

We need too expand in all the six cases the 7-adic power series. Every cases gives at most two possibilities, so we find in total at most 12 possibilities, but there are only 6 possibilities okay. We get the following identities:

(θ3− θ2)η1(1)b1η (1)b2 2 + (θ1− θ3)η1(2)b1η (2)b2 2 + (θ2− θ1)η1(3)b1η (3)b2 2 ≡ 0 (mod 7), (θ4− θ2)η1(1)b1η (2)b2 2 + (θ1− θ4)η1(2)b1η (2)b2 2 + (θ2− θ1)η1(4)b1η (4)b2 2 ≡ 0 (mod 7).

Now we look through all the possible pairs of (b1, b2), there are 6 · 48 = 288 possibilities,

we look whether these pairs fulfill the requirements of Siegel’s identities. As a result we get the following 6 pairs:

(b1, b2) = (0, 0); (0, 1); (2, 23); (3, 24); (3, 25); (5, 47).

Now for all these individual pairs we have to expand our power series in two variables k1, k2, we get the following:

(1) The case that (b1, b2) = (0, 0).

f1= 5k1+ k2+ 6Ωk2+ 7(. . .)

f2= 5k1+ k2+ Ωk2+ 7(. . .)

(2) The case that (b1, b2) = (0, 1).

f1= 5k1+ 6k2+ 5Ωk1+ 7(. . .)

(16)

(3) The case that (b1, b2) = (2, 23).

f1= 4 + 5k1+ 3k2+ Ω(2k1+ 5k2) + 7(. . .)

f2= 4 + 5k1+ 3k2+ Ω(5k1+ 2k2) + 7(. . .)

(4) The case that (b1, b2) = (3, 24).

f1= 4 + 2k1+ 6k2+ Ω(4 + k2) + 7(. . .)

f2= 4 + 2k1+ 6k2+ Ω(3 + 6k2) + 7(. . .)

(5) The case that (b1, b2) = (3, 25).

f1= 5 + 2k1+ k2+ Ω(1 + 2k1) + 7(. . .)

f2= 5 + 2k1+ k2+ Ω(6 + 5k1) + 7(. . .)

(6) The case that (b1, b2) = (5, 47).

f1= 6 + 2k1+ 4k2+ Ω(5k1+ 2k2) + 7(. . .)

f2= 6 + 2k1+ 4k2+ Ω(2k1+ 5k2) + 7(. . .)

For all cases we apply Multi-Dimensional Hensel Theorem from [17]. After using this theorem we see that in all our six cases we have only one unique solution in Z27. And

every such solution corresponds to two solutions of our Thue equation, X4− 2Y4_{= ±1,}

so we get that an upper bound on the number of solution is 12. After testing we get the following solutions: b1 b2 X Y 0 0 -1 0 0 0 1 0 0 1 1 -1 0 1 -1 1 5 47 -1 -1 5 47 1 1

Table 1.1.: All solutions for X4− 2Y4 _{= ±1}

We have found 6 solutions, since we found out that the bound is 12, there is a possibility that 6 other do exist, so we can repeat this method and use it on an other prime to find out if there are other solutions. I have tried to do this also for p = 11, but unfortunately did not find any new solutions. We partially did solve the problem, but not completely. We will proceed to the next chapter, where we will learn another method for finding all integral solutions to a Thue equation.

(17)

2. Thue equations

In the next chapter we will search for integral points on curves that are defined by Y2 = X3+ aX2+ bX + c, for a, b, c ∈ Z.

Thue equation play an important part in that method. For if we want to be able to find all integral points on the defined curve, we will need to be able to find all integral solutions of certain Thue equation of degree 4. In this chapter we follow the theory of Chapter VII of [17]. We will explain how the method for finding all integral solutions of a Thue equation works. We start by explaining what a Thue equation is.

2.1. Solving Thue equations

A Diophantine equation

F (X, Y ) = m (2.1)

where F (X, Y ) ∈ Z[X, Y ] a homogenous polynomial of degree at least 3 and m a given nonzero integer is called a Thue equation if F (X, 1) is an irreducible polynomial. We are interested only in the solutions (X, Y ) with X and Y both being integers. The polynomial F (X, Y ) of degree n can in general be written as:

F (X, Y ) =

n

X

i=0

aiXiYn−i with ai∈ Z.

If a_n6= 1, in other words if F (X, 1) is non monic, it can be transformed into equivalent Thue equation (F0(X, Y ) = m0) where F0(X, 1) is a monic polynomial. To get these F0 and m0 we have to multiply the Thue equation 2.1 with an−1_n and substitute X/an for

X. The integral points on our new Thue equation will correspond with integral points on the equation 2.1. Therefore we will from now on assume that our Thue equation F (X, Y ) = m has a monic polynomial F (X, 1).

The easiest way to find all the solutions of a Thue equation, is to find an upper bound on |X| or |Y | and simply check all possibilities. If the polynomial F (X, 1) has no real roots it is easy to find such a bound.

Lemma 2.1. Let F (X, Y ) = m be a Thue equation where F (X, 1) has no real roots and let θ be one of those roots. If (X, Y ) is a solution, then

|Y | ≤ |m|

min1≤i≤n|I(θ(i))|

(18)

Proof. Let X, Y ∈ Z such that (X, Y ) is a solution of the Thue equation. By the definition of θ, we know that F (X, 1) =Qn

i=1(X − θ(i)). We get that

Qn

i=1(X − Y · θ(i)) = m, since

(X, Y ) is a solution. This means that there is an i such that |X − Y · θ(i)| ≤ |m|, and since X has no imaginary part, |Y · I(θ(i))| ≤ |X − Y θ(i)| ≤ |m|. From this it follows that

|Y | ≤ |m| |I(θ(i)_)| ≤

|m| min1≤i≤n|I(θ(i))|

.

If F (X, 1) does have a real root, it is harder to find the solutions of the Thue equations. In this case, let θ be a root of F (X, 1) (with the conjugates of θ ordered in the standard way, that is

θ(i)∈ R if 1 ≤ i ≤ s,

θ(i) = θ(i+t) _{if s + 1 ≤ i ≤ s + t).}

We will work over the number field K = Q(θ). Obviously the norm of an element g of O_K stays the same after it is multiplied by a unit (i). By Dirichlet’s Unit theorem we now that the unit group O∗_K is finitely generated. We can define a group action of the unit group (O_K∗ ) of OK on an element g, by defining O∗_K· g := {eg, e ∈ O_K∗ }. We want

to define a complete set of representatives of O_K of a certain norm. We define this set as following. First we define the set M_m0 = {x ∈ OK | kxk = m}, this is the set of all

elements that have norm m. Now we can divide this set though our action of U . We get Mm= M0/U.

Then we embed this set Mm canonically into Mm and now Mm is exactly the complete

set of representatives of O_K of norm m, in order words U · M_m = M0. Now we formulate an important lemma.

Lemma 2.2. Given any number field K. Take Mm as being the complete set of

repre-sentatives of O_K of norm m as defined earlier. Then M_m is finite.

Proof. The norm of an element g ∈ OK is the same as the norm of the ideal (g). In

a Dedekind domain, there are only a finite number of ways to factorize any ideal into prime ideals. Moreover there are only finitely many prime ideals of certain norm, we refer to Theorem 5.17c) of [18]. Thus there are only a finite number of combinations. This implies that the are only finitely many elements of certain norm modulo the group action of the O∗_K. So M_m is a finite set.

For any solution (X, Y ) of the Thue equation we set: β(i) = X − θ(i)Y . By unique factorization of the ideal (X − θY ), we get that

β(i)= µ(i)(i),

where (i) is a unit and µ(i)∈ Mm(i)as defined earlier, from Lemma 2.2 we now that there

(19)

During the procedure of the first method it will be necessary to solve Thue equations of degree 4. Consider the case where F (X, 1) has at least s ≥ 1 real roots and pairs of t complex conjugate roots. If deg(F (X, Y )) = 4, this means that s + 2t = 4

The set of fundamental units in O_K is {η₁, . . . , ηr}, with r = s + t − 1. The formulas

hold for all conjugates, but we will formulate them with the coefficients corresponding with the identity conjugate θ and use β and µ and ηi. With this we get the following

β = µ r Y i=1 ηai i

for a set of integers {a₁, ..., ar}. Hence to find all possible β we need to find an upper

bound on A = maxi(|ai|).

We will use linear forms to find this A. Define for a given i Λk,j_i := log(−α) + r X i=1 ailog η_i(k) η_i(j) ! + a0· 2π √ −1, with α := µ (k)_(θ(i)_{− θ}(j)₎ µ(j)_(θ(k)_{− θ}(i)₎

for all combinations of k, j different from i. Because we have defined a discontinuous log we get a0 times 2π

√

−1, but they will not play a role.

We want to know how small the linear forms can become, because this gives us this bound on the largest value that coordinates of an integral point on the curve can take. To find the bound for the linear forms we will introduce an auxiliary lemma, that will help us to relate the size of the logarithm to the maximum exponent of the chosen fundamental unit.

Lemma 2.3 (Lemma for the bound on the maximum exponent). Let K be a number field with r fundamental units ηi ∈ K and let ai ∈ Z for 1 ≤ i ≤ r. Set A := max(|ai|)

and =Qr

i=1η ai

i . Let I = {i1, . . . ir} be an arbitrary set of r different indices from the

set {1, . . . , r + 1} then the matrix

UI =    log(|η(i1) 1 |) · · · log(|η (i1) r |) .. . . .. ... log(|η(ir) 1 |) · · · log(|η (ir) r |)   

is invertible. Let t ∈ I such that

| log(|(t)|)| = max

1≤i≤r+1| log(| (i)_|)|,

it then holds that | log(|(t)_{|)| ≥} A kU_I−1k∞

.

Proof. For the proof we refer to the proof of Lemma VII.2 of [17].

When we use this lemma for our case, we get different cases in all of them we will calculate different bounds Ai. In the end our bound on all of the ai will be the maximum

of {A1, A2, A3, A4}. Now we will explain how in all the cases these Ai can be determined

(20)

2.1.1. Determining the value of A1

Before we can determine the value of A₁ we have to calculate some important constants. We took θ to be the root of F (X, 1), we have set θ(k) to be the conjugates of θ in the order we have explained earlier. Then we can define c1 = _min 2

i6=j(|θ(i)−θ(j)|) and

c2 = maxi6=j6=k6=i(|θ

(i)_−θ(j)

θ(j)_−θ(k)|) and c3 = c1 · c2. Now to determine c4 we have to use the

following matrix defined in Lemma 2.3. So we r is the number of fundamental units that we have, i.e. r = s + t − 1. We remind that I = {i1, . . . , ir} is an r-subset of {1, . . . , s+t},

and the matrix UI is defined in the following way:

   log(|η(i1) 1 |) · · · log(|η (i1) r |) .. . . .. ... log(|η(ir) 1 |) · · · log(|η (ir) r |)   

We calculate the kU_I−1k∞, where U_i−1 is the inverse of the just defined matrix UI. Now

we c₄ = max(kU_J−1k∞) for all possible r-subsets (J ) of {1, . . . , s + t}. Now we take c5

to be a positive real number smaller then c4

n−1, for example you can take c5 = c4

n−0.9999.

Recalling that µ were the elements of O_K of norm m, we set c6 = max1≤i≤n|µ(i)|−1.

Now we are ready to calculate A₁. This is the case that |β(i)| > e−c5·A _{(index i is chosen}

such that |β(i)| is the smallest of all the conjugates of β) and the case that |(t)_{| ≥ e}−c4·A

according to Lemma 2.3. We choose c6 to be the maximum of all _|µ1(t)_|. So now we get

ec4·A _{≤ |}(t)_{| =} |β(t)|

|µ(t)_| ≤ c6· |β(t)|. Now we get an upper bound on the |β(t)| as follows:

|β(t)| < |m|Y

l6=t

|β(l)|−1< |m||β(i)|−(n−1)< |m|ec5·(n−1)·A_.

Combing everything we get the following inequality: ec4·A_{< c}

6· |m|ec5·(n−1)·A.

We take the logarithm at both sides to get: c4· A < log(c6· |m|) + c5· (n − 1) · A. This

leads to the following bound on A:

A < log(c6· |m|) c4− c5· (n − 1)

. So the value of A₁ is equal to log(c6|m|)

c4−(n−1)c5.

This will be the other case that |β(i)| > e−c5·A _{and the case that |}(t)_{| ≤ e}−c4·A_{. In this}

case all the ci, which we have determined earlier are the same, we define only c7, which

is equal to min1≤i≤n|µ(i)|−1. Now we get the following inequalities:

e−c4·A_{≥ |}(t)_{| =} |β

(t)_|

|µ(t)_| ≥ c7· |β (t)_{| ≥ c}

(21)

Again after we take on both sides the logarithm we get −c₄· A ≥ log(c₇) − c5· A. We

get the following bound on A:

A ≤− log(c7) c4− c5

. Therefore we take A₂ to be equal to | log(c7)|

c4−c5 .

This is the case that |β(i)| ≤ e−c5·A_{. We again have two cases either A >} log(2c3)

c5 or not.

So if this is not the case then we find that the value of the third bound A3 is log(2c_c₅ 3).

This is the case that |β(i)| ≤ e−c5·A _{and A > log(2c}

3)/c5. In this case we get that

|Λ| ≤ 2c3e−c5·A.

We use the theory of linear forms in logarithms from Chapter V of [17]. To define the bound A₄ we will make use of height of functions called absolute logarithmic Weil height h(·) and the modified height hm(·), that are defined for all algebraic numbers. For the

definition of the absolute logarithmic height we refer to [3] and the hm is defined as

follows: hm(a) = max h(a),| log(a)| d , 1 d .

Now when A > 3 in order to define a bound on A we need to introduce the following constants, first we define c8, its value is very large number

18 · (n + 1)!nn+1(32 · D)n+2log(2nD)Y

i

hm(αi).

Here the integer D is the degree of the extension of the splitting field S, in other words D = [S : Q] and for αisuch that S = Q(α1, . . . , αn). Using the Theorem A.1 of Appendix

A of [17] we a bound on our A, which we define as A0₄: A0₄ := 2 c5 (log(2c3) + c8log( s + t 2 ) + c8log( c8 c5 )).

But since c₈ is a "big" constant, we have that A0₄ becomes an even larger number. With the help of the LLL-algorithm we can try to lower the bound of A0₄ to get a "smaller" bound that we define as A4. The construction algorithm of A4 from A04 we will explain

with an example in the next section. As the result for our bound on A we get the maximum of all previously defined Ai and 3, in other words:

(22)

2.2. LLL-algorithm

We have come across linear forms in this section we will explain how we will use the linear forms. Given a linear form

Λ = |d1· a1+ d2· a2. . . dr· ar|, where ai are variable and di∈ C are coefficients

and let A = max_1≤i≤n(|ai|), our goal is to find a reasonably small bound on A.

The LLL-algorithm takes as basis the columns of a matrix and transforms them into a smaller basis. This algorithm was invented by Arjen Lenstra, Hendrik Lenstra and László Lovász in 1982 and is now being used by many programs and engines, also by SageMath. To read more about this algorithm we refer to their publication [10] and for more recent information to Chapter 2 from [6].

We will use the LLL-algorithm to reduce the bound of our A0₄. We take the bound of A0₄ and we use it as an initial bound. We take a C that is larger than Ar₄ (r is the number of variables that we have in our linear form). For the linear form inequality:

Λ = |d1· a1+ d2· a2. . . dr· ar| ≤ f1· e−f2·A

q

, for some constants f1, f2 and q ∈ Z.

we construct the r × r matrix that is defined by the following columns (< defines real part of a complex number and = its imaginary part)

       1 . . . 0 0 0 .. . . .. ... ... ... 0 . . . 1 0 0 |C · <(d1)| . . . |C · <(dr−2)| |C · <(dr−1)| |C · <(dr)| |C · =(d1)| . . . |C · =(dr−2)| |C · =(dr−1)| |C · =(dr)|       

Using the columns of this matrix as a basis we do an LLL-algorithm to reduce this basis. Now to calculate the new bound on A, we first define some constants S = (r − 2) · C2 and T = (1+r·C)√

2 and Cspecial= k −1

1 · kb1k (where b1 is the first column of the reduced

LLL-basis and k1 = max1≤i≤n(b_b1∗ i), b

∗

i is the orthonormal reduced basis of the new reduced

matrix). Then A_new the new bound on A is given by the formula Anew:= q

r 1 f2

(log(C · f1)) − log(pCspecial− S − T ).

We can repeat this process a couple of times until the bound on A stops getting smaller or if we get a small enough bound to run through all possible cases, this last bound we will define as A₄.

2.3. Example of solving a Thue equation

We will illustrate now the explained theory on an example of a Thue equation. Before we start we will introduce two height functions defined on an element of the number field

(23)

K of degree d that we will use in this example: absolute logarithmic Weil height h(a) for its definition we refer to [3] and the modified height (hm). The modified height will

be defined as hm(a) = max h(a),| log(a)| d , 1 d .

We will find all the solutions of the equation X4− 2Y4 _{= ±1. We set θ to be the root of}

the equation θ4− 2 = 0 and we take two fundamental units to be: η1 = 1 + θ2, η2= 1 + θ.

Hence we find all possible exponents a_i such that X − θY = β = ±ηa1

1 η a2

2 . We label now

the roots of X4− 2 = 0, so that θ(1) _{= −}√4

2, θ(2) ₌ √4

2, θ(3) _{= θ}(4) _{= −}√4

2√−1. We can calculate c1, c2, c3 and c4. Now we take c5 = 0.65, so that it satisfies the inequality

c5 < _n−1c4 = c₃4. Also we can take the constants c6and c7 equal to 1. So we get the trivial

upper bounds of A1 = A2 = 0. We need to consider two cases for when Y ≥ Y1 = 1,

there are two cases (where i = 1 or i = 2), for the index i, such that |β(i)| = min

1≤i≤4|β (i)_|

• Let us first consider the case i = 1. We set j = 3 and k = 2. We consider the linear form in logarithms and use the LLL-algorithm:

Λ = log(−α3) + r X i=1 ailog( η_i(2) η_i(3) ) + a02π √ −1, with a0 ∈ Z We notice that α₂= ±(θ_θ(1)(2)−θ_−θ(3)(1) = ± 1−√−1 2 we also have η₁(2) η₁(3) = 1 + θ 2 1 − θ2 , η (2) 2 η(3)₂ = 1 − θ 1 +√−1θ .

The first has a minimum polynomial: X2+ 6X + 1 and the second has a minimum polynomial: X8+ 8X7+ 44X6− 136X5_{+ 230X}4_{− 136X}3_{+ 44X}2_{+ 8X + 1. Using}

SageMath we compute the heights:

h(α2) = 0.3465, h η(2)₁ η(3)₁ = 0.88137, h η (2) 2 η₂(3) = 0.61211

Now we consider modified heights, using appendix A1 of [17]. hm(α2) = max{0.3465, 0.1075, 0.125} = 0.3465 hm η₁(2) η₁(3) = max{0.88137, 0.4503, 0.125} = 0.88137 hm η₂(2) η₂(3) = max{0.61211, 0.1171, 0.125} = 0.61211

(24)

Using the theory of the appendix A1 of [17], so we can give a lower bound to our |Λ|: log |Λ| ≥ −c8log 3A 2 , where c8= 18 · 5! · 45· 2566(log 64)hm(−1)hm(α2)hm η₁(2) η₁(3) hm η₂(2) η₂(3) = 1.902 · 1020

Now we can see that if A ≥ A₃ = 1.866, then A ≤ A0₄ = 2.8 · 1022. Now we can use the LLL-algorithm to try to get a smaller bound on A. Using LLL -algorithm with C = 1046 we get that A0_new = 49, after this we repeat the algorithm using C = 105, which gives us eventually A₄= 8. This bound is reasonable small and we can enumerate all possibilities to find all solutions.

• Let us now consider the case i = 2. We choose again j and k, now j = 3 and k = 1. We consider the values of Λ. We see that α2 = ±

θ(2)−θ(3) θ(1)_−θ(3) = ±1+ √ −1 2 , we also get η₁(1) η₁(3) = 1 + θ 2 1 − θ2, η(1)₂ η(3)₂ = 1 + θ 1 +√−1θ

These two numbers have the minimal polynomials: X2+ 6X + 1 and X8+ 8X7+ 44X6 − 136X5 _{+ 230X}4 _{− 136X}3 _{+ 44X}2 _{+ 8X + 1. We compute the heights}

and then also the modified heights and we get that c₈ = 1.902 · 1020. So we get that A ≤ A4 = 2.8 · 1022. Using the LLL-algorithm with C = 1040, we get that

A0_new = 49 and after using the same algorithm with C = 104, we get that A4 = 7.

So in both cases A ≤ 8. So we need to check all the solution below this bound. And then we find that the only integral solutions below this bound to the equation X4− 2Y4 _{= ±1 are}

(X, Y ) = {±(1, 0), ±(1, −1), ±(1, 1)}.

This means that in the previous hapter we did determine all possible solutions to the Thue equation Y2− X4 _{= −2.}

2.4. Method of Bilu and Hanrot

This is an alternative method for determining the upper bound of A for the Thue equation. We let U_I to be the matrix as defined in Lemma 2.3 and we take an i ∈ {1, . . . , s + t} \ I and we let (u_i,j) = U_I−1, with I being the same as before, so that

|β(i)| = min

1≤l≤r+1|β (l)_|

(25)

From the following equation: Ui    a1 .. . ar   =    

log |x−θ_µ_(i1)(i1)y| .. . log |x−θ_µ(ir )(ir )y|



  

we deduce for k = 1, . . . , r that

ak= r X j=1 uk,jlog x − θ(ij)_y µ(ij) = r X j=1 uk,jlog |y| + r X j=1 uk,jlog x y − θ (ij) µ(ij) = r X j=1 uk,jlog |y| + r X j=1 uk,jlog θ(i)− θ(ij) µ(ij) + r X j=1 uk,jlog x y − θ (ij) θ(i)_{− θ}(ij) = δklog |y| + λk+ r X j=1 uk,jlog x y − θ(ij) θ(i)_{− θ}(ij)

We have defined δk = Prj=1uk,j and λk = Prj=1uk,jlog

θ(i)_−θ(ij ) µ(ij ) . When y becomes large enough, we get that

x y−θ(ij ) θ(i)_−θ(ij ) ≤ n+2_n+1. Using this fact we can estimate the termPr

j=1uk,jlog x y−θ(ij ) θ(i)_−θ(ij ) , we get that r X j=1 uk,jlog x y − θ (ij) θ(i)_{− θ}(ij) ≤ r X j=1 uk,jlog n + 2 n + 1 ≤ r X j=1 |uk,j| log n + 2 n + 1 ≤ 1 n r X j=1 |u_k,j|

Then it follows that A ≤ c10log |y| + c11, where c10= max1≤k≤r|δk| and

c11= max1≤k≤r 1 n Pr j=1|uk,j|+|λk|

. Rewriting the previous inequality we get |y|−1≤ c12e−c13A, where c12 = exp(_cc11₁₀) and c13= c−110. Now when y is sufficiently large, we see

(26)

that: log x y − θ (ij) θ(i)_{− θ}(ij) ≤ log(1 + x y − θ (i) θ(i)_{− θ}(ij)) ≤ 2 x y − θ(i) θ(i)_{− θ}(ij) ≤ c₁ x y − θ (i) ≤ c1c9|y−n| ≤ c1c9cn12e−nc13A= c14e−c15A

Using the fact we now see that we get the following inequality: |δ_klog |y| − ak+ λk| ≤ r X j=1 uk,jlog x y − θ(ij) θ(i)_{− θ}(ij) ≤ r X j=1 uk,jc14e−c15A ≤ c₁₆e−c15A_,

where we define c16:= c14 max 1≤k≤r

r

X

j=1

|uk,j| = c14||U1−1||∞.

Now we write down the Bilu and Hanrot equation, first we define index h to be the one for which holds: |δh| = max1≤k≤r|δk|. From the others we define g to be an other index

than h. Then we define δ = δg

δh and λ =

δgλh−δhλg

δh then we get the following:

|ag− δah+ λ| = −ag+ δah− δgλh− δhλg δh

= |δglog |y| − ag+ λg− δglog |y| + δah− δλh|

= |(δglog |y| − ag+ λg) − δ(δhlog |y| − ah+ λh)|

≤ |δglog |y| − ag+ λg| + |δ||δhlog |y| − ah+ λh|

≤ (1 + |δ|)c16e−c15A

≤ 2c16e−c15A.

We can use this inequality to reduce the bound on A using LLL reduction of two di-mensional lattices. However it is more efficient to use continued fractions. We can use this inequality to search for the small solutions to the equation when we can reduce the bound on A. This would mean that we would need to check (2A6 + 1)r possibilities.

But we can reduce the sieving method on the exponents. Rewriting the right hand sides depending on |y| in stead of A, in the cases when |y| becomes big we get the following:

|ag− 1 δh (δgah− δgλh+ δhλg)| < 1 2.

(27)

And now we loop through all possible values of a_k and then all possible values of a_g for all g 6= h are determined explicitly. Hence we need only check 2A6+ 1 exponent vectors.

So what implies that the final search is only depending on A₆ and not on r and so not on the degree of the equation. But you still need to calculate the fundamental units and the complete set of representatives for the µ and then the calculating and solving the Thue equation is a fast process. For more information about this method we refer to Section VII.3 of [17]

2.5. Program

I have written the program programFindBoundThueEquation, based on the theory of this chapter. It solves Thue equations, for the code of this program we refer to Appendix A. This program is essential part of programone which will be explained in the next chapter. I have tried to write program programFindBoundThueEquation so that it can solve the general case of a Thue equation, unfortunately in the end I had to make some restrictions.

The program inputs a homogeneous polynomial Q(p, q) defined in variables p,q and an array of numbers mi, for which you want to solve the Thue equation

Q(p, q) = m2_i, m2_i ∈ Z.

Careful that here it is important that m2_i _{∈ Z, this construction was done for the smooth} running of program programone. Therefore it is allowed to choose imaginary numbers and square roots mi as long as m2i stays Z.

The beginning of program programFindBoundThueEquation is not dependent on the de-gree the polynomial. Only in the end there is a restriction on the number of fundamental units. Since programone produces always a degree four Thue equations that needs to be solved, I have restricted the end of program programFindBoundThueEquation to the cases that can occur only in degree four extensions of Q, namely 1 or 2 or 3 fundamental units.

If you want to solve a higher degree Thue equation, you can use my program with a slight upgrade. Then you need to add other possible cases on the number of fundamental units. How program programFindBoundThueEquation works, is made more clear in the Appendix A and also in the last section of the next chapter , where we discuss program the working of program programone.

(28)

3. Finding integral points using Thue

equations

3.1. Theory of the method

In this chapter we explain how the theory of the Thue equations can be applied to find all integral point on a curve defined by the equation Y2 = X3+ aX2+ bX + c, where a, b, c ∈ Z. We remind that we will consider only those a, b, c for which X3+ aX2+ bX + c does not have multiple roots, in other words the discriminant should be nonzero. We follow the principle explained in Section 2 of [4].

We denote A to be the algebra, with A ∼_{= Q[X]/(X}3+ aX2+ bX + c). Then there are three possibilities for A to be.

• If X3_+aX2_{+bX +c has three different roots that are integers, then A = Q×Q×Q,}

• If X3 _{+ aX}2 _{+ bX + c has one root that is an integer and two quadratic roots}

θ = θ(1)_{, θ}(2)_{, then A = Q × Q(θ),}

• If X3_{+ aX}2_{+ bX + c has no roots that are ∈ Z, so X}3_{+ aX}2_{+ bX + c is irreducible,}

then there is a θ, such that θ3+ aθ2+ bθ + c = 0 and in this case A = Q(θ). We consider the last case, we start by writing A = Qr

i=1Ai, where Ai = Q(θi) (for

i ∈ {1, 2, 3}) and θi (with i ∈ {1, 2, 3}) being the corresponding root of cubic polynomial

X3+ aX2+ bX + c = (X − θ1)(X − θ2)(X − θ3).

Suppose (X, Y ) is an integral solution to our equation, then X − θ_i ∈ A_i is an algebraic integer. We consider a prime ideal ℘ ∈ OLi that is a divisor of an odd degree of the ideal

(X − θi).

From the unique factorization of ideals and the equation of ideals that we have (Y )2 = (X − θi)(X − θj)(X − θk) we get the following information. Since ℘ divides the right

hand side, it has to divide the left hand side, thus it should divide the ideal (Y ). Then since it divides (Y ), an even power of ideal ℘ divides the left hand side and because an odd power of ℘ divides the ideal (X − θi), ℘ has to divide the ideal (X − θj), for a j 6= i.

Now because ℘ divides both the ideals (X − θi) and (X − θj) it has to divide also their

difference, so ℘ divides the ideal (θ_j − θ_i). From this follows that ideal ℘ divides the discriminant of X3+ aX2+ bX + c. So we are interested in the set of prime ideals that divide the discriminant, we will call this set Si.

(29)

We can write now

X − θi= αiβi2,

with the following three conditions that

αi ∈ Ai(Si, 2) (αi is square free, so we denote with Ai(Si, 2) the set A∗i/(A∗i)2 with the

added generators of the prime ideals that did divide the discriminant); βi = (xb1+ yb2+ zb3) ∈ Ai (where b1, b2, b3 form a basis of Ai);

Qr

i=1NAi/Q(αi) is a rational square.

In this situation three quadratic forms Q₁, Q2, Q3 with integral coefficients in three

vari-ables x, y, z can always be constructed, just by rewriting αiβi2to a form Q1θ21+Q2θ1+Q3,

using the equality that θ₁3= −aθ2₁−bθ₁−c. After equating the coefficients of the equality X − θi= αiβi2 we see that finding an integral point on our curve is equivalent to finding

integers x, y, z to the system:

Q1(x, y, z) = 0,

Q2(x, y, z) = 1,

Q3(x, y, z) = X.

We illustrate this method with the following example.

3.2. Example of the method

We consider the curve Y2 = X3−6X −14. We take θ, being a solution of θ3_{−6θ −14 = 0.}

We determine that class number of Q(θ) is one, so one fundamental unit forms a basis of O∗_K, we take for it η = (5−8θ+2θ₃ 2)_{. We find also that as the integral basis of Q(θ)} we can take ω₁ = 1, ω2 = θ and ω3 = 1−θ+θ

2

3 . The discriminant of X

3 _{− 6X − 14 is}

−4 · 1 · (−6)3_{− 27 · 1}2_{· (−14)}2 _{= −4428 = −2}2_{· 3}3_{· 41.}

So we want to find the generators of the prime ideals that divide (2), (3) and (41), which split in O_K into:

(2) = ℘3₂, (3) = ℘3(℘03)2, (41) = ℘41(℘041)2.

The generators of these prime ideals we can take the following: π2 =

8 + θ − θ2

3 for prime ideal ℘2, π3 =

5 + θ − θ2

3 for prime ideal ℘3, π₃0 = 1 + 2θ + θ

2

3 for prime ideal ℘

0 3,

π41=

1 + 5θ − 2θ2

3 for prime ideal ℘41, π₄₁0 = 5 + 4θ + θ

2

3 for prime ideal ℘

0 41.

(30)

Now we see that Q(θ)(S, 2) ∼=< −1 > × < η > × < π2 > × < π3> × < π30 > × < π41>

× < π0₄₁>, so it is a group of order 27, because we work modulo the squares. But we need only the element, whose norm is a square in Q. After simple calculations we are left only with the following subgroup H of order 8: H =< −η > × < −π₃π₃0 > × < −π41π410 >.

For each number α of this subgroup we need to determine if there are integral solutions for x, y, z for the equation X − θ = α(xω1+ yω2+ zω3)2.

X − θ = α(xω1+ yω2+ zω3)2 (3.1) = α x + yθ + z1 − θ + θ 2 3 2 . (3.2)

After we equate the coefficients of 1, θ, θ2 we get the following conditions on x, y, z.

0 := Q1(x, y, z), (3.3)

−1 := Q₂(x, y, z), (3.4)

X := Q3(x, y, z). (3.5)

From our 8 candidates we can eliminate 3, since they are not locally soluble, we are left with the set:

α ∈ {1, −π3π30, ηπ3π30, ηπ41π410 , −ηπ3π30π41π410 }.

So now we will consider all five cases.

(i) This is the case that α = 1. We will equate the coefficients of {1, θ, θ2} from left hand side with those from right hand side and then we will calculate Q1(x, y, z),

Q2(x, y, z) and Q3(x, y, z). We start by rewriting the right hand side, we will use

that θ3− 6θ − 14 = 0: x + yθ + z1 − θ + θ 2 3 2 = θ4(z 2 9) + θ 3₍₋2z2 9 + 2yz 3 ) + θ 2_(y2₊z2 3 + 2xz 3 − 2yz 3 )+ + θ(−2z 2 9 + 2xy − 2xz 3 + 2yz 3 ) + (x 2₊z 2 9 + 2xz 3 ) = θ4(z 2 9) − ( z2 9)(θ 4_{− 6θ}2_{− 14θ) + θ}3₍₋2z2 9 + 2yz 3 ) + θ 2_(y2₊z2 3 + 2xz 3 − 2yz 3 )+ + θ(−2z 2 9 + 2xy − 2xz 3 + 2yz 3 ) + (x 2₊z 2 9 + 2xz 3 ) = θ3(−2z 2 9 + 2yz 3 ) − (− 2z2 9 + 2yz 3 )(θ 3_{− 6θ − 14) + θ}2_(y2_{+ z}2₊2xz 3 − 2yz 3 )+ + θ(−2z 2 9 + 2xy − 2xz 3 + 2yz 3 + 14z2 9 ) + (x 2₊z 2 9 + 2xz 3 ) = θ2(y2+ z2+2xz 3 − 2yz 3 ) + θ(2xy − 2xz 3 + 14yz 3 )+ + (x2− 3z2₊2xz 3 + 28yz 3 ).

(31)

So we find that Q₁(x, y, z) = y2+ z2 +2xz₃ − 2yz₃ = 0, Q2(x, y, z) = 2xy − 2xz₃ + 14yz 3 = −1 and Q3(x, y, z) = x2 − 3z2 + 2xz 3 + 28yz 3 = X. Thus Q1(x, y, z) = 2xz+3y2_−2yz+3z2

3 = 0 and from this the following result can be determined.

gx = −3p2+ 2pq − 3q2, gy = 2pq,

gz = 2q2 for p, q, r ∈ Z.

We will now explain how we got the previous results. We take our equation: Q(x, y, z) = 2xz + 3y2 − 2yz + 3z2 _{= 0.} _{We find an easy non trivial}

solu-tion: (x0, y0, z0). In our case an easy solution is (x0, y0, z0) = (1, 0, 0). We now

parametrize all solutions with this easy one in the following way: x = rx0,

y = ry0+ p,

z = rz0+ q, where p, q, r ∈ Z.

Since these are solutions of our equation Q, we substitute these (x, y, z) in our equation. We get: 2xz + 3y2− 2yz + 3z2_{= 0,} 2(rx0)(rz0+ q) + 3(ry0+ p)2− 2(ry0+ p)(rz0+ q) + 3(rz0+ q)2= 0, r2Q(x0, y0, z0) + r(p(6y0− 2z0) + q(2x0− 2y0+ 6z0)) + (3p2− 2pq + 3q2) = 0, r(p(6y0− 2z0) + q(2x0− 2y0+ 6z0)) + (3p2− 2pq + 3q2) = 0, because Q(x0, y0, z0) = 0, r(p(6y0− 2z0) + q(2x0− 2y0+ 6z0)) = (2pq − 3p2− 3q2).

Now we can multiply both sides first with x0, then with y0 and lastly with with z0

to get: (p(6y0− 2z0) + q(2x0− 2y0+ 6z0))rx0 = (2pq − 3p2− 3q2)x0, (p(6y0− 2z0) + q(2x0− 2y0+ 6z0))x = (2pq − 3p2− 3q2)x0; (p(6y0− 2z0) + q(2x0− 2y0+ 6z0))ry0= (2pq − 3p2− 3q2)y0, (p(6y0− 2z0) + q(2x0− 2y0+ 6z0))(y − p) = (2pq − 3p2− 3q2)y0; (p(6y0− 2z0) + q(2x0− 2y0+ 6z0))rz0 = (2pq − 3p2− 3q2)z0, (p(6y0− 2z0) + q(2x0− 2y0+ 6z0))(z − q) = (2pq − 3p2− 3q2)z0.

To simplify we set g = (p(6y0− 2z0) + q(2x0− 2y0+ 6z0)). We get:

gx = (2pq − 3p2− 3q2)x0,

gy = (2pq − 3p2− 3q2)y0+ p(p(6y0− 2z0) + q(2x0− 2y0+ 6z0))

= p2(−3y0+ 6y0− 2z0) + pq(2y0+ 2x0− 2y0+ 6z0) + q2(−3y0),

gz = (2pq − 3p2− 3q2)z0+ q(p(6y0− 2z0) + q(2x0− 2y0+ 6z0))

(32)

We now remember that in our case (x₀, y0, z0) = (1, 0, 0), we get:

gx = 2pq − 3p2− 3q2, gy = 2pq,

gz = 2q2.

Now the integer g should be a divisor of the determinant of matrix A, such that

g   x y z  = A   p2 pq q2  .

This matrix is as follows:

A =   −3 2 −3 0 2 0 0 0 2  

The determinant of this matrix is −3 · 2 · 2 = −12, so g should be a divisor of 12. We take only the positive divisors, since later on we will substitute g into a quadratic form and thus the minus sign will have no effect. But we use a trick to reduce the number of possible values for g. For g to be a suitable number, there has to be an integral solution for co prime p, q to the system of equations:

−3p2+ 2pq − 3q2 ≡ 0 (mod g), 2pq ≡ 0 (mod g), 2q2 ≡ 0 (mod g).

Using this trick we can discard two of the possible values of g, namely g = 4 and g = 12. In the case g = 4, from 2q2 = 0 modulo 4 we see that 2|q, but then from −3p2_{+ 2pq − 3q}2 _{= 0 modulo 4, we see that that p also should be divisible by 2, but}

that contradicts that p, q are co prime. In the case g = 12, from 2q2 = 0 modulo 12 we see that 6|q, thus then from −3p2+ 2pq − 3q2 = 0 modulo 12 follows that 2|p, so again it follows that p, q are not co prime, which is a contradiction. So we are left with values for g from the set {1, 2, 3, 6}.

We substitute all the solutions in our Q2(x, y, z). Our equation Q2(x, y, z) is the

following:

Q2(x, y, z) =

−2xz + 6xy + 14yz

3 = −1.

After substitution this becomes: −2xz + 6xy + 14yz 3 = −1, −2(−3p2_{+ 2pq − 3q}2_{) · 2q}2_{+ 6(−3p}2_{+ 2pq − 3q}2_{) · 2pq + 14 · 2pq · 2q}2 3 = −g 2_, q(36p2_{q + 12pq}2_{+ 12q}3_{− 36p}3₎ 3 = −g 2_, 4q(q3+ pq2+ 3p2q − 3p3) = −g2.

(33)

Because both p, q are integers we see that g can only be 2 or 6. We consider the first case when g = 2. We get q(q3 + pq2 + 3p2q − 3p3) = −1. So only q = 1 or q = −1, which implies that respectively (q3 + pq2 + 3p2q − 3p3) = −1 and (q3+pq2+3p2q−3p3) = 1. But if we find a solution for q(q3+pq2+3p2q−3p3) = −1, with (p, q) = (a, b), then (p, q) = (−a, −b) is also a solution, because of the quadratic form of our equation. Thus it is sufficient for us to consider only the positive q. So we consider only the case that q = 1, we get: 1 + p + 3p2− 3p3_{= −1.}

1 + p + 3p2− 3p3 = −1,

p + 3p2− 3p3 = −2,

p(1 + 3p − 3p2) = −2.

We see that there are only 4 possibilities for p, p can be ±1 or ±2. So there are no p 1 + 3p − 3p2 p(1 + 3p − 3p2) 1 1 1 -1 -5 5 2 -5 -10 -2 -17 34 Table 3.1.: case g = 2, q = 1

solutions in the case g = 2, q = 1, as we have mentioned before we do not need to consider the other case q = −1: So there are no solutions for g = 2.

Now we will consider the case that g = 6. We get: q(q3+ pq2+ 3p2q − 3p3) = −9. This gives us 6 possible solutions for q, q can be ±1, ±3 or ±9. But with the same argument as before, it is enough only to consider the positive cases for q, so q being 1, 3 or 9.

(a) The case that q = 1 and therefore q3+ pq2+ 3p2q − 3p3 should be equal to −9: q3+ pq2+ 3p2q − 3p3 = −9

1 + p + 3p2− 3p3 = −9

p + 3p2− 3p3 = −10

p(1 + 3p − 3p2) = −10

We find that p can be ±1, ±2, ±5 or ±10. When p is either ±5 or ±10, in that case 1 + 3p − 3p2 should be respectively ∓2 or ∓1, but because in all this cases |p| ≥ 5, we can remark that |1 + 3p − 3p2_{| ≥ |p}2_{+ p}2_{− 3p}2_{| ≥ |p}2_{| ≥ 25 > 2.}

So we need not to consider these solutions. The other four solutions are simple calculations.

(34)

p 1 + 3p − 3p2 p(1 + 3p − 3p2) 1 1 1 -1 -5 5 2 -5 -10 -2 -17 34 Table 3.2.: case g = 6, q = 1

We find only one solution, namely (p, q) = (2, 1), we substitute them back to find what x, y, z are and we get: (x, y, z) = (−3·22+2·2·1−3·1₆ 2,2·2·1₆ ,2·1₆2) = (−11₆ ,2₃,1₃). We again see that x, y, z are not integers, so this is not a good solution.

(b) The case that q = 3 and therefore q3+ pq2+ 3p2q − 3p3 should be equal to −3: q3+ pq2+ 3p2q − 3p3 = −3,

27 + 9p + 9p2− 3p3 = −3, 3p + 3p2− p3 = −1 − 9,

p(3 + 3p − p2) = −10.

We find that p can be ±1, ±2, ±5 or ±10. When p is either ±5 or ±10, in that case 1 + 3p − 3p2 should be respectively ∓2 or ∓1, but because in all this cases |p| ≥ 5, we can remark that |1 + 3p − 3p2_{| ≥ |p| ≥ 5 > 2. So we need not to}

consider these solutions. The other four solutions are simple calculations. We p 3 + 3p − p2 p(3 + 3p − p2) 1 5 5 -1 -1 1 2 5 10 -2 -7 14 Table 3.3.: case g = 6, q = 3 find no solutions.

(c) The case that q = 9 and therefore q3+ pq2+ 3p2q − 3p3 should be equal to −1: q3+ pq2+ 3p2q − 3p3 = −1,

729 + 81p + 27p2− 3p3 = −1.

We see that the left hand side is divisible by 3 and the right hand side is not divisible by 3, so cannot find an integer p such that it would solve the equation

(35)

729 + 81p + 27p2− 3p3 _{= −1, which implies that there are no solutions. Now}

we have just finished the first case, still there are four more to go. We will not consider them in detail, but we will give the results.

(ii) This is the case that α = −π3π03 = −5+θ−θ

2

3 ·

1+2θ+θ2

3 . We do this in the same way,

as we saw in the previous case, with α = 1, only to simplify the calculations we do it with computer. As a result we get no solutions.

(iii) This is the case that α = ηπ3π03 =

(5−8θ+2θ2)

3 ·

5+θ−θ2

3 ·

1+2θ+θ2

3 . We repeat the same

procedure as for α = 1, This is a more interesting case, thus we consider some of the intermediate results. After we repeat the procedure we get the following Thue equations that need to be solved:

p4− 2780p3_{q + 2906184p}2_q2_{− 1353926432pq}3_{+ 237157630216q}4₌ _−73560059,

p4− 2780p3q + 2906184p2q2− 1353926432pq3+ 237157630216q4= −294240236. In this case we cannot factorize the left side of the Thue equations as we deed when α was 1. Here we apply the theory of Chapter 2, We consider the number field K = Q(θ) with θ being the root of the polynomial p4− 2780p3_{+ 2906184p}2₋

1353926432p + 237157630216. In this number field we field that the unit group is generated by two fundamental units (η1, η2), which we take to be

η1:= 4 175561θ 3₋ 8187 175561θ 2₊13312 419 θ − 7205, η2:= 85 351122θ 3₋ 89239 175561θ 2₊149888 419 θ − 84381

For both cases we want to determine A an upper bound on the ai in β = µQ2_i=1ηa_ii,

where µ, where µ comes in the first case from the set of representatives of norm −73560059 and in the second case from the set of representatives of norm −294240236. In both cases we find that the biggest of Ai as defined in Section 2,1 is A4 = 15.

Then we enumerate all the possibilities and find for the first case one solution for p = 2095 and q = 3, For the second Thue equation we find no solutions. And this solution of the Thue equation gives us the following points on the curve defined by the equation Y2 = X3− 6X − 14:

(X, Y ) = (5, ±9).

These are the only point we determine for this α. Thus we move on to the next possible value of α.

(iv) This is the case that α = ηπ₄₁π₄₁0 = (5−8θ+2θ₃ 2) ·1+5θ−2θ2

3 ·

5+4θ+θ2

3 . Unfortunately

in this case we do not get any solutions. (v) This is the last case that α = −ηπ3π30π41π041=

(5−8θ+2θ2) 3 · 5+θ−θ2 3 · 1+2θ+θ2 3 · 1+5θ−2θ2 3 · 5+4θ+θ2

(36)

This finishes the illustration of the method of finding integral points on the curve Y2 = X3− 6X − 14 using the techniques of solving Thue equations. And in the end we have determined that the only integral solutions on our curve were (X, Y ) = (5, ±9).

3.3. Program based on Thue equation method

The previous theory is used to create programs in SageMath. Before these programs were written, two functionalities that would be used numerous times in the methods are implemented in separate helper functions called coefficient and remainder. Their functionality is pretty straightforward; they give respectively the coefficient or the re-mainder of a certain variable in a polynomial.

coefficient(5x^2+5xy+6y^2,x) −−−−→ 5x+5youtput remainder(5x^2+5xy+6y^2,x)−−−−→ 6y^2output

The rest of the programs are less trivial. In this section their purpose will be explained in chronological order of their work. The code of these programs can be found in the Appendix.

The main executing program, called FinalProgramMethodOne, inputs the three coeffi-cients a, b, c that correspond to the coefficoeffi-cients in the equation Y2 = X3+ aX2+ bX + c and outputs all integral points that are on this curve. The output are points of the form (X, Y ) and (X, −Y ), such that Y = √X3_{+ a · X}2_{+ b · X + c. Since we do not}

consider polynomials with a zero discriminant, the program will give an (implemented) error if the input represents such a polynomial. This program requires the help of programSubstitutingGxGyGz, that gives all integral solutions for X and of programone which requires a little more explanation.

The program progamone analyses the polynomial X3+ a · X2+ b · X + c, given by the input variables (a, b, c). One of the output variables of this program is the list Z, which consists of all possible values α that should be tested, as explained in the example in the previous section. Another output variable is the polynomial F1 in x, y, z which can be found in 3.1 in the previous section. We are interested in all integral triplets (x, y, z) such that they solve the system of equations 3.3 and 3.4, since these triplets give us the X-coordinate of an integral point on our curve.

At the start of this chapter we have seen how, given any number α and polynomial f , the polynomial coefficients of 1, θ and θ2 are calculated. This process is implemented in the auxiliary program programForQ1Q2 were the coefficients are respectively given by the polynomials Q3, Q2 and Q1. We run this program for all elements of list Z, that we got from program programone. The input in program programForQ1Q2 is the polynomial z · F 1 (where z is an element of Z (value of α) and F 1 is second output of programone, the polynomial in terms of x, y, z). The output polynomials are important, because when

Integral Points on Curves Defined by the Equation Y 2 = X3 + aX2 + bX + c

MSc Mathematics

Master Thesis