Height Bounds for Mordell Equations Using Modularity

(1)

MS

C

M

ATHEMATICS

M

ASTER

’

S

T

HESIS

Height Bounds for Mordell Equations

Using Modularity

Author: Supervisor:

Josha Box

dr. S.R. Dahmen

Examination date:

June 14, 2017

Korteweg-de Vries Institute for

Mathematics

(2)

Abstract

We describe a method introduced by Von Känel and Murty-Pasten for determining height bounds for certain Diophantine equations using the modularity theorem. This theorem is used to derive an upper bound for the Faltings height of an elliptic curve in terms of its conductor. This is applied to Mordell equations, for which we slightly improve on a result of Matschke and Von Känel to obtain the current best height bound. As a corollary we obtain that for each (x, y) ∈ Z2 such that y2 = x3+ a (with a ∈ Z fixed), we have log |x| ≤ 1310|a| log(1728|a|). This bound for |x| is exponential in |a|, but significantly smaller than height bounds obtained previously using logarithmic forms. Prior to proving this, we describe the conductor of an elliptic curve, Néron models of elliptic curves, Tate’s algorithm, the Eichler-Shimura relation and the Faltings height of an elliptic curve in detail. Along the way we provide a detailed proof of a theorem of Igusa.

Title: Height Bounds for Mordell Equations Using Modularity Author: Josha Box, joshabox@msn.com, 10206140

Supervisor: dr. S.R. Dahmen Second Examiner: dr. A. Kret Examination date: June 14, 2017

Korteweg-de Vries Institute for Mathematics University of Amsterdam

Science Park 105-107, 1098 XG Amsterdam http://kdvi.uva.nl

(3)

Introduction

Diophantine equations

What do we see when we look at the integers

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, . . .?

Which patterns arise and which do not? What is unusual? And why? These are the kinds of questions number theoristsattempt to answer. For example, consider the consecutive numbers 8 and 9: 8 = 23 is a cube and 9 = 32 is a square. Does it happen more often that the difference between a square and a cube is only 1? If so, when exactly? Does it happen an infinite number of times or not? Such questions can be formulated in terms of symbols as follows: which numbers x and y satisfy the equation

y2 = x3+ 1?

This is an example of a Diophantine equation, named after the Hellenistic mathematician Diophantus of Alexandria, who studied similar kinds of equations in the third century AD. To be precise, a Diophantine equation is a polynomial or exponential equation in multiple variables with whole number coefficients for which we seek whole number solutions. Diophantus wrote thirteen books, called the Arithmetica, con-taining such equations and their solutions, which inspired mathematicians for centuries. Mathematicians often find it convenient to think in terms of equations, but bear in mind that each Diophantine equation formulates a very basic question about the integers.

Ever since Diophantus, Diophantine equations have been studied extensively by mathematicians from all over the world, one of whom was Louis Mordell (1888-1972). This British mathematician studied Diophantine equations of the kind

y2 = x3+ a,

where a is a fixed integer, such as the example from before where a = 1. Another example is y2= x3+ 5, i.e. a = 5. If we also consider negative numbers, we see that x = −1 and y = ±2 solve this equation: 22 = (−1)3 + 5, as shown in Figure 0.1. Are these the only solutions? In 1923, Mordell partially answered this question: he showed that for each possible value of a, there are only finitely many pairs integral solutions (x, y) [42] [43]. Consequently, such equations are now called Mordell equations and the constant a is referred to as the Mordell constant. The next question is, of course: for any given a, exactly which integers x and y satisfy y2= x3+a? An algorithm to find all solutions in theory was first discovered by Baker [4] in 1968, after which Masser [34] and Zagier [68] introduced practical approaches that allowed them to find all solutions for many values of a. Then in 1998, Peth¨o, Zimmer, Gebel and Hermann [48] found all solutions to the Mordell equations y2 = x3+ a for each fixed a of size |a| ≤ 10, 000. At that moment, one could say the net around Mordell equations had closed: they held no more mysteries for us.

This, however, did not withhold Von K¨anel [63] and Murty and Pasten [45] to consider Mordell equa-tions again in recent years after finding a new method for solving multiple types of Diophantine equaequa-tions. Matschke and Von K¨anel [35] then showed that this new method could be used to solve Mordell equations with larger constants a than before, in less time. In this thesis we describe their new method in detail, while using Mordell equations to illustrate the theory.

(6)

The modularity theorem

The main ingredient in the approach of Von K¨anel and Murty and Pasten is the modularity theorem, which deserves special mention. It was in the margin of one of Diophantus’ books that Pierre de Fermat in the early seventeenth century wrote his famous words:

“It is impossible to separate a cube into two cubes, or a fourth power into two fourth powers, or in general, any power higher than the second, into two like powers. I have discovered a truly marvelous proof of this, which this margin is too narrow to contain.”

In other words, he claimed to have proved that for each integer n > 2, there are no positive integral numbers x, y and z such that xn+ yn= zn. This became known as “Fermat’s Last Theorem”, because it was the last of Fermat’s claims to remain unproven. Since Fermat’s claim, mathematicians have attempted to find his “marvelous proof”, without any luck; the problem became known as one of the hardest problems in mathematics. Finally, in 1994, Andrew Wiles proved Fermat’s Last Theorem, using, however, modern methods unknown by Fermat at the time. In fact, Wiles had not been working with this equation at all; instead, he proved an important case of the modularity theorem. Before Wiles, it was known that it was sufficient to prove Fermat’s Last Theorem for exponents n > 2 that are prime numbers. In 1974, Hellegouarch [23] coined the idea to associate to a hypothetical triple of positive integers x, y, z such that xp_{+ y}p_{= z}p_{, with p > 2 prime, the elliptic curve}

E_(x,y,z) : Y2 = X(X − xp)(X + yp).

An elliptic curve as above is the geometric object determined by all points X and Y (not only integers) satisfying the equation; an example of an elliptic curve is shown in Figure 0.1. Then, after a suggestion of Frey [20] in 1986, Serre [53] and Ribet [50] proved that such an elliptic curve E(x,y,z) had unusual

properties, which would contradict a conjecture made by Taniyama and Shimura. This conjecture says that there is an explicit relation between elliptic curves on one hand and certain types of modular forms on the other. These modular forms are holomorphic functions on the upper half-plane of the complex numbers with convenient properties. Wiles (partially) proved this conjecture, which became known as the modularity theorem. Next to proving Fermat’s Last Theorem, the modularity theorem has had far-reaching implications in mathematics: it was one of the first successes of the Langlands programme, which attempts to relate two apparently dissimilar fields of mathematics. For more about this programme, see [5].

Moreover, this idea of associating an elliptic curve to a (hypothetical) solution of a Diophantine equation and applying the modularity theorem turned out to be successful in many more cases; now this widely used technique is often called the modular approach. The elliptic curves associated to solutions are known as Frey-Hellegouarch curvesor just Frey curves. Von K¨anel and Murty and Pasten were the first to apply the modular approach to Mordell equations. Though it was probably “common knowledge” that this was possible, these three authors noticed that the Frey curve corresponding to a Mordell equation had the particularly convenient property that one can find an upper bound for its conductor that is independent of the hypothetical solution itself, but depends only on the constant a. This means that Ribet’s level lowering theorem, commonly used in the modular approach, is not necessary. Moreover, this makes it possible to find height bounds for solutions to Mordell’s equation, as we describe in more detail in the next section.

Height bounds

From now on throughout this thesis, let S be a finite set of rational prime numbers, NS :=

Q

p∈Sp and

O := Z[1/NS]. Consider a ∈ O \ {0}. In this thesis, we study pairs

(7)

i.e. solutions to the generalised Mordell equation over O, with a particular interest in the special case O = Z. The Weierstrass equation t2 _{= s}3_{+ xs + y with x, y ∈ Q and s, t indeterminates has discriminant}

∆ = −27y2 − 4x3_{and this equation is ostensibly similar to the Mordell equation a = y}2_{− x}3_{. Indeed,}

tweeking the coefficients in the Weierstrass equation yields a discriminant that is a constant multiple of y2− x3_{: the equation}

E_(x,y) : t2 = s3− 27xs − 54y with s, t indeterminates (0.2) has discriminant −39· 26_{· a. Denote by E}

(x,y)also the elliptic curve defined by this Weierstrass equation.

We thus obtain a map

{(x, y) ∈ O2 | y2 = x3+ a} −→ {elliptic curves over Q}, (x, y) 7→ E(x,y)

illustrating how Mordell equations are amenable to the modular approach. This modular approach leads to the algorithm for solving Mordell equations described in Section 4.1 and Appendix A. However, we focus on a slimmed down modular approach using height bounds, which turns out to be more successful (see the end of Section 4.7 and [35]).

When x = a/b ∈ Q×with a, b ∈ Z coprime, the height of x is defined to be

h(x) := log max{|a|, |b|}. (0.3)

When x ∈ Z \ {0}, note that h(x) = log |x|. Rather than the absolute value of x, the height of x has the crucial property that there are only finitely many x ∈ Q of (upper) bounded height, as one easily verifies. Therefore, if one can show that all solutions to Mordell’s equation must have their height bounded from above by some number, then one has in theory an algorithm for computing all solutions by verifying all possibilities below this bound. However, (cf. (0.4) and Example 4.7.3), such height bounds are usually way too large to directly yield a practical algorithm for solving the equation. Yet, as described below, height bounds are essential to obtaining efficient practical algorithms for solving the equations. Furthermore, height bounds for Diophantine equations are of theoretical interest because they allow for a comparison of different techniques.

Traditionally, height bounds were obtained via a method of Baker [4] that uses the theory of logarithmic forms. De Weger and Tzanakis [62] successfully applied such bounds to find an efficient algorithm for solving Thue equations and later, Masser [34] and Zagier [68] did the same for elliptic curves such as the Mordell equation. The approach of the latter, often referred to as the “elliptic logarithm reduction process”, allows a substantial reduction of the initial height bound to a bound N1, such that all points

of height greater than N1 are exceptional (cf. [35, Definition 11.14]) and can be efficiently computed

separately.

Recently, Murty and Pasten [45] and Von Känel [63] independently obtained stronger height bounds for multiple types of Diophantine equations using the modularity theorem instead of logarithmic forms. To be precise, they related the height of a solution to the Faltings height of its corresponding elliptic curve (see Section 4.2 for the definition of the Faltings height) and used the modularity theorem to find an upper bound for the Faltings height of that elliptic curve. Later, in 2016, Matschke and Von Känel [35] optimised these height bounds for many kinds of equations, including Mordell, S-unit, Ramanujan-Nagell and cubic Thue equations. Moreover, they used a geometric interpretation to improve Zagier’s reduction process. Using their “elliptic logarithm sieve”, Matschke and Von Känel were able to compute solutions more efficiently than before, taking as input a given height bound. For example, they computed all solutions to y2 _{= x}3_{− 2520963512 over O = Z[1/(2 · 3 · 5 · 7 · 11 · 13 · 17 · 19 · 23 · 29)] in less than 23 seconds}

(see [35, p. 53]). We note that their method does involve the computation of a Mordell-Weil basis. This is almost always possible in practice, but no algorithm for this has yet been proven to be effective.

In this thesis, we study the approach of Von K¨anel [63] and Murty and Pasten [45] for obtaining height bounds, including the improvements to these made by Matschke and Von K¨anel [35], all of which we apply

(8)

to Mordell equations. Moreover, we make two minor improvements (see Remark 4.3.5 and Theorem 2.4.4) on the results of the latter pair for Mordell equations, leading to the best currently known height bound for Mordell equations in the case O = Z. This result is presented in Theorem 4.7.2, a slightly weakened version of which says that for all (x, y) ∈ Z2with x 6= 0 solving (0.1) with a ∈ Z, we have

log |x| ≤ 1310|a| log(1728|a|). (0.4)

By comparison, as described in [35, (4.5)], the best known upper bound for log |x| using the theory of logarithmic forms is of order |a|2(log |a|)10, due to Peth¨o, Zimmer, Gebel and Herrmann [48].

Overview

The main results of this thesis are Theorems 4.7.1 and 4.7.2, which gives explicit height bounds for elliptic curves and the solutions to Mordell equations respectively. The road to get there is a long one, however, because the necessary theory to understand the proof is studied first.

As mentioned before, these height bounds are obtained using the modularity theorem. In order to formulate this theorem, one needs the notion of the conductor of an elliptic curve. In Chapter 1 we define the conductor in a coherent way and study its properties. In particular, we show that the conductor is a rational integer and we provide an upper bound for its local description at the primes 2 and 3.

Then in Chapter 2 we introduce Néron models of elliptic curves. Néron models are essential in the proof of Tate’s algorithm, which makes it possible to compute the conductor of an elliptic curve. Néron models also play a role in Section 4.2 when defining the Faltings height. We finish Chapter 2 by applying Tate’s algorithm to obtain an upper bound for the conductor of the elliptic curve associated to a solution of the Mordell equation, in terms of the Mordell constant.

In Chapter 3 we study the modularity theorem in more detail. It is completely beyond reach to prove the theorem in a master’s thesis, so we focus on one direction of the bijection, also called the Eichler-Shimura construction. We prove the Eichler-Eichler-Shimura relation, which allows to describe the properties of this construction. In the process, a detailed proof of Igusa’s theorem is presented.

Finally, in Chapter 4 we reap the fruits of our labour. First, we explain a standard modular approach for solving Mordell equations in Section 4.1. Then we derive a technique to obtain height bounds for Diophantine equations via the modularity theorem. In particular, we compute explicit height bounds for Mordell equations. In order to achieve this, we introduce the Faltings height of an elliptic curve in Section 4.2 and establish its main properties, before proving that the modular degree of a rational newform divides its congruence number in Section 4.5.

An effort has been made to keep this thesis relatively self-contained, but the reader may benefit from being familiar with (some of) the basic theory of elliptic curves (such as in Silverman [55]), modular forms (such as the first five Chapters in Diamond and Shurman [14]), algebraic geometry (such as the first three Chapters in Hartshorne [22]) and algebraic number theory (such as the first four Chapters in Serre [52] and the first three Chapters in Stuart and Tall [59]).

Acknowledgements

I would like to thank Rob Tijdeman for suggesting the topic of this thesis, Peter Bruin for taking the time to discuss the proof of Igusa’s theorem and Arno Kret for agreeing to read my lenghty work. Most of all, I am grateful to Sander Dahmen for many interesting conversations and for encouraging me to pursue the questions I encountered.

(9)

1. The conductor of an elliptic curve

In this chapter we define the conductor of an elliptic curve and establish its main properties. The book of Silverman [55] is followed for the standard theory of elliptic curves, but for the definition of the conduc-tor, we deviate from this book. Instead, we take a more representation-theoretic viewpoint to define the conductor in a more unified way. Large part of this section is dedicated to proving that the conductor is a well-defined rational integer, for which multiple sources, such as the books of Serre [54] [52] and of Artin and Tate [2], have been combined.

1.1. Good and bad reduction

Because the conductor of an elliptic curve measures the type of reductions of the curve at the primes, the different types of reduction are defined first. Consider an elliptic curve E over a global number field K. We use the standard notation such as e.g. in Silverman [55, III.1] to denote the invariants associated to E. Denote by OK the ring of integers of K, by Kp the completion of K at the prime p and by k(p) the

residue field of Kp. Let eEpbe the reduction (at p) of E/Kp, as defined in Silverman [55, VII.2]. Recall

that the point at infinity is never singular.

Definition 1.1.1. We say that E/Kphas good reduction when eEpis a non-singular curve over k(p). Else,

E/Kphas bad reduction. Here we distinguish two cases. If P = (x0 : y0 : 1) is the (necessarily unique)

singular point and f (x, y) the defining equation of eEp, write

f (x, y) − f (x0, y0) = (y − y0− α(x − x0))(y − y0− β(x − x0)) + (x − x0)3.

If α 6= β, we say E/Kp has multiplicative reduction. Else, E/Kp has additive reduction. When the

reduction type is not additive, we say E/Kphas semi-stable reduction. The reduction type of E/K at p

is defined as the reduction type of E/Kp.

The reduction type at p can be determined by inspecting the discriminant and the c4-invariant of a

minimal Weierstrass of E over Kp. This minimal discriminant is denoted ∆(E/Kp). It holds that E has

good reduction at p if and only if the minimal discriminant ∆(E/Kp) /∈ p. Moreover, if eEpis singular,

the reduction is multiplicative if and only if c4 ∈ p, where c/ 4is the c4-invariant of a minimal Weierstrass

equation of E over Kp. Both statements can be proved by an explicit computation; see Silverman [55, III

Proposition 1.4].

1.2. The Tate module

The conductor should be a measurement of the reduction types at the various primes. It is therefore natural to define the conductor as a product of the local data:

NE/K =

Y

p6=(0)

pf (E/Kp)_, _(1.1)

(10)

Therefore we let K be a finite field extension of Qp for now with residue field k. Let E be an elliptic

curve over K. There is an obvious action of the absolute Galois group Gal(K/K) −→ Aut(E(K)).

It is desirable to turn this into a representation over a ring of characteristic zero. By a representation of a group G over a ring R, we mean a homomorphism G → GL(V ), where V is a free R-module. Furthermore, if G is a Galois group and V a topological space, the representation is called a Galois representationwhen it is continuous with respect to the profinite topology on G. For each positive integer m, let E[m] denote the m-torsion points of E(K). If ` 6= p is a prime number, we know that E[`n] is a Z/`nZ-module for every n. In fact, from the theory of isogenies, we know exactly what kind of module. Proposition 1.2.1. Suppose K is a field and m ∈ N such that char(k) - m. Then there exists a group isomorphism_{E[m] ' Z/mZ × Z/mZ.}

Proof. See [55, III Proposition 6.4(b)].

Definition 1.2.2. For each prime `, the `-adic Tate module of E is defined as T`(E) = lim

← E[` n_]

where the inverse limit is defined by the multiplication-by-` maps E[`n] → E[`n−1].

The Tate module naturally has the structure of a free Z`-module of rank 2 and the Galois action on

E[`n_{] becomes a Galois representation}

ρE,`: Gal(K/K) → GL(T`(E)).

We might as well extend scalars to Q` and consider the vector space V`(E) = T`(E) ⊗Z` Q`. Let

L = K(E[`]) be the extension of K defined by adding the x- and y-coordinates of points (x : y : 1) ∈ E[`] on the affine chart. This is a finite field extension of K because E[`] is finite. Moreover, because E[`] consists of the torsion points over K, the coordinates of the points of E[`] are defined as all the roots of a number of equations on the coordinates, making L/K Galois.

The reduction type is closely related to the Galois action on the Tate module. Let I(K/K) be the profinite limit of the inertia groups I(L/K) of L/K, where L/K are finite Galois extensions.

Theorem 1.2.3 (Criterion of N´eron-Ogg-Shafarevich). Let E be an elliptic curve over a finite field exten-sion_{K of Q}_p. Then the following are equivalent:

(i) E has good reduction,

(ii) The extensionK(E[m])/K is unramified for all m coprime to char(k) and (iii) the inertia groupI(K/K) acts trivially on T`(E) for every ` 6= char(k).

Proof. See [55, VII Theorem 7.1] for the full proof; here we sketch the connection between the three notions. When E has good reduction, we obtain an action of Gal(K/K) on T`( eE), where eE denotes the

reduction of E, on which I(K/K) acts trivially by definition. However, the map T`(E) → T`( eE) is an

isomorphism when ` 6= char(p), proving (iii). Also, if the inertia group acts trivially on T`(E), it leaves

E[`n] fixed and hence K(E[`n])/K is unramified.

From this theorem, we see that it makes sense to define the conductor as an invariant of the Tate module representation in order to measure the reduction type.

(11)

1.3. The Artin conductor of a Galois representation

Let p be a prime number. In this section, we consider a finite field extension K of Qp, a finite Galois

extension L/K and a representation ρ : Gal(L/K) → GL(V ), where V is a finite-dimensional vector space over a field k of characteristic zero. When λ : G → GL(W ) is a representation over a ring R, the character ofλ is the composition of λ with the trace map GL(W ) → R. We denote by χ the character of ρ. Furthermore, we define G := Gal(L/K), fL/K to be the residue field degree of L/K and vF, OF

and mF to be, respectively, the (normalised) valuation, ring of integers and maximal ideal of a finite field

extension F of Qp,

From the N´eron-Ogg-Shafarevic theorem, we see that the reduction type is related to ramification in the extensions K(E[m])/K. We recall the definition of the ramification groups, which are a measure of ramification. For each integer i ≥ 0, let

Gi := ker(G → Aut(OL/mi+1_L ))

be the i-th lower numbering ramification group. An important property of these groups is that G1 is a

p-group and that Gi/Gi+1is a direct product of cyclic groups of order p for each i ≥ 1 (see [52, p. 67]).

For t ∈ (−1, ∞), we define Gt:= Gdte. Consider the function φ(u) =

Ru

0 gt/g0dt, where gs:= #Gsfor

each s. This is a bijection (−1, ∞) → (−1, ∞) (see [52, p. 73]) and we define Gt := Gφ−1_(t) to be the

t-th upper numbering ramification group. Unlike the lower numbered ones, these ramification groups are compatible with quotients, i.e. (G/H)t= Im(Gt_{→ G/H) for each t and each normal subgroup H ⊂ G}

(see [52, p. 74]).

Next, consider πL∈ OLsuch that OL= OK[πL] (cf. [52, III Proposition 12]). For each σ ∈ G, define

i_L/K = vL(σ(πL) − πL). We then define Ar_L/K(σ) = −f_L/Ki_L/K(σ) if σ 6= 1 and ArL/K(1) = fL/K X σ6=1 iL/K(σ).

This is a class function, meaning that ArL/K(τ−1στ ) = ArL/K(σ) for all σ, τ ∈ Gal(L/K). Indeed,

note that

i_L/K(τ−1στ ) = vL(τ−1(στ (πL) − τ (πL)) = vL(σ(τ (πL)) − τ (πL)).

Note that Gi = {σ ∈ G | iL/K(σ) ≥ i + 1} , so we see that iL/K is indeed independent of the chosen

element πLsuch that OL = OK[πL]. It follows that vL(σ(τ (πL)) − τ (πL)) = iL/K(σ), showing that

i_L/K and Ar_L/Kare class functions.

Definition 1.3.1. The conductor exponent of χ is defined to be f (χ) = (χ, ArL/K) := 1 g X σ∈G χ(σ)ArL/K(σ),

where g = |Gal(L/K)|. Sometimes we write f (ρ) instead.

Recall that the characters of irreducible representations of G form an orthogonal basis for the space of class functions on G with respect to the inner product (·, ·) (see [54, Chapter 2 Theorem 6]), so f (χ) is the coefficient of Ar_L/K for χ with respect to this basis. We first rewrite this definition in terms of the ramification groups. In what follows, we write gi= |Gi|.

Lemma 1.3.2. We have f (χ) = ∞ X i=0 gi g0 (χ(1) − χ(Gi)) = ∞ X i=0 gi g0 dim(V /VGi_{) =} Z ∞ −1 dim(V /VGt) dt, (1.2) whereχ(H) = _|H|1 P

(12)

Proof. We split the sum defining f (χ) in parts where ArL/Kis known: f (χ) = fL/Kχ(1) g X σ6=1 ∞ X i=0 1σ∈Gi− f_L/K g X σ∈G0\G1 χ(σ) −2fL/K g X σ∈G1\G2 χ(σ) − . . . = χ(1) g0 ∞ X i=0 (gi− 1) − 1 g0   ∞ X i=0 X σ∈Gi χ(σ) − χ(1)  .

In the latter equality we have merely rearranged both double sums and used the fact that fL/Kg0 = [L : K].

Now we can take the two sums together to obtain f (χ) =P∞

i=0 gi

g0(χ(1)−χ(Gi)). For the second equality,

we remark that χ(1) = dim V and that by [54, p. 56] we have

χ(Gi) = (χ|Gi, χtriv) = (ρ|Gi, ρtriv) = dim V

Gi_,

where χtrivand ρtrivdenote, respectively, the trivial character and the trivial representation and the inner

product (λ1, λ2) of two representations λi : G → GL(Vi) (i ∈ {1, 2}) is defined as dimkHomG(V1, V2).

Finally, the last equality in (1.2) follows from a simple substitution using the definition of the upper ramification groups.

The lemma implies that f (χ) ∈ Q. In the end, we mean for f (χ) to be the exponent of a prime ideal, so f (χ) should be integral. Hence we focus on proving the following theorem.

Theorem 1.3.3 (Artin). The conductor exponent f (χ) is a non-negative integer.

The first term in the sum of (1.2) is integral, so we define the Swan conductor exponent as

δ(χ) := ∞ X i=1 gi g0 dim(V /VGi_{) = f (χ) − dim(V /V}G0_).

To prove Theorem 1.3.3 it thus suffices to show integrality of the Swan conductor exponent. This turns out to be a generalisation of the Hasse-Arf Theorem ([52, p. 93]), which says that the jumps in the upper number ramification groups occur at integer values when the Galois group is abelian.

Let us first sketch why the Hasse-Arf Theorem follows from Theorem 1.3.3. Assume Theorem 1.3.3 holds true and that G is abelian. We may and do assume that G = G0. Let f0 be the smallest f > 0

such that Gf 6= Gf −_{for all > 0. As G}

0 = G0/G1 o G1 (see [52, IV Cor 4]) and G1/Gf0 is a direct

product of cyclic groups of order p (see [52, IV Corollary 3]), there exists a subgroup H ⊂ G0 of index

p containing Gf0_{. If χ is the character on G defined by a non-trivial degree 1 character (the degree of a}

character is the dimension of the vector space corresponding to its representation) on G/H, then f (χ) =

Z f0

−1

1 dt = 1 + f0∈ Z

so the position of the first jump f0 ∈ Z. Using this fact, we can inductively show that the other jumps

occur at integer places as well, proving the Hasse-Arf theorem. Next, we use the Hasse-Arf theorem to prove Theorem 1.3.3. Lemma 1.3.4. We have f (χ) ∈ Z when χ is a degree 1 character. Proof. We observe that

dim(V /VGi_{) =}

(

1 if Gi ⊂ ker χ and

(13)

It follows from Lemma 1.3.2 that f (χ) = inf{a ≥ −1 | Ga⊂ ker χ} + 1. If G were abelian, the Hasse-Arf theorem would tell us that the jumps of the upper ramification groups occur at integer values. But χ is a homomorphism to an abelian group, so G/H is abelian, where H = ker χ. By definition χ|Gt acts

trivially on V if and only if Gt ⊂ H. This happens if and only if (G/H)t_{is trivial, because (cf. [52, IV}

Propostion 14]) (G/H)t = Im(Gt → G/H). By the Hasse-Arf theorem, the jumps in (G/H)t_{occur at}

integer values, so f (χ) ∈ Z.

For a finite separable extension F/K, let dF /Kbe the different of F/K as defined in (but not as denoted

in) [52, p. 50]. The following observation will be used to reduce to the degree 1 case.

Proposition 1.3.5. For any subgroup H of G corresponding to the subfield L/K0/K we have ArL/K|H = fK0_/Kv_K0(d_K0_/K)r_H+ f_K0_/KAr_L/K0,

whererH denotes the character of the regular representation ofH.

Proof. For σ 6= 1 the equality is easily verified when applied to σ. For σ = 1 the equality is a direct con-sequence of the transitivity of the different: dL/K= dK0_/Kd_L/K0as ideals in O_L. See [52, VI Proposition

4] for more details.

It follows using Frobenius reciprocity (cf. [54, Chapter 7 Theorem 13]) for a character χ of H that f (χ∗) = (χ∗, Ar_L/K) = (χ, Ar_L/K|_H) = f_K0_/Kv_K0(d_K0_/K)χ(1) + f_K0_/Kf (χ), (1.3)

where χ∗is the induced character of χ on G. Hence, if f (χ) is an integer, so is f (χ∗). We would now like to write any character as a finite integral combination of induced degree 1 characters. If k = C, Brauer showed this is always possible, which proves Theorem 1.3.3 in that case.

Theorem 1.3.6 (Brauer). If G is a finite group and χ the character of a finite-dimensional representation of_{G over C, then we can write}

χ =

k

X

i=1

niχ∗i

where eachχ∗_i is the induced character of a degree 1 characterχiof a subgroupHiofG.

Proof. See [54, Chapter 10 Theorem 2].

However, for our application to the Tate module representation of an elliptic curve, we need a similar result with C replaced by Q`. To this end, we first use properties of the ramification groups to reduce the

problem to G1, which is a p-group. For p-groups analogues for Brauer’s theorem do exist. First we recall

the following fact about ramification groups.

Lemma 1.3.7. If s ∈ G0andt ∈ Gifori ≥ 1 then sts−1t−1 ∈ Gi+1if and only ifsi ∈ G1 ort ∈ Gi+1.

Proof. See [52, p. 69]

The following crucial lemma is taken from [2, p. 140]. Lemma 1.3.8. We have e0| g0f (χ), where e0 = g0/g1.

Proof. Consider a representative τ ∈ G0of the generator of G0/G1(this group is cyclic, cf. [52, p. 67]).

Then for any σ ∈ Gi\ Gi+1(i ≥ 1) we have τkστ−k ∈ G/ i+1for each k because Gi+1is normal in G0.

(14)

σ0 = τk_στ−k_{. Let r}

σbe the number of elements in the equivalence class of σ. Then clearly rσ is the least

k > 0 such that τkστ−k = σ. Hence

τrσ_στ−rσ_σ−1_{= 1 ∈ G}

i+1

and since σ /∈ G_i+1we have τirσ _{∈ G}

1by the previous lemma. As τ was a generator of G0/G1 we see

that e0| irσ.

Now as in the proof of Theorem 1.3.2, we rewrite the conductor in terms of the differences Gi\ Gi+1

to find g0f (χ) = χ(1)G0+ X σ∈G0 χ(σ) + χ(1) ∞ X i=1 i(gi− gi+1) − ∞ X i=1 X σ∈Gi\Gi+1 iχ(σ). (1.4)

The first two termsP

σ∈G0χ(σ) = g0dim V

G0 _{and χ(1)g}

0 are both divisible by e0. For the second and

the third term, we use that χ is a class function to find that ni:= X σ∈Gi\Gi+1 iχ(σ) = X σ∈(Gi\Gi+1)/∼ irσχ(σ).

As e0 | irσ we see that ni/e0 is an integral linear combination of the χ(σ)’s, which are roots of unity so

integral over Z. On the other hand,P

Gi\Gi+1χ(σ) = gidim V

Gi_{− g}

i+1dim VGi+1 ∈ Z, so ni/e0 ∈ Q

as well. It follows that ni/e0 ∈ Z so the fourth term in (1.4) is divisible by e0. Lastly, note that taking

χ = 1 in the previous yields the third term in (1.4).

By definition of the Swan conductor exponent, we have δ(χ|G1) = e0δ(χ). We have already shown that

e0 | g0δ(χ), so because (g1, e0) = 1 and g1e0 = g0 it suffices to show that g1 | g0δ(χ), or equivalently

that f (χ|G1) ∈ Z. As G1is a p-group, this follows from the following theorem.

Theorem 1.3.9. Suppose G is a p-group and χ is the character of a representation over field k of charac-teristic unequal top. Then there exists a subgroup H ⊂ G and a degree 1 character χ on H taking values ink such that χ is induced by χ.

Proof. See [2, p. 145].

This finishes the proof of Theorem 1.3.3: the conductor exponent is integral.

1.4. The conductor of an elliptic curve over a local field

Consider the notation from the previous section and let E be an elliptic curve over K. We have defined the conductor of representations of finite Galois groups G = Gal(L/K). However, the conductor only depends on G0 = GI(K/K), i.e. f (χ) = f (χ|G0), as can be seen from Lemma 1.3.2. Therefore, in

order to define the conductor of an elliptic curve E/K via its Tate module representation ρE,`, we wonder

whether there exists a finite Galois extension F/K such that the action of I(K/K) on T`(E) factors via

I(F/K), i.e. such that I(K/F ) acts trivially on T`(E). By N´eron-Ogg-Shafarevich, such an L exists if

and only if there exists an F such that E has good reduction over F . We say that an elliptic curve E/K has potential good reduction if there exists a finite extension F/K such that E has good reduction over F . This turns out to be easy to verify.

Proposition 1.4.1. The j-invariant of E is integral if and only if E has potential good reduction. Proof. See [55, p. 196].

(15)

Definition 1.4.2. Let ` 6= p be a prime. Suppose that E/K has integral j-invariant and consider a finite Galois extension F/K such that E has good reduction over F . Then the conductor of E/K is NE/K =

pf (E/K)where p is the unique non-zero prime of OKand f (E/K) = f ρE,`|I(L/K).

This definition is independent of the choice of L by Lemma 1.3.2 and the fact that the upper ramification groups are compatible with taking quotients. We will see later that is is also independent of `. It follows from Lemma 1.3.2 that I(K/K) acts trivially on T`(E) if and only if f (E/K) = 0. Also, we see from

the definition: the more “non-trivial” the action of the ramification groups on T`(E) is, the larger the

conductor exponent of E/K.

It is our aim to generalise this definition to arbitrary elliptic curves. In order to do that, we generalise the conductor of a Galois representation to infinite Galois groups. A problem with the lower numbering ramification groups is that they are not compatible with taking quotients, which makes it hard to generalise them to infinite Galois groups. This is one of the virtues of the upper numbering ramification groups. Definition 1.4.3. If G = Gal(K0/K) is a (possibly infinite) Galois group, we define

Gu := lim_←−

K0_{/F /K}

[F :K]<∞

Gal(F/K)u

to be the u-th upper numbering ramification group of G.

This is well-defined because the upper ramification groups are compatible with quotients. Definition 1.4.4. The conductor exponent of a continuous finite-dimensional representation ρ of GK = Gal(K/K) is

f (ρ) =

Z ∞

−1

dim(V /VGtK_{) dt,}

provided the integral exists. If moreover f (ρ) ∈ Z, we define the conductor of ρ to be N (ρ) = pf (ρ). Lastly, the conductor exponent of E/K is f (E/K) := f (ρE,`), where ` 6= p. If E0 is an elliptic curve

over a number field F , the conductor of E0/F is NE0_/F :=

Y

q

qf (E0/Fq)_,

where the product runs over the non-zero primes q ⊂ OF.

This definition agrees with Definition 1.4.2 in the case where I(K/K)/ ker ρ is finite. The remainder of this section is dedicated to proving the following theorem.

Theorem 1.4.5. For each elliptic curve over a finite extension K of Qp, the conductor exponent exists and

is an integer independent of`.

In order to prove this, we define the Swan conductor exponent of a continuous finite-dimensional rep-resentation ρ of GKas δ(ρ) := Z ∞ 0 dim(V /VGtK_{) dt = f (ρ) − dim} V /VI(K/K) .

We call the ramainder (ρ) := dim(V /VI(K/K)). For finiteness and integrality questions, it suffices to consider δ(ρ). We note that δ(ρ) depends only on the restrictions ρ|Gu

K for u > 0. Denote by χE,`the

(16)

Lemma 1.4.6. Consider a prime ` 6= p. For every u > 0 we have dim_Q_`V`(E)/V`(E)G

u

= dim_F_`E[`]/E[`]Gu.

Proof. We first show that E[`]Gu = T_`Gu ⊗_Z_` F`, where T` := T`(E). This suffices because V` :=

V`(E) = T`⊗Z`Q`and hence V

Gu

` = TG

u

` ⊗Z`Q`. To this end, we note that E[`] = T`⊗Z`F`and this

fits into an exact sequence

0 → T` → T` → T`⊗Z`F` → 0

where the first map is given by multiplication by `. We are done when the induced sequence 0 → T_`Gu→ T_`Gu → (T_`⊗_Z_`_F_`)Gu→ 0

is also exact. It is not hard to see that the functor M 7→ MGuon G-modules M is left-exact, so it remains to prove that the right-hand map is surjective. So suppose that we have x ∈ T`such that Ax ≡ x mod `T`

for all A ∈ ρE,`(Gu). For every u > 0 we know that Gu is a profinite group of pro-order p 6= `, so by

[18, §1.3.1: `-adic representations of local fields], ρE,`(Gu) ⊂ GL(T`(E)) ' GL2(Z`) is finite of order

prime to `. Therefore, |ρE,`(Gu)| is invertible in Z`, so we can define

z := 1

|ρE,`(Gu)|

X

A∈ρE,`(Gu)

Ax ∈ T`.

By construction, z ∈ T_`Guand since Ax ≡ x mod `T`for all A ∈ ρE,`(Gu), also z ≡ x mod `T`so z is

the desired element in the inverse image of x.

Corollary 1.4.7. The Swan conductor of E/K equals δ(ρE,`) =

∞

X

i=1

|Gal(L/K)i|

|Gal(L/K)| dimF`E[`]/E[`]

Gal(L/K)i_,

whereL = K(E[`]).

Proof. By the previous lemma, we may restrict ourselves to considering only the action on E[`]. Because E[`] is finite, L/K is a finite extension and the Galois action on E[`] factors via Gal(L/K). Now it is a simple substitution to rewrite the integral over the upper ramification groups into one over the lower ramification groups, which is a sum.

This shows that the conductor of an elliptic curve is a well-defined rational number. Also, this shows that our definition of the conductor agrees with that of Silverman in [56, p. 380]. We proceed to show that the conductor does not depend on `. The following proposition is a corollary of the non-degeneracy of the Weil pairing; see [55, p. 99] for a proof.

Proposition 1.4.8. If φ ∈ End(E) induces φ` : T`(E) → T`(E) then det φ`= deg φ.

Next, notice that if σ ∈ Gal(K/K) acts trivially on T`(E), then deg φ = 1 so φ acts trivially on E.

Therefore ker ρE,` are those elements of the Galois group that act trivially on E. In particular, this is

independent of `. Secondly, for any 2 × 2 matrix A, we have TrA = 1 + det A − det(I − A), so that in the above proposition also Trφ` is independent of `. Consider σ ∈ GK. By the above, TrρE,`(σ) =

2 − deg(I − σ) is independent of `. Suppose that u > 0. Then the image ρE,`(Gu) ' Gu/ ker ρE,`is

finite as in the proof of Lemma 1.4.6. We thus find that dim V`(E)G

u

= dim V`(E)ρE,`(G

u₎ = 1 |Gu_{/ ker(ρ} E,`)| X σ∈Gu_{/ ker(ρ} E,`) TrρE,`(σ).

By the previous remarks, this is independent of `, thus showing that the Swan conductor is independent of `. It remains to consider (ρE,`).

(17)

Theorem 1.4.9. For ` 6= p we have

(ρE,`) = dimQ`

V`(E)/V`(E)I(K/K)

=     

0 if E/K has good reduction,

1 if E/K has multiplicative reduction and 2 if E/K has additive reduction

.

In particular, this is independent of`.

Remark1.4.10. This generalises the criterion of N´eron-Ogg-Shafarevich.

Proof. First we note that V`(E(K))I(K/K) = V`(E(Kur)) where Kuris the maximal unramified

exten-sion of K. This has residue field k, the algebraic closure of the residue field of K. We have two exact sequences

0 → E0(Kur) → E(Kur) → E(Kur)/E0(Kur) → 0,

0 → E1(Kur) → E0(Kur) → eEns(k) → 0,

where E0(Kur) is the subgroup of E(Kur) of points with non-singular reduction, eEnsis the non-singular

locus of the reduction of E and E1(Kur) is defined by the latter exact sequence. It is a consequence of

Hensel’s lemma (cf. [52, II Proposition 7] that the map E0(Kur) → eEns(k) is surjective. The functor V`

from abelian groups to Q`-vector spaces given by V`(A) = (lim_n≥0←− A[`n]) ⊗Z`Q` is exact and can thus

be applied to the above exact sequences. It is a consequence of Tate’s algorithm (see Corollary 2.3.10) that E(Kur)/E0(Kur) is finite. Also E1(Kur) is isomorphic to the formal group of E and thus has no

`-torsion (see [55, VII Proposition 3.1]). Therefore, both V`(E1(Kur)) = V`(E(Kur)/E0(Kur)) = 0 and

we conclude that

V`(E(Kur)) ' V`( eEns(k)).

Now we are basically done, because eEns(k) is determined by the reduction type (see [55, III Proposition

2.5]). It is not hard to see that V`(k ×

) ' Q`, V`(k +

) ' {0} when ` 6= p and we already know that V`( eE) ' Q2` when eE is an elliptic curve over a field of characteristic prime to `. Now the statement

follows.

We conclude that the conductor of an elliptic curve is independent of `, but depends instead on the reduction type of E. We proceed by showing the Swan conductor is integral.

Remark_{1.4.11. It appears natural to try to argue as in Section 1.3, using instead the F}_`-representation of Gal(K/K) on E[`], because this does factor via the Galois group of a finite extension L = K(E[`]) over K. In that case, however, characters take values in F`so we cannot use them to define the conductor as in

Definition 1.3.1. Instead, for an F`-representation ρ we need to work with the definition

f (ρ) =

Z ∞

−1

dim_F_`E[`]/E[`]Gt dt.

Gabor Wiese attempted to prove integrality of the conductor this way in his lecture notes [66]. It turns out (see [66]) that one can prove the same formula (1.3) for the conductor of an induced representation in terms of the original one, using Mackey’s formula instead of Frobenius reciprocity. In order to apply Theorem 1.3.9, we need to somehow reduce to a group of order coprime to `. This is problematic, because the argument used in the proof of Lemma 1.3.8 to reduce to the wild ramification group makes crucial use of the formula

|G| dim VG= X

σ∈G

χ(σ),

that holds when χ is the character of a representation V of G over a field of characteristic zero. In case of characteristic `, however, this is an equality modulo `, thus spoiling the argument. Wiese attempted to circumvent this problem in [66, Proposition 3.1.40], but there appears to be a fatal mistake in the proof: the second displayed exact sequence is in general not exact.

(18)

As a general representation-theoretic argument appears to be hard, we “cheat” and make use of the special properties of the Tate module representation of an elliptic curve.

Theorem 1.4.12. If p ≥ 5 or E/K has good or multiplicative reduction, we have δ(ρE) = 0, so

f (E/K) = (ρE) =     

0 if E/K has good reduction,

1 if E/K has multiplicative reduction and

2 if E/K has additive reduction. Proof. When E/K has good reduction, δ(ρE) = 0 by Corollary 1.4.7 and N´eron-Ogg-Shafarevich. For

the proof when E/K has multiplicative reduction, see [56, IV Theorem 10.2]. Assume now that p ≥ 5. By Corollary 1.4.7, it suffices to consider the action of G on E[`]. As K(E[`])/K is finite, this action will factor via a faithful representation of a finite group Gal(M/K). But GL2(F`) contains `(` − 1)2(` + 1)

elements so |Gal(M/K)| divides this number. Also Gal(M/K)uis a p-group for u > 0 (cf [52, p. 67]), so if p is coprime to `(` − 1)2(` + 1) then Im(Gu → Gal(M/K)) = Gal(M/K)u _{= {1} and thus G}u

acts trivially on E[`]. It remains to note that for every prime p ≥ 5, there exists a prime ` 6= 2, p such that ` mod p /∈ {0, 1, −1}.

Next, we consider the case where jE/K is non-integral and p ∈ {2, 3}. Over the complex numbers, elliptic curves correspond with complex tori C/Λ, where Λ is of the form Z⊕τ Z. Applying the exponential map z 7→ exp(2πiz), such a torus is biholomorphic to C∗/qZ_{, where q = exp(2πiτ ). Over a finite}

extension K of Qp, the analogy with lattices is lost and we may not have a (global) exponential map, but

we can consider K×/qZ _{for q ∈ K}×_{. This is called a Tate curve. By convergence issues, this indeed}

corresponds to an elliptic curve Eq/K when |q| < 1 and then |jEq| = 1/|q| > 1. Conversely, every

elliptic curve over K with non-integral j-invariant equals Eqfor some q ∈ K×. This allows us to view

elliptic curves over K with non-integral j-invariant as ‘lattices’ in a manner similar to the case over the complex numbers. A consequence of this theory is the following lemma.

Lemma 1.4.13. If jE ∈ O/ K there exists a field K0/K such that [K0 : K] ≤ 2 and K0(E[`]) =

K0(ζ`, q1/`) for some q ∈ K0 and primitive `-th root of unity ζ`. Moreover, K0/K is unramified when

E has multiplicative reduction and ramified when E has additive reduction.

Proof. See [56, V.5] for a proof of the lemma and for more details on the theory of Tate curves. We now have everything we need.

Proposition 1.4.14. Suppose that E/K has non-integral j-invariant. Then δ(E/K) = 0 when the residue field characteristicp = 3. When p = 2, Gal(K(E[`])/K)1 ∈ {{1}, Z/2Z}. In both cases, f(E/K) ∈ Z.

Proof. From the previous lemma we see that the ramification index of K(E[`])/K divides 2`. So the extension is either tamely ramified or p = 2 and Gal(K(E[`])/K)1 = Z/2Z (we may and do assume

` 6= 2, 3). In the latter case, if σ is the generator, we see from normality that στ = τ σ for all τ ∈ Gal(K(E[`])/K)0, so Gal(K(E[`])/K)1 is contained in the center of Gal(K(E[`])/K)0. It follows

from choosing s to be a generator of Gal(K(E[`])/K)0/Gal(K(E[`])/K)1 in Lemma 1.3.7 that σ /∈

Gal(K(E[`])K)i+1is only possible when i is a multiple of

e0 := |Gal(K(E[`])/K)0|/|Gal(K(E[`])/K)1|. As Gal(K(E[`])/K)1has prime order, the next jump in

the ramification groups is also the last, so e0was the only possible denominator occurring in f (E/K).

(19)

1.5. Upper bounds for the conductor

Again let K be a finite extension of Qpwith residue field k. In Chapter 4, upper bounds for the conductor

are an important part of finding height bounds for Mordell equations. If p ≥ 5, we have seen that the conductor exponent f (E/K) of ane ellitic curve E/K is at most 2. In this section, we find an upper bound for the conductor exponent at the primes 2 and 3. We follow the reasoning in [56, IV Theorem 10.4], but we include a proof of the key statement (iii) omitted by Silverman, for which we use the original article of Brumer and Kramer [8]. First a bound is determined for the number of non-trivial lower ramification groups. For this, a bound on the valuation of the different is needed. When L/K is a finite field extension, we write eL/Kfor the ramification index of L/K.

Lemma 1.5.1. Let L/K be a finite extension of fields. Then

vL(dL/K) ≤ eL/K− 1 + vL(eL/K).

Proof. See [52, p. 58].

The following lemma is based on parts of the paper [8] by Brumer and Kramer, where the result is, however, not mentioned explicitly. This lemma is key to the proof of Theorem 1.5.3.

Lemma 1.5.2. Suppose L/K is a finite Galois extension with Galois group G and let r be the smallest integers such that Gs= {1}. Then

r ≤ 1 + vL(p) p − 1.

Proof. If r = 0 or r = 1 the lemma trivially holds. Hence it suffices to consider the extension L/M where M is the fixed field of G1, so wlog we assume that G1 = G. In particular L/K is now totally ramified.

Also G is quite tangible: it is a p-group. We seek to upper bound the place of the first jump. Let f0be the

smallest integer f such that Gf 6= G. As G/Gf0 is a direct product of cyclic groups of order p (see [52, p.

67], there exists a non-trivial character χ of G of degree 1 with Gf0 ⊂ ker χ such that G/ ker χ is cyclic

of order p. Then by Lemma 1.3.2, f0 = f0−1 X i=0 1 = ∞ X i=0 |Gi| |G₀|(1 − χ(Gi)) = f (χ).

Suppose H = ker χ corresponds via Galois theory to the subfield Lχand that χ = χ ◦ π, where π : G →

G/H is the quotient map. Then f (χ) = f (χ) by Lemma 1.3.2 because the upper numbering ramification groups are compatible with quotients.

It thus remains to bound the conductor of a non-trivial degree 1 character χ on the cyclic group G/H of order p. Each non-trivial degree 1 character on G/H is given by mapping the generator to a p-th root of unity and soP

ψ∈ [G/Hχ(g) = 0 for every g 6= 1 and

P

ψ∈ [G/Hψ(1) = | [G/H| = |G/H|, where cG 0

denotes the group of characters on a group G0. From this computation we see thatP

ψ∈ [G/Hψ equals the

character χreg of the regular representation of G/H. The regular representation is induced by the trivial

representation χtriv on the trivial subgroup and thus we know from (1.3), Lemma 1.5.1 and the fact that

Lχ/K is totally ramified that

X

ψ∈ [G/H

f (ψ) = f (χreg) = vLχ(dLχ/K) ≤ p − 1 + pvK(p).

It remains to notice that f (χtriv) = 0 and that all non-trivial characters are isomorphic 1-dimensional

representations and thus have the same conductor. We obtain f0= f (χ) = f (χ) ≤ 1 +

pvK(p)

(20)

We can now do the same for Gf0 instead of G to find the second jump f1, and the third f2 and so on.

The bounds we find will be the same, but with K replaced by the fixed field of the group we consider. At worst, the last jump will occur over a base field K0with [L : K0] = p. Therefore,

r ≤ 1 +pvK0(p)

p − 1 ≤ 1 +

vL(p)

p − 1, as desired.

Theorem 1.5.3. If E is an elliptic curve over K, we have

f (E/K) ≤ 2 + 3vK(3) + 8vK(2).

Remark1.5.4. This bound can be slightly improved. Brumer and Kramker [8] proved the stronger bound f (E/K) ≤ 2 + 3vK(3) + 6vK(2) and this is the strongest possible.

Proof. Let L = K(E[`]). We use the bound (E/K) ≤ 2, so it suffices to consider δ(E/K). Therefore, by Corollary 1.4.7, we may and do assume that Gal(L/K) = Gal(L/K)0, i.e. that L/K is totally

ramified. Using Corollary 1.4.7, we see that

f (E/K) = (E/K) + r X i=1 gi g0

dim_F_`(E[`]/E[`]Gal(L/K)) ≤ 2

r−1 X i=0 gi g0 = 2r g0 + 2 g0 ∞ X i=0 (gi− 1),

where gi = |Gal(L/K)i|. We need thatP∞i=0(gi − 1) = vL(dL/K), a result easily proved by rewriting

gi as a sum of indicator functions 1iL/K(σ)≥i+1 and swapping the order of summation (see [52, p. 64]).

Together with the previous two lemmas, we obtain f (E/K) ≤ 2vL(p) eL/K(p − 1) + 2 g0 (vL(dL/K) + 1) ≤ 2vL(p) eL/K(p − 1) + 2 g0 (e_L/K+ vL(eL/K)).

Recall that g0 = eL/Kand vL(p) = eL/KvK(p). Then the above becomes 2 + 2v_p−1K(p) + 2vK(eL/K)). It

remains to find a tight bound for vK(eL/K). We have by definition of L that Gal(L/K)0 ,→ GL2(F`)

which has order `(` − 1)2(` + 1). Choosing for example ` = 5, we see that eL/K | 480, so when p = 3

this yields vK(eL/K) ≤ vK(3) and in total

f (E/K) ≤ 2 + 3vK(3).

When p = 2 we can take ` = 3 to find that vK(eL/K) ≤ 4vK(2). But we can do a little better. We will

show that the image of Gal(L/K) in GL2(F3) is actually contained in SL2(F3). Note that the conductor

is invariant under unramified extensions of the base field, and that from the Weil pairing, we know that L contains a third root of unity since it contains E[3] (see [55, III Corollary 8.1.1]). As K(ζ3)/K is

unramified, we may and do assume that ζ3 ∈ K. Let v, w be an F3-basis of E[3] and let e3 be the Weil

pairing on E[3]. Then

e3(v, w) = σ(e3(v, w)) = e3(σ(v), σ(w)) = e3(v, w)det ρE(σ),

by standard properties of the Weil pairing (see [55, III Proposition 8.1]). Hence by the non-degeneracy of the pairing, det ρE(σ) = 1. Lastly, note that |SL2(F3)| = 23· 3, so we save one power of 2.

(21)

2. The N´eron model of an elliptic curve and Tate’s

algorithm

In this chapter, we discuss an algorithm of Tate that allows for the computation of the conductor of any elliptic curve over a number field. The main tool for this is the N´eron model of an elliptic curve, which is a scheme capturing the elliptic curve together with its reductions at the primes. The main reference used for this chapter is [56, IV]. A thorough discussion of N´eron models could be a thesis in itself, so not all aspects are discussed here in great detail. Instead of presenting a sketchy overview, we have chosen to discuss some proofs in greater detail than in [56, IV] and to omit others. The focus is on arithmetic aspects, rather than general algebraic geometry.

2.1. N´eron models

For the geometric definitions in this section and those that follow, we refer to the book of Hartshorne [22]. By OX we denote the structure sheaf of a scheme X, by OX(U ) its sections over an open U ⊂ X and

by OX,x or Oxits stalk at x ∈ X. When R is a commutative ring, X = Spec R and M is an R-module,

we denote by fM the induced OX-module. When Y → Spec R is a scheme over R, we denote by Ypthe

fibre of Y at a prime p ⊂ R. From now on, let R be a Dedekind domain of characteristic zero with field of fractions K and E an elliptic curve over K.

We can always find a Weierstrass equation over R for E

E : y2+ a1xy + a3y = x3+ a2x2+ a4x + a6 (all ai∈ R).

The naive scheme to capture E and its reductions simultaneously would be the sub-R-scheme E of P2R

defined by the above equation. This is a surface with the nice property that E (R) = E(K). Indeed, every point (a : b : c) ∈ E(K) can be rescaled to give a point in E(R).

All the non-singular reductions of E admit a group law induced by the group law on E and even when e

Epis singular, its non-singular locus eEp,nshas a group law induced by E (see [55, III 2.5]). Therefore,

one might expect for some R-scheme resembling E to be a group scheme. In general, E itself cannot be a group scheme, as that would yield a group law on all the reductions eEp, which is impossible when eEpis

singular. A possible solution would be to consider instead the subset E0 of those points of E that are non-singular fibre points. The base map π : E → Spec R is continuous and a non-zero prime {p} ⊂ Spec R is closed (R has dimension 1), so π−1({p}) ⊂ E is also closed and this is homeomorphic to the fibre E ×Spec Rk(p) = eEp. Moreover, the singular locus in eEp is closed and contains a single point, so this

unique singular point on eEp is closed in E. Therefore, removing these (finitely many) singular points

yields an open subscheme E0 ⊂ E with smooth fibres. The equations for the group law make this into a group scheme. A downside of removing points on the fibres is, however, that we may have lost the point extension property (i.e. E(K) 6= E0(R)). Therefore, a more abstract viewpoint needs to be taken, while keeping the above approach in mind.

We need the notion of an excellent ring. A ring S is called excellent when it satisfies the properties described in [36, p. 259], one of which is that S is Noetherian. Furthermore, a scheme X is called excellentif it has an open cover of affine subschemes defined by excellent rings. Most rings frequently encountered, such as Dedekind domains of characteristic zero and fields are excellent (see [36, p. 260]). The following definition describes which schemes over R are particularly “nice”.

(22)

Definition 2.1.1. An arithmetic surface over R is an integral, normal, excellent R-scheme π : C → Spec R, flat and of finite type over R, such that the generic fibre is a smooth connected projective curve C/K, and the other fibres are unions of curves over the residue field of the prime considered.

Note that we do not require an arithmetic surface to be regular over R, to have smooth fibres or to be proper over R. An arithmetic surface is indeed a surface. This follows from flatness and the fact that all fibres have equal dimension 1, see [22, III Corollary 9.6].

We now examine whether E and E0as defined above are arithmetic surfaces. In what follows, we write Mpfor the localisation of an R-module M at the prime p.

Lemma 2.1.2. The scheme E0is a regular arithmetic surface. IfE is normal, then E is also an arithmetic surface.

Proof. By definition, the generic fibre of E is an elliptic curve and the other fibres are unions of curves. Also, E is integral because all local rings are integral (f being irreducible in R[x, y]). Moreover, R is Dedekind of characteristic zero, so excellent, and every finitely generated algebra over an excellent ring is again excellent (see [36, p. 260]). For flatness, we note that R[x, y]/(f ) has no zero divisors and thus the result follows from [22, III Proposition 9.7].

For E0, note that flatness, excellence, being of finite type over R and integrality are local properties, so they still hold on E0. Moreover, E0 is E with finitely many closed points on non-generic fibres removed, so the generic fibre remains E/K and the special fibres are unions of curves. Lastly, we note that E is regular at all points that are non-singular in its special fibre. Indeed, if z ∈ Spec R[x, y] and z/(p + (f )) in (R/p)[x, y]/(f ) = R[x,y]_{p+(f )} is non-singular, then in the localisation R[x, y]z/(p + (f ))zat z, we can write

zz = pz+ (f ) + (a) + zz2for some a ∈ R[x, y]. As pp = (π) ⊂ Rpis principal and p = z ∩ R, we find

zz = (f )z+ (a, π)z+ zz2which shows that the corresponding ideal in (R[x, y]/(f ))z/(f )can be generated

by two elements modulo its square. This proves that z/(f ) is regular in E . Consequently, E0 is regular and in particular normal.

The N´eron model will be an arithmetic surface, but there is no need to define it that way.

Definition 2.1.3. A N´eron model N (E/K)/R for E/K is a smooth group scheme over R that has E/K as its generic fibre (with the induced group structure), satisfying the following N´eron mapping property: for every smooth R-scheme X and rational map φ : XK → E of their generic fibres, there exists a unique

R-morphism eφ : X → N (E/K) extending φ.

It is clear from the Néron mapping property that Néron models, if they exist, are unique up to unique isomorphism. Taking X = Spec R in the Néron mapping property, we see that the natural inclusion N (E/K)(R) → E(K) is a bijection, as was the case for E. The Néron mapping property is a natural scheme-theoretic extension of this particular instance.

Recall that smoothness is an open condition (cf. [22, p. 268]): if C is an R-scheme, the set C0 of x ∈ C such that C → Spec R is smooth at x is a smooth open subscheme. It is the largest smooth subscheme contained in C. Smoothness is closely connected with regularity. If C is an arithmetic surface and R is a discrete valuation ring (DVR), C is smooth over R if and only if the special and the generic fibre are both regular (see [56, IV Proposition 2.9]). This is the definition we will work with. So when R is a DVR, the open subscheme E0defined before by discarding the non-singular points on the special fibre indeed equals the largest subscheme of E smooth over R.

The following equivalent definition of properness gives us a hint where to look for schemes obeying the N´eron mapping property.

Proposition 2.1.4 (Valuative criterion of properness). A morphism φ : X → S of finite type between Noetherian schemes is proper if and only if for each discrete valuation ringR with fraction field K and

(23)

every commuting square Spec K // X Spec R // ∃! 77 S

there is a unique morphismSpec (R) → X making the entire diagram commute. Proof. See [22, II 4.7 and Exercise 4.11].

This proposition shows that proper schemes over DVRs obey the particular instance of the N´eron map-ping property. Indeed, if R is a DVR and C → Spec (R) is proper with special fibre C/K and P ∈ C(R) then P ((0)) ∈ C(K). This map C(R) → C(K) is injective because the generic point (0) ∈ Spec (R) is dense. Surjectivity follows from the commuting diagram

Spec K //

C //C

Spec R //Spec R

and the valuative criterion of properness.

We also note that any N´eron model N /R has smooth fibres. Intuitively, this clashes with properness, as (cf. the beginning of the section) non-singular points on the fibres need to be removed to ensure smooth fibres, and that disturbs the valuative criterion described above: the unique extension Spec (R) → N may have had one of these non-singular points in its image.

On the other hand, if C is a proper R-scheme, we can again consider the largest R-subscheme C0having smooth fibres. We thus wonder in which case the sections P : Spec (R) → C map exclusively to non-singular fibre points.

Proposition 2.1.5. Suppose that C is a regular arithmetic surface over the Dedekind domain R, P ∈ C(R) and p∈ Spec (R). Then P (p) is a non-singular point on the fibre Cpof p.

Proof. Let x = P (p). By definition x is non-singular when p is the generic point, so we may assume p is a maximal ideal of R. This implies that x is a closed point of Cp(and hence of C too), so we may work

with the ‘cotangent space definition’ of singularity. Let P∗be the induced homomorphism of local rings OC,x→ Rp. By definition of a scheme morphism, this homomorphism is local, i.e. if mxis the maximal

ideal of OC,xthen P∗(mx) ⊂ pp. The sequence

π ◦ P : Spec (R) → C → Spec (R) equals the identity morphism, so the induced sequence

P∗◦ π∗ : Rp → OC,x→ Rp

also equals the identity morphism. From this we see that π∗(p) ⊂ m2

ximplies that p ⊂ p2which is absurd

for maximal ideals in a Dedekind domain. Hence π∗(p) 6⊂ m2_x. We shall show that this implies that P (p) is non-singular. Considering an affine neighbourhood of x, one sees that OCp,x= OC,x/π

∗_(p

p)OC,x. We

can write pp = tRpfor some t ∈ Rpas Rpis a DVR, and

π∗(t) = a1f1+ a2f2mod m2x

for some a1, a1 ∈ OC,x and an OC,x/mx-basis f1, f2 of mx/m2x (here we use that C is regular). As

π∗(t) /∈ m2

xwe have wlog that a1 ∈ m/ x, so a1 ∈ O×C,x. We can hence write

m_x = π∗(t)OC,x+ f2OC,x+ m2x

(24)

Definition 2.1.6. Suppose that C/K is a non-singular projective curve over K, where R is still assumed to be Dedekind. A regular arithmetic surface C with special fibre C/K that is proper over R is called a regular proper modelfor C/K.

Corollary 2.1.7. Suppose that C/K is a non-singular projective curve and C a regular proper model for C/K. Then C0(R) = C(K), where C0⊂ C is the largest subscheme smooth over R.

This shows that the largest smooth subscheme of a regular proper model obeys the particular instance of the Néron mapping property, so it may be a good candidate for a Néron model. Of course, if a proper regular model C/R exists, any regular proper scheme X → C with the same special fibre over R as C is also a proper regular model. We are looking for a Néron model, so if C/K is elliptic we desire C to be sufficiently small for the group law on C/K to extend to C0/R.

Definition 2.1.8. A proper regular model C for C/K is called minimal when for every other proper regular model X for C/K the isomorphism XK → CK of the generic fibres extends to an R-morphism X → C.

When R is a DVR, the generic point {η} is open, so the map on the generic fibres XK → CK defines a

rational map X → C. When R is merely Dedekind, it is also true that the isomorphism of the generic fibres between two proper regular models X and C of a non-singular curve determines a rational map X → C, but this is less trivial (see Liu [30, Proposition 9.3.19]). Hence in all cases, the obtained morphism X → C in the previous definition is uniquely determined.

A minimal proper regular model is the unique one that every proper regular model factors through in the right way. When C has positive genus, it always exists.

Theorem 2.1.9. For every non-singular projective curve C/K of genus at least 1, a minimal proper regular model exists.

Proof. See [28, Theorem 4.4].

By definition of the minimality condition, if C is the minimal proper regular model for C/K, we see that any K-automorphism τ : C → C extends to an R-automorphism C → C. Moreover, if x ∈ C0, the largest subscheme that is smooth over R, then for any open x ∈ U ⊂ C0, we see that τ (U ) is open in C as C0 _{is open, and hence τ (U ) is also a smooth R-scheme, so τ (x) ∈ C}0_{. This shows that τ also extends to}

give a morphism C0 → C0_{. In particular, if C is an elliptic curve, the addition maps τ}

P : Q 7→ Q + P

extend to C and C0. We hope now to have provided sufficiently compelling evidence for the following theorem.

Theorem 2.1.10. Suppose that E/K is an elliptic curve and C/R is a minimal proper regular model for E/K. Then C0 is a N´eron model for E/K.

Proof. See [56, IV.6].

The task of constructing N´eron models is thus reduced to finding the minimal proper regular model. An example of a N´eron model can be found in Section 2.3.

The following theorem will be of crucial importance for computing conductors of elliptic curves in practice. It also shows the usefulness of N´eron models and minimal proper models. When R is a DVR with residue field k, we denote by m(E/K) the number of irreducible components of the special fibre over k of a minimal proper regular model of E/K.

Theorem 2.1.11 (Ogg’s formula). Suppose that R is a DVR. Then N_E/K = vK(∆(E/K)) − m(E/K) + 1.

(25)

Proof. A detailed exposition of the proof in the case where the residue field of K has characteristic greater than 3 can be found in [56, IV.11]. See Saito’s paper [51] for a full proof in a more general case and Liu’s work [29] for an exposition of Saito’s proof for elliptic curves.

In order to describe the conductor of an elliptic curve, we now concentrate on finding minimal proper regular models.

2.2. Weil divisors on an arithmetic surface

In this section, we introduce Weil divisors on arithmetic surfaces. This is a tool to do intersection theory on these surfaces in the next section, which in turn helps determining when a minimal model has been found: to find the minimal model one needs to compute intersection indices (see Definition 2.3.1). Therefore, we have chosen to study this in detail. In this section, we fix an arithmetic surface π : C → Spec R over a Dedekind domain R of characteristic zero.

Since C is integral, normal and Noetherian (for it is excellent), every local ring OC,xof dimension 1 is

a DVR. Let X ⊂ C be a closed subscheme. By Noetherianity, X has a unique decomposition X =

n

[

i=1

Fi,

as a topological space into distinct closed irreducible subspaces. For each Fiwe have a unique ideal sheaf

Ii = {f ∈ OX | f /∈ O×_X,xfor all x ∈ Fi} that makes (Fi, OX/Ii) into a closed reduced subscheme

(cf. [22, II Proposition 5.9]). These Fis are thus integral; denote their generic points by ηi. We assume

that each Fihas dimension 1. Then looking at an open affine subscheme Spec A ⊂ C containing ηi, the

fact that dim C = 2 and dim {ηi} = 1 shows that OC,ηi = Aηi has Krull dimension 1. By the above, this

forces the local rings OC,ηito be DVRs.

Definition 2.2.1. Let X ⊂ C be a closed subscheme corresponding via [22, II Proposition 5.9] to the ideal sheaf I, such that in the decomposition X = ∪n_i=1Fi into integral subschemes, each Fi is a curve. We

define

vFi(X) := vO_C,ηi(Iηi) ,

where vO_C,ηi is the normalised valuation on the DVR OC,ηiand ηiis as above.

Note that Fi ⊂ X implies that Iηi ⊂ ηi inside OC,ηi, so vFi(X) ≥ 1 for each i.

Proposition 2.2.2. The scheme X is uniquely determined by the formal sumPn

i=1vFi(X)Fi.

Proof. Clearly X is defined by the sum as a topological space. We will show that for every open affine Spec A ⊂ C, the structure sheaf OX∩Spec (A)is uniquely determined by the formal sum. Let I be the

ideal in A such that X ∩ Spec (A) = Spec A/I (then OX∩Spec A = OC/ eI). The set of integral curves

in Spec A corresponds to the set of minimal prime ideals in A: if p ⊂ A is a minimal prime, {p} is an integral curve and conversely, if Y is an integral curve in Spec A, its generic points is a prime ideal. Because C is Noetherian and normal and OC,x= Axis a DVR for each minimal prime x of A, the ring A

a Krull ring. Therefore,

I = \

p∈Spec A minimal

Ip,

so I is determined by its localisations at the minimal prime ideals. This is precisely what we needed to show.