The Effective Topos and its Sheaf Subtoposes

(1)

The effective topos and its sheaf subtoposes

Joost van Dijk

May 16, 2017

Master project Mathematics

Supervisor: dr. Jaap van Oosten Second reviewer: dr. Benno van den Berg

Third reviewer: dr. prof. Lenny Taelman

KdV Instituut voor Wiskunde

Faculteit der Natuurwetenschappen, Wiskunde en Informatica Universiteit van Amsterdam

(2)

Abstract

In this thesis two toposes (categories very similar to the category of sets) will be studied: One which gives the ‘world of mechanically computable mathematics’ and another which gives the ‘world of hyperarithmetical mathematics’. In particular the connections between arithmetic in such a topos and realizability will be studied.

Information

Titel: The effective topos and its sheaf subtoposes Author:

Joost van Dijk, joost.vandijk@student.uva.nl, 10202323 Supervisor: dr. Jaap van Oosten

Second reviewer: dr. Benno van den Berg Third reviewer: dr. prof. Lenny Taelman Einddatum: May 16, 2017

Korteweg de Vries Instituut voor Wiskunde Universiteit van Amsterdam

Science Park 904, 1098 XH Amsterdam http://www.science.uva.nl/math

(3)

Introduction

Topos theory straddles the line between mathematics and logic. Knowing that I love mathematics and logic, topos theory is magical. Accordingly, my master thesis had to be in that subject. I decided to do my thesis by Jaap van Oosten, and he wanted me to look into an article he wrote a while ago (see [35]).

The specific category (a topos) which is studied in the article is a subcategory (subto-pos) of a category called the effective topos. An effective procedure is an operation for which the outcome can be mechanically calculated by a machine in finitely many steps. A topos is a category which is finitely complete, cartesian closed and which has a subob-ject classifier. Crucially, there is a way to reason ‘within’ any topos in an intuitionistic way. The way of reasoning within the effective topos is heavily connected to effective procedures via a reasoning construct named realizability. For this reason, the original inventor of the effective topos, Martin Hyland in [9], called it ‘the world of effective mathematics’.

On any topos there can be defined a ‘Lawvere-Tierney topology’, which induces a ‘notion of sheaves’. This notion of sheaves induces a subtopos of the original topos. This generalizes the situation between the category of presheaves on a small category C and the subcategory of sheaves on a site of C: both are toposes and a Lawvere-Tierney topology generalizes the notion of small sites. The generalization is useful for the effective topos, which is not a category of presheaves or sheaves induced from a site. The subtoposes of sheaves of the effective topos induced by Lawvere-Tierney topologies are intricately connected to computability. In the article of Van Oosten one such subtopos of sheaves is shown to be linked to functions and sets which are said to be ‘hyperarithmetical’. The class of hyperarithmetical functions (respectively sets) is an extension of the class of computable functions (respectively sets). In section 2.5 it will be seen that this gives a reason to call this topos of sheaves ‘the world of hyperarithmetical mathematics’.

In my thesis I describe the knowledge needed to understand what the article says. In writing this thesis, I found that very many things can be said about topos theory, perhaps even too much. But the more I learn about it, the more I want to expand my sphere of knowledge to this area.

A rough outline of the chapters of my thesis is as follows.

Chapter 0 is used to introduce three topics. The sections of the form 0.1.a describe the theory of computability to the extend needed for this thesis. Thus it is explained in there what computable functions are, Kleene’s enumeration theorem is given, hyper-arithmetical functions are defined and Church’s thesis is explained. Sections of the form 0.2.a introduce the reader to intuitionistic reasoning in the form of the Brouwer-Heyting-Kolmogorov interpretation of intuitionistic logic, Kleene’s realizability of intuitionistic

(6)

arithmetics, and the Russian constructive school of computable mathematics led by Markov. The sections of the kind 0.3.a explain the basics of topos theory. The def-inition of lattices, Heyting algebras, and toposes is given. Then it is shown that the subobjects of an object of a topos give a Heyting algebra. This Heyting algebra struc-ture enables an interpretation of intuitionistic logic in any topos. Finally, the partial map classifier of a topos is given.

The ensuing chapter 1 examines the structure of the effective topos. In particular, the effective tripos is defined in section 1.1, followed up by an interpretation of logic inside the effective tripos in section 1.2. Then comes the construction of the effective topos as category in 1.3 and then as a topos in 1.4. The internal logic associated to the effective topos is given in 1.5. The last section of this chapter is used to show the connection between the arithmetic of the effective topos and realizability.

Chapter 2 will discuss the theory of sheaf subtoposes and in particular the specific sheaf subtopos of the effective topos studied by Van Oosten. Fundamental is the concept of Lawvere-Tierney topologies. In section 2.1 the definition is given. In section 2.2 a definition of sheaves for a Lawvere-Tierney topology is given, which is then shown to induce a subtopos of sheaves. In the pursuing four sections the Lawvere-Tierney topologies for the effective topos are studied. In section 2.3 a specific subtopos of sheaves first defined by Pitts is studied. Then in section 2.4 and 2.5 we follow Van Oosten in giving the arithmetic of this subtopos and its connection to hyperarithmetical functions and sets. In the final section a non-standard model of arithmetic of Skolem is given, usually given for sets, but which has been shown by Van Oosten to exist in the sheaf subtopos he studied.

Finally, I would like to mention that there is an index, a nomenclature, and a bibli-ography, which can all be found at the end of this thesis.

A word of thanks

A journey of a thousand miles begins with a single step. This thesis hopefully represents such a first step. But no journey can be done without help. Foremost I want to thank my teacher Jaap van Oosten, whose help made this excursion into the effective topos possible. In addition, I would like to express my gratitude to Benno van der Berg, who gave invaluable advice after reading an earlier version of this thesis. Moreover, I am indebted to everybody who has contributed to topos theory, for any thought on the subject it must surely, in some form, be theirs originally. Furthermore, my parents and brother have been supporting me as only family can, and they have helped me to weather some storms which I found on my path. Support also came from Hessel Posthuma, who assisted me tackling the administration of the University of Amsterdam, for Jaap comes from the University of Utrecht. And I want to thank the reader, who is investing his time in reading my thesis.

(7)

0 Preliminaries

The purpose of this chapter is to explain three topics needed to understand the effective topos, computability theory, realizability and topos theory in the subsections of 0.1, 0.2 and 0.3. What relevance do these topics have to the effective topos? Let me give a rough sketch of the objects and arrows of the effective topos in order to explain their importance.

Before we start, I would like to fix the notation x, y for a (computable) bijective pairing function N N Ñ N, and pq1, pq2 for the (computable) ‘projection’ functions

N Ñ N. With this I mean that these function satisfy

pxa, byq1 a pxa, byq2 b xpcq1,pcq2y c.

For more about these functions and computability see section 0.1.1.

The effective topos is one of the main categories in this thesis. An object of this category consists of a set X equipped with a function X: X X Ñ P pNq such that

there are (total) computable functions f, g : N Ñ N that satisfy for x, y P X and a, b P N two requirements:

• If aP px X yq and b P py X zq then fpxa, byq P px X zq.

• If aP px X yq then gpnq P py X xq.

Morphisms from one such pairpX, Xq to another one pY, Yq are equivalence relations

of functions F : XY Ñ P pNq such that there are total computable functions f, g, h, l : N Ñ N that satisfy for x, x1 P X, y, y1 P Y and a, b, c P N the following requirements.

• If aP F px, yq, b P px X x1q and c P py Y y1q then fpxa, b, cyq P F px1, y1q.

• If aP F px, yq then pgpaqq1 P px X xq and pgpaqq2 P py Y yq.

• If aP F px, yq and b P F px, y1q then hpxa, byq P py Y y1q.

• If aP px X x1q then lpaq P

tF px, yq| y P Y u.

A rationalization of these objects and morphisms can be given as follows. Erase in your mind the numbers a, b and c and computable functions f, g, h, l for a moment, suppose that F really is a function f : X Ñ Y and that X andY are real equalities. The first

set of statements then reduces to ‘equality is transitive and symmetric’. The second set of statements reduces to ‘if x x1 then fpxq fpx1q’, ‘fpxq y implies x x and y y’, ‘there is only one outcome fpxq for any input x’ and ‘on input x there is an

(8)

output fpxq’. The numbers a, b and c and computable functions f, g, h, l try to follow the validity of these ‘statements’ for constructive purposes.

The approach, originally from Hyland [9] and Pitts [24], is based on recursive real-izability by Kleene [13]. Recursive realreal-izability describes, in some sense (section 1.6), what happens in the case of the object pN, _Nq where

pa Nbq

#

tau if a b, H if a b.

This, however, is of course not the way Kleene originally defined realizability. Hyland used the original definition of recursive realizability to build the effective topos, and it is useful to first study it as a simple case. This way, the ideas of the effective topos will be easier to understand. Pitts generalized this construction to the effective topos to a very general setting which Pitts named tripos theory, after a suggestion of his PhD supervisor Johnstone1_.

The whole idea described above can also be done by using hyperarithmetical functions instead of computable functions (see subsection 0.1.2 and chapter 2 and the last 4 sections of chapter 2).

In order to fully understand what the effective topos is really about, we first need to concentrate ourselves on computability theory, realizability and topos theory. This will be done in this chapter in sections 0.1, 0.2 and 0.3.

As the main goal of this chapter is to make sure my readers understand the subject, I will give some examples that will not be used later on.

I will not write down any proofs.

0.1 Computability theory

From computability theory we only need a few elemental definitions and results. The first subsection is about primitive recursive functions, computable functions and Kleene’s enumeration theorem. The second subsection deals with different ways to define hyper-arithmetical functions. In the third subsection some words are spend on Church’s thesis.

0.1.1 Computable functions

In this subsection we are going to focus on computable functions. All functions that are considered in this sections will be partial functions from Nk_{to N. The intuition behind a}

computable function is that it is mechanically computable, say, that there is a computer program (an algorithm) capable of producing an output fpaq when given an input a, without using outide information. The computable functions we will be working with are partial functions. In terms of computer programs we would like to think that the program outputs fpaq in finitely many steps for any a in the domain of f and does not terminate when a is outside the domain of f . In this way any computer program that

(9)

accepts aP Nk as input and outputs fpaq P N if the program terminates (if this happens at all) will induce a computable partial function f : Nk _{Ñ N.}

There are serveral ways to formalize computable functions. These formalization are equivalent in the sense that they define the same class of functions. We will look at a formalization using µ-recursive functions. After that we will discuss the all-important Kleene enumeration theorem which says that all partially computable functions can be represented by a natural number.

We start by defining primitive recursion.

Definition 0.1. Let f : Nk _{Ñ N and h : N}k _{Ñ N be partial functions. Then a new}

partial function g : Nk 1 _{Ñ N can be defined by primitive recursion in the ensuing way:}

gp0, aq : fpaq.

gpn 1, aq : hpn, gpn, aq, aq.

Definition 0.2. The subsequent functions from Nk _{to N are: basic primitive recursive}

functions

• The zero function 0 : Nk _{Ñ N : ~a ÞÑ 0.}

• The successor function S : Nk _{Ñ N : ~a ÞÑ a} 1 1.

• The projection functions πi : NkÑ N : ~a ÞÑ ai for all 1¤ i ¤ k.

The class of primitive recursive functions is the closure of the above functions (for any k¥ 0) under composition and primitive recursion. That means:

• Composition: If f : Nk _{Ñ N and g}

i : Nni Ñ N are primitive recursive functions

then f pg1, . . . , gkq : Nm Ñ N is primitive recursive, where m :

°k i1ni.

• Primitive recursion: If f : Nk_{Ñ N and h : N}k _{Ñ N are primitive recursive then the}

function g : Nk 1 Ñ N defined by primitive recursion is also primitive recursive.

A primitive recursive set is a subset A Nn_{such that the indicator function c}

A : Nn Ñ N

is primitive recursive.

The µ-recursive functions is the closure of a class of functions called primitive recursive functions under minimization.

Definition 0.3. Let f : Nk 1 _{Ñ N be a partial function. Define the minimization of f}

µpfq : Nk _{Ñ N as follows. Consider the set}

A : ty P N|pi, aq P dompfq for 0 ¤ i ¤ y and fpy, aq 0u. Take the minimization to be

µpfqpaq : #

minpAq, if A H, undefined, if A H.

(10)

Definition 0.4. The class of µ-recursive functions or partially computable functions is the closure of the primitive recursive functions under the minimization, composition and primitive recursion.

A computable set is a subset A Nn such that the indicator (total) function cA :

NnÑ N is computable.

Some examples. I will not prove that they are partially computable.

Example 0.5. (i): Many of the usual operations and sets of natural numbers are computable. A few examples are , : N2 Ñ N, pq! : N Ñ N, n ÞÑ n-th prime, tpx, yq| x yu, tx| x is primeu, sg : N Ñ N : n ÞÑ cn¡0, : N Ñ N : pa, bq ÞÑ pa bqca¡b.

Actually, by Church’s thesis all mechanically calculable functions are computable. See subsection 0.1.3.

(ii): For any natural number n, there is a computable bijective pairing function x, . . . , y : N N Ñ N and there are computable ‘projection’ functions pq1, . . . ,pqn:

N Ñ N such that

a xpaq1, . . . ,paqny , pxa1, . . . , anyqi ai.

for each i.

Definition 0.6. If f, g : Nn_{Ñ N are two partially computable functions then f g will}

denote that dompfq dompgq and fpxq gpxq for all x P dompfq. Moreover, x P Nn _is

said to be defined for f or fpxq is defined if x P dompfq. This is denoted by fpxqÓ. One of the examples has showed it is possible to encode k-tuples of natural numbers pn1, . . . , nkq in a single natural number xk, xn1, . . . , nkyy. This fact enabled Kleene in

1943 [12] to build a primitive recursive surjective assignment from the natural numbers to the partial computable functions from Nk to N. This is called Kleene’s enumeration theorem. In the same paper Kleene showed that the function φe associated to a number

e can be written in a certain form. That is called Kleene’s normal form theorem. Theorem 0.7 (Kleene’s enumeration and normal form theorem). There is a primitive recursive relation T N3 _{and a primitive recursive function U : N Ñ N such that}

a partial function f : Nn _{Ñ N is computable if and only if there is a number e such}

that for all pa1, . . . , anq P Nn, fpa1, . . . , anq Upµy.T pe, xa1, . . . , any , yqq. Note that

fpa1, . . . , anq is defined if and only if for some y, T pe, xa1, . . . , any , yqq.

Definition 0.8. The number e in the theorem is called an index for f . The index is not necessarily unique. For any e, we write ep~xq : φep~xq : Upµy.T pe, xx1, . . . , xny , yqq. The

predicate T is called Kleene’s T -predicate, and the function U the extraction function. Let us denote an arbitrary index of a partial computable function f by λpc_~_x.f_p~xq.

However, it should be well to note that there are infinitely many indices for every partial computable function.

Proof. A sketch of the proof of the enumeration part: Any partially computable function f is obtained from the basis functions by a finite sequence of of compositions, primitive recursion schemes and µ-recursive schemes. Each of these steps gives functions, so this gives a finite sequence of functions pf1, . . . , f`q such that

(11)

• f1 is a basic primitive recursive function.

• Each fi is either a basic primitive function or the result of composition, primitive

recursion or µ-recursion applied to the fj with j i.

• f` f.

If each of the possible applications are properly labeled by a tuple of numbers using pairing functions then we are done. This can be done by recursion. See Normann [22] page 14-17 for a complete explanation.

0.1.2 Hyperarithmetical sets and functions

In this subsection hyperarithmetical sets and functions will be defined. Three equivalent definitions will be given.

For the first definition it is necessary to know what first-order and second-order for-mulas of arithmetic are. A formal definition of these kind of forfor-mulas will be given in subsection 0.3.5. For now the following informal definition will be enough.

A formula is a statement in mathematics. Say a statement is ‘basic’ if it is of the form ~a P A (abbreviated Apnq) for a computable set A Nn for some n. The ‘compound statements’ we will now consider are those build up from basic statements by ‘and’ ^, ‘or’ _, ‘implies’ Ñ, ‘not’ , ‘the existential number quantifier’ Dn : N, ‘the universal number quantifier’ @n : N, ‘the existential set quantifier’ DA N and ‘the universal set quantifier’ @A N. A compound statement φpnq will be called a first-order formula of arithmetics if it is build up in the above way without using the set quantifiers. The compound statements φpA, xq build above that do use set quantifiers will be called a second-order formula of arithmetics. Again, a formal definition can be found a bit further ahead, in subsection 0.3.5, example 0.47.

First-order and second-order formulas are going to be used to define arithmetical and hyperarithmetical sets.

Definition 0.9. Let A N be a subset. Then A is called arithmetical if A ta| φpaqu

for a first-order formula φpaq of arithmetic.

A function f : N Ñ N is arithmetical if tpxn, f pnqyq| a P Nu is arithmetical. Definition 0.10. Let A N be a subset. Then A is called hyperarithmetical if

A ta| DB NpφpB, aqqu ta| @B NpψpB, aqqu

for some first-order formulas φpB, aq and ψpB, aq of arithmetic for every subset B N and a a variable. More precisely, DB NpφpB, aqq and @B NpψpB, aqq are second-order arithmetical formula with only one second-second-order set quantifier.

(12)

For the first equivalent definition, I will be following Normann [22] chapter 2 who shows hyperarithmetical sets are connected to ‘Kleene computable’ sets.

Define the following functional:

E : NN Ñ N : f ÞÑ

#

1 if for some aP N, fpaq ¡ 0, 0 else

This is not in general a feature that can mechanically be calculated. The following thought process inspired by a function of Brouwer will show an argument for this. Example 0.11. Consider the following function of Brouwer:

fpnq : #

0 if there are no consecutive 99 nines in the decimals of pi up until n, 1 if there are consecutive 99 nines in the decimals of pi up until n. Clearly f is computable i.e. we can calculate fpnq for every n, but we cannot in a reasonable way compute Epfq: If we can find an n0 for which fpn0q ¡ 0, then there

is no problem: Epfq 1. But if no such n0 can be found then checking this fact for

fp0q, fp1q, . . . one after the other with a machine does never stop. The argument that either Epfq 0 or Epfq 1 does not help: it does not give us a way to effectively determine Epfq with a machine. The whole point of this is of course not that an answer can be found for the problem of finding 99 consecutive nines, but that problems of which we do not yet know the answers might lead to infinite, never ending procedures looking for answers, while Epfq can only be calculated with a machine if a finite procedure deciding the truth of Dnpfpnq ¡ 0q is known. But as can be seen in subsection 0.1.3, such a procedure does not in general exist. However, in our case, if we where ever to prove that such an n0 does exist, but without providing n0 itself in a calculable way,

then it could not be denied that the original algorithm calculating fp0q, fp1q, . . . one after the other is an effective procedure: Just proceed until an m is found that satisfies fpmq ¡ 0: such an m can be found in at most finitely many steps. This argument might be called a version of Markov’s rule (which is usually described in intuisionistic logic): Markov’s rule says that an algorithm which can be shown not not to stop is effective (however, not necessarily efficient). In the next definition, the definition of calculation is in some way extended to include the answer of such infinite procedures.

We extend the computable functions using two extra clauses. See my source, [22] definition 2.2.2 and 2.2.4 for a more general construction.

Definition 0.12. The Kleene computable functions in E are the functions obtained from the closure of the partially computable functions under the clauses

• If f : N Nn _{Ñ N is a total and Kleene computable in E, then gp~aq : Epx ÞÑ}

fpx,~aqq is Kleene computable in E. All computable functions defined thus far had an index by Kleene’s enumeration theorem. This new operation is given an as of yet unused index.

(13)

• The following is a Kleene computable function in E function which is given re-cursively an unused index2 _{It is necessary to give the index beforehand as the}

definition can be applied to its own index): fpe,~aq φep~aq.

The new indexes will be called hyperarithmetical indexes to prevent confusion with the earlier defined indexes for computable functions.

A set A Nn _{is Kleene computable in E if its indicator function c}

A : Nn Ñ t0, 1u is

Kleene computable.

The following characterization of hyperarithmetical sets comes (according to page 71 of [22]) from Kleene.

Proposition 0.13. A set is hyperarithmetical if and only if it is Kleene computable in E.

Proof. From [22], see corollary 2.4.23, definition 2.4.1, and the analytical hierarchy de-fined on page 74, and of [28] the page 382 (the new subsection there). This might sound complicated, but really is not. The point is that [22] defines hyperarithmetical sets to be the sets Kleene computable in E, and then uses an equivalent definition of the analytical hierarchy , which [28] tells us to be equivalent to the one we use.

There is also a connection between arithmetical sets and a concept called Turing reducibility which I will not work out any further than saying that A is arithmetical if, for some n, A is Turing reducible toHpnq, where Hpnq is the result of applying the jump operator to H n times. For more information, compare the definition of arithmetical sets on page 258, corollary VIII(d) on page 316 and section 14.4 of [28].

The second equivalent way to define hyperarithmetical sets is obtained from the Suslin-Kleene theorem. The following version was stated on pages 387-391 of Odifreddi’s book [23].

Theorem 0.14 (the Suslin-Kleene theorem). (i): The class of hyperarithmetical sets is the smallest indexed class of sets C tXauaPA with A N and Xa N for each a, such

that there are partial computable functions f, g, h : N Ñ A such that • Xfpnq tnu for every natural number n.

• Xgpnq XnA (the complement in N) for every natural number n.

• If e is as an index for a total computable function, then Xhpeq

nPNXepnq.

In each case the function is defined in n if it is used.

(ii): The smallest such class is of the form: Define the set A0 by recursion by saying

2_{See for instance from [22] definition 1.2.19 and definition 2.2.2 on how to do that. Note that definition} 1.3.18 is not needed for our definition. But actually, the index itself is not of great importance, but that it exists in a primitive recursive way is.

(14)

• x0, ny P A for every n. • If nP A0 then x1, ny P A0.

• If e is considered as an index for a total computable function and if epnq P A0 for

all n, then x2, ey P A0.

The class C0 tXnunPA is defined by

• X_x0,ny : tnu, • X_x1,ny : XC

n,

• X_x2,ny :_i_PNXnpiq.

A consequence of combining the two statements is that we get a concrete way to build hyperarithmetical sets. The theorem will be referenced to in section 2.5.

0.1.3 Church’s thesis

The famous Entscheidungsproblem of Hilbert (see Ramsey [26] for an early English account) was the question whether there exists an effective procedure to decide the truth of formulas in logic. In 1936 both Church ([3],[4] and [5]) and Turing [32] showed that such a decision procedure does not exist. To prove this, Church developed a notion of (in Church’s words) “effective calculability which is thought to correspond satisfactorily to the somewhat vague notion [of effectively calculable functions]”. Turing, on the other hand, defined the computable real numbers by describing a machine (nowadays called a Turing machine) and gave three kinds of arguments why his (in his words) ““computable” numbers include all numbers which would naturally be regarded as computable”. By nature the arguments that Church and Turing gave are not precise mathematical ones and they thus rely on intuition. This assertion that their definition of computable functions is correctly characterizing all functions that can be computed by humans or machines is called the Church-Turing thesis, or simply Church’s thesis. I will use the latter name.

I previously defined the partially computable functions by using the primitive recursive functions and µ-recursion. According to Kleene in 1936 ([11] page 343) this definition was due to Herbrand and G¨odel. This definition is equivalent to the ones used by Church (see Kleene 1936 [11] page 343) and Turing (see Turing 1936 [32]).

In short,

Church’s thesis 0.15. The µ-recursive functions or partially computable functions de-fine all functions reasonable calculable by hand.

In section 0.2.3 it will be discussed that in some variants of constructive logic, Church’s thesis is true. After that, on page 60 it will be seen that Church’s thesis is true in the effective topos.

(15)

0.2 Realizability

The goal of this section is to understand realizability and its connection to constructive mathematics and Church’s thesis. The first subsection explains the Brouwer-Heyting-Kolmogorov interpretation of intuitionistic logic, which is an intuitive explanation of what the logical statements of intuitionistic logic are. The second subsection will explain realizability itself. The third subsection explains the connection between realizability, Church’s thesis, Markov’s principle and the Russian school of mathematics.

0.2.1 The BHK-interpretation of logic

Constructivism is a school in mathematics that interprets the logical connectives in a different way than the logic used by most mathematicians. The resulting logic is called intuitionistic logic. (Beware not to confuse intuitionistic logic with intuitionism, which is a school of constructivists starting with Brouwer). There is an interpretation of what it ought to mean to prove a statement in intuitionistic logic. The interpretation was implicitly used by Brouwer, and later written down by Heyting in [7], [8] and Kolmogorov [14]. I find the approach used by Kolmogorov [14] clearest one (see the English translations written down in Mancosu [19]), but I will follow description found in Troelstra and Van Dalen [31, p.9], which is almost exactly the same.

The Brouwer-Heyting-Kolmogorov interpretation or BHK-interpretation says the fol-lowing: Let φ and ψ be any statements. Then

• A proof of ‘φ and ψ’ consists of a proof of φ and a proof for ψ. • A proof of ‘φ or ψ’ consists of a proof of φ or a proof of ψ.

• A proof of ‘φ implies ψ’ consists of a construction that can turn a proof of φ into a proof of ψ.

• There is no proof of ‘absurdity’.

• A proof of ‘not φ’ consists of a proof of ‘φ implies absurdity’. So it is a construction that turns a proof of φ into a contradiction.

• A proof of ‘for some xP X, φpxq’ consists of a pair consisting of an element x P X together with a proof of φpxq.

• A proof of ‘for all x P X, φpxq’ consists of a construction that, on input x P X, gives a proof of φpxq.

To emphasize what it means to have a proof of a universal satements, let us use Kol-mogorov’s own (translated3) words: “To solve the problem pxqpapxqq means to be able to solve for any given single value x0 of x the problem apx0q after a finite number of

(16)

steps known in advance (prior to the choice of x0).” So the constructions or methods

involved need to work prior to any choice of element or proof.

Ordinary mathematicians reasoning would unequivocally accept the proofs in intu-itionistic logic. The controversy around constructive mathematics is not within the acceptation of the truth of the above ways of reasoning, but it is rather about the loss of ways to prove things. For instance there are no ways given to proof ‘not not φ implies φ’, so a statement cannot be proven by assuming the converse and showing it to produce a contradiction. Intuitionistic logic does, however, not reject proof by contradiction al-together, for it allows the negative statement ‘not φ’ to be proven by assuming φ and showing it to give a contradiction. So in concrete terms, it would allow the ordinary proof of ‘not there are finitely many prime numbers’ or ‘the square root of 2 is not rational’ (indeed, these are in some sense the very definitions of infinite and irrational) and would not allow any proof of the intermediate value theorem because the traditional proof uses proof by contradiction to prove an existence statement (there are constructive versions of the intermediate value theorem, however). This creates the situation that some traditionally equivalent notions are not equivalent using intuitionistic logic. There occurs a splitting of notions. For instance there are multiple inequivalent notions of fields, called Heyting fields, discrete fields, residue fields (not the traditional algebraic one). Some constructivists close a few of the gaps by assuming new axioms, sometimes with the consequence that the new theory becomes classically untrue. Brouwer could use his new axioms to prove continuity, for instance. The effective topos is in fact con-nected to a different school of constructivists than Brouwer, a Russian school of recursive mathematics, for which all functions from and to N are computable. Besides the impos-sibility to prove by double contradiction there are also other important aspects of the BHK-interpretation. A proof of ‘φ or ψ’ will always directly say which of the two is true. And a proof of ‘for some xP X, φpxq’ will always specify the x for which φpxq is true. If this is combined with an honest method for constructions, then intuitionistic proofs ‘construct’ things, hence the name constructivists.

What is an honest method for construction? The traditional mathematics work with hypothesis and reasoning. In the next section we see a form of intuitionistic arithmetic which is called realizability. A proof of a statement will be a natural number called a realizer of the statement. The method of construction will then be a computable function f : N Ñ N sending one proof n to another one f pnq. This gives a stronger interpretation of the BHK-interpretation. Unlike ordinary intuitionistic logic, there are statements that can be realized that are classically untrue.

0.2.2 Realizability

Realizability, also called recursive realizability or Kleene’s realizability, gives an interpre-tation of intuitionistic arithmetic by Kleene [13] using classical mathematics. A more modern account of realizability can be found in chapter 6 of [2]. Kleene’s notion of ‘n realizes φ’ can be inductively defined as follows. Let a be a natural number.

(17)

• ‘a realizes φ^ ψ’ if a xn, my and ‘n realizes φ’ and ‘m realizes ψ’.

• ‘a realizes φ_ ψ’ if a xi, ny and if i 0 then ‘n realizes φ’ abd if i 0 then ‘n realizes ψ’.

• ‘a realizes φÑ ψ’ if a is considered as an index for a partially computable function and for all nP N, if ‘n realizes φ’ then apnq is defined and ‘apnq realizes ψ’. • ‘a realizes φ’ if ‘a realizes φ Ñ p1 0q’. That happens if and only if there are

no numbers ‘realizing φ’.

• ‘a realizes Dnpφpnqq’ if a xn, my and ‘m realizes φpmq’.

• ‘a realizes @npφpnqq’ if a is considered as an index for a partially computable function and for all nP N, apnq is defined and ‘apnq realizes φpnq’.

If we want, we can also add new (atomic) statements A by taking a computable subset A N and specifying

• ‘a realizes A’ if aP A.

Moreover, this interpretation can be extended4 _{to second-order formulas (quantifying}

over subsets of N) by adding the rules

• ‘a realizes pi P Aq’ if and only if xa, iy P A.

• ‘a realizes @A NpφpAqq’ if and only if for all A N, a realizes φpAq.

• ‘a realizes DA NpφpAqq’ if and only if there exists an A N such that a realizes φpAq.

By using realizability we do not have the ambiguity mentioned before anymore: proofs are numbers n realizing the statement and constructions are partially computable func-tions form N to N.

The realizer of a statement can tell you a lot about the difference between a statement and its double negation. Consider for instance the statements n n and pn nq. The former is realized solely by n. And note that no number realizes n n. The latter is therefore realized by all natural numbers a. So these classically equivalent formulas are not realizable by the same numbers. The difference arises because the negation is either realizable by all numbers or realizable by no numbers.

Another consequence of using this definition is that, unlike with the BHK-interpretation, we can actually show that some classical true statements are not realizable. Kleene shows on page 116 of [13] that there are certain number theoretic formulas Apxq for which @xpApxq _ Apxqq has no realizer. Therefore, by what it means to realize the negation, all numbers realize @xpApxq _ Apxqq. At the same time, it is possible to use classical logic to reason about numbers realizing formulas. Following Kleene paper

(18)

page 114, the principle of the excluded middle of classical logic shows for instance that if a statement B has no free variables, either B is realizable or B is not realizable, and B not realizable implies that B is realizable, thus B is realizable or B is realizable. Why then is @xpApxq _ Apxqq realizable? That is because there is no computable total function f such that fpxq realizes Apxq _ Apxq for all numbers x.

Kleene actually did something more: He assumed being given a language for first-order arithmetic and then defined what it meant for a formula in that language to be realizable. I did not do this because I define these kind of languages only in a later subsection; in example 0.47 of subsection 0.3.5. So instead I used the often used version based on the BHK-interpretation.

0.2.3 Church’s thesis and constructive mathematics.

As we have seen, realizability interpreted construction as computable functions. The logic of the effective topos generalizes this. The main property of the generalization will be that all (partial) functions from N to N will be5 _{computable functions. This is}

related to the so called Russian school of constructive thought led by Markov in the 1950s, 1960s6. The Russian school thought that all mathematical constructions must be build using algorithms. For example a real number x should be an algorithm giving the decimals of x. To enforce this line of thought they used intuitionistic mathematics together with a few extra axioms. The most important axiom says that Church’s thesis applies to all functions.

Axiom 0.16 (Church’s thesis). All functions f : N Ñ N are computable.

Later on page 60 we will see different versions of this axiom using the Kleene’s predicate T and the extraction function U . These axioms are actually ‘true’ in the effective topos. A second significant axiom used by the Russian school is called Markov’s rule or Markov’s principle. The rule says if an algorithm does not not stop, then it does stop. Despite that this axiom does not seem constructive, Markov was convinced that this way of reasoning was in fact true. He reasoned that if you do not have that the algorithm does not stop, then you just run the algorithm for so long that it does stop. The formulation of Markov’s rule uses the fact that we have Church’s thesis as an axiom, so all atomic predicates and functions involved are automatically computable.

Axiom 0.17 (Markov’s rule). Let A be an atomic predicate. Then p@xpApxq _ Apxqq ^ DxApxqq Ñ DxApxq In a version using functions, let f : N Ñ N be a function, then

Dxpfpxq 0q Ñ Dxpfpxq 0q.

5_{Actually from and to the natural numbers object of the effective topos. See lemma 1.24.}

6_{For a far more detailed history, see p.25-29 of [31]. In fact, the notes at the end of every chapter} of [31] give a lot of history about constructivism and algorithms. The information for most of this subsection comes from there.

(19)

Assuming both Chruch’s thesis and Markov’s rule as axioms does not imply the law of the excluded middle, for the formulasDxT pa, a, xq and @x T pa, a, xq are not computable, while T is primitive recursive. See Kleene page 49 of [12]. Kleene proves in that paper much more general statements, for instance he gives whole classes of non-computable predicates build up from computable ones by alternating the universal and existential quantifiers. Thus for these formulas we cannot use Markov’s rule. Note that the set ta| DxT pa, a, xqu cannot be assumed to be a atomic predicate beforehand, as atomic predicates are by assumption computable and this set not.

Markov’s rule will also be true for the effective topos, see that same page 60.

There are a few other axioms that are used by the Russian school that are also true for the effective topos. Let me name a few of them: countable choice7_{, Brouwer’s principle}8_,

Shanin’s principle9_{, the uniformity principle}10_.

The Russian school is not the only way constructive mathematics can be done. Bishop’s school of constructive mathematics, for example, does not accept any extra axioms, and therefore his school does not disagree with classical mathematics. That is, the mathe-matics done by Bishop style mathematicians does not differ anything from traditional mathematics, save that everything is done constructively. Intuitionism, the school initi-ated by Brouwer, however differs very much with traditional mathematics. Differently from constructive recursive mathematics, which restricts mathematics to the things com-putable, intuitionism enlarges the class allowable sequences by allowing choice sequences. A choice sequence is a sequence in which each step xn is chosen consequently by some

thought construct.

Indulge me for a moment. Or skip the following paragraph and go to the next section, as it does not have much importance for the rest of the thesis. Let us shortly give an example of a choice sequence. Suppose each xn is chosen in time n, say one for

each second, starting with t 0 right now. Suppose that we have until now (until this second t) chosen xn to be 0. This means, if t is the second you started reading

this sequence, then xn 0 for n ¤ t. Then, assuming intuitionistic logic, a statement

like @npxn 0q _ Dnpx 0q becomes practically useless: a proof would assume that

you have a proof of either @npxn 0q or a proof of Dnpxn 0q, but the process of

choosing a new element xn is never complete, as time n 8 is never reached. So after

saying that you have a proof at time t we could just choose xt 1 to be 1 if you say

you have a proof of @npxn 0q of choose xt 1 forever to be 0 if you say to have a

proof of Dnpxn 0q. So an intuitionistic proof of @npxn 0q _ Dnpxn 0q will never,

at no time t, be found. So it is useless to even consider it. Of course, this example presupposes that mathematics is a physical process that continues through time. And classical mathematics does not allow this intermingling of time for creating sequences and time for the proving process. Instead, it says that mathematics lives in an ideal world, that proofs are idealized constructs independent of time.

7_{See page 137 of [34].} 8_{Page 124 of [34].} 9_{Page 127 of [34].} 10_{Page 128 of [34].}

(20)

0.3 Topos theory

The theory of toposes is a branch of mathematics which started with Grothendieck’s work on algebraic geometry in the early 1960s. Grothendieck’s toposes are categories of sheaves on a (small) site. The word ‘topos’ means ‘place’ in ancient Greek, and Grothendieck thought of topos theory as the study of spaces, so one can see why Grothendieck had chosen this name.

In the late 1960s and early 1970s Lawvere realized that most important properties of Grothendieck’s toposes ought to follow from a more general kind of category. He defined a new kind of category called an elementary topos, which is intricately connected to intuitionistic logic. All Grothendieck’s toposes are elementary toposes simply called Grothendieck toposes.

Somewhat later, in 1982 Hyland [9] introduced an elementary topos which is called the effective topos. The effective topos is special in that it is an elementary topos but not a topos in Grothendieck’s sense: It is no category of sheaves in a site. Specifically, it is not cocomplete11. It is finitely cocomplete, however (any elementary topos is finitely cocomplete).

In the topos theory literature the term topos is usually either used for Grothendieck toposes or for elementary toposes. For this master thesis a topos will always be an elementary topos. The most important toposes studied are not Grothendieck.

The first two subsection reviews the theory of Heyting algebras. In the third subsection the basic definition of a topos is given, with some examples. The fourth subsection connects Heyting algebras and toposes by showing that the set of relations of an object forms a Heyting algebra structure and a pair of adjoint functor is given. This structure will be shown to induce an ‘internal logic inside the topos’ in the fifth subsection. And finally, in the sixth subsection, partially map classifiers will be explained.

0.3.1 Lattices

The main source for this subsections and the next one comes from the book “A Course in Universal Algebra” by Stanley Burris and H.P. Sankappanavar [29]. I will put down the necessary references in footnotes. Please note that the millennial edition of this book is made freely available by the authors, for personal use and non-commercial use, and can be found at their website https://www.math.uwaterloo.ca/~snburris/htdocs/ ualg.html. At the request of the authors (on their website) I also place a link to https://www.math.uwaterloo.ca/~snburris, where more interesting free stuff can be found, for instance about the history of mathematical logic.

There are two equivalent definitions of lattices12_.

Definition 0.18. A lattice is a posetpP, ¤q such that for every pair of points pa, bq the supremum supta, bu and infimum infta, bu exists.

11_{See proposition 3.2.3 of [34].} 12_{See chapter 1, section 1 from [29].}

(21)

Definition 0.19. A lattice is a set P equipped with two maps_, ^ : P P Ñ P , called join and meet, such that the following laws are satisfied:

x_ y y _ x x^ y y ^ x (commutative)

x_py _ zq px _ yq _ z x_py _ zq px _ yq _ z (associative)

x_ x x x^ x x (idempotent)

x_px ^ yq x x^px _ yq x (absorption)

Lemma 0.20. If pP, ¤q is a lattice in the first sence, then pP, _, ^q with px _ yq : suppx, yq and px ^ yq : infpx, yq is a lattice in the second sense.

If pP, _, ^q is a lattice in the second sense, then pP, ¤q with pa ¤ bq :ô pa ^ b aq is a lattice in the first sense.

Henceforth the context will make clear which of the equivalent definitions is used. Next we define and characterize distributive lattices13.

Definition 0.21. A distributive lattice is a lattice pP, _, ^q that satisfies

x_py ^ zq px _ yq ^px _ zq (0.1)

x^py _ zq px ^ yq _px ^ zq (0.2)

for all x, y, z P P .

There is a nice and simple characterization of distributive lattices involving diamonds and pentagons.

Definition 0.22. The diamond lattice or M5 is the lattice of the following form: 1

a b c

0

The pentagon lattice or N5 is the lattice of the the following form: 1

a

c b

0

Lemma 0.23. Given a lattice pP, ¤q, the following are equivalent:

(22)

• P is distributive.

• P satisfies either (0.1) or (0.2).

• The diamond lattice M5 and pentagon lattice N5 do not embed into P .

Definition 0.24. A lattice pP, ¤q is called bounded if there are elements 0, 1 P P such that 0¤ a ¤ 1 for all a P P .

Definition 0.25. A lattice pP, ¤q is called complete if arbitrary suprema exist in P . The supremum of a settai|i P Iu will be denoted by

iPIai.

0.3.2 Heyting algebras

Four different but equivalent definitions of Heyting algebras will be given14_.

Definition 0.26. A Heyting algebra is a bounded lattice pH, _, ^, 0, 1q such that for every a, bP H there is an element pa Ñ bq P H that satisfies

c¤ pa Ñ bq ô pa ^ cq ¤ b for all cP H. Here Ñ is called the Heyting implication of H.

Lemma 0.27. A bounded distributive lattice H is a Heyting algebra if and only if the supremum of tc P H|a ^ c ¤ bu exists. The Heyting implication is given by

pa Ñ bq ªtc P A|a ^ c ¤ bu.

Lemma 0.28. A complete bounded distributive lattice H is a Heyting algebra if and only if the distributive law is satisfied:

a^ª

iPIbi

ª

iPIpa ^ biq.

Lemma 0.29. A bounded distributive lattice H is a Heyting algebra if and only if there is a map Ñ: H H Ñ H that satisfies the following four axioms:

pa Ñ aq 1 a^pa Ñ bq a ^ b

b^pa Ñ bq b

pa Ñ pb ^ cqq pa Ñ bq ^pa Ñ cq A redundent fifth axiom is

ppa _ bq Ñ cq pa Ñ cq ^pb Ñ cq.

Definition 0.30. The negation of aP H is defined by a : pa Ñ 0q.

Definition 0.31. A Boolean algebra is a Heyting algebra pB, _, ^, 0, 1q that satisfies a^ a 0 and a _ a 1 for any element a P B.

The theory of Heyting algebras is closely connected to intuitionistic logic, and the Boolean algebras are connected to classical logic.

14_{The various definitions can be found in chapter 2 section 1 of [29]. See in particular example 11 and} exercise 5. From chapter 1 section 4 see exercise 7.

(23)

0.3.3 Toposes

A topos is a category highly similar to the category of sets. In fact, in many parts of mathematics, the category of sets could be replaced by any topos. The similarity stretches quite far. In the category of sets, the subsets of a set X from a boolean algebra (using set comprehension tx P X| φu). In a topos, the subobjects of an object form a Heyting algebra. But a topos has more structure in common with the category of sets: a topos is finitely complete, finitely cocomplete, locally cartesian closed, has power objects and has a subobject classifier. In fact, the category of sets could in a certain sense15 _be

characterized up to equivalence as a cocomplete well-pointed topos where the hom-sets are sets (and not classes). Here a category is called well-pointed if for all morphisms f, g : X Ñ Y , f g if and only if fx gx for any morphism x : 1 Ñ X.

Let us start with the definitions of subobject classifiers. A subobject classifier general-izes the fact for sets X that all subsets A X are uniquely characterized by a morphism χA: X Ñ t0, 1u.

Definition 0.32. Let C be a category with finite limits. A map t : 1 Ñ Ω from the terminal object to another object Ω is called a subobject classifier if for every object X and every subobject A of X there is a unique map χA: X Ñ Ω, called the characterizing

map of A, such that the following diagram is a pullback:

A 1

X Ω

!A

t χA

Notice that, since t is a subobject of Ω, the pullback of t along an arbitrairy map f : X Ñ Ω gives a subobject A of X, and by uniqueness of pullbacks, this subobject must be the unique (up to isomorphism) subobject of X characterized by f .

The composition t !A : AÑ 1 Ñ Ω will be written by tA: AÑ Ω.

Definition 0.33. An elementary topos is a category E that is finitely complete, cartesian closed and that has a subobject classifier Ω. The exponent of an object Y by an object X will be denoted by YX_.

Example 0.34. (i): The category of sets is a topos. The subobject classifier is t0, 1u and for any sets X and Y , YX HomSetpX, Y q.

(ii): For any category C, the category of presheaves PshpCq SetCop is a topos. Let C be an object of C. A sieve on C is a set of arrows R ArpCq such that (1) each arrow f P R has codomain C and (2) if f : A Ñ B and g : B Ñ C are arrows such that g P R then gf P R. The subobject classifier Ω is a presheaf defined by defining ΩpCq to be the set of sieves on C and ΩpfqpRq : fpRq where for any arrow f : B Ñ C, fpRq tg P ArpCq| codpgq B and fg P Ru. If X and Y are presheaves, then the

15_{In the sense of Lawvere’s Elementary Theory of the Category of Sets (ETCS). See Leinster [16] for a} nice exposition.

(24)

exponent presheaf YX is defined for an object C by YXpCq HomPshpCqpypCq X, Y q,

where y is the Yoneda embedding. For a morphism f : BÑ C it is defined by YX_pfq

pypfq IdXq.

(iii): Let pC, Covq be a site, that is, a small category C equipped with a Grothendieck topology Cov. Then the category of sheaves ShpC, Covq will be a topos. In fact, a Grothendieck topos is a topos of this form.

(iv): The category of finite sets is a topos which is not a Grothendieck topos. This can be seen by the fact that a Grothendieck topos is cocomplete, and the category of finite sets is not.

(v): The effective topos (see chapter 2) is also a topos which is not a Grothendieck topos, as it is also not cocomplete16_.

(vi): If E is a topos and j : ΩÑ Ω a Lawvere-Tierney topology (see section 2.1), then the category of j-sheaves is a topos. This example is generalizes example (iii), see section 2.2. In this thesis we are interested in sheaves on the effective topos.

(vii): If E is a topos and X is an object of E , then the slice category E{X is also a topos. It is intresting to note that while Grothendieck toposes connect topos theory to geom-etry, the effective topos and its sheaf subtoposes connect topos theory to computability (to various notion of realizability).

Let work through a few basic properties of toposes.

In topos theory we tend to regard morphisms with codomain X to be the ‘elements’ of X. The motivating examples are the elements x : 1Ñ X of a set X. These different kind of ‘elements’ are given the following names: Let E be a topos, X an object. A generalized element of X is a morphism x : AÑ X for some object A. A global element of X is a morphism x : 1 Ñ X. In both cases this will often be denoted by x : X, in clear analogy to the membership predicate xP X of sets. Note that in the topos of sets it is enough to work with global elements, since they determine the generalized elements as well. In a presheaf of Grothendieck topos, it is enough to look at morphisms of the kind YpCq Ñ X for any object C of C, and these kind of morphisms correspond by the Yoneda lemma to the elements of XpCq.

A morphism f : X Ñ Y can be used to send a generalized element x : X toward another generalized element fpxq : Y , by letting fpxq : f x. Thus we have a kind of function application.

Consider for a moment products. If we have a generalized element z : X Y , then x : π1pzq : X and y : π2pzq : Y , and by product property z x y. In what follows

I will often denote such z by z px, yq.

The exponential objects YX _{serve as internal hom-sets. To elaborate this idea,}

con-sider a generalized element f : YX_{, i.e. a morphism f : A}Ñ YX _{for some object A. By}

the adjunction associated to exponential objects, there exists a morphism f : AX Ñ Y . If f was considered to be a global element instead of a generalized element, then f would have been a morphism X Ñ Y . Morphisms of the form A X Ñ Y for any A are of course related in a similar way to generalized elements A Ñ YX, and ordinary mor-phisms X Ñ Y to the global elements 1 Ñ YX_{. Consider the evaluation morphism}

(25)

ev : YX X Ñ Y . By the adjunction it satisfies f evpf IdXq. Let x : B Ñ X be a

generalized element. Consider the morphism IdA x : A B Ñ A X. Applying f to

IdA x gives us that

‘fpxq’ : fpIdA xq ev pf IdXq pIdA xq evpf xq

gives a generalized element ‘fpxq’ : A B Ñ Y . So, in some sense, ‘applying’ f : YX _to

a generalized element x : X gives a generalized element ‘fpxq’ : fpIdA xq : Y as one

would hope. In this sense, the generalized element f : YX really is a “function from X to Y ”.

A property of the subobject classifier works as follows. Let x : X be a generalized element and A be a subobject of X. Say that x : B Ñ X ‘sits inside’ A (denoted temporarily by x PX A) if x factors through A X. Then x PX A if and only if

χApxq tB. Notice that since A is a subobject of X, x can only factor through A in a

unique way.

It is important to note that Ω acts like “a set of truth values”. A “truth value” x : Ω given by x : A Ñ Ω acts like the value true if x tA. There can be multiple different

“truth values”. For instance, for presheaves there are usually more than two sieves. However, for the topos of sets there are exactly two “truth values”: 0, 1 : 1Ñ t0, 1u.

An important property of a topos is that a topos has power-objects. These generalize power sets PpXq for the category of sets.

Definition 0.35. Let E be a topos. The power object of an object X is defined by PpXq : ΩX_{. There is also the so-called membership predicate} P

X: X P pXq Ñ Ω,

which is defined as the evaluation morphism ev : X ΩX Ñ Ω.

The adjunction for exponential objects can also be used for power objects, thus for instance, we see that generalized elements B : A Ñ P pXq gives rise to morphisms B : A X Ñ Ω. Using the subobject classifier, it induces a subobject rB of A X. Moreover, membership works as follows: If x : X is a generalized element, then

px PX Bq : PXpB xq BpIdA xq.

So if x is given by x : C Ñ X, then x PX B is true (i.e. it is equal to tAB) if and

only if IdA x factors through the subobject rB of A X, which happens if and only if

χ_B_rpIdA xq tAC, which happens if and only ifpIdA xq PX B (in the sense definedr

above the definition of power sets).

A natural numbers object in a topos is an object with the same role as the natural numbers, in the sense that there exists a kind of induction for a natural numbers object. Definition 0.36. An object N of a topos E , equipped with a global element 0 : N and a function S : N Ñ N is a natural numbers object if for all global elements a : A and all functions f : A Ñ A there is a unique morphism h : N Ñ A such that hp0q a and hS fh. The definition implies hpnq fnpaq, where n : Snp0q. In diagrammatic

(26)

form, 1 N N X X 0 x h S h f

Both Grothendieck toposses and the effective topos have a natural numbers object: the sheafification of the constant sheaf on N works for a Grothendieck topos and lemma 1.24 shows that the effective topos has one. An example of a topos without a natural numbers object is the topos of finite sets.

We can also think about equality in a topos. Let 4X X X be the diagonal of

X. The characteristic function χ4X : X X Ñ Ω can be seen as an ‘equality function’

. If px, yq : A Ñ X X is a generalized element, then ‘px yq’ : χ4Xppx, yqq is true

(i.e. equal to tA) if and only if px, yq factors through 4X. Of course, this only happens

if x : π1ppx, yqq π2ppx, yqq : y. The singleton map tuX : χ4X : X Ñ P pXq

obtained using the adjunction of exponential objects on χ4X is also of interest. For any

generalized element x : A Ñ X it gives a generalized element txuX : A Ñ P pXq. This

morphism will be studied further in example 0.57 and section 0.3.6

It should be said that any topos also has finite colimits. The proof seems to me rather involved, and can be found [27]. Luckily, the colimits of the effective topos are easy to find; see example 1.22.

The following definitions will be used later subsections.

The image impfq of a morphism f : X Ñ Y is the smallest subobject of Y that f factors through. A cover is a morphism whose image is the codomain. A cover f : X Ñ Y is stable under pullback if for all morphisms g : Z Ñ Y , the pullback g#pfq : V Ñ Z of f along g is a cover as well.

X Y

V Z

f

g#_pfq

g

A regular category is a category that is finitely complete, where all morphisms have images, and where all covers are stable under pullback17.

Lemma 0.37. (i): Any topos is a regular category.

(ii): A topos E is locally cartesian closed (i.e. the slice categories E{X are cartesian closed), and therefore pullbacks preserve colimits.

In the next section we will study subobjects of an object of a topos.

17_{This definition can be found in section 25.1 of [20]. The author of [20] does use some different names,} but those can be found in the same section as well.

(27)

0.3.4 Subobjects as a Heyting algebra

Suppose we are give a topos E . In this section it will be shown that the poset of subobjects SubEpXq for an object X of E forms a Heyting algebra18.

Definition 0.38. Let φ : Y Ñ X be a morphism in E and let A be a subobject of X. Define φ#pAq to be the pullback of A along φ.

φ#pAq A

Y φ X

It should be noted that φ#pAq automatically forms a subobject of Y : pullbacks preserve monic arrows. The pullback can also be defined for arrows. If A, B are two subobjects of X and ψ : A Ñ B a morphism of subobjects (i.e. if i : A X, j : B X then j i ψ) then the pullback gives a unique arrow φ#pψq : φ#pAq Ñ φ#pBq of morphisms of subobjects making the following diagram commute:

φ#pAq _A

φ#pBq _B

Y X

ψ

φ

The uniqueness part implies that φ# _{: Sub}_{pXq Ñ SubpY q defines a contravariant functor.}

Any subobject A of X with characteristic function χA : A Ñ Ω is given by the

pullback χ#_Aptq of the subobject classifier t : 1 Ñ Ω. If f : Y Ñ X is a morphism, the subobject f#_{pAq f}#_pχ#

Aptqq pχAfq#ptq has characteristic function χAf . Therefore,

if A is a subobject of X and B is a subobject of Y and if χB χAf for some morphism

f : Y Ñ X, then B pχAfq#ptq f#pχ#_Aptqq f#pAq.

Definition 0.39. Let ^ : Ω Ω Ñ Ω be the characteristic function of the subobject xt, ty : 1 Ñ ΩΩ. If A, B are subobjects of X and if the morphism xχA, χBy : X Ñ ΩΩ

is considered, then the morphismpχA^ χBq : ^ xχA, χBy : X Ñ Ω Ω Ñ Ω induces a

subobject A^ B of X with characteristic function pχA^ χBq. Suppose that x : Z Ñ X

is a function. Then,

pχA^ χBqpxq ^ pxχA, χBy xq ^ xχApxq, χBpxqy pχApxq ^ χBpxqq.

18_{Proposition 0.44 comes from theorem 2.5 of [21], but that version was for presheaves, so was adapted} to work for toposes. For this I used definitions found in [20] chapter 13 section 3, 4, 6. I had to rework these definitions to make it work. This is what is done in this section. The discussion of internal Heyting algebra also comes from there.

(28)

By definition of^, the arrow x satisfies ^ xχApxq, χBpxqy tZif and only ifxχApxq, χBpxqy

xtZ, tZy. Therefore, pχA^ χBqpxq tZ precisely if χApxq tZ and χBpxq tZ. Thus,

x factors through A^ B if and only if x factors through A and B. In particular, if x factors through A and B then x factors through A^ B in a unique way, so A ^ B is a pullback.

A^ B B

A X

Definition 0.40. Consider the subobjects t Ω and Ω t of Ω Ω. The coproduct pt Ωq pΩ tq gives an arrow pt Ωq pΩ tq Ñ Ω Ω. Define a morphism _ : Ω Ω Ñ Ω as the characteristic arrow

_ : χimpptΩq pΩtqÑΩΩq : Ω Ω Ñ Ω.

of the image of that arrow. And, for subobjects A, B of X, let A_ B be the subobject with characteristic arrow χA_ B : _ xχA, χBy. We have by definition

A_ B xχA, χBy

#_{p_}#_{ptqq xχ} A, χBy

#_{pimppt Ωq pΩ tq Ñ Ω Ωqq.}

Then, since the pullback preserves images (lemma 0.37), A_ B impxχA, χBy

#_{ppt Ωq pΩ tqq Ñ Xq.}

By the same lemma the pullback preserves coproducts, so we havexχA, χBy#ppt Ωq

pΩ tqq pχ# Aptq χ # BpΩqq pχ # ApΩq χ # Bptqq A X X B. Therefore A_ B imppA X X Bq Ñ Xq.

And, since the morphism A X X B Ñ X is piAπ1q piBπ2q, this gives

A_ B impA B Ñ Xq. Definition 0.41. Let¤1 be the equalizer

¤1 _Ω_Ω ^ _Ω

π1

.

The name of the characterizing arrow of this subobject is the implication arrow Ñ: Ω Ω Ñ Ω. Assume that A and B are subobjects of X. Define the implication from A to B to be the subobject A ñ B by taking pχA Ñ χBq :Ñ xχA, χBy as the

characterizing arrow.

Definition 0.42. The truth arrow is the arrow t : 1 Ñ Ω and the falsity arrow is the characterizing arrow K : χ₀₁ : 1Ñ Ω.

Let X be an object. The subobject J of X is defined to be the maximal subobject IdX : X Ñ X, and K the subobject 0 X. The negation arrow is the arrow :Ñ

pIdΩ 0q : Ω Ω 1 Ñ Ω Ω Ñ Ω The negation of a subobject A is defined by

(29)

Definition 0.43. Let m : AÑ X and n : B Ñ X be subobjects of X. Take the relation ¤ on subobjects by A ¤ B iff m factors through n. And also, A ¤ B iff χBm tA by

definition of the pullback.

Proposition 0.44. For any object X of E , the poset of subobjects SubEpXq with subobject

relation ¤ forms a Heyting algebra with the structure ^, _, J, K and Ñ.

Proof. Consider two subobjects m : A X and n : B X. Suppose that i : C Ñ X is a subobject that satisfies i¤ m and i ¤ n. That means that there are arrows f : C Ñ A and g : CÑ B such that n f i m g. By the definition of the pullback there is a unique arrow h : C Ñ A ^ B such that pn ^ mqh i. Put differently, C ¤ A ^ B. Thus A^ B is the infimum of the poset structure.

Take the same two subobjects, let j : A B Ñ X be the unique arrow obtained from n and m. Let iA, iB be the coproduct inclusions of A B. Suppose that i : C Ñ X is a

subobject that satisfies m¤ i and n ¤ i. That means that there are arrows f : A Ñ C and g : B Ñ C such that i f n, i g m. By the definition of the coproduct there is a unique arrow h : A B Ñ C such that f hiA, g hiB. Then we have ihiA n,

ihiB m, therefore ih j. But A _ B is the smallest subobject j factors through, so

A_ B ¤ C. Thus A _ B is the supremum of the poset structure.

It has now been shown that the poset of subobjects is a lattice. This lattice is a bounded one, sinceJ is a top and K is a bottum of the lattice. It remains to be proved thatÑ is a Heyting implication. Take any subobjects A, B of X. Suppose that m : A X is the inclusion of A. The condition A¤ pB Cq is equivalent to Ñ pxχA, χBy mq

pχB Ñ χCqpmq tA. Which is in turn equivalent to xχBpmq, χCpmqy PΩΩ ¤1 being

true. But¤1is the equalizer of^ and π1, so this happens if and only if χBpmq ^ χCpmq

χBpmq. Thus happens to be true exactly whenever pA ^ Bq ^pA ^ Cq pA ^ Bq. And

that is equivalent to A^ B ¤ A ^ C ¤ C. Thus Ñ indeed is a Heyting implication. This proposition can be used to show that the subobject classifier Ω has an inter-nal Heyting algebra structure pΩ, _, ^, 0, 1q, where the structure, namely the arrows _, ^, Ñ: Ω Ω Ñ Ω, the arrows t, K : 1 Ñ Ω and the arrow : Ω Ñ Ω where defined above.

It works like this: In the logic of the topos (see next section), ^ and _ are (internally) idempotent, associative, commutative and satisfy the absorbtion laws. Moreover, Ñ satisfies

E , @abcppc ¤ a Ñ bq Ø pa ^ c ¤ bqq.

0.3.5 Logic of toposes

The internal logic of a topos will be explained in this section19.

The idea is that we can give a formulas in logic an interpretation in the topos by using the Heyting algebra structure from the previous subsection. To give a concrete example right away, given a topos with a natural numbers object N , we want to have subobjects

(30)

like the even numbers _{Jn : N.Dm : N pn 2mqK ¤ N .} To be able to do this, we must know how to interpret the logical connectives and the quantifiers in terms of something concrete. It turns out that the Heyting algebra structure of the subobjects from the previous subsection is ideally suited to do this. For the quantifiers we still need to work a little bit.

A many-sorted language L consist of three collections of symbols: A collection called the collection of types X (also called sorts). A collection of symbols called relation symbols R equipped a finite sequence of sorts X1, . . . , Xn called the type of R and which

is denoted by R X1, . . . , Xn. And a collection of function symbols f equipped with

a finite sequence of sorts X1, . . . , Xn called the domain of f and a sort Y called the

codomain of f , which is written by f : X1, . . . , Xn Ñ Y . By convention, if n 0 then

the list X1, . . . , Xn will be denoted by 1.

Observe that the sorts, relation symbols and function symbols from the definition are just symbols, and nothing else. Assume that each sort X comes with a infinite amount of symbols written like x : X or xX _{called the variables of X.}

It is an assumption that there is an endless supply of variables xX of type X for each set X. We inductively define an L- terms of type Y to be either a variable of type Y , or, if ti was a previously defined L-term of type Xi for 1¤ i ¤ n and f : X1, . . . , Xn Ñ Y

is a function symbol, to be of the form fpt1, . . . , tnq.

Again, this is just a concatenation of symbols.

An atomic L-formula is either K, or J, or Rpt1, . . . , tnq for some relation symbol

R X1, . . . , Xn and some terms t1, . . . , tn of types X1, . . . , Xn, or it is tp~xq sp~xq

for two terms in Y considered in the same context. An L-formula is either an atomic formulas, or, for some previously defined formulas φ, ψ, the conjunction φ^ ψ, or the disjunction φ_ ψ, or the implication φ Ñ ψ, or the negation φ, and if xX is a variable also the formulas@x : Xφ and Dx : Xφ.

Let φ be a formula. A variable x : X is free in φ if the variable occurs in φ and is not specifically named by any quantifier in φ. Thus @x:Y pRpx, fpyqqq has y : Y as free variable and not x : X. Variables in φ that are not free are called bounded.

A context ~x of a term t or a formula φ is a list of variables that contain the free variables of t or φ. If we consider a term/formula a within a context ~x, then we will write this as ap~xq or ~x.a, depending on the situation.

An L-sentence is a formula considered in an empty context. This implies that sen-tences have no free variables.

Definition 0.45. Let E be a topos and let L be a language. An interpretation of L is an assignment that assigns to each sort X an object _{JX K, to each relation symbol} R X1, . . . , Xn a subobject JRK JX1K JXnK and to each function symbol f : X1, . . . , XnÑ Y a morphismJf K : JX1K JXnK Ñ JY K. By an abuse of notation, we often equate the symbols S with their interpretation _{JS K.}

Later we will see that interpretations can be extended to the terms and formulas giving morphisms and subobjects.

Example 0.46. The standard example of a language is the Mitchel-Benabou-language of a topos E . This language has one type X for each object X, one relation symbol

The Effective Topos and its Sheaf Subtoposes