Algebraic Coding Theory

(1)

Algebraic Coding Theory

Eline Filius

July 14, 2017

Bachelor thesis

Supervisor: B.M. (Bart) Litjens MSc

Korteweg-de Vries Instituut voor Wiskunde

(2)

Abstract

The subject of this bachelor thesis is algebraic coding theory. We first describe some basic notions of coding theory. What is a codeword, what is a code, how can we describe the distance between codes? We also define perfect and linear codes, from where we discuss the Hamming code. The final goal of this thesis is to prove the first theorem in the article “Two theorems on perfect codes”, written by H.W. Lenstra in 1972 [8]. This theorem is a stronger version of Lloyd’s theorem, which states for a prime power q the following: If a perfect e-error-correcting code of length n over an alphabet [q] exists, then the Lloyd polynomial has e distinct integral zeros in {1, . . . , n}. We prove, as in Lenstra’s article, that this condition holds for every q ≥ 2.

Title: Algebraic Coding Theory

Author: Eline Filius, eline.filius@outlook.com, 10542973 Supervisor: B.M. (Bart) Litjens MSc

First grader: prof. dr. L.D.J. (Lenny) Taelman Second grader: dr. V.S. (Viresh) Patel

Date: July 14, 2017

Korteweg-de Vries Instituut voor Wiskunde Universiteit van Amsterdam

Science Park 904, 1098 XH Amsterdam http://www.science.uva.nl/math

(3)

Acknowledgement

I would like to thank Bart Litjens for his good guidance during this project. I really appreciate his patience and the suggestions he had. Beside, I am glad he came up with this subject. Somewhat against my own expectations I thought it was very interesting. Also, I would like to thank my first and second grader, Lenny Taelman and Viresh Patel, for taking the time to read my thesis.

(4)

Introduction

Codes are everywhere. Look around and you will probably see something in which codes are used. Think of mobile phones, computers, or even books (as they have an ISBN number). It may be clear that in this digital era, research in coding theory is of great importance.

Coding theory is a very young subject. The origin lies in 1948, when Shannon’s article “A mathematical theory of communication” [13] was published [6, 3]. It is a subject within discrete mathematics which mostly uses methods from (linear) algebra. Besides, coding theory is also studied in computer science and information studies. However, this thesis is written on algebraic coding theory.

In coding theory it is all about information transmission. For example one could think of sending an e-mail, listening to a CD or save data on a computer. For all these purposes, information first is encoded, then it is transmitted over a channel and finally it gets decoded to a receiver. The following diagram gives a schematic representation of this process [6].

Source

Encoder Channel Decoder Receiver

A simple example from Hill’s book [3] is the following: Say the message is YES or NO, and it is encoded by YES=11111 and NO=00000. Assume YES is transmitted over a noisy channel, and instead of 11111 we get 10110. The decoder decodes this to 11111 since this is the ‘nearest’ codeword. Despite that a few errors occurred, the receiver still got the right message. Therefore this is an example of an error-correcting code.

An important purpose in coding theory is finding good error-correcting codes. Not only because you want to listen to your CD, even when there is a little bit of dust on top, but also you don’t want your data to get corrupted when doing scientific research. In computer memory, the error-correcting ‘Hamming code’ is widely used [4]. The Hamming code is an example of a perfect code and will be discussed in this thesis.

The final goal in this thesis is to prove a theorem which gives an important condition for perfect codes. Lloyd’s theorem gives this condition for a prime power q. The article “Two theorems on perfect codes”, written by H. W. Lenstra [8] is used to prove Lloyd’s theorem for q ≥ 2.

In the first chapter some basic notions of coding theory will be discussed, among which perfect codes. The second chapter is about linear codes and especially we will define the Hamming code. In the third and last chapter we prove the strengthening of Lloyd’s theorem and we will discuss some corollaries.

(6)

1 Coding theory

Before talking about specific types of codes, we introduce a few basic notions of coding theory. We start by explaining what codes are and then discuss some other terminology, where we hold on to Chapters 2, 4 and 5 in Ling and Xing’s book [6]. The reader is expected to be familiar with basic notions of algebra, linear algebra and the basics of set theory.

We start by giving the definition of an alphabet and a word.

Definition 1.1 (Alphabet, word). Let q, n ∈ N. An alphabet [q] is defined as [q] := {0, 1, . . . , q − 1}. A word is an element v ∈ [q]n_{, i.e., a word v is an n-tuple of the form}

(v1, v2, . . . , vn), with vi ∈ [q] for 1 ≤ i ≤ n.

Often we just write v = v1v2. . . vn, if it is clear from the context that we are talking

about words. Using the above definition we easily define a code.

Definition 1.2 (Code). A code C of length n, over an alphabet [q] is a subset of [q]n_.

Words in C are called codewords. The size of a code C is the number of codewords it contains. A code C is called a q-ary code if the alphabet contains q letters.

A very important concept in coding theory is the Hamming distance.

Definition 1.3 (Hamming distance). The Hamming distance d(v, w) of two words v, w ∈ [q]n, is the number of positions on which v and w are different, i.e.,

d(v, w) = |{i | vi6= wi}|.

The Hamming distance is a metric on the set of words of a fixed length n over an alphabet [q].

Definition 1.4 (Minimum distance). The minimum distance d(C) of a code C, is the minimum of the Hamming distances over all distinct codewords in C, i.e.,

d(C) = min{d(c, c0) | c 6= c0 ∈ C}. Also just called the distance of C.

Example 1. Let q = 2 and n = 4. Then C = {1000, 1110, 1011, 1101, 0111, 0001} ⊆ [2]4 is a binary code of size 6. We see that d(1110, 0001) = 4, but d(C) = 2.

As mentioned in the introduction, coding theory is about transmission of encoded information. Now we know the concept of distance, we introduce a way to decode trans-mitted codewords. This method is called nearest neighbour decoding, or also minimum

(7)

distance decoding. Let C ⊆ [q]n be a code and assume that codewords in C are sent over a noisy channel. If a word v ∈ [q]n is received, we can decode v to it’s ‘nearest neighbour’. In other words, v is decoded to a codeword c ∈ C so that d(v, c) ≤ d(v, c0) for c0 6= c ∈ C. If the inequality is not strict, there are at least two codewords in C which are smallest to v, which means there is not a unique way for decoding v. If we use complete decoding, this means that one of these nearest codewords is randomly chosen. When we use incomplete decoding, retransmission is requested.

Using the nearest neighbour decoding rule, we now define detecting and error-correcting codes. Fix η ∈ N.

Definition 1.5 (Error-detecting). A code C is called η-error-detecting if after transmis-sion of a codeword c ∈ C, up to η errors can be detected. In other words, if a codeword c changes in at most η positions, the obtained word v is not in C.

Definition 1.6 (Error-correcting). A code C is called η-error-correcting if after trans-mission of a codeword c ∈ C, up to η errors can be corrected. This means that if e ≤ η errors occur, i.e., c is sent to a word v with d(c, v) = e, then there is no c0 6= c ∈ C so that d(c0, v) ≤ e.

Example 2. Consider the code of Example 1. We take the codeword c = 1000 and change it in one position, say to v = 1001 ∈ [2]4. Then v /∈ C. This means that we know some error has occurred, so C is at least 1-error-detecting.

Now assume two errors occur, for example c is sent to c0 = 1110. Then c0 ∈ C, so from that we cannot notice that this is not the original codeword c. Thus C is not 2-error-detecting.

Example 3. Let q = 2 and n = 5. Consider the code C = {10000, 11110, 00011} and look at the codeword c = 10000. If c changes in one position to a word v ∈ [2]5, say v = 11000, then the nearest codeword in C to v is c, hence the code is 1-error-correcting. If two errors occur, say c is sent to w = 11100, then the codeword with smallest distance to w is 11110, hence C is not 2-error-correcting.

We give some theorems about the error-detecting and error-correcting properties of a code, again assuming nearest neighbour decoding is used.

Theorem 1.7. A code C is η-error-detecting if and only if d(C) ≥ η + 1.

Proof. ’⇒’ Assume C is η-error-correcting but d(C) < η + 1. Then d(C) ≤ η implies that there exist codewords c, c0 ∈ C so that d(c, c0_{) = d(C) ≤ η. Now suppose c is}

transmitted and d(C) ≤ η errors occur. Then it could be that c0is the received codeword. But c0 ∈ C, so then the code is not η-error-detecting. This implies d(C) ≥ η + 1.

’⇐’ Assume d(C) ≥ η + 1 and suppose at most η errors occur during transmission. Let c be the transmitted codeword and v is the received word. Because at most η positions are changed, d(c, v) ≤ η, because d(C) ≥ η + 1 we now know v /∈ C, so C is η-error-detecting.

(8)

Proof. ’⇒’ Assume C is η-error-correcting and suppose d(C) ≤ 2η. Then there exist codewords c and c0 with d(c, c0) = d(C) ≤ 2η. We assume d(c, c0) ≥ η + 1, otherwise C is not even error-detecting (Theorem 1.7) and hence not error-correcting. Now assume c is transmitted and the received word v is different from c in at most η positions, so d(c, v) ≤ η. We see d(c0, v) = d(C) − η ≤ η ≤ d(v, c). So either d(c0, v) < d(v, c) or d(c0, v) = d(v, c). In the first case v is decoded to c0, which is not the initial codeword. In the second case it depends on whether the complete or incomplete decoding method is used, but in both cases the error cannot be corrected. This implies d(C) ≥ 2η + 1.

’⇐’ Assume d(C) ≥ 2η + 1 and suppose that up to η errors occur during transmission. Take c ∈ C, then for the received word v we know d(c, v) ≤ η. For every c0 ∈ C we know with the triangle inequality that d(c, c0) ≤ d(c, v) + d(v, c0). This implies d(v, c0) ≥ d(v, c0) − d(c, v) ≥ 2η + 1 − η = η + 1 > η. So the nearest neighbour of the received code v is c. Thus C is an η-error-correcting code.

From the above theorems it follows that a code C with d(C) = d is a (d − 1)-error-detecting code and a b(d − 1)/2c-error-correcting code.

Example 4. The code in Example 1 has minimum distance 2, so the above theorem confirms our finding that the code is 2 − 1 = 1-error-detecting.

The code from Example 3 has minimum distance 3, so the theorem gives that it is b(3 − 1)/2c = 1-error-correcting. This also matches with what we have seen in the previous example.

Similar to the definition of Hamming distance is the definition of the Hamming weight. Definition 1.9 (Hamming weight). The Hamming weight wt(v) of a word v is the number of positions in which v is not zero, i.e.,

wt(v) = |{i | vi6= 0}|.

The Hamming weight of a word v is equal to the Hamming distance between v and 0.

Lemma 1.10. If v, w ∈ [q]n_{, then d(v, w) = wt(v − w).}

Proof. One computes that wt(v − w) = |{i | (v − w)i 6= 0}| = |{i | vi − wi 6= 0}| =

|{i | vi 6= wi}| = d(v, w).

The Hamming weight can also be defined for a code C instead of words.

Definition 1.11 (Minimum Hamming weight). The minimum Hamming weight wt(C) of a code C is the minimum of the Hamming weights of all non-zero codewords c ∈ C i.e.,

wt(C) = min{wt(c) | c 6= 0 ∈ C}.

Next, we show that we can always assume that the zero word is in a code C. Therefore we define the isometry group.

(9)

Definition 1.12 (Isometry group). The isometry group of a metric space, denoted by Giso, is the group of bijective distance preserving maps from the space to itself. The

group operation is composition, and the identity element is the identity map.

This group acts on [q]n _{by permuting the positions of the words and by permuting the}

symbols in fixed positions. If C0 is obtained from C by applying isometries, we call them isometric. Since the distance is preserved, two isometric codes have the same distance based properties, such as the extent to which they are error-correcting codes.

It is clear that via a permutation of symbols we can achieve that the zero word is in C.

Lemma 1.13. Every code C is isometric to a code C0 which contains the zero word 0 = 00 . . . 0.

Now we know some basic concepts of coding theory, we define something a little more interesting.

Definition 1.14. For a given alphabet [q], length n and d ∈ N we define Aq(n, d) as the

maximum size a code over [q] of length n with at least minimum distance d can have, i.e.,

Aq(n, d) = max{|C| | C ⊆ [q]n and d(C) ≥ d}.

A code C for which |C| = Aq(n, d) holds is called an optimal code with respect to q, n

and d.

From an example this definition probably will become much clearer. Two easy exam-ples are Aq(n, 1) = qnand Aq(n, n) = q. The first one holds since [q]nitself is a code over

[q] of length n with minimum distance d. The second equality is true since if n is the minimum distance of a length n code, then the distance between every two codewords is exactly n. Therefore, in each position the symbols are mutually different, thus the size of such a code is maximum q.

Let us give another example.

Example 5. Set q = 2, n = 4 and d = 3. We want to know the maximum size of a code C ⊆ [2]4 with d(C) ≥ 3. We assume 0000 ∈ C (Lemma 1.13). Now suppose the non-zero positions of the codewords are in the first positions, then because d(C) ≥ 3, the only two options for codewords in C are 1110 or 1111. In both cases, we see there is no other possible word in [2]4 so that the minimum distance of C is at least 3. Hence the maximum size of C is 2, i.e., A2(4, 3) = 2.

We can translate the problem of finding Aq(n, d) into a problem in graph theory.

Consider the graph G = (V, E) where the vertices are given by V = [q]n and the edge set is given by E = {vu | v 6= u and d(u, v) < d}. Then finding Aq(n, d) is

the same as determining the size of a maximal independent set of the graph G, called the independence number. Since finding the independence number is an NP-complete problem [2], we know it is difficult to determine Aq(n, d) in general. It is also known as the

main coding theory problem. However, it is pretty easy to give a lower and upper bound. In the figures below we roughly show how to picture this. For some q, n and minimum

(10)

distance d, the rectangles denote the space [q]n, the bold dots point out codewords in a code C ⊆ [q]n and the smaller dots are other words in [q]n. The codewords are the centres of the spheres. Figure 1.1 shows how we obtain a lower bound for the number of codewords in C, by dividing the total number of words in [q]n by the number of words in each sphere with radius d − 1. The empty space in the spheres and the fact that they overlap as much as possible containing no more than one codeword each, make that this gives a lower bound for Aq(n, d). In Figure 1.2 we see that if Aq(n, d) attains its upper

bound, then the union of the spheres with radius e = b(d − 1)/2c is the whole space.

Figure 1.1: Sphere-covering bound Figure 1.2: Sphere-packing bound From now on, for c ∈ C ⊆ [q]n_{, the sphere with centre c and radius r is denoted by}

B(c, r) := {v ∈ [q]n| d(c, v) ≤ r}. The number of codewords in B(c, r) is given by |B(c, r)| = r X i=0 n i (q − 1)i, (1.1) since the number of codewords at a distance i from an arbitrary codeword c is n_i(q−1)i. Theorem 1.15 (Sphere-covering bound). Let q, n, d ∈ N with q > 1 and 1 ≤ d ≤ n. Then Aq(n, d) ≥ qn Pd−1 i=0 n i(q − 1)i .

Proof. Take C an optimal code with respect to q, n and d. If v ∈ [q]n\C, then d(v, c) < d for all c ∈ C, because if it is not, then we could add this v to C and then the size of C is bigger than M . Hence for all v ∈ [q]n\ C there exists at least one ci ∈ C so that

d(v, ci) ≤ d − 1. This implies v ∈ B(ci, d − 1). Therefore

[q]n⊆ M [ i=1 B(ci, d − 1), which implies qn≤ | M [ i=1 B(ci, d − 1)| ≤ |C| · |B(ci, d − 1)|.

(11)

Hence with (1.1) we get qn≤ A_q(n, d) d−1 X i=0 n i (q − 1)i, which completes the proof.

Theorem 1.16 (Sphere-packing bound). Let q, n, d ∈ N with q > 1 and 1 ≤ d ≤ n. Then Aq(n, d) ≤ qn Pe i=0 n i(q − 1)i , where e = bd−1₂ c.

Proof. Take C an optimal code with respect to q, n and d. Using the notation from the previous proof we see that for c ∈ C the spheres B(c, e) are disjoint andF

c∈CB(c, e) ⊆

[q]n. This implies |C| · |B(c, e)| ≤ qn and therefore the theorem holds.

The sphere-packing bound is also called the Hamming bound. When the size of a code attains the Hamming bound, we obtain a special code.

Definition 1.17 (Perfect code). A code C ⊆ [q]nwith d(C) = 2e + 1 is called perfect if it satisfies

|C| = q

n

|B(c, e)|.

Such a code is e-error-correcting and is called a perfect e-code.

In other words, a code is perfect if for all v ∈ [q]n there exists a unique c ∈ C with d(v, c) ≤ e (see also Figure 1.2).

Example 6. There are so called trivial perfect codes. If C contains just one word, equals [q]n or is of the form {cc . . . c | c ∈ [q]n with n odd}, then C is a perfect e-code with respect to e = 0 in the first two cases and e = b(n − 1)/2c for the so called repetition code.

(12)

2 Linear codes and the Hamming code

In the first chapter we have seen some basic notions of coding theory. In this chapter we focus on a specific type of codes: linear codes. As a special example we discuss the Hamming code. Unless mentioned otherwise, we used Chapter 4 in Ling and Xing [6].

For q a prime power, we denote Fq for the finite field with q elements.

Definition 2.1 (Linear code). Let q, n ∈ N. A linear code C of length n is a subspace of the vector space Fnq. The dimension of a code is the dimension of the subspace C,

notation dim(C). If dim(C) = k we call C an [n, k]-code. Moreover, if d(C) = d, then C is called an [n, k, d]-code.

The size of a linear code C ⊆ Fnq is given by |C| = qdim(C). As an example of a

non-linear code we have Example 1, since 0 /∈ C. Lemma 2.2. If C is a linear code, then wt(C) = d(C).

Proof. Assume wt(C) = d. There exists an c ∈ C so that wt(c) = d(c, 0) = d. This implies d(C) ≤ d. Now suppose d(C) = d. Then there exist c, c0 ∈ C so that d(c, c0) = d = wt(c − c0). Which implies wt(C) ≤ d. Hence we see d(C) ≤ wt(C) and wt(C) ≤ d(C), thus wt(C) = d(C).

Example 7. Define C = {000, 100, 001, 101} ⊆ F32, then C is a linear code. A basis for C

is the set {100, 001}. Consider the matrix

G =100 001

.

Then we have that C is generated by the rows of G, i.e., C is the row space of G. We define such a matrix in general.

Definition 2.3 (Generator matrix). A generator matrix for an [n, k]-code C is a k × n matrix for which the rows are the elements of a basis for C. We say that G is in standard form if G = Ik| X.

In the introduction we gave an example of so called source encoding. We encoded the words YES and NO into the codewords 11111 and 00000 respectively. If we encode already encoded information, we call this channel encoding. Therefore the generator matrix can be used.

Example 8. Consider the matrix G from Example 7. Then using this generator matrix, a word v ∈ F22 is encoded as w = vG.

(13)

Just as in linear algebra, we define for every code C a dual code C⊥.

Definition 2.4 (Dual code). The dual code C⊥ of a linear code C ⊆ Fnq is defined by

C⊥= {c0 _{∈ F}n_q | c0· c = 0 for all c ∈ C}. Where the dot product is defined as c0· c =Pn

i=1c0ici. The dual code C⊥ is also a linear

code and is called the orthogonal complement of C.

Definition 2.5 (Self-orthogonal, self-dual). A code C is called self-orthogonal if C ⊆ C⊥. When C = C⊥ we call C self-dual.

Example 9. Let q = 2 and n = 4. The linear code C = {0000, 1001, 0110, 1111} ⊆ F42 is

self-dual since C⊥= {c1c2c3c4 ∈ F42 | c1 = c4 and c2= c3} = C.

Definition 2.6 (Parity-check matrix). A parity-check matrix for an [n, k]-code is an (n − k) × n matrix so that it is a generator matrix for C⊥.

The parity-check matrix is very useful in determining errors. To show this, we first prove the following theorem.

Theorem 2.7. Let G be an k × n generator matrix for C. Then c0∈ C⊥ if and only if c0GT = 0, i.e., c0 is orthogonal to every row in G.

Proof. ’⇒’ Assume c0 ∈ C⊥_{. Then c}0_{· c = 0 for all c ∈ C. Certainly, for elements in}

a basis of C, the inner product with c0 is zero. So c0 is orthogonal to every row of G. Let ri denote the rows of G, for 1 ≤ i ≤ n. Then c0GT = (c0r1, . . . , c0rn), hence we have

c0GT = 0.

’⇐’ Assume c0is orthogonal to every row riof G. So we have c0GT = (c0r1, . . . , c0rn) =

0. Since the rows in G form a basis of C, for all c ∈ C we can write c = λ1r1+ . . . λnrn

for certain λi ∈ Fq. Then

c0· c = c0(λ1r1, + . . . λnrn)

= λ1(c0r1) + . . . + λn(c0rn)

= 0. So c0 ∈ C⊥.

An equivalent formulation of the above theorem is the following. Let H be a parity-check matrix for C, then c ∈ C if and only if cHT = 0. Therefore, using the parity-check matrix we can easily check if an error has occurred. Moreover, it tells us in which position the codeword has changed. Define for 1 ≤ i ≤ n, the standard basis vectors bi as the

vectors consisting of all zeros and a 1 in position i. Let c be a codeword in a code C. Assume one error has occurred in position i, so c changed to a word v = c + λbi for

some λ ∈ Fq. This gives the following

vHT = (c + λbi)HT = cHT + λ(biHT) = λ(biHT).

Where the last equality holds because cHT = 0. The right side of the equation is a multiple of the ith row of HT, i.e., a multiple of the ith column of H. So we see if vHT is a multiple of the ith row of HT, then the error occurred in position i [5].

(14)

Corollary 2.8. Let G the generator matrix for a code C. An (n − k) × n matrix H is a parity-check matrix for C if and only if the rows are linearly independent and HGT = 0. Proof. ’⇒’ If H is a parity-check matrix for C, then the rows of H form a basis for C⊥. This directly implies that the rows of H are linearly independent. And for every row in H the inner product with GT is zero (Theorem 2.7). So HGT = 0.

’⇐’ HGT = 0 implies that the rows in H are orthogonal to every row in G, so the rows in H are in C⊥. Since the rows are linearly independent, we know that the dimension of the row space of H is (n − k), the number of rows. Since the dimension of C⊥ is also (n − k) we know the rows span C⊥ and hence H is a parity-check matrix for C.

Theorem 2.9. If G = Ik| X is a generator matrix in standard form for a code C,

then H = −XT | In−k is a parity-check matrix for C.

Proof. We need to check if the rows in H are linearly independent and HGT = 0 (Corollary 2.8). The rows of H are linearly independent since the last n − k columns are the identity matrix. By a direct calculation we see indeed HGT = 0, so H is a parity-check matrix for C.

From this theorem it follows that a linear code can also be defined by giving a parity-check matrix.

Definition 2.10 (Equivalent). Two linear codes C and C0 in Fnq with generator matrices

G and G0 respectively, are called equivalent if G0 can be constructed from G by the following operations

(i) Rearranging the order of the columns of G;

(ii) Multiplying the symbols in a certain column of G with a non-zero scalar. Lemma 2.11. Every linear code C is equivalent to a code C0 whose generator matrix is in standard form.

Proof. Let C be a linear code with generator matrix G. We know from linear algebra that we can bring G in row reduced echelon form by using elementary row operations. Now by rearranging the columns we get a matrix G0 in standard form. The code C0 given by the row space of G0 is equivalent to C.

Example 10. Consider the generator matrix G of Example 7. We bring G in standard form by switching the second and third column. Then we get

G0 =100 010

.

By Corollary 2.8 we see that the 1 × 3 matrix H = 0 0 1 ,

is the parity-check matrix for C. Indeed we have that the row space of H, the set {000, 001}, is equal to C⊥_.

(15)

Theorem 2.12. Let C be a linear code and H the parity-check matrix of C. Then d is the distance of C if and only if H has d − 1 linearly independent columns and all d columns are linearly dependent.

Proof. Assume d(C) = d. Therefore wt(C) = d, so there exists a c ∈ C so that wt(c) = d. This means d positions of c are not zero. Assume this happens in the first d positions. So if c = c1c2. . . cn, then c1 = c2 = . . . = cd = 0 for d ≤ n. Now with Theorem 2.7 we

know c ∈ C ⇔ cHT = 0. Let ki denote the columns of H, then

cHT = c1kT1 + c2kT2 + . . . + cdkdT + cd+1kTd+1+ cd+2kTd+2+ . . . + cnkTn

= c1cT1 + c2kT2 + . . . + cdkTd

= 0.

This implies d columns are linearly dependent. Because d(C) = d, we also know that C does not contain any codeword v with wt(v) < d. Hence vHT _{6= 0 (by Theorem 2.7), so}

v1kT1 + v2kT2 + . . . + vd−1kTd−16= 0 which implies d − 1 columns linearly independent.

Now we are familiar with linear codes, we can give some examples of nontrivial perfect codes. The first example needs some introduction.

For v, w ∈ Frq\ {0} with q ≥ 2, we know that hvi = hwi if and only if there exists

a λ ∈ Fq\ {0} so that v = λw. This implies there are (qr− 1)/(q − 1) different

one-dimensional subspaces of Frq.

Definition 2.13 (Hamming code). Let r ≥ 2 and q a prime power. A linear code C with parity-check matrix H is called a Hamming code if the columns of H constist of one non-zero vector from each of the one-dimensional subspaces of Frq. We denote Hamming

codes with Ham(r, q).

Since the order of the columns is not defined in the definition, we have for fixed q that Hamming codes with the same length are equivalent.

Theorem 2.14. The Hamming code Ham(r, q) has the following properties (i) It is a [(qr− 1)/(q − 1), (qr_{− 1)/(q − 1) − r, 3]-code;}

(ii) It is a perfect 1-code.

Proof. (i) The number of columns in the parity-check matrix H of Ham(r, q) is (qn− 1)/(q − 1). By definition of the parity-check matrix this is the length of the code. The number of rows in the parity-check matrix is r, hence the dimension of Ham(r, q) is (qr − 1)/(q − 1) − r. The minimum distance is determined with Theorem 2.12. Each two columns of H are pairwise linearly independent by definition of the Ham-ming code. On the other hand, (1, 0, . . . , 0)T _{and (0, 1, 0 . . . , 0)}T _{are columns in H, and}

also (1, 1, 0, . . . , 0)T is a column in H. These three columns form a linearly dependent set, hence the minimum distance is 3.

(ii) From part (i) we know that the minimum distance of Ham(r, q) is 3, hence the Hamming code is 1-error-correcting. To prove it is perfect we show that the size of

(16)

Ham(r, q) attains the Hamming bound. Let n be the length (qn− 1)/(q − 1) and e = 1, then qn |B(c, e)| = qn 1 + n(q − 1) = q qr −1 q−1 1 +q_q−1r−1(q − 1) = q qr −1 q−1 qr ,

is the Hamming bound. The number of elements in Ham(r, d) is

|Ham(r, q)| = qdim(Ham(r,q)) = q qr −1 q−1−r₌ q qr −1 q−1 qr .

Hence we conclude the Hamming code is a perfect 1-code.

Let us give two examples of binary Hamming codes. The first one is the binary repetition code of length 3, mentioned in Example 6. The cover image is a geometric representation of this code.

Example 11. Let r = 2 and q = 2. Every vector in F22 generates a one-dimensional

subspace and because q = 2 they are all distinct. Hence the columns in the parity-check matrix H of Ham(2, 2) are exactly the non-zero vectors in F22. So we get

H =1 1 0 1 0 1

.

By Theorem 2.9 the matrix

G = 1 1 1 ,

is the generator matrix for Ham(2, 2). Hence we have Ham(2, 2) = {000, 111}. Another example of a Hamming code is Ham(3, 2).

Example 12. Let r = 3 and q = 2. Then we have for Ham(3, 2) the parity-check matrix

H =   0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1  ,

formed by all non-zero vectors in F32. This Hamming code is a [7, 4]-code.

Other important examples of linear perfect codes are the binary and ternary Golay code. The binary Golay code is denoted by G23. For a certain 12 × 11 matrix A, this

code is generated by the 12 × 23 matrix G = I12| A. The matrix A can be found

in Ling and Xing [6] and for more details on the background of this matrix we refer to Van Lint [11]. The binary Golay code is a perfect 3-code. The ternary Golay code is denoted by G11 and is a perfect 2-code. Codes with the same parameters as the binary

(17)

3 A strengthening of Lloyd’s theorem

In this chapter we prove a strengthening of Lloyd’s theorem, as done by H. W. Lenstra in his article “Two theorems on perfect codes” [8]. Lloyd’s theorem gives a necessary condition for perfect codes. As described in [10], this theorem was first proved by Lloyd for q = 2 and then generalized for q a prime power by F. J. MacWilliams and many others. It was first shown by Delsarte in 1971 that the theorem holds for all q ≥ 2. One year later, Lenstra gave another proof for this strengthening of Lloyd’s theorem. This is the proof we will hold on to in this chapter.

First, we briefly give an overview of some necessary knowledge for giving the proof of the theorem. We recall the Chinese Remainder Theorem, modules and we introduce the Krawtchouk polynomials.

For the proof of Theorem 3.1 and Corollary 3.2 we refer to Lang [7].

Theorem 3.1 (Chinese Remainder Theorem). Let I1, . . . , In be ideals of a commutative

ring R, which are mutually co-prime. So Ii + Ij = R for all i 6= j. Then for given

elements x1, . . . , xn∈ R, there exists an x ∈ R so that x ≡ xi (mod Ii) for all i.

The following corollary of the Chinese Remainder Theorem is used in Lemma 3.13. Corollary 3.2. Let I1, . . . , In be mutually co-prime ideals of a commutative ring R. So

Ii+ Ij = R for all i 6= j. Then the map

f : R →

n

Y

i=1

R/Ii,

induced by the canonical map of R onto R/Ii for each factor, is surjective and the kernel

of f isTn

i=1Ii. Hence we have an isomorphism

R/\Ii ∼

−→YR/Ii.

We define a module as in Atiyah [1].

Definition 3.3 (Module, submodule). Let R be a ring. A left R-module is an abelian group M , written additively, together with an operation

R × M → M, (r, m) 7→ r · m, such that for r, s ∈ R and m, n ∈ M we have

(r + s) · m = r · m + s · m r · (m + n) = r · m + r · n

(rs) · m = r · (s · m) 1 · m = m.

(18)

Equivalently, M is an R-module if there exists a ring homomorphism R → End(M ). An R-submodule of M is a subgroup N of M so that for every r ∈ R, n ∈ N we have r · n ∈ N .

Next, we shortly discuss Krawtchouk (or Kravˇcuk) polynomials. These polynomials play an important role in coding theory [11].

Definition 3.4 (Krawtchouk polynomial). For k, n, q ∈ N, the Krawtchouk polynomial is defined by Kk(x; n, q) := k X j=0 (−1)jx j n − x k − j (q − 1)k−j,

where x_j = _j!(x−j)!x! . This polynomial has degree k and we often denote it by Kk(x).

To formulate Lloyd’s theorem we need the Krawtchouk polynomial Kk(x − 1; n − 1, q).

This is also called the Lloyd polynomial, and is denoted by Lk(x).

Let us recall the definition of a perfect code.

Definition 3.5 (Perfect code). A code C ⊆ [q]n with d(C) = 2e + 1 is called perfect if it satisfies

|C| = q

n

|B(c, e)|.

This definition gives a necessary condition for an e-error-correcting code to be per-fect, namely |B(c, e)| divides qn. As mentioned before, Lloyd’s theorem gives another necessary condition for a code to be perfect.

Theorem 3.6 (Lloyd’s theorem). Let q be a prime power. If C is a perfect e-code of length n over [q], then the Lloyd polynomial Le(x) has e distinct zeros among the integers

{1, . . . , n}.

We will prove that this theorem holds for all q ≥ 2. After we finish the proof of this strengthening of Lloyd’s theorem, we give an example where the first perfectness condition holds, but for which Lloyd’s theorem tells us that there does not exist such a perfect code.

In this thesis, the proof of the theorem as given by Lenstra is split into several smaller parts. The lemmas we prove finally lead to the proof of the actual theorem.

First we introduce some notation. Assume q ≥ 2, let [q] = {0, 1, . . . , q − 1} as before and put N = {1, 2, . . . , n}. Let K denote a field with characteristic zero and M the vector space over K with the elements of [q]nas basis vectors. So we have dim(M ) = qn. Furthermore, for a subset D ⊆ [q]n we denote P D for P_v∈Dv.

Lemma 3.7. Let K[X1, X2, . . . , Xn] be the commutative polynomial ring in n

sym-bols, and let B denote the ideal generated by {X_i2 − qXi | 1 ≤ i ≤ n}. Define R :=

(19)

Proof. We want to show there exists a ring homomorphism f : R → EndK(M ). The

ring EndK(M ) consists of the K-linear endomorphisms of M . These are clearly just

endormorphisms with an extra property, so by Definition 3.3 such a ring homomorphism makes M into an R-module. Define for all basis vectors v ∈ [q]n, the K-endomorphisms φi of M by

φi(v) =

X

{v0 ∈ [q]n| v0_j = vj for all i 6= j}. (3.1)

with 1 ≤ i ≤ n. For all φi we show

φiφj = φjφi, (3.2) φ2_i = qφi. (3.3) We have φiφj(v) = φi( X {v0 ∈ [q]n| v_k0 = vk for all k 6= j} | {z } Y ) = X v0_∈Y φi(v0) = X v0_∈Y X {v00∈ [q]n| v_l00= v0_l for all l 6= i} =X{v000 ∈ [q]n| v_m000 = vm for all m 6= i, j} = φjφi(v). And for φ2 i(v) we see φ2_i(v) = φi( X {v0 ∈ [q]n| v_k0 = vk for all k 6= i} | {z } Y ) = X v0_∈Y φi(v0) = X v0_∈Y X {v00∈ [q]n| v_k00= vk for all k 6= i} | {z } φi(v) = |Y |φi(v) = qφi(v).

There exists a K-linear ring homomorphism g : K[X1, X2, . . . , Xn] → EndK(M ), since

the φi commute (3.2), which maps 1 to the identity map and Xi to φi. By (3.3) we see

B ⊆ ker(g) and hence there exists a homomorphism f : R → EndK(M ) [7] mapping 1

to the identity and xi = (Xi mod B) to φi. Therefore, we have that M is an R-module

by setting r · m = f (r)(m).

For the following lemma we define yI =

Q

i∈I(xi− 1) ∈ R for I ⊆ N .

(20)

Proof. First we find for I ⊆ N and v ∈ [q]n that yI acts on the basis vectors v of M as follows yI· v = X {v0 ∈ [q]n| v_j = v0_j ⇐⇒ j /∈ I}. (3.4) Indeed, yI· v = f (yI)(v) =Y i∈I (φi− id)(v) = Y i∈I\{j} (φi− id)(φj − id)(v) = Y i∈I\{j} (φi− id)( X {v0 ∈ [q]n| v0_l= vl ⇐⇒ l 6= j}) = Y i∈I\{j, k} (φi− id)(φk− id)( X {v0 ∈ [q]n| v_l0= vl ⇐⇒ l 6= j}) = Y i∈I\{j, k} (φi− id)( X {v00∈ [q]n| v00_m= vm ⇐⇒ m 6= k, j}) .. . =X{v0∈ [q]n| vj = vj0 ⇐⇒ j /∈ I}.

From this we see that for different I and J , there does not exist an u both in yI· v and

yJ · v. Hence {yI· v | I ⊆ N } is linearly independent over K, for v ∈ [q]n. Therefore,

{yI | I ⊆ N } is linearly independent over K as well. As if it was not, there would

exist ki ∈ K (not all zero), so that if p denotes the number of subsets I ⊆ N , we have

k1yI1 + k2yI2 + . . . + kpyIp = 0. Which implies

(k1yI1 + k2yI2 + . . . + kpyIp) · v = k1yI1· v + k2yI2· v + . . . + kpyIp· v = 0.

This is in contradiction with {yI · v | I ⊆ N } being linearly independent. Besides, we

see that |{yI | I ⊆ N }| = 2n = dimK(R), since R has a K-basis with 2n elements. So

{yI | I ⊆ N } is a K-basis for R.

The group Sn acts on R by permuting xi ∈ R, for 0 ≤ i ≤ n. So for σ ∈ Sn we have

σ(xi) = xσ(i) and σ(k) = k for k ∈ K. We define A as the subring of invariants under

Sn,

A = {r ∈ R | σ(r) = r for all σ ∈ Sn}. (3.5)

Define zj =PI⊆N | |I|=j

yI for 0 ≤ j ≤ n.

Lemma 3.9. The set {zj | 0 ≤ j ≤ n} forms a K-basis for A.

Proof. Let us define for 1 ≤ i ≤ n, the symmetric polynomials ei(X1, . . . , Xn) as e1 =

(21)

Beside we set e0 = 1. So for 0 ≤ i ≤ n, the ei are invariant under Sn. We see that

zj = ej(X1− 1, . . . , Xn− 1), so clearly zj ∈ A. By the same reasoning as in the previous

lemma we see that {zj | 0 ≤ j ≤ n} is linearly independent over K. Since the symmetric

polynomials form a K-basis for A, we have |{zj | 0 ≤ j ≤ n}| = n + 1 = dimK(A).

Hence we conclude {zj | 0 ≤ j ≤ n} is a K-basis for A.

We see for j ∈ N and v ∈ [q]n, that zj acts on the basis vectors v of M as follows

zj· v = X I⊆N ||I|=j f (yI)(v) = X I⊆N ||I|=j X {v0∈ [q]n| v_k= v_k0 ⇐⇒ k /∈ I} zj· v = X I⊆N ||I|=j {v0 ∈ [q]n| d(v, v0) = j}. (3.6) Where the second equality follows from Lemma 3.8. The third equality holds because vk= vk0 ⇐⇒ k /∈ I implies vk6= v0k ⇐⇒ k ∈ I. So since we sum over I ⊆ N for which

|I| = j, this gives d(v, v0_{) = j.}

Furthermore M is an A-module, as A is a subring of R.

Now let S[q]n the full permutation group of [q]n, i.e., the set of bijections from [q]n

to [q]n. For fixed u ∈ [q]n, consider the subgroup (Giso)u = {τ ∈ S[q]n | τ (u) =

u and d(v, v0) = d(τ (v), τ (v0))} for all v, v0 ∈ [q]n_{. From now on we assume u is the zero}

word 0 = 0 . . . 0. The group (Giso)0 consists of the isometries for which 0 is invariant. So

the elements of (Giso)0 permute the positions of the vectors, and the non-zero symbols

in any position.

We consider the set of invariants in M under the group (Giso)0, denoted by

M(Giso)0 _{= {m ∈ M | τ (m) = m for all τ ∈ (G}

iso)0}.

Lemma 3.10. The set of invariants M(Giso)0 _{is an A-submodule of M .}

Proof. We know M(Giso)0 _{is a subgroup of M , so we only need to show that for all a ∈ A}

we have a · m ∈ M(Giso)0_{, i.e., τ (a · m) = a · m. By permuting the basis vectors, the}

group (Giso)0 acts K-linearly on M . We show this action is even A-linear

τ (zj· v) = τ ( X {v0 ∈ [q]n| d(v, v0) = j}) =X{τ (v0) ∈ [q]n| d(v, v0) = j} =X{v0 ∈ [q]n| d(v, τ−1(v0)) = j} =X{v0 ∈ [q]n| d(τ (v), v0) = j} = zj· τ (v).

Where we get the third equality by applying τ−1 to v0, which only changes the order in which the terms in the sum appear. The fourth equality follows from d(v, τ−1(v0)) = d(τ (v), τ (τ−1(v0))) = d(τ (v), v0) for τ ∈ (Giso)0. From the A-linearity it directly follows

(22)

In order to further study the structure of M(Giso)0_{, we define the A-homomorphism}

T : M → M(Giso)0 _by

T (m) = X

τ ∈(Giso)0

τ (m). (3.7)

It follows from the fact (Giso)0 that acts A-linearly on the basis vectors of M , that T is

an A-homomorphism.

We will use this map in Lemma 3.12. But first we determine the orbits from the action of (Giso)0 on [q]n. For v ∈ [q]n we have

(Giso)0v = {τ (v) | τ ∈ (Giso)0}

= {v0 ∈ [q]n_{| wt(v) = wt(v}0_)}.

Where the second equality holds because wt(v) = d(v, 0) = d(τ (v), τ (0) = d(τ (v), 0) = wt(τ (v)). Hence the collection of all orbits is {{v ∈ [q]n | wt(v) = j} | 0 ≤ j ≤ n}. We define mj =P{v ∈ [q]n| wt(v) = j} ∈ M for 0 ≤ j ≤ n.

Lemma 3.11. The set {mj | 0 ≤ j ≤ n} is a K-basis for M(Giso)0.

Proof. Again, the collection mj is linearly independent by the same argumentation as

in Lemma 3.8. Moreover, the mj generate M(Giso)0 since for m ∈ M(Giso)0 we have for

certain kv∈ K τ (m) = τ ( X v∈[q]n kvv) = X v∈[q]n kvτ (v) =X j∈N X v∈[q]n_| wt(v)=j kvτ (v) =Xkjmj = m.

Where the first equality holds because M is generated by [q]n and the last equality holds because τ (m) = m for m ∈ M(Giso)0_{. Hence we see that m can be written as}

linear combination of mj over K. It follows that {mj | 0 ≤ j ≤ n} is a K-basis for

M(Giso)0_.

Now we have these bases for both M(Giso)0 _{and A (Lemma 3.11 and Lemma 3.9), we}

show that they are isomorphic as A-modules.

Lemma 3.12. We have A ∼= M(Giso)0 _{as A-modules.}

Proof. Define an A-homomorphism ψ : A → M(Giso)0 _{by ψ(a) = a · 0. Then}

ψ(zj) = zj· 0 =

X

(23)

where the third equality holds because d(v, 0) = wt(v). We see that ψ maps the elements of a K-basis for A to the elements of a K-basis for M(Giso)0_{, hence ψ is an isomorphism.}

Therefore A ∼= M(Giso)0 _{as A-modules.}

Before we begin with the proof of the actual theorem we have one lemma to prove. Lemma 3.13. Let A and K be as before. Then A ∼=Qn_x=0K as rings.

Proof. We show there exists a ring homomorphism A → Qn

x=0K. Therefore we first

define for all I ⊆ N the ring homomorphism χI : R → K by

χI(k) = k, for k ∈ K

χI(xi) = 0, if i ∈ I

χI(xi) = q, if i /∈ I.

This map is surjective, hence there exists an isomorphism R/ ker(χI) → K [1] which

implies ker(χI) is a maximal ideal [7] of R for every I ⊆ N .

R K

R/ ker(χI) χI

∼ (3.8)

These maximal ideals are mutually co-prime since ker(χI)+ker(χJ) = R for I 6= J . As

if I 6= J , we can choose i ∈ I so that i /∈ J . Then we have χI(1_qxi) = 1_qχI(xi) = 1_q0 = 0,

hence 1_qxi ∈ ker(χI) and χI(1−1_qxi) = χI(1)−1_qχI(xi) = 1−1 = 0 so (1−1_qxi) ∈ ker(χJ).

Moreover we have

1

qxi+ (1 − 1

qxi) = 1. Now we define another ring homomorphism from R to Q

I⊆NK χ := Y I⊆N χI : R → Y I⊆N K. We have R/ ker(χ) = R/ ker Y I⊆N χI = R/ \ I⊆N ker(χI) ∼= Y I⊆N R/ ker χI ∼= Y I⊆N K,

where the first isomorphism follows from the Chinese Remainder Theorem (Theorem 3.1) and the second isomorphism is true by (3.8). Hence we know by Corollary 3.2 that χ is surjective. Counting the dimensions gives that dimK(R) = 2n = dimK(Q_I⊆NK),

which now implies χ is an ring isomorphism. So R ∼=Q

I⊆NK as rings.

We are interested in χ restricted to A, therefore we first consider χI|A. For σ ∈ Sn,

I ⊆ N and r ∈ R we have χσ(I)(σ(r)) = ( 0 if σ(i) ∈ σ(I) q if σ(i) /∈ σ(I) ⇐⇒ 0 if i ∈ I q if i /∈ I ) = χI(r).

(24)

Since for every I, J with |I| = |J | there exists a σ so that I = σ(J ), we have for the basis elements zj for A that χI(zj) = χσ(J )(zj) = χJ(zj), where the second equality

holds because zj is invariant under σ. Hence we see χI|A= χJ|A for all I, J ⊆ N with

|I| = |J |. This implies

χ(A) = (χI1(A), χI2(A), . . . , χIp(A))

= (kI1, kI2, . . . , kIp) ⊆ {(k_I)I⊆N ∈ Y I⊆N K | kJ = kJ0 if |J | = |J0|} | {z } Z ,

for Ii ⊆ N and p the number of subsets of N as before. Because dimKχ(A) = n + 1 =

dimK(Z) we have that the inclusion is an equality. Hence by putting Ix:= {1, 2, . . . , x}

and defining χx:= χIx|A for 0 ≤ x ≤ n, the ringhomomorphism

χ0 := n Y x=0 χx: A → n Y x=0 K,

is a K-linear ring isomorphism. So A ∼=Qn

x=0K as rings.

Now by using the above lemmas and notation we prove the strengthening of Lloyd’s theorem.

Theorem 3.14 (Strengthening of Lloyd’s theorem). Let q ≥ 2. If a perfect e-code of length n over [q] exists, then the polynomial

Le(X) = e X i=o (−1)in − X e − i X − 1 i (q − 1)e−i, has e distinct integral zeros in N .

Proof. Suppose there exists a perfect e-code C ⊆ [q]n. Then by applying isometries, we can construct e + 1 perfect e-codes C0, C1, . . . , Ce⊆ [q]n so that i ∈ {wt(c) | c ∈ Ci} :=

w(Ci). We may assume 0 ∈ C (Lemma 1.13), so we have 0 ∈ w(C). Let us denote C

by C0. From C0 we obtain a new code C1 by permuting the symbol in the first position

of every codeword. Then since c10 . . . 0 ∈ C1, for some c1 ∈ [q], we have 1 ∈ w(C1).

The code C2 is obtained by permuting the first two positions of every codeword in C0,

which gives a perfect e-code with 2 ∈ w(C2). And so on, until we have constructed Ce

by permuting the first e positions of the codewords in C0. Since the Ci are also perfect

e-codes, we even have w(Ci) ∩ {0, 1, . . . , e} = i.

Consider

{T (XCi) | 0 ≤ i ≤ e} ⊆ M(Giso)0, (3.9)

with T defined as in (3.7). We prove that this set is linearly independent over K. Let T (P Ci) = Pnj=0kijmj, the linear combination of basis vectors of M(Giso)0, with

(25)

kij ∈ K. Since w(Ci) ∩ {0, 1, . . . e} = i, we have for 0 ≤ i ≤ e and 0 ≤ j ≤ e by definition

of mj that kij 6= 0 ⇐⇒ i = j. Therefore the given set is linearly independent.

Now put s =Pe j=0zj ∈ A. Then s ·XCi= e X j=0 zj · X Ci = e X j=0 X c∈Ci (zj· c) = e X j=0 X c∈Ci (X{v0 ∈ [q]n| d(XCi, v0) = j}) = X c∈Ci e X j=0 (X{v0 ∈ [q]n| d(XCi, v0) = j}) s ·XCi= X [q]n, (3.10)

where the third equality holds by (3.6) since c ∈ Ci ⊆ [q]n and the last equality is true

by the perfectness of Ci. Apply the A-linear map T to both sides of equation (3.10) and

we get

s · T (X(Ci)) = T (

X [q]n).

This shows that for i 6= j we have T (P(Ci)) − T (P(Cj)) ∈ {m ∈ M(Giso)0 | s · m = 0}.

Which implies dimK{m ∈ M(Giso)0 | s·m = 0} ≥ e. Now because A ∼= M(Giso)0 (Lemma

3.12) we have

dimK{a ∈ A | s · a} ≥ e. (3.11)

Since A ∼= Qn

x=0K as rings (Lemma 3.13), it follows that for k = (k1, k2, . . . , kn) ∈

Qn x=0K we have dimK{k 0 _∈Qn x=0K | k · k 0 _{= 0} ≥ e. Also we have} k · k0 = (k0, k1, . . . , kn) · (k00, k 0 1, . . . , k 0 n) = (k0k00, k1k01, . . . , knk0n) = 0 ⇐⇒ kxkx0 = 0 for all 0 ≤ x ≤ n ⇐⇒ kx= 0.

Which gives dimK{k0 ∈ Qnx=0K | k · k0 = 0} = |{x | 0 ≤ x ≤ n, kx = 0}|. Now let

k = χ0(s), then since (χ0(s))x= χx(s) we see

(26)

Now we want to prove Le(x) = χx(s), therefore we first compute χx(zj). χx(zj) = X I⊆N ||I|=j χIx(yI) = X I⊆N ||I|=j χIx( Y i∈I (xi− 1)) = X I⊆N ||I|=j Y i∈I∩Ix (0 − 1) Y i∈I\Ix (q − 1) = X I⊆N ||I|=j (−1)|i∈I∩Ix|_{(q − 1)}|i∈I\Ix| χx(zj) = j X i=0 x i n − x j − i (−1)i(q − 1)j−i. (3.13) Where the last equality is obtained as follows. Let |I| = j and |I ∩ Ix| = i. Then

|I\I_x| = |I| − |I ∩ I_x| = j − i. There are x_i options to pick I so that |I ∩ Ix| = i,

and n−x_j−i possibilities to take I ⊆ N so that |I ∩ Ix| = j − i. Together we have for

every i ≤ j, exactly x_i n−x

j−i options to take I ⊆ N with |I| = j. This gives the final

summation.

Hence we get the following

χx(s) = χx( e X j=0 zj) = e X j=0 χx(zj) = e X j=0 j X i=0 x i n − x j − i (−1)i(q − 1)j−i (3.14) = e X i=0 (−1)in − x e − i x − 1 i (q − 1)e−i = Le(x).

The third equality holds by (3.13) and the fourth equality follows from rewriting (3.14) as e X i=0 i X j=0 x j n − x e − i (−1)j(q − 1)e−i. Since (−1)in − x e − i x − 1 i (q − 1)e−i = i X j=0 x j n − x e − i (−1)j, the fourth equality follows.

(27)

For χx(s) we know by (3.12) that it has more then e zeros in the set of integers

{0, 1, . . . , n}, which implies that this is also true for Le(x). Since the degree of Le(x) is

e, we know that there are exactly e zeros. For x = 0 we have Le(0) =

Pe i=0

n

e−i(q−1) e−i_,

which is not zero since _e−in > 0 because i ≤ e ≤ n and also (q − 1) 6= 0 since q ≥ 2. Hence we conclude Le(x) has e distinct zeros in N .

Let us give a few examples of the usage of Lloyd’s theorem. As promised in the beginning of this chapter, we give an example of certain parameters for which we could expect a perfect code exists, but Lloyd’s theorem shows differently. This example comes from MacWilliams and Sloane [12].

Example 13. Consider the parameters n = 90, e = 2 and q = 2. By the condition for perfect codes which followed from the definition, one might expect a perfect code with this parameters exists. Since we have |B(c, e)| = Pe

i=0 n

i(q − 1)

i ₌ 90

2 + 90 + 1 =

212 _{| 2}90_{, this condition is satisfied. Now consider the Lloyd polynomial with these}

parameters. Le(x) = 2 X i=0 (−1)i90 − x 2 − i x − 1 i (2 − 1)2−i = 2(x2− 91x + 2048).

The zeros of Le(x) are x = 91₂ ± √

89

2 , which are both not integers. By Lloyd’s theorem

it follows that a binary perfect 2-code of length 90 cannot exist.

The following corollary of Lloyd’s theorem was proven by Van Lint [9]. We will only shortly give the idea of the proof.

Corollary 3.15. For q ≥ 3 and n > 3, the Golay code is the only nontrivial perfect 3-code.

Sketch of proof. In Van Lint’s article it is shown that for q ≥ 3, n > 3 and e = 3, the sign of the Lloyd polynomial changes for two consecutive integers. This implies that there is at least one non-integer for which the Lloyd polynomial is zero. Hence by the the strengthening of Lloyd’s theorem, no other nontrivial perfect 3-codes with q ≥ 3 and n > 3 exist.

It is still unknown if there exist perfect 1- and 2-codes over an alphabet [q], if q is not a prime power [10]. Van Lint and Tiet¨av¨ainen have shown that if q is a prime power, then a nontrivial perfect code over Fqhas the same parameters as either one of the Hamming

(28)

Conclusion

In this thesis we began by giving a short introduction on coding theory. From there we, defined perfect codes, linear codes and we gave examples of both of them, such as the Hamming code. Finally, we proved the strengthening of Lloyd’s theorem as Lenstra did in the article “Two theorems on perfect codes”.

Originally the plan was also to study cyclic codes. Unfortunately, time did not allow us to do so. Beside, it would have been useful to explore more examples and applications of coding theory.

A lot of further academic research can be done. For example, for many parameters q, n and d, the maximum code size Aq(n, d) is still unknown. More close to the topic

of this thesis, and also mentioned in the last chapter, it is not known yet if there exist perfect 1- or 2-codes over an alphabet q, if q is not a prime power.

(29)

Popular summary in Dutch

Codes zijn overal. Kijk om je heen en je zal waarschijnlijk iets zien waarin codes gebruikt worden. Denk aan je telefoon, je laptop, of zelfs je boeken (heb je je ooit afgevraagd wat het ISBN nummer is?). In dit digitale tijdperk is coderingstheorie dus enorm van belang.

Coderingstheorie gaat over het verzenden van informatie. Denk bijvoorbeeld aan het versturen van een e-mail, het luisteren naar een CD of het opslaan van informatie op een computer. De informatie wordt daarvoor eerst ge¨encodoceerd, bijvoorbeeld in nullen en enen, daarna wordt het verzonden en vervolgens wordt het weer gedecodeerd. Dit kun je als volgt voor je zien.

Source

Encoder Channel Decoder Receiver

Laten we een simpel voorbeeld geven om er een idee bij te krijgen. We maken een code die bestaat uit nullen en enen. De informatie die we willen verzenden is JA of NEE en dit wordt ge¨encodeerd door JA=11111 en NEE=00000. Met andere woorden, de code is {11111, 00000}. We noemen 11111 en 00000 de codewoorden. Deze code heeft lengte 5, omdat de codewoorden uit vijf cijfers bestaan. Je kunt natuurlijk nog veel meer vijfcijferige combinaties maken met 0 en 1, zoals bijvoorbeeld 10110 of 00010. Al deze andere mogelijkheden noemen we woorden. Stel nu dat we JA verzenden, maar tijdens het versturen gaat er iets mis, waardoor sommige nullen in enen veranderen. We krijgen bijvoorbeeld in plaats van het codewoord 11111, het woord 10110. Dit woord zit niet in onze code, dus dat er een fout is opgetreden is duidelijk. Maar deze decoder decodeert woorden naar het ‘dichtsbijzijnde’ codewoord, dat wil zeggen het codewoord dat op het minste aantal plekken anders is dan het woord. De afstand tussen 10110 en 11111 is 2, want ze verschillen van elkaar op de tweede en vijfde plek. De afstand tussen 10110 en 00000 is 3, want deze verschillen op posities een, twee en drie. Dus 10110 wordt gedecodeerd naar 11111, wat JA betekent. En dus hebben we ondanks dat er een fout in de verzending plaatsvond, toch het juiste bericht ontvangen. Een code waarin opgetreden fouten gecorrigeerd kunnen worden, noemen we een foutcorrigerende code. We noteren ook altijd hoeveel fouten er gecorrigeerd worden, dus we noemen bovenstaande code een 2-foutcorrigerende code.

Er bestaan ook niet-foutcorrigerende codes. Stel bijvoorbeeld dat {0000, 1111} de code is en we verzenden 0000. Als er nu weer twee fouten optreden, ontvangen we bijvoorbeeld 1100 in plaats van het gewenste 0000. Dit kan niet goed gedecodeerd worden. Immers, 1100 heeft afstand 2 tot 0000, maar 1100 heeft ook afstand 2 tot 1111, dus geen van de

(30)

twee codewoorden is ‘het dichtstbij’.

Voor sommige codes is het voor elk woord duidelijk naar welk codewoord het gede-codeerd moet worden, zulke codes noemen we perfect. De code in ons JA/NEE voorbeeld is zo’n perfecte code. Ga maar na, of je nu het woord 10110 bekijkt, 00010 of 11100, van elk woord kun je bepalen of die het dichtst bij 11111 of bij 00000 ligt. Alle mogelijke woorden met lengte vijf, bestaande uit 0 en 1, kunnen goed gedecodeerd worden.

De voorbeelden die we hier gaven, zijn codes die bestaan uit nullen en enen, maar je kunt ook codes maken waarbij je uit meer cijfers mag kiezen. Denk bijvoorbeeld aan een code zoals {10200, 11034, 22244}, waarbij je codewoorden bestaan uit de cijfers 0, 1, 2, 3 of 4.

We hebben verschillende begrippen in coderingstheorie uitgelegd aan de hand van voorbeelden. Vanaf nu zullen we de lengte van een code aanduiden met de letter l, het aantal cijfers waar we uit kiezen noteren we met de letter c en het aantal fouten dat gecorrigeerd kan worden noemen we zoals eerder n. Voor l, c en n kunnen we dus allemaal verschillende getallen invullen en zo krijgen we verschillende soorten codes. Bijvoorbeeld voor l = 4, c = 2 en n = 1, hebben we het over een code van lengte 4, waarvoor we uit 2 cijfers kunnen kiezen (de cijfers 0 en 1) en het aantal fouten dat gecorrigeerd kan worden is 1, zoals bijvoorbeeld de code {0000, 1111}.

Het doel van deze scriptie was om Lloyd’s stelling te bewijzen. Voordat we die kunnen verwoorden, hebben we nog ´e´en nieuw begrip nodig. Voor elke combinatie van getallen l, c, en n, bestaat er het zogenaamde Lloyd polynoom. Dit is een functie L(x) die er voor andere l, c en n, anders uitziet. Bijvoorbeeld voor l = 4, c = 2 en n = 1 krijgen we L(x) = 2x2_{− 10x + 1. Maar het Lloyd polynoom voor l = 4, c = 3 en n = 2 is}

L(x) = 4x2− 28x + 34.

Lloyd’s stelling geeft een belangrijke eigenschap van perfecte codes. Niet voor elke l, c en n bestaat er namelijk een perfecte code. De stelling zegt: als een perfecte code voor bepaalde l, c en n bestaat, dan heeft de bijbehorende L(x) alleen voor gehele getallen x tussen 1 en l, oplossingen voor de vergelijking L(x) = 0. Voor het eerste voorbeeld uit de vorige alinea betekent dit, dat L(x) = 0 voor x = 1, x = 2, x = 3 of x = 4. Andersom geeft de stelling, dat als L(x) = 0 voor een x die n´ıet een geheel getal is, dan bestaat er geen perfecte code voor die l, c en n.

Laten we testen of er een perfecte code met l = 4, c = 2 en n = 1 bestaat. We berekenen de nulpunten van 2x2− 10x + 1 met behulp van de ABC-formule, en komen erachter dat de functie 0 is voor x = 5₂ ±

√ 3

2 . Dit zijn zeker geen gehele getallen, en dus

weten we nu met Lloyd’s stelling dat er geen 1-foutcorrigerende code bestaat van lengte 4 die bestaat uit de cijfers 0 en 1.

Naast het geven van het bewijs van Lloyd’s stelling, hebben we in deze scriptie enkele basisbegrippen uit de coderingstheorie besproken. We hebben verschillende voorbeelden van codes gezien, in het bijzonder voorbeelden van perfecte codes. Tot slot hebben we enkele gevolgen van Lloyd’s stelling behandeld.

(31)

Bibliography

[1] Atiyah, M. (1994). Introduction to commutative algebra. Westview Press.

[2] Bomze, I. M., Budinich, M., Pardalos, P. M., & Pelillo, M. (1999). The maximum clique problem. In Handbook of combinatorial optimization (pp. 1-74). Springer US. [3] Hill, R. (1986). A first course in coding theory. Oxford University Press.

[4] Hodgart, M. S., & Tiggeler, H. A. B. (2000). A (16, 8) error correcting code (t= 2) for critical memory applications. EUROPEAN SPACE AGENCY-PUBLICATIONS-ESA SP, 457, 659-664.

[5] Igodt, P., & Veys, W. (2015). Lineaire algebra. Universitaire Pers Leuven.

[6] Ling, S., & Xing, C. (2004). Coding theory: a first course. Cambridge University Press.

[7] Lang, S. (2002). Algebra revised third edition. Graduate Texts in Mathematics, 1(211), ALL-ALL.

[8] Lenstra, H. W. (1972). Two theorems on perfect codes. Discrete mathematics, 3(1-3), 125-132.

[9] Van Lint, J. H. (1970). On the nonexistence of perfect 2-and 3-Hamming-error-correcting codes over GF (q). Information and Control, 16(4), 396-401.

[10] Van Lint, J. H. (1975). A survey of perfect codes. Journal of Mathematics, 5(2). [11] Van Lint, J. H. (2012). Introduction to coding theory (Vol. 86). Springer Science &

Business Media.

[12] MacWilliams, F. J., & Sloane, N. J. A. (1977). The theory of error-correcting codes. Elsevier.

[13] Shannon, C. E. (2001). A mathematical theory of communication. ACM SIGMO-BILE Mobile Computing and Communications Review, 5(1), 3-55.

Algebraic Coding Theory