An Application of Fourier analysis on Boolean functions in Theoretical Computer Science

(1)

An Application of Fourier analysis on Boolean

functions in Theoretical Computer Science

Ismani Nieuweboer

Bachelor Thesis

Double bachelor’s programme Mathematics & Computer Science

Supervisors: dr. Guus Regts & dr. Jop Bri¨et

Korteweg-de Vries Institute for Mathematics Faculty of Science

(2)

Abstract

After laying down a foundation of Fourier analysis on Boolean functions (functions that have the n-fold Cartesian product of integers modulo 2 as domain), the Level-k inequalities and Chang’s lemma are proven. Subsequently Chang’s lemma is used together with additive combinatorics to show the quasi-polynomial Freiman-Ruzsa theorem originally proven by Sanders. Using this, an overview of linearity testing of Boolean functions and functions from and to the Boolean/Hamming cube, and the complexity, are given.

Title: An Application of Fourier analysis on Boolean functions in Theoretical Computer Science Author: Ismani Nieuweboer, ismani94@protonmail.ch; student number 10502815

Supervisors: dr. Guus Regts (Korteweg-de Vries Instituut) dr. Jop Bri¨et (Centrum Wiskunde & Informatica)

Second graders: prof. dr. Tom Koornwinder (Korteweg-de Vries Instituut) prof. dr. Ronald de Wolf (Centrum Wiskunde & Informatica) End date: July 8th, 2016

Korteweg-de Vries Institute for Mathematics (KdVI) University of Amsterdam

Science Park 105, 1098 XH Amsterdam

(3)

Acknowledgments

First off I want to thank my supervisors, Jop and Guus, for their time, patience, and guidance. Not to forget their suggestion for the topic of this thesis which I found very interesting due to the immense diversity of it. Secondly, I want to show my appreciation to all my friends for their moral support, given that a few times I needed it more than I thought. Last but not least many thanks to my parents back in Suriname, for allowing me to get to this point in life and for supporting me in the choices I make today. Ismani Nieuweboer

(4)

(5)

Introduction

Theoretical computer science leans broadly on the fundamentals laid down by mathematics. One can argue a lot about where the boundary lies between the two. The theory described in this thesis is pioneered by computer scientists and mathematicians alike.

As a particular example, Boolean functions are used to describe properties of n-bit strings. The Boolean/Hamming cube, Fn2 and {±1}

n

are equivalent ways to denote these strings. Fourier analysis is used to analyze the properties of Boolean functions, and for applications throughout (theoretical) computer science. Some examples of these applica-tions are property testing, extremal/additive combinatorics, random graph theory, social choice theory, cryptography, circuit complexity, learning theory and pseudorandomness [6].

The goal of this thesis is to give a description of linearity tests and the complexity of those. Given a function that is either linear or far from linear, i.e. not linear on a lot of pairs of points, the linearity test described determines which case applies to the function. Two classes of functions are discussed: Boolean functions (Fn2 → F2) and functions over

Boolean space (Fn2 → Fn2), up to isomorphism of F2 with other groups. Working over F2,

for linearity it only has to hold that f (x) + f (y) = f (x + y) for all x, y ∈ Fn2.

For testing Boolean functions, only standard results from Fourier analysis are needed. For functions from and to the Hamming cube however, a lot more machinery is needed. Chang’s lemma, a result from hypercontractivity is derived. This result gives a logarithmic

bound on the dimension of the space spanned by elements with large Fourier coefficients. Freiman and Rusza proved an exponential bound and conjectured a polynomial bound on the size of the span of sets A with small doubling. Recently Sanders proved that for a subset within A with a quasi-polynomial lower bound of size, the size of the span has a polynomial upper bound. This theorem, called the quasi-polynomial Freiman-Rusza theorem, is proven using techniques from additive combinatorics and Fourier analysis.This result has been a large breakthrough in theoretical computer science.

Finally, we describe an explicit application of this to the area of property testing in the form of an algorithm for testing linearity of Boolean functions, and functions over Boolean space. The complexity of the algorithm describes the amount of points on which the function agrees with a certain linear map. This complexity is in the case of Fn2 → Fn2

directly related to the parameter that appear in the quasi-polynomial Freiman-Rusza theorem.

Chapters two and three can be considered mathematics. In chapter three the result discussed has many other applications in theoretical computer science [4] than the one discussed in this thesis. Therefore chapter three can also be considered to belong to the field of theoretical computer science. The goal of giving proofs of correctness for an algorithm concludes chapter four, which makes that chapter theoretical computer science.

(7)

1 Preliminaries

1.1 Measure theory

Let (S, Σ, µ) be a measure space. For p ∈ R≥1 consider the Lp-space over R.

Definition 1.1.1. [Inner product over L2] Let f, g ∈ L2. The inner product between f and g is defined as

hf | gi := Z

S

f g dµ.

Definition 1.1.2. [p-norm] Let (S, Σ, µ) be a measure space. Then the p-norm of f is defined as kf k_p:= Z S |f |pdµ _p1 .

Remark. If µ is a probability measure, we have kf k_p= E[f ] = Ex[f (x)] for x a uniformly

chosen random variable over S.

Theorem 1.1.3. [Hölder’s inequality (Theorem 12.2 in [10])] For f, g measurable on S, and p, q ∈ [1, ∞] Hölder conjugates or a Hölder pair, i.e. 1_p +1_q = 1 ⇐⇒ p − 1 = _q−11 it holds that

kf gk₁≤ kf k_pkgk_q. In particular, if f, g ∈ L2 we have

|hf | gi| ≤ kf k_pkgk_q (using the triangle inequality for integration).

Additionally, when p, q ∈ (0, ∞), f ∈ Lp, g ∈ Lq the inequality is sharp if and only if |f |p, |g|q are linearly dependent in L1.

Remark. A calculation shows that the second inequality can be made sharp by the choice of g(x) := f (x)|f (x)|

p

q−1_{. This in particular implies that sup}

kgk_q=1hf | gi = kf kp

since the choice of g can always be normalized.

Theorem 1.1.4. [Jensen’s inequality (Theorem 12.14 in [10])] Suppose x is a random variable and ϕ a convex function. Then ϕ(E[x]) ≤ E[ϕ(x)].

Theorem 1.1.5. [Monotonicity of the p-norm (in probability spaces)] Suppose µ(S) = 1, i.e. (S, Σ, µ) is a probability space. Then k·k_p is monotonously increasing in p.

(8)

Proof. Let f ∈ Lp(S), q, q0 ≥ 1 H¨older conjugates. kf kp_p= Z |f |pdµ = Z |f |p· 1 dµ =D|f |p 1 E ≤ |f | p qk1kq0 = Z |f |pqdµ 1 q · µ(S)q01 _{= (kf k}pq pq) 1 q _{· 1} = kf kp_pq.

Since we have kf k_p≤ kf k_pq and q ≥ 1 only if pq ≥ p the claim immediately follows. Theorem 1.1.6. [Markov’s inequality (Proposition 10.12 in [10])] Suppose x ≥ 0 is a nonnegative random variable and let a > 0. Then P[x ≥ a] ≤ E[x]

a .

Theorem 1.1.7. [Marcinkiewicz-Zygmund inequality (Theorem 10.3.2 in [3] )] Let p ≥ 1. There exists a constant C > 0 (only depending on p) such that for every sequence x1, . . . xl

of independent, zero-mean random variables with each of them finite p-th moment (i.e. E[|xi|p] < ∞) it holds that

E| l X i=1 xi|p ≤ (Cp) p 2_E h ( l X i=1 |x_i|2)p2 i .

Corollary 1.1.8. Let p ≥ 1. There exists a constant C > 0 (only depending on p) such that for every sequence x1, . . . xl of independent, zero-mean random variables such that

|x_i| ≤ 1 for each i ∈ [l], it holds that

E|1 l l X i=1 xi|p ≤ Cp l p₂ .

Proof. Note that bounded random variables also have finite p-th moment. Hence the Marcinkiewicz-Zygmund inequality can be applied to give

E 1 l l X i=1 xi p = 1 lpE l X i=1 xi p ≤ 1 lp(Cp) p 2_E ( l X i=1 |xi|2) p 2 ≤ 1 lp(Cp) p 2 Xl i=1 1 p 2 =Cp l p₂ .

(9)

2 Fourier analysis on Boolean functions

This chapter closely follows the book Analysis of Boolean Functions by Ryan O’Donnell [6]. Specific references will be given at sections and theorems.

2.1 Basic theory

This section follows parts of chapter 1 of O’Donnell [6].

We will consider bit strings of a fixed length n ∈ Z>0. One can do an bit-wise XOR

operation between two of these bit strings. The strings together with the operation then gives rise to an Abelian group.

Mathematically one can describe this group in multiple ways. The first way is as Fn2 (here F2 = {0, 1}) with addition modulo 2, and a second way is as {±1}n with

multiplication. A third way is by subsets of [n] := {1, 2, . . . , n} with as operation the symmetric difference between sets. The third way is needed to describe the space on which the Fourier transforms of functions take their arguments, which will be done later in this section.

We identify F2and {±1} with each other using the group isomorphism F2 → {±1} : x 7→

(−1)x, which extends easily to n dimensions. Furthermore Fn2 is isomorphic to 2[n], with

the isomorphism given by x 7→ S := {i : xi = 1}, where we see that S contains i if and

only if x has a 1 at position i. Shortly notated, we have (Fn

2, +) ∼= ({±1} n_{, ·) ∼}

= (2[n], ∆). It is important to note that addition and subtraction are the same in F2, as well as

multiplication and division in {±1}, however trivial this might be.

We now give the definition of the type of functions that are at the center of this thesis. Definition 2.1.1. [Boolean function] A function f is said to be a R-valued Boolean function if it is specified as

f : {±1}n→ R.

We may also use terminology like {±1}-valued Boolean functions for f : {±1}n→ {±1}. The domain of the function can be switched with Fn2.

The space {f : {±1}n → R} of such functions forms a vector space under pointwise addition and scalar multiplication.

In chapter 3 (on additive combinatorics) and section 4.2 we will switch from {±1} to F2.

Two simple examples of Boolean functions are min2(x1, x2) = − 1 2 + 1 2x1+ 1 2x2+ 1 2x1x2,

(10)

and Maj3(x1, x2, x3) = 1 2x1+ 1 2x2+ 1 2x3− 1 2x1x2x3.

Let f, g be Boolean functions. In Definition 1.1.1, take µ the uniform measure over {±1}n. We then have that the inner product between f and g is equal to

hf | gi = 1 2n

X

x∈{±1}n

f (x)g(x).

The stochastic interpretation to this is as follows. Let x ∼ A denote that the random variable x is chosen uniformly from a (finite) set A. Then hf | gi = Ex∼{±1}n[f (x)g(x)].

Expected values and probabilities written without random variable are implicitly meant to be uniformly chosen, i.e. E[f ] = Ex∼{±1}n[f (x)].

We now define a type of Boolean functions, characters, which give an orthonormal basis for the space of Boolean functions. Furthermore two equivalent ways of describing characters are given.

Definition 2.1.2. [Character] Define the character χS: {±1}n→ {±1} for S ⊆ [n] by

χS: x 7→

Y

i∈S

xi.

Now (χS)S⊆[n] is an orthonormal basis for the space of functions. We have hχS| χTi =

δS,T, where δS,T is the Kronecker delta which is equal to 1 if and only if S equals T , and

equal to 0 else.

This can be proven in two ways:

• Using the identification ({±1}n, ·) ∼= (2[n], ∆) and proving it directly;

• Using representation theory, since there are exactly 2n _{of these characters and}

{±1}n is an Abelian group (see for example [11]). When writing Boolean functions as f : Fn2 → R, one can use

χS: x 7→

Y

i∈S

(−1)xi

as an orthonormal basis of characters instead. This follows from the isomorphism between {±1} and F2 given earlier.

We now give a third way to describe characters, where we replace the index from 2[n] with an index from Fn2. For given S ⊆ 2[n], let z ∈ Fn2 such that zi = 1 if and only if

i ∈ S. Using the characters of Boolean functions over Fn

2, we then have

Q

i∈S(−1)xi =

(−1)Pi∈Sxi _{= (−1)}Pn_i=1xizi_{, for x ∈ F}n

2. For z ∈ Fn2, another equivalent way to describe

the basis of characters is therefore χz: Fn2 → {±1} with

χz: x 7→ (−1)x·z = (−1) Pn

(11)

Using the first description of the basis of characters a Boolean function f can be written as

f = X

S⊆[n]

hf | χ_Siχ_S.

The coefficients hf | χSi are called Fourier coefficients. Following directly from this is

the Fourier transform:

Definition 2.1.3. [Fourier transform] The Fourier transform ˆf of a Boolean function f is defined by ˆf (S) := hf | χSi for S ⊆ [n]. In particular, ˆf : 2[n]→ R.

The decomposition using the orthonormal basis of characters can then be written using these Fourier coefficients as

f = X

S⊆[n]

ˆ f (S)χS.

We call this the Fourier expansion of f . Note that min2 and Maj3 have already been

written out in their Fourier expansions.

It is easy to see that taking the Fourier transform is a linear map. Let α, β ∈ R, S ⊆ [n], and let f, g be Boolean functions. Then one has

(αf + βg

V

)(S) = hαf + βg | χSi = αhf | χSi + βhg | χSi = α ˆf (S) + βˆg(S).

Proposition 2.1.4. [Plancherel’s identity] Let h· | ·i_L2 be the inner product on the space of Boolean functions corresponding to the uniform measure on {±1}n, as stated earlier. Let h· | ·i_`2 be the inner product for the function space on 2[n] corresponding to the counting measure on 2[n]. Then, for Boolean functions f, g,

hf | gi_L2 = h ˆf | ˆgi_`2.

Proof. Taking the Fourier expansion of both functions and then using linearity and orthonormality gives hf | gi_L2 = h X S⊆[n] ˆ f (S)χS| X T ⊆[n] ˆ g(T )χTi_L2 = X S,T ⊆[n] ˆ f (S)ˆg(T )hχS| χTi_L2 = X S,T ⊆[n] ˆ f (S)ˆg(T )δS,T = X S⊆[n] ˆ f (S)ˆg(S) = h ˆf | ˆgi_`2,

which proves Plancherel’s identity.

Remark (Parseval’s identity). With f = g this is also called Parseval’s identity and the proposition states

kf k_L2 = k ˆf k_`2. (2.1) For inner products the L2, `2 subscripts will be left away. For norms they will mostly be replaced with a subscript indicating the 2-norm, since the domain of the function implicitly specifies the measure used on the space.

(12)

Definition 2.1.5. [Fourier weights] For ◦ ∈ {=, <, >, ≤, ≥} and k ∈ [n] ∪ {0} define f◦k:= X |S|◦k ˆ f (S)χS, and W◦k[f ] := kf◦kk2₂ = X |S|◦k ˆ f (S)2,

where the last equality follows by Parseval’s identity.

Definition 2.1.6. [-spectrum of a function] For ∈ (0, 1) define the -spectrum Spec[f ] := {S ⊆ [n] : | ˆf (S)| ≥ }.

Definition 2.1.7. [Density (Definition 1.20 in [6])] ϕ : {±1}n_{→ R}≥0 is called a

(proba-bility) density if E[ϕ] = 1.

Definition 2.1.8. [Random variable drawn from associated probability distribution] Let ϕ be a density. One writes x ∼ ϕ for a random variable x chosen from the probability distribution associated with ϕ, defined by

Px∼ϕ[x = x] =

1 2nϕ(x),

for x ∈ {±1}n. Note thatP

x∈{±1}nPx∼ϕ[x = x] =

P

x∈{±1}n 1

2nϕ(x) = Ex∼{±1}n[ϕ(x)] =

1, which means this probability is well-defined.

Remark. When A ⊆ {±1}nthe probability density corresponding to uniformly choosing a random variable from A is ϕA:= 1A

E[1A]. This holds since Px∼ϕA[x = x] =

1 2n2

n

|A|1A= 1

|A|1A= Px∼A[x = x], which shows that

x ∼ ϕA if and only if x ∼ A.

The sets are often omitted for densities of singleton sets {x} for x ∈ {±1}n; i.e. we write ϕx instead of ϕ{x}.

Remark. Suppose that the random variable x ∼ A is uniformly chosen from a set A ⊆ {±1}n. Let B ⊆ {±1}n be another set. Then with

P

x∼A[x ∈ B] = Ex∼A[1B(x)] =x∼{±1}E n[ϕA(x)1B(x)] = hϕA|1Bi. (2.2)

one sees that the probability of x being contained in B can be written as the inner product of a density and an indicator.

Parseval’s identity gives a bound on the dimension of the space spanned by the -spectrum of density functions. Here, the dimension of a span of a subset of 2[n] is to be interpreted as the dimension of the span of the corresponding set in {±1}n. The bound given here will be significantly improved at the end of this chapter (Chang’s lemma, Theorem 2.3.18).

(13)

Proposition 2.1.9. Let A ⊆ {±1}n have volume α = E[1A] = |A|

2n, and let > 0. Define d := dim(Sp(Spec[ϕA])), using the -spectrum as defined in Definition 2.1.6.

Then d ≤ −2α−1.

Proof. Let α := E[1A] = |A|₂n. Then

α−1 = E[1 2 A] α2 = kϕAk 2 2 = k ˆϕAk 2 2 = X S∈[n] | ˆϕA(S)|2 ≥ X S∈Spec[ϕA] |ϕ_A|2≥ X S∈Spec[ϕA] 2=Spec[ϕA] 2. Consequently, dim Spec[ϕA] ≤ Spec[ϕA] ≤ −2α−1.

Now the definition of an operation between Boolean functions will be given, which can also be defined for general groups. One can think of this operation as “smoothing” one of the functions using the other.

Definition 2.1.10. [Convolution of Boolean functions] Let f, g be Boolean functions. The convolution of f and g is the Boolean function defined by

(f ∗ g)(x) := E

y∼{±1}n[f (y)g(xy)].

Lemma 2.1.11. [(Exercise 1.25 in [6])] The convolution as defined in Definition 2.1.10 is commutative and associative.

Note that for a Boolean function f (ϕx∗ f )(y) = 1 2n X z∈{±1}n f (zy)ϕx(z) = 1 2nf (xy)ϕx(x) = f (xy).

Due to this convolutions with densities on singletons are also called shifts. Related to this, we see that

(ϕA∗ f )(x) = Ey∼{±1}n[ϕA(y)f (xy)] =

1 2n

X

y∈{±1}n

2n

|A|1A(y)f (xy) = 1

|A| X

y∈A

f (xy) = Ey∼A[f (xy)].

A few useful properties about convolutions which also hold over a general Abelian group will now follow.

(14)

Proof. Writing out gives

kϕx∗ f kp_p= Ey[|(ϕx∗ f )(y)|p] = Ey[|f (xy)|p] = Ez[|f (z)|p] = kf kp_p,

which proves the lemma.

Remark. For A ⊆ {±1}n a subset and x ∈ {±1}n it holds that ϕA∗ ϕx= ϕAx. Indeed,

ϕx∗ ϕA(y) = ϕA(xy) =

2n

|A|1A(xy) = 2n

|Ax|1Ax(y) = ϕAx(y).

Lemma 2.1.13. [(Fact 1.26 in [6])] Suppose ϕ, ψ are densities. Let x ∼ ϕ, y ∼ ψ be independently distributed random variables. Define the random variable z := xy. Then ϕ ∗ ψ is a density that represents the random variable z.

Proof. Let z ∈ {±1}n. Then one sees that

P[z = z] = X x P[x = x]P[y = zx] = X x ϕ(x) 2n ψ(zx) 2n = 1 2n(ϕ ∗ ψ)(z),

with the last equality following by definition. This proves the lemma, given Definition 2.1.8.

The following theorem is often stated in a more general form, for (Abelian) groups. Take the space of Boolean functions with the convolution on one side, and the space of Fourier transformed Boolean functions on the other. Then the Fourier transform provides an isomorphism between the two function spaces. The theory in its more general form can for example be studied at [11].

Theorem 2.1.14. [Fourier transform as isomorphism over algebras (Theorem 1.27 in [6])] Let f, g be Boolean functions. Then f ∗ g

V

= ˆf · ˆg. For a proof we refer to O’Donnell [6].

Remark. Convolutions can “flipped across” an inner product using Plancherel’s identity: hf ∗ g | hi = hf ∗ g

V

| ˆhi = h ˆf · ˆg | ˆhi = h ˆf | ˆg · ˆhi = hf | g ∗ hi. (2.3)

2.2 Noise stability

This section follows chapter 2 of O’Donnell[6]. Concepts closely related to social choice theory are defined. We start with the definition of an operator which is needed in section 2.3; then we explore properties of this operator.

Definition 2.2.1. [Noise operator (Definition 2.46 and Proposition 2.47 in [6])] Let ρ ∈ R, and f be a Boolean function. Define the noise operator Tρ by

Tρf := X S⊆[n] ρ|S|f V (S)χS.

(15)

Note that the noise operator is linear by definition, and T1= Id.

For ρ ∈ [−1, 1] one can give a stochastic interpretation to this operator as the expecta-tion of a random variable. For this we need the concept of applying “noise” to an n-bit string, which boils down to flipping each bit independently with a certain probability. Definition 2.2.2. [ρ-correlation (Definition 2.40 in [6]] Let ρ ∈ [−1, 1], and x ∈ {±1}n be fixed. Let y chosen from {±1}n be a random variable such that for every i ∈ [n] independently, P[yi = xi] = 1 + ρ 2 , P[yi = −xi] = 1 − ρ 2 .

The random variable y is then called ρ-correlated to x, denoted by y ∼ Nρ(x).

Remark. Due to the symmetry in x and y in the previous definition one can also consider a pair of random variables (x, y). This pair is called a ρ-correlated pair if first x ∼ {±1}n is chosen uniformly, and y ∼ Nρ(x) is chosen, i.e. y is chosen ρ-correlated to

x.

Theorem 2.2.3. [Stochastic interpretation of the noise operator (Definition 2.46 and Proposition 2.47 in [6])] Let ρ ∈ [−1, 1], and f a Boolean function. Then

Tρf (x) = Ey∼Nρ(x)[f (y)].

Proof. By linearity it suffices to consider how the noise operator acts on characters. One has TρχS(x) = ρ|S|χS(x) = Y i∈S ρxi =Y i∈S Ey∼Nρ(x)yi (2.4) = E y∼Nρ(x) hY i∈S yi i (2.5) = E y∼Nρ(x) [χS(y)].

Here equation 2.4 holds since

Ey∼Nρ(x)yi = xiP[yi = xi] + (−xi)P[yi= −xi] = xi

2(1 + ρ) − xi

2(1 − ρ) = ρxi, which can also be seen as the ”damping” of xi by ρ. Furthermore equation 2.5 follows

from independence.

Lemma 2.2.4. [Exercise 2.34 in [6]] The noise operator satisfies |Tρf | ≤ Tρ|f | pointwise,

for all Boolean f .

Proof. For all x ∈ {±1}n it holds that

|T_ρ_{f (x)| = |E}_y∼N_ρ_(x)_{[f (y)]| ≤ E}_y∼N_ρ_(x)_{[|f (y)|] = E}_y∼N_ρ_(x)[|f |(y)] = Tρ|f |(x),

(16)

Lemma 2.2.5. [Exercise 2.32 in [6]] The noise operator is a multiplicative homomorphism in ρ.

Proof. First note that for all Boolean functions f : Tρf

V

(S) = ρ|S|f (S)ˆ

by definition of the noise operator and uniqueness of the Fourier expansion. It follows that Tτ ρf V (S) = (τ ρ)|S|f (S) = τˆ |S|ρ|S|f (S) = τˆ |S|Tρf V (S) = TτTρf V (S) and therefore Tτ ρf V = TτTρf V

. This immediately gives Tτ ρf = TτTρf for all Boolean

functions f and hence

Tτ ρ= Tτ◦ Tρ

as was claimed.

Lemma 2.2.6. The noise operator is a Hermitian operator for every ρ ∈ R. Proof. Let f , g be Boolean functions. Then for every ρ ∈ R:

using linearity of the inner product and that hχS| χTi = δS,T.

Theorem 2.2.7. [Contraction theorem (Exercise 2.33 in [6])] Tρ is a contraction on

Lp({±1}n) for p ≥ 1, ρ ∈ [−1, 1] and all Boolean functions f : kT_ρf k_p≤ kf k_p.

Proof. Writing out one gets

kTρf kp_p= Ex∼{±1}n|Tρf (x)|p = Ex Ey∼Nρ(x)[f (y)] p . (2.6)

Since t 7→ |t|p is convex for p ≥ 1, one can apply Jensen’s inequality (Theorem 1.1.4) to equation 2.6 and then switch the random variables around:

Ex Ey∼Nρ(x)[f (y)] p ≤ Ex Ey∼Nρ(x) h |f (y)|pi = E (x,y) ρ−correlated h |f (y)|pi_{= E}y Ex∼Nρ(y) h |f (y)|pi . (2.7)

(17)

Since x does not appear in the argument of the inner expectation in equation 2.7, the proof concludes with

Ey Ex∼Nρ(y) h |f (y)|pi = Ey|f (y)|p = kf kpp.

Definition 2.2.8. [Noise stability (Definition 2.42 and Fact 2.48 in [6])] Let f be a Boolean function, ρ ∈ R. Then the noise stability of f at ρ is defined as

Stabρ[f ] = hTρf | f i.

Proposition 2.2.9. [Stochastic interpretation of noise stability (Definition 2.42 and Fact 2.48 in [6])] Let f be a Boolean function, and ρ ∈ [−1, 1]. Then

Stabρ[f ] = E

(x,y) ρ−correlated

f (x)f (y).

Proof. Writing out the noise stability gives

Stabρ[f ] = hTρf | f i = Ey[Tρf (y)f (y)]

= Ey E x∼Nρ(y) [f (x)]f (y) = E (x,y) ρ−correlated f (x)f (y), as was claimed.

2.3 Hypercontractivity

This section follows parts of chapter 9 and 10 of O’Donnell[6].

It has been shown that Tρ is a contraction on Lp({±1}n) for p ≥ 1. The goal of

this section is to prove a stronger statement, namely that Tρis a hypercontraction from

Lp({±1}n) to Lq({±1}n) for certain p and q.

Let µ(A) = E[1A] = |A|₂n be the uniform probability measure on Fn₂, in Definition 1.1.2. For f ∈ Lp({±1}n) this gives in particular

kf k_p = Z {±1}n |f |pdµ 1 p = _X x∈{±1}n |f (x)|p 1 2n 1 p = Ex∼{±1}n[|f (x)|p] 1 p_.

Theorem 2.3.1. [Bonami-Beckner inequality, Hypercontractivity theorem (page 247, 284 in [6])] Let f : {±1}n → R, 1 ≤ p ≤ q ≤ ∞ and 0 ≤ ρ ≤ p−1_q−11

2_{. Then f is} (p, q, ρ)-hypercontractive, i.e.

kTρf kq≤ kf kp

(18)

• The case ρ = 1 implies 1 = ρ ≤ p−1_q−11₂

≤ (q−1_q−1)12 _{= 1 ⇐⇒} p−1

q−1 = 1 ⇐⇒ p = q,

for q 6= 1. Since T1 = Id the inequality then states kf kp ≤ kf kp, which is trivially

valid.

• The other case, q = 1, implies 1 ≤ p ≤ q = 1 and therefore p = 1 = q. The condition on ρ is now not well-defined, but it is already proven that Tρ is a

contraction (Theorem 2.2.7), which exactly proves the Bonami-Beckner inequality for p = q.

Both cases therefore need not be considered during the chapter.

Considering Tρ as a linear operator from Lp({±1}n) to Lq({±1}n), this inequality

states that Tρ is a bounded operator, and therefore also continuous.

The Bonami-Beckner inequality implies the operator norm can be bounded with kT_ρk ≤ 1. Sharpness is achieved in T_ρχ_∅ = χ_∅. In particular this implies kTρk = 1.

The proof of this inequality consists of the following parts:

• Formulating a Two-function version of the Hypercontractivity theorem, and showing equivalence of it with the Bonami-Beckner inequality.

• Reduce the theorem for n = 1 to a statement on uniform {±1}-bits, for 1 ≤ p < q ≤ 2, ρ = p−1_q−1

1 2_.

• Showing that the Bonami-Beckner inequality holds for n = 1; this is also called the Two-Point inequality.

• Prove the general statement of the Two-Function version by induction on n. We will start with the Two-function version of the Bonami-Beckner inequality. The reductions, the case n = 1 and the induction on n are discussed in later sections. Theorem 2.3.2. [Two-function Hypercontractivity Theorem (page 284 in [6])] Let f, g : {±1}n→ R be Boolean functions, and r, s ≥ 0, 0 ≤ ρ ≤ (rs)12 ≤ 1. Then

E

f (x)g(y) ≤ kf k_1+rkgk_1+s.

The following proposition and its corollary show equivalence of the Hypercontractivity theorems.

Proposition 2.3.3. [Proposition 10.4 in [6]] Let 1 ≤ p ≤ q ≤ ∞, 0 ≤ ρ ≤ p−1_q−11₂ . We have for all Boolean functions f, g:

kT_ρf k_q ≤ kf k_p ⇐⇒ hT_ρf | gi ≤ kf k_pkgk_q0, where q0 is the H¨older conjugate of q.

(19)

Proof. Suppose that kTρf kq≤ kf kp for every Boolean function f . Then for all Boolean

functions f , g:

hT_ρf | gi ≤ kTρf kqkgkq0 ≤ kf k_pkgk_q0

with the first inequality following by H¨older’s inequality and the second by assumption. On the other hand, suppose that hTρf | gi ≤ kf kpkgkq0 for all Boolean functions f , g. Then for every Boolean function f one has

kTρf kq= sup kgk_q0=1

hTρf | gi ≤ sup kgk_q0=1

kf k_pkgk_q0 = kf k_p,

where the first equality holds by sharpness of H¨older’s inequality and the inequality by assumption.

Corollary 2.3.4. [Equivalence of Hypercontractivity theorems] The Hypercontractivity Theorem (Theorem 2.3.1) and Two-function Hypercontractivity Theorem (Theorem 2.3.2) are equivalent.

Proof. Let p = 1 + r and q0 = 1 + s in the previous proposition. Similarly to Proposition 2.2.9 we have E (x,y) ρ−correlated f (x)g(y) = Ey h E x∼Nρ(y)

[f (x)]g(y)i= Ey[Tρf (y)g(y)] = hTρf | gi.

Furthermore p−1_q−1 1

2 _{= ((p − 1)(q}0_{− 1))}12 _{= (rs)} 1

2_{, which shows that ρ ≤} p−1

q−1

1₂ if and only if ρ ≤ (rs)12.

2.3.1 Reductions

Suppose f : {±1} → R, i.e. n = 1. The Fourier expansion of f is then given by f = ˆ

f (∅)χ∅+ ˆf ({1})χ{1}, which gives f (x) = ˆf (∅) + ˆf ({1})x. Therefore, write f (x) = a + bx

for a, b ∈ R. Denoting f like this also gives Tρf (x) = a + ρbx. Since x ∈ {±1} only takes

two values it is useful to think of the function f : x 7→ a + bx as the random variable a + bx, for x uniformly chosen from {±1}.

Hence hypercontractivity of f can be expressed as hypercontractivity of the uniform {±1}-bit x, as follows:

Definition 2.3.5. Let x be a random variable uniformly chosen from {±1}, i.e. x ∼ {±1}. Define for any function f in x:

kf (x)k_p:= kf k_p = Ex[|f (x)|p]

1 p_.

Let again 1 ≤ p ≤ q ≤ ∞ and 0 ≤ ρ ≤ p−1_q−1 1

2_{. The uniform {±1}-bit x is now called} (p, q, ρ)-hypercontractive if for all a, b ∈ R, it holds that

(20)

It is essential to note that by definition the {±1}-bit x is hypercontractive if and only if the function f : x 7→ a + bx is hypercontractive.

The following two lemmas are proven in the context of general Boolean functions, but applied for n = 1.

Lemma 2.3.6. Suppose f : {±1}n→ R is (p, q, ρ)-hypercontractive. Let c ∈ R. Then cf is also (p, q, ρ)-hypercontractive.

Proof. We have

kTρcf kq= kc · Tρf kq = |c|kTρf kq ≤ |c|kf kp = kcf kp.

This proves the lemma.

Suppose hypercontractivity is proven for the {±1}-bit x with a = 1. Using the previous lemma we then know that x is hypercontractive for any a ∈ R.

In the case that a = 0 one is left to prove

kρbxk_q≤ kbxk_p ⇐⇒ ρkxk_q≤ kxk_p ⇐⇒ ρ ≤ kxkp kxk_q = 2 1 p− 1 q_,

which holds since ρ ≤ p−1_q−1 1 2 _{and p ≤ q implies} p−1 q−1 1 2 _{≤ 1 ≤ 2} 1 p− 1 q_.

Lemma 2.3.7. [Exercise 9.7 in [6]] Let 1 ≤ p ≤ q, 0 ≤ ρ < 1. Suppose we know that for all non-negative Boolean functions f : {±1}n → R, f ≥ 0 that f is (p, q, ρ)-hypercontractive. Then the same holds for arbitrary Boolean functions g : {±1}n→ R. Proof. We have to prove that the Boolean function g is (p, q, ρ)-hypercontractive, i.e. that kTρgk_q≤ kgk_p. We see that

Tρg _q = |Tρg| _q ≤ Tρ|g| _q≤ |g| _p = g _p, using Lemma 2.2.4 with f = |g| ≥ 0.

We already know it is sufficient to prove hypercontractivity of a single bit x ∼ {±1} (Definition 2.3.5) for a = 1. Now suppose this is proven also with |b| ≤ 1. Let f : x 7→ 1 + bx. It holds for all x ∈ {±1} that f (x) ≥ 0 if and only if bx ≥ −1, exactly when b ≥ −1 and −b ≥ −1, i.e. |b| ≤ 1. Applying the previous lemma restricted to functions f as defined, we then know that every function of this form is hypercontractive. Hence, for 1 ≤ p ≤ q ≤ ∞ and 0 ≤ ρ ≤ p−1_q−11₂

, we know for the bit x that k1 + ρbxk_q≤ k1 + bxk_p

holds for every b ∈ R, which was sufficient for proving hypercontractivity of it. Hence, from this point on, this is exactly what needs to be proven in the case n = 1.

The following lemma shows that the Hypercontractivity theorem only needs to be proven for ρ = p−1_q−11₂

(21)

Lemma 2.3.8. [Exercise 9.11 in [6]] Suppose that a uniform {±1}-bit x is (p, q, ρ)-hypercontractive. Then x is also (p, q, τ )-hypercontractive for all τ < ρ.

Proof. Let a ∈ R. Using Lemma 2.3.6 we prove hypercontractivity for b = 1. Since uniform {±1}-bits have mean 0, we have

|a| = |a + E[x]| = |E[a + x]| ≤ E[|a + x|] = ka + xk1 ≤ ka + xkq. (2.8)

with Jensen’s inequality and the fact that the q-norm with the uniform measure is monotonously increasing. Furthermore for ρ < 1 we have

ka + ρxk_q= k(1 − ρ)a + ρ(a + x)k_q ≤ k(1 − ρ)ak_q+ kρ(a + x)k_q

= (1 − ρ)|a| + ρka + xk_q ; using that ρ < 1 ≤ (1 − ρ)ka + xk_q+ ρka + xk_q ; with equation 2.8

= ka + xk_q. (2.9)

Finally, let 0 ≤ τ < ρ. Then ka + τ xk_q= ρ a ρ + τ ρx q ≤ ρ a ρ + x q ; since τ ρ < 1 using equation 2.9 = ka + ρxk_q≤ ka + xk_p ; since x is hypercontractive. The conclusion is that x is (p, q, τ )-hypercontractive for all 0 ≤ τ < ρ.

The following two lemmas show that the Bonami-Beckner inequality needs only to be proven for 1 ≤ p ≤ q ≤ 2. Letting 2 ≤ q0, p0 ≤ ∞ be the H¨older conjugates of p and q we then have q0− 1 = 1 q − 1 ≤ 1 p − 1 = p 0_{− 1 =⇒ q}0 _{≤ p}0_.

Lemma 2.3.9. [Proposition 9.19 in [6]] Let f be a real-valued Boolean function. Suppose we prove (p, q, ρ)-hypercontractivity of f for 1 ≤ p ≤ q ≤ 2. Let 2 ≤ q0 ≤ p0 _{≤ ∞ be their}

H¨older conjugates. Then f is also (q0, p0, ρ)-hypercontractive.

Proof. We have kTρf kq≤ kf kp. Let g be any real-valued Boolean function, then

kTρgk_p0 = sup kf k_p=1 hf | Tρgi = sup kf k_p=1 hTρf | gi ≤ sup kf k_p=1 kTρf k_qkgk_q0 ≤ sup kf k_p=1 kf k_pkgk_q0 = kgk_q0,

using sharpness of H¨older’s inequality, that Tρis Hermitian, H¨older’s inequality, and the

(22)

Lemma 2.3.10. [Exercise 9.17 in [6]] Suppose that for every p, q such that p < 2 < q we have that a Boolean function f is (2, q, (q − 1)−12_{)- and (p, 2, (p − 1)}

1

2_{)-hypercontractive.} Then f also is (p, q, (p−1_q−1)12_{)-hypercontractive.}

Proof. Define τ := (q − 1)−12 _{and σ := (p − 1)} 1

2_{. We have for all Boolean functions f :} kTτf k_q≤ kf k₂, kTσf k₂≤ kf k_p.

Let ρ = (p−1_q−1)12 _{= τ σ. Then for all Boolean functions f :}

kTρf k_q = kTτ σf k_q = kTτTσf k_q≤ kTσf k₂ ≤ kf k_p.

2.3.2 Base case and induction

At this point we need to prove much less to actually prove the entire Bonami-Beckner inequality, or Hypercontractivity theorem. Proving the base case and doing induction on the Two-function Hypercontractivity theorem now suffices. First, we give a lemma that is needed for the base case:

Lemma 2.3.11. For t ≥ 0, 0 ≤ θ ≤ 1 we have (1 + t)θ ≤ 1 + θt. This lemma can be checked by taking derivatives in t.

Theorem 2.3.12. [Two-point inequality (page 286 in [6]] Suppose 1 ≤ p ≤ q ≤ ∞, 0 ≤ ρ ≤ p−1_q−11₂

and let f : {±1} → R be a function on a single bit. Then kTρf k_q≤ kf k_p.

Proof. With remark that the proof is trivial for ρ = 1, and using the reductions from the previous section (2.3.1) one needs to prove for 1 ≤ p < q ≤ 2 (since Tρ is a contraction

(Theorem 2.2.7) and the norms can be ”flipped” (Lemma 2.3.9)), a uniform {±1}-bit x, ρ = p−1_q−1

1

2 _{and || < 1 that}

k1 + ρxk_q≤ k1 + xk_p. Expanding this one gets

k1 + ρxkp_q≤ k1 + xkp_p _{⇐⇒ E}x[(1 + ρx)q]

p

q ≤ E_x_{[(1 + x)}q_]

dropping the absolute values since dx > −1 for d = ρ or d = 1. Now write out the expected values, with x ∼ {±1}:

1 2(1 + ρ) q₊1 2(1 − ρ) qp/q _≤ 1 2(1 + ) p₊1 2(1 − ) p_.

(23)

Since || < 1, the Generalized Binomial Theorem can be applied to obtain 1 2 X k≥0 q k (ρ)k+1 2 X k≥0 q k (−ρ)k p/q ≤ 1 2 X k≥0 p k k+ 1 2 X k≥0 p k (−)k,

in which all uneven terms cancel out to give 1 +X l≥1 q 2l (ρ)2l p/q ≤ 1 +X l≥1 p 2l 2l. (2.10)

Given 1 ≤ q ≤ 2 and since q 2l = 1 (2l)! 2l−1 Y j=0 (q − j) = 1 (2l)!q(q − 1) 2l−1 Y j=2 (j − q) ≥ 0,

with Lemma 2.3.11 one has 1 +X l≥1 q 2l (ρ)2lp/q ≤ 1 + p q X l≥1 q 2l (ρ)2l = 1 +X l≥1 p q p − 1 q − 1 l q 2l 2l,

which means it is sufficient to prove X l≥1 p q p − 1 q − 1 l q 2l 2l ≤X l≥1 p 2l 2l.

Comparing terms therefore it suffices to prove for each l that p q p − 1 q − 1 l q 2l ≤ p 2l . (2.11)

Now note that for p = 1 it has to hold that 0 ≤ _2l1 = 0. Therefore, we can assume from this point on that p 6= 1. Equation equation 2.11 can now be written out as

p q p − 1 q − 1 l 1 (2l)!q(q − 1) 2l−1 Y j=2 (j − q) ≤ 1 (2l)!p(p − 1) 2l−1 Y j=2 (j − p) ⇐⇒ p − 1 q − 1 l−12l−1_Y j=2 (j − q) ≤ 2l−1 Y j=2 (j − p) ⇐⇒ 2l−1 Y j=2 j − q (q − 1)12 ≤ 2l−1 Y j=2 j − p (p − 1)12 .

Noting that for j ≥ 2, r > 1 one has d dr j − r (r − 1)12 = −(r − 1) 1 2 −1 2(r − 1) −1 2_{(j − r)} r − 1 = − 2r − 2 + j − r 2(r − 1)32 = −r + j − 2 2(r − 1)32 < 0,

(24)

it follows that j−r

(r−1)12

decreases in r. Therefore p < q implies for each j ≥ 2: j − q

(q − 1)12

≤ j − p (p − 1)12

,

which concludes the proof.

Now induction is done for the Two-function Hypercontractivity theorem. Doing induction on the Two-point inequality to derive the Bonami-Beckner inequality directly is possible, but needs more concepts on Boolean functions than described in this thesis. See also Remark 10.5 in O’Donnell [6].

Theorem 2.3.13. [Two-function Hypercontractivity Induction Theorem (page 261 in [6])] Suppose that the base case of the Two-function Hypercontractivity theorem holds. Then it holds for all n.

Proof. For n > 1, let f, g : {±1}n_{→ R be Boolean functions. Choose (x, y) ρ-correlated.} Denote x0 = (xi)n−1_i=1 and x = (x0, xn), and similarly for y. This makes both (x0, y0), (xn, yn)

ρ-correlated pairs by definition. Also denote fxn = f[n−1]|xn for the restriction of f with the last coordinate fixed in xn; similarly for g.

Then E (x,y) ρ−correlated f (x)g(y) = E (xn,yn) h E (x0,y0)[fxn(x 0_)g yn(y 0_)]i_≤ E (xn,yn) [kfxnkpkgynkq]

where the induction hypothesis is used for the inequality. Define F (xn) := kfxnkp,

G(yn) := kgynkq, then E (xn,yn) [kfxnkpkgynkq] = E (xn,yn) [F (xn)G(yn)] ≤ kF kpkGkq

where the inequality follows by the base case. Writing out one gets kF kp_p _{= E}_x_n[|F (xn)|p] = Exn[|kfxnkp| p ] = Exn[E[x 0_]|f xn(x 0_)|p ] = Ex[|f (x)|p] = kf kpp

and similarly for G, which implies E

f (x)g(y) ≤ kf k_pkgk_q.

By applying the equivalence of the Hypercontractivity theorems for one and two functions (Proposition 2.3.3) on the Two-point inequality (Theorem 2.3.12), applying induction by the previous theorem, and again noting equivalence of the Hypercontractivity theorems, the Bonami-Beckner inequality is proven.

As an addendum, the following proposition shows that one cannot weaken the conditions on the Bonami-Beckner inequality.

(25)

Proposition 2.3.14. [Exercise 9.10b in [6]] k1 + ρxk_q≤ k1 + xk_p implies ρ ≤ p−1_q−1 1 2_; in particular, the Bonami-Beckner inequality cannot improve on this bound.

Proof. Starting with the expansion in equation 2.10 in the proof of Theorem 2.3.12, one has k1 + ρxk_q ≤ k1 + xk_p =⇒ 1 +X l≥1 q 2l (ρ)2l p/q ≤ 1 +X l≥1 p 2l 2l.

Now cut off both series to their second-order expansions, with corresponding remainder terms expressed using asymptotic notation to get

1 +q(q − 1) 2 ρ 22_{+ O(}4₎p/q _{≤ 1 +}p(p − 1) 2 2_{+ O(}4_).

We now again want to take a second-order expansion to eliminate the power taken on the left-hand side. For this one can use the Generalized Binomial Theorem, or note that with g() = q(q−1)₂ ρ22+ O(4), one has g0() = 0 and therefore, in the Taylor expansion the second term is of the second order and the third term is of the fourth order. Hence one has for small enough:

1 +p q q(q − 1) 2 ρ 22_{+ O(}4_{) ≤ 1 +}p(p − 1) 2 2_{+ O(}4_).

Writing this out gives

(q − 1)ρ22+ O(4) ≤ (p − 1)2+ O(4) =⇒ (q − 1)ρ2+ O(2) ≤ p − 1 + O(2) =⇒ (q − 1)ρ2≤ p − 1 =⇒ ρ ≤p − 1 q − 1 1₂ , letting → 0.

2.3.3 Small-set expansion theorem, Level-k inequalities, and Chang’s lemma

In this section sets A ⊆ {±1}n are considered, with the exception of Chang’s lemma. We now are going to prove a theorem that touches upon the idea of the Hamming cube being a small-set expander, namely that most of the “weight” of a subset of the Hamming cube lies at its boundary. More about these ideas is explained in O’Donnell [6]. Theorem 2.3.15. [Small-set expansion theorem (page 264 in [6])] Suppose A ⊆ {±1}n has volume α ∈ [0, 1], i.e. E[1A] = α. Then for all 0 ≤ τ ≤ 1 it holds that

Stabτ[1A] ≤ α

2 1+τ_.

(26)

Proof. By definition, Lemma 2.2.5, Lemma 2.2.6 and non-negativity of τ we have Stabτ[1A] = hTτ1A|1Ai = hT√τ1A| T√τ1Ai = kT√τ1Ak2₂. Furthermore, by the

Bonami-Beckner inequality (Theorem 2.3.1) with q = 2, p − 1 = ρ2 = τ it holds that kT√

τ1Ak₂≤

k1_Ak_{τ +1}. Combining these results gives

Stabτ[1A] = kT√τ1Ak22≤ k1Ak2τ +1 = E[|1A|τ +1] 2 τ +1 = E[1_A] 2 τ +1 = α 2 τ +1, which concludes the proof.

In particular, given the assumptions of the theorem just proved, with Proposition 2.2.9 we have Stabρ[1A] = E (x,y) ρ−correlated 1A(x)1A(y) = E (x,y) ρ−correlated 1A2(x, y) = P (x,y) ρ−correlated (x, y) ∈ A2_{≤ α}1+ρ2 _.

This means that the probability of staying inside A when applying noise to a uniformly chosen x ∼ A decreases quadratically in the volume of the set A.

In the following two theorems, asympototical statements are made that bound the Fourier weight up until a certain level k.

Theorem 2.3.16 (Level-k inequalities (page 264 in [6])). Suppose a non-empty set A ⊆ {±1}nhas volume α ∈ (0, 1], i.e. E[1A] = α. Furthermore take k ∈ Z>0 such that

k ≤ 2 ln(α−1). Then W≤k[1A] ≤ 2e k ln(α −1₎k α2.

Proof. Let 0 < ρ ≤ 1. Writing out the Fourier weights of 1Aat degrees at most k:

ρkW≤k[1A] = ρk X |S|≤k ˆ 1A(S)2= X |S|≤k ρk1Â(S)2 ≤ X |S|≤k ρ|S|1Â(S)2 ≤ X S⊆[n] ρ|S|1Â(S)2 = D X S⊆[n] ρ|S|ˆ1A(S)χS X T ⊆[n] ˆ 1A(T )χT E = hTρ1A|1Ai = Stabρ[1A] ≤ α 2

1+ρ _{; by the Small-Set Expansion Theorem}

When ρ = 1, one has α1+ρ2 _{= α ≤ 1 = α}2(1−ρ)_{. Now assume ρ < 1. Then} 1

1+ρ =

P

n≥0(−ρ) n

= 1 − ρ + ρ2− ρ3_{+ . . ., and ρ}2m_{≥ ρ}2m+1 _{implies ρ}2m_{− ρ}2m+1_{≥ 0 for every}

m; hence all terms higher than first order can be estimated away to give the inequality

1

1+ρ ≥ 1 − ρ. Therefore

W≤k[1A] ≤ ρ−kα

2

(27)

holds for each 0 < ρ ≤ 1. We now want to minimize the right-hand side. Setting the derivative of the right side with respect to ρ to zero gives

− kρ−k−1α2(1−ρ)+ ρ−k· −2 ln(α)α2(1−ρ)_{= 0} ⇐⇒ − kρ−k−1+ ρ−k · −2 ln(α) = 0 ⇐⇒ k + 2ρ ln(α) = 0 ⇐⇒ ρ = k 2 ln(α−1₎ ≤ k k = 1

so that ρ satisfies the conditions of the Small-Set Expansion Theorem. Substituting this value in the right-hand side of equation 2.12 gives

ρ−kα2(1−ρ)= 2 kln(α −1 )kα2 1− k 2 ln(α−1) = 2e k ln(α −1 )kα2, since α− k ln(α−1) _{= α}− k ln(α−1) _{= e} k ln(α)·ln(α)_{= e}k_{. We conclude W}≤k_[1 A] ≤ 2e_k ln(α−1) k α2.

Theorem 2.3.17 (Sharp form of the Level-1 inequality (Exercise 9.18 in [6])). Suppose a non-empty set A ⊆ {±1}n has volume α ∈ (0, 1], i.e. E[1A] = α. Then

W=1[1A] ≤ 2 ln(α−1)α2. Proof. Let 0 < ρ ≤ 1. ρW=1[1A] = ρ X |S|=1 ˆ 1A(S)2 = X |S|=1 ρ|S|1ˆA(S)2 ≤ X S6=∅ ρ|S|ˆ1A(S)2 = X S⊆[n] ρ|S|1ˆA(S)2− ˆ1A(∅)2

= Stabρ[1A] − α2 ; since E[1A] = ˆ1A(∅)

≤ α1+ρ2 − α2 by the Small-Set Expansion Theorem. Hence

W=1[1A] ≤

1 ρ(α

2

1+ρ − α2_).

Taking the limit as ρ ↓ 0 and applying l’Hˆopital’s rule, it follows that

W=1[1A] ≤ lim ρ↓0 α1+ρ2 − α2 ρ = limρ↓0 −_(1+ρ)1 2 ln(α2)α 2 1+ρ 1 = 2 ln(α −1_)α2_,

which proves the theorem.

As promised, we will improve the bound given in Proposition 2.1.9. The theorem that gives this bound, named after Mei-Chu Chang, has a more general version described by himself as lemma 3.1 in [2]. This article touches upon the additive combinatorics as

(28)

described in chapter 3. We focus on a version for the Hamming cube, from O’Donnell [5], in which Fn2 is used to describe the domain of both original and Fourier transformed

Boolean functions.

Again the dimension of a span of a subset of 2[n] is to be interpreted as the dimension of the span of the corresponding set in Fn2.

Theorem 2.3.18. [Chang’s lemma] Let A ⊆ Fn2 have volume α = E[1A] = |A|₂n, and let > 0. Define d := dim(Sp(Spec[ϕA])), using the -spectrum as defined in Definition

2.1.6. Then d ≤ 22 ln(α

−1_).

Proof. In this proof we will use (Fn2, +), χz(x) = (−1)x·z to describe the Fourier basis.

Let Γ ⊆ Spec[1A] be a maximal linearly independent subset (which can be generated

by iterating across elements and adding those from Spec[1A] which are not already in

the span of Γ). Since Γ is maximal we have dim(Γ) = d, so one can write Γ = {v1, . . . , vd}.

Now let M : Fn2 → Fn2 be an invertible linear map such that for all i ∈ {1, . . . d} we have

vi= Mei.

Define ϕ := ϕA, ψ := ϕ ◦ M−T, where M−T is the inverse transposed of M. Then

ˆ ψ(ei) = (ϕ ◦ M−T V )(ei) = Ex[ϕ(M−Tx)(−1)x·ei] = Ey[ϕ(y)(−1)M T_y·e i_] = Ey[ϕ(y)(−1)y·Mei]

= Ey[ϕ(y)χMei(y)] = ˆϕ(Mei) = ˆϕ(vi).

Since one chooses x uniformly if and only if one chooses y = M−Tx uniformly, the map ψ is a density: Ex[ψ(x)] = Ex[ϕ(M−Tx)] = Ey[ϕ(y)] = 1.

This proves that without loss of generality we can assume that Γ = {e1, . . . , ed},

since the transformation preserves the volume of the set considered. Using the Level-1 inequality and ϕA= 2

n

|A|1A= _α11A, one has

2 ln(α−1)α2≥ W=1_[₁ A] = W=1[αϕA] = α2W=1[ϕA]. Hence 2 ln(α−1) ≥ W=1[ϕA] = W=1[ϕA] ≥ d X i=1 ˆ ϕA(ei)2 ≥ d X i=1 2= d2,

which implies d ≤ 22 ln(α−1) proving the theorem.

Comparing this inequality to Proposition 2.1.9 where we obtained a bound of d ≤ −2α−1, this is a significant improvement. Suppose we take A = Fn2, so α = 1, i.e.

ϕA= χ∅ ≡ 1. Then only the empty set has Fourier coefficient at least , since ˆχ∅(S) =

hχ_∅| χ_Si = δ_∅,S. Therefore Spec[ϕA] = {∅}. This corresponds to the null space {0} ⊆

Fn2, which has dimension zero. While the bound derived from Parseval’s identity does not

give a sharp estimate, it is correctly given by Chang’s lemma, where d ≤ 2−2ln(1) = 0. It is worthwhile to note that the Level-k inequalities and Chang’s lemma are independent of the domain of the Boolean functions involved.

(29)

3 Additive Combinatorics

Additive combinatorics is an area within mathematics that considers estimates, oftenly combinatorial, involved with addition and subtraction within arbitrary subsets of groups. Working over F2, subtraction is exactly the same as addition, which gives additional

structure. At first sight it might not look like this area of mathematics has anything to do with Fourier analysis. The convolution of Boolean functions from Definition 2.1.10 plays a large role in allowing Fourier analysis to be a toolset to be used in additive combinatorics on Fn2; addition can be turned into so-called shifts, convolutions with

densities on singletons. The largest result from the previous chapter, Chang’s lemma (Theorem 2.3.18), is going to be needed.

The proof techniques in this chapter sometimes vary significantly from earlier chapters - a lot more is done with the domain of the Boolean functions involved. For this reason (Fn2, +) is used for the domain of both Boolean functions themselves, and the Fourier

transform of Boolean functions. Since Chang’s lemma is independent of the domain of the Boolean functions involved, this is a legitimate choice to make.

This chapter follows a survey written by Lovett [4], with modifications for readability and completion.

First we need a few definitions.

Definition 3.0.1. Let A ⊆ G be a subset of an Abelian group (G, +) and let t ∈ Z>0.

Define the k-sumset of A as

kA = {

k

X

j=1

aj: aj ∈ A∀j ∈ [k]}

Let G be a group and A ⊆ G. Note that it always holds that |2A| ≥ |A|; let a ∈ A, then x 7→ a + x is an injection from A to 2A. Therefore we are interested in the case that |2A| is not too big in relation to |A|, i.e. when for a fixed K ∈ R≥1, it holds that

|2A| ≤ K|A|. Since one can prove that |2A| = |A| if and only if A = g + H for a certain g ∈ G and a subgroup H ≤ G, one can view the sets that meet the requirement of not being too big as “almost-cosets”. Hence the following definition:

Definition 3.0.2. Fix K ∈ R≥1. Let A ⊆ G be a subset of a group (G, +). We say A

has doubling K if

|2A| ≤ K|A|.

A subset A always has doubling K for K ≥ |2A|_|A|. Terminology like “small” doubling therefore only makes sense when K is small relatively to |2A|_|A|.

(30)

Theorem 3.0.3. [Quasi-polynomial Freiman-Ruzsa theorem] Fix K ∈ R≥1. Let A ⊆ Fn₂

have doubling K.

Then there is a subset A0 ⊆ A such that |A0| ≥ K−O log(K)3

|A| and Sp(A0) ≤ 2K5|A0|. The theorem just stated follows from another large result that is similar in form, proved originally by Sanders [9]:

Theorem 3.0.4. [Quasi-polynomial Bogolyubov-Ruzsa theorem] Fix K ∈ R≥1. Let

A ⊆ Fn2 have doubling K.

Then there exists a linear subspace V ⊆ 4A such that |V | ≥ K−O log(K)3

|A|.

This theorem will be proven, and at the end of the chapter the reduction from this theorem to the quasi-polynomial Freiman-Ruzsa theorem will be given.

The quasi-polynomial comes from how the subset A0, respectively the subspace V are stated to have quasi-polynomially size in relation to the set of doubling K. Similar to this, one can formulate the polynomial Freiman-Ruzsa and polynomial Bogolyubov-Ruzsa conjectures [4]. At the moment of writing it is an open problem to either prove or disprove (one of) these conjectures.

We state the following lemma, to be used a few times over the course of this chapter. Lemma 3.0.5. [Pl¨unnecke-Ruzsa inequality] Fix K ∈ R≥1. Suppose A ⊆ G is a subset

of a group such that |2A| ≤ K|A|. Then for each t ∈ Z>0 it holds that |tA| ≤ Kt|A|.

For a proof of this lemma, we refer to Corollary 6.28 in [12]. We now need the notion of a Freiman homomorphism.

Definition 3.0.6. [Freiman homomorphism] Let A ⊆ Fn2. A linear map φ : Fn2 → Fm2 is

called a Freiman homomorphism on A of order t if φ is injective on tA, i.e. by linearity for all ai, bi∈ A : i ∈ [t]: t X i=1 φ(ai) = t X i=1 φ(bi) =⇒ t X i=1 ai = t X i=1 bi.

Lemma 3.0.7. Let A ⊆ Fn2 and t ≥ 1. Choose m (depending on t) the smallest

integer such that a Freiman homomorphism φ : Fn2 → Fm2 on A of order t exists. Then

φ(2tA) = Fm 2 .

Proof. When m = n, then the identity map Id : Fm2 → Fm2 is a Freiman homomorphism

of order t, which means the minimal m exists.

We know by definition of the image that φ(2tA) ⊆ Fm2 . Furthermore for x ∈ tA,

0 = 2x ∈ 2tA gives us 0 = φ(0) ∈ φ(2tA).

So, suppose 0 6= y ∈ Fm2 . Let ψ : Fm2 → Fm−12 be a surjective linear map for which it

holds that ψ(y) = 0 (such a map exists; extend y to a basis on Fm2 and take the linear

map that sends each basis element to itself, excluding y. This results in a m × (m − 1) matrix).

(31)

Define the map φ0 = ψ ◦ φ : Fn2 → Fm−12 , which is linear by composition of linear maps.

Suppose φ0 is injective. Then it is a Freiman homomorphism on A of order t, which contradicts the minimality of m.

Therefore, we know φ0 is not injective. This means there are distinct x1, x2 such that

(by linearity) ψ(φ(x1+ x2)) = φ0(x1+ x2) = 0, i.e. φ(x1 + x2) = 0 or φ(x1+ x2) = y

since Ker(φ) = {0, y}. As x1+ x2 6= 0, the first case is impossible by injectivity of φ on

tA; therefore y = φ(x1+ x2) ∈ φ(2tA) since x1, x2 ∈ tA, hence x1+ x2∈ 2tA.

Hence also Fm2 ⊆ φ(2tA) and so Fm2 = φ(2tA).

The notion of a Freiman homomorphism is now going to be applied in proving that the Bogolyubov-Ruzsa theorem only needs to be proven in the context of “large” sets. “Large” is here defined in terms of the doubling constant K ≥ 1.

Proposition 3.0.8. Fix K ∈ R≥1. It is sufficient to prove Theorem 3.0.4 for “large”

sets, i.e. A ⊆ Fn2 with doubling K such that E[1A] = |A|₂n ≥ K−1. Proof. Fix K ∈ R≥1. We assume Theorem 3.0.4 for “large” sets.

Let A ⊆ Fn2 such that A has doubling K.

Replacing A with A + a for any a ∈ A we may assume 0 ∈ A, and since |A + a| = |A|, |2(A + a)| = |2A| this does not matter for the theorem.

Let φ : Fn2 → Fm2 be a minimal Freiman homomorphism of A of order 12. Define

B = φ(A).

Now 0 ∈ A implies φ is a Freiman homomorphism on A of orders t for any t ≤ 12, as one can choose any of the elements ai, bi as zero. In particular for t = 1, 2 we get that

|A| = |B|, |2A| = |2B|. Hence |2B| = |2A| ≤ K|A| = K|B|.

Therefore by Theorem 3.0.4 we know a linear subspace W ⊆ 4B with |W | ≥ K−O log(K)3

|B| exists. Also, using Lemma 3.0.7, the fact that φ|_Amaps A onto B and therefore φ(tA) = tB for t ∈ Z>0, and Lemma 3.0.5 we have

2m= |Fm2 | = |φ(24A)| = |24B| ≤ K24|B| =⇒ E[1B] =

|B| 2m ≥ K

−24

so B is large in F₂m.

Since φ is injective on 12A, φ|_12A: 12A → φ is a bijection and therefore an inverse φ|−1_12A exists.

Now define V := φ−1

12A(W ) ⊂ 4A; this will be the linear subspace as to be found

in the theorem. Note that since 0 = φ(0) ∈ φ(A) = B, one has W ⊆ 12B, so V is well-defined.

Let x, y ∈ V . Then x0 := φ(x), y0 := φ(y) ∈ W , and since W is a linear subspace we have z0 := x0 + y0 _{∈ W . Define z := φ|}−1

12A(z

0_{). Since x, y, z ∈ V ⊂ 4A, it holds that}

x + y + z ∈ 12A; lastly, φ(x + y + z) = x0+ y0+ z0 = x0+ y0+ x0+ y0 = 0 and due to injectivity of the linear map φ on 12A we therefore have x + y + z = 0 ⇐⇒ x + y = z ∈ V . We conclude that V ⊂ 4A is a linear subspace.

The following result shows that a [0, 1]-valued Boolean function, when smoothed, is invariant to many shifts.

(32)

Lemma 3.0.9. [Croot-Sisask] Fix K ∈ R≥1. Let A ⊆ Fn2 such that E[1A] = |A|

2n ≥ K−1. Furthermore let f : Fn2 → [0, 1], p ≥ 1, and > 0. Then there exists a subset X ⊆ Fn2

with E[1X] = |X| 2n ≥ K −O p 2

such that for every x ∈ X: kϕx∗ ϕA∗ f − ϕx∗ ϕAk_p ≤ .

Proof. Take the assumptions of the theorem, let l ≥ C41+1p p

2, and let (a1, . . . , al) be a sequence of independent random variables distributed uniformly over A.

Define the random variables

z := ϕ_A∗ f − 1 l l X i=1 ϕai ∗ f p, zi := ϕA∗ f − ϕai∗ f. Then 1 l l X i=1 zi p = 1 l l X i=1 ϕA∗ f − ϕai∗ f p = ϕ_A∗ f − 1 l l X i=1 ϕai∗ f p= z,

and all zi(x) = ϕA∗ f (x) − ϕai∗ f (x) = Ea∼A[f (x + a)] − f (x + ai) ∈ [−1, 1] have mean 0 (as both a, ai ∼ A uniformly). We now have, by simply writing out:

E a1,...,al∼A [zp] = Eai 1 l l X i=1 zi p p = E ai,x∼Fn2 |1 l l X i=1 zi(x)|p ≤ Eai hCp l p₂i ; using Corollary 1.1.8 =Cp l p₂ . As a consequence, P a1,...,al∼A [z ≤ 2] = 1 − P[z ≥ 2] = 1 − P[z p _≥ 2 p ] ≥ 1 −E[z p_] (₂)p ; by Markov’s inequality ≥ 1 −2 p · Cp l p₂ = 1 −4 2 Cp l p₂ ≥ 1 −1 2 = 1

2 ; using the choice of l.

Now define S(A) :=n(a1, . . . , al) ∈ Al: ϕA∗ f − 1 l l X i=1 ϕai∗ f p ≤ 2 o ⊆ Fnl2 .

(33)

Since Pa1,...,al∼A[z ≤

2] ≥

1

2, the number of sequences contained in S(A) have to make up

at least half of Al. Furthermore shifting by x ∈ Fn2 is a bijection (over any finite set), so

|S(A + x)| ≥ 1 2|A| l_≥ 1 22 nl_|K|−l . Defining w(a) :=P x∈Fn

2 1[a ∈ S(A + x)] for a ∈ F

nl 2 , we see that X a∈Fnl 2 w(a) = X a∈Fnl 2 x∈Fn 2 1[a ∈ S(A + x)] = X x∈Fn 2 |S(A + x)| ≥ 2n_·1 22 nl_|K|−l . Hence, choosing a0 _{∈ F}nl

2 such that w(a0) is maximal, we have

w(a0) ≥ 1 2nl X a∈Fnl 2 w(a) ≥ 2 n_·1 22 nl_K−l 2nl = 1 22 n_K−l_.

Define X0 _{:= {x ∈ F}n₂ : a0 ∈ S(A + x)}. Let y ∈ X0 _{be fixed, and define X := X}0_{+ y.}

Then every x ∈ X is of the form x = z + y for z ∈ X0. Now, writing out, we have

kϕx∗ ϕA∗ f − ϕA∗ f kp= kϕy∗ (ϕA+x∗ f − ϕA∗ f )kp= kϕA+x+y∗ f − ϕA+y∗ f kp,

using that the p-norm is invariant under shifts (Lemma 2.1.12). Working towards an estimate, this is equal to

ϕA+z∗ f − 1 l l X i=1 ϕ_a0 i∗ f + 1 l l X i=1 ϕ_a0 i∗ f − ϕA+y∗ f _p ≤ ϕA+z∗ f − 1 l l X i=1 ϕa0 i∗ f p + 1 l l X i=1 ϕa0 i∗ f − ϕA+y∗ f p ≤ 2 + 2 = ,

where the second inequality holds given that a0 ∈ S(A + z) and a0 ∈ S(A + y) imply that the first, respectively second term are both at most ₂. This concludes the proof.

A more general version of the previous lemma has been proven by Sanders [9]. This is done in a more representation theoretic context.

Note that

P

a1,a2∼A

[a1+ a2 ∈ 2A] = 1,

by definition. The following result which is a corollary of Croot-Sisask shows that shifting 2A by elements from a t-sumset does not lower this probability much.

(34)

Lemma 3.0.10. Fix K ∈ R≥1. Let A ⊆ Fn2 such that |A|

2n ≥ K−1. Let t = O(log(K)), and δ ∈ (0, 1).

Then there exists a subset X ⊆ Fn2 with E[1X] = |X|₂n ≥ K

−O log(K)3

δ2

such that for every y ∈ tX it holds that

P

a1,a2∼A

[a1+ a2+ y ∈ 2A] ≥ 1 − δ.

Proof. Take the assumptions from the theorem. Set f =12A, p = log(K), = _2tδ and

take the X ⊆ Fn2 from Lemma 3.0.9. Then

E[1X] ≥ K−O p 2 = K−O log(K)3 δ2 . We first claim that

ϕy ∗ ϕA∗12A− ϕA∗12A p ≤ t,

which will be proven using a telescoping argument. Define si :=Pij=1yj for i ∈ {1, . . . , t},

and g := ϕA∗12A. Let y =

Pt

i=1yi ∈ tX. Then y = st and 0 = s0, hence we have

kϕ_y∗ g − gk_p = ϕst∗ g + t−1 X i=1 ϕsi ∗ g − t X i=2 ϕsi−1∗ g − ϕs0∗ g p = t X i=1 ϕsi∗ g − t X i=1 ϕsi−1 ∗ g p = t X i=1 ϕsi∗ g − ϕsi−1∗ g p ,

where ϕst and ϕs0 have been incorporated into their corresponding sums. Continuing, and applying the triangle inequality,

kϕy∗ g − gk_p ≤ t X i=1 ϕ_s_i−1_+y_i∗ g − ϕ_s_i−1∗ g p ≤ t X i=1 ϕsi−1∗ (ϕyj∗ g − g) p, = t X i=1 ϕyi∗ g − g p.

Here the common factor ϕsi−1 is taken out and then can be eliminated due to invariance of the p-norm under shifts (Lemma 2.1.12). Finally, for each i ∈ [t]: ϕ_y_j∗ ϕ_A∗1_2A− ϕA∗12A

p ≤ , by Lemma 3.0.9 since yi∈ X for each i. Hence

kϕ_y∗ ϕ_A∗1_2A− ϕ_A∗1_2Ak_p ≤

t

X

i=1

= t,

and the claim is proven. Substituting = _2tδ now gives for all y ∈ tX

kϕ_y∗ ϕ_A∗1_2A− ϕ_A∗1_2Ak_p≤ δ 2.

(35)

Let q be the H¨older conjugate of p. We have kϕAk_q = 2n |A|E[|1A| q ]1q ₌ 2 n |A|E[1A] 1 q q = 2n |A| |A| 2n 1_q = 2 n |A| 1−1_q = 2 n |A| _p1 = 2 n |A| _log(K)1 ≤ Klog(K)1 _{= 2.}

Note that the case K = 1 is irrelevant as then A = Fn2 and the theorem holds trivially.

Let y ∈ tX. From the remarks at equation 2.2 and equation 2.3 we have P

a1,a2∼A

[a1+ a2+ y ∈ 2A] = hϕA∗ ϕA∗ ϕy|12Ai = hϕy∗ ϕA∗12A| ϕAi.

We have hϕA∗12A| ϕAi = hϕA∗ ϕA|12Ai = Pa1,a2∼A[a1+ a2 ∈ 2A] = 1. Hence it follows that

P

a1,a2∼A

[a1+ a2+ y ∈ 2A] = 1 − hϕA∗12A| ϕAi + hϕy∗ ϕA∗12A| ϕAi

= 1 − hϕA∗12A− ϕy∗ ϕA∗12A| ϕAi.

Finally with H¨older’s inequality (Theorem 1.1.3) and kϕAkq ≤ 2 as derived,

P

a1,a2∼A

[a1+ a2+ y ∈ 2A] ≥ 1 − kϕA∗12A− ϕy∗ ϕA∗12AkpkϕAkq

≥ 1 −δ

2· 2 = 1 − δ. This proves the lemma.

We already have seen a fair share of how Fourier analysis appears in additive combina-torics. We will now prove the quasi-polynomial Bogolyubov-Ruzsa theorem, which was stated to be one of the largest results of this chapter. In the proof the largest result from the previous chapter, Chang’s lemma (Theorem 2.3.18), will be applied.

Theorem 3.0.4 (Quasi-polynomial Bogolyubov-Ruzsa theorem). Fix K ∈ R≥1. Let

A ⊆ Fn2 have doubling K.

Then there exists a linear subspace V ⊆ 4A such that |V | ≥ K−O log(K)3

|A|.

Proof. Take the assumptions of the theorem. For t = log(10K), using Lemma 3.0.10 with δ = ₁₀1 one has the existence of X ⊆ Fn2 with volume at least α := E[1X] ≥ K−O log(K)

3

such that for every x ∈ tX: P a1,a2∼A [a1+ a2+ x ∈ 2A] ≥ 9 10. Hence P a1,a2∼A x∼tX [a1+ a2+ x ∈ 2A] ≥ 9 10.

(36)

Set V := Spec1 2[ϕA]

⊥

as the vector space that lies orthogonal to all vectors in the spec-trum of ϕA. Using Chang’s lemma (Theorem 2.3.18) we know with d := dim(Sp(Spec1

2[ϕA])) that d ≤ 22 ln(α−1) = 8 ln(2) log(α−1). Therefore using that each free coordinate takes exactly two values in Fn2 and the dimension theorem:

|V | = 2dim(V ) = 2n2−d ≥ |A|2−8 ln(2) log(α−1) = 2log(α8 ln(2))|A| = α8 ln(2)|A| ≥ K−O log(K)3

|A|. It remains to show that V ⊂ 4A. Let v ∈ V . Write

P a1,a2∼A x∼tX [a1+ a2+ x ∈ 2A] = hϕ∗2A ∗ ϕ ∗t X|12Ai = hϕA V2 ϕX Vt |12A V i = X α∈Fn 2 ϕA V2 (α)ϕX V (α)t12A V (α), and P a1,a2∼A x∼tX,v∈V [a1+ a2+ x + v ∈ 2A] = hϕ∗2A ∗ ϕ∗tX ∗ ϕV |12Ai = hϕA V2 ϕX Vt ϕV V |1_2A V i = X α∈Fn 2 ϕA V2 (α)ϕX V (α)tϕV V (α)12A V (α). We have ϕV V (α) = Ey∼Fn

2[ϕV(y)χα(y)] = Ey∼V[χα] = X y∈V 1 |V |(−1) y·α . For α ∈ V⊥ we have α · y = 0, so ϕV V (α) =X y∈V 1 |V |(−1) 0_{= 1.}

When α ∈ V , let (v1, . . . , vd) be an orthonormal basis for V , and let α =Pdi=1civi.

Now for i ∈ {1, . . . , d} we have ϕV V (vi) = X y∈V 1 |V |(−1) y·vi ₌ X βj∈F2: j∈[d] 1 |V |(−1) Pd j=1βjvj·vi = X βi∈F2 1 |V |(−1) βj _(3.1) = 0, where y = Pd

j=1βjvj has been written as a linear combination of the orthonormal

basis vectors. We also made use of vj · vi = δj,i in equation 3.1. This gives ϕV

V (α) = Pd i=1ciϕV V (vi) = 0.

(37)

Hence ϕV

V

(α) =1(α ∈ V⊥). Combining these results, P[a1+ a2+ x ∈ 2A] − P[a1+ a2+ x + v ∈ 2A]

= X α∈Fn 2 ϕA V2 (α)ϕX V (α)t12A V (α) − X α∈Fn 2 ϕA V2 (α)ϕX V (α)tϕV V (α)12A V (α) = X α6∈V⊥ ϕA V2 (α)ϕX V (α)t12A V (α).

For this sum α 6∈ V⊥ = Spec1 2[ϕA]

⊥⊥ _{⊇ Spec}

1

2[ϕA], which gives α 6∈ Spec 1 2[ϕA]. Therefore ϕX V (α)t< 1₂tand 12A V (α) = E[12Aχα] ≤ E[1 · χα] = hχ0| χαi = δ0,α≤ 1, so

P[a1+ a2+ x ∈ 2A] − P[a1+ a2+ x + v ∈ 2A] ≤

X α6∈V⊥ ϕA V2 (α)2−t ≤ X α∈Fn 2 ϕA V (α)22−t = 2−tkϕAk2₂; by Parseval’s identity = 2−t 2n |A| 2 k1Ak22 = 2 −t2n |A| 2 E[12A] = 2 −t2n |A| 2|A| 2n ≤ 2−tK = 2log((10K)−1)K = 1 10.

It follows that P[a1+ a2+ x + v ∈ 2A] ≥ P[a1+ a2+ x ∈ 2A] −₁₀1 ≥ 4₅. Define the

random variable b := a1+ a2+ x ∼ 2A + tX, then

P b∼2A+tX v∼V [v ∈ (2A + b)] = P b∼2A+tX v∼V [b + v ∈ 2A] ≥ 4 5. Now choosing b0 such that Pv∼V[v ∈ (2A + b0)] is maximal, we have

|V ∩ (2A + b0_)| |V | = Pv∼V[v ∈ (2A + b 0_)] ≥ 1 |2A + tX| X b∈2A+tX P v∼V[v ∈ (2A + b)] = P b∼2A+tX v∼V [v ∈ (2A + b)] ≥ 4 5. (3.2)

Note that 0 ∈ 4A, so fix 0 6= v ∈ V . Define W := V ∩ (2A + b). Then |W | ≥ 4₅|V | > |V |₂ . Write out the decomposition v = v1+ v2, which can be done in |V |₂ ways; choose v1∈ V

and set v2 = v1+ v. Shifting a set by addition with v is a bijection, so the pairs are

counted twice. Define the sets

W1:= {w ∈ WC: w + v ∈ WC}, W2 := {w ∈ WC: w + v ∈ W },

˜

(38)

where WC denotes the set complement of W relative to V . Note that ˜W ⊆ W and that W2 and ˜W are in bijection with each other by a shift with v. Now V = W ˙∪W1∪W˙ 2 can

be written as a disjoint union of sets. Hence with |W | > |V |₂ , we have 2|W | − 2| ˜W | > |V | − 2| ˜W | = |W | + |W1| + |W2| − 2| ˜W |

= (|W | − | ˜W |) + |W1| ≥ |W | − | ˜W |,

which implies |W | − | ˜W | > 0 and so W \ ˜W is not empty. This set difference exactly represents the elements of W of which the shift by v is also contained in W ; since it is not empty there must be a pair v1, v2∈ W = 2A + b such that v = v1+ v2 ∈ 2(2A + b) = 4A.

So, V ⊆ 4A, as desired.

When inspecting the arguments given in the proof of the quasi-polynomial Bogolyubov-Ruzsa theorem, one can remark that the parameters δ, t and 1₂ from the 1₂-spectrum can be chosen differently. It only has to hold that

|V ∩ (2A + b0)| |V | >

1 2,

(equation 3.2) since the crucial argument in the proof is finding a pair for which an element of the candidate vector space can be decomposed into.

Having proved the quasi-polynomial Bogolyubov-Ruzsa theorem, we will now reduce it to the quasi-polynomial Freiman-Ruzsa theorem.

Theorem 3.0.3 (Quasi-polynomial Freiman-Ruzsa theorem). Fix K ∈ R≥1. Let A ⊆ Fn₂

have doubling K.

Then there is a subset A0 ⊆ A such that |A0| ≥ K−O log(K)3

|A| and Sp(A0) ≤ 2K5|A0|. Proof. Let A ⊆ Fn

2 such that |2A| ≤ K|A|. Then there exists a linear subspace V ⊆ 4A

with |V | ≥ K−O log(K)3

|A|.

Choose S ⊆ A maximally such that for all distinct s, s0∈ S we have s + s0 _{6∈ V (which}

holds if and only if for all distinct s, s0∈ S such that s + V 6= s0_{+ V ). This can be done}

by writing Fn2 = ˙

S

s(s + V ) as a disjoint union of cosets for {s} distinct representatives

chosen from A if possible.

Now for distinct s, s0 ∈ S, v, v0_{∈ V we have v + v}0_{∈ V and therefore s + s}0 _{6= v + v}0_,

only if s + v 6= s0+ v0, so there’s an injective (and therefore bijective, since all sets are finite) map from S × V to S + V , which gives

|S||V | = |S × V | = |S + V | ≤ |A + V | ≤ |A + 4A| = |5A| ≤ K5|A|

by Lemma 3.0.5. This implies _|S|1 ≥ _K|V |5_|A| ≥ K

−O log(K)3

K5 ≥ K

−O log(K)3

and |V | ≤ K5 |A|_|S|.

(39)

Now choose s ∈ S such that the size |A0| of A0 _{:= A ∩ (V + s) is maximal. Then} |A|

|S| ≤ |A

0_{| since using that s + V are cosets and therefore A ∩ (V + s) for s ∈ S form a}

partition of A: |A| = ∪˙s∈S A ∩ (V + s) = X s∈S A ∩ (V + s)≤ X s∈S |A0| = |S||A0|.

This gives us the required |A0| ≥ |A|_|S| ≥ K−O log(K)3

|A|.

Finally, since |V | is a vector space and therefore the span of the coset V + s can at most have twice as many elements (working over F2):

|Sp(A0)| ≤ |Sp(V + s)| ≤ 2|Sp(V )| = 2|V | ≤ 2K5|A| |S| ≤ 2K

5_|A0_|,

An Application of Fourier analysis on Boolean functions in Theoretical Computer Science