Geometric complexity theory

(1)

Geometric complexity theory

Jeroen Zuiddam

Thesis MSc Mathematics

Supervisor: prof. dr. Eric Opdam

(2)

Title: Geometric complexity theory Author: Jeroen Zuiddam

Submitted: 25 August 2014 Supervisor: prof. dr. Eric Opdam Second grader: prof. dr. Harry Buhrman Program: MSc Mathematics

Coordinator: prof. dr. Ale Jan Homburg Korteweg-de Vries Institute for Mathematics Faculty of Science

University of Amsterdam

Science Park 904, 1098 XH Amsterdam http://www.science.uva.nl/math

(3)

Abstract

Geometric complexity theory provides a mathematical framework for attacking problems in complexity theory, by viewing them as orbit separation problems. This thesis is an exposition of recent results in this field, in particular of a result of Kumar [Kum11] regarding occurrence obstructions.

We introduce the reader to two problems that motivate studying geometric complexity theory, namely the permanent versus determinant problem and the matrix multiplication rank problem. Then we focus on the permanent versus determinant problem and outline its translation into an orbit separation problem. Next, we discuss some techniques for separating orbits in the permanent versus determinant setting. Finally, we focus on the occurrence obstruction approach and give an exposition of the negative result of Kumar regarding this approach.

(4)

Dankwoord

Eric Opdam ben ik enorm dankbaar voor zijn begeleiding bij dit afstudeerproject. Zijn enthousiasme en ruimdenkendheid hebben mij ontzettend gemotiveerd en met zijn optimisme wist hij elke tegenslag te verzachten.

Het KdVI en in het bijzonder Evelien Wallet dank ik voor alle prak-tische ondersteuning. Ook heb ik veel tijd doorgebracht op het CWI in de groep van Harry Buhrman. Ik ben Harry erg dankbaar voor deze mogelijkheid.

Ik dank mijn medestudenten en in het bijzonder Jason, Rasila, Bart, Bart en Sjoerd voor de talloze koffiepauzes en de talloze vruchtbare samenwerkingen de afgelopen jaren.

Ten slotte dank ik Joris, Emma, Laurens, Raoul en Nadine voor al hun steun en afleiding.

(5)

Introduction

Geometric complexity theory is an approach to solving lower bound problems in computational complexity theory. The first step in this approach is to translate the problem at hand into the problem of separating certain orbit closures. The second step is to view the orbit closures as geometric objects. One then employs tools from algebraic geometry and representation theory to separate the orbit closures.

Example. We give an example of an orbit closure separation problem. Let V

be the set of polynomials of the form f(x, y) = ux2_{+ vxy + wy}2 _{with u, v, w}

complex numbers. Let

A=a b c d

be an invertible matrix. Then we define Af to be the polynomial (Af)(x, y) = u(ax + by)2_{+ v(ax + by)(cx + dy) + w(cx + dy)}2_.

Note that Af is again in V . Now consider the set of all polynomials of the form Af for some invertible 2 × 2 matrix A. This is the orbit of f. Now view V as the three-dimensional vector space C3 _{with basis x}2_{, xy, y}2_{endowed with}

the Euclidean topology. The orbit closure of f is the Euclidean closure of the orbit of f, that is, the orbit of f plus its limit points in V . Now consider the polynomials x2 _{and xy in V . The sequence (f}

n) defined by fn(x, y) := _{1 0} 1 1/n xy= x(x +1_ny)

is contained in the orbit of xy and converges to x2_{. Hence x}2 _{is in the orbit}

closure of xy and it is not hard to see that therefore orbit closure x2 _⊆ _{orbit closure xy.}

The orbit separation problem in this case is to show that the inclusion is strict: orbit closure x2

( orbit closure xy. We will see in Section 3.6 how one can do this.

History. Orbit closures appeared in complexity theory in the seventies in

the work of Bini, Strassen and others on the complexity of matrix multipli-cation [Str87]. In the nineties Mulmuley and Sohoni proposed to attack the famous P versus NP problem using orbit closures, in particular using tools from

(8)

algebraic geometry and representation theory like Kempf’s criterion and highest weight vectors [MS01]. They introduce the name geometric complexity theory. Bürgisser and Ikenmeyer then apply these tools in the context of matrix multi-plication. They set up a geometric complexity theory framework in such a way that it fits on both the matrix multiplication problem and the permanent versus determinant problem, an algebraic sister of the P versus NP problem [Ike13]. We will follow this framework and use the permanent versus determinant and matrix multiplication complexity problems as our motivating examples.

Permanent versus determinant. The determinant and permanent of an

n × nmatrix (aij) are the polynomials

detn(aij) = X σ∈Sn sign(σ) n Y i=1 aiσ(i), permn(aij) = X σ∈Sn n Y i=1 aiσ(i),

where Snis the symmetric group on n symbols. The permanent is thus equal to

the determinant except that ‘it has no signs’. For 2 × 2 matrices, the permanent can be computed using a determinant, by changing the signs of the matrix elements only:

perma_c _db= deta −b_c _d.

In an exercise in the Archiv der Mathematik und Physik, volume 20, Pólya asks whether we can compute the the permanent of degree n by the determinant of degree n by changing the signs of the matrix elements only, see the recto. Szegő shows that this is impossible in volume 21.

We might then ask whether we can compute a permanent using a larger determinant and, more precisely whether we can compute a permanent of degree n using a determinant of degree polynomial in n. This question has tight connections to questions about the circuit complexity of polynomials.

Matrix multiplication. When we multiply two 2 × 2 matrices by hand,

we use 8 multiplications. Surprisingly, there is a clever algorithm that uses 7 multiplications (in exchange for extra additions) due to Strassen [Str69], see Theorem A.17. It turns out that 7 multiplications is the optimum. The real benefit of this exchange appears when we apply the clever algorithm recursively to larger matrices, by computing the product blockwise. This method uses O(nlog27) = O(n2.81) operations, improving the complexity O(n3) of the naive algorithm.

We might then ask whether we can do 3 × 3 matrix multiplication using less than 27 multiplications. If we can do it with less than 22 multiplications, applying this algorithm recursively will improve the asymptotic complexity O(nlog₂7_{). The problem of finding a good algorithm is tightly connected to}

determining the tensor rank of the matrix multiplication tensor.

Organisation. In Chapter 1 we introduce the permanent versus determinant

problem and the matrix multiplication rank problem rigorously and translate them to orbit closure separation problems. In Chapter A we provide more background on the complexity theory side of the stories. From Chapter 2 we focus on the setting of the permanent versus determinant problem. We introduce

(9)

424.Die Regel für die Entwicklung einer Determinante besteht aus

zwei Teilen; der erste Teil gibt an, welche Produkte aus den Elementen zu bilden sind, während der zweite Teil das Vorzeichen der gebildeten Produkte bestimmt.

Der zweite Teil läßt sich im Falle der zweireihigen Determinante auf folgende Weise vereinfachen; man ordne den Elementen

a11 a12

a21 a22 bzw. die Vorzeichen + +− +

zu; dann trete jedes Element in die durch den ersten Teil vorgeschriebenen Produkte mit dem zugeordneten festen Vorzeichen ein.

Nun ist nachzuweisen, daß eine entsprechende Vereinfachung bei der Determinante mit mehr als zwei Reihen unmöglich ist; d. h. es ist unmöglich, den n Elementen n feste Vorzeichen derart zuzuordnen, daß, wenn die Elemente in die durch den ersten Teil der Entwicklungsregel vorgeschriebenen Produckte mit dem zugeordneten Vorzeichen eintreten, ohne weiteres das richtige Vorzeichen bei allen Produkten herauskommt.

Budapest. G. Pólya.

Aufgabe 424 from Archiv der Mathematik und Physik 20, 1913 [Pó13]. See page 66 for the solution.

the occurrence obstruction approach for separating orbit closures. Let X and Y be orbit closures and C[X ] and C[Y] the coordinate rings. Then we try to find a highest weight that does occur in one of the coordinate rings, but not in the other, which implies that one inclusion cannot exist. This highest weight is an occurrence obstruction. In the following background chapters we collect some basics on algebraic geometry and representation theory. Next, in Chapter 3 we give an exposition of some useful properties and techniques for separating orbits in the permanent versus determinant setting. Finally, in Chapter 4, we take a closer look at how one could find occurrence obstructions. The main result in this chapter is a result of Kumar that shows that many highest weights are not occurrence obstructions, under assumption of the Column Latin square conjecture.

(10)

Chapter 1

Two motivating problems

In this chapter we will gently introduce the reader to two concrete open prob-lems in the intersection of mathematics and computer science. They serve as a motivation for the the rest of this thesis. Briefly, the permanent versus determinant problem asks whether we can compute the permanent of a matrix using a small determinant, and the matrix multiplication rank problem asks how many multiplications we need to compute the product of two square matrices. We follow the framework of Ikenmeyer [Ike13].

1.1 Permanent versus determinant

Let A be an n × n matrix. The determinant and permanent of A are defined as detnA= X σ∈Sn sign(σ) n Y i=1 Aiσ(i), permnA= X σ∈Sn n Y i=1 Aiσ(i),

where Sn is the symmetric group on n symbols. They are homogeneous

polyno-mials in n2_{variables of degree n.}

1.1 Definition. Take polynomials f ∈ C[X1, . . . , Xn] and g ∈ C[Y1, . . . , Ym].

We say f is a projection of g and write f ≤ g if there are elements a1, . . . , am

in C ∪ {X1, . . . , Xn}such that f(X1, . . . , Xn) = g(a1, . . . , am).

1.2 Example. The polynomial X1 is a projection of Y1+ Y2, but not the

other way around. The polynomial X1+ X2 is a projection of the degree 2

determinant det2.

Determinantal complexity

1.3 Definition. Let f be a polynomial. The determinantal complexity dc(f)

is the minimal n such that f is a projection of detn.

Valiant showed that for any polynomial f the determinantal complexity dc(f) is finite [Val79, Theorem 1]. We are interested in the rate at which dc(permn)

grows in n. The permanent versus determinant problem is to show that dc(permn) grows faster than any polynomial in n. This would in some sense

(11)

1.4 Motivating Problem. Show that the determinantal complexity of the

permanent dc(permn) is not bounded by a polynomial in n.

1.5 Remark. The permanent versus determinant problem might seem

some-what arbitrary. Why do we investigate these particular polynomials? In the optional Chapter A we will see that the permanent versus determinant problem corresponds to the problem in complexity theory of separating the complexity classes VPws and VNP, an algebraic sister of the famous P versus NP problem.

Currently, only the following quadratic lower bound and exponential upper bound are known for dc(permn). Note that the bounds do not give an answer

to Problem 1.4. We recommend the reader to read the original proofs. They do not employ heavy machinery.

1.6 Theorem (Mignon and Ressayre [MR04, Theorem 1.5]).

1 2n

2_≤_dc(perm

n).

1.7 Theorem (Grenet [Gre11]).

dc(permn) ≤ 2n−1.

1.8 Example. Grenet’s proof of Theorem 1.7 is constructive: he constructs for

every n a directed weighted graph G on 2n₋_{1 vertices such that the determinant}

of the adjacency matrix of G is the degree n permanent. As an illustration we give the graph and adjacency matrix for n = 3, where we order the vertices as ∅, {1}, {2}, {3}, {12}, {23}, {13}. In the graph we identify the two nodes labelled ∅. ∅ {2} {1} {3} {23} {12} {13} {∅} 1 1 1 1 1 1 x11 x12 x13 x22 x23 x21 x23 x22 x21 x33 x31 x32           0 x11 x12 x13 0 0 0 0 1 0 0 x22 0 x23 0 0 1 0 x21 x23 0 0 0 0 1 0 x22 x21 x33 0 0 0 1 0 0 x31 0 0 0 0 1 0 x32 0 0 0 0 0 1          

(12)

Besides our understanding of the asymptotic behaviour of dc(permn) being

poor, the exact values of dc(permn) are also still a mystery. The following

problem is open.

1.9 Problem. By Theorem 1.6 and Theorem 1.7 we have 5 ≤ dc(perm3) ≤ 7.

Determine the value of dc(perm3).

Approximation

In computations, it is often easier to approximate a solution to a problem instance than to compute the exact answer. In an attempt to better understand the determinantal complexity we introduce a complexity measure which allows us to ‘approximate the solution’.

In the following, we endow the polynomial ring P (Cn_{) = C[X}

1, . . . , Xn] with

a topology by viewing the homogeneous parts Pm_(Cn_{) as finite dimensional}

vector spaces over C endowed with the Euclidean topology. We call this topology the C-topology.

1.10 Definition. For a polynomial f, the approximate determinantal

complex-ity dc(f) is the minimal n such that there exists a sequence of polynomials (gk)

with dc(gk) ≤ n and f = limk→∞gk in the C-topology.

1.11 Conjecture(Mulmuley and Sohoni [MS01, Conjecture 4.3]). The

approx-imate determinantal complexity of the permanent dc(permn) is not bounded by

a polynomial in n.

1.12 Remark. Every polynomial f is the limit of the constant sequence (f),

so trivially dc(f) ≤ dc(f). Hence proving Conjecture 1.11 solves Motivating Problem 1.4.

Linear transformations

Intuitively, applying a linear transformation is ‘computationally cheap’. We introduce a complexity measure that incorporates this. It turns out that this new measure differs only polynomially from dc and thus is very useful in light of Motivating Problem 1.4. To talk about applying linear transformations, we use the language of group orbits. Let G be a group with unit e, say the general linear group GLn2, and X a set. Then an action of G on X is a map G × X → X : (g, x) 7→ g · x such that (gh) · x = g · (h · x) and e · x = x. An orbit of an element x ∈ X is the result of applying every element in G to x, that is, the orbit Gx is the set {g · x : g ∈ G}. We endow the space Pn_Cm _of

homogeneous polynomials of degree n in m variables with an action of GLmby

setting (A · p)(v) := p(A−1_v_{) for A in GL}

n2, p in PnCmand v in Cm.

1.13 Definition. Let M ≤ n2 _{and m ≤ n. Let f be a polynomial in P}m_CM

and embed Pn_CM _{in P}n_Cn2 _{by identifying the set of coordinates {x}

1, . . . , xM}

with a subset of {xi,j : i, j ∈ [n]}. Let z := x1,1. The determinantal orbit

closure complexity docc(f) is the minimal n such that zn−m_f _{is contained in}

the closure of GLn2detn in PnCn

2 .

(13)

As stepping stones to Proposition 1.16, the following two propositions say that applying a linear transformation to a polynomial or ‘taking the homoge-neous part of a polynomial’ can increase the determinantal complexity only polynomially.

1.14 Proposition. There is a polynomial p such that for every g ∈ GLn2, dc(g · detn) ≤ p(n).

Proof. It would be interesting to find a direct proof of this. The result follows from a result in circuit complexity theory from Malod and Portier [MP08, Proposition 5]. We refer to Proposition 2.3.7 in [Ike13] for the reduction.

1.15 Proposition. For a polynomial f denote by f(d)_{the homogeneous degree d}

part. There is a polynomial p such that for all polynomials f, dc(f(d)_{) ≤ p(n).}

Proof. Again, I do not know a direct proof. I refer to Ikenmeyer for a proof using circuits [Ike13, Lemma 2.1.13].

1.16 Proposition. The functions dc and docc are polynomially equivalent.

Proof. Let f ∈ Pm_CM_{. Let n be docc(f). Then by definition, the polynomial}

zn−mf is contained in the closure of GLn2det_n. This means that there is a sequence (gk)k in GLn2 such that lim_k→∞g_kdet_n= zn−mf. Evaluating z in 1 gives

lim

k→∞gk(detn|z=1) = f. (1.1)

Proposition 1.14 says there is a polynomial p, independent of gk, such that

dc(gkdetn) ≤ p(n), so dc(gk(detn|z=1)) ≤ p(n). This together with (1.1) implies

gk(detn)|z=1 is contained in the closure of GLn2det_n.

Let n = dc(f). Then by definition there is a sequence (fk)k such that

dc(fk) ≤ n for every k and limk→∞fk = f. Let fk(m) be the homogeneous part

of degree m of fk. Since f is homogeneous of degree m, limk→∞fk(m) = f.

Proposition 1.15 says that there is a polynomial p, independent of f, such that dc(f(m)

k ) ≤ p(n). This means that there are p(n)×p(n) matrices Akwith entries

in {x1, . . . , xM} ∪ Csuch that fk(m)= det(Ak). Let Bk be the matrix obtained

from Ak by multiplying all constant entries with z. Then det(Bk) = zp(n)−mfk.

The entries of Bk are linear in the entries of Ak so there are p(n)2× p(n)2

matrices gk in such that det(Bk) = gkdetp(n). The group GLp(n)2_×p(n)2 lies dense in Matp(n)2_×p(n)2 and hence there exist hk ∈ GLp(n)2_×p(n)2 such that limk→∞hkdetp(n) = zp(n)−mf. This means zp(n)−mf is an element of the

closure of GLp(n)2detp(n). Therefore, docc(f) ≤ p(n).

1.17 Approach. We conclude that, as an approach to solving Motivating

Problem 1.4, we can try to, given n, find an m as large as possible such that zn−m_perm

(14)

1.2 Matrix multiplication rank

Let W be an m-dimensional complex vector space with basis {e1, . . . , em}.

Extend the inner product h·, ·i on W to a bilinear map h·, ·i : W⊗3_{× W}⊗3_{→ C}

by hu1⊗ u2⊗ u3, v1⊗ v2⊗ v3i= hu1, v1ihu2, v2ihu3, v3i. Let f be an element

in W ⊗ W ⊗ W . We can identify f with a bilinear map f : W × W → W by f(w1, w2) =

m

X

i=1

hf, w1⊗ w2⊗ eii.

We thus have an isomorphism of vector spaces

W ⊗ W ⊗ W −∼→ {f: W × W → W bilinear}.

The ideas in this chapter also work for elements in spaces U ⊗ V ⊗ W , where U, V and W are unequal.

1.18 Example. The bilinear map C2× C2_{→ C}2_{: (x, y) 7→ (x}

1y1+ x2y2, x1y1− x2y2)

corresponds to the tensor e1⊗ e1⊗ e1+ e2⊗ e2⊗ e1+ e1⊗ e1⊗ e2− e2⊗ e2⊗ e2,

or, in braket notation, |111i + |221i + |112i − |222i.

1.19 Notation. Occasionally, we use braket notation for vectors in tensor

spaces. In this notation we denote vectors by |vi and dual vectors by hv|. The standard basis elements of Cm_{are labelled by numbers: |1i , . . . , |mi; for C}n2

we use the notation |11i , |12i , . . . , |nni. We write the tensor product of two vectors compactly as |vi|wi or even as |vwi.

1.20 Definition. The multiplication of n × n matrices is the bilinear map Cn2× Cn2 _{→ C}n2 _{: (e}

ij, e`k) 7→ eikδj`.

The corresponding matrix multiplication tensor is Mn :=

n

X

i,j,k=1

|iji|jki|iki .

1.21 Remark. In the literature, the matrix multiplication tensor Mn is often

defined as the more symmetric Pn

i,j,k=1|iji|jki|kii, which corresponds to the

transpose of the matrix product. The tensor rank of both tensors, however, are equal.

Tensor rank

In the previous section we defined determinantal complexity as a measure of complexity of polynomials. Analogously, we define tensor rank as a measure of complexity of bilinear maps.

1.22 Definition. Let f be an element in W ⊗ W ⊗ W or the corresponding

bilinear map. The tensor rank R(f) is the minimal n such that f can be written as f = Pn

(15)

In a very precise sense, the tensor rank of a bilinear map is a measure for how hard it is to compute the bilinear map, see Chapter A. Since matrix multi-plication is important in many algorithms in linear algebra, we are interested in its computational complexity and thus in its tensor rank.

From the definition of Mn the rank of Mn is at most n3. For n = 2 this

gives R(M2) ≤ 8. Not surprisingly this is also the number of multiplications

needed in the naive algorithm for multiplying two n × n matrices. It turns out that we can write M2 with one less summand.

1.23 Theorem (Strassen, Winograd, Hopcroft and Kerr).

R(M2) = 7.

Proof. This follows from Strassen’s algorithm, see Chapter A, and the lower bound of 7 ≤ R(M2) found by Winograd [Win71, Theorem 3.1] and by Hopcroft

and Kerr [HK71, Theorem 3].

1.24 Motivating Problem. Compute the tensor rank of the matrix

multipli-cation tensor Mm= P m

i,j,k=1|iji|jki|iki.

We note that the general computational problem of deciding whether the rank of a three-dimensional is at most r, for some r, is NP-hard [Hås90]. Also, note that our problem is a generalisation of the problem of computing the rank of a two-dimensional tensor, that is, of a matrix. The latter can be done efficiently by Gaussian elimination. The following is known about the rank of matrix multiplication. 1.25 Theorem. 2n2_{+ n − 2 ≤ R(M} n). (1.2) 5 2n 2₋_{3n ≤ R(M} n). 8 3n 2₋_{7n ≤ R(M} n). 11 4n 2₋_{17n ≤ R(M} n).

Proof. The first two bounds are from Bläser [Blä03, Blä99]. The last two bounds are from Massarenti and Raviolo [MR14].

1.26 Problem. Using the lower bound (1.2) and an upper bound from

Lader-man [Lad76], we have

19 ≤ R(M3) ≤ 23.

Compute the tensor rank of M3.

Following the example of the permanent versus determinant problem, we define a special element En and a relation ≤ that correspond nicely to tensor

rank.

1.27 Definition. The unit tensor in W ⊗ W ⊗ W is the tensor

Em:= n

X

i=1

|ii|ii|ii .

Let f ∈ (Cm₎⊗3_{and g ∈ (C}n₎⊗3_{. We say f is a restriction of g and write f ≤ g}

(16)

1.28 Proposition (Strassen). Let f in W ⊗ W ⊗ W . Then R(f) ≤ n if and

only if f ≤ En. In particular the set of projections of En is independent of the

choice of basis in W .

Proof. Suppose R(f) ≤ n. Then there exist xi, yi, zi ∈ Cm such that f can

be written as Pn

i=1xi⊗ yi⊗ zi. Define linear maps g1, g2, g3 : Cn → Cm by

setting g1(|ii) = xi, g2(|ii) = yi, g3(|ii) = zi. Then f = (g1⊗ g2⊗ g3)En.

On the other hand, suppose f = (g1⊗ g2⊗ g3)En. Then f can be written

as Pn

i=1g1(|ii) ⊗ g2(|ii) ⊗ g3(|ii).

1.29 Definition. Let f ∈ W ⊗W ⊗W . The border rank R(f) is the smallest n

such that there exists a sequence (fk)k in W ⊗ W ⊗ W with R(fk) ≤ n and

limk→∞fk = f.

We will later use the following trivial lower bound on the border rank of matrix multiplication. There are better bounds, closer to the bounds for the tensor rank of matrix multiplication.

1.30 Theorem ([CW82, Corollary 3.7]). The border rank R(Mn) is at least

n2_{+ 1.}

Let Gn be the product group GLn×GLn×GLn. It acts naturally on

Vn := Cn⊗ Cn⊗ Cn by (g1, g2, g3) · (v1⊗ v2⊗ v3) = g1v1⊗ g2v2⊗ g3v3. For

m ≤ nlet im,nbe the natural embedding Vm,→ Vn.

1.31 Theorem (Strassen). Let f ∈ Vm. For n ≥ m, we have R(f) ≤ n if

and only if in,m(f) is in GnEn. In particular, if R(f) ≥ m, then R(f) is the

minimal n such that im,n(f) ∈ GnEn.

Proof. We follow Ikenmeyer [Ike13, Theorem 2.5.12]. We show the first state-ment. The second statement follows directly from the first. Let f ∈ Vm. Assume

im,n(f) ∈ GnEn. This means that there is a sequence (gk)k in Gn such that

limk→∞gkEn = im,n(f). Let p : Vn → Vmbe the linear projection such that

p ◦ im,n= id. Then limk→∞(p ◦ gk)En = f and by Proposition 1.28 we have

R((p ◦ gk)En) ≤ n. This implies R(f) ≤ n.

On the other hand, assume R(f) ≤ n. By definition this means that there is a sequence (fk)k in Vmwith R(fk) ≤ n. By Proposition 1.28, fk≤ En so fk=

gkEnfor an element gk= g1k⊗ g 2 k⊗ g 3 k with g i k∈ C m×n_{. Embed C}m×n _{in C}n×n

by adding n − m rows of zeros. Let g0i

k be the image of g i

k under this embedding.

Then g0i

k = im,n◦ gki. Since GLn is dense in Cn×nthere exist elements ˜gil such

that limk→∞˜gkEn = im,n(f). This implies im,n(f) ∈ GnEn.

1.32 Approach. Because of Theorem 1.30 we get the following from

The-orem 1.31. Suppose that for some n and m the element im,n(Mn) is not

contained in the closure of GnMn. Then R(f) > n. We conclude that, as an

approach to solving Motivating Problem 1.24, we can try to find lower bounds for the border rank of Mn by, given n, finding m as large as possible such that

(17)

1.3 Common approach

In the previous sections we have introduced two problems and stated approaches for solving them in terms of orbit closure separation problems, Approach 1.17 and Approach 1.32. We can state the two orbit closure approaches as a single problem. Let G, V , W , h and e be as follows. Let i be the natural embedding V ,→ W.

per vs. det matrix mult. n ≥ m+ 1 n ≥ m2 G GLn2 GL×3_n V Pn_Cn2 ⊗3_Cn W Pn_Cm2₊₁ ⊗3_Cm2 h zn−m_perm m Mm e detn En

1.33 Approach. Given m, find n as large as possible such that hm,n is not in

the closure of Gen. Or, equivalently, given m, find n as large as possible such

that the orbit closure of hm,n is not contained in the orbit closure of en.

In Chapter 2 we will see how algebraic geometry and representation theory could be used to carry out this approach.

(18)

Appendix A

Complexity theory

Both the permanent versus determinant problem and the matrix multiplication rank problem from Chapter 1 are tightly related to and motivated by problems in complexity theory. In this chapter we will explain the connections. The determinantal complexity of a polynomial is related to the circuit complexity of the polynomial. The tensor rank of a bilinear map is related to the multiplicative complexity of the bilinear map. In both settings we need the notion of an arithmetic circuit.

A.1 Definition. An arithmetic circuit, or circuit for short, is a directed graph

without directed cycles, where the vertices with indegree zero are labelled with indeterminates or constants in C×_{and the other vertices have indegree 2 and are}

labelled with ∗ or +. A circuit with with n vertices labelled by indeterminates and m vertices with outdegree zero naturally defines a function f : Cn _{→ C}m_.

We say that the circuit computes the function f.

A.2 Example. The following circuit computes the product of the degree 1

polynomials ax + b and cx + d. This can be seen as a function f : C4_{→ C}3_.

a c b d

∗ ∗ ∗ ∗

+

A.3 Definition. Let f be a function Cn_{→ C}m_{. The complexity L(f) of f is}

the minimal number of vertices in a circuit computing f. The multiplicative complexity L∗_{(f) of f is the minimal number of nonscalar multiplications in a}

circuit computing f.

A.4 Remark. Historically, the notation L comes from the German word

Länge [Str72].

A.5 Example. Continuing Example A.2, we have L∗(f) ≤ 4. It turns out

that L∗_{(f) = 3. The circuit comes from an algorithm known as Karatsuba’s}

(19)

A.1 Complexity of families of polynomials

In this section we define Valiant’s algebraic complexity classes VP and VNP, and a derived class VPws, and we explain how the permanent and determinant

play a special role in these classes. We say a function f : N → N is polynomially bounded if there exists a polynomial p such that f(n) ≤ p(n) for all natural numbers n.

A.6 Definition. A p-family is a sequence (fn) of polynomials such that the

degree is polynomially bounded.

A.7 Example. Examples of p-families are (detn) and (permn). A non-example

is the sequence of polynomials (x2n_{) in x.}

Intuitively, a family of polynomials is efficiently computable if it can be computed by small circuits. These efficiently computable families will make up the class VP. We can imagine having a family of polynomials for which we do not know a small circuit that computes it, but we do know small circuits that compute the coefficients of the polynomials – for example, the permanent family! These families (and their projections, to be defined) will make up the class VNP.

A.8 Definition. Let (fn) be a p-family. Then (fn) belongs to VP if L(fn) is

polynomially bounded in n. Let dn be the number of variables of fn. Then (fn)

belongs to VNP if there exists a polynomial p and a sequence (gn) ∈ VP such

that gn is a polynomial in dn+ p(dn) variables and

fn(x1, . . . , xdn) =

X

e∈{0,1}p(dn)

gn x1, . . . , xdn, e1, . . . , ep(dn).

A.9 Remark. We note that there is an equivalent definition of VP, in which

the sequences are not required to have polynomially bounded degree, but the circuits are [MP08, Definition 3]. Here the degree of a circuit is defined by: the degree of an input gate is 1, the degree of a + gate is the maximum of the incoming degrees and the degree of a ∗ gate is the sum of the incoming degrees. Bounding the degree of a circuit thus makes sure that the ‘intermediate results’ of the circuit are bounded.

The characterisation of VNP as the class of families for which the coefficients can be computed by small circuits plus their projections, can be found in the original definition of VNP of Valiant [Val79, Section 3].

When Valiant defined VP and VNP his aim was to demonstrate that the tendency of ‘problems’ to cluster in ‘completeness classes’ also manifests itself in an algebraic setting. Here the clustering in induced by so-called p-projections.

A.10 Definition. Let f and g be polynomials. In Chapter 1, we defined f to

be a projection of g if there exist elements a1, . . . , amin C ∪ {x1, . . . , xn}such

that f(x1, . . . , xn) = g(a1, . . . , am). Let (fn) and (gn) be p-families. Then (fn)

is a p-projection of (gn) if there exists a polynomially bounded function t(n)

such that fn is a projection of gt(n) for all n.

A.11 Definition. Let C be a set of p-families and let (fn) be a p-family.

Then (fn) is C-complete if (fn) belongs to C and if every p-family in C is a

(20)

A.12 Theorem([Val79]). The permanent family (perm_n) is VNP-complete.

The determinant family (detn) is in VP. One might hope that the

deter-minant family is VP-complete, but this is not known. We define a reasonably natural subclass of VP for which the determinant family is complete.

A.13 Definition. A weakly skew circuit is a circuit such that for each

mul-tiplication gate at least one parent gate forms an isolated subcircuit with his ancestors. Let f be a function Cn_{→ C}m_{. The weakly skew complexity L}

ws(f)

of f is the minimal number of vertices in a weakly skew circuit computing f. Let (fn) be a p-family. Then (fn) belongs to VPws if Lws(fn) is polynomially

bounded in n.

A.14 Theorem([MP08]). The determinant family (detn) is VPws-complete.

Finally, we can see the relevance of the permanent versus determinant problem to the theory of circuit complexity of polynomials. Namely, we have

VPws⊆VP ⊆ VNP

and it is unknown whether the inclusions are strict. If one solves Motivating Problem 1.4 then the permanent is not a p-projection of the determinant and hence the class VPws is a proper subset of the class VNP.

A.2 Multiplicative complexity of bilinear maps

The tensor rank of a bilinear map and the number of multiplications needed to compute the bilinear map using circuits are related in a very direct manner; the values differ only by a constant. We define the multiplicative complexity of bilinear maps in the obvious way.

A.15 Definition. Let f be a bilinear map Cn1_×Cn2_{→ C}m. We define the mul-tiplicative complexity of f by naturally viewing f as a function Cn1+n2 _{→ C}m.

A.16 Theorem([Str73, Section 4]). Let f be a bilinear map. Then we have

the inequalities L∗_{(f) ≤ R(f) ≤ 2L}∗_(f).

Proof. We follow [Blä13, Theorem 4.7]. We first show the left inequality. Let R(f) = n. Then by definition, f = Pn_i=1ui⊗ vi⊗ wi for some ui, vi, wi∈ W.

This yields a circuit with exactly n nonconstant multiplications. For example, for n1 = n2 = 3 get the following circuit, where a node labelled by a vector

denotes the subcircuit that takes a linear combination with coefficients from the vector, and where an edge labelled by a constant denotes multiplying by this constant.

(21)

x1 x2 y1 y2 u1 u2 v1 v2 ∗ ∗ + + w11 w21 w22 w12 We thus have L∗_{(f) ≤ n.}

Let L∗_{(f) = n. Then by definition there is a circuit C computing f with}

exactly n nonconstant multiplications. View C as a layered circuit where the first layer contains the indeterminates and the other layers alternate between two types: the first type computes linear combinations of the outcomes of the previous layer; the second type computes products of pairs of outcomes of the previous layer. The layered circuit C can only have a single products layer, since f is bilinear. We are left with a linear combinations layer, followed by a products layer, followed by a linear combinations layer. Let e1, . . . , embe the

standard basis of W and let fk = e∗k◦ f. From our layer discussion it follows

that there exist indexed coefficients α, β, α0_{, β}0_{, γ} _{such that}

fk= n X `=1 γk` X i α`ixi+ X j β`jyj X i α0`ixi+ X j β`j0 yj . (A.1) Because f is bilinear in x and y, all terms of the form xixj and yiyj vanish.

Hence, fk= n X `=1 γk` X i α`ixi X j β_`j0 yj + n X `=1 γk` X i α0_`ixi X j β`jyj . In order to explicitly write down a tensor of rank at most 2n, let

u`= Piαìxi, v0`= Pjβ 0 `jyj, u0`= Piα 0 ìxi, v`= Pjβ`jyj. Then f = m X k=1 ek⊗ Xn `=1 γkù`⊗ v`0+ n X `=1 γkù0`⊗ v` = n X `=1 Xm k=1 γkèk ⊗ u`⊗ v`0 + n X `=1 Xm k=1 γkèk ⊗ u0`⊗ v`, so R(f) ≤ 2n.

We finish with Strassen’s famous algorithm for 2 × 2 matrix multiplication, which also shows R(M2) ≤ 7. The last statement will follow only from the

special structure of the algorithm. We note that Strassen’s algorithm can be applied recursively to larger matrices of size 2n_{, because matrix multiplication}

(22)

A.17 Theorem ([Str69]). Multiplication of 2 × 2 matrices takes at most 7

multiplications.

Proof. Let A = (aij) and B = (bij) be 2 × 2 matrices and let C = (cij) be the

matrix product AB. Let

m1= (a11+ a22)(b11+ b22) m2= (a21+ a22)b11 m3= a11(b12− b22) m4= a22(b21− b11) m5= (a11+ a12)b22 m6= (a21− a11)(b11+ b12) m7= (a21− a22)(b21+ b22)

Computing the mi takes 7 multiplications. Now C can be computed by adding

and subtracting the mi as follows:

c11= m1+ m4− m5+ m7

c12= m3+ m5

c21= m2+ m4

c22= m1− m2+ m3+ m6

We can thus compute C using 7 multiplications. Moreover, the cijcan be written

in the form (A.1) with α0 _{zero and β}0 _{zero. By the proof of Theorem A.16 it}

(23)

Chapter 2

Geometric complexity theory

In this chapter we will translate the permanent versus determinant problem and the matrix multiplication rank problem (Problem 1.4 and Problem 1.24) into geometric problems, in order to employ the machinery of representation theory and algebraic geometry. From Section 2.2 we will focus completely on geometric complexity theory as a tool for investigating the permanent versus determinant problem. We note, however, that there is an analogous story for the matrix multiplication rank problem, for which we refer to Ikenmeyer [Ike13]. This chapter is more technical than Chapter 1. In Chapter B and Chapter C we collected some background material.

2.1 From C-topology to Zariski topology

We need the following basic theorem.

2.1 Theorem(See [Spr08, Theorem 1.9.5]). Let φ : X → Y be a morphism of

varieties. Then f(X) contains a non-empty open subset of its closure f(X). Let G be a linear algebraic group and let V be a G-variety.

2.2 Theorem. The G-orbits in V are quasi-affine varieties.

Proof. The map G → Gx : g 7→ g · x is a regular map. Hence by Theorem 2.1, Gx contains an open subset of its closure. Let U be such an open. Since g 7→ gx is continuous and G is a group, gU is open for every g ∈ G. We have Gx = ∪g∈GgU, hence Gx is open in its closure. This means that Gx is

quasi-affine.

Thus any orbit Gv in V is a G-variety. Let X be a subset of an affine variety. We call X constructible if X is a finite union of quasi-affine varieties. In particular, the G-orbits in V are constructible. Constructible sets have the following special property.

2.3 Theorem(See [Kra85, Section AI.7.2 Folgerung]). For constructible sets,

the Zariski closure and C-closure are the same.

It follows from Theorem 2.2 and Theorem 2.3 that the C-closures of the orbit of enand the orbit of hm,nfrom Approach 1.33 are equal to their Zariski closures.

(24)

2.4 Proposition. The closure of an orbit Gv is a G-variety.

Proof. We have Gv ⊆ Gv and hence Gv ⊆ gGv. The set gGv is closed so Gv ⊆ gGv. Therefore, g−1Gv ⊆ Gv.

In the rest of this thesis we will think of these orbits and their closures as (quasi-)affine G-varieties endowed with the Zariski topology.

2.2 Permanent versus determinant

From now we will focus on the permanent versus determinant scenario. First we will redefine the objects of interest. Then we will state the conjecture that we aim to show in the new notation.

Let n < m be variables with positive integer values. We will omit n and m when possible to keep notation tidy. Let G = GLm2 and let

Qm:= Pm(Cm

2 )

be the affine G-variety of homogeneous polynomials of degree m on m × m matrices. We have detm∈ Qm. Let permm,n be the m, n-padded permanent

polynomial on m × m matrices defined by

(aij) 7→ am−n11 permn (aij)|n,

where (aij)|n is the lower-right n × n-block in (aij). Then permm,n∈ Qm. For

example,

perm3,2 = x11(x22x33+ x23x32) ∈ Q3.

Let Xm be the G-orbit closure of detm in Q and let Ym,n be the G-orbit

closure of permm,nin Q,

Xm:= G detm, Ym,n:= G permm,n.

As we have seen in the previous section, both X and Y have a G-variety structure.

2.5 Conjecture(Geometric version of permanent versus determinant problem).

Let n be a natural number. Let m(n) be the smallest natural number m such that Pm,n∈ Xm. Then m(n) is not bounded by a polynomial in n.

2.6 Example. Trivially, perm_1,1= det1, hence Y1,1= X1. So m(1) = 1. From

the equality perm a b c d

_{= det} a −b c d

_{it follows that G perm}

2,2 = G det2, hence

Y2,2 = X2. So m(2) = 2. Already for n = 3, the value of m(n) is unknown. We

only know the bounds 5 ≤ m(3) ≤ 7 [LMR10, Gre11].

2.3 Strategies

2.7 Polynomial obstructions. Suppose Pm,nis not contained in X = GDm.

Then, since X is closed, there exists a polynomial f ∈ C[Q] which is zero on X but not on Pm,n. One the other hand, clearly if a polynomial f ∈ C[Q] is zero

on X but nonzero on Pm,n, the Pm,nis not contained in X . This means that

(25)

2.8 Multiplicity obstruction. The coordinate rings C[X ] and C[Y] have a

canonical G-module structure. The crucial observation is that both coordinate rings have a decomposition into homogeneous parts, and the homogeneous parts are rational G-representations (Proposition C.5). Hence, both coordinate rings are completely reducible and have an isotypic decomposition.

Suppose that for some m, n we have P ∈ X . Then Y ⊆ X . So we have a G-equivariant surjection of coordinate rings C[X ] C[Y] by restriction. Hence, the multiplicity of an irreducible G-module in C[Y] is at most the multiplicity of this module in C[X ]. We can thus separate C[X ] and C[Y] by finding an irreducible G-module such that the multiplicity in C[Y] is strictly larger than the multiplicity in C[X ]. We call such a module a multiplicity obstruction.

2.9 Occurrence obstruction. An extreme instance of a multiplicity

obstruc-tion is when an irreducible occurs in C[Y] that does not occur in C[X ] at all. Such a module is called an occurrence obstruction. Mulmuley and Sohoni suggest finding an occurrence obstruction as a strategy for proving Conjecture 2.5.

2.10 Conjecture (Occurrence obstruction). If P 6∈ X then there exists an

irreducible G-module that appears in C[Y] but not in C[X ].

In Chapter 4 we will look more closely at how to find occurrence obstructions.

2.4 Language of symmetric tensors

In the coming chapters we will sometimes view the polynomials we are concerned with as symmetric tensors, because this is a very natural notion in representation theory. Consider the tensor product T = ⊗d_Cn_{. It has a diagonal action of}

GLn by g(v1⊗ · · · ⊗ vd) = gv1⊗ · · · ⊗ gvd, making it a GLn-module. There is

also a natural action of the symmetric group Sd on T by permuting the tensor

legs: σ(v1⊗ · · · ⊗ vd) = vσ−1₍₁₎⊗ · · · ⊗ v_σ−1_(d). The actions of Sd commutes

with the action of GLn, so the subspace SymdCn⊆ ⊗dCn of tensors invariant

under the action of Sd is a submodule. There is GLn-projection

Pd: ⊗NCn SymdCn: v 7→ 1

d! X

σ∈Sd

σ · v.

We note that Symd_(Cn₎∗ _{and (Sym}d_Cn₎∗_{are isomorphic as GL}

n-modules and

as Sd-modules. We will use this fact without mentioning.

2.11 Theorem(See [GW09, Proposition B.2.4]). Let PmCn be the space of

homogeneous polynomials of degree m on n variables. The map ξ : Pm_Cn _→

Symm_(Cn₎∗ _{defined on monomials by}

ξ(xi1xi2· · · xim) = X σ∈Sm σ ·(e∗_i 1⊗ e ∗ i2⊗ · · · ⊗ e ∗ in)

is an isomorphism of GLn-modules, called polarization. The inverse ξ−1 is

given by sending a tensor v to the polynomial w 7→ hv, w⊗m_i_{, where h·, ·i is the}

evaluation pairing.

(26)

2.13 Example.

D2= 1₂(x11⊗ x22+ x22⊗ x11− x12⊗ x21− x21⊗ x12) ∈ Sym2C4,

(27)

Appendix B

Representation theory

In this chapter we briefly collect some basic concepts from representation theory, mostly without proofs. The first paragraph tries to give some intuition for rep-resentations for those who have never seen them. The later paragraphs assume the reader is familiar with representations. We refer to Fulton and Harris [FH91] for an accessible introduction to representations and to Fulton [Ful97] for a gentle introduction based on the theory of Young tableaux.

B.1 Intuition. Groups often arise as a description of the symmetries of an

object; such objects are, for example, geometric objects or polynomials. Groups symmetries Objects

In the opposite direction, given an abstract group, one could try to find objects for which the group describes symmetries.

Groups representations Objects

For example, consider an equilateral triangle in the plane. Label the vertices by 1, 2 and 3.

1 3

2

We can think of the symmetries of this object as all permutations of 1, 2 and 3; this yields the symmetric group S3. On the other hand, we can try to find

objects for which S3provides symmetries. Consider the vector space

V = C3

with standard basis e1, e2, e3. The group S3 naturally acts on V by permuting

the basis elements according to their label. We say V is a representation of S3.

We see that our action of S3 leaves the line spanned by e1+ e2+ e3invariant. It

turns out that the representation V decomposes as a direct sum W1⊕ W2, where

W1:= {(u, v, w) : u+v+w = 0} and W2:= {(z, z, z)} are S3-subrepresentations

(28)

further. Note that in W1we can recognize the triangle we started with.

Repre-sentation theory is concerned with describing the irreducible repreRepre-sentations and describing the decomposition of representations into irreducibles.

B.1 Representations

B.2 Definition. Let G be a group. A representation of G is a finite dimensional

complex vector space V together with a homomorphism of groups ρ: G → GL(V ).

Let g ∈ G, v ∈ V . We abbreviate ρ(g)(v) to g · v or gv. A representation of G is ‘the same thing as’ a C[G]-module, where C[G] is the group algebra of G over C. We will thus sometimes use the language of modules. A morphism of G-representations V → W is a morphism of vector spaces φ : V → W such that g · φ(v) = φ(g · v). We also call this a G-morphism or G-equivariant linear map. These morphisms form a vector space, which we denote by HomG(V, W ). An

isomorphism of representations is an invertible morphism of representations. The direct sum of representations V and W is the vector space V ⊕ W with action g(v ⊕ w) = gv ⊕ gv. The tensor product of representations V and W is the vector space V ⊗ W with action on simple tensors g(v ⊗ w) = gv ⊗ gw. The dual of a representation V is the dual space V∗_{with action gφ(v) = φ(g}−1_v_).

Let V be a representation of G. Then W is a subrepresentation of V if W is a subspace of V and the set GW = {g · w : g ∈ G, w ∈ W } equals W . The zero representation is the zero-dimensional vector space with trivial action. A representation V is irreducible if the only subrepresentations are V itself and the zero representation.

Let G and H be groups. The irreducible representations of the product group G × H are the tensor products V ⊗ W with V an irreducible G-representation and W an irreducible H-representation.

B.3 Schur’s Lemma. Let V1 and V2 be representations of a group. Let

φ: V1→ V2 be a nonzero morphism of representations. (i) If V1is irreducible,

then φ is injective; (ii) if V2is irreducible, then φ is surjective. Thus, if both V1

and V2 are irreducible, then φ is an isomorphism.

Let V be a representation of a group and let ψ : V → V be a morphism of representations. Then φ = λ · Id for some λ in C.

Proof. The kernel K of φ is a subrepresentation of V1. Since φ is nonzero, K is

not V1. By irreducibility of V1, K is the zero space. Similarly, the image I of φ

is a subrepresentation of V2. Since φ is nonzero, I is not the zero space. By

irreducibility of V2, I is V2.

The field C is algebraically closed so the characteristic polynomial p(x) = det(φ − x Id) has a root λ. Then φ − λ Id is a morphism of representations V → V, which is not an isomorphism since by construction its determinant is zero. By the above, it is the zero map.

B.4 Definition. Let G be a group. A semisimple or completely reducible

representation of G is a direct sum of irreducible representations.

We note that not all all representations are semisimple. The ones that we deal with, however, are.

(29)

B.5 Isotypic components. Let G be a group and let V be a completely

reducible representation of G, that is, V decomposes as a direct sum of irre-ducibles. Let W be an irreducible representation of G. The isotypic component of type W in V is the direct sum of the irreducibles occuring in the decomposi-tion of V that are isomorphic to W . The decomposidecomposi-tion of V into irreducibles is not unique. The decomposition into isotypic components, however, is unique. Let cW be the number of times W occurs in a decomposition of the isotypic

component of type W in V . We then write V =M

W

cWW.

B.2 Partitions, diagrams and tableaux

We will use the language of partitions, diagrams and tableaux for describing the irreducible representations of the symmetric group, the general linear group and the special linear group in the sections to come.

B.6 Partitions. Let n be a nonnegative integer. A partition of n is a finite

nonincreasing sequence of nonnegative integers λ = (λ1, λ2, . . . , λk). We denote

the number of nonzero elements in λ by `(λ) and say λ has length length `(λ). We say λ has weight n and wite |λ| = n. We write λ `knto say λ is a partition

of n of length at most k. We will identify partitions that only differ by trailing zeros. We use the shorthand notation (am1

1 , a

m2

2 , . . . , amnn) for the partition

consisting of m1 times a1 followed by m2 times a2, etc.

To a partition λ we associate a diagram consisting of boxes arranged in rows of length λ1, λ2, etc. For example, λ = (4, 2, 1) corresponds to a diagram

We say the diagram has shape λ. The transpose λτ_{of a partition λ is the sequence}

of column lengths of the diagram of shape λ. A filling of a diagram is the result of writing natural numbers in the boxes of the diagram. A semistandard tableau is a filling where the entries are nondecreasing in each row from left to right and increasing in each column from top to bottom. A standard tableau or tableau is a semistandard tableau on λ where the entries are 1, 2, . . . , |λ|. For example,

1 1 1 1 2 2 2 3 3 semistandard tableau 1 2 3 7 4 5 6 8 9 standard tableau

B.7 Generalised partitions. A generalized partition is a finite sequence of

nonincreasing integers λ = (λ1, λ2, . . . , λk). The dual λ∗of λ is (−λk, . . . , −λ1).

(30)

B.3 Symmetric group

In this section we describe the representations of the symmetric group. By the following theorem the representations of the symmetric group decompose in irreducibles. We will thus focus on describing the irreducibles.

B.8 Theorem. Representations of finite groups are completely reducible.

Let λ be a partition of n and T (λ) the tableau of shape λ obtained by filling in the numbers 1, . . . , n in text reading direction. For example,

T (3, 2, 1) = 1 2 3 4 5 6

Let R(λ) be the subgroup of Sn consisting of permutations preserving the rows

of T (λ). Likewise, let C(λ) be the subgroup of Sn consisting of permutations

preserving the columns of T (λ). The Young symmetrisers of λ are the following elements in the group ring C[Sn],

aλ:= X σ∈R(λ) σ, bλ:= X µ∈C(λ) ε(µ) µ, cλ:= bλaλ.

B.9 Example. Let λ = (2, 1). Then R(λ) = {id, (12)} and C(λ) = {id, (13)}.

So the Young symmetrisers are

aλ= id + (12), bλ= id − (13), cλ= id + (12) − (13) − (123).

B.10 Theorem. Let λ be a partition of n. The subspace [λ] := C[Sn]cλ of

C[Sn] is an irreducible representation of Sn under left multiplication. Every

irreducible representation of Sn is isomorphic to [λ] for a unique partition λ

of n.

B.11 Corollary. The representations of the symmetric group are self-dual.

Proof. It is enough to prove the statement for irreducible representations, since ‘the dual commutes with the direct sum’. From character theory, we know that irreducible Sn-representations (ρ, W ) are uniquely determined by their

character χW : G → C : g 7→ tr(ρ(g)) and that the character of the dual

representation W∗ _{is the conjugate character of W , χ}

W∗(g) = χ_W(g). Let V be an irreducible Sn-representation. By Theorem B.10 the action of Sn on V is

given by matrices with rational entries, so the character χV is real-valued and

hence equal to its conjugate, that is, V is self-dual.

B.12 Corollary. Let λ and µ be partition of d. We have

dim([λ] ⊗ [µ])Sd=(1 if λ = µ,

0 otherwise. Proof. By Corollary B.11, [λ] is isomorphic to [λ]∗_{, so as S}

d-representations,

[λ] ⊗ [µ] is isomorphic to [λ]∗_⊗_{[µ]. The latter is naturally isomorphic to}

Hom([λ], [µ]) where Sd acts on Hom([λ], [µ]) by (σφ)(v) = σφ(σ−1v) for σ in

Sd, φ in Hom([λ], [µ]) and v in [λ]. As vector spaces, (Hom([λ], [µ]))Sd equals

HomSd([λ], [µ]). By Schur’s Lemma, the dimension of the latter is one if λ = µ

(31)

B.4 General linear group

Let G = GLn. A representation V of G is called polynomial if the corresponding

homomorphism ρ : G → GL(V ) is given by polynomials, and rational if ρ is given by rational functions. We assume all our representations to be rational.

B.13 Theorem. The rational representations of the general linear group GLn

and the special linear group SLn are completely reducible. In the more general

theory of linear algebraic groups we call this property linearly reductive.

Irreducibles and highest weight vectors

Let Tn be the subgroup of diagonal matrices in GLn. We can thus view a

GLn-representation as a Tn-representation by restricting the action.

B.14 Proposition. A GLn-representation V decomposes as a representation

of Tn as

V = M

z∈Zn

Vz (B.1)

where Tn acts on Vz via diag(α1, . . . , αn)v = αz11· · · α

zn

n v. We say the vectors

in Vz _{are weight vectors of weight z.}

Let Bn ⊆GLn be the subgroup of upper triangular matrices. A nonzero

weight vector v in V is a highest weight vector if the line Cv is stable under Bn.

The weight of v is then a generalised partition λ of length at most n.

B.15 Theorem. The GLn-submodule generated by a highest weight vector is

irreducible and a GLn-module is irreducible if and only if it contains exactly

one Bn-stable line. For each generalised partition λ of length at most n, there

exists an irreducible GLn-module {λ} with highest weight vector of weight λ.

Two irreducible GLn-modules are isomorphic if and only if their highest weight

vectors have the same highest weight.

We thus parametrise the irreducible GLn-representations by generalised

partitions λ of length at most n. We will denote the irreducible GLn

-rep-resentations by {λ}. For integers k we denote by D⊗k _{the one-dimensional}

representation GLn → C× given by g 7→ det(g)k. The representation D is

the determinant representation, which can be realised as ∧n_Cn_{. The following}

theorem says that all rational representations can be obtained from a polynomial representation by tensoring with the determinant representation.

B.16 Theorem. The irreducible polynomial representation of GLn are the

representations {λ} for λ a partition with at n parts. Let λ be a generalised partition of at most n parts. The irreducible GLn-module of highest weight α

can be realised as {λ} ⊗ D⊗k_{, for λ `}

n, k ∈ Z with λi= αi− k ≥0.

B.17 Example. The representation corresponding to the empty partition is

the trivial representation. For natural d the GLn-representation {(d)} is the

vector space Symd

Cn with action induced by the action on ⊗dCn. For natural

d ≤ nthe representation {(1n)} is the vector space ∧nCn with action induced

(32)

Special linear group

The representation theory of the special linear group is very similar to the representation theory of the general linear group. Essentially, the theories are the same except that as an SLn-representation the determinant representation

is trivial. In terms of highest weights this means the following.

B.18 Theorem. Every GLn-module is naturally an SLn-module via the natural

inclusion SLn ,→GLn. On the other hand, two irreducible GLn-modules {λ}

and {µ} are isomorphic as SLn modules if and only if λ − µ is a constant

sequence.

Plethysms

Plethysms are compositions of representations. Let G be a group. Let ρ : G → GLk be a representation of G and let τ : GLk → GLm be a representation

of GLk. The plethysm of ρ and τ is the G-representation given by τ ◦ ρ. We

will be concerned with the case G = GL` and in particular with the plethysm

Symd_Symn

C`. We note that this plethysm has a nice description as the image

of a GL`-projection

Pd[n]: ⊗dnC`→ ⊗dnC`,

which is the composition of the natural ‘inner symmetrisation’ and the natural ‘outer symmetrisation’,

P_d[n]inner: ⊗d⊗n_C`_{→ ⊗}d_Symn_C`_, _Pouter

d[n] : ⊗

d_⊗n_C`_→_Symd_⊗n_C`_.

For example, for d = n = 2 and ` = 3,

P2[2](|1123i) = 14(|1123i + |2311i + |1132i + |3211i).

B.5 Schur-Weyl duality

Consider the tensor space W := ⊗d_Cn_{. The canonical action of GL}

n on Cn

makes W a GLn-module: for g ∈ GLn,

g(v1⊗ · · · ⊗ vn) = gv1⊗ · · · ⊗ gvn.

At the same time, permuting the tensor legs makes W an Sd-module: for σ ∈ Sd,

σ(v1⊗ · · · ⊗ vn) = vσ−1₍₁₎⊗ · · · ⊗ v_σ−1_(n).

Note that the actions of GLn and Sd commute. We can thus view W as a

GLn×Sd module. The following famous result will be very helpful for us.

B.19 Theorem (Schur-Weyl duality). As GLn×Sd-modules

⊗d_Cn_∼₌ M

λ`nd

(33)

B.6 Branching

Let φ : H → G be a group homomorphism. Then a G-representation (ρ, V ) naturally yields an H-representation (ρ ◦ φ, V ), which we write as V ↓G

H. If V

is irreducible, then V ↓G

H is not necessarily irreducible. The act of viewing V

as an H-representation via φ is called branching and a description of the decomposition of V ↓G

H is called a branching law. In this section we will be

concerned with branchings for the diagonal embeddings Sd→ Sd× Sd, GLn →GLn×GLn.

Via Schur-Weyl duality, these coefficients also occur in the branching laws for the homomorphisms GLa×GLb → GLab : (A, B) → A ⊗ B and Sa× Sb →

Sa+b, where Sa permutes the symbols 1, 2, . . . , a and Sb permutes the symbols

a, a+ 1, . . . , b.

Kronecker coefficients

Consider the diagonal embedding Sd→ Sd× Sd. Let λ and µ be partitions of d.

Then there exist numbers kλµν such that the irreducible Sd× Sd-representation

[λ] ⊗ [µ] restricted to Sd decomposes as

[λ] ⊗ [µ] ∼=M

ν`d

kλµν[ν].

We call these numbers Kronecker coefficients. We use a symmetric notation for the three parameters of the Kronecker coefficient since the value is invariant under permuting the three parameters, by the following proposition.

B.20 Proposition. Let λ, µ and ν be partitions of d. Then

kλµν = dim([λ] ⊗ [µ] ⊗ [ν])Sd,

that is, kλµν is the dimension of HomSd([λ], [µ] ⊗ [ν]).

Proof. The Sd-representation [λ] ⊗ [µ] is isomorphic to L_ρ`_dkλµν[ρ]. Consider

the Sd× Sd-representation obtained by tensoring the Sd-representation [λ] ⊗ [µ]

with [ν]. Since tensor product ‘commutes’ with direct sum, this representation is isomorphic to Lρ`dkλµν([ρ] ⊗ [ν]). Taking Sd invariants of the latter yields

the vector space Lρ`dkλµρ([ρ] ⊗ [ν])

Sd, which by Corollary B.12 has dimension

equal to kλµν.

Via Schur-Weyl duality Kronecker coefficients occur in branching laws of the general linear group.

B.21 Proposition. Let λ `ab d. The group homomorphism GLa×GLb →

GLab: (A, B) 7→ A ⊗ B gives the decomposition

{λ}↓GLab GLa× GLb=

M

µ,ν `a,bd

HomSd([λ], [µ] ⊗ [ν]) ⊗ {µ} ⊗ {ν},

where the action on the Hom-space is trivial, or equivalently, {λ}↓GLab

GLa× GLb=

M

µ,ν `a,bd

Geometric complexity theory

Geometric complexity theory

Jeroen Zuiddam

Thesis MSc Mathematics

Supervisor: prof. dr. Eric Opdam

Contents

Introduction

Chapter 1

Two motivating problems

1.1

Permanent versus determinant

Determinantal complexity

Approximation

Linear transformations

1.2

Matrix multiplication rank

Tensor rank

1.3

Common approach

Appendix A

Complexity theory

A.1

Complexity of families of polynomials

A.2

Multiplicative complexity of bilinear maps

Chapter 2

Geometric complexity theory

2.1

From C-topology to Zariski topology

2.2

Permanent versus determinant

2.3

Strategies

2.4

Language of symmetric tensors

Appendix B

Representation theory

B.1

Representations

B.2

Partitions, diagrams and tableaux

B.3

Symmetric group

B.4

General linear group

Irreducibles and highest weight vectors

Special linear group

Plethysms

B.5

Schur-Weyl duality

B.6

Branching

Kronecker coefficients