Tensor networks on the motzkin spin chain

(1)

Tensornetwork for the Motzkin spin chain

Lars van Geest

August 6, 2019

Bachelor thesis Mathematics and Physics & Astronomy Supervisor: dr. Maris Ozols, dr. Michael Walter

Institute of Physics

Korteweg-de Vries Institute for Mathematics Faculty of Sciences

(2)

Abstract

In this thesis we will study the Motzkin spin-1 chain. This is an interesting system in quantum information theory, since it is critical and frustration-free. We first study the ground state of this system by finding a correspondence of orthonormal basis states and combinatorial paths. Using this correspondence we can find the ground state, derive the Schmidt coefficients and therefore give an upper bound on the entanglement entropy. Later we will use tensor networks in our study of the ground state. These are networks of tensors that can be seen as a generalisation of quantum circuits. We construct a renormalization network that prepares the ground state of the Motzkin spin chain. The resulting network is scale-invariant, which is a property that might be linked to the criticality of the system.

Title: Tensornetwork for the Motzkin spin chain

Authors: Lars van Geest, lars.vangeest@student.uva.nl, 11318724 Supervisors: dr. Maris Ozols, dr. Michael Walter

Second grader: prof. dr. Eric Opdam, dr. Vladimir Gritsev End date: August 6, 2019

Institute of Physics University of Amsterdam

Science Park 904, 1098 XH Amsterdam http://www.iop.uva.nl

Korteweg-de Vries Institute for Mathematics University of Amsterdam

Science Park 904, 1098 XH Amsterdam http://www.kdvi.uva.nl

(3)

1. Introduction

In this thesis we will study the Motzkin spin-1 chain. This is a one dimensional chain of spin-1 particles and a Hamiltonian that is the sum of local Hamiltonians acting on nearest neighbour interactions only. The system is interesting, because it is critical and frustration-free. A frustration-free system is a system with the property that the ground state minimises the energy of every local Hamiltonian. Criticality is the property that states that as the number of particles gets increased, then the energy difference between the ground state and the first excited state will shrink. It was still an open question in physics whether an frustration-free Hamiltonian can describe a critical spin-1 chain [3], but this system shows that it is possible. This is remarkable, since [6] showed that there was no such spin-1/2 chain.

We mainly follow Bravyi et al. [3], searching the ground state of the system. We do this by making a correspondence between orthonormal basis states of the spin-1 system and paths consisting only of up, down and flat moves. Using the paths we will try to find the ground state, give its Schmidt decomposition, prove that the Hamiltonian is frustration-free and give the measures of entanglement of the system.

After we’ve done this, we will try to construct a tensor network for the ground state of the Motzkin spin chain. Tensor networks can be used in a way to graphically describe the building blocks of a general tensor. They have been presented by Roger Penrose in 1970 [7] and they have been used for many different applications since. For example the quantum circuits of Deutsch can be seen as a specific case of tensor networks. Tensor networks are upcoming in the analysis of highly entangled 1D critical quantum systems. They can give insights in the entanglement structure of a many-body system [1].

Again using the path correspondence, we try to build a tensor network for the ground state of the Motzkin chain ground state that is self similar. Self similarity is often found in physics, describing a critical point or a phase transition. Think about the renormalization group when talking about ferromagnets. Because of this we hope to connect this self similarity of the network to the criticality of the system.

(5)

2. Mathematical preliminaries

In this chapter we will define the mathematical tools which we need to analyse physical systems.

2.1. Probability distributions

Let X be a random variable with a finite event space and let (p1, p2, . . . , pn) be a

proba-bility distribution, with pi the probability corresponding to event X = i. What we want

is a function that gives a measure of how random this distribution is, the more spread out, the higher the function value.

The first thing one could think of is the number of values that can be realised by X. This is the number of non-zero probabilities in the distribution. Our first try of measuring the spread of a distribution is called the rank R(X) being the number of non-zero terms in the probability distribution of X.

Since this measure discards a lot of information, we will try to find a more accurate measure, using the Shannon information. The Shannon information I of an event i hap-pening with probability pi is defined as I(pi) = log2(p1i) = − log2(pi). For a motivation

and intuition of this quantity, take a look at Appendix B.

The entropy of a random variable, H(X), is the expected information we get out of an observation, H(X) =X i∈X pi· I(pi) = X i∈X pi· log2 1 pi .

One might encounter a problem when pi= 0, but we define 0 · log 1₀ = 0.

Taking the example of 10/90 and 50/50 coin flips, we find

H(C10/90) = 0.1 · I(0.1) + 0.9 · I(0.9) = 0.1 · log2(10) + 0.9 · log2

1 0.9

= 0.469, H(C_50/50) = 0.5 · I(0.5) + 0.5 · I(0.5) = I(0.5) = log₂(2) = 1.

For this specific example, the entropy seems to measure how “spread-out” a distribution is. Let us see whether we can put these numbers in perspective and give an upper-bound for the entropy. We also relate entropy to the rank we defined earlier.

Theorem 2.1. For every random variable X with an event space of size n and distri-bution (p1, . . . , pn)

(6)

Proof. We know by the definition of information that it is always positive, so the expected value of information, H(X), will be positive as well.

For the third inequality we use that R(X) ≤ n, since n is the size of the event space. Thus the number of events with non-zero probabilities is bounded by n. Therefore log₂(R(X)) ≤ log₂(n), because the log₂ function is increasing.

Now for the second inequality, we create a second random variable ˆX which has value

1

pi with probability pi 6= 0 and 0 if pi = 0 (in contrast to X which has value i with

probability pi). Now E( ˆX) = R(X), so we find

H(X) =

n

X

i=1

pilog2(p1i) = E(log2( ˆX)) ≤ log2(E( ˆX)) = log2(R(X)),

where we used Jensen’s inequality and the fact that log₂(p) is a concave function. An immediate consequence of this theorem is that, the uniform distribution on n events will always be the distribution of maximum entropy, since

n X i=1 1 n· I( 1 n) = n X i=1 1 nlog2 1 1 n ! = log₂(n).

More informally, this states that the entropy function reaches its maximum when the distribution is fully spread out. This explains the name of the function, because the entropy in thermal physics is used to describe the disorder, or randomness, of a system.

2.2. Tensors

In the following sections we will start with the preliminaries of tensor networks. The papers of Biamonte and Bergholm [2] and Bridgeman and Chubb [4] were used to find the most suitable definitions of tensor networks. In order to get there we first take a look at tensors.

Definition 2.1. Let V1, . . . , Vn be vector spaces. An n-tensor T on V1, . . . , Vn is an

element in the n-fold tensor product space Nn

i=1Vi.

If dk denotes the dimension of Vk and we pick a basis (eik|ik ∈ {1, . . . , dk}) for each

Vk, we can write our tensor as

T = d1 X i1=1 d2 X i2=1 . . . dn X in=1 Ti1,i2,...,in· ei1⊗ . . . ⊗ ein, (2.1)

where Ti1,i2,...,in are scalars in the field over which we take the vector space. In this thesis

we will usually take Vk = Cdk with ei being the standard basis vectors, so tensors are

given by their coefficients Ti1,...,in, which we will call the (tensor) entries from now on.

(7)

given by the entries Ti1,...,in (this leads to a correspondence 1-tensors ↔ vectors and

2-tensors ↔ matrices), where i1 denotes the row, i2 the column, i3 the depth et cetera.

To denote an n-tensor pictorially, we use a node with n half-edges departing from it, where each half-edge corresponds to an index position. If we put two different tensors in the same picture, the whole diagram can be seen as a new tensor which is the tensor product of the two [4].

Definition 2.2. Let A be an m-tensor with entries Ai1,...,im and B an n-tensor with

entries Bj1,...,jn. We define the tensor product A ⊗ B as a m + n-tensor with entries

(A ⊗ B)i1,...,im,j1,...,jn.

Remark. From the definition of the tensor product follows its associativity. Therefore we can draw multiple tensors in one diagram and do not need to specify the order in which they are multiplied.

(a) A single 5-tensor.

(b) Putting a 5-tensor and a 2-tensor in one diagram gives a 7-tensor by using the 7-tensor product. We find C = A ⊗ B, so for the entries Ck1,k2,k3,k4,k5,i1,i2,i3= Ak1,k2,k3,k4,k5· Bi1,i2,i3.

Figure 2.1.

2.3. Tensor contraction

The tensor product is not the only operation we want to perform on tensors. We will introduce tensor contraction, a generalisation of matrix multiplication.

Let Ai1,...,ik,...,il...,in be the entries of an n-tensor A. If the index sets of ik and il are

equal, denoted as IndA(k) = IndA(l) for certain k, l ∈ {1, . . . , n}, we can contract the

corresponding indices by connecting the half edges ik and il. We get a resulting tensor

B, with entries given by B_i

1,..., bik,...,bil,...,in =

X

x∈IndA(k)

(8)

The hats on indices mean they are omitted, so the resulting tensor B is a (n − 2)-tensor. The picture of this contraction is drawn in figure 2.2.

Figure 2.2.: On the left side a general tensor A. On the right side the contraction B where we connected half edges ik and il. The resulting tensor B is given by

equation 2.2.

Let us look at matrix multiplication and verify that it is a special case of tensor contraction. If we have two matrices A and B with entries Ai1,i2 and Bj1,j2, we assume

IndA(2) = IndB(1). We first put both 2-tensors in the same diagram to create the

4-tensor A ⊗ B, then we contract i2 with j1, by connecting the half-edge i2 with the

half-edge j1, see figure 2.3. We find

Ci1j2 = X x∈IndA⊗B(2) (A ⊗ B)i1,x,x,j2 = X x∈IndA(2) Ai1,x· Bx,j2. (2.3)

(9)

Figure 2.3.: On the left side, the matrices A and B as disconnected 2-tensors, this is the tensor A ⊗ B. On the right side the matrix multiplication is given by connecting the half edge i2 of A and the half edge j1 of B. We replace both

indices with an x over which we sum. The resulting matrix C is given by formula 2.3.

As another example we take a look at a square matrix M . We can denote the trace by contracting the matrix with itself, as denoted in figure 2.4.

Figure 2.4.: On the left-hand side a matrix M . On the right-hand side the contraction P

(10)

2.4. Tensor networks

We have defined tensor contraction for one pair of indices. However, we want to end up having a network of tensors, so we need to know if the order of contracting multiple pairs of indices is important.

Let A be a general 4-tensor, with entries Ai1,i2,i3,i4. If we contract (i1, i3) first to get

tensor B and afterwards (i2, i4) to get the 0-tensor (the number) C we find

Bi2,i4 = X y∈IndA(1) Ay,i2,y,i4 C = X x∈IndB(2) Bx,x= X x∈IndA(2) X y∈IndA(1) Ay,x,y,x.

If we do it the other way around, first create B0 by contracting (i2, i4) and later

contract (i1, i3), to get C0, we find

B_i0₁_,i₃ = X x∈IndA(2) Ai1,x,i3,x C0= X y∈IndB(1) B_y,y0 = X y∈IndA(1) X x∈IndA(2) Ay,x,y,x.

We find that C0 = C are the same. Concluding, it does not matter in which order we contract the paired indices. The example above can be summarised in the diagram in figure 2.5. We can prove this for any n-tensor using the same reasoning. Keeping this in mind, we can define the notion of tensor networks.

Figure 2.5.: First contracting indices (i1, i3) and later (i2, i4)

(11)

A tensor network is a collection of tensors together with the information of how to contract them (paired indices). Since we’ve proven that the order in which we contract does not matter, tensor networks can be depicted as graphs, with the vertices as ten-sors, the edges indicating contractions and the remaining half-edges representing the remaining indices in the resulting tensor. An example is given in figure 2.6.

Figure 2.6.: The contraction of tensors W, X, Y, Z with entries Ww1, Xx1,...,x4, Yy1,y2,y3

and Zz1,z2,z3. We first make a big tensor product of all the tensors and

afterwards we contract the indices in any order we like. The resulting tensor R becomes Rx1,y2,z2 =

P

a,b,c,dWa· Xx1,b,c,a· Yy1,c,b· Zd,z2,d.

Remark. It appears that the tensor product also commutes with the contraction of indices, this is the reason we can also contract parts of the network and analyse it that way, see figure 2.7.

Figure 2.7.: Partially contracting a network, to study pieces of it. Figure taken from [4]. It is not hard to see that the tensor networks, are a generalisation of the quantum circuits we have seen in quantum computing. We now discuss a concrete example.

(12)

For i1, i2, i3 ∈ {0, 1}, let

COPYi1,i2,i3 = δi1,i2 · δi2,i3

Figure 2.8.: Definition of the COPY tensor.

So COPY1,1,1= COPY0,0,0= 1 and COPYi1,i2,i3 = 0 for all other values of i1, i2, i3.

For j1, j2, j3 ∈ {0, 1}, let

XORj1,j2,j3 = δj3,j1⊕j2.

Figure 2.9.: Definition of the XOR tensor.

So XORj1,j2,j3 = 1 whenever j3 = j1⊕ j2 and otherwise 0, where ⊕ denotes addition

mod 2. As we contract the XOR and the COPY tensor together, we get a new tensor which we call the CNOT tensor.

(13)

The entries of the CNOT tensor can be found as follows: CNOTi1,j1,i3,j3 =

X

x∈{0,1}

COPYi1,x,i3 · XORj1,x,j3

= X

x∈{0,1}

δi1,x· δx,i3 · δj3,j1⊕x

= δi1,i3 · δj3,j1⊕i1.

We see that CNOTi1,j1,i3,j3 = 1 if i3 = i1 and j3 = j1⊕ i1 and that it is zero for the

other index values. The CNOT tensor is written as a matrix

CNOT = X

i1,j1,i3,j3∈{0,1}

CNOTi1,j1,i3,j3|i1, j1ihi3, j3| ,

where we grouped the i1 and j1 index as the input and the i3 and j3 are the output. We

let this matrix act on the standard basis vectors |k, li and we find

CNOT |k, li = X i1,j1,i3,j3∈{0,1} CNOTi1,j1,i3,j3|i1, j1ihi3, j3| |k, li = X i1,j1∈{0,1} CNOTi1,j1,k,l|i1, j1i = X i1,j1∈{0,1} δi1,k· δl,j1⊕i1|i1, j1i = X j1∈{0,1} δl,j1⊕k|k, j1i = |k, k ⊕ li ,

where the last step follows from the fact that if j1⊕ k = l, then j1 = l ⊕ k. We recognise

that the tensor CNOT indeed corresponds to the controlled-NOT gate from quantum computing. We conclude that the CNOT gate can be written as a contraction of two 3-tensors, the XOR and the COPY tensor.

Remark. Since the tensor entries of the COPY and the XOR tensors only have zero’s and ones, we could have derived the tensor entries of CNOT in another way that will be useful for the rest of the thesis.

Consider the values of i1 and j1 given. We will find the values of i3 and j3, for which

the tensor entry CNOTi1,j1,i3,j3 will be a one. First of all i3 = i1 otherwise the first

factor in

CNOTi1,j1,i3,j3 =

X

x∈{0,1}

COPYi1,x,i3· XORj1,x,j3,

will be zero. Also the x in this equation, or the connect edge must have the same value as i1. Since x = i1 the second factor will only be one if j3 = j1⊕ i1. The conclusion is

(14)

3. Physical preliminaries

3.1. Quantum systems and quantum states

Since this thesis is a combination of mathematics and physics, I will introduce some physical conventions and notations.

Definition 3.1. A pure state is a normalised (length 1) vector in a complex Euclidean vector space often denoted with H (from Hilbert space).

Where mathematicians use letters ψ to denote column vectors, physicists often like to denote pure states as |ψi (ket psi). Now if we flip the notation around we get hψ| (bra psi), which denotes the conjugate transpose of this vector (hψ| = |ψi†). This notation becomes quite useful when it comes to inner products (the bra-ket) hψ|φi, which is just the same as hψ| · |φi. The other way around is possible as well, for example if |ii is the i’th standard basis vector then P

i|iihi| is the identity matrix. Now the tensor product

of two pure states can also be written in a more compact way: |ψi ⊗ |φi = |ψ, φi.

3.2. Schmidt decomposition

The basis of a tensor product of two spaces of dimensions m and n, contains m × n elements, so one would think that m × n coefficients are needed to describe a state in such a space. The fact is we only have to use min(m, n) coefficients, if we pick the corresponding vectors a little more cleverly. This is called the Schmidt decomposition. Theorem 3.1. Let |ψi be a pure state in the space HA ⊗ HB with dimensions m

and n, respectively and let l = min(m, n). There exists a set of orthonormal vectors {|u₁i , . . . , |u_li} ⊂ H_A, {|v1i , . . . , |vli} ⊂ HB and numbers {Λ1, . . . , Λl} ⊂ R≥0, such

that |ψi = l X k=1 Λk|uki ⊗ |vki . (3.1)

Proof. Let |ψi be a pure state in the space HA⊗ HB with dimensions m and n,

respec-tively. It can be written in the form |ψi = m X i=1 n X j=1 Cij|ii ⊗ |ji (3.2)

where |ii and |ji are the orthonormal basis vectors of HA and HB respectively, and Cij

(15)

of numbers, with Cij being the entry at the i’th row and the j’th column. Let’s call

this matrix C. The singular value decomposition tells us that C can be decomposed as C = U ΛV , where U is an m × m unitary matrix, V is a n × n unitary matrix and Λ is a diagonal m × n matrix, with the real, non-negative singular values Λk of C on the

diagonal (note that there are l singular values).

If we write out the matrix multiplication, we find that

Cij = l X k=1 Uik(ΛV )kj = l X k=1 UikΛkVkj.

Let |uki be the k’th column of U and |vki the transpose of the k’th row of V . Then Uik

is the i’th element of |uki and Vkj is the j’th element of |vki. Hence we can write Cij as

Cij = l

X

k=1

Λkhi|uki hj|vki ,

with |ii and |ji being the i’th and j’th standard basis vector respectively. Inserting the equation for Cij in equation 3.2 and using the bilinearity and distributivity of the tensor

product, we get |ψi = m X i=1 n X j=1 l X k=1

(Λkhi|uki hj|vki) |ii ⊗ |ji

= l X k=1 m X i=1 n X j=1

Λkhi|uki |ii ⊗ hj|vki |ji

This bipartite decomposition of a pure state is called the Schmidt decomposition and the coefficients Λk are often called the Schmidt coefficients. They say a lot about the

entanglement of the state.

3.3. Entanglement

We easily see that if only one of the Λk in the Schmidt decomposition is one and the

rest are zero, then our state is a product state with respect to the partition A : B, which means that it can be written as a tensor product of a state in HA and a state in HB.

(16)

state). This analysis gives rise to the definition of the Schmidt rank, which is the number of non-zero coefficients in the Schmidt decomposition, denoted χ(|ψi). If the Schmidt rank is low then the system HA is not correlated much with HB and the other way

around.

The Schmidt rank can be seen as the rank R(X) in section 2.1, where the probability distribution is given by pk= Λ2k (Λk is real and positive by the Schmidt decomposition

and their squares sum to 1 since |ψi is normalised).

We can also apply the Shannon entropy H(X) to our decomposition to find a more precise way to describe how “entangled” two subsystems are.

Definition 3.2. The entanglement entropy of a pure state |ψi_AB ∈ HA⊗ HB with

respect to the partition A : B, is given by

S(|ψi_AB) = l X k=1 Λ2_klog₂ 1 Λ2 k ,

where Λk are the Schmidt coefficients of |ψi.

Note that this is the same as the known formula using density matrices, S(|ψi_AB) = − Tr(ρAlog(ρA)).

This follows from the fact that the reduced density matrix ρA of the state |ψi_AB given

in equation 3.1 traced over system B becomes the diagonal matrix

ρA= l

X

k=1

Λ2_k|ukihuk| .

Plugging this into the previous formula leads to the desired result.

Now we have two measures for the entanglement, namely the Schmidt rank and the entanglement entropy. By Theorem 2.1 we know how they relate to each other, namely

0 ≤ S(|ψi) ≤ log₂(χ(|ψi)) ≤ log₂(l).

3.4. Hamiltonians

Definition 3.3. A Hamiltonian H on the space H is a Hermitian linear operator (H†= H) on H. The eigenvalues of the Hamiltonian are called the energies and the eigenvectors are called eigenstates.

We use the word “energy” because in a physical system one can create a Hamiltonian, such that its eigenvalues are exactly the possible outcomes of a measurement for the energy. The fact that our Hamiltonian is Hermitian comes in handy if we want to calculate the energy of a state |ψi. Because H is Hermitian, we have an orthonormal basis of eigenvectors (|ϕ1i , |ϕ2i , . . . , |ϕni) of H, with their respective eigenvalues (or

(17)

energies) (E1, E2, . . . , En), where n is the dimension of H. If we physically measure

the energy of a system whose state is in the eigenstate, the system will output the corresponding eigenvalue or energy. If we measure a pure state |ψi (which is not an eigenstate), we have a probability of | hψ|ϕii |2 that the state |ψi collapses to the state

|ϕii and the energy output is Ei.

So we know what the energy of an eigenstate is, but not yet for an arbitrary pure state |ψi. We define it to be E|ψi := hψ| H |ψi. This makes sense because

hψ| H |ψi = hψ| · H n X i=1 |ϕii hϕi|ψi ! = n X i=1 hψ| H |ϕ_ii hϕ_i|ψi = n X i=1 Eihψ|ϕii hϕi|ψi = n X i=1 Ei· | hψ|ϕii |2. (3.3)

We find that this value has to be real, since it is the sum of real eigenvalues (spectral theorem) multiplied by non-negative numbers. The intuition behind this definition is that the energy of a state is the sum of the energies of all eigenstates, weighted with the probability that we measure the state to be in that eigenstate (it is the expected value of the energy of a measurement).

A particularly interesting state of a system is the ground state. It is defined as the state for which the energy is the lowest. Because all Eiare positive, we see from equation

3.3 that this is the eigenstate |ϕii for which Ei is the lowest. This is the state with the

least energy, so a system is often found in this state.

3.5. Frustration and critical systems

All systems we will be looking at are spin chains.

Definition 3.4. A spin-s chain of n particles is a system with underlying vector space (C2s+1)⊗n, with orthonormal basis vectors of the form |i1, . . . , ini where all ij go from

−s to s in steps of 1 and s ∈ {1 2, 1, 1

1 2, . . .}.

Note that s can also be a half-valued number. A spin chain is called a chain because it describes a model where multiple quantum particles are lined up and only interact with their direct neighbours. Each particle can be described by its own subsystem C2s+1, with its own basis vectors {|−si , |−s + 1i , . . . , |s − 1i , |si}.

The two key concepts why the Motzkin spin chain is interesting are frustration and criticality. They appear to go hand-in-hand in physics, e.g., when a system is critical, it has to have frustration as well. However, we will find that the Motzkin chain is frustration-free although it is critical. Let us define these concepts.

(18)

A critical system has many special properties, but we will describe it according to the closure of the spectral gap.

Definition 3.5. The spectral gap of a Hamiltonian is the difference in energy between the ground state of a system and the next lowest energy. This is the same as the difference between the lowest two eigenvalues.

For most spin chains, the spectral gap tends to get smaller when we add more particles. We can create a spin chain of length n and for every n write a Hamiltonian Hn. Each

individual Hamiltonian Hnhas a spectral gap. If the sequence of these gaps goes to zero

as n goes to infinity, we call the system critical.

To define the term “frustration”, we first need to know what a local Hamiltonian is. A local Hamiltonian Π is a term in the (full) Hamiltonian H that acts only on a strict subsystem of the composite system the Hamiltonian acts on. Often in spin chains the Hamiltonian is given by the sum of local Hamiltonians that only describe the nearest neighbour interactions and therefore the total energy is the sum of the energies of the local Hamiltonians.

A system is called frustration-free when the ground state minimises all of the energies of the local Hamiltonians. We will study this property in an example.

We study a spin-1₂ chain of 4 particles with boundary conditions. The two basis vectors for each particle are1₂ and

-1₂. We define the Hamiltonian as

H =1₂ ₁ 2 1+ 1₂ ₁ 2 4+ 3 X j=1 h 1₂,1₂ ₁ 2, 1 2 j,j+1+ -1₂, -1₂-1₂, -1₂ j,j+1 i .

The subscripts indicate that the operators only act on specific particles. For example, the operator 1₂,1₂ ₁ 2, 1 2 2,3 = I ⊗ 1₂ ₁ 2 ⊗ 1₂ ₁ 2

⊗ I, where I is the identity operator, acts only on subsystems (particles) 2 and 3. To minimise the energy of this specific operator our ground state can not contain a term of the form |ψ1i ⊗

1₂,1₂ ⊗ |ψ₄i, for any |ψ1i , |ψ4i ∈ C2.

First of all we see that the H is diagonal (only terms of the form |ϕihϕ|), so the eigen-vectors are just the standard basis eigen-vectors and so is the ground state. Note that every local term has a state of energy 0, so proving that the system is frustration free reduces to proving that the ground state of the composite system is not of energy 0. Minimising the energies of every term we conclude, by the first two terms (boundary conditions), that particle one and four must both be in state -1₂. The terms in the sum couple neighbours (they are often called “coupling terms”). To minimise their energy, we can never have two neighbouring particles in the same state. Together with the boundary conditions we find particle one in state-1₂, therefore particle two in state

1₂ and par-ticle three in state -1

2. We already concluded that particle four was in state

-1

2, so

the overall state becomes |ψi =-1₂,1₂, -1₂, -1₂. But now the local term 1₂,1₂ ₁ 2, 1 2 3,4 is

not minimised since hψ|1₂,1₂ ₁ 2, 1 2

3,4|ψi = 1 6= 0. In this system there is no state which

(19)

If we study the same system, but now with 5 particles instead of 4, then there is no frustration. Now the ground state is-1₂,1₂, -1₂,1₂, -1₂ and this state minimises every term of the Hamiltonian.

(20)

4. Motzkin spin chain

In this chapter we study the Motzkin spin chain using an equivalence relation between strings. We follow the article by Bravyi et al. [3].

4.1. Finding the ground state

We study a spin-1 chain. This means that each particle has 3 orthonormal basisvectors {|-1i , |0i , |1i} ⊂ C3_{, so the state of a system can be described by a vector in (C}3₎⊗n_.

Definition 4.1. The Hamiltonian of the Motzkin spin chain is given by a sum of local terms.: H = |-1ih-1|₁+ |1ih1|_n+ n−1 X j=1 Πj,j+1,

Where the subscripts mean that these operators act on subsystem j and j + 1. We define the local Hamiltonians as

Πj,j+1= |φihφ|j,j+1+ |ψihψ|j,j+1+ |θihθ|j,j+1,

with

|φi = |1, 0i − |0, 1i , |ψi = |0, -1i − |-1, 0i ,

|θi = |1, -1i − |0, 0i .

Because of the way the local terms are defined, we will define an equivalence relation on the orthonormal basis {|-1i , |0i , |1i}⊗n⊂ (C3₎⊗n_.

Definition 4.2. Two strings s, t ∈ {-1, 0, 1}n are equivalent, denoted s ∼ t, when one can get from s to t by applying the moves

..., 0, 1, ... ↔ ..., 1, 0, ... ...0, -1, ... ↔ ..., -1, 0, ... ..., 1, -1, ... ↔ ..., 0, 0, .... It is immediately clear that this relation is reflexive, symmetric and transitive. So we have defined an equivalence relation which gives a partition of our orthonormal basis {|-1i , |0i , |1i}⊗n. The next question is how are we going to represent the equivalence classes.

Lemma 4.1. For every string s ∈ {-1, 0, 1}n there are unique p, q ∈ N ∪ {0}, such that s ∼ -1 . . . -1 | {z } p 0 . . . 0 | {z } n−p−q 1 . . . 1 | {z } q .

(21)

Proof. Because of the first two moves in definition 4.2 we do not have to consider the zeroes. We can put them in any positions we like, so we can just ignore them. Now we only have -1’s and 1’s. We can annihilate every -1 with a 1 using the third move, if the 1 is directly to the left of the -1. So we remain with zeroes (which we can ignore again) and only -1’s without a 1 on their lefts. This is exactly the string we wanted, when we’ve used the first two moves to put the zeroes between the group of -1’s and the 1’s.

The uniqueness of the p and q follows from the fact that different representatives are never equivalent. If -1 . . . -1 | {z } p 0 . . . 0 | {z } n−p−q 1 . . . 1 | {z } q ∼ -1 . . . -1 | {z } p0 0 . . . 0 | {z } n−p0_−q0 1 . . . 1 | {z } q0 ,

with p 6= p0. Assume without loss of generality p > p0, so to get from p to p0, we need to annihilate some of the -1’s. Since this is not possible from this string, the strings are not equivalent. We can do the same for q. Therefore all strings are equivalent to exactly one representative.

To illustrate this proof let’s take an example with n = 11. 0, -1, 1, 1, 0, -1, -1, 0, 0, 1, 1. First move all of the zeroes to the left:

0, 0, 0, 0, -1, 1, 1, -1, -1, 1, 1. Annihilate -1, 1 using the third move:

0, 0, 0, 0, -1, 1, 0, 0, -1, 1, 1. Move all of the zeroes to the left again:

0, 0, 0, 0, 0, 0-1, 1, -1, 1, 1. Annihilate again:

0, 0, 0, 0, 0, 0-1, 0, 0, 1, 1. Let the zeroes seperate the -1’s and the 1’s:

-1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1. This is in the desired form with p = 1 and q = 2

Now we have a representative for every class, so we can give every class a name. Definition 4.3. An equivalence class Cp,q⊂ {-1, 0, 1}n is a set of all the strings

equiv-alent to a representative, as given in Lemma 4.1.

Cp,q :=    s ∈ {-1, 0, 1}n s ∼ -1 . . . -1 | {z } p 0 . . . 0 | {z } n−p−q 1 . . . 1 | {z } q    .

(22)

To find the ground state of the Hamiltonian, we will first try to find a state that minimises the energy of every separate term. If such a state exists (it will turn out so), we are sure that it is the ground state. Furthermore, the existence of such a state will prove that the Motzkin Hamiltonian is frustration-free.

Theorem 4.2. The ground state of the Hamiltonian given in definition 4.1 is the uni-form superposition of all strings in the class C0,0. Furthermore, the Hamiltonian is

frustration-free.

Proof. We want to find a state |χi such that it minimises the energy for every local term, or in other words, for every j, hχ| Πj,j+1|χi = 0. Then for this state we need.

hχ|φi = 0, hχ|ψi = 0, hχ|θi = 0

Intuitively, |χi should contain at every spin pair (j, j + 1), as much of |1, 0i as of |0, 1i, as much of |0, −1i as of |−1, 0i and as much of |1, −1i as of |0, 0i. Formally speaking, this would mean that if we look at the inner product of the state |χi with |si for any arbitrary string s ∈ {-1, 0, 1}n, and use some of the moves defined in definition 4.2 to alter the string s then the inner product should still have the same value. In short, a state |χi that gives the same inner product on equivalence classes:

hs|χi = ht|χi if s ∼ t,

will have zero energy on the coupling terms in the Hamiltonian. Because we picked a basis we can use the familiar trick,

I · |χi = X s∈{−1,0,1}n |sihs| · |χi = X s∈{−1,0,1}n hs|χi |si = n X p=0 n X q=0 −pχ_p,q X s∈Cp,q |si ,

where χp,q = hrp,q|χi · #Cp,q, with rp,q being the representative of Cp,q.

Now we only need to use the boundary conditions to proof the theorem. Because of the first term in our Hamiltonian, the ground state should not contain a -1 in the first position. Therefore the classes over which we sum should only be the C0,q classes,

otherwise we would have to include a representative of a certain class which has at least one -1 at the start of the string. If we apply this same reasoning to the second boundary condition, we find that the only class over which we sum should be C0,0. So our ground

state reduces to |χi = 1 p#C0,0 X |si∈C0,0 |si .

Since we have found a state that minimises every single term in the Hamiltonian, we conclude that the Hamiltonian is frustration-free.

Remark. The normalised superposition of all states in a class Cp,q will from now on be

written as the state |Cp,qi (not to be confused with the representative of the class Cp,q).

(23)

4.2. Motzkin paths

The strings in the class C0,0 are special in the sense that they can be represented in a

neat way. Let us identify every 1 in the string with a move %, identify 0 with →, and identify -1 with &. A string in C0,0 can be identified with a path on a rectangular grid

with these moves, we call this the Motzkin path corresponding to the string. An example is shown in figure 4.1.

1, 1, 0, -1, 0, 1, -1, -1, 1, 0, -1

∼

=

Figure 4.1.: The connection between a string and its Motzkin path.

If we pick a string in C0,0, we can look at each initial segment and find that there are

always more 1’s than -1’s. This is because each 1 has to come before a corresponding -1. Now we can relate this to the Motzkin path, and find that the number of 1’s minus the number of -1’s in each initial segment is the height of the Motzkin path at that point. Because we are looking at strings in C0,0, we know that this height is always greater or

equal to zero.

Definition 4.4. A Motzkin path of length n is a path of n up, down or flat moves such that it never goes below zero and ends at zero. Due to the correspondence between these paths and elements in C0,0, we will use the two notions interchangeably.

We can generalise this view of paths to any string. For any string in Cp,q, there

is an initial segment with p more -1’s then 1’s. This means that the minimal height in a diagram can go below zero (the minimal height will be −p). Considering the representatives, we see that in Cp,q there are p -1’s and q 1’s. Therefore the height at

the end will be q − p. The height at the end does not change if we apply the moves given in definition 4.2, we conclude that the height is q − p for every path in Cp,q.

(24)

Given these facts, we can deduce that −p corresponds to the lowest point in the path and q corresponds to the height difference between the endpoint and the lowest point. We can calculate the number of paths in a class. To better understand this technique of computing a number of paths it might be wise to look at the proof in the appendix A.1 first.

Theorem 4.3. The number of paths of length n in C0,m is

#C0,m = X i=0 2i+m≤n m + 1 i + m + 1 2i + m i · n n − i .

Proof. We start by looking at the paths which only use up and down moves, with no flat moves allowed. We call such paths Catalan paths and use a generalisation of the reflection method given in appendix A.1.

We will try to find #C0,m, the number of paths starting at zero, having zero as the

lowest point and the ending at height m. We will first count the Catalan paths with 2i + m moves, consisting of i down and i + m up moves and never going below 0. We call this number of paths Di,m. Just as in the, appendix we first ignore the zero constraint

and only consider paths going from zero to m in 2i + m steps, with only up and down moves. If we pick the places for the i down moves out of the 2i + m possible spots, the up moves are already determined (they should be at the remaining spots). The number of possible paths going from 0 to m in 2i + m steps (with or without crossing the zero-line) is therefore

2i + m i

.

We want to subtract from this the number of paths which are eliminated by crossing the zero line. We count the number of such paths by using the same reflection argument as in the proof of theorem A.1. The reflection through the line at −1₂ will sent an invalid path ending at height m to a path ending at height −m − 1. The number of these paths is 2i + m i + m + 1 =2i + m i − 1 ,

because we now have to pick i + m + 1 down moves from a total of 2i + m moves. We find the number of Catalan paths ending at height m, by subtracting this from the total

(25)

number of paths: Di,m= 2i + m i −2i + m i − 1 = (2i + m)! (i + m)! · i! − (2i + m)! (i − 1)! · (i + m + 1)! = (2i + m)! (i + m)! · i! − i i+m+1· (2i + m)! i! · (i + m)! = 1 − i i + m + 1 2i + m i = m + 1 i + m + 1 2i + m i .

The only thing left to do is to translate these Catalan paths into Motzkin paths. Let n be the number of moves in a Motzkin path. The idea is that every Motzkin path with n − (2i + m) flat moves can be constructed by taking a Catalan path of length 2i + m and adding in those n − (2i + m) flat moves at every possible position. If we have n − 2i − m flat moves and n positions then there are _n−2i−mn = _2i+mn possible places to put the flat moves. We can calculate the number of Motzkin paths by summing over i, which gives n − 2i − m flat moves:

#C0,m= X i=0 2i+m≤n Di,m· n n − i = X i=0 2i+m≤n m + 1 i + m + 1 2i + m i · n 2i + m .

Now we know the number of Motzkin paths for every possible height of the endpoint.

4.3. Schmidt decomposition

To analyse the ground state a little more we will find a Schmidt decomposition and see if it is entangled. Let us call the first n/2 particles set A and the remaining B (assuming n to be even). If we split a Motzkin path like this, the first n/2 particles form a path that stays above 0 and ends at a certain height m ∈ {0, . . . , n/2}, so the first part becomes a path in C

n 2

0,m (where the superscript denotes that this class consists of paths of length

n/2 instead of the usual n). The second part of the path goes from height m to 0 with lowest point 0. Therefore, this path is in C

n 2 m,0.

(26)

The Schmidt decomposition of the ground state |C0,0i that uses the partition A : B

will look like

|C0,0i = n/2 X m=0 Λm C n 2 0,m ⊗ C n 2 m,0 .

The coefficient Λm is the square root of the probability that we (uniformly) pick a path

from C0,0 that has height m at spot n/2. This probability is given by the number of

paths of height m at n/2 divided by the total number of paths, or in other words

Λm = v u u u t C n 2 0,m · C n 2 m,0 |C0,0| ∗ = C n 2 0,m p|C0,0| ,

where * follows from the fact that reading the string from right to left and flipping all 1’s and -1’s (mirror the path through a vertical line in the middle), creates a bijection between Cm,0 and C0,m. By filling in the values for C0, m and C0,0 from theorem 4.3

we can find concrete values for these Schmidt coefficients. Since Λm is only zero when

|C

n 2

0,m| = 0 we can easily deduce what the Schmidt rank is. The number of length-n2

Motzkin paths starting at height 0 and ending at height m is always greater than zero, if 0 ≤ m ≤ n₂. So all of the n₂ + 1 Schmidt coefficients are non-zero, which means the Schmidt rank is

χ(|C0,0i) =

n 2 + 1.

Using Stirling’s approximation and some integration trickery [3] writes Λm into

some-thing more feasible and substituting it into the entropy formula, giving

S(|C0,0i) = − n/2 X m=0 Λ2_mlog₂(Λ2_m) ≈ 1₂log₂(n) + cn, where cn→ 0.14(5) as n → ∞.

In the article of Brayvi et al. [3], bounds on the first energy level above the ground state are found. These bounds prove the criticality of the system and are found using the Schmidt decomposition. Since this derivation is quite long and not in line with the rest of this thesis, we skip it.

(27)

5. The renormalization network for the

Motzkin state

In this chapter we will create a tensor network which prepares the ground state of the Motzkin spin chain. We follow the article of Alexander et al. [1] while adding some extra steps to make the derivation of the network clearer.

We start with the general Motzkin tensor MOT. This tensor has n legs, each leg of dimension 3, because the ground state lives in the space (C3)⊗n. Until now we have worked with index sets ranging from 1 to some positive integer, but this time we introduce index values of 0 and −1, to keep the network intuitive (we relabel the index set {1, 2, 3} to {−1, 0, 1}).

We want the tensor entry MOTk1,...,kn to be

1

√

|C0,0|

whenever the string (k1, . . . , kn) is

a Motzkin path. This way the tensor prepares the uniform superposition of all Motzkin paths. At this point we let go of the normalisation factor and just make the tensor entries 1 or 0, this will make it easier for us to split up the network.

with k1, . . . , kn∈ {-1, 0, 1}.

MOTk1,...,kn = 1, when (k1, . . . , kn) forms a

Motzkin path, otherwise MOTk1,...,kn = 0.

Figure 5.1.: The definition of the MOT tensor. Representing the Motzkin state, as a vector in (C3)⊗n.

5.1. The Motzkin state as a tensor network

We want to split this tensor up into multiple tensors. Our first step is to think of how we would check if a certain string is a Motzkin path. There are two conditions:

(28)

2. The height at the endpoint (sum of all elements) is zero.

Using these rules we think of a tensor network that keeps track of the height of the input string (k1, . . . , kn). We use the 3-tensor H to create this network.

with k ∈ {-1, 0, 1} and n1, n2 ∈ Z.

Hn1,k,n2 = 1, whenever n1 + k = n2

and n2 ≥ 0, otherwise Hn1,k,n2 = 0.

Figure 5.2.: The definition of the H tensor.

The other tensor we need is the zero tensor, which has a tensor entry of one only when its index is zero (note that the zero tensor is a 1-tensor which might be confusing).

with i ∈ Z. 0i = δi,0.

Figure 5.3.: Definition of the zero tensor

The MOT tensor can be written as a network of these two tensors, see figure 5.4.

Figure 5.4.: A tensor network equivalent to the MOT tensor (on 8 particles). The re-sulting tensor of this network has entry 1 only whenever the index values correspond to a Motzkin path.

This tensor network keeps track of the height of a Motzkin path. If we read the tensor network from left to right, we find that at every node the current string element of the

(29)

Motzkin path is added to the partial sum until that point. The left and right tensors are there because we want our path to start and end at height 0. Since the tensor entry vanishes when n2 < 0, we also know that every partial sum will always be non-negative.

These are exactly the two conditions for a Motzkin path.

Note that the dimensions of the spaces of the horizontal edges are actually infinite, since the indices are in N. When we pick a number of particles n, we can restrict this dimension to n, because this is the maximal height of any path.

5.2. Binary height tensor network

To split up the previous network further, we will use the binary addition tensor.

with k1, k3 ∈ {0, 1} and k2, k4 ∈ {−1, 0, 1}. BINk1k2k3k4 = 1, when k1 k2 k3 k4 0 -1 1 -1 0 0 0 0 0 1 1 0 1 -1 0 0 1 0 1 0 1 1 0 1

and BINk1k2k3k4 = 0 otherwise.

Figure 5.5.: The definition of the binary addition tensor.

The tensor entries can be remembered by thinking of k1 as a digit in the binary

expansion of a number and k2 as the number we want to add to this digit. The tensor

entry is 1 if k3 = k1⊕ k2, where we use addition modulo 2, and if k4 is the “carry digit”

of the sum k1+ k2, which denotes what would happen with the next digit (for example if

you add k2= 1 to k1 = 1, the last digit becomes a k3= 0 and you carry a k4 = 1 to the

next place). The carry digit of the sum k1+ k2 can be written as k4 = b(k1+ k2)/2c. In

this way we can see the digits k1 and k2 as input, determining what the output k3 and

k4, should be. Whenever the input and output indices are compatible, the corresponding

tensor entry is one.

If we stack multiple copies of this tensor, we can create a tensor network that works the same way as the H tensor we defined earlier, see figure 5.6. We get as input a string of digits representing a binary expansion of the current height and a −1, 0 or a 1 at the bottom denoting the current entry of the Motzkin path, which we will add to the current height, see figure 5.6 for a pictorial example.

(30)

Figure 5.6.: On the left an example of the H tensor where the height was 4, the Motzkin path has entry −1 and the resulting height is therefore 3. On the right is the same example, but now with the binary expansions of the heights. Note that when the “input” indices (the current height and the Motzkin entry) for both tensors are known, the “output” indices must return a 3, otherwise the tensor entry is 0.

Replacing every H tensor in figure 5.6 with a stack of BIN tensors, we find the binary height tensor network given in figure 5.7.

Figure 5.7.: The binary height tensor network, which has tensor entry one only when the indices ki form a Motzkin path.

(31)

The question that arises is how many layers of BIN tensors we need in our tensor network. If a stack of BIN tensors contains k tensors, then the binary expanded height is at most 2k−1. The maximal height in a Motzkin path of length n is n/2, in the middle of the path. To encode this height we need a stack of dlog₂(n/2 + 1)e BIN tensors (when n = 8 we need 3 layers). If the carry digit at the top of such a stack is non-zero, we know that the height of the path at this point becomes either negative at this point, or higher than the maximal n/2. If we contract the top carry digit with a zero tensor, the tensor entry will be zero, whenever (k1, . . . , kn) is not a Motzkin path.

Note that these zero tensors on top of the network are not exactly the same as defined in figure 5.3, since their index set is now only {−1, 0, 1}. The zero tensors on the left and right side of the network are there so that the Motzkin path starts and ends at zero height. The index set of these tensors is {0, 1}.

The network in figure 5.7 can be written in a compacter way. We use a lemma called the zipperlemma 5.1 to write the network in a way so that the number of tensors grows linearly with n. The number of tensors in figure 5.7 is n log₂(n). To get to the compact network we introduce an unphysical index ω to the index sets of the tensors we have until now. We define a new tensor B instead of BIN.

with k1, k3 ∈ {0, 1} and k2, k4 ∈ {−1, 0, 1, ω}. Bk1k2k3k4 = 1, when k1 k2 k3 k4 0 -1 1 -1 0 0 0 0 0 1 1 0 1 -1 0 0 1 0 1 0 1 1 0 1 1 ω 1 0 0 ω 0 ω and Bk1k2k3k4 = 0 otherwise.

Figure 5.8.: The definition of the B tensor.

Note that when the values of indices k2and k4 are restricted to {−1, 0, 1}, the B tensor

is exactly the same as the BIN tensor. This tensor still has the property that if we know the values of indices k1 and k2 (input), then the indexes k3 and k4 (output) are uniquely

determined given that the tensor entry is one. We also define a 2-tensor Π = I − |ωihω|, with indices k1, k2 ∈ {−1, 0, 1, ω}. Its entries are 1 if k1 = k2 6= ω, and otherwise 0. It

can be written in terms of Kronecker deltas as Π = δk1,k2· (δk1,−1+ δk1,0+ δk1,1). If we

put this tensor at the bottom of every column in the network in figure 5.7, the string at the bottom does not contain any ω’s, otherwise the tensor entry will be zero. To give a new tensor network for the MOT tensor, we replace all BIN tensors by B tensors and every column will get a Π tensor at the bottom (the index set of the top zero tensors

(32)

obviously changes to {−1, 0, 1, ω}).

The reason why we use the B tensors instead of the BIN tensors will become clear in the next section where we use the index ω to exclude a certain case.

Figure 5.9.: The altered binary height tensor network has tensor entry one only when the indices ki form a Motzkin path (if one of the path elements is ω, the

tensor entry becomes 0).

5.3. Triangle tensor

We introduce the T (triangle) tensor in figure 5.10. This tensor will have a special property in combination with the B tensors given in Lemma 5.1. We will use this lemma to zipp up the network in the desired compact form.

(33)

With k1, k2, k3 ∈ {−1, 0, 1}. Tk1k2k3 = 1, when k1 k2 k3 -1 0 -1 -1 1 ω 0 -1 -1 0 0 0 0 1 1 0 ω ω 1 -1 0 1 0 1 1 ω 1 ω -1 -1 ω 0 ω ω ω ω and Tk1k2k3 = 0, otherwise.

Figure 5.10.: The definition of the triangle tensor.

One would hope that again the values of k1and k2 (input) determine k3 (output) given

that Tk,k2,k3 = 1, and that is still true. There is however a minor detail, there are multiple

pairs (k1, k2) that do not occur in the table. The pairs (−1, −1), (−1, ω), (1, 1), (ω, 1)

immediately give a tensor entry of zero.

(34)

Figure 5.11.: We claim that this tensor network represents the same tensor as the tensor network given in figure 5.9. The difference is that the pyramid of T tensors has replaced the tensor |0i⊗n.

We claim that this tensor network is equivalent to the MOT tensor. To prove this, we only have to show that it is equivalent to the tensor network given in figure 5.9. The only difference between these tensor networks is that the |0i⊗n tensor at the top has been replaced by a pyramid of T tensors. Therefore it is enough to show that the indices at the bottom of the pyramid (or at the top of the grid of boxes) in figure 5.11 are all zero, assuming a tensor entry of one. We prove this using contradiction. Assume that there are non-zero indices at the bottom of the pyramid and the tensor entry is non-zero. Let µ be the column of B tensors beneath the leftmost non-zero index. There are three options:

• The index on top of µ is −1. We will walk up the pyramid from here and consider what can happen at every step. First we note that if we meet a triangle where

(35)

k1 = 0, then k3 = k2, so then the index is carried on. Because of this we can

ignore all zeroes on the left of µ and pretend that −1 is the leftmost index at the bottom of the pyramid. By looking at the table in figure 5.10, we can conclude that if k1 = −1 or k1 = ω then k3 must be either −1 or ω, whatever the value of

k2 is. So the index on top of the pyramid will be −1 or ω and thus the network

will contract to a zero tensor entry. This is a contradiction. This case is exactly why we needed the unphysical index ω. Alexander et al. [1] discuss a version of the network with periodic boundary conditions, where they use simpler versions of the introduced tensors (without ω), but for this case we need the triangle tensor to give a different output for values of (k1, k2) = (1, −1) and (k1, k2) = (−1, 1).

• The index on top of µ is ω. This is the same as the last case. We can ignore the zeroes on the left, we find k1 = −1 or k1 = ω implies k3 = −1 or k3 = ω and

therefore the index on top of the pyramid will be −1 or ω and thus the network will contract to a zero.

• The index on top of µ is 1. Then the indices on the left of µ are all ones (see table 5.8). As discussed in section 5.2, this gives a binary encoded height greater than n/2, which can never happen and will lead to a tensor entry of zero because of the boundary conditions.

We find that the leftmost non-zero index at the bottom of the pyramid does not exist, so all the indices at the bottom of the pyramid are zero. This proves that the networks in figure 5.11 and figure 5.9 are the same.

5.4. Zipping up

Finally we will discuss the reason why we introduced the T tensor.

Lemma 5.1. The relation between the B tensor and the triangle tensor given in figure 5.12 holds.

(36)

Figure 5.12.: The zipper lemma

Figure 5.13.: We prove that for two connected B tensors, denoted as B2, the tensor entry is zero if (n, m) ∈ {(1, 1), (−1, −1), (−1, ω), (ω, 1)}.

Proof. First we try to prove that if we have two connected B tensors as in figure 5.13, then the tensor entry is 0 if the upper two indices (n, m) ∈ {(1, 1), (−1, −1), (−1, ω), (ω, 1)}. We will stick to the index labelling given in figure 5.13. We assume that the tensor entry is non-zero. Assume n = 1, then we look at the table in figure 5.5 and find that we must have o = 0. So the input k1 in the right tensor is a 0, looking at the table again we find

that the “carry digit” m can never be a one. We reach a contradiction and therefore B2_ijkl11= 0. We can use similar reasoning to show that B2_ijklmn = 0 for all of the other pairs in the set.

(37)

5.14. If we are able to show that this holds when (b, c) /∈ {(1, 1), (−1, −1), (−1, ω), (ω, 1)}, then the zipper lemma holds as well.

Figure 5.14.: Reduced problem where we assume that (b, c) ∈/

{(1, 1), (−1, −1), (−1, ω), (ω, 1)}.

Let us call the left tensor A and the right one B. Then we first take a look at A and try to find all the values for the indices for which Aabcde = 1. We do this by filling in all

possi-ble values for a, b and c (ignoring the ones where (b, c) ∈ {(1, 1), (−1, −1), (−1, ω), (ω, 1)}) and using the tables in sections 5.8 and 5.10 we calculate what d and e should be for the tensor entry to be one. Unfortunately this is just a tedious exercise, because finding a smart way to give expressions for the indices is harder than checking all cases.

To give a demonstration, we check the case where (a, b, c) = (0, ω, -1). Looking at the right-hand side of figure 5.14, we know that for this tensor to be non-zero, the connected edge must have value -1, see figure 5.10 with k1 = ω and k2 = -1. Now we can take a

look at figure 5.8 with k1 = 0 and k2 = -1 and conclude that the only way the tensor is

non-zero is when d = 1 and e = -1.

Filling in (a, b, c) = (0, ω, -1) on the left side (and assuming a tensor entry of 1), causes the bottom connected edge to have value 0 and the left connected edge to have value ω by analysing the left B tensor. Then analysing the right B tensor gives d = 1 and the right connected edge value 1. Analysing the T tensor (with k1 = ω and k2 = 1), gives

e = 1. So in the case (a, b, c) = (0, ω, -1), the equation 5.14 holds.

Checking for all possible (a, b, c) where (b, c) /∈ {(1, 1), (−1, −1), (−1, ω), (ω, 1)} gives table 5.1 as the values where both Aabcde = 1 and Babcde = 1 (they are zero otherwise).

(38)

a b c d e 0 -1 0 1 -1 0 -1 1 0 0 0 0 -1 1 -1 0 0 0 0 0 0 0 1 1 0 0 0 ω 0 ω 0 1 -1 0 0 0 1 0 1 0 0 1 ω 1 0 0 ω -1 1 -1 0 ω 0 0 ω 0 ω ω 0 ω 1 -1 0 0 0 1 -1 1 1 0 1 0 -1 0 0 1 0 0 1 0 1 0 1 0 1 1 0 ω 1 0 1 1 -1 1 0 1 1 0 0 1 1 1 ω 0 1 1 ω -1 0 0 1 ω 0 1 0 1 ω ω 1 0

Table 5.1.: The only 24 index values for which Aabcde = 1 and as well the only 24

index values for which Babcde = 1. This proves that figure 5.14 holds if

(b, c) /∈ {(1, 1), (−1, −1), (−1, ω), (ω, 1)}.

Repeatedly applying lemma 5.1 to the tensor network in figure 5.11 yields the so-called renormalization tensor network given in figure 5.15.

(39)

Figure 5.15.: The renormalization tensor network representing the Motzkin state. This network has some interesting properties, some of which can immediately be spotted. It is made up recursively of a repeating unit of tensors. We see that the left hand side of figure 5.14 is repeated throughout the network. Assume for now that n = 2k for a k ∈ N. Then the number of repeating units is

k−1

X

i=0

2i= (2k− 1) = n − 1.

The repeating unit is a tensor of three indices, each with underlying dimension 4, and two legs with underlying dimension 2. This leads to 22 · 43 _{= 256 tensor entries, so}

256(n − 1) tensor entries in the whole network (ignoring the zero and Π tensors). In contrary the MOT tensor in figure 5.1 has 3n entries and the binary addition tensor network in figure 5.7 has n · log₂(n) box tensors (n is the width of the network

(40)

and log₂(n) the height), which make up for a total of n · log₂(n) · 42· 22_{= 64 · n log} 2(n)

(again ignoring the zero tensors). We conclude that of all the networks we saw, the renormalization network is the only one where the number of entries grows linearly with n. Therefore, it can be seen as a very compact description of the tensor (not too many numbers are needed to store the network).

(41)

6. Conclusion

The main goal of this thesis was to examine the Motzkin spin-1 and learn more about its properties. We started in Chapter 2 by giving the definitions of Tensor networks, a whole new way of studying states in a quantum system. We first considered tensors and then defined a way to contract them, namely start with a tensor product of all the tensors together and then sum over paired indices.

In the next chapter we defined the physical properties a system can have. We gave the proof for the Schmidt decomposition of a pure state and defined two measures of entanglement. Then we also defined what the terms frustration and criticality mean.

In Chapter 4 we finally start analysing the Motzkin spin chain. We start by defining the Motzkin Hamiltonian. Then we define an equivalence relation on the orthonormal basis of the systems vector space. We proof that elements of an equivalence class have the same coefficient in the basis expansion of the ground state. Together with the boundary conditions we conclude that the ground state is the uniform superposition of all elements of a certain equivalence class C0,0. By finding the ground state, we also

proved that the Hamiltonian is frustration-free, since the ground state minimises every local Hamiltonian.

We introduce a correspondence between elements of C0,m and Motzkin paths. The

number of Motzkin paths for any endpoint is counted and this helps us find the Schmidt coefficients for the decomposition of state |C0,0i.

After analysing the ground state in Chapter 4, we start by building the self similar tensor network in Chapter 5. Starting with a general tensor, we split it up in a series of height addition tensors, figure 5.4. These tensors keep track of the height at each entry of the path. We expand this idea by splitting every height addition tensor into multiple BIN tensors, in order to get their index sets finite. We make some alterations on the binary addition tensors, like adding a pyramid of T tensors on top. Then we use the zipper lemma to compress the network in a self similar form.

The self similar form of the renormalization tensor network is expected to be linked with critical behaviour [1]. This connection can be investigated in other studies, since there was not enough time to examine that in this thesis. However, the renormalized tensor network gives us a tensor network where the number of entries required scales linearly with the number of particles in the chain. This means we have found an efficient way to store the Motzkin ground state.

(42)

Popular summary

The main goal of this thesis is to study the Motzkin spin-1 chain. This is a physical system with interacting particles lined up in a chain. Each particle can be in three different states which we call |1i , |0i , |-1i (can be thought of as spins of a particle), this way the state of the whole system can be described by a string of these states, for example |1, 1, 0, -1, 0, -1, -1i. Since this system is quantum mechanical there are also combinations of states possible for example a particle can be half in the |1i state and half in the |0i state.

This chain of particles has a rule called the Hamiltonian, which gives an energy for each state. The Hamiltonian of the Motzkin chain consists of the sum of local Hamiltonians that only say something about the energy between two neighbouring particles. More importantly, the system is frustration-free, which means that there exists a state that minimises the energy for all of the local Hamiltonians at the same time.

For example consider the chain of 5 particles that is wrapped up to a circle and only states |0i and |1i are allowed, see figure 6.1. Say that every edge in this diagram corre-sponds to a local Hamiltonian. The Hamiltonian is defined in a way that to minimise the energy of each local Hamiltonian we need to make sure that neighbouring particles are in a different state. Now one can try to solve this problem and satisfy every local Hamiltonian, but we come to the conclusion that there is no state that minimises every local Hamiltonian. Therefore this system is not frustration-free. If we take the same sys-tem, but now with 6 particles, then there are states which minimise every Hamiltonian, for example the one given in figure 6.1. Thus the 6 particle system is frustration-free.

(43)

Figure 6.1.: On the left the described 5 particle system with frustration and on the right the 6 particle system that is frustration free. The state that is filled in minimises every local term of the Hamiltonian.

What makes our system special, is that it is also critical. It has the property that when we add more particles to the system, the energy levels of the states will get closer and closer. There haven’t been found many examples of systems which are both frustration-free and critical, so this makes our system interesting to study.

How we study the system is by using tensor networks. These are diagrams of shapes and lines that represent states in the system, for example figure 6.2. By studying tensor networks for the minimal energy state of the Motzkin chain, we can learn about prop-erties of the system. Each box node with n lines in a tensor network, corresponds to an n-dimensional table of numbers.

(44)

(45)

A. The Catalan numbers

The problem of the Catalan numbers is a nice combinatorial problem which comes in handy in this thesis. The problem is formulated as follows. For n pair of parentheses find the number Dn, of how much possible configurations there are. For example C3 = 5,

with the following configurations

()

, ()(), ()(), () (), ()()().

To find a formula for this we need to translate these configurations to paths (similar to motzkin paths). Let’s identify every opening parenthesis with an step up and every closing one with a step down, illustrated in diagram A.1. Now we start at height zero and end at zero (every pair of parentheses must be closed in the end) and we do not go beneath the x-axis (we can’t have more closing parentheses than opening ones in any chosen prefix). So for a given n we can calculate the number of paths to a point p by

start 1 finish 1 1 4 1 2 9 1 5 14 2 5 3

Figure A.1.: A diagram with all possible Catalan paths for n = 4. The numbers denote the number of possible paths from start to that point.

just adding the ways to get to the two points which could lead to p. This way of adding reminds us of pascals triangle, which gives rise for the proof of the following theorem. Theorem A.1. There is a formula for the Catalan numbers given by

Dn= 1 n + 1 2n n .

Proof. We tackle this problem by first considering the problem without the restriction that we have to have at least as much opening parentheses as closing ones in every prefix. This is equivalent with allowing paths that cross the bottom line. Now a path

(46)

from start to finish is fixed if we pick out of the 2n places the n places where we go up (the other n places are now immediately down moves). So the number of paths without ground-restriction is 2n_n. Now we subtract from this number the number of paths that are made false by the ground-restriction.

Imagine a path is false. Then it will definitely hit the red dotted line in diagram A.2. Now define a map σ which reflects the part of the path after hitting the red line

7−→

Figure A.2.: σ working on a false Catalan path.

(for the first time), through the red line. The reflected paths will always end one point lower then the finish, let’s call this point F0. Now we find that σ is a bijection, because for every path γ ending in F0 there is a unique false Catalan path, which is its inverse image. So the number of false Catalan paths exactly as many as the number of paths from start to F0 which is _n−12n. So our formula will be

Dn= 2n n − 2n n − 1 = (2n)! n! · n!− (2n)! (n − 1)! · (n + 1)! = (n + 1) · (2n)! n! · (n + 1)! − (n) · (2n)! (n)! · (n + 1)! = (2n)! (n)! · (n + 1)! = 1 n + 1 2n n .

(47)

B. Shannon information

This appendix follows a series of slides created by Tom Carter [5]. Assuming four axioms, we will show that the information defined in section 2.1, is the only function that satisfies these axioms (up to a constant factor).

We want to define the information we get from an observation. Say we observe a coin flip of a biased coin. The coin is made such that it will always hit heads, so when we flip it we know for sure that the result will be heads. Therefore there is no information we got out of the experiment. However, if it is a fair coin then each time we flip, we can’t predict the outcome with certainty, so we receive information from measuring. If we instead have a biased coin with a 10/90 rate, we get a lot of information from the unexpected event.

It’s the same as attending a lecture where a lot of predictable material is given. Then the information density is really low. However if a lecture is attended where the events are unpredictable, there is a lot information to process.

We want a continuous differentiable information function I that gives the information for an event that happens with probability p. It should, according to reasoning, follow the following axioms:

1. I(p) ≥ 0.

2. I(p) is monotonically decreasing in p, the bigger the probability of an event the less information we find.

3. I(1) = 0, a certain event does not provide information.

4. I(p1 · p2) = I(p1) + I(p2). We call this property additivity, if two independent

events happen, the information is the sum of both.

Using these rules we will try to find a function I : (0, 1] → R≥0 that follows these

rules.

By the fourth rule, our function should follow I(pn) = nI(p) for all n ∈ N. We can expand this to all positive rationals by stating

n mI(p) = 1 mI(p n_{) =} 1 mI((p n/m₎m_{) = I(p}n/m₎

Now we expand this to the rule aI(p) = I(pa) for all a ∈ R≥0 by using the continuity

of I and the fact that every positive real has a sequence in the positive rationals, which converges to it.

Differentiating aI(p) = I(pa), with respect to a, we find I(p) = I0(pa)palog(p)

(48)

So filling in a = 1 gives us I(p) = I0(p)p log(p). Now it is just a matter of solving this differential equation I0(p) I(p) = 1 p log(p) Z t t0 I0(p) I(p)dp = Z t t0 1 p log(p)dp

log(I(t)) − log(I(t0)) = log(log(t)) − log(log(t0))

I(t) I(t0) = log(t) log(t0) I(t) = C · log(t) Where C = I(t0)

log(t0) is a degree of freedom and can be picked as any number in R. Now we

check all of the axioms and find that 3 and 4 always hold. To get 1 and 2 we need to pick C ≤ 0. Or, in an even more manageable form I(p) = C · log(1/p), with C ≥ 0, which can be rewritten as I(p) = log_b(1/p), where b = e1/C (excluding the non-interesting I(p) = 0).

Taking b = 2, the information is given in ‘bits’, for b = 3 ‘trits’ and for b = e the unit is called ‘nats’ (from natural logarithm). It is obvious that between these units, there is only a constant factor. In this thesis we will use the ‘bits’ as units for information.

(49)

Bibliography

[1] Rafael N Alexander, Glen Evenbly, and Israel Klich. Exact holographic tensor net-works for the motzkin spin chain. arXiv preprint arXiv:1806.09626, 2018.

[2] Jacob Biamonte and Ville Bergholm. Tensor networks in a nutshell. arXiv preprint arXiv:1708.00006, 2017.

[3] Sergey Bravyi, Libor Caha, Ramis Movassagh, Daniel Nagaj, and Peter W Shor. Criticality without frustration for quantum spin-1 chains. Physical review letters, 109(20), 2012.

[4] Jacob C Bridgeman and Christopher T Chubb. Hand-waving and interpretive dance: an introductory course on tensor networks. Journal of Physics A: Mathematical and Theoretical, 50(22):223001, 2017.

[5] Tom Carter. An introduction to information theory and entropy, 2000.

[6] Jianxin Chen, Xie Chen, Runyao Duan, Zhengfeng Ji, and Bei Zeng. No-go theorem for one-way quantum computing on naturally occurring two-level systems. Physical Review A, 83(5):050301, 2011.

[7] Roger Penrose. Applications of negative dimensional tensors. Combinatorial mathe-matics and its applications, 1:221–244, 1971.

Tensor networks on the motzkin spin chain

Tensornetwork for the Motzkin spin chain

Lars van Geest

August 6, 2019

Abstract

Contents

1. Introduction

2. Mathematical preliminaries

2.1. Probability distributions

2.2. Tensors

2.3. Tensor contraction

2.4. Tensor networks

3. Physical preliminaries

3.1. Quantum systems and quantum states

3.2. Schmidt decomposition

3.3. Entanglement

3.4. Hamiltonians

3.5. Frustration and critical systems

4. Motzkin spin chain

4.1. Finding the ground state

4.2. Motzkin paths

1, 1, 0, -1, 0, 1, -1, -1, 1, 0, -1

∼

=

4.3. Schmidt decomposition

5. The renormalization network for the

Motzkin state

5.1. The Motzkin state as a tensor network

5.2. Binary height tensor network

5.3. Triangle tensor

5.4. Zipping up

6. Conclusion

Popular summary

A. The Catalan numbers

B. Shannon information

Bibliography