Using tensor network renormalization to simulate the 2D classical Ising model at criticality

(1)

University of Amsterdam

MSc Physics

Theoretical Physics

Master Thesis

Using tensor network renormalization to simulate the 2D

classical Ising model at criticality

by Schelto Crone 10285288 July 2016 60 ECTS Supervisor: dr. P. Corboz Examiner: prof. dr. J. de Boer

(2)

(3)

Abstract

Recently a new coarse-graining algorithm for tensor networks has been introduced: the ten-sor network renormalization algorithm (TNR). By employing the techniques of entanglement renormalization it can properly coarse-grain a 2D classical system around and at its critical point and reproduce the correct fixed point structure. In this thesis this method is implemented and its capabilities have been tested in context of the 2D classical Ising model. It is shown that this algorithm can accurately reproduce the trivial and the critical fixed points of the model. This allows the algorithm to properly coarse-grain the system at criticality, resulting in a scale invariant system. The results have been compared with previous coarse-graining schemes, namely the tensor renormalization group and the corner transfer matrix method, in terms of the accuracy and the computational cost. It has been found that, although the TNR can simulate a system at its critical point, it has a huge computational cost which prohibits simulations for large bond dimensions. Still, the qualitative features produced by the algorithm allow for accurate simulations at criticality.

(4)

Populaire Nederlandse samenvatting

Een veel-deeltjes systeem is een model bestaande uit velen elementaire kwantumdeeltjes die een interactie hebben met elkaar. Het beschrijven van deze veel-deeltjes systeem met een sterke interactie term is in de meeste gevallen niet exact mogelijk: er zijn maar enkele oplossingen bekend. Veel interessante fenomenen, bijvoorbeeld hoge temperatuur supergeleiding of het fractionele kwantum Hall effect zouden alleen hierdoor kunnen worden verklaart. Hierom zijn numerieke methoden van een essentieel belang om informatie te verkrijgen over dit soort modellen.

In de jaren 90’ is door S. White een methode ontwikkeld om dit effici¨ent te doen voor een dimensionale kwantumsystemen, de DMRG methode. Deze methode probeert op een effici¨ente manier de meest relevante toestanden van het systeem te vinden. Deze methode heeft een revolutie binnen het numeriek simuleren teweeggebracht door zijn zeer hoge nauwkeurigheid. Het is een van de eerste methoden die een goed werkende numerieke renormalizatie groep transformatie implementeert. Later bleek dat de nauwkeurigheid van dit algoritme te verklaren valt met behulp van een netwerk van tensoren.

Het algoritme is gemaakt voor een dimensionale kwantumsystemen die niet op een kritiek punt zitten. Een kritiek punt kan worden beschouwd als het punt waar het systeem zich op een faseovergang bevindt. Om de systemen op een kritiek punt te simuleren is een andere aanpak nodig. Dit is later opgelost met de MERA algoritme, welke door alle korte correlaties in het systeem te verwijderen dit kritieke punt wel kan simuleren.

De methode van DMRG is ook toegepast op een twee dimensionaal klassiek systeem, alleen die methodes zijn ook niet nauwkeurig het kritieke punt. Recentelijk is een nieuwe methode voorgesteld, TNR, welke wel in staat zou moeten zijn om het klassieke systeem te simuleren op zijn kritieke punt. In deze these is deze nieuwe methode onderzocht en vergeleken met de gevestigde methodes. De methoden zijn vergeleken door te kijken naar de kwalitatieve eigenschappen, de nauwkeurigheid en de benodigde rekenkracht. Er is gevonden dat dit algoritme het systeem echt op het kritieke punt kan simuleren, waardoor het nieuwe soort simulaties mogelijk maakt. Echter, de nauwkeurigheid van het algoritme weegt helaas niet op tegen de veel grotere rekenkracht nodig om het fatsoenlijk te kunnen simuleren. Dit zou waarschijnlijk kunnen worden verbeterd, maar hiervoor is meer onderzoek nodig.

(5)

1 Introduction

The capability to produce accurate simulations of strongly correlated many-body systems is a key challenge in condensed matter physics. In these systems the strong interactions between the particles give rise to exotic collective behaviour like for example the fractional quantum hall effect or high-temperature superconductivity. However, the models that describe these systems are only in few cases analytically solvable and good numerical simulations can give good insight in the behaviour and different phases. But numerical calculations are still hard to perform exact for large systems due to the exponential scaling of the Hilbert space, which prevents any exact diagonalization like approach. Therefore proper approximate methods are necessary to solve these models. Some systems can be accurately simulated with Monte Carlo methods, however the negative sign problem prevents these simulations for fermionic or frustrated models[29]. A method that does not suffer from this sign problem is White’s density matrix renormalization group (DMRG)[35]. The DMRG allows simulations of 1D quantum systems and, due to its accuracy, this method it has become the state of the art method of simulating these systems[25, 23]. Later it has been shown that the DMRG has an underlying tensor network variational ansatz called the matrix products states (MPS)[31, 32], which allow this to be rewritten as a tensor network. The approach of DMRG is to truncate the Hilbert space in order to find a fixed number of most relevant states. This idea is in the same sprit as the renormalization group, as originally introduced by Kadanoff and Wilson[14, 36, 37]. One can view it as a renormalization group transformation in the Hilbert space, resulting in a more effective Hilbert space. From quantum information it was shown that the ground state of local gapped Hamiltonians possess are only locally entangled [1], which is most times called an area law for the entanglement. This means that only a small portion of the Hilbert space needs to be considered for accurate simulations of the system. A representation of the ground state which automatically possesses this property of local entanglement are the MPS [31]. The DMRG algorithm tries (and succeeds) to employ this local entanglement to find these relevant states, resulting in very accurate simulations[32, 23].

These methods however fail at criticality, because then the system no longer follows a perfect area law. The systems do not possesses the gapped Hamiltonian with only local entanglement, but have entanglement at each length scale[34, 1]. Although it had been shown that the 1D systems can still accurately be simulated by an MPS[31] there is much improvement to be made here. This problem was eventually fixed by Vidal[33] with his multi-scale entanglement renormalization ansatz (MERA). This is an ansatz that can accurately simulate 1D (and perhaps higher) quantum systems at criticality. By inserting disentanglers this ansatz fully removes all the short-range entanglement, allowing systems which follow this logarithmic correction to the area law to be properly simulated. The MERA can therefore simulate 1D quantum systems at criticality.

All the above mentioned methods are made to simulate 1D quantum system, however the same methods can be applied to 2D classical systems. The partition function of a 2D classical system has a direct correspondence with the partition function of a 1D quantum system, therefore the same methods can be applied to a 2D classical system[17]. The first algorithms which employ this are the corner transfer matrix (CTM)[21] and the tensor renormalization group (TRG) [17]. Although both work well away from criticality, at criticality they suffer from the same problem as the DMRG method [17]. Many variations of TRG[39, 40, 11] have been given, but all suffer from this failure at criticality. Recently, in the same spirit as the MERA, the tensor network renormalization (TNR) algorithm has been introduced [5, 2]. It coarse-grains classical systems with the employment of the entanglement renormalization. One can view this as a variation on the TRG algorithm with added disentangler, and a clear connection with the MERA algorithm

(7)

has been shown [7]. Using this a proper coarse-graining of a classical system at criticality can be made. In this thesis the recently proposed TNR algorithm is investigated. The theory behind TNR suggests that it should have some qualitative properties: it should be able to properly coarse-grain the system and produce the right fixed point structure. Therefore it should also be able to coarse-grain the correct critical fixed point and accurately calculate quantities at criticality. In the thesis these qualitative and quantitative behaviour of TNR is investigated. Then TNR will be compared with some of the previous coarse-graining algorithms, namely the previously mentioned TRG[17] and CTM[21] algorithms. Both the accuracy and computational cost are investigated to conclude which algorithm performs best in which situation. All the calculations will be performed on the tensor network representation of the partition function of the two-dimensional classical Ising model, although they can easily be adapted for other networks.

This thesis is organised in the following way: in section 2 the main theoretical foundation of all the algorithms is presented. A way of truncating the Hilbert space effectively is shown and the principles of entanglement renormalization are presented. Both are motivated in context of a 1D quantum spin-1₂ chain, although they are also be applicable to the 2D classical Ising model. Then a short overview of principles of tensor networks and why they are relevant in the context of many body physics is given in section 3. In section 4 the 2D classical Ising model is introduced with its exact solution. Here also a way of rewriting the partition function of the 2D classical Ising model as a tensor network is presented, and the correspondence between a 2D classical Ising model and a 1D quantum Ising model is used to argue why the same methods can be used in this context. After these sections the algorithms are explained in more detail, they are all reviewed in section 5. Finally in section 6 the results are presented and they are discussed in 7. Then the conclusions of this discussion are given in section 8.

(8)

2 Theoretical background

In this section the main theoretical background of the algorithms is presented. First the method of truncating a system to its largest singular values, as first introduced by Steven White in his DMRG algorithm[35], will be discussed. This method can be seen as the basis for the CTM and TRG algorithms. Then the problem of accumulation of short-range entanglement is introduced, which can be solved by the method of entanglement renormalization[33]. The techniques will be presented in context of the 1D quantum spin chain, however they can also be applied (as will be the case with the rest of the thesis) on 2D classical models. Why this is allowed will be argued later in context of the 2D classical Ising model. Last, a short qualitative review on some of the basic principles of the renormalization group will be given.

2.1 Truncating the singular values: White’s rule

The main approach for all the algorithms is to apply a renormalization group (further abbreviated as RG) like transformation on the configuration space (in the 2D classical system). This results in a reduction of the degrees in freedom while still describing the relevant physics. The RG flow through this space allow model to be represented faithfully while at the same time removing the exponential scaling. Still, the question remains how this transformation can be performed in a way that the system is still represented properly. The first who invented a way of doing this efficiently for 1D quantum spin chains was S. White with his DMRG method[35]. The justification of this method can be found in the ideas of quantum information and the finite scaling of the entanglement entropy. In the DMRG a RG transformation is applied on the Hilbert space, but the same principle can be applied to the configuration space off a classical system. Therefore an outline of the ideas is presented below. However, because they are best motivated by looking at the 1D spin chain, this is the context in which it is presented below.

The techniques can best be explained by looking at a 1D quantum spin-1₂ chain with a gapped Hamiltonian (one can imagine for example a Ising transverse Hamiltonian, however the specific Hamiltonian does not matter for the following story). The 1D quantum spin-1₂ chain has on each lattice site a Hilbert space given as H1

2 = { | ↓i, | ↑i }, assuming this is the eigenstate

of the local Hamiltonian, so therefore an N particle spin-1₂ chain has a Hilbert space which is given as HN = H1 2 ⊗ H1 2 ⊗ ... ⊗ H1 2 = (H 1 2) ⊗N . (1)

Here one can see that the size of the Hilbert space for this spin chain grows exponentially (|HN| = 2N). Due to this exponential scaling exact diagonalization of a Hamiltonian H in this

basis is not efficient. To solve this a block of lattice sites (2 or more) is, in accordance with the RG group ideas, coarse-grained into an effective site. In this blocking step the total degrees of freedom of the system is reduced - the new Hilbert space should have a smaller dimension than the original Hilbert spaces combined. The question of when and how this is possible and will be addressed later. Doing this RG transformation multiple times results in an ’effective’ Hilbert space with such a small dimension that it has become computationally tractable. The Hamiltonian in each step also should be updated to an effective Hamiltonian H0 that works on this truncated Hilbert space. Combining these two an RG-like flow arises which usually can be written as

(H, H) → (H0, H0) → (H00, H00) → ... → (Hef f, Hef f) (2) where |H| ≥ |H0|.

The coarse-graining of the Hilbert space can be performed by introducing isometries. These are operators which act on two (or more) lattice sites (more specific, the Hilbert space associated

(9)

Figure 1: Here the idea of coarse-graining the lattice with isometries is shown pictorially. The isometries truncate the Hilbert space of the original spin lattice H1

2 into a new, effective Hilbert

space H0, which should describe the same physics.

with this lattice site) and coarse-grain them into a single site. An isometry v, which acts on two lattice sites with vector space V (In the first step of our lattice this would be H1

2) and

coarse-grains them into a single lattice site with vector space V0, can be written as a map

v : V ⊗ V → V0 (3)

If the vector space V has dimension n, then the vector space V0 has the dimension m ≤ n2. When m = n2 this coarse-graining is exact and V0 contains the same degrees of freedom as the earlier lattice. However when m < n2_{, the isometry truncates the vector spaces. This naturally}

leads to the property of the isometry that v†v = I, but vv†≈ I. There is a simple reasoning for this, v†leads V0 into a larger or equal vector space V ⊗ V, therefore if this is again truncated into V0 the total information can be recovered and this operation is essentially an identity operation. The other way (vv†), however, starts with a truncation (n2→ m) and then is brought back to a vector space of n2. This can only be achieved without loss of information when m = n2. However, if m < n2 _{this is not exactly an identity. Still is can be seen as a good approximation of the}

identity, because the isometry is constructed in such a way that the most relevant information is kept. When the lattice is uniform the same isometry can be used at each lattice site, which can speed up computational time. An example of this truncation for our original spin-1₂ lattice can be seen in figure 1.

The question still remains how to perform this truncation in such a way that it is efficient. For a random state vector in the Hilbert space one would not expect that it would be possible to find a general scheme such that this can be done. A priori nothing is known of its behaviour, therefore the full Hilbert space is needed to properly describe such a vector. Luckily, in most cases one is only interested in the ground state of the system (or low lying excitations). When looking for the ground state of the system, it turns out that it is less entangled than a random state which allows this truncation. This is a very important result from quantum information theory and first used by White[35] in his DMRG paper. He suggested that the most efficient truncation can be found by looking at the reduced density matrix of the ground state. Below a review of his argument is given. One can see the additional structure by investigating the entanglement in the system. To make this more concrete consider a lattice L of spins and block B of size L. A wave function on this lattice can in general always be written as

|Ψi =X

ij

Ψij|iiB|jiL−B (4)

where |ii ∈ H⊗L and |ji ∈ H⊗N −L and Ψij is a matrix with the respective weights of each

(10)

This process is called the Schmidt decomposition and it shows that the substructure of the wave function allows it to always be written in a diagonal form. The index of summation goes to r, which is due to the SVD defined to be r ≤ min(|Hi|, |Hj|. If this would be applied to our L

spin-1₂ lattice sites r would be equal to r = |H ⊗ H| = 2L. The singular values squared sum by construction (the wave function is normalized) to unity, which allow them to be interpreted as the square root of the probability of a certain state[25, 19]. The reduced density matrix (tracing out the whole lattice except the block B) is given as

where s2_nn can be seen as a probability measure: it gives the probability of encountering the state |ni_ihn|_i. A reduced density matrix can be interpreted as a measure of how much one would learn about our states when the rest of the lattice is measured. It can also be viewed as a way of seeing the entanglement between the block and the lattice. This can be better quantified in terms of the entanglement entropy. The von-Neumann entanglement entropy is defined as

S[ρs1s2_{] = −Tr[ρ}s1s2_{log ρ}s1s2_{] = −}

r

X

i=1

Pilog(Pi). (7)

The entanglement entropy has again this very nice intuitive interpretation. It calculates how much qubits one would need to describe the entangled state, or in a similar definition it gives a measure of the number of states (which scales exponential with the entropy) one would at least need to represent this entanglement structure. This second behaviour is the reason why it is interesting to investigate, because it gives insight in how much states are needed to represent this state faithfully.

This can be made more concrete by looking at the behaviour of this measure. When r = 1 → S = 0, which means that there is no entanglement between the two sites and the rest of the lattice. This means that the wave function can be written as a product state |Ψi = |ii_s

1s2|jiL−s1s2 which can easily be seen from equation 6. When r > 1, the block is

(11)

when Pi= 1/r for each |nii. This can be easily checked by differentiating equation 7 and setting

this equal to zero.

Luckily, for a 1D quantum system with a local gapped Hamiltonian it can be shown that the entropy of the ground state follows an area law [34]

S(L) ≤ Smax 1D non-critical system (8)

where here L stands for size of the block one is considering. One can see that the entropy for a 1D non-critical system grows to a certain value and then saturates. This means that it should be possible to describe the state by a fixed number of states. When looking at the definition of the entanglement entropy one can deduce that a 1D non-critical systems are only locally entangled, while there is no entanglement at length scales much further than the correlation length. This means that the values Pi show a spectrum which should decay rapidly, otherwise the entropy

would increase if L increases. This result is a very non-trivial result from quantum information theory, which allows for very accurate simulations of the ground state. When the system goes to its critical point it no longer follows the area law due to the scale invariance of the system (see section 2.3 and a logarithmic correction is found. Then the entanglement grows as

S(L) ∼ log(L) 1D critical system. (9) A direct connection between the 1D non-critical scaling relations (In case of the R´enyi entanglement entropy) and possible truncation of a matrix product state (MPS) has been exactly shown [31, 26]. In Ref. [31] it is proven that the singular values of the density operator of a wave function decay quickly, which suggest that the wave function can be truncated efficiently by taking only a fixed number of singular values. They also prove in the case of an MPS that this approximate wave function describes all the local and non-local properties of the system, which makes this approximation very powerful. Concretely it says that the wave function can be approximated by |Ψi = r X n snn|nii|nij ≈ χ X n snn|nii|nij (10)

with χ ≤ r. It has been shown that this choice reduced the error ||{ketΨ − |Ψ0i ||2_{, where}

here with the double vertical lines the Hilbert-Schmidt norm (see 5.3) is meant and Ψ0 is the approximated tensor.

Also it has been shown that this error can, for a finite bond dimension, become exponentially small which allow for a proper approximation[26]. This approximation scheme can then be used to determine the isometries for the system. The isometry is chosen such that the states with the largest singular values of the system are kept. Finding the isometries is the crucial part of the algorithms and the specific way of finding them will be presented in the section of each algorithm.

2.2 Entanglement renormalization

In addition to the previous mentioned methods, TNR employs an additional theoretical method called entanglement renormalization[33]. As mentioned earlier (eq. 8) the entanglement of a 1D critical quantum system grows logarithmically with system size. This would imply that when the system is at criticality it would not be possible to efficiently describe the system with a fixed number of states χ. Because the entropy scales with the system size, a similar scaling one would need in the required χ to properly represent the system. This prevents the system to be accurately simulated in the thermodynamic limit. Luckily, by adding the additional techniques of entanglement renormalization a system at criticality can be well approximated by a finite

(12)

Figure 2: Here the accumulation of short-range entanglement over multiple RG transformations is shown. The entanglement is between the state with the dotted oval, and after multiple poorly placed isometries this entanglement is promoted to higher RG sites.

number of states. This is done by properly removing all the short-range entanglement of the system.

The previous mentioned coarse-graining scheme, which can be viewed as having successive isometries in a sort of tree network, has a subtle but significant problem with coarse-graining the short-range entanglement at each RG layer. With short-range entanglement the entanglement between a site and its direct neighbours is meant. The problem can best be explained by looking at four neighbouring lattice sites r1s1s2r2 embedded in a larger lattice. Imagine that the state

we would like to find has the following entanglement structure on these sites |Ψi = 1

2(|0ir1|1is1 + |1ir1|0is1)(|0ir2|1is2 + |1ir2|0is2). ₍₁₁₎

This state is a combination of two bell-states, which ensures that the two lattice sites are in their most entangled state. Imagine that this state is coarse-grained by the use of isometries, then there are two possibilities on how to place the isometries. Either the isometries can be placed onto the site r1s1 and r2s2, in that case entanglement between the states gets removed.

In that case the isometries can coarse-grain this state properly. The problem arises when the isometries are placed between the two bell states. If this is the case there would be an isometry between r1 and its left neighbour, s1 and s2 and between r2 and its right neighbour. Then

the above mentioned short-range entanglement would not be removed and it gets upgraded to the next RG level. Maybe it would then get removed by a luckily placed isometry, but if not then it again gets upgraded to the next coarse-graining level. This problem is called the accumulation of short-range entanglement and this is what prevents a proper coarse-graining of systems, especially at criticality. In fig. 2 this problem can be seen pictorially.

In systems at criticality there is entanglement at all different length scales, this is due to the scale invariance of the system (see section 2.3). The entanglement at each successive layer prevents a proper coarse-graining with only the isometries. The scheme where only a single isometry is inserted will possess the problem mentioned above of the accumulation of short-range entanglement, however if the entanglement stops growing at a certain level, a finite bond dimension can represent all this entanglement (although larger than strictly necessary due to this upgrading of entanglement). However, if the entanglement keeps growing at each layer a different scheme is needed. Only the isometries cannot properly represent a system with an

(13)

Figure 3: Here the disentanglers are placed in lattice with short-range entanglement. One can quickly see that the disentanglers allow the entanglement to be completely removed.

entanglement structure with a logarithmic correction. The scheme that successfully achieves this goal is called entanglement renormalization [33]. This scheme fully removes the short-ranged entanglement, and as a result it should also be able to represent the logarithmic scaling of the entropy properly.

The idea of entanglement renormalization is that all the short-range entanglement can be removed by inserting local unitary operators called disentanglers between the coarse-grained blocks. These can be written again as a map between vector spaces as

u : V ⊗ V → V ⊗ V, (12)

where it is good to note that they only connect a site with its nearest neighbour, they do not truncate the system. Due to the unitary property they follow the relation that uu†= u†u = I. These insertion of disentanglers ensure that all the short-range entanglement is removed from a site, even when the isometry is not placed between the two sites. This is because the site now becomes connected and disentangled (hence the name) by the site outside of its block. An example is shown in figure3

A good example for the disentangling is again looking at the above mentioned state of equation 11 (this example is taken from Ref. [18]). Again we have two sites s1s2 and its direct

neighbouring sites r1r2 entangled as a system in a maximal entangled Bell state

|Ψi = 1

2(|0ir1|1is1+ |1ir1|0is1)(|0ir2|1is2 + |1ir2|0is2)

= 1

2(|0101i + |0110i + |1001i + |1010i).

(13)

A graphical representation of the example can be found in figure 4. One can look at the reduced density matrix of this state, which is given as

ρs1s2 _{= Tr} r1r2ρ r1s1s2r2 = X r1,r2∈{ 0,1 } hr1r2|Ψi hΨ|r1r2i = 1

4(|00i h00| + |01i h01| + |10i h10| + |11i h11|),

(14)

where in the last line the |i is defined on sites s1s2. From the reduced density matrix one can

compute the entanglement entropy, which is here given as

(14)

This state is maximally locally entangled and can greatly benefit from the use of a disentangler. Remember that after the disentangling step isometries are constructed which take the most relevant eigenstates of ρs1s2_{. Without the disentangling there is no way to construct the isometries}

which truncate the system without losing much information (each state is equally likely e.g. 1₄). A disentangler u can be constructed that works on a single site and its adjacent neighbour (r1s1

or s2r2) such that the state will be disentangled. The disentangler can be written as an operator:

U =X ijkl U_klij|iji hkl| , (16) where U₀₁00= U₁₀00= U₀₁01= √1 2 U₁₀01= −√1 2 U₀₀10= U₁₁11= 1 (17)

and the remaining U_klij are zero if not otherwise specified. It can be checked that this is a unitary operator, e.g. U†U = X i0_j0_k0_l0_ijkl U_ik00_jl00U_klij|k0l0i hi0j0|iji hkl| = X k0_l0_ijkl U_ijk0l0U_klij|k0l0i hkl| = (U₀₁00U₀₁00+ U₀₁01U₀₁01) |01i h01| + (U₁₀00U₁₀00+ U₁₀01U₁₀01) |10i h10| + (U₁₀00U₀₁00+ U₁₀01U₀₁01) |10i h01| + (U₀₁00U₁₀00+ U₀₁01U₁₀01) |01i h10| + U₀₀10U₀₀10|00i h00| + U10 11U1110|11i h11|

= |00i h00| + |01i h01| + |10i h10| + |11i h11| = I.

(18)

Here one can see nicely that the values of U_klij are chosen such that the cross elements drop out. If we let this operator work on the wave function (only the wave function of a single site) we find

U |Ψi_r 1s1/r2s2 = (U 00 01+ U1000) 1 √ 2|00i + (U 01 01+ U1001) 1 √ 2|01i = |00i (19)

which is exactly the behaviour a disentangler should perform, it removes the local entanglement of the wave function. This can even better been seen by looking again at the density operator, this is with the disentangler given as (due to the trace the product (u1⊗ u2)†(u1⊗ u2) can be

inserted and cyclically permuted) ρs1s2 _{= Tr} r1r2[(u1⊗ u2)ρ r1s1s2r2_(u 1⊗ u2)†] = X r1,r2∈{ 0,1 } hr₁r2| (u1⊗ u2) |Ψi hΨ| (u1⊗ u2)†|r1r2i = |00i h00| (20)

Thus with the disentangler the reduced density matrix gets mapped onto a new state which can be truncated efficiently. The entanglement entropy is now given as S = log 1 = 0, and the state can, without loss of information, be truncated into a single state. Although this example

(15)

Figure 4: Left is the example with only the isometry. On the right the disentanglers are added, which completely disentangles the wave function.

was custom made for a disentangler, it really illustrates the potential that this method can have. The question remains how to find the proper disentangler. This can again be specific for the algorithm and will be discussed in section of the TNR. With the disentangling all the short-range entanglement is removed, which was one of the problems of simulating the system at criticality. A proper coarse-graining flow is created because now there is no bias in where the isometry is placed.

Further, systems at criticality had equal amount of entanglement at all length scales. The short-range entanglement can be seen as the entanglement associated with the current length scale. All the further entanglement (the entanglement between a site and its next to nearest neighbour for example), can be seen as the short-range entanglement of one of the next RG layer. Therefore by removing all the short-range entanglement one removes all the entanglement associated with this layer. This allows the scheme to properly coarse-grain a system with a logarithmic correction in the scaling of the area laws. The logarithmic correction is correctly represented by the disentanglers, and therefore the system can simulate quantum systems at criticality. This argument can be seen in a similar way in the MERA[33], where one can quickly see graphically that the ansatz can represent a logarithmic growth in each layer. The entropy represented in the MERA can be linked to the number of bonds that acts between two blocks. Because the number of bonds grow logarithmically in system size, the amount of entanglement the MERA can represent has this similar growth. This allows the MERA to naturally represent systems at criticality[4].

2.3 Intuition from the renormalization group

All of the algorithms used in this thesis try to find an effective tensor which represents several original tensors (i.e. a block of tensors) in such a way that the tensor network is coarse-grained into a smaller network while still describing all the relevant physics. This can be viewed as a renormalization group transformation, in the same spirit as the renormalization group first introduced by Wilson and Kadanoff [37, 14]. Here a short overview of the main ideas are given, but this should by no means be seen as a full review of the subject. Some of the main ideas and concepts which are relevant for the rest of the thesis are highlighted, and a qualitative explanation is given in the context of the Ising model.

The renormalization group tries to describe a model by removing the microscopic details and looking at an effective system. This idea has been suggested by Kadanoff in his original

(16)

block spin transformation of the Ising model. He suggested that one can describe the relevant physics of the model by blocking the spins into a single effective spin. The way this block spin transformation is originally performed is by considering a majority rule transformation, the new ’site’ is assigned the same value as the value of the majority of sites. This idea, which effectively coarse-grains the microscopic details of the system, can be seen as a qualitative implementation of a renormalization group transformation. The idea of Kadanoff was made more concrete and general later by Wilson in his renormalization group[36, 37]. Still, the block spin method gives a very intuitive way of viewing a RG transformation.

It can also be used to qualitatively describe the fixed point behaviour of the Ising model. Independent of the initial conditions of a configuration, after multiple RG transformations each system should flow to one of a certain number of fixed points. To which fixed point the system flows depends on its initial conditions, and as it turns out these fixed points can be used to describe certain phases of the system. The fixed points are a way of arguing that qualitative behaviour of the system is independent of the microscopic details. One can see the intuition behind this very well by looking at the 2D Ising model1. Imagine the model at a temperature T < Tc, where the system is in an ordered phase. Most of the spins should be aligned (either ↑

or ↓, this does not matter) with a few spins misaligned. If then the block spin transformation is applied, most of the misaligned spins will be removed. The majority of the spins are aligned, so they dominate the block spin transformation. Eventually, after multiple RG transformations, the system will only consists of spins which are completely aligned. This can be viewed as the fixed point of the model, if now a new RG transformations is applied the system does not change. A similar point can be found when T > Tc, now the system is found in its disordered phase. All

the spins are disordered, and under multiple RG transformations they will stay in the disordered phase, which again can be viewed as the fixed point of the system. One can now see that all the possible values of T < Tcwill flow to the same fixed point, namely Ising at T = 0, and all the

systems where T > Tc flow to the other fixed point: Ising at T = ∞.

So with each phase one can associate such a fixed point, however what would happen when the model starts at T = Tc. The system is then at its phase transition, one cannot determine

whether the configuration is in the ordered phase or in the disordered phase. But after multiple RG transformations (one could again take the block spin transformation) one should still not be able to determine in which phase the system acts. Therefore the system should look identical under multiple RG transformations (if the system would change it would eventually flow to one of the two previously mentioned fixed points). The system should therefore become scale invariant, the physics at each scale should be identical. One can imagine the clusters of the spins possess a fractal-like structure, which also possesses this property of scale invariant. One can also describe the 2D Ising model at criticality with a CFT[8], which describes this scale invariant behaviour. It is good to note that in general a critical point does not automatically implies that the states can be described by a CFT, however the critical point of the 2D Ising can be described by a CFT.

Also, because of this scale invariance, the model should have correlations at all length scales. This results in a correlation length ξ = ∞, which poses a problem for area law mentioned earlier. The area law is discussed in context of the 1D Ising model, but can also be applied to the 2D Ising model.However, if there is entanglement at all length scales, this cannot longer remain a constant. A logarithmic correction is needed, which can be seen as arising from the multiple length scales at which the entanglements acts. This correction is what prevent the DMRG and TRG to successfully simulate the system at criticality. Luckily, as argued earlier, the technique of entanglement renormalization allows the system to be simulated at criticality. When taking the MERA[33] as an ansatz wave function, the entanglement grows logarithmically and therefore

1

(17)

this can faithfully represents systems at criticality. The infinite MERA also possesses this fractal like structure, each layer looks identical to the previous one up to a scale transformation. [38, 34, 13, 20, 4]

(18)

3 Tensor network methods

In this thesis the partition function of the 2D-Ising model is mapped to a tensor network, and then coarse-grained to calculate observables. Before continuing the main part of the thesis, here an overview of the formalism of tensor networks is given. Most of this review is based on a review found in Ref. [27]. This review is in no mean a full comprehensive review, it just highlights the most important aspects with an emphasis on the methods used in this thesis and it motivates the use of tensor networks.

3.1 Tensor networks in graphical notation

A tensor network is, as the name already says, a network of multiple tensor connected with each other. A connection between tensors implies a contraction between one or more of the indices of the tensors. With a tensor T a multidimensional array of complex numbers is meant e.g. Ti1i2i3...in, where in this case n is the number of indices. This is often called an n-dimensional

tensor or a tensor of rank n. With the size of a tensor the total number of elements of each index is meant. So as example a χ × χ × χ tensor is a tensor with three indices where each index runs from 1 to χ. A tensor is a generalization of a matrix to more indices and, similar as a matrix, they can also be viewed as a representation of the linear maps between vector spaces. There is however one subtlety: for a matrix the map is unambiguously defined between two vector spaces e.g.

M : V → V0. (21)

However a tensor consists of multiple indices. Therefore it is necessary to clearly specify which indices are the ingoing indices I and which ones the outgoing O. Then the tensor can be seen as a map between vector spaces by defining

T : Vin→ Vout (22)

where Vin ₌ N

i∈IVi and Vout =

N

j∈OVj. By the Hermitian conjugate of a tensor T† the

reverse product is meant, and if the tensor was originally unitary the combination of these linear maps give an identity operation

T†T : Vout → Vin→ Vout. (23) When the tensor is Hermitian this can become an identity. In the language of tensor networks this difference might sometimes not be very clear. When one considers for example an MPS[25] or the TN representation of the 2D-Ising model (section 4.2) one cannot clearly see which indices are defined as ingoing or outgoing indices. In these cases the tensors should only be seen as a multidimensional array. An isometry, as introduced earlier in section 2.1, can very well be seen as such a linear operation between vector spaces. The isometries are constructed to be unitary, so the Hermitian conjugate acts as its inverse (vv†= I). In cases where this ambiguity might arises the tensor will be presented as a linear map. It is good to note that in the rest of the thesis only the isometries and disentanglers should be viewed as linear mappings. All the other tensors can be seen as only a multidimensional array without this clear mapping between vector spaces.

A tensor network can very quickly become very cumbersome to represent in mathematical notation. The network can become quite large and keeping track of which indices of which tensors are in contraction can very quickly become complicated and unclear to represent with formulas. To solve this a graphical notation is introduced, which in a very natural way allows a large network of tensors to be depicted. The main idea of this representation is that each tensor

(19)

Figure 5: Here some elementary tensor networks are shown. In the upper line a scalar, vector and matrix are represented in the graphical notation. The second line shows a rank four tensor, and a very elementary example of the contraction of a vector with a matrix which yields, as can clearly be seen from the graph, again a vector. The last line shows the trace taken over five matrices

is represented by a shape, the kind of shape does not have a meaning, and each index by a leg attached to this shape. When two tensors are in contraction with each other they share the same leg. This results in that the whole network will be represented by a network of tensors with shared legs, while the free legs stand for the free indices of the network. Some examples of this graphical notation are shown in figure 5. Here some very simple tensors are shown, and some very elementary contractions are shown.

This graphical notation allows also for a very natural insight into the of network. One can very quickly see the shape of the result one can expect from this contraction of large networks, this is simply looking for the number of free legs. The simple example of matrix vector multiplication shown in the figure already shows this principle, but one can imagine that for very large networks this has a better benefits. Further, one can quickly see which legs of which tensors are in contraction with each other. It gives some intuition in the behaviour of the network. As example, in the figure a trace over five matrices is shown. This becomes a circle, from which one can very quickly deduce the cyclic property of the trace. This graphical notation will be used extensively during the rest of the thesis where much larger networks are considered, however the same principles as mentioned above always hold.

3.2 Permuting and reshaping

There are two operations which are commonly used for manipulating tensor networks, namely the permutation and the reshaping of tensors. These two operations are reversible and allow the network to be shaped into a desired representation before contracting. Permutation is simply the rearranging of indices of the tensors. Transposing a matrix is a very elementary example of permuting e.g.M_ijT = Mji, but this can be generalized quite easily for multidimensional arrays of

tensors. Although the permuted tensor is a new different tensor, it possesses the same behaviour of the original tensor in that only the labelling is changed. The other operation, reshaping, might need a bit more introduction. With reshaping the fusing/splitting of indices is meant. As example one can reshape a tensor Tabc→ Tad0 , where here |d| = |b||c|, and the indices d runs over

all the possible combinations of b and c. Concretely the new index d corresponds to old index in a way that d = b + (c − 1)|b|, or in the correct order d ∈ { (1, 1), (1, 2), ..., ( | b − 1|, |c|), (|b|, |c|) }.

(20)

Figure 6: On the left the reshaping of a rank-four tensor Tijklinto T(ij)(kl)(assuming the labelling

goes clockwise starting from the top leg) is shown. On the right the permutation of the same tensor as Tijkl→ Tkjil is shown.

So again this gives a new tensor T_ad0 , but it has a close connection with the original tensor. Reshaping does not change the tensor, it only represents the labelling of the indices in a different manner.

Both of these operations are used extensively in the language of tensor networks because they allow the network to be formatted into a more desired shape in a very flexible way. This is because they both are reversible and they do not change the network, only the representation. Therefore one can change the network with these operation into its desired shape, perform the operation one would like to perform, and then reverse the network into its original shape. For most operations this would consistent results, and therefore these operations can be used extensively. An examples of such operations where these reshaping and permuting are used are shown below, namely the contraction of tensors and a singular value decomposition applied to tensors. In both cases the tensors are first transformed into a matrix, then the operation is performed, and finally the matrix is shaped back into its original tensors. The permutation and reshaping of a network can also be represented graphically, as is shown in figure 6. Note the very intuitive way of seeing permutations and reshaping in the graphical notation.

3.3 Contracting tensors efficiently

Although the contraction of a network is a straightforward calculation, the way it is performed can make a difference in terms of computation/memory cost. The cost of contracting two tensors can be decreased by rewriting the contraction in terms of matrix multiplication. This is because if the problem only consists of matrix multiplication, well known linear algebra routines can be employed which in most cases are already very optimized. These extra steps decompose the summation into only pairwise multiplications, which is worth the effort because it makes the contracting step more efficient. The following method is already implemented in a matlab function called NCON [24], which is publicly available and extensively used for the calculations performed in this thesis.

The following example is for two 4-legged tensors, but it can easily be generalised for larger/smaller tensors. Imagine two tensors Aabcd and Bef gh, and one would calculate the

contraction of these tensors over two legs e.g. Ccdef =

X

ab

AabcdBebf a. (24)

Instead of directly calculating this contraction the original tensors are first permuted such that the indices which are contracted are respectively the last indices when the tensor is first in the summation or the first indices of the tensor when the tensor is second in the summation. After this the tensors are reshaped into the matrix. All the contracted indices are reshaped into

(21)

a single indices, and the other indices are reshaped into a single index. This can be written as Aabcd → A0cdab→ A 00 (cd)(ab) Bebf a → Babef0 → B 00 (ab)(ef ) (25) such that the summation becomes

C(cd)(ef ) =

X

(ab)

A00_(cd)(ab)B_{(ab)(ef )}00 . (26) Then C_{(cd)(ef )} can be reshaped/permuted back into Ccdef and we have the desired result of

equation 24.

When the network gets increasingly complicated the order of contraction becomes important. The contraction happens in a step by step process and it is important to specify the order in which the tensors are contracted. The order determines which intermediate tensors occur, and therefore what the computational cost of the contraction is. Usually a fixed bond dimension (denoted by χ) is chosen for the tensor network, and the leading order (defined in terms of the scaling in χ) of the contraction is given as the main measure of the efficiency of an algorithm. One would want to keep this as low as possible, and therefore a careful analysis of the contraction order is necessary.

An example is given in figure 7, where the same network is contracted in two different ways. The computational cost can be calculated by multiplying the number of legs placed on the tensors that are contracted, ensuring that the contracted legs are only multiplied once. The memory cost can be calculated by multiplying the number of legs on each intermediate tensor, where in both cases the leading cost term is of course the largest multiplication. In the example the top branch has a leading order of O(χ7) while the lower branch can be calculated with a leading order of O(χ6). Therefore, already for this simple example, this can be done more efficient by choosing the right contraction order. It is good to note that the leading order does not always determine whether an algorithm is always more efficiently or not, it only determines the scaling in bond dimension. Because accurate simulations require large χ this is a very important, however one has to be aware that the overall cost of the algorithm depends on more details.

3.4 The singular value decomposition

Another frequently used linear algebra process is the singular value decomposition. This says that each m × n matrix M can be decomposed into a product of three matrices as

M = U sV† (27)

Such that

• U is a m × m unitary matrix (U†_{U = I)}

• s is a m × n diagonal matrix with the matrix singular values ordered. If m 6= n the remaining rows are filled with zeros.

• V is also a unitary n × n matrix

This decomposition can be found by looking at the product M†M . If we assume that it is possible to write this decomposition, then one can deduce that

M†M = (U sV†)†(U sV†) = V sU†U sV† = V s2V†

(22)

Figure 7: Here two different contraction orders are shown for the same contraction scheme. In the top branch the network has a leading contraction of order O(χ7) while in the lower branch the contraction order is given as O(χ6_{). Therefore, for this simple but tailor made example,}

one can already see that the computational cost can be significantly reduced by finding the optimal order of the contraction. The memory requirement is also lower in the lower branch (O(χ6_{) vs O(χ}4_)).

which is just an eigenvalue decomposition. The singular value matrix can be found by taking the square root of the eigenvalues of M†M . For the singular value matrix the eigenvalues should be sorted in decreasing order, and the eigenvectors ordered in the similar order. The other unitary matrix U can be constructed by the eigenvalue decomposition of the product M M† [16, 27].

This process of the singular value decomposition is well known for matrices, but can be generalised towards tensors. In a similar way as for the contraction one first reshapes the tensor into a matrix, then performs this decomposition in a U , s and V matrix, and then splits the U and V back into the original legs of the matrix. In this way an exact decomposition can be found for a tensor into three different tensors. It is good to note that, in contrast to a matrix, there is an ambiguity to how to split the tensors. One can pick different axis and this will result in different decompositions [27]. This decomposition first reshapes the tensors into a matrix, and then performs the singular value decomposition. Another way of performing the SVD for tensors is by a higher-order SVD, also called the Tucker decomposition. Here the tensor is decomposed into a core tensor with its singular values, and for each leg a unitary matrix [39].

3.5 Tensor network states and the area law

One of the reason why tensor networks are being used so widely in many body physics is that they are a good ansatz for systems which follow an area law in their entanglement entropy. In this section, by looking at a few standard models, this rather vague statement is made more concrete. The area law, as introduced in equation 8, say that the entanglement entropy saturates for 1D systems away from criticality. It turns out that a tensor network can reproduce this behaviour rather well.

(23)

Figure 8: On the left a graphical representation a matrix product state (MPS) is shown, on the right the MERA ansatz is shown. Both figures are taken from Ref. [23]

written as:

|Ψi = X

i1,...,iN

Ψi1,...,iN|i1i ⊗, ..., ⊗ |iNi (29)

where Ψi1,...,iN can be viewed as an N-legged tensor. By applying multiple SVD steps one can

rewrite this N-legged tensor into a network of three legged tensors. Each index of the original tensor has d states (d=2 for the spin-1₂), therefore the corresponding index of the three legged tensor is of the same size. The large tensor can be split by reshaping the tensor into a d × dN −1 matrix, applying an SVD on this matrix and then absorbing the singular values back into the unitary matrices. Then the U can be reshaped into a rank three tensor (actually the first becomes a rank-2 tensor in case of open boundaries) and the V† can be reshaped back into a tensor size N − 1. Using this procedure for the entire lattice the big tensor is split into smaller tensors for each lattice site. Therefore the wave function is written as

|Ψi = X

i1,...,iN

A1, .., AN|i1i ⊗, ..., ⊗ |iNi . (30)

This is called the matrix product state, which is shown graphically in figure 8. Here the outer tensors (A1AN) have two legs, while the remaining tensors have three legs (as can be seen in the

figure). With each lattice site a tensor is associated, therefore the bonds pointing out (which corresponds with the indices of the original tensor) are called the physical legs of size d, and the bonds between the tensors are called the virtual legs. These virtual legs are (usually) truncated up to a fixed bond dimension χ[25, 32].

Without elaborating too much on the details of the MPS one can show that the MPS reproduces an area law. Imagine that one would look at the entanglement between two blocks in the lattice, block A of size LA and block B of size LB. The wave function can then again be

represented very general as

|Ψi =X

ij

Ψij|ii_s_A|ji_s_B ₍₃₁₎

similarly as done earlier in the theory section. This Ψij can then be rewritten as two matrix

product states, which are connected by this summation over the bond (the same summation as above). It can be shown that the amount of entanglement represented by this ansatz between the blocks can be linked to the bond dimension χ of the bond where the split is made, the blocks are not connected in any different way. This can directly be linked to the entanglement entropy,

(24)

because it is in essence the same as rewriting the earlier general wave function into two matrix product states. This gives an upper limit to the amount of entanglement entropy represented by the MPS.

This limit can be made more concrete: imagine splitting the above wave function into two parts (MPS existing of two tensors):

|Ψi =X

ijk

ΨA_ikΨB_kj|ii_s

A|jisB (32)

where ΨA_ik is a matrix of size 2LA _{× χ and Ψ}B

ik is of size χ × 2LB. The reduced density matrix of

ρa only depends on the matrix ΨA_ik, this can be seen by using again a Schmidt decomposition on

both tensors: |Ψi = X ijknm (U_inA s_nnAV_nkA†)(U_kmB sB_mmV_mjB†) |ii_s A|jisB =X knm

(s_nnA V_nkA†)(U_kmB sB_mm) |ni_i|mi_j =X kn (sA_nnV_nkA†) |ni_i|ki_j =X n sA_nn|ni_i|ni_j. (33)

This can be used to write the reduced density matrix ρa=

X

n

(sA_nn)2|ni_ihn|_i ₍₃₄₎

which only depends on the singular values of the original ΨA_ik matrix. The original ΨA_ik was size 2LA _{× χ, so therefore this can at most contain χ non-zero singular values. This shows that the}

amount of entanglement that can be represented by the MPS is bounded by the size of the bond dimension of the bond where the blocks are split. This bound on the reduced density can also be represented as

rank(ρa) ≤ χna (35)

where here nAstands for the number of bonds that connect block A with block B (in the example

this is one, but here this bound is generalized). The entanglement entropy is bounded in a similar way which can be written as:

SA= Trρalog ρa≤ log(χ)nA. (36)

This can be seen because in the most entangled case the eigenvalues of ρa are equal. Because

the number of states is given by rank(ρa) and they are normalized this means that in the most

entangled case each singular value of the density matrix is equal to 1/rank(ρa). Combining

this with equation 35 gives 36. If the bond dimension scales with the number of states, this representation can become exact. If however the χ is fixed to a finite value, one can see that the entropy becomes a constant

SA∼ constant. (37)

This equation shows that matrix product states can reproduce the area law behaviour. In 1D, the MPS has a fixed number of bonds one can cut (either a single bond in the case of open boundaries or two in the case of periodic boundaries). Therefore an MPS can describe a system which is constant in the scaling of the entropy. Luckily the ground states of 1D quantum systems with a local Hamiltonian away from criticality have this scaling, and therefore these can very

(25)

accurately be described by an MPS. However, it also describes the problem with an MPS like description at criticality: a logarithmic correction is needed and this cannot be represented by an MPS properly in the thermodynamic limit.

In fig. 8 on the right one sees the MERA ansatz[33]. Without going into too much details on this ansatz one can see this as the earlier described combination of isometries and disentanglers. However, one of the interesting properties it has is that it can describe the logarithmic scaling of the entanglement entropy. If one would again divide the system in two separate parts, one can see that the number of bonds now grows logarithmically. This is because extra layers ensure extra bonds, but the size of the layers reduce exponentially. Therefore the new bonds are introduced with a logarithm, resulting that the MERA ansatz can describe the logarithmic correction needed to describe systems at Tc[4]. The same scaling of the MPS and MERA can

(26)

4 The 2D Ising model

In this thesis tensor network techniques will be implemented and investigated for the 2D classical Ising model. The 2D Ising model is one of the best known interacting model in statistical physics. Further it exhibits a second order phase transition between an ordered and disordered phase. The model describes spins on a two dimensional lattice with a very simple nearest neighbour interaction. Due to its simplicity the model can be seen as the most basic model one can propose that describes an interacting spins. Although this is an interacting model it is analytically solved by Onsager in 1944[22]. This allows it to be a good benchmark calculation for new computational methods and therefore this will be the main topic of this thesis. In the section below the model is more concretely introduced and the exact solution will be presented. The partition function can be represented as an infinite network of uniform tensors, which is described in more detail in section 4.2.

4.1 Introduction to the 2D Ising model

The 2D classical Ising model is model of spin particles on a lattice. They only have a nearest neighbour interaction where the energy of a state increases when the spins are not aligned. The model could also possess a possible interaction term with an external magnetic field, however then there is not an exact solution known. This external magnetic field will not be considered in this thesis and therefore this term is omitted. The Hamiltonian of the Ising model is given as

H({σ}) = −JX

i,j

(σi,jσi+1,j+ σi,jσi,j+1), ₍₃₈₎

here each σi,j stands for a spin site and can have the possible values of ±1 (corresponding of

course to ↑↓). The summation sums over all the neighbouring spin sites. The minus sign is to ensure the lowest energy state is when all the spins are aligned either ↑ or ↓. The factor J is an indication of the coupling strength, but because there is no magnetic field this can be set to unity by choosing the right units. The partition function of this model is given by definition as

Z(β) =X

{σ}

e−βH({σ}). (39)

This partition function has been exactly evaluated by Onsager [22] with the use of transfer matrices in the thermodynamic limit and his results will be presented here without a derivation. The solution for the Helmholtz free energy per site is given as

f = − log(2) 2β − 1 β2π Z π 0 log cosh (2J ) cosh (2J ) + 1 k p 1 + k2_{− 2k cos(2θ)} dθ (40) where k is given by k = 1 sinh (2J ) sinh (2J ). Also the spontaneous magnetization per site is given by

m(T ) = ( 1 − sinh−4 (2βJ )1/8, if T > Tc 0, if T < Tc. (41) Here one can see the two phases of the model. Below Tc there is a finite spontaneous

magnetization, which means that the expectation value m ≡< σi,j >> 0. In this phase most of

(27)

when T > Tc, there is enough energy in the system such that the thermal fluctuations break

down the ordering of the spins and the system reaches a disordered phase. The exact value of the critical temperature at which this phase transition occurs is first exactly derived by the use of the Kramers–Wannier duality, which relates the low energy expansion with the high energy expansion of the Ising model[15]. The value found with this duality is exact and given as

Tc=

2

log 1 +√2 ≈ 2.269 (42)

in the case where J = 1 and h = 0. Note that the ordered phase is a phase with spontaneous symmetry breaking. The system has to determine whether it aligns the spins either up or down. Both alignments give the same energy solution, so there is a clear symmetry between the two orderings. However, if T < Tc the system really has to pick one of either orderings and break its

symmetry. If there is no magnetic field, this happens by an accumulation of random fluctuations which results the breaking of symmetry. If there is however an external magnetic field, the system might favour one of the two alignments more and this symmetry is no longer present.

For the 2D Ising model there is also an exact formulation of the two-point correlation function. However, the exact solution took more effort to find than the earlier mentioned observables. Eventually it was solved for the general case in Ref. [28], here only the solution will be mentioned. The exact solution is defined in general in terms of Toeplitz matrices. For the comparison only the correlation in one dimension < σi,jσi+N,j> is needed, therefore only this solution is shown

below. Due to translational invariance this solution does not depend on the value of i and j but only on the difference N . Therefore the correlator can be written as

G((i, j, k, j) = G(|i − k|) =< σi,jσk,j > − < σi,j >< σk,j >

=< σi,jσi+N,j > −m2

(43) where in the last step the definition of m as m =< σi,j > is used. The two-point correlation

function is given in terms of a Toeplitz determinant, first define z = tanh( J kT) z∗ = 1 − z 1 + z (44) and α1 = zz∗ α2 = z∗/z.

Then the exact solution is given as

< σi,jσi+N,j >= a0 a−1 a−2 · · · a−N +1 a1 a0 a−1 · · · a−N +2 a2 a1 a0 · · · a−N +3 .. . ... ... . .. ... aN −1 aN −2 aN −3 · · · a0 (45) where an= 1 2π Z 2π 0 e−inθφ(θ)dθ φ(θ) = s (1 − α1eiθ)(1 − α2e−iθ) (1 − α1e−iθ)(1 − α2eiθ) . (46)

(28)

It is good to note that there exist a branch cut in equation 46 due to the square root. This can be removed by rewriting this equation. First write

z1 = (1 − α1eiθ) and z2 = (1 − α2e−iθ) (47)

and by definition

¯

z1 = (1 − α1e−iθ) and z¯2= (1 − α2eiθ). (48)

These can be used to rewrite equation 46 as φ(θ) =r z1z2 ¯ z1z¯2 = s (z1z2)2 ¯ z1z¯2z1z2 = s (z1z2)2 |z₁z2|2 = z1z2 |z₂z2| (49) which finally result in

φ(θ) = (1 − α1e

iθ_{)(1 − α} 2e−iθ)

|(1 − α1e−iθ)(1 − α2eiθ)|

. (50)

Using this rewriting removes the branch cut of the original φ(θ) function. Therefore the exact solution can always be found (without the rewriting discontinuities occur at the T < Tcrange).

This solution will be compared with the simulations found by the tensor network representation.

4.2 Tensor network representation

Because this model has an exact solution it is a good model to test the accuracy of the different tensor network coarse-graining algorithms. First, however, the model has to be translated into the language of tensor networks. It turns out that the partition function can rather easily be represented as an infinite tensor network of an identical four legged tensor a. To do this the partition function is rewritten as

Z(β) =X {σ} e−βH(σ)=X {σ} Y <i,j> eβσiσj =X {σ} Y <i,j> Qs,s0. (51)

Because the spins have only two values the Hamiltonian in the last line can be written as a simple two by two matrix Qs,s0 given as

Qs,s0 = s, s0 −1 1 −1 exp(β) exp(−β) 1 exp(−β) exp(β) (52) where the s, s0 are summations over the possible values of the spins, in the Ising case {1, −1}. Using this the partition function can be represented as a network of these Qs,s0 tensor. On

each bond between two lattice sites this Qs,s0 tensor is placed, and on each lattice site a four

dimensional delta functions δijkl is placed such that they contract the adjacent Qs,s0 matrices.

The resulting network is shown in fig. 10. The delta function represents all the possible states the lattice site can be (in the Ising case ↑ and ↓), and the Qs,s0 gives the interaction between the

lattice sites with the proper weight. The delta function is here viewed as a four legged tensor. Because the system is in a tensor network all the deltas and interaction matrices are summed over. This should give all the possible configurations of the model, with the proper weight

(29)

summed due to the properly chosen Qs,s0 tensor. Therefore this should give a representation of

the partition function of the Ising model.

To argue a bit more on why this is the correct representation an example is presented below. Another way of viewing that this is really the proper representation is by starting from a small system, and step by step adding extra lattice sites. A number of small systems are shown in fig 9. The first system (fig 9a) contains only two lattice sites, so all the possible configurations are given as ↑↑, ↑↓, ↓↑, ↓↓. The proper partition function sums over all these configurations, with the proper weight. This is given (by using equation 39) as

Z2sites= e−βH(↑,↑)+ e−βH(↑,↓)+ e−βH(↓,↑)+ e−βH(↓,↓). (53)

The tensor network representation here contains two 1-legged ’delta’ functions δi and δj and

a single interaction matrix Qij. The delta functions might look very trivial because they only

have a single index, but they ensure that there is a summation over index i and j. The whole summation of the TN representation then is a summation over all the possible values of Qij,

which is given as:

ZT N = Q↑,↑+ Q↑,↓+ Q↓,↑+ Q↓,↓ = Z2sites (54)

due to the definition of Qs,s0. This is exactly the same as the partition function so the

representation works well for the two lattice sites. The next figure shows three sites (fig 9b), now a new delta function δl is added and the middle delta function is extended to a two legs delta

function δkj. The summation is now over an extra index which gives the representation now as:

ZT N = Q↑,↑Q↑,↑+ Q↑,↑Q↑,↓+ Q↑,↓Q↓,↑+ Q↑,↓Q↓,↓+ Q↓,↑Q↑,↑+ Q↓,↑Q↑,↓+ Q↓,↓Q↓,↑+ Q↓,↓Q↓,↓ =X {σ} Y <i,j> Qs,s0 = Z_3sites. (55)

The last line can be easily seen, because all the possible permutations are in the summation here with the proper multiplication. A more quick way of seeing the summation above is to see this as a multiplication of the partition function of two two-site system (54), but with a single site shared which prevents the outer index of the first Qss0 to be different from the first index of

the second Qs0_s. Then you will also find the above summation the partition function.

When looking closer at the above representation note that automatically the multiplication of all the possible states is taken into account. Also, due to the delta function the lattice site in the middle is not over counted. The delta function ensures that each site is once counted as an up spin and once with a down spin. Adding again a lattice site to complete the square gives the system shown in fig. 9c. Again the summation in combination with the delta functions ensures that this is a representation of the partition function. The explicit calculation is omitted here, but one can easily see that again this gives the proper partition function. Extending this idea to large (or infinite) systems one gets the network shown in fig. 10. This network is already a proper representation, however for simplicity it is transformed into a network of a single uniform tensor. To do this, first the interaction matrix Qs,s0 is split into two identical matrices such

that Qs,s0 = (

√ Q)s,i(

√

Q)i,s0. This can be done numerically, but after some algebra one can also

check that the solution is given by p

Q = √1 2

√cosh β +_√ √sinh β √cosh β −√sinh β cosh β −√sinh β √cosh β +√sinh β

. (56)

Using this all the interaction matrices are split, which is shown in the second step of fig. 10. Then a new uniform tensor is defined by defining the contraction of a single delta with four √Q matrices as aijkl. This tensor is, due to the construction, completely symmetric under

permutation. This finally gives the full partition function of the Ising model as a uniform network of a tensors.

(30)

(a) Two lattice sites

(b) Three lattice sites

(c) Four lattice sites

Figure 9: An example is shown here where the lattice and the TN representation are grown with a single site to argue that this TN representation faithfully represents the Ising partition function

Figure 10: All the steps of the Ising model tensor network representation. Starting from the lattice of spins one constructs a tensor network by placing a delta function (shown in grey) on each lattice site, and by placing properly chosen interaction matrix ( shown in yellow, see eq 52) between all the lattice sites. Then the interaction matrix is split into two identical matrices, and finally they are contracted to form a new fully symmetric tensor aijkl.

Using tensor network renormalization to simulate the 2D classical Ising model at criticality

University of Amsterdam

MSc Physics

Theoretical Physics

Master Thesis

Using tensor network renormalization to simulate the 2D

classical Ising model at criticality

Abstract

Populaire Nederlandse samenvatting

Contents

1

Introduction

2

Theoretical background

3

Tensor network methods

4

The 2D Ising model