Tensor Network algorithms for 1D spin chain: Ground state searches and entanglement entropy minimization

(1)

MSc Physics

Track: Theoretical Physics

Master Thesis

Tensor Network algorithms for 1D spin chains

Ground state searches and entanglement entropy minimization

by

Leon Schoonderwoerd

10000848

July 29, 2016

60EC August 2015 – July 2016 Supervisor: P.R. Corboz Second Examiner: C.J.M. Schoutens

(2)

Leon Schoonderwoerd: Tensor Network algorithms for 1D spin chains, Ground state searches and entanglement entropy minimization. © July 2016 s u p e r v i s o r: P.R. Corboz s e c o n d e x a m i n e r: C.J.M. Schoutens t i m e f r a m e: August 2015 – July 2016

(3)

A B S T R A C T

Tensor networks provide a relatively novel and powerful way to think about and perform calculations with lattice-based many-body quan-tum systems. This thesis treats tensor networks in the form of Matrix Product states (MPS) and Matrix Product Operators (MPO), which are used for calculations on (1D) spin chains. The Density Matrix Renormalization Group (DMRG) algorithm, concerned with finding the ground state and its energy, is introduced. As the numerical ef-ficiency of DMRG and the MPS representation of a state are limited by entanglement, a way to minimize entanglement (entropy) could improve these methods.

This thesis aims to provide a proof of principle for such an ap-proach for the minimization of entanglement entropy using local uni-tary basis transformations. The idea for this method is based on [14]. However, instead of a parameter-based approach, an environment network-based algorithm is used here. The combination of the en-tropy minimization with DMRG is discussed. A main limitation to the entanglement minimization method is the scaling of MPO dimen-sions under the unitary transformations. An approach to overcome this limitation by using ‘projectors’ to constrain MPO dimensions is given.

S A M E N VAT T I N G

Binnen de natuurkunde houdt de gecondenseerde materie zich be-zig met het bestuderen van materie en materialen. Deze materialen zijn opgebouwd uit (thermodynamisch) grote aantallen deeltjes. Om deze reden wordt een grote rol binnen de gecondenseerde materie ge-speeld door de fysica van grote hoeveelheden interagerende deeltjes.

Een specifiek voorbeeld van een terrein binnen dit vakgebied is het beschrijven van microscopische modellen voor magnetisme. Deze modellen zijn over het algemeen roostermodellen, waarbij een ma-croscopische magneet bestaat uit een rooster met op elk roosterpunt een microscopisch magnetisch moment (‘spin’). Om berekeningen te vergemakkelijken worden deze modellen vaak op laag-dimensionale systemen toegepast. In deze scriptie zal worden gekeken naar eendi-mensionale spin-roosters: ‘spin-kettingen’.

Hoewel er veel werk wordt gestoken in het vinden van exacte oplossingen voor veel-deeltjes-systemen, blijken veel modellen niet exact oplosbaar. Daarom wordt er in dit vakgebied veel gebruik ge-maakt van computermodellen, waarmee dergelijke systemen toch be-studeerd kunnen worden. Een aantal veelgebruikte computermodel-len zoals Monte Carlo-simulatie, lijdt onder het zogenaamde fermioni-sche teken-probleem. Dit probleem maakt simulaties van een bepaalde klasse modellen (‘fermionische modellen’) zeer onnauwkeurig.

(4)

Een simulatietechniek die geen last ondervindt van het fermioni-sche teken-probleem, is de Density Matrix Renormalization Group (DMRG). Door deze techniek op een tensor-netwerk-manier te defi-niëren in termen van product-toestanden’ (MPS) en ‘matrix-product-operatoren’ (MPO), ontstaat een intuïtieve en effectieve ma-nier om grondtoestanden van veel-deeltjes-systemen in één dimensie te vinden. MPS, MPO en DMRG worden in de scriptie uitgelegd, en er worden enkele resultaten van het DMRG-algoritme getoond.

DMRG en de representatie van een toestand door een MPS worden gelimiteerd door de verstrengeling binnen het systeem, die te kwan-tificeren is met behulp van verstrengelingsentropie. Daarom kan een methode om het systeem te ‘ontstrengelen’ een verbetering van deze technieken opleveren.

In deze thesis zal een bewijs van concept gegeven worden voor een methode om de verstrengelingsentropie te minimaliseren. Het idee hiervoor is gebaseerd op [14], maar in plaats van een op parame-ters gebaseerd minimalisatie-algoritme te gebruiken, wordt hier een ‘omgevings-netwerk’ gebruikt. De scriptie zal door resultaten van de minimalisatie-methode te tonen laten zien dat deze methode in prin-cipe werkt. Ook zal worden laten zien dat de methode nog enkele problemen ondervindt, met name in de toename van de dimensies van de MPO. Een eerste poging om deze problemen te overkomen zal worden beschreven, maar dient nog verder ontwikkeld te worden.

(5)

A C K N O W L E D G M E N T S

This thesis is the culmination of a year of hard work, which would not have been possible without the help and support of a number of people. Here, I want to express my gratitude to those people who, directly or indirectly, have helped me to complete my year of research and produce this thesis.

First and foremost, I would like to thank my supervisor, Philippe Corboz. Your guidance and readiness to answer any questions through-out the year was of great help to me. You are the person most respon-sible for changing me from a student into a physicist. I hope that we get to meet—and perhaps work together—again later in my scientific career.

Second, I would like to thank Kareljan Schoutens for interesting conversations about my work and my presentation. These helped me to see the place of my work in a bigger picture, and added to the completeness of my thesis.

I also want to thank Schelto Crone and Ido Niesen, the other stu-dents of tensor networks at the UvA. Talking to you and listening to your talks has helped me express my thoughts on tensor networks in a comprehensible way.

For providing valuable insights into their work via email, I would like to thank Örs Legeza and Christian Krumnow. You helped me understand and replicate your methods, which I would not have been able to do without your hints. I would also like to thank Christian together with Jens Eisert for interesting discussions.

Last but not least, this word of thanks would be incomplete without mentioning my fellow master room inhabitants—in particular Emiel, Eyzo and Robert—my “brothers in arms”, so to speak. Without you, this year would have been a vastly different experience. Those theses really did not write themselves!

(6)

(7)

C O N T E N T S List of Figures viii

List of Algorithms x

1 i n t r o d u c t i o n 1

2 b a c k g r o u n d 3

2.1 Tensor Networks 3

2.2 MPS and MPO 6

2.3 Entanglement entropy in Matrix Product States 12

3 d e n s i t y m at r i x r e n o r m a l i z at i o n g r o u p a l g o r i t h m s 15

3.1 Variational Approach 15

3.2 Implementation: single-site 16

3.3 Implementation: double-site 17

3.4 Results 18

4 pa r a m e t e r-based entanglement entropy minimiza-t i o n 27

4.1 Procedure 27

4.2 Parameter-based unitary search 28

5 e n v i r o n m e n t n e t w o r k-based entanglement entropy m i n i m i z at i o n 37

5.1 Cost function as tensor network 37

5.2 Optimizing the environment 37

5.3 Method comparison 40

5.4 Results 45

6 c o n s t r a i n i n g m p o d i m e n s i o n s 55

6.1 Exponential MPO growth 55

6.2 Projectors 57 6.3 Interaction with DMRG 58 7 c o n c l u s i o n 61 7.1 Summary 61 7.2 Conclusion 62 7.3 Outlook 62 a p s e u d o c o d e s 65 a.1 Single-site DMRG 65 a.2 Two-site DMRG 66

a.3 Unitary basis transformations 67

a.4 MPO compression 70

b i b l i o g r a p h y 73

(8)

L I S T O F F I G U R E S

Figure 1 Graphical representation of mathematical ob-jects. 3

Figure 2 Graphical representation of mathematical op-erations. 4

Figure 3 Example of a tensor network. 4

Figure 4 Small tensor network. 5

Figure 5 Small tensor network with labeled bonds 5

Figure 6 Different network contractions. 6

Figure 7 Optimal contraction procedure. 6

Figure 8 Graphical representation of an SVD. 7

Figure 9 State tensor. 7

Figure 10 Performing an SVD on a state tensor. 8

Figure 11 Decomposition of a state tensor into an MPS. 8

Figure 12 Calculation of the norm of a mixed-canonical

MPS. 10

Figure 13 Moving nonunitarity in MPS. 11

Figure 14 Matrix Product Operator 11

Figure 15 Applying an MPO to an MPS. 13

Figure 16 Expectation value. 13

Figure 17 Two-block system 14

Figure 18 The DMRG optimization problem. 15

Figure 19 L- and R-expressions 17

Figure 20 Two-site DMRG problem. 18

Figure 21 Energy per site in single-site DMRG. 20

Figure 22 Energy per site in double-site DMRG. 21

Figure 23 Energy per site vs. system size (single-site DMRG). 22

Figure 24 Energy per site vs. system size (double-site DMRG). 23

Figure 25 Convergence of energy per site (single-site DMRG). 24

Figure 26 Mean converged value of the energy per site (single-site DMRG). 24

Figure 27 Convergence of energy per site (double-site DMRG). 25

Figure 28 Mean converged value of the energy per site (two-site algorithm). 25

Figure 29 Minimizing the entanglement entropy. 28

Figure 30 Applying the unitary transformations to an MPO. 28

Figure 31 Global transfrofmation. 29

Figure 32 Von Neumann entropy landscape. 32

Figure 33 Steps for the Nelder-Mead algorithm. 34

Figure 34 Powell optimization for different cost function landscapes. 35

Figure 35 Calculation of the vector-4-norm (equation17)

in tensor network form. 38

Figure 36 Vector-4-norm (more compact). 39

Figure 37 Network for environment-based optimization. 39

(9)

LIST OF FIGURES ix

Figure 38 Von Neumann entropy after optimization ver-sus system size for different optimization

meth-ods. 41

Figure 39 Vector-1-norm after optimization for versus sys-tem size different optimization methods. 42

Figure 40 Second Rényi entropy after optimization ver-sus system size for different optimization

meth-ods. 42

Figure 41 Von Neumann entropy after optimization ver-sus Dmax, for different optimization methods. 43

Figure 42 Vector-1-norm after optimization versus Dmax, for different optimization methods. 43

Figure 43 Second Rényi entropy after optimization ver-sus Dmax, for different optimization methods. 44

Figure 44 Computational cost of optimization algorithms. 44

Figure 45 Double-site DMRG with unitary transforma-tions: energy convergence. 48

Figure 46 Double-site DMRG with unitary transforma-tions: Energy convergence (zoomed). 49

Figure 47 Double-site DMRG with unitary transforma-tions: energy convergence from delayed start 49

Figure 48 Double-site DMRG with unitary

transforma-tions: energy convergence from converged start. 50

Figure 49 Double-site DMRG with unitary transforma-tions: Von Neumann entropy. 50

Figure 50 Double-site DMRG with unitary transforma-tions: Von Neumann entropy (zoomed). 51

Figure 51 Double-site DMRG with unitary transforma-tions: Von Neumann entropy, delayed start. 51

Figure 52 Double-site DMRG with unitary transforma-tions: singular values. 52

Figure 53 Double-site DMRG with unitary transforma-tions: Dmaxrampdown. 52

Figure 54 Double-site DMRG with unitary transforma-tions: Dmaxrampdown (zoomed). 53

Figure 55 Double-site DMRG with unitary transforma-tions: Dmaxrampup. 53

Figure 56 Double-site DMRG with unitary transforma-tions: Dmax rampup (strict convergence crite-rion). 54

Figure 57 Increase in MPO dimensions. 56

Figure 58 Projectors. 57

Figure 59 Network used to generate the projectors. 57

Figure 60 Energy convergence as a function of sweeps, for the two-site DMRG algorithm including uni-tary transformations and projectors. 59

Figure 61 Converged energy per site as a function of sys-tem size, for the two-site DMRG algorithm

(10)

Figure 62 Energy convergence for the two-site DMRG al-gorithm including unitary transformations and projectors on an odd-length system. 60

Figure 63 Tensor network as used in a single step of the single-site DMRG algorithm. 66

Figure 64 Tensor network as used in a single step of the two-site DMRG algorithm. 67

Figure 65 Networks used for the computations of projec-tors. 71

L I S T O F A L G O R I T H M S

Algorithm 1 Single-site DMRG calculations: main loop, Sweep and IsConverged helper functions. 65

Algorithm 2 Sweep function for two-site DMRG. 66

Algorithm 3 Parameter-based optimization of unitary trans-formations. 68

Algorithm 4 Environment-based optimization of unitary trans-formations. 68

Algorithm 5 Sweep function for two-site DMRG in combi-nation with unitary transformations. 69

Algorithm 6 Algorithm to compress MPO dimensions us-ing projectors. 70

(11)

1

I N T R O D U C T I O N

Condensed matter physics is an important part of physics as a whole. It is concerned with the study of matter and materials that are en-countered on a daily basis. As these are built up from thermodynam-ically large numbers of quantum mechanical particles, the study of quantum many-body physics plays a prominent part within condensed matter physics.

In line with the aim for microscopical descriptions of macroscopic behaviour, this thesis is concerned with models for magnetism. These models are lattice-based spin models: magnetic moments at the quan-tum level, living on a lattice (which models the crystalline structure of solids). This thesis is restricted to the study of one-dimensional lattices (spin chains): while these exhibit some properties that make working with them far easier than higher-dimensional systems, they can still provide insights into nontrivial physics.

Although the search for exact solutions to many-body quantum systems is a very active field (see for instance [29]), these solutions tend to be hard to come by. Even if it is possible to write down an algebraic solution to such a model, this solution is often limited by approximations such as a temperature of absolute zero, or an infinite system size. This is where numerical methods come into play, pro-viding a way to study these models that is not exact, but still fully theoretical so as not to suffer from experimental difficulties.

For several numerical methods like Monte Carlo simulation, the fermionic sign problem (see for instance [16]) poses a big difficulty. This problem arises due to the non-positiveness of the statistical weights used in such simulations. An example of a model that is hard to at-tach due to the sign problem is the two-dimensional Hubbard model [25, 34], which is somewhat of a ‘holy grail’ model for condensed matter physicists. Other numerical approaches suffer from their own problems, as is summed up nicely in section 3.1 of [21].

A numerical approach that does not suffer from the sign problem (or the other problems mentioned in [21]) is the Density Matrix Renor-malization Group (DMRG). DMRG is a variational method first de-scribed by White in 1992 [35]. Although in the method is one of the most effective methods for the study of one-dimensional systems (the focus of this thesis), it can—with some difficulty—be generalized to higher dimensions.

DMRG is, essentially, a tensor network method. Tensor networks pro-vide a novel and often powerful way to study quantum many-body systems. Therefore, they have become increasingly popular over re-cent years. There exists a wide range of tensor network descriptions and models for different kinds of systems [2, 21,32]. For this thesis, only specific tensor networks for one-dimensional systems are used.

(12)

2 i n t r o d u c t i o n

These are Matrix Product States (MPS) and Matrix Product Operators (MPO).

Formulated in terms of MPS and MPO, DMRG provides an intu-itively insightful and efficient way to partly and approximately diago-nalize Hamiltonians [26–28]. The efficiency of the MPS representation of a state is largely limited by the interaction length of the system and the corresponding entanglement. Quantifying this entanglement by the entanglement entropy, a method to minimize the entropy could provide an improvement of the MPS representation.

DMRG can also be generalized to infinite-size [19] and two-dimen-sional [15,36] systems. This latter generalization makes use of a map-ping of the two-dimensional state onto a one-dimensional, ‘snake-like’ MPS. This approach suffers from long-range entanglement in one of the two system dimensions, which even in the presence of area laws (see section 2.3) cause the numerical representation of the state to

scale exponentially in the size of that system in that dimension. There-fore, this method could also benefit greatly from a disentanglement procedure.

The main aim of this thesis is to describe a proof of principle of an entropy minimization method. To do so, the concepts of MPS, MPO and DMRG for one-dimensional spin lattices will be covered. Both the single- and double-site DMRG algorithms will be discussed. All of this will be done in terms of tensor networks; this is what makes the approach as intuitive as has been claimed above.

The core part of this thesis will treat methods for the minimiza-tion of entanglement entropy in spin chains. These methods use a patchwork of local basis transformations, creating a global transfor-mation that ‘transforms away’ the entanglement. To find the local transformations, use is made of parameter-based as well as environ-ment network-based optimization algorithms.

This entropy minimization method leads to an exponential increase in MPO dimensions. An idea for a way to prevent this increase in dimensions is to compress the MPO using projectors. These projectors are a resolution of the identity formed out of a part of the tensor network that the algorithms are working with anyway. The projector idea is described in the thesis, but leads to problems of its own which will have to be worked out in future research.

The thesis is outlined as follows: chapter 2 covers the basics of

MPS and MPO, emphasizing the aspects of these that are relevant for the rest of the thesis. Chapter 3 will treat the DMRG algorithms.

Chapter 4 will introduce the idea behind the entanglement entropy

minimization and cover the parameter-based to this problem. Chap-ter 5 will continue on entropy minimization, detailing the

environ-ment network-based approach and providing evidence for the proper functioning of this approach. Chapter6will explain the method used

to constrain MPO dimensions during the entanglement minimization. Finally, a summary, conclusion and possible directions for future re-search are given in chapter7.

(13)

2

B A C K G R O U N D

In this chapter, the theoretical foundations for this thesis will be pro-vided. Section2.1will give an introduction to tensor networks,

includ-ing an explanation of the diagrammatical formalism that is used to describe these networks. Section2.2introduces Matrix Product States

(MPS) and Matrix Product Operators (MPO), that will be used in fur-ther chapters. Section 2.3 will explain how we can relate the notion

of entanglement entropy to Matrix Product States. 2.1 t e n s o r n e t w o r k s

To introduce tensor networks, we have to start by explaining the graphical notation in which we describe these networks. In figure 1,

the graphical representation of a vector, matrix, 3-tensor and 4-tensor is given. The key idea of this representation is that every shape (such as a circle) represents a mathematical object (such as a matrix), while every ‘leg’ corresponds to an index.

i₁ (a) Vector i₁ i₂ (b) Matrix i1 i2 i₃ (c) 3-Tensor i1 i2 i3 i₄ (d) 4-tensor

Figure 1: Graphical notation of a vector, matrix, 3-tensor and 4-tensor. The legs coming out of the circles represent the indices of the object. With the notation in place, the natural next step is to describe mathematical operations such as matrix-vector contraction graphi-cally. The notation for these operations (which, apart from contrac-tion, also include the transposition and reshaping of tensors) is shown in figure2. These images can be easily understood by again referring

to the correspondence between legs and indices: contracting over an index corresponds to connecting the legs associated with that index, transposition corresponds to ‘twisting’ legs around and reshaping the tensor (which is particularly important when doing numerics) corre-sponds to ‘glueing’ legs together or ‘cutting’ them apart.

(14)

4 b a c k g r o u n d

k _i _i

(a) Matrix-vector contraction. The image corresponds to the ex-pression Mikvk = vi0. In this thesis, the convention that a

connected leg is contracted (or ‘summed over’) will be main-tained.

i₁

i3

(i₂, i4)

(b) Reshaping a tensor. The im-age corresponds to the expres-sion Ti1,i20,i3 = (Ti1,i2,i3,i4) 0_. i₁ i₂ i3 i₄

(c) Permuting the indices of a tensor. The image corre-sponds to the expression T_i₁_,i0

2,i3,i40 = Ti1,i4,i3,i2.

Figure 2: Basic mathematical operations depicted in the graphical tensor network-notation

Figure 3: Example of a tensor network (appearing in context in figure 36).

This network contracts to a scalar.

A ‘tensor network’ is now simply a (generally large) number of tensors, forming a network. An example of this is given in figure 3.

Note that this network corresponds to a scalar, as there are no non-connected legs, thus no free indices.

2.1.1 Efficient contraction

Although it does not matter analytically, when working with tensor networks numerically, the efficiency of any algorithm using tensor networks depends strongly on the way in which tensors are con-tracted. Consider the tensor network shown in figure 4. If we label

(15)

2.1 tensor networks 5

Figure 4: Small tensor network. Consider every bond to have dimension D. The top row of (blue, round) tensors are labelled M(i), the bottom row are their conjugates M(i)†. The middle row of (red, square) tensors are named W(i). This network corresponds to equation (1).

a1 σ1

σ0₁ b₁

a0₁

Figure 5: The same tensor network as in figure4, but with the bonds labelled

as described in the text. Compare this figure to equation (1) to see

the correspondence between indices and bonds.

the top row of tensors M(i)_{, the middle row W}(i) _{and the bottom} row M(i)†, this tensor network corresponds to the expression

X σi,σi0 X ai,ai0 X bi M(1)σ1,a1M (2) σ2,a1,a2M (3) σ3,a2,a3M (4) σ4,a3 × ×W_σ(1) 1,σ10,b1W (2) σ2,σ20,b1,b2W (3) σ3,σ30,b2,b3W (4) σ4,σ40,b3 × ×M(1)†_σ0 1,a 0 1 M(2)†_σ0 2,a 0 1,a 0 2 M(3)†_σ0 3,a 0 2,a 0 3 M(4)†_σ0 4,a 0 3 (1) (showing the improvement of the pictographic representation of this expression over the algebraic representation). Here, the σiand σi0 cor-respond to the vertical legs (upper and lower row, respectively), the a_i and a_i0 correspond to the horizontal legs (uppermost and lower-most row, respectively) and the bi correspond to the horizontal legs in the middle row. Figure5shows which indices correspond to which

legs for the first ‘column’ of tensors. Because every index is summed over, the network contracts to a scalar.

For the analysis of efficient contraction, the key point is that a con-traction of two tensors in the network has a numerical complexity that is the product of the dimensions of all indices of those tensors. For instance, if we consider for simplicity every bond in this network to be of dimension D (meaning every index runs over D values), con-tracting M(1) and M(2) (figure 6a) has a complexity of D4, while

contracting W(2)_{with W}(3) _(figure₆_b_{) has a complexity of D}7_. Using this idea, the goal is to implement all algorithms in a man-ner that is optimal with respect to the efficiency of tensor contractions. For the network contraction treated here (which turns out to be the

(16)

(a) Contracting figure 4 to this

network has a complexity of D4.

(b) Contracting figure 4 to this

network has a complexity of D7.

Figure 6: Different contractions of the network in figure 4, with different

numerical complexities. All bonds are assumed to have dimension D. 1 2 3 4 5 6 7 8 4 5 7 8 9 10 11 10 11

Figure 7: Optimal contraction procedure for the network shown in figures4

and5. Some numbers appear twice, as some tensors are contracted

with the rest of the network over two indices simultaneously.

calculation of an expectation value, see figure16), the optimal

proce-dure is outlined in figure 7, resulting in a numerical complexity of

O(D5).

2.2 m p s a n d m p o

The forms of tensor network that will be used most in this thesis, are the Matrix Product State (MPS) and the Matrix Product Operator (MPO). This section will describe these networks. To do so, some understanding of Singular Value Decompositions (SVD’s) is needed. 2.2.1 Singular Value Decompositions

An SVD is used to decompose any (complex-valued) matrix into a product of three matrices, of which two are unitary and the third is a diagonal matrix:

The notation S =diag(Σsi)

implies that S is a square diagonal matrix, where the elements on the diagonal are the si.

M = USV†, where            UU† =1 VV† =1 S =diag(s1, . . . , sn) . (2)

The siin this expression are called the singular values of the matrix M.

(17)

2.2 mps and mpo 7

Any SVD mentioned in this thesis will have the following proper-ties:

• The singular values are ordered, real and nonneggative: R 3 s1 > s2 > . . . sn> 0

• The matrix M can have any dimensions m × n. The dimensions of U, S and V† that follow from the SVD are m × k, k × k and k× n respectively, where k = min(m, n).

A graphical representation of an SVD is given in figure8.

M U S V†

(a) An SVD with dim M = m × n, m > n.

M U S V†

(b) An SVD with dim M = m × n, n > m.

Figure 8: Graphical representation of an SVD.

2.2.2 Matrix Product States

An MPS is a tensor network representation of a state. For this the-sis, an understanding of MPS in one (spatial) dimension is sufficient, although the concept can be generalized to systems in higher dimen-sions.

2.2.2.1 Generating an MPS

Figure 9: State tensor for a one-dimensional system of 10 sites. Every ‘leg’ corresponds to an index running over the local state space. Consider a state for an N-site spin chain. This state is represented by a N-tensor, a 10-site example of which is shown in figure 9. The

legs are the ‘local’ indices of the state: they each run over the ba-sis state of the local system on the corresponding site. That is, for a spin-1₂ system, each spin is represented by two complex numbers: |~s i ∈ C2_.

In this thesis, the convention will always be used that a diagram with legs pointing down corresponds to a state|ψi, while a diagram with legs pointing up corresponds to a costate hψ|.

The state tensor from figure 9 can be decomposed into a tensor

network containing as many tensors as there are spins in the system. This is done by splitting off parts of the tensor through an SVD. A single split-off is shown in figure 10. The procedure of the split-off

(18)

all but one leg together. Second, perform an SVD. Third, reshape the

Reshaping the tensor is necessary to be able to do the SVD; this can only only be done with matrices.

matrix V† back into an (N − 1)-tensor. Finally, contract the matrix S with the (N − 1)-tensor, leaving a unitary tensor split off of the rest of the state.

(a) State vector with one SVD performed. Note that the entire right part of this network is the–reshaped–matrix V†.

(b) Reabsorbing the matrix S back into the state vector by contracting T = SV†.

Figure 10: Performing an SVD on a state tensor. The leftmost tensor is now unitary.

To get a full MPS from the state, the procedure described above is repeated as often as necessary. The first tensor is split off and ignored for the rest of the calculation. Then the procedure is repeated on the remaining tensor, as shown in figure 11. In the final result (the

com-plete MPS), if all tensors except one are unitary, the MPS is said to be in canonical form. Canonical forms will be described in some more detail in section2.2.2.3.

(a) State vector with one tensor split off (as in figure10b).

(b) State vector with two tensors split off. Both tensors on the left are unitary.

(c) State fully decomposed into MPS. All tensors except for the right-most are unitary.

Figure 11: Decomposition of a state tensor into an MPS.

2.2.2.2 MPS dimensions

Because the legs of the state in figure9run over the local state space,

their dimension (the number of values that the corresponding index can take) is called the physical dimension d. As has been mentioned, for a spin-1₂ system, every local spin is an element ofC2. Thus, each leg has dimension 2. For more general systems, this dimension is equal to the dimension of the local Hilbert space:

(19)

2.2 mps and mpo 9

The MPS representation of the state (figure11c) not only has

outgo-ing legs, but contains internal, auxiliary legs as well. These legs carry all information about the correlation between the spins. Their dimen-sion is called the bond dimendimen-sion D. As can be seen from the properties of SVD’s (see section 2.2.1), the bond dimension increases from the

sides of the MPS to the center bond. For a system with physical di-mension d, the bond didi-mensions are (from left to right):

d, d2, . . . , dL/2, . . . , d4, d2. (4) The exponential scaling in the center poses a problem when using numerical methods: large systems would be exponentially difficult to represent due to memory and processor constraints.

Luckily, it is possible to represent a state approximately by con-stricting the bond dimension to some value Dmax. To this end, when generating the MPS, the matrices being generated by the SVD are truncated to have at most (auxiliary) dimensions Dmax. That this trun-cation provides a good approximation to the actual state is not imme-diately clear: it depends critically on the spectrum of the singular values.

In section 2.2.1, we have seen that the singular values are put in

decreasing order. Thus, the truncation mentioned above is good if this spectrum decays ‘fast enough’. Furthermore, in [26], it is shown that the singular values in a gapped one-dimensional system generally decay exponentially; in [31] it is shown that the representation of ground states of (gapped) local Hamiltonians by MPS is good. Finally, for finite systems under a gapless Hamiltonian, the representation is still good: the entropy here scales as log(L) [4]. In that case, the required bond dimension grows only polynomially with system size. As a result of this the MPS, when chosen well, should represent the state very well. Note that all sources mentioned here consider mainly ground states of local Hamiltonians (explicitly or implicitly). As this thesis is concerned only with precisely these ground states, this poses no problem.

A potential problem might be that when system size increases, D_max needs to increase as well to retain a faithful representation of the state. However, [5] show that the entanglement entropy of one-dimensional lattice systems with a gapped, local Hamiltonian such as treated in this thesis obeys an area law, which essentially indicates that the maximal value of Dmaxneeded for a good description of the state tends to a constant as the system size increases. A more in-depth treatment of entanglement entropy and its connection to MPS is given in section2.3.

2.2.2.3 Unitarity and canonical forms

Canonical forms are MPS where all tensors except one are unitary. Three types of canonical form are recognized:

1. Left-canonical MPS, in which the rightmost tensor is nonunitary. 2. Right-canonical MPS, in which the leftmost tensor is nonunitary.

(20)

(a) Norm (hψ|ψi) of a full mixed-canonical MPS: contraction of the MPS with itself. The green tensor (third from the left) is nonuni-tary, all other tensors are unitary.

(b) All unitary tensors are contracted together, yielding identities and leaving only the nonunitary tensor.

(c) The three legs are grouped together into a single leg.

Figure 12: Calculation of the norm of a mixed-canonical MPS.

3. Mixed-canonical MPS, in which the nonunitary tensor is some-where in the middle of the MPS.

Having an MPS in a canonical form has the advantage of radically simplifying some computations. In figure 12, the norm of an MPS in

mixed-canonical form is computed. The canonical form here simpli-fies this to the norm of a single tensor: contracting the unitary tensors with their conjugate tensors results in identities. This simplifies some calculations in for instance the algorithms treated in sections3.2and 3.3.

For the implementation of the algorithms treated later in this the-sis, it is important to be able to move the location of the nonunitary tensor in a canonical form-MPS. This is done by contracting tensors and performing an SVD (see section 2.2.1). Moving the

‘nonunitar-ity’ is achieved by storing either the U or the V† from the SVD as unitary tensor, and storing the product of the remaining two matri-ces (including the matrix S) as nonunitary tensor. The procedure for moving the nonunitary tensor through an MPS in canonical form is shown in figure13.

2.2.3 Matrix Product Operators

Matrix Product Operators (MPO) are the operator counterpart to MPS. As contracting an operator with a state should yield another state (consider O|ψi = |ψ0i, where any eigenvalues are assumed to be con-tained in|ψ0i), an MPS has both ‘incoming’ as well as ‘outgoing’ legs. Figure 14 shows an example of a 10-site MPO. Although an MPO

could be constructed in a similar fashion as the MPS generation de-scribed in section2.2.2.1by decomposing the full (matrix-form)

oper-ator, for the MPO’s in this thesis an easier and more elegant construc-tion is possible.

(21)

2.2 mps and mpo 11

(a) MPS in mixed-canonical form. The nonunitary (green) tensor is the third from the left.

(b) The nonunitary tensor is contracted with the tensor on its right.

(c) An SVD is performed. The tensor that was nonunitary is now re-placed by the (unitary) matrix U, the tensor to its right is rere-placed by the (nonunitary) product SV†.

Figure 13: Moving the nonunitary tensor from left to right through an MPS in mixed-canonical form. Moving from right to left is achieved in a similar manner, with the difference that V† is stored as unitary tensor and the product US is stored as nonunitary tensor.

Figure 14: An MPO representation for an operator on a one-dimensional system of 10 sites.

2.2.3.1 Generating an MPO

A general description of how to construct a general MPO for a

spin-chain Hamiltonian is given in [8,18], For a different,

more ‘mathematical’ approach, one could look at [22].

some examples are found in sec-tion 6.1 of [28]. The explanation here builds strongly on those sources. To build the MPO, we need a localized description of the opera-tors in the Hamiltonian. For a one-dimensional chain, this description takes the form of a 4-tensor (all but the left- or right-most tensors in figure14). These 4-tensors can be written as:

As an example of the operators X and Y, consider for instance the Pauli matrices σx, σyand

σz. Not entirely

coincidentally, these are the operators that will be used the most in this thesis.

W_i(1) =   1 0 Xi 1 

 for one-body terms Xi (5a)

W_i(2) =     1 0 0 Yi 0 0 0 X_i 1    

for two-body terms XiYi+1 (5b)

and so on. The Wi in equations (5) are the MPO elements for site i.

All matrix elements in equations (5) have local dimensions (i. e. they

have dimension 2 × 2 for a spin-1₂ system). A Hamiltonian containing several terms, of which some are one-body, some are two-body (or higher) can be constructed by combining terms:

W_i(1,2)=     1 0 0 Z_i 0 0 X_i Y_i 1    

(22)

The first and last element of the MPO look slightly different: as can be seen in figure14, they have a different number of indices. The

leftmost MPO element is the bottom row of the inner tensors, the rightmost element is the left column:

W₁ = X₁ Y₁ 1 , W_L =     1 Z_L X_L     . (7)

The complete Hamiltonian can be reconstructed by multiplying these tensors. As an example, the XXZ antiferromagnetic Heisenberg Hamil-tonian for a 3-site, spin-1₂ system is constructed as follows:

The spin operators Siare related to the Pauli spin matrices as Si= h₂σi, with i∈{ x, y, z }. S±= Sx± iSy_. H = W₁⊗ W₂⊗ W₃ = −hSz₁ S−₁ S+₁ Sz₁ 1₁ ⊗ ⊗           12 0 0 0 0 S+₂ 0 0 0 0 S−₂ 0 0 0 0 Sz₂ 0 0 0 0 −hSz₂ (J/2)S−₂ (J/2)S+₂ JzSz₂ 1₂           ⊗           13 (J/2)S+₃ (J/2)S−₃ Sz₃ −hSz₃           = J 2 S + 1S − 2 + S + 2S − 3 + S − 1S + 2 + S − 2S + 3 + J z _Sz 1Sz2+ Sz2Sz3 − − h Sz₁+ Sz₂+ Sz₃ = J 2 X hiji S+_iS−_j + S−_iS+_j + JzX hiji Sz_iSz_j − hX i Sz_i, (8) where J is the interaction strength in the x- and z-direction, Jz is the interaction strength in the z-direction, h is an external field (in the z-direction) and the notation hiji is taken to mean nearest-neighbour terms.

The dimensions of an MPO are generally well-manageable. The vertical legs in figure 14 have the same physical dimension d as the

MPS (which is clear when considering the application of an MPO to an MPS, as in figure15). The horizontal indices have a dimension that

is constrained by the number of terms and the interaction length in the Hamiltonian. For example, using a Hamiltonian containing terms as in equation (6), the horizontal indices have a dimension of 3, as

can be seen from the size of the Wi.

The contraction of an MPS with an MPO can now be done by con-necting the tensor networks over their free legs. An operation ˆO|ψi is shown in figure 15, an expectation value hψ| ˆO|ψi is shown in

fig-ure16.

2.3 e n ta n g l e m e n t e n t r o p y i n m at r i x p r o d u c t s tat e s In general, because spin chain Hamiltonians contain interaction terms, (ground) states corresponding to these Hamiltonians contain some en-tanglement. One measure of the ‘amount’ of entanglement present in a state is entanglement entropy.

(23)

2.3 entanglement entropy in matrix product states 13

(a) MPS-MPO product before contraction.

(b) MPS-MPO product after contraction. Generally, only a single hor-izontal bond would be shown.

Figure 15: Applying an MPO to an MPS ( ˆO|ψi).

Figure 16: Expectation value (hψ| ˆO|ψi).

For this thesis, the important aspect of entanglement entropy is its connection to MPS. To see this connection, first consider a system in a two-block state, as shown in figure17. The state of this system can

be expressed in terms of the states of the subsystems:

A completely nunentangled state can be written as |ψi = |ψAi ⊗|ψBi,

without the sum-mation: this is also called a product state.

|ψi =X i

λ_i|ψ_Ai_i⊗|ψ_Bi_i. (9)

This is called the Schmidt decomposition of the state; the λi are the Schmidt coefficients. It is possible to express the Von Neumann entan-glement entropy in terms of these coefficients:

S = −X i

λ2_ilog λ2_i, (10)

where the log is understood to be base-2.

The reduced density matrix is found by taking a trace over a part of the full density matrix: ρA=TrBρ. This result can be derived in

more detail by calculating the Von Neumann entropy as a trace over the reduced density matrix: S = Tr ρAlog ρA, writing this reduced density matrix as ρA=

P

iλ2i|ψAi hψA|.

To relate this expression for the entanglement entropy to an MPS, an identification is made between the Schmidt coefficients and the singular values corresponding to an SVD between substates A and B. The (Von Neumann) entanglement entropy can thus be directly written in terms of the singular values:

S = −X i

s2_ilog s2_i. (11)

Different measures of entanglement entropy can also be calculated from the singular values, by making the same connection to the

(24)

re-14 b a c k g r o u n d

A B

Figure 17: System divided into two bipartite subsystems A and B.

duced density matrix. For instance, the Rényi entropy (of any order α_{> 0) is defined as} Sα= 1 1 − αlog Tr ρ α A = 1 1 − αlog X i s2α_i , (12)

(25)

3

D E N S I T Y M AT R I X R E N O R M A L I Z AT I O N G R O U P A L G O R I T H M S An overview of the development of DMRG and MPS and their combination is given in the introduction to [28].

The Density Matrix Renormalization Group (DMRG) is a numerical approach to solving many-body lattice problems. First described by White [35], its combination with the MPS formalism has created a powerful computational tool for such systems. The goal of this chap-ter is to outline how DMRG is used to compute ground states and ground state energies.

3.1 va r i at i o na l a p p r oa c h

DMRG is, at its heart, a variational optimization scheme. The goal herein is to minimize the energy of a state |ψi with respect to some Hamiltonian H. I. e., the goal is to find|ψi such that the expression

E = hψ|H|ψi

hψ|ψi (13)

is minimized. This corresponds to extremizing the expression hψ|H|ψi − λhψ|ψi, where λ is a Lagrangian multiplier corresponding to the en-ergy. The|ψi resulting from this problem will be the ground state, the value of λ will be the ground state energy.

− λ = 0

(a) The complete optimization problem.

− λ = 0

(b) The reduced optimization problem, making use of the unitary properties of the MPS as described in section2.2.2.3.

Figure 18: The DMRG optimization problem expressed in terms of tensor networks. The circled tensor is the tensor which is being opti-mized.

The key to DMRG is to not perform this extremization for the entire system at once, but to localize it. In terms of the MPS diagrams, this means optimizing a single MPS tensor at a time. Figure18shows the

tensor networks corresponding to this problems. The tensor which is circled in red is the tensor that locally solves the optimization prob-lem. The missing tensor is due to the nature of the optimization:

(26)

16 d e n s i t y m at r i x r e n o r m a l i z at i o n g r o u p a l g o r i t h m s

tremizing hψ|H|ψi − λ hψ|ψi with respect to the state corresponds to solving

d

d|ψi hψ|H|ψi − λ hψ|ψi = 0. (14)

Changing this equation from a differential with respect to a full state to a differential with respect to a single local tensor results in the differential operator ‘taking out’ a single tensor from the tensor net-work.

To go from figure 18a to figure 18b, the unitary properties of the

MPS, introduced in section2.2.2.3are used. This necessitates the MPS

being in mixed-canonical form, with its only nonunitary tensor at the site that is currently being optimized.

The final optimization problem shown in figure 18b corresponds

to an (ordinary) eigenvalue problem.

Similarly, the diagram in fig-ure18acorresponds

to a generalized eigenvalue

prob-lem of the form Mv − λNv = 0, where both M and Nare matrices.

Reshaping the three open legs and the three legs connecting to the circled tensor into a single leg each, the problem takes the form Mv − λv = 0, where the ‘matrix’ M is built up from the MPO and all MPS tensors that are not currently optimized, λ is the eigenvalue, and v corresponds to the tensor under optimization.

The dimensions of this eigenvalue problem is what makes DMRG an efficient numerical method. Diagonalizing the Hamiltonian exactly would result in an eigenvalue problem of dimensions dL× dL_{, where} L is the system size. In each local optimization in this scheme the matrix to be diagonalized is of dimensions dD2_{× dD}2_{, as can be seen} by looking at the dimensions of the legs in figure 18. This is a clear

improvement: as explained in section 2.2.2.2, D tends to a constant

for large L. Local optimization is not full diagonalization however; as the following section will explain, the local procedure needs to be repeated a potentially large number of times. This will eventually result in having to diagonalize multiple eigenvalue problems, but the exponential scaling of the problem with L is gone for good.

3.2 i m p l e m e n tat i o n: single-site

As mentioned in the previous section, local optimization is not im-mediately equivalent to finding the global ground state. In order to achieve the latter result, an iterative procedure, consisting of several ‘sweeps’ is used. A sweep starts at the left- or rightmost site of the

system, and consists of the following steps:

1. Make sure the MPS is in canonical form, with the nonunitary tensor at the site under optimization.

2. Reshape the MPO and MPS into a matrix M.

3. Reshape the MPS tensor under optimization into a vector v 4. Solve the eigenvalue problem Mv − λv = 0.

5. Reshape v back into an MPS tensor and store it in the MPS. 6. Store λ as current best guess for the ground state energy.

(27)

3.3 implementation: double-site 17

7. Rework the MPS into a canonical form with the nonunitary ten-sor at the next site.

To turn the local optimization into a global solution, these steps are repeated back and forth through the system. Every step changes the matrix M with respect to which the next tensor is optimized. This process continues until some convergence criterion is satisfied. The convergence criterion that is used in this thesis is the maximum

dif-ference in the last 10 values of the energy: The notation

Energies[−10 : ] indicates a slicing of the list of energy values, taking only the last 10 values into account.

∆E =max(Energies[−10 : ]) − min(Energies[−10 : ]), (15) which is compared to some small number, e. g. = 10−5. If ∆E < , the algorithm is converged. A slightly more detailed pseudocode of this algorithm is given in algorithm1in appendix A.

To speed up the DMRG computations the algorithm does not actu-ally contract the full network shown in figure18b, but uses so-called

L- and R-expressions. These expressions contain essentially all MPS and MPO tensors to either side of the tensor under optimization, which is depicted in figure19. By contracting the MPS and MPO tensors on

the ‘active’ site with the L/R that is to their left (when moving L-to-R) or to their right (when moving R-to-L), the algorithm ensures that it never has to contract the entire network at once.

=

Figure 19: The L- and R-expressions contain all tensors except for those on the active site. The big tensor on the left of the right-hand-side is named L, the big tensor on the right is named R.

The procedure outlined above is called the single-site DMRG algo-rithm, because optimization is performed on a single site at a time. A problem with this method is that it can get stuck in local minima. This problem can be mitigated by optimizing multiple sites at once, which avoids local minima but increases the complexity of the local eigen-value problem. A good balance between these two points is found by optimizing two sites at a time. The procedure for this is outlined in the following section.

3.3 i m p l e m e n tat i o n: double-site

The DMRG algorithm for two sites is very similar to that for a single site. Figure20shows the first term in the two-site eigenvalue problem

(corresponding to the left network in figure18).The matrix input for

the eigenvalue problem now has dimensions d2_D2_{. Thus, it consists} of 4 times as many numbers when working with a spin-1₂system.

To achieve the two-site algorithm, before reshaping the MPS and MPO elements into the matrix M and vector v, the MPS tensors at

(28)

18 d e n s i t y m at r i x r e n o r m a l i z at i o n g r o u p a l g o r i t h m s

Figure 20: The first term in the eigenvalue problem for the two-site DMRG algorithm. The circled tensor is again the tensor under optimiza-tion.

the current site and the site next to it are contracted, creating the 4-tensor shown in figure20. After solving the eigenvalue problem, the

resulting vector v is reshaped back into the two-site tensor. An SVD is performed on this tensor to divide it back into two tensors (similar to what was done in section2.2.2.1), making sure the nonunitary part

of the SVD output is one site further than the initial starting site. 3.4 r e s u lt s

In this section, results

All figures from this section can be found at the end of the section (after the text).

from both versions (single-site and double-site) of the DMRG algorithm will be presented. The system under consid-eration here, and in the rest of the thesis, is the spin-1₂ antiferromag-netic XXZ Heisenberg model from equation (8).

Figure21 shows the convergence of the energy per site through a

single run of the single-site DMRG algorithm. For this computation, a 20-site system with Dmax= 10was used. Figure21ashows the

con-vergence from the start of the algorithm. Note that for the first four sweeps, the algorithm seems to be stuck in a non-global minimum. Figure 21b has been zoomed in on the final part of the convergence

to show the convergence in slightly more detail. Both these plots are compared to the exact ground state solution to the infinite-size sys-tem, which is 1₄ −ln 2 ([10], as mentioned in [30]). The fact that this result is not reached by the algorithm is not surprising, as 20 sites is still far from the thermodynamic limit.

Figure22is similar to figure21, but was created using the

double-site DMRG algorithm. Again, the convergence of the energy with sweeps is shown both through the full algorithm and zoomed in on the final part of the convergence. Note that the double-site algorithm gets out of the local minimum faster than the single-site algorithm, and needs less sweeps to reach final convergence.

Figure23shows the converged value of the energy per site due to

the DMRG algorithm as a function of system size. Results from both the single- and double-site algorithm are shown, Dmax = 15 for all computations. Plotted as well is the exact solution to the infinite-size system. Note that both algorithms show exactly the same behaviour. Also, as system size increases, the DMRG result gets closer to the thermodynamic limit.

Figure24shows the converged value of the energy per site of a

10-site system due to the DMRG algorithm as a function of Dmax. Results from both the single- and double-site algorithm are shown. Plotted

(29)

3.4 results 19

as well is the exact (finite-size) solution for this system. Note that the plots again look very similar, and that both algorithms eventually go to the exact value for Dmax> 16, which is the value of Dmaxat which only a single bond in the MPS is truncated.

Figures 25 through 28 show the sensitivity of both algorithms to

local minima in more detail. Figure 25 shows how the energy per

site converges as a function of the number of sweeps in the single-site DMRG algorithm. The Hamiltonian used here is the Heisenberg Hamiltonian (the final line in equation8) with no external field.

Con-vergence is plotted for different values of the system size L, at a max-imum bond dimension Dmax= 10.

The relationship between the DMRG algorithm and the maximal bond dimension Dmax, discussed in section 2.2.2.2, is important for

the functioning of the algorithm. Figure 26 shows how DMRG

con-vergence is dependent on Dmax. As can be observed, a Dmax that is too low will result in the algorithm getting stuck in a local minimum, as too much information is lost in truncating the MPS. Increasing D_max leads to the algorithm converging on some runs, but not all (as can be seen from the error bars; the fact that the plot is decreas-ing overall shows that the algorithm converges more often for larger D_max). Finally, when Dmaxgets large enough, the algorithm converges consistently.

Figures 25 through 28 show the improvements of the two-site

gorithm as opposed to the single-site version. Both the single-site al-gorithm getting stuck in local minima and the smaller number of sweeps needed for the algorithms to converge are visible. Comparing figure 27 to figure 25 shows the two-site algorithm now converging

for all system sizes from L = 8 to L = 25, and needing less sweeps to do so. Comparing figure28 to figure26 shows that the D_max needed

to consistently achieve convergence to the correct energy is lower for the two-site algorithm than it is for the one-site algorithm. As men-tioned, these improvements come at the cost of having to solve a larger eigenvalue problem at each step, increasing the computational time of each sweep.

These results can be summarized as follows. The Density Matrix Renormalization Group algorithms perform an accurate computation of ground states. The results from these algorithms approach the infinite-size exact solution when increasing system size. Both algo-rithms approach and ultimately reach the finite-size exact solution when increasing Dmax. The double-site algorithm performs slightly better than the single-site algorithm, in particular with respect to avoiding local minima.

(30)

20 d e n s i t y m at r i x r e n o r m a l i z at i o n g r o u p a l g o r i t h m s 0 5 10 15 20 25 −0.4 −0.2 0 0.2 Sweeps Ener gy per site DMRG result Exact solution (L =∞)

(a) Global convergence.

5 10 15 20 25 −0.45 −0.44 −0.43 −0.42 −0.41 −0.4 Sweeps Ener gy per site DMRG result Exact solution (L =∞) (b) Zoomed in on (a).

Figure 21: Energy (per site) convergence of a 20-site system (Dmax = 10) through the single-site DMRG algorithm. The dashed line in both plots represents the exact solution for the infinite-size system,1₄− ln 2.

(31)

3.4 results 21 0 2 4 6 8 10 12 14 16 −0.4 −0.2 0 0.2 Sweeps Ener gy per site DMRG result Exact solution (L =∞)

(a) Global convergence

2 4 6 8 10 12 14 16 −0.45 −0.44 −0.43 −0.42 −0.41 −0.4 Sweeps Ener gy per site DMRG result Exact solution (L =∞) (b) Zoomed in on (c).

Figure 22: Energy (per site) convergence of a 20-site system (Dmax = 10) through the double-site DMRG algorithm. The dashed line in both plots represents the exact solution for the infinite-size sys-tem, 1₄−ln 2.

(32)

22 d e n s i t y m at r i x r e n o r m a l i z at i o n g r o u p a l g o r i t h m s 5 10 15 20 25 −0.44 −0.43 −0.42 −0.41 −0.4 L Ener gy per site DMRG result Exact result (L =∞)

(a) Single-site DMRG result.

5 10 15 20 25 −0.44 −0.43 −0.42 −0.41 −0.4 L Ener gy per site DMRG result Exact result (L =∞) (b) Double-site DMRG result.

Figure 23: The converged value of the energy per site as a function of sys-tem size, for both DMRG algorithms. For both plots, Dmax = 15. The dashed line is the exact solution for an infinite-size system,

1 4−ln 2.

(33)

3.4 results 23 8 10 12 14 16 18 20 22 24 802 804 D_max Ener gy per site (× 10 − 7 − 0. 42

5) DMRG result_{Exact result}

(a) Single-site DMRG result.

8 10 12 14 16 18 20 22 24 802 804 Dmax Ener gy per site (× 10 − 7 − 0. 42

5) DMRG result_{Exact result}

(b) Double-site DMRG result.

Figure 24: The converged value of the energy per site as a function of Dmaxfor both DMRG algorithms. Both plots are made with a 10-site system. The dashed line is the exact result for this finite sys-tem.

(34)

24 d e n s i t y m at r i x r e n o r m a l i z at i o n g r o u p a l g o r i t h m s 0 5 10 15 20 25 30 35 −0.4 −0.2 0 0.2 Sweeps Ener gy per site L = 8 L = 9 L = 10 L = 11 L = 12 L = 13 L = 14 L = 15 L = 16 L = 17 L = 18 L = 19 L = 20 L = 21 L = 22 L = 23 L = 24 L = 25

Figure 25: Convergence of energy per site for different system sizes at Dmax= 10, using the single-site algorithm. Note that sometimes, the algorithm does not converge, and the number of sweeps needed for convergence grows as system size increases. (While it may be hard to see, there are actually three lines that do not converge.) 2 4 6 8 10 12 14 −0.6 −0.4 −0.2 0 0.2 D_max Ener gy

Figure 26: Mean converged value of the energy per site for different max-imum values of the bond dimension Dmax, using the single-site algorithm. Every computation is started from a new, random ini-tial state. For all cases, the system size is L = 20. Error bars show one standard deviation over 10 runs.

(35)

3.4 results 25 0 2 4 6 8 10 12 14 16 18 −0.4 −0.2 0 0.2 Sweeps Ener gy per site L = 8 L = 9 L = 10 L = 11 L = 12 L = 13 L = 14 L = 15 L = 16 L = 17 L = 18 L = 19 L = 20 L = 21 L = 22 L = 23 L = 24 L = 25

Figure 27: Convergence of energy per site for different system sizes at D_max= 10, using the two-site algorithm. As opposed to figure25,

all sizes converge for this value of Dmax.

2 4 6 8 10 12 14 −0.6 −0.4 −0.2 0 0.2 0.4 D_max Ener gy

Figure 28: Mean converged value of the energy per site for different maxi-mum values of the bond dimension Dmax, using the two-site al-gorithm. Every computation is started from a new, random initial state. For all cases, the system size is L = 20. Error bars show one standard deviation over 10 runs.

(36)

(37)

4

PA R A M E T E R - B A S E D E N TA N G L E M E N T E N T R O P Y M I N I M I Z AT I O N

As was discussed in section 2.3, an MPS generally contains a

cer-tain amount of entanglement, quantified by the entanglement entropy. This entanglement puts an upper bound on the efficiency of the MPS representation: the stronger the entanglement, the bigger the bond dimension has to be to capture all the relevant physics in the sys-tem. The goal of this chapter is to investigate ways to decrease the amount of entanglement in the system, by using local unitary basis transformation. This idea is based mainly on [13].

This chapter is outlined as follows: section4.1 explains the global

method of minimizing the entanglement entropy. Section 4.2 details

the way this minimization is done using parameter-based minimiza-tion schemes.

4.1 p r o c e d u r e

The general procedure for minimizing the entanglement entropy in an MPS, as described by [13], rests on the application of local unitary transformations to the MPS. The procedure is as follows: for a given bond between two MPS tensors, some algorithm determines an op-timal unitary transformation. This transformation is then contracted with the two MPS tensors, effectively changing their basis, after which the resulting tensor is brought back into MPS form using an SVD.

This procedure results in a local change in the entanglement en-tropy (if the transformations are chosen well). The procedure is schemat-ically shown in figure 29. To make sure that observables such as the

expectation value of a Hamiltonian in a given state remain the same, unitary transformations are applied to the MPO as well, resulting in identities of the form UU† = 1 when contracting the MPS with the MPO. This is depicted in figure 30.

The step from local transformations to a global transformation op-timizing the entire system is quite reminiscent of the DMRG algo-rithms treated in chapter 3. On each site of the system, the MPS is

optimized locally. Just like in DMRG, we then ‘sweep’ to the next site, moving back-and-forth through the system. In this way, the global transformation is built up out of a large number of local transforma-tions, creating a kind of ‘patchwork’ as shown in figure31.

Because the optimization is so similar to DMRG, the two algo-rithms are readily combined. This is done by performing a local en-tanglement minimization step after each (2-site) DMRG step. Pseu-docode for the resulting algorithm is given in algorithm 5 in

ap-pendixA.3.

The problem now is how to find the specific transformation that minimizes the entanglement entropy. As for a spin-1₂system the

(38)

28 pa r a m e t e r-based entanglement entropy minimization

(a) Step 1: the dashed line shows the bond about to be optimized.

U

(b) Step 2: A unitary transformation is applied to the MPS.

(c) Step 3: The unitary tensor is contracted with the two MPS tensors.

(d) Step 4: The tensor resulting from (c) is split back into two tensors, with less entanglement between them.

Figure 29: Schematic procedure for minimizing the entanglement entropy over the middle bond of a 10-site MPS.

U†

U

Figure 30: Applying the unitary transformations to an MPO.

formations contain 2 × 2 × 2 × 2 complex values (determined by the dimension of the unitary’s indices), there are potentially 32 real num-bers to be found. While this number is reduced to 16 by simply de-manding unitarity, the task at hand is still a complicated one. The next section treats the algorithms that are used to find these transfor-mations.

4.2 pa r a m e t e r-based unitary search

The most obvious way to attempt to find the transformations is to run some multidimensional optimization algorithm, optimizing a uni-tary matrix in terms of a number of parameters. Algorithms like this make use of a cost function, which will be treated in subsec-tion 4.2.1. Subsection 4.2.2 will cover different ways to parametrize

unitary transformations. Subsections 4.2.3 and 4.2.4 will treat

(39)

Gra-4.2 parameter-based unitary search 29

Figure 31: A large number of local transformations creating a single global transformation. Here, only the transformations from a single sweep are shown; for a complete global transformation, multiple sweeps (in both directions) are used.

dient-based schemes are found in [17] and [3], both of which are men-tioned and used in the appendix of [13].

4.2.1 Cost functions

Most standard optimization algorithms need a cost function to work. The cost function is a function that, given a certain point in the pa-rameter space of the optimization algorithm, returns the ‘cost’ for that point. This tells the algorithm how well it is doing.

The cost function used in [13] is the vector-1-norm: In general, the

vector-n-norm is given by_P ||~λ||n = i|λi|n 1/n . f(1)=||~λ||1 = X i |λi|, (16)

where the λi are the singular values of a bipartition over the bond under current optimization. [13] also suggest using a cost function based on the vector-4-norm if the optimization problem is large in dimension. This cost function is

f(4)= −||~λ||4₄= −X i

|λi|4, (17)

where||~λ||4₄ is to be read as (||~λ||4)4. This norm is closely related to the second Rényi entropy, equation (12) with α = 2.

As can be seen, both versions of vector-nnorms are based on the singular values corresponding to a certain bond. This suggests other possibilities for the cost function. For instance, the Von Neumann or Rényi entanglement entropies for α 6= 2 can both be computed

(40)

directly from the singular values and thus offer more potential cost functions.

4.2.2 Parametrizing unitary transformations

For a parameter-based optimization scheme to work, the unitary trans-formation needs to be expressed in terms of the optimization param-eters. This can be done in various ways. Section4.2.2.1treats a highly

general way to express a unitary matrix in terms of 16 parameters. Section 4.2.2.2 treats a way to express the matrix in only 2

parame-ters, which is specialized towards spin-1₂systems. 4.2.2.1 N2 parameters

As mentioned above, a general N × N (complex-valued) unitary ma-trix is completely determined by fixing 16 parameters. A way to do this parametrization is given in [11].

To write the parametrization, a general (N × N) unitary matrix is first written as a product of three unitary matrices (using the super-script (N) to denote an N × N matrix,

U(N) = Φ(N)(~α)V(N)Φ(N)(~β), (18)

where the matrices Φ are diagonal matrices containing only complex phases:

Φ(N)(~α) =diageiα1_{, e}iα2_{, . . . , e}iαN

Φ(N)(~β) =diag

eiβ1_{, e}iβ2_{, . . . , e}iβN

    αi, βi∈R. (19) The matrix V(N) can now be written as

V(N)=   V(N−1)+ (1 − cN)|A(N−1)i hB(N−1)| sN|A(N−1)i sNhB(N−1)| cN   (20)

where|A(N−1)_{i and}_|B(N−1)_{i are complex-valued vectors containing} N − 1 components, cN = cos θN and sN = sin θN. These vectors are related through multiplication by −V(N−1) or its Hermitian conju-gate.

For a full count of the number of independent parameters, we start with the vectors|A(N−1)i and |B(N−1)_{i. As they are related,} normal-ized and the matrices Φ can absorb an overall phase, these vectors carry 2(N − 2) parameters. The matrix V(N−1) _{gives us (N − 2)}2 pa-rameters. The angle θN is another parameter. Finally, the matrices Φ carry a total of 2N − 1 parameters (not 2N, as the αi and βi only enter the final matrix as the sums αi+ βj, removing one degree of freedom). Thus, the total number of parameters describing U is

2(N − 2) + (N − 2)2+ 1 + 2N − 1 = N2 (21) as it should be. For a full description of how to build the V(N) itera-tively and how to pick the parameters, see [11]. Important to note is that all parameters appear in either a sine, cosine or complex expo-nent, so all parameters are in the interval [0, 2π].

(41)

4.2 parameter-based unitary search 31

4.2.2.2 2parameters

The previous subsection showed how to fully parametrize an N × N unitary matrix in 16 real parameters. Optimisation in a 16-dimensional parameter space can be both costly and difficult. Therefore a reduc-tion in the number of parameters is strongly desired.

As indicated in [13], for certain systems (specifically spin-1

2systems obeying certain symmetries), the number of parameters needed can be reduced drastically. The first step towards achieving this is real-izing that for a spin-1₂system, a single transformation targets the fol-lowing spin sectors:

     Sz = 1 S_z = 0 Sz= −1      , (22)

with the Sz =±1 sectors being 1 × 1 and the Sz = 0sector being 2 × 2. The transformation now only really needs to work on the Sz = 0 sector, as the other two sectors are fully disentangled already and there is no mixing between sectors. The unitary transformation thus can be written as U =      1 ˜ U 1      , (23)

where ˜Uis a 2 × 2 unitary matrix.

The number of parameters describing the transformation can be further reduced by considering the symmetries of the system. As is written in [13], the transformation can be restricted to be an element of U(2)/U(1) × U(1). Personal correspondence with Christian Krum-now and Örs Legeza pointed out that using the Euler angle represen-tation

U2×2 = eia11eia2σzeia3σyeia4σx, (24) this quotient can be easily performed, and is

˜

U = eia3σy_eia4σx_. ₍₂₅₎

The full unitary transformation, with two parameters, is now found by reworking equation 25 into matrix form and plugging the result

back into equation23. The matrix form is found using a Taylor series:

˜ U =   X n (ia3)n n! σ n y     X m (ia4)m m! σ m x   = 1 + ia3σy− a2₃ 2 1 − ia3₃ 6 σy+ . . . ! × × 1 + ia4σx− a2₄ 2 1 − ia3₄ 6 σx+ . . . !

(42)

= 1 cos a3+ iσysin a3 (1 cos a4+ iσxsin a4) =   cos a3 sin a3 −sin a3 cos a3     cos a4 isin a4 isin a4 cos a4  . (26)

The two parameters on which this transformation depends again appear in complex exponentials or (co)sines only, and thus are again in the interval [0, 2π].

Because the unitary transformations treated here are parametrized by only two parameters, any cost function depending on these trans-formations can in principle be solved by brute force and graphed di-rectly. Such a ‘landscape’ is given in figure32. However, using brute

force computations to solve the problem is very inefficient, which is why optimization schemes are used to find the minimum.

0 1 2 3 4 5 6 0 1 2 3 4 5 6 a4 (pauli x) a 3 (pauli y) 0.7687 0.7690 0.7693 0.7696 0.7699 0.7702 0.7705 0.7708 0.7711

Figure 32: Cost function ‘landscape’ for the Von Neumann entropy on the middle bond of a 10-site system. The value of the cost function is shown against the values for the parameters a3 and a4 that parametrize the transformation as in equation (21).

4.2.3 Nelder-Mead optimization

The optimization scheme suggested by [13] is the Nelder-Mead algo-rithm. This algorithm (sometimes also called ‘simplex’ or ‘amoeba’ al-gorithm) is a multidimensional optimization scheme first introduced in [20]. A good treatment of the method, including an extensive pseu-docode, is given in [24].

The algorithm works as follows: given an N-dimensional parameter space, N + 1 points are chosen, of which N should be linearly inde-pendent. These N + 1 points then form a simplex. The algorithm evalu-ates the cost function for all points in the simplex, ordering them from best to worst result. The algorithm then has several ways it ‘moves’ the simplex to a better position: