Chemistry in the Enlightening Age of Quantum Computing: Calculating Energy Derivatives

(1)

Literature Thesis

Chemistry in the Enlightening Age

of Quantum Computing:

Calculating Energy Derivatives

by

Hugo Beks

November 26, 2019

Student Number Daily Supervisor

10541837 D.P. Kooi

Assessed by

Research Group Prof. Dr. P. Gori-Giorgi

(2)

Abstract

In this study, we give an overview of the current state of quantum com-puting applied to quantum chemistry. Firstly, an overview of the current quantum chemistry methods using conventional, classical computers is provided and the need for computers with significantly more computa-tional power is shown. Secondly, the concept of quantum computing and how this can be applied to quantum chemistry is introduced. Thirdly, we evaluate the Quantum Phase Estimation algorithm and the Variational Quantum Eigensolver, which are currently the two most well-known algo-rithms that can be applied to chemistry.

In the final part of this thesis, we take a closer look at a more specific problem: calculating energy derivatives using a quantum com-puter. After reflecting on current classical methods, we take a closer look at three proposed quantum algorithms. In the first place, the algorithm proposed by Kassal et al. in 2009 is studied,1 which uses a variation of Jordan’s Gradient Estimation algorithm to determine energy derivatives. Here, it is shown that the time required to calculate energy derivatives will no longer depend on the dimension of said derivative when using a quantum computer. Secondly, the two algorithms proposed by O’Brien

et al. are studied:2_{the Propagator and Phase Estimation algorithm and}

the Eigenstate Truncation Approximation algorithm. We show that both these algorithms have huge potential to improve classical methods, but that far more research in both quantum hardware as well as software is required for these algorithms to be implemented experimentally.

(3)

1 Introduction

Even with the ever-growing rise in the power of classical computers, it is not possible to simulate all natural phenomena. An exponential wall arises, which we will not be able to surpass using classical methods. Because of this, a new field of computer science has emerged: the field of quantum computing. In the 1980s, the idea of quantum computing was first proposed by Feynman.3 With the proposal of this new field arose two questions: How can we make a computer using quantum particles? How should we alter, and propose new algorithms that can be applied on a quantum computer?

The field started purely theoretical, seeing as no one had yet proposed a viable way of building a computer out of quantum systems. However, in the 1990s, due to Shor’s algorithm, interest in quantum computers grew.4 This was the first algorithm proposed to exponentially speed up an impor-tant class of cryptanalysis, assuming large enough quantum computers would exist. Currently, interest and investment in quantum computing have spiked enormously. Large scale companies, such as Google5 _{and IBM}6_{, are}

attempt-ing and succeedattempt-ing in buidlattempt-ing small-scale quantum computers. Furthermore, academics and companies around the world are working on countless different applications and algorithms to be applied when quantum computers eventually become available.

So what is the big challenge and why has a usable quantum computer not yet been created? The main reason is that we can not observe a quantum system without disturbing its current state, i.e. we have to keep our quantum system completely isolated from the outside world. However, we also want our particles (otherwise known as qubits) to interact with each other, and we wish to control their actions from the outside. These contradicting statements make it extremely challenging to create a perfect, or so-called fault-tolerant quantum computer. This is where the concept of Noisy Intermediate-Scale Quantum (NISQ) computers was introduced. NISQ devices are quantum computers that will (hopefully) be available in the next few years and are systems that contain extra qubits that error correct our qubits of interest. This essentially means that we use extra qubits to protect the information on the qubits of interest. NISQ computers can be seen as an ‘intermediate’ step towards full-scaled quantum computers. Unfortunately, we cannot predict how far away in the future full-scaled quantum computers are.

Following the hardware-concept of quantum computing, the next challenge for quantum computing are the development of quantum algorithms. Firstly, sets of problems that can be solved faster on a quantum computer than a classical computer have to be found. Secondly, algorithms to solve these prob-lems efficiently have to be created. As the field quantum computing is still young, there is still much to learn and much to be improved about quantum algorithms.

(5)

quantum chemistry. Next, we will introduce complexity theory and introduce current quantum algorithms. Finally, we will specifically review quantum algorithms that calculate energy derivatives and give an overview of current studies.

1.1 Quantum Chemistry

In quantum chemistry, the properties of molecular systems are estimated by solving the Schrödinger equation. Seeing as the Schrödinger equation is too complex to solve for many-electron systems, we have to resort to approxi-mations. Since the weight of the atomic nuclei is much larger than that of electrons, the motion of the nuclei can often be neglected. We call this the Born-Oppenheimer approximation. After applying this approximation, the Hamiltonian we are left with is:

H ({ri}) = − 1 2 N X i=1 ∇2_i + N X i>j 1 |ri− rj| − N X i=1 Nn X α=1 Zα |ri− rα| , (1.1)

which is a sum of the electronic kinetic energy, the electron-electron and nuclei-electron interaction energy. Here, N is the number of nuclei-electrons, N_nthe number of nuclei, ri the electronic coordinates, rα the nuclear coordinates and Zα the

nuclear charges. The lowest eigenfunction of this Hamiltonian corresponds to the groundstate wavefunction of the Hamiltonian and the corresponding eigenenergy to the groundstate energy. Higher eigenvalues correspond to the respective excited state energies, where the corresponding eigenfunctions are the wavefunctions of these states.

Seeing as we cannot calculate the Hamiltonian exactly, we have to resort to approximations. There has been extensive research into approximations of the Schrödinger equation using classical computers. In the following section, we will present the most profoundly used approximations, which will later in this thesis form the basis of our quantum algorithms.

1.1.1 The Hartree-Fock Approximation

To describe the molecular orbital (MO) of electron i, let ψi(x) =

X

j

cijφj(x), (1.2)

where the set {φ_j(x)} are atomic orbitals (AOs), x denotes the spin and coordinate of an electron and C = [c_ij] are the corresponding coefficients. The wavefunction of our entire system Ψ(X, C) can then be constructed from the wavefunctions of these single electrons. Because electrons are fermionic, our many-electron wavefunction must be antisymmetric. We can write our

(6)

many-electron wavefunction as a so-called Slater determinant: Ψ(X, C) = ψi1(x1) ψi1(x2) . . . ψi1(xn) ψi2(x1) ψi2(x2) . . . ψi2(xn) .. . ψin(x1) ψin(x2) · · · ψin(xn) . (1.3)

Here, {ψ_i₁, ψi2, ..., ψin} denote the electrons occupy states and X = {x1, x2, ..., xn} are the position and spin of the electrons. The exact wavefunction is a sum of these determinants. In the Hartree-Fock (HF) approximation, it is assumed that the many-electron wavefunction can be described by one such determi-nant, which is optimised using the variational principle.

1.1.2 Basis Sets

In equation 1.2 we assumed that the set of basis functions {φi} forms a

com-plete basis set, which means that the set is of infinite size. To be more precise, to describe the electronic Hamiltonian of eq. 1.1, our Hilbert space has infinite dimensions. We thus need infinite basis functions to describe the AOs cor-rectly. Unfortunately, we cannot calculate infinite basis functions and hence have to reduce the basis set size. However, a well-chosen basis set can still produce highly accurate results.

The most commonly-used orbital representations are Slater type orbitals (STO) and Gaussian type orbitals (GTO). The starting point of STO’s are AOs of hydrogen-like atoms. The molecular orbitals are then represented by a linear combination of these AOs. The basis functions are then of the following form:

φSTO(r; α) ∝ p(r)e−αr, (1.4)

where p(r) is a polynomial in r, which is the distance of the electron from the nucleus, and α is a scaling factor. Even though the STO describe the AOs of hydrogen-like atoms well, they are difficult to use in numerical integration. GTOs have an advantage in this aspect, as they decay in the exponential of the square of the distances:

φGTO(r; α) ∝ p(r)e−αr 2

. (1.5)

Because the multiplication of two GTOs results in a new GTO, integrals with these types of functions can be easily simplified. One problem, however, is that GTOs do not correctly describe AOs. A linear combination of multiple GTOs can be used to represent a single AO. This brings us to a commonly-used basis set: STO-nG, which employs a combination of n-GTOs to represent a single STO.

(7)

1.1.3 Second Quantization of the Hamiltonian

Because the Hartree-Fock approximation determines the energy of the Hamil-tonian from a single determinant, static and dynamic electron-electron inter-action energies are ignored. Post-HF methods attempt to improve on the Hartree-Fock approximation by adding additional electronic configurations. To simplify these methods, it is preferred to formulate the Hamiltonian in its second quantization.

We first represent our wavefunction as a function of the optimised HF orbitals, now in the Dirac notation:

|Ψi ≡ |i₁i2. . . ini . (1.6)

This is an n-electron state where the orbitals {i₁i2. . . in} are filled and the

rest of the orbitals are empty. We now introduce two important operators: the creation (a†_i) and annihilation (ai) operators. The creation operator ‘creates’

an electron in orbital i:

a†_i|j1j2. . .i = |ij1j2. . .i . (1.7)

In this fashion, we can represent our wavefunction by a string of creation operators working on the vacuum state |0i:

|Ψi = a†_i 1a † i2. . . a † in|0i (1.8)

The annihilation operator applies the opposite of the creation operator and deletes an electron from orbital i:

ai|ij1j2. . .i = |j1j2. . .i . (1.9)

As we have stated, the wavefunction is antisymmetric due to its fermionic nature. Because of this, the creation and annihilation operators obey the following canonical anticommutation relationships:

n a†_i, aj o = a†_iaj+ aja†_i = 0, {ai, aj} = n a†_i, a†_jo= 0. (1.10)

Using the newly defined creation and annihilation operators, we can define our Hamiltonian in the second quantized form:

ˆ H =X ij hija†iaj+ X ijkl Vijkla†ia † jalak, (1.11)

(8)

where h_ij and V_ijkl are our one- and two-electron integrals: hij = Z ψ∗_i(x) " −1 2∇ 2₋X α Zα |r − rα| # ψj(x)dx, Vijkl= Z ψ∗_i (x₁) ψ_j∗(x₂) 1 |r1− r2| ψk(x2) ψl(x1) dx1dx2. (1.12)

If we assume the set of atomic orbitals {φi} forms a complete basis set, the

second quantized Hamiltonian is exact, i.e. equal to eq. 1.1. In modern compu-tational chemistry applications, we often limit ourselves to M basis functions, thus causing the second quantized Hamiltonian to be an approximation. The Hamiltonian contains O(M4) terms.

1.1.4 The N-representability problem

The size of the variational space spanned by Ψ grows exponentially with the system size N . To circumvent this issue, the N-representability approach was introduced,7 which notes that to retrieve the expectation value of the Hamil-tonian, only the two-electron density matrix of Ψ is needed. Because the full many-electron wavefunction holds more information than explicitly needed, it is unnecessary to calculate the entire wavefunction. By just studying the first and second-order density matrices, called the two-body reduced density matrix (2RDM), we can retrieve the properties we seek.

In the second quantized form, the function for the energy then becomes:

E =

P

i,jρ

(1)

i,jhi,j+Pi,j,k,mρ

(2) i,k,j,mgi,k,j,m P iρ (1) i,i , (1.13)

where ρ(1)_i,j = hΨ|a†_iaj|Ψi and a similar definition of ρ(2). The N-representability

problem checks whether the 2RDM is consistent with a single many-electron wavefunction.

1.1.5 Configuration Interaction Method

The Hartree Fock method merely uses one Slater determinant to represent the many-electron wavefunction. When using all possible appropriately-weighted determinants we speak of the Full Configuration Interaction (FCI) method. The wavefunction is then given by:

|ΨFCIi =

X

ci|Ψii . (1.14)

If our system consists of M molecular orbitals with N electrons, we have M_N

possible determinants, thus scaling exponentially in M . An FCI calculation is intractable for many-electron systems.

(9)

In the second quantized form, the FCI wavefunction can be constructed from the HF state, where |ΨHFi is the HF determinant. The FCI wavefunction

then becomes: |Ψ_FCIi =   X ia ca_ia†_aai+ X ij,ab cab_ija†_aa†_baiaj+ · · ·  |Ψ_HFi , (1.15)

where {i, j, ...} and {a, b, ...} denote the occupied and virtual orbitals, respec-tively. Each term in eq. 1.15 denotes a set of excitations from the original HF wavefunction (i.e. single excitations, double excitations, etc.). If all levels of excitation are included, the FCI wavefunction is exact. However, approxima-tions of the FCI approximation are often employed. For example, FCI Singles Doubles (FCISD) merely includes first and second excitation levels.

1.1.6 Coupled Cluster Method

The next method we will discuss is the Coupled Cluster (CC) method. The CC wavefunction is as follows: |Ψ_CCi = exp   X i,a ta_ia†_aai+ X ij,ab tab_ija†_aa†_bajai+ · · ·  |Ψ_HFi . (1.16)

If one includes all excitation levels, the CC and FCI wavefunction are the same.

To obtain the energy of our system we have to solve the Schrödinger equa-tion, ˆH|ΨCCi = E|ΨCCi. Next, we define the Coupled Cluster operator ˆT :

ˆ T = ˆT1+ ˆT2+ . . . =X i,a ta_ia†_aai+ X ij,ab tab_ija†_aa†_bajai+ · · ·. (1.17)

Thus giving us:

|Ψ_CCi = eTˆ|Φ₀i , (1.18)

where |Φ₀i is our starting wavefunction, which is usually the HF wavefunction. Because the determinants

Φ

ab... ij...

E

= a†_aa†_b. . . ajai|Φ0i form an orthonormal

basis set, the CC equations become:

D Φ₀ e − ˆT_Heˆ _Tˆ Φ0 E = E D Φab..._ij... e − ˆT_He_ˆ Tˆ Φ0 E = 0 . (1.19)

We can simplify the operator e− ˆTHeˆ Tˆusing the BakerCampbelHausdorff (BCH) expansion.8 Furthermore, we note that the Hamiltonian ˆH in its second quan-tized form (eq. 1.11) only contains double-excitation operators. This leaves

(10)

us with: e− ˆTHeˆ Tˆ = ˆH+[ ˆH, ˆT ]+1 2![ ˆH, [ ˆH, ˆT ]]+ 1 3![ ˆH, [ ˆH, [ ˆH, ˆT ]]]+ 1 4![ ˆH, [ ˆH, [ ˆH, [ ˆH, ˆT ]k], (1.20) where we see that this series terminates.

1.2 Quantum Computing

Now that we have given an introduction on the current classical methods for quantum computation, let us introduce the concept of quantum computing. A classical computer uses bits, which have a value of either 0 or 1. A quantum computer, on the other hand, uses qubits, which are in a superposition of the two states |0i and |1i as follows:

|ψi = α|0i + β|1i ≡ α β

!

, (1.21)

where α and β are complex numbers and are normalised such that |α|2_+|β|2₌

1. If the qubit |ψi were measured, it would report a value of 1 with probability |α|2 _{and 0 with a probability |β|}2_{. Moreover, we can define our qubits with}

two different parameters: θ and φ, such that: |ψi = cosθ

2|0i + e

iφ_sinθ

2|1i, (1.22)

where θ ∈ [0, π] and φ ∈ [0, 2π]. The qubit can now be seen as a unit vector on a sphere, where θ and φ are its polar coordinates. This representation is called the Bloch sphere.

When working with n qubits, we can form the basis set B, which contains

all possible bitstrings of our n qubits, i.e. B = {|00..000i, |00..001i, |00.010i, ...|11...111i}, where we can represent the state |00..000i as |0i⊗n. Our n-qubit state then

becomes: |Ψi = X z∈B cz|zi, (1.23) where now P z|cz|2= 1.

Sets of qubits, or quantum states, can be classified in two different ways: separable or entangled. A quantum state is separable if it is a tensor product of all possible single-qubit states; otherwise, it is an entangled state. For example, the two qubit state

|Ψi = |00i + |01i + |10i + |11i (1.24)

is separable, as the state is a sum of all possible qubit states. However, the two-qubit state

(11)

is entangled. If the first qubit of the entangled state were measured, one would immediately know the value of the second qubit.

A quantum algorithm cleverly manipulates the qubits to retrieve the de-sired outcome from a starting state (often the |0i⊗n state). This is done by applying a number of 1- or 2-qubit gates on our system, after which the final state is measured. If we wish to manipulate more than two qubits, a combina-tion of 1- and 2-qubit gates is used. We can represent these gates as matrices that work on our qubits. There are a number of important gates, which we will represent here. The Pauli gates are gates that work on single qubits and can be seen as a rotation by an angle π along its respective axis:

σx= 0 1 1 0 ! , σy = 0 −i i 0 ! , σz = 1 0 0 −1 ! . (1.26)

Other important single-qubit gates are the Hadamard gate H and the T gate:

H = √1 2 1 1 1 −1 ! , T = 1 0 0 eiπ/4 ! . (1.27)

The Hadamard gate maps the basis states |0i and |1i to |0i+|1i√

2 and

|0i−|1i_√

2 ,

respectively. Which causes the measurement to have equal probability to become a 0 or 1. The T gate is an example of a phase gate, which applies a certain phase eiφ/4 to |1i and leaves |0i unchanged.

Furthermore, there are a number of two-qubit gates, which can be used to entangle sets of qubits. One example of this is the controlled-NOT gate, or CNOT gate. This applies the σx gate to the second, or target, qubit, only if the first, or control, qubit is in state |1i. We can represent this gate as follows: CNOT = |0ih0| ⊗ I + |1ih1| ⊗ σx, (1.28) where I is a single qubit identity matrix.

All above-mentioned gates form a set of universal quantum gates. This means that all operations performed on a quantum computer can be reduced into a combination of above-described one- and two-qubit gates.9

1.3 Qubit Mappings

In this section, we explain the concept of qubit mappings. This encodes the electronic structure problem on to a set of qubits. Here, we present two vari-ations of the Jordan-Wigner transformation and the Brayvi-Kitaev encoding method.

(12)

1.3.1 Jordan-Wigner Transformation

One method to encode spin orbitals on to qubits is called the Jordan-Wigner transformation, where each qubit represents a spin orbital. An occupied or-bital is represented by a |1i qubit and an unoccupied oror-bital by a |0i qubit. The number of qubits needed then depends on the size of our basis set, requir-ing M qubits to encode a wavefunction with basis set size M .

The next step is to specify the creation and annihilation operators in terms of qubit gates. We can represent the creation operator as |0ih1|, which converts a |0i qubit to a |1i and removes the chance of measuring a |0i. In a similar fashion, we can represent the annihilation operator as |1ih0|, converting a |1i qubit to a |0i and removing the chance of measuring a |1i. However, we have to add a phase to obey the anticommutation relationship of eq. 1.10 as follows:

ai = σ1z⊗ σ2z⊗ ... ⊗ σi−1z ⊗ |0ih1|

i (1.29)

a†_i = σz₁⊗ σz₂⊗ ... ⊗ σ_i−1z ⊗ |1ih0|

i, (1.30)

where we have to apply the σz gate to every qubit with an index lower than i. This essentially only flips the sign of all |0i qubits preceding qubit i, thus not changing the probability of measuring a |0i if we measure said qubit. However, by doing this, we ensure that the anticommutation relationships of eq. 1.10 are obeyed. Furthermore, we apply either |0ih1|

or |1ih0|

to qubit i to remove or create an electron, respectively. Consequently, the creation and annihilation operators carry out two actions. First, they change the occupation locally and secondly, by applying a string of σz gates to the qubits preceding qubit i, they apply a phase to each of these qubits according to their parity (even or odd). Hence, the parity is stored non-locally. Due to the many parity-counting phase gates that have to be applied for each creation/annihilation operator, the Jordan-Wigner can be costly.

When observing the Jordan-Wigner transformation, we immediately see the advantage that quantum computers have over classical computers for quan-tum chemistry. As we have seen, the FCI expansion requires a number of determinants that scales exponentially with the number of electrons. Using quantum computers, however, we can map the entire FCI wavefunction us-ing only M qubits, where these can be in a superposition of 2M basis states. And every Slater determinant of the FCI expansion can be one of these basis states. The Jordan-Wigner transformation thus stores the FCI wavefunction efficiently.

1.3.2 Parity approach

A different method is the parity encode approach, where each qubit represents the parity of the number of electrons in orbitals before it, thus encoding the parity locally.10If we let f_i represent the occupation (f_i= 1) or unoccupation

(13)

(f_i = 0) of orbital i, then in the parity encoding method, q_i ∈ [0, 1] denotes the state of qubit i. The wavefunction can then be written as the following string of qubits:

|Ψi = |q_{M −1}, qM −2, . . . , q0i, (1.31)

where each qubit q_n is defined as:

qi= " _i X j=0 fj # mod 2, (1.32)

where we thus now see that the parity is stored locally and the occupation is stored non-locally. We can now furthermore define our creation/annihilation operators in terms of qubit gates:

ai =

|0ih0|i−1⊗ |0ih1| + |1ih1|i−1⊗ |1ih0|

⊗ σ_i+1x ⊗ σ_i+2x ⊗ . . . ⊗ σx_{M −1} (1.33) a†_i =|0ih0|_i−1⊗ |1ih0| + |0ih0|_i−1⊗ |1ih0|⊗ σx

i+1⊗ σi+2x ⊗ . . . ⊗ σM −1x . (1.34)

Because the parity encoding method stores parity locally, two qubits can be removed from a typical molecular system. Our Hamiltonian conserves the number of electrons and the spin-component of our qubits. Because the last qubit (q_{M −1}) stores the number of electrons (mod 2), we can replace this qubit by a constant. And if we arrange the qubits in such a fashion so that the first half of the qubits represent the spin-up electrons and the second half the spin-down electrons, we can subsequently remove the middle qubit, as this represents the number of spin-up electrons in our system, which can also be replaced by a constant. We can thus remove two qubits in the representation of a molecule.

Just as the Jordan-Wigner transformation, the number of Pauli spin oper-ators required scale O(M ), each acting on a different qubit.

1.3.3 Bravyi-Kitaev Mapping

We have now seen two different methods for qubit mapping: the Jordan-Wigner and the parity approach, where the parity and occupation were stored either completely locally or non-locally. The Bravyi-Kitaev (BK) mapping sits somewhere in the middle,11where each qubit stores partial sums of occupation numbers. These occupation numbers are then defined by the BK matrix β_ij, where the qubits are now defined as follows:

qi=   i X j=0 βijfj   (mod2) (1.35)

(14)

and the BK matrix is defined recursively as follows: β1 = [1] β2x+1 = " β2x 0 A β2x # , (1.36)

where A is a (2x× 2x_{) sized matrix which is filled with 0’s, except the bottom}

row, which is filled with 1’s and 0 is a (2x_{× 2}x_{) sized matrix completely filled}

with 0’s. As an example, the matrix β4 looks like this:

β4 =      1 0 0 0 1 1 0 0 0 0 1 0 1 1 1 1      . (1.37)

We see that we can only define the BK matrix for powers of two. If the number of qubits (M ) is not a power of two, we use a matrix with the next greatest power of two and only use the first M rows. We already see that the Bravyi-Kitaev mapping is more involved than the two previously-presented methods. For a derivation of the creation and annihilation operators, we refer the reader to the study by Seeley et al.10 The biggest advantage of the BK mapping method is the scaling of the qubit operations required, which reduces to O(log M ).

We have now presented three mapping methods that present the basis for current qubit mappings methods. It should be noted that more research is being carried out to improve the scaling of these transformations even fur-ther.10,12–18

(15)

2 Complexity Theory

As stated in the introduction, we cannot solve the Schrödinger equation ex-actly on a classical computer. A key question that scientists have recently been asking is: can we solve the Schrödinger equation, assuming we have a fully working quantum computer? This brings us to the next section of this thesis, where we address the computational complexity of quantum chemistry. The theory of computational complexity seeks to characterise the hardness of computational problems, i.e. how much time it would cost to solve certain problems on a computer and how the resources required to solve these prob-lems rise with the increasing size of the problem. Classifying the complexity of the Schrödinger equation is important if we wish to know whether quan-tum computers will significantly speed up calculations compared to classical computers.

In complexity theory, the complexity of a problem is often defined by its worst-case scenario. Simpler instances of a problem might need significantly less time. A problem of size n is deemed to be efficiently solvable if it can be solved in a time less than a polynomial function of n. Here, n denotes the size of our problem, which in the case of the search of the groundstate of a molecular Hamiltonian is the basis set size.

Let us first define decision problems, which are problems with a ‘yes’ or ‘no’ answer. If one can efficiently solve a decision problem, it is said to be in complexity class P, which means deterministic polynomial time. An example of a problem that is in P is the decision problem: ‘is N a prime number?’.19 The size of the problem is then defined by how large of a number N is.

If one is given an answer to a decision problem, and one can verify this answer efficiently, then the problem is in complexity class NP, which means non-deterministic polynomial time. An example of a problem in NP is the ‘travelling salesman problem’, which, given a matrix of distances between n cities, determines if there is a route that visits all cities with a total distance less than k.20

The next classes we look at are hard classes, where hard means that a problem is at least as difficult as any problem in that class. This thus means that if there is an algorithm to solve this problem, we can solve all problems in the class efficiently. Finally, we have complete classes. These are problems that can be seen as the hardest problems in each class. They are both members of the class itself and are furthermore hard for the class. An NP-complete problem is thus in NP as well as in NP-hard.

In all above-mentioned problems, access to a classical computer was as-sumed. Thus defining the P/NP as the classes of problems efficiently solv-able/verifiable on a classical computer. The arrival of quantum computers has led to new complexity classes. The first we discuss is the class BQP, or the Bounded-error Quantum Polynomial-time class. This is the equivalent of the

(16)

computer. It should be noted that we compute the complexity of a problem by determining the number of gates needed to compute a certain problem. If the number of gates needed on a quantum computer scales polynomially with the system size, the problem is said to be in the BQP class. Subsequently, the class QMA, which stands for Quantum Merlin-Arthur, is the quantum computer equivalent of the NP-class. A problem is thus said to be QMA-complete if it is unlikely to have an efficiently solvable solution on a quantum computer, but an answer can be verified in polynomial time.

An interesting aspect that should be noted is that a quantum computer will not have unlimited power. We do not expect quantum computers to solve all

NP-hard problems instantly. The travelling salesman problem, for example,

is such a challenging combinatorial search problem, even a quantum computer will merely be able to speed up the exhaustive search21, however, modestly22.

NP-hard problems will most likely still be hard on a quantum computer.

Next, we address the complexity of several problems in quantum chem-istry. Solving the Hartree Fock approximation, as presented in section 1.1.1, has been proven to be NP-complete.23 However, in quantum chemistry, we often describe far simpler systems than the worst-case scenario, which is why it is possible to use the HF approximation to solve small molecular systems. Furthermore, it has been shown that finding the groundstate of a system is QMA-complete.24 Moreover, calculating the energy of the many-electron wavefunction using the N-representability approach has been proven to be

QMA-complete.25 Even though it is unlikely that we will find effective algo-rithms for NP- or QMA-complete problems, fortunately, there are methods that gave gained some success in obtaining approximate solutions.

Another positive note is that Hamiltonians that frequently occur in quan-tum chemistry are far more restricted than the Hamiltonians considered to determine the computational complexity of its class.26 We thus often do not consider the worst-case scenario of the class.

(17)

3 Quantum Algorithms

In this section, we will present the two most commonly-used algorithms for quantum chemistry using quantum computers. The first being the Quantum Phase Estimation algorithm and the second the Variational Quantum Eigen-solver.

3.1 Quantum Phase Estimation

The idea of the Quantum Phase Estimation (QPE) algorithm is to use a Fourier transform of the autocorrelation of the time-evolving state of our wave-function, which will have peaks at the eigenenergies of our Hamiltonian. An input wavefunction is made to collapse into one of its eigenstates, of which we calculate the energy.

The algorithm consists of three parts. Firstly, the Hamiltonian is written as a sum of Pauli spin operators. Secondly, we convert all of these operators into unitary gates, such that they represent an approximation to a unitary propagator exp(−iHt). Finally, the phase estimation algorithm is used to approximate an eigenvalue of the eigenstate of the input wavefunction.

Different mappings can be used to represent our Hamiltonian as a sum of spin operators, which are described in section 1.3. The next step is to decompose our unitary propagator exp(−iHt). In the second quantization of our Hamiltonian, we can decompose our Hamiltonian into a sum of 1- and 2-local terms. As such, we can decompose our unitary propagator as follows, using the first-order Trotter decomposition:

exp(−iHt) = [exp(−ih₁∆t) exp(−ih₂∆t)... exp(−ih_n)]t/∆t+ O(t∆t), (3.1) where t/∆t is the Trotter number27 and hn are our 1- and 2-local

Hamil-tonians.

If we wish to determine an eigenvalue of an eigenstate, let us first consider the phase of the eigenstate of our Hamiltonian, which evolves depending on a register qubit:

|0i|ψni + exp(−iHt)|1i|ψni = |0i|ψni + exp(−iEnt)|1i|ψni. (3.2)

If we now define En = 2π(φ − K)/t, where 0 ≤ φ < 1 and K is some

in-teger, then we can determine the eigenvalue by measuring the relative phase of the register qubit as |0i + exp[−2πi(φ − K)]|1i using the phase estimation algorithm, which we describe in the next section.

(18)

3.1.1 The Quantum Phase Estimation Algorithm

To estimate the phase Φ of a unitary propagator exp(iΦ) (which in our case is exp(−iHt)), we use the Quantum Phase Estimation algorithm. This algorithm uses two sets of qubits. The first set consists of a number of ancilla qubits, which are used to read out a binary representation of λ_m, which is the corre-sponding eigenenergy of an eigenstate of our Hamiltonian. The second set of qubits |ψmi is presumed to be an approximation of this eigenstate. If we want

to obtain groundstate energies when using QPE, we have to assume that the input wavefunction has significant overlap with the groundstate wavefunction. First, we initialise each ancilla qubit to the state |0i. After which, we apply the Hadamard gate to each of these qubits to prepare them in an equal superposition of the states |0i and |1i.

The goal of the QPE algorithm is then to achieve the best L-bit estimate of the phase φ, where L denotes the number of ancilla qubits, and we define φ as the following fractional binary expansion:

φ = 0.j0j1j2. . . jL−1 = _j 0 2 + _j 1 4 + _j 2 8 + . . . + _j L−1 2L . (3.3)

Here, j_k is either 0 or 1 and is the measured value of the kth qubit. Our unitary propagator is: U (t) = exp(−iHt), of which the relative phase (which we get from Eq. 3.2) is exp(2πiφ). If we now apply U (2kt), we obtain a relative phase exp(2πi(2k(φ)) and we observe that: 2kφ = j0j1...jk−1.jkjk+1.

From this, we can thus obtain the binary expansion of length L.

First, we implement U (2L_{t), from which we obtain a relative phase}

-exp (2πij₀. . . jL−1· jL) = exp (2πijL/2). By applying the controlled-U2

L on the Lth qubit and the state preparation register |φmi, we can transform the

state as follows: (|0i + |1i) |ψmi →

|0i + e2πijL/2_|1i_|ψ

mi. Because we know

that jLis either a 0 or 1, the relative phase is either +1 or -1, respectively, thus

resulting in either the state |0i + |1i or |0i − |1i on our register qubit L. We can then subsequently do this on all of our register qubits. After applying an inverse Quantum Fourier Transform (which we explain in the next section) to all our ancilla qubits to obtain the state |j₀j1...jLi, we can measure all values

of jk, retrieving our binary expansion of the eigenenergy of our Hamiltonian.

(19)

Figure 1. Quantum circuit implementation of the Quantum Phase Estimation

al-gorithm. First, the Hadamard gate is applied to all (in this case 3) ancilla qubits. Next, the controlled U (2k) gates are applied to all ancilla qubits with the eigenstate |ψki. After which the inverse Quantum Fourier Transform is applied to retrieve state

|j0j1...jLi, which is measured.

3.1.2 Quantum Fourier Transform

The last step of the QPE algorithm is the inverse Quantum Fourier Transform (QFT),28 which is an important algorithm, used in many different quantum computing applications. The QFT algorithm converts the phases of the dif-ferent ancilla qubits into a binary representation of the concerned eigenstate. The QFT performs a similar action as the classical version of the Fourier transform: it transforms the basis from the computational to the Fourier basis. Mathematically, it changes an orthonormal basis, |0i, |1i, ..., |N − 1i as follows:

QFT|ji = √1 N N −1 X k=0 e2πijk/N|ki. (3.4)

If |ji is a multi-qubit state |j₁, j2, ..., jni, the QFT acts as follows on the entire

state:

QFT |j₁, j2, . . . , jni =

|0i + e2πi0.jn_|1i _{|0i + e}2πi0.jn−1jn_|1i_{. . . |0i + e}2πi0.j1j2...jn_|1i

2n/2 ,

(3.5) where 0.j1j2...jndenotes a binary fraction. The QFT is applied using a

quan-tum circuit including a series of Hadamard gates and controlled rotations. In QPE, the inverse of the QFT is applied, as we have to transform a binary representation into an orthonormal basis. But as we have mentioned in the introduction, as all quantum gates are unitary, all quantum circuits are reversible. As such, the inverse QFT is the reverse circuit of the QFT itself. The problem that arises with the application of the QPE algorithm is the number of gate operations needed to obtain accurate results. For example, to achieve a chemically-accurate estimation of the groundstate energy of H2,

approximately 600 gate operations need to be applied.29 The number of gate operations quickly stacks up, requiring something in the order of millions or

(20)

even billions of gates for more interesting systems, where in the near future, quantum computers with tens or hundreds of gates are achievable.30 For this reason, alternative methods to determine the eigenvalue of the Hamiltonian have been proposed, of which the most promising is the Variational Quantum Eigensolver, explained in the next section.

3.2 Variational Quantum Eigensolver

The main idea of the Variational Quantum Eigensolver (VQE) is to combine the strength of a quantum computer with that of a conventional, classical com-puter.31Firstly, a quantum state |Ψ(~θ)i is prepared. Secondly, the expectation value of the energy is calculated by splitting up the Hamiltonian using its sec-ond quantized form. Thirdly, the parameters ~θ are updated using a classical computer. The second and third step are repeated until the energy is con-verged.

The first part of the VQE is to prepare a parametrized quantum state |Ψ(~θ)i. We prepare this state from an initial state |Ψ0i : U (~θ) |Ψ0i = |Ψ(~θ)i.

To prepare this trial state, there are a number of different ansatz, where the performance of the VQE largely depends on the quality of the ansatz, such as the Unitary Coupled Cluster ansatz32, the Hamiltonian variational ansatz33, low depth circuit ansatz34 and Hardware efficient ansatze35. We will here present the (much used) Unitary Coupled Cluster ansatz.

We have already presented the Coupled Cluster theory in section 1.1.6. Unitary Coupled Cluster (UCC) is a different formulation of this theory, where now:32

|Ψi = exp(T − T†)|Ψi_ref, (3.6)

where the cluster operator T is defined in equation 1.17 and the reference state |Ψi_ref is often the HF groundstate. We see that in the UCC method the operator T is replaced by the anti-hermitian operator (T − T†) and that we are now working with a unitary operator: U = exp(T − T†). An efficient implementation of the UCC ansatz on a classical computer has not yet been developed, due to the fact that the BCH series of the UCC ansatz does not terminate as the BCH series of the CC method does (see eq. 1.20). However, the unitary operation exp(T −T†) is a natural gate on a quantum computer and we can thus efficiently prepare a state using the UCC method on a quantum computer. If we define T as: T(k)= k X i=1 Ti, (3.7)

where k now denotes the number of terms that are included in T , e.g. in Uni-tary Coupled Cluster Singles Doubles (UCCSD), we only include the terms T1 and T2. The number of terms in the parameterization now scales as

(21)

Cor-respondingly, the number of gates scales as O(f N4), where f denotes the number of gates needed to implement the qubit mapping method (see section 1.11).

The UCC ansatz is merely an example of the preparation of a reference state and there are many different ansatze that can be applied to the VQE.33–36 Once we have a parametrized trial state |Ψ(~θ)i, we wish to calculate the expectation value of the Hamiltonian, to retrieve the energy: E(~θ) = hΨ(~θ)|H|Ψ(~θ)i, with a certain precision . This is done by iteratively applying the VQE algorithm. The expectation value of the Hamiltonian is first calcu-lated using the wavefunction prepared by our ansatz on a quantum computer. After this, our parameters ~θ are updated using a classical computer. We then repeat this procedure until we have required sufficient accuracy. In quantum chemistry applications, we often wish to calculate chemical properties. And since many of these properties are exponentially sensitive to a change in the energy, the concept of chemical accuracy was introduced, which states that a calculation is accurate if the energy accuracy is < 1 kcal/mol,37which is thus often the chosen value for the precision .

If we had access to a quantum computer with error correction, we could use QPE to estimate the energy of the Hamiltonian. However, current NISQ devices are error-prone and to avoid a huge overhead of error correction qubits, new methods to estimate the energy have been proposed, such as the Hamil-tonian averaging method, which we present here.31

Because we can define our annihilation and creation operators in the form of Pauli operators using different qubit mapping methods, we can define our Hamiltonian as: H =X iα hi_ασ_αi +X ijαβ hij_αβσ_αiσ_βj + ..., (3.8) where i, j, ... are the type of Pauli matrices and α, β, ... are the indices of the qubits. Because of the linearity of the observables, we can define the expectation value of the Hamiltonian hHi as:

hHi =X

iγ

hi_αhσ_αii + X

ijαβ

hij_αβhσ_αiσ_βji + .... (3.9)

This thus reduces the problem of finding the expectation value of the Hamil-tonian, hHi, to a sum of a polynomial number of expectation values of Pauli spin operators for a certain quantum state. Fortunately, we can efficiently calculate the expectation value of a tensor product of Pauli spin operators.38 If we tried to achieve this on a classical computer, we would suffer from the N-representability problem, which we have introduced in section 1.1.4. Since the N-representability problem is QMA-hard, we wish to avoid this. Thank-fully, a quantum computer can store a quantum state with exponentially fewer resources than a classical computer.

(22)

of all Pauli operators.39 Because we can measure each of these in parallel on a quantum computer, there is only an addition of a constant cost in time. The disadvantage of this approach, when compared to the QPE, is the num-ber of operations added for the required precision, which is O(−2) for VQE and O(−1) for QPE. This scaling furthermore reflects the number of state preparations that are needed in VQE, which are constant in QPE.

The final part of the VQE algorithm is the update of the parameters ~θ, based on the output values of the expectation value of the Hamiltonian, us-ing a non-linear optimisation routine. As the field of non-linear optimisation routines is well established, there are several different methods which can be applied in the VQE algorithm. In the original introduction of the VQE,31 the derivative-free Nelder-Mead optimisation algorithm was applied. How-ever, in different studies, it has been shown that the Nelder-Mead method has trouble to convert due to the large parameter space required.40,41 Several different optimisation procedures have been studied and applied to the VQE using classical35,41–44 and quantum computers41,45.

The accuracy of current experimental implementations of the VQE algo-rithm are affected by the fact that we work with NISQ devices. This is caused by the fluctuations on the measured properties due to noise and finite mea-surement statistics. This is currently still a big challenge for VQE application. An advantage that gradient-free methods, such as the Nelder-Mead method, have, is that they better tolerate these fluctuations. However, as we have stated above, the Nelder-Mead method is problematic in that it has trouble converging when the problem has many parameters to optimise.40 A method called the Simultaneous Perturbation Stochastic Approximation (SPSA)46,47, includes the numerical gradients, which are computed in random directions, has recently been applied on a quantum computer.35The SPSA method lends itself better to optimise a large number of parameters. However, because the numerical gradient is involved, the accuracy of this method highly depends on the step size for gradient discretisation. To be more precise, the number of measurements required to optimise the gradient at a fixed precision sufficiently, scales quadratically with the inverse of the step size.41When employing meth-ods including the numerical gradient, we thus have to take precise care that we tune our hyper-parameters optimally. More effort is required to improve the optimisation procedure for the experimental application of the VQE algo-rithm.

After we have optimised the parameters using said optimisation routine, we repeat the Hamiltonian averaging procedure and optimisation routine until we have minimised the energy sufficiently.

(23)

4 Energy Derivatives using Quantum Computation

We have discussed the algorithms that are the basis of quantum chemistry using quantum computation. There are many different applications for using these algorithms in quantum chemistry. In this thesis, we would like to dis-cuss one in particular: the calculation of derivatives of the molecular energy. Calculating energy derivatives has several different applications. For example, by calculating the derivative with respect to external electromagnetic fields, we can determine a molecule’s electric properties. Furthermore, calculating the gradient of the molecular energy with respect to the nuclear coordinates is a very common way to determine potential energy surfaces and, as such, perform geometry optimisation procedures. Moreover, we can calculate higher-order derivatives to determine different properties. For example, by determining third- and fourth-order anharmonic constants, we can predict vibrational ab-sorption spectra.37 In this section, we wish to show the currently-proposed alterations of the previously mentioned quantum algorithms to calculate en-ergy derivatives and analyse what the prospects for the near future are.

4.1 Classical Methods

Before we dive into the quantum algorithms, it is helpful to present the classical methods that are used to calculate energy derivatives. If we have an external perturbation λ, we can expand our electronic energy using a Taylor series:

E(λ) = E(0)+ λ>E(1)+1 2λ

>_E(2)_{λ + · · · .} _(4.1)

Here, E(n) denote the molecular properties, which are the responses of the system to the concerned perturbation.48 Considering time-independent prop-erties, we can define these properties by differentiating the energy at λ = 0:

E(n)= d n_E dλn 0 . (4.2)

On a classical computer, there are two different ways of calculating energy derivatives - analytically and numerically - both of which will be discussed here. When calculating a derivative numerically, the energy is calculated at several, well chosen points and numerical techniques are used to obtain an average of the derivative. An example of a numerical technique is a finite difference, which is the easiest way of calculating a derivative. In one dimension, the estimation of the first derivative is as follows:

dE dλ 0 ≈ E(h) − E(0) h , (4.3)

(24)

dimension, at least two evaluations of the energy are required. Similarly, we need d + 1 evaluations of the energy in d dimensions. Moreover, if we require higher-order derivatives, we need at least dn+ 1 energy evaluations for the nth derivative.

One problem that arises with numerical gradient techniques is their sus-ceptibility to numerical instability, especially when employing finite-precision arithmetic, where the addition of different small errors can be hugely amplified by division of a small number (in our case h). Therefore, the accumulation of the many potential sources errors in the energy evaluation can cause large errors in the derivative determination.

On the other hand, analytical derivative techniques, which directly com-pute the derivative by evaluating the analytical expression, do not suffer from this problem and have therefore largely replaced numerical techniques for derivative evaluation in quantum chemistry when using classical computers. For most electronic structure techniques and different perturbations, analyti-cal gradient techniques exist.

To describe analytical derivatives of fully variational wavefunctions, we first write our molecular energy functions as: E(λ, θ(λ)), where the energy is a function of the external perturbation and the wave function parameters θ(λ) (which we will henceforth write as θ). These parameters, which are for example the CC coefficients, completely describe the wave function. The energy is now only fully variational if for any perturbation λ, the θ assumes such a value, θ∗ that the following variational condition holds:

∂E(λ, θ) ∂θ _∗ = 0, (4.4)

where * means that θ = θ∗ and we can thus write E(λ) = E(λ, θ∗).

We can now define the gradient of the energy for fully variational wave functions with respect to λ as follows:

dE(λ) dλ = ∂E(λ, θ) ∂λ _∗ + ∂E(λ, θ) ∂θ _∗ · ∂θ ∂λ = ∂E(λ, θ) ∂λ _∗ = θ∗ ∂H ∂λ θ ∗ , (4.5)

where we have used the variational condition of eq. 4.4 in the first step and the Hellman-Feynman theorem in the last step.

As we see, we do not have to calculate the first-order wavefunction response ∂θ/∂λ. The problem of calculating the derivative of the energy thus reduces to the calculation of the expectation value of the derivative of the Hamilto-nian once we know |θ∗i, which is as difficult, up to a small constant factor, as calculating the expectation value of the Hamiltonian.49 When we wish to

(25)

determine higher-order derivatives, however, we do need to determine the first-order wavefunction response. This is a direct consequence of Wigner’s 2m + 1 rule, which states that we need to know the first m wave function responses to calculate the (2m + 1)th order derivative. The calculation of wavefunction responses is often difficult. This is one reason why calculating higher-order derivatives is often more expensive. The cost of calculating the first-order derivative scales similarly as the cost of calculating the energy. In compari-son, the cost of calculating higher-order derivatives scales as O(dn/2), where n denotes the nth derivative and d is the number of degrees of freedom, or the dimension of λ. This is roughly a quadratic speed-up over common numerical methods, which scale as O(dn).

As we see, the computational cost of both numerical and analytical classical techniques depend on d. For some properties, this is not a problem. For ex-ample, if we wish to determine the derivative with respect to the electric field, d = 3 and classical techniques exist to determine these properties. However, if we wish to calculate derivatives with higher dimensions, e.g. with respect to the nuclear coordinates for large systems, classical methods fall short since the computational costs become too high due to the dependence on the dimension of the perturbation.

4.2 Quantum Algorithms

Now that we have shown the basis of calculating energy derivatives using classical computation methods, we wish to look at three different proposed algorithms using quantum computers. All three algorithms determine the energy derivatives numerically.

4.2.1 Jordan’s Gradient Estimation Algorithm

The first quantum algorithm we discuss in this study is the algorithm proposed by I. Kassal et al. in 2009,1 which is based on Jordan’s quantum gradient esti-mation algorithm.50 This is a numerical method which computes the gradient of a function F , using a black box which can calculate the value of F for arbitrary inputs. This quantum algorithm can evaluate the gradient, using merely a single query to F , in contrast to the minimal d + 1 queries to F re-quired using classical numerical methods. This is achieved by sampling along all dimensions in superposition. In this section, we describe this quantum gradient estimation algorithm and how to apply it to quantum chemistry.

The molecular energy is assumed to be a smoothly-bounded function of the perturbation, E : [−h/2, h/2]d, where h is sufficiently small such that E varies slowly over the domain. The second assumption we make is that we have a black box that can, for a given perturbation θ, output the energy E(θ). This might, for instance, be one of the algorithms we have described in the previous section. However, the manner of the energy calculation does not matter for

(26)

the quantum algorithm, as long as it can be solved on a quantum computer. The algorithm begins by choosing the L number of bits precision that we desire and then starts in an equal superposition with d registers of L qubits each (thus having Ld qubits in total),39

1 √ Nd N −1 X k1=0 · · · N −1 X ki=0 |k₁i · · · |k_di = _√1 Nd X k |ki, (4.6)

where N = 2L, |kii are integers, which are represented by L qubits in binary

and |ki is a d-dimensional vector containing all |ki’s.

This is an input state which goes into the above-mentioned black box. For every integer vector |ki in the superposition, a phase that is proportional to the energy at perturbation θ = h(k − N/2)/N is appended, where N is the vector (N, N, ..., N ) that is subtracted to centre the sampling region on the origin. For maximum precision, using the fewest number of qubits, an estimate m is needed, which is the largest value of any of the first derivatives of E. We then scale the energy evaluated by the black box with 2πN/hm. The black box needs only one query to operate on all |ki in the superposition at once, leaving us with the following state:

1 √ Nd X k exp _2πiN hm E _h N(k − N/2) |ki ≈ 1 √ Nd X k exp _2πiN hm E(0) + h N(k − N/2) · dE dλ ₀ |ki, (4.7)

where we have expanded E into a Taylor series containing only the first deriva-tive. If we choose h to be sufficiently small, we can neglect quadratic terms in h as these terms only lead to a negligible error.50 We can then separate this final state as follows:

eiΦ(0) √ Nd N −1 X k1=0 exp " 2πi m k1 ∂E ∂λ1 ₀ 0 # |k1i · · · N −1 X kd=0 exp " 2πi m kd ∂E ∂λd ₀ 0 # |kdi , (4.8)

where the phase is

Φ(0) = 2π N hmE(0) − N 2m· dE dλ ₀ 0 ! . (4.9)

By now applying the inverse quantum Fourier Transform (see section 3.1.2) to every d register, we obtain the gradient:

eiΦ(0) N m ∂E ∂λ1 ₀ 0 + . . . N m ∂E ∂λd ₀ + = eiΦ(0) N m dE dλ ₀ .(4.10)

(27)

The scaling factor N/m makes sure that the term N/m(dE/dλ) is an integer vector of L bits of precision in each dimension. We repeat that in this algo-rithm, only a single call to E was made, compared to the d + 1 that is required in the classical case.

To sum up, the gradient estimation algorithm performs the following trans-formation: |0i → eiΦ(0) N m dE dλ ₀ . (4.11)

If we wish to compute the Hessian and higher-order derivatives, we can iterate the algorithm. If we attempted to find E(θ − ν) from the black box instead of E(θ), we would perform the following transformation:

|0i → eiΦ(ν) N m dE dλ _ν (4.12) with the following global phase:

Φ(ν) = 2π N hmE(ν) − N 2m· dE dλ _ν ν ! . (4.13)

The phase in eq. 4.13 will become a relative phase, instead of a global phase, as the procedure will be applied as a subroutine. To remove this relative phase, we have to evaluate E at ν. This overall applies a new black box, which yields |(N/m)(dE/dθ)|νi for a given ν, by calling the original black box for E twice.

We can then thus produce the state:

eiΦ(2)(0) N m(2) d2E dλ2 ₀ + . (4.14)

This state is a 2d-array containing d2 quantum registers, including all elements of the Hessian matrix of E, m(2) is an estimate of the largest second derivative and the phase is as follows:

Φ(2)(0) = 2π N hm(2) dE dλ ₀ − N 2m(2) · d2E dλ2 ₀ 0 ! . (4.15)

To compute higher-order derivatives, we would require an additional factor of two black box calls to remove the global phase at each step. To evaluate the nth derivative we thus need 2n−1 queries to the black box that evalu-ates E. This is still exponential in n but is an improvement over the dn+ 1 queries needed in numerical gradient techniques when using classical com-puters. Therefore, we see that the number of queries is independent of the dimension and thus of the system size (even though the energy evaluation still is).

(28)

tech-niques are affected by numerical instability. That is, if there are errors in the energy evaluation query, these would quickly accumulate to a faulty energy derivative evaluation. However, if large-scaled fault tolerant quantum com-puters become available, which is definitely not a certainty, it will, in princi-ple, be possible to evaluate the molecular energy numerically exact, causing numerical instability to no longer be a problem.

To summarise, I. Kassal et al. have shown that Jordan’s quantum gradient estimation algorithm can be applied to evaluate derivatives of the energy nu-merically. To apply the algorithm, a black box is required which can evaluate the energy of an electronic structure problem, of which quantum algorithms are already known. The quantum algorithm provides a speed up over the classical analytical derivative techniques, scaling as O(dn/2), to the query complexity of 2n−1. The complexity of this algorithm thus depends on the complexity of the energy evaluation method (the black box), of which the most difficult part is the preparation of the groundstate wavefunction. It has already been shown that finding the groundstate is QMA-complete24. There have, how-ever, already been promising studies that show we can numerically solve this problem in polynomial time using adiabatic state preparation.51 The most promising notion of the gradient estimation algorithm is that the complexity of the algorithm is independent of the dimension d, giving an advantage when calculating derivatives of larger systems. For example, calculating the Hes-sian would require only twice as many resources as calculating the molecular gradient.

4.3 The PPE and ETA algorithms

In 2019, O’Brien et al. published a paper on calculating energy derivatives using a quantum computer,2providing two new quantum algorithms for energy derivation, which we will present and discuss in this section.

4.3.1 Basis of algorithms

Before we go into more detail about the two new proposed quantum algo-rithms, let us first take a closer look at the Hellman-Feynman theorem and how this can help us solve first-order derivatives using a quantum computer. The Hellman-Feynman theorem states that the first-order derivative of the energy to a perturbation λ is:1,52

∂E0 ∂λ = * Ψ0 ∂ ˆH ∂λ Ψ0 + (4.16)

Using the second-quantized form of the Hamiltonian, we can split up our Hamiltonian into a sum of one- (hij) and two-electron (Vijkl) integrals (see eq.

(29)

integrals in the MO basis as: hij = AO X µν cµicνjhµν (4.17)

and our two-electron integrals as:

Vijkl= (ij|kl) = AO

X

µνρτ

cµicνjcρkcτ l(µν|ρτ ) (4.18)

We can solve the first-order derivative of ˆH by differentiating these integrals to some perturbation λ, giving:

∂hij ∂λ = AO X µν _∂c µi ∂λ cνjhµν+ cµi ∂cνj ∂λ hµν+ cµicνj ∂hµν ∂λ (4.19) and ∂gijkl ∂λ = AO X µνρτ _∂c µi ∂λ cνjcρkcτ l(µν|ρτ ) +c_µi∂cνj ∂λ cρkcτ l(µν|ρτ ) + cµicνj ∂cρk ∂λ cτ s(µν|ρτ ) +cµicνjcρk ∂cτ l ∂λ (µν|ρτ ) + cµicνjcρkcτ l ∂(µν|ρτ ) ∂λ (4.20) where ∂cµi ∂λ = M O X m U_mi(1)cµm. (4.21)

This newly-introduced matrix U(1)(λ) contains the first-order changes of the MO coefficients. We can solve this matrix using the coupled perturbed Hartree-Fock equations.53–55

The last terms of eq. 4.19 and 4.20 contain the one-and two-electron integrals in the AO basis and are given by:

(30)

respectively. The first-order derivative of the Hamiltonian then becomes: ∂ ˆH ∂λ = X pq ∂hpq ∂λ ˆ Epq+ 1 2 X pqrs ∂gpqrs ∂λ ˆ EpqEˆrs− δqrEˆps , (4.24)

where ˆEpq = ˆa†pˆaq are excitation operators. This can be mapped to a quantum

computer using any previously mentioned mapping method (see section 1.3). If we then wish to determine the first-order derivative of the energy of eq. 4.16, we can the following:

∂E0 ∂λ = * Ψ0 ∂ ˆH ∂λ Ψ0 + →X i ∂hi ∂λ D Ψ0 ˆ Pi Ψ0 E (4.25)

where we can determine ∂hi

∂λ classically, as described above and hΨ0| ˆPi|Ψ0i

using some quantum algorithm, such as VQE, where ˆPi denotes a string of

Pauli operators. The great advantage of this algorithm is that the rhs of eq. 4.25 only has to be solved once, as it is equal for each term of ∂hi

∂λ, and because

these terms can be solved fast classically, this is a very appropriate method to solve first-order derivatives of the Hamiltonian using a quantum computer.

Next, we look at higher-order derivatives, which are unfortunately far more troublesome to solve. We can write a second-order energy derivative as:

∂2E0 ∂λ1∂λ2 = * Ψ0 ∂2Hˆ ∂λ1∂λ2 Ψ0 + +X j6=0 2 Re "* Ψ0 ∂ ˆH ∂λ1 Ψj + * Ψj ∂ ˆH ∂λ2 Ψ0 +# 1 E0− Ej (4.26)

where we define our amplitudes as:

A1(j) = * Ψ0 ∂ ˆH ∂λ1 Ψj + * Ψj ∂ ˆH ∂λ2 Ψ0 + (4.27) A2 = * Ψ0 ∂2_Hˆ ∂λ1∂λ2 Ψ0 + , (4.28)

with the following energy coefficients: f1(E0; Ej) =

2 (1 − δj,0)

E0− Ej

(31)

In this manner, we can define a general equation for the nth-order derivative: D =X A X j1,...,j_XA−1 Re [A (j1, . . . , jXA−1)] × fA E0; Ej1, . . . , Ej_XA−1 , (4.30)

where X_A denotes the number of excitations, which differs from the number of linear responses. XA does thus not follow Wigner’s 2m + 1 rule, which

was described in the previous section. Instead, to determine an nth-order derivative, we know that X_A ≤ n, where the amplitudes take the following form: A (j1, . . . , jXA−1) = * Ψ0 ˆ PXA XA−1 Y x=1 |Ψjxi hΨjx| ˆPx Ψ0 + . (4.31)

We can estimate these amplitudes with their corresponding energies Ejx by applying sequential rounds of QPE. After calculating the energy coefficients fA classically, one can calculate eq. 4.30. This is the first algorithm described

by O’Brien et al. and is called the ‘propagater and phase estimation’ (PPE) algorithm.2 We will go into more detail of this algorithm in section 4.3.3. The second algorithm that was proposed is the ‘eigenstate truncation approxima-tion’ (ETA), which we will explain in section 4.3.4.

To explain the Eigenstate Truncation Approximation, we first introduce the quantum subspace expansion (QSE) method.56 Using this expansion, we generate a set of NE vectors |χji, which are defined by NE linear excitation

operators ˆEj to the groundstate:

|χji = ˆEj|Ψ0i. (4.32)

Because the groundstate is not easily accessible using classical methods, the new space spanned by the vectors |χji is neither. These states are thus not

necessarily orthonormal and the overlap matrix (eq. 4.33) is not necessarily the identity.

S_j,k(QSE) = hχj|χki (4.33)

We now wish to create a set of | ˜Ψji orthonormal approximated eigenstates,

and we can form these by calculating the projected Hamiltonian matrix: H_j,k(QSE)=Dχj| ˆH|χk

E

(4.34) and then solving the generalised eigenvalue problem:

ˆ H(QSE)~v(j)= ˜EjSˆ(QSE)~v(j)→ ˜ Ψj E =X l ~ v_l(j)|χ_li . (4.35)

Chemistry in the Enlightening Age of Quantum Computing: Calculating Energy Derivatives

Literature Thesis