• No results found

Linear Quantum Entropy Inequalities beyond Strong Subadditivity and their Applications

N/A
N/A
Protected

Academic year: 2021

Share "Linear Quantum Entropy Inequalities beyond Strong Subadditivity and their Applications"

Copied!
100
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Master Physics & Astronomy

Theoretical Physics

Master thesis

Linear Quantum Entropy Inequalities beyond Strong

Subadditivity and their Applications

by

Laurens Ligthart

10709932

July 2020,

60 EC,

Conducted between September 2019 and July 2020

Supervisor: Supervisor & Examiner: Second Examiner:

Dr. Christian Majenz

Dr. Michael Walter

Dr. M¯

aris Ozols

(2)

Abstract

Entropy and entropy inequalities have been central topics in the field of (quantum) information theory since the start. Despite the fact that enormous progress has been made in our understanding of these quantities, there is still plenty to be discovered. In this thesis we try to contribute to this interesting field of study by looking into new types of entropy inequalities and their applications.

The thesis consists of two main parts. In the first part we discuss quantum entropy inequalities for general quantum states. In particular, we repeat the von Neumann inequalities and prove a Zhang-Yeung entropy inequality for a particular type of classical-quantum states, called ccqq states. To prove this, the copy lemma is adapted and used. It is also shown that the copy lemma cannot hold for any other type of classical-quantum state.

We have investigated two topics from classical information theory for which it is known that the non-Shannon Zhang-Yeung inequality plays a role. In Overhead constrained Private Informa-tion Retrieval (OPIR) the goal is for a user to retrieve a message out of a set of messages from databases, without the databases gaining knowledge about which message is retrieved. In this setup the databases have a limited amount of storage space. Classically it is known that a Zhang-Yeung inequality is important for the upper bound of the capacity of a specific OPIR scheme. We define a quantum OPIR setup and show that our new ccqq Zhang-Yeung inequality is not relevant in this setup. Secondly, causal structures have been investigated. These Directed Acyclic Graphs (DAGs) denote correlation and causation between random variables. One is interested in the entropy in-equalities that govern the observable variables in the causal structure. For certain causal structures, among which the triangle scenario, it has been shown that the classical Zhang-Yeung inequality is relevant in the marginal scenario of the observable variables in the structure [13, 14]. We show that the new ccqq Zhang-Yeung inequality is not relevant in the triangle scenario, or any other causal structure for which we could perform a Fourier-Motzkin elimination. For those causal structures where the Fourier-Motzkin elimination was computationally too hard, Farkas’ lemma was used. This again suggested that the ccqq Zhang-Yeung inequality is not relevant in the tested causal structures, but gave no definitive proof. It is an interesting open question whether the ccqq Zhang-Yeung in-equality is ever relevant in causal structures and to find a proof if this is the case.

In the second part of the thesis we discuss entropy inequalities that are specific to Gaussian quantum states. Gaussian states are the thermal and ground states of quadratic Hamiltonians. They are conveniently described by their first and second moment, also known as the displacement vector and covariance matrix, respectively. We use a heat flow argument and a connection with the Fisher information, based on the approach by Berta et al. [16], to prove a geometric Brascamp-Lieb type inequality for these Gaussian quantum states. This inequality is a generalisation of strong subadditivity and also generalises the result of Berta et al. by including conditioning. We prove a similar generalised subadditivity inequality for the R´enyi-2 entropy of Gaussian states, but can now extend our results to the non-geometric case. The second part of the thesis is concluded with a discussion on the Markov property for Gaussian quantum states. It is shown that this property cannot be described in terms of the covariance matrix in a similar way as the classical case. Instead, we give a set of covariance matrices that do obey the Markov property. It is unknown whether this set is complete.

(3)

Samenvatting

Entropie en entropie-ongelijkheden zijn al een centraal onderwerp in (quantum)informatietheorie sinds het ontstaan van het veld. Ondanks het feit dat er veel vooruitgang is geboekt, blijft er genoeg over om te onderzoeken. In deze thesis probeer ik bij te dragen aan deze interessante tak van de quantuminformatietheorie door naar nieuwe soorten ongelijkheden te kijken.

Deze thesis bestaat uit twee delen. In het eerste deel worden quantum entropie-ongelijkheden besproken voor algemen quantumtoestanden. Von Neumann ongelijkheden worden herhaald en een Zhang-Yeung-type ongelijkheid wordt bewezen for een speciaal soort klassiek-quantumtoestanden, genaamd ccqq states. Om dit te bewijzen is er gebruik gemaakt van een aangepaste versie van het kopielemma. Er is ook bewezen dat het kopielemma niet geldt voor een andere soort klassiek-quantumtoestanden.

Er zijn twee onderwerpen uit de klassieke informatietheorie onderzocht, waarvan is gebleken dat de Zhang-Yeung-ongelijkheid een rol speelt. In ”Overhead constrained Private Information Re-trieval” (OPIR) is het doel dat een gebruiker ´e´en van een set van berichten ontvangt van een aantal databases, zonder dat de databases erachter komen welk bericht er ontvangen is. In de OPIR op-stelling hebben de databases een gelimiteerde hoeveelheid opslagruimte. In het klassieke geval is het bekend dat de Zhang-Yeung-ongelijkheid van belang is voor het bepalen van een bovengrens voor de capaciteit van een bepaald OPIR ontwerp. We defini¨eren een quantum OPIR setup en laten zien dat onze nieuwe ccqq Zhang-Yeung-ongelijkheid niet relevant is in deze setup. Als tweede hebben we causale structuren onderzocht. Deze Gerichte Acyclische Grafen (DAG) geven correlaties en causaties aan tussen variabelen. Het is interessant om te weten welke entropie-ongelijkheden bepalen welke waardes de entropie van de meetbare variabelen aan kan nemen. Voor bepaalde causale struc-turen, waaronder het driehoekscenario, is aangetoond dat de klassieke Zhang-Yeung-ongelijkheid relevant is in de marginale kansverdeling van de meetbare variabelen in de structuur. We hebben laten zien dat de nieuwe ccqq Zhang-Yeung-ongelijkheid niet relevant is in het driehoekscenario, noch in enige andere causale structuur waarvoor we Fourier-Motzkineliminatie konden toepassen. Voor de causale structuren waarvoor het Fourier-Motzkinalgoritme te zwaar was voor de computer, hebben we Farkas’ lemma gebruikt. Dit lemma wees er opnieuw op dat de ccqq Zhang-Yeung-ongelijkheid niet relevant is in de geteste causale structuren, maar gaf geen sluitend bewijs. Het is een interes-sante open vraag of de ccqq Zhang-Yeung-ongelijkheid ooit van belang is in causale structuren en, als dat het geval is, een bewijs te vinden.

Het tweede deel van deze thesis behandelt entropie-ongelijkheden die specifiek voor Gaussische quantumtoestanden gelden. Gaussische quantumtoestanden zijn thermische- en grondtoestanden van kwadratische Hamiltonianen. Ze kunnen gemakkelijk uitgedrukt worden in termen van hun eerste en tweede moment, ookwel bekend als de verplaatsingsvector en covariantiematrix. We ge-bruiken warmtestroming en een connectie met de Fisherinformatie om een geometrische Brascamp-Lieb-ongelijkheid te bewijzen voor deze Gaussische quantumtoestanden. Deze ongelijkheid is een veralgemenisering van ‘strong subadditivity’. We bewijzen een soortgelijke ongelijkheid voor de ‘sub-additivity’ van de R´enyi-2 entropie voor Gaussische toestanden, maar kunnen hierbij de resultaten uitbreiden naar het niet-geometrische geval. Het tweede deel van de thesis wordt afgesloten met een discussie over de Markoveigenschap voor Gaussische toestanden. Er wordt aangetoond dat deze eigenschap niet op dezelfde manier in termen van de covariantiematrix kan worden beschreven als in het klassieke geval. In plaats daarvan geven we een set van covariantiematrices die wel aan de Markov voorwaarde voldoen. Het is niet bekend of deze set compleet is.

(4)

Contents

Abstract 1

Samenvatting 2

1 Introduction 5

1.1 Motivation and goal . . . 5

1.2 Outline of the thesis . . . 6

2 Quantum Information Theory 8 2.1 Finite dimensional quantum states . . . 8

2.1.1 Purification and Schmidt decomposition . . . 9

2.1.2 Separability & entanglement . . . 10

2.2 Continuous quantum systems . . . 10

2.2.1 The uncertainty principle . . . 11

2.3 Quantum channels . . . 11

2.3.1 Representations of quantum channels . . . 12

2.3.2 Measurements . . . 13

2.4 Entropy . . . 13

2.4.1 Shannon entropy . . . 14

2.4.2 Classical entropy inequalities . . . 15

2.4.3 Von Neumann entropy . . . 17

2.4.4 Quantum entropy inequalities . . . 18

2.4.5 Differential entropy . . . 20

2.5 The Markov property . . . 21

3 Entropy Cones 22 3.1 Cones and entropy vectors . . . 22

3.1.1 Shannon and von Neumann cones . . . 25

3.2 Beyond Shannon inequalities . . . 26

3.3 Quantum Zhang-Yeung inequalities . . . 29

3.3.1 Proof of the Zhang-Yeung inequality for ccqq states . . . 29

3.3.2 The Zhang-Yeung inequality for other states . . . 31

3.3.3 Infinite families of inequalities . . . 32

3.4 Numerical analysis of Zhang-Yeung inequalities . . . 35

4 Overhead constrained Private Information Retrieval 37 4.1 OPIR scheme setup . . . 37

4.2 OPIR capacity proof . . . 39

4.3 Quantum OPIR . . . 42

4.3.1 QOPIR setup . . . 43

4.3.2 Alternative QOPIR setups . . . 44

4.4 Zhang-Yeung in QOPIR . . . 45

5 Causal Structures 46 5.1 Classical and quantum causal structures . . . 46

5.2 Previous results for small causal structures . . . 47

5.3 ccqq Zhang-Yeung inequalities in the (extended) triangle scenario . . . 49

5.3.1 Fourier-Motzkin elimination . . . 49

5.3.2 Farkas’ lemma . . . 50

(5)

6 Gaussian quantum states 56

6.1 Phase space formalism . . . 56

6.1.1 The symplectic group . . . 56

6.1.2 Cannonical commutation relations . . . 57

6.1.3 Coherent states . . . 57

6.1.4 Characteristic functions . . . 58

6.1.5 Wigner function . . . 59

6.2 Representations of Gaussian states . . . 59

6.2.1 Fock space representation . . . 59

6.2.2 Phase space representations . . . 61

6.2.3 The uncertainty principle . . . 64

6.3 Gaussian maps . . . 64

6.3.1 General Gaussian CP maps and their dual . . . 66

6.4 Gaussian states and maps for general subspaces . . . 67

7 Brascamp-Lieb inequalities for Gaussian states 69 7.1 Brascamp-Lieb inequalities . . . 69

7.2 Brascamp-Lieb inequalities for the R´enyi-2 entropy . . . 77

7.2.1 R´enyi entropy . . . 77

7.2.2 Geometric and non-geometric Brascamp-Lieb inequalities . . . 80

7.3 The Markov property for Gaussian states . . . 81

7.3.1 Probability densities . . . 82

7.3.2 Gaussian quantum states . . . 83

8 Summary and open questions 86

Acknowledgements 89

A Useful relations between entropy quantities 90 B R´enyi-2 Brascamp-Lieb inequality heat flow proof 91

Index 94

(6)

1

Introduction

1.1

Motivation and goal

For the past century, quantum mechanics has revolutionised our understanding of nature on the smallest scales. This year marks the 99th anniversary of Einstein’s Nobel prize for the photo-electric effect. The proof of this theory was the first sign that certain phenomena of nature that were previously thought to be continuous, in fact came in quanta. Going forward in time to the present, we are now able to use these quantum properties to our advantage, for example in order to improve our computational power, to both break and strengthen our cryptographic schemes or to teach us more about the properties of larger quantum systems.

Recently the first instance of quantum supremacy was accomplished by the Google AI Quantum team [1]. This shows that quantum computing can be a powerful tool for certain specific problems. As another example of an improvement to our computational power we have the famous Grover search algorithm, which gives a quadratic speedup for finding solutions to a search problem [2, 3].

In cryptography, Shor’s algorithm is able to solve the prime factorisation and discrete logarithm problem in polynomial time with the help of quantum computers [4]. Once quantum computers become powerful enough, this will break many cryptographic schemes that are currently considered safe, because they rely on the task of prime factorisation, which is considered hard for classical computers [5]. On the other hand, even before Shor published his algorithm, Bennett and Brassard came up with a secure scheme to distribute keys in cryptographic setups with the help of quantum mechanics [6], which would help make our communication more secure.

Additionally, we can also use quantum computing and quantum simulation as tools to better understand quantum theory itself, for example in condensed matter systems, particle physics and possibly even cosmology [7, 8, 9].

This thesis discusses the notion of entropy for multi-partite quantum systems and the inequalities that govern these entropies. The relevance of such inequalities is clear from many examples in (quantum) information theory. Already, the optimal rate of communication between two parties is conveniently expressed in terms of the Shannon entropy classically and the von Neumann entropy in the quantum case. Furthermore, the amount of shared information between parties is given by the mutual information, which is a quantity that is most easily expressed in terms of entropies as well. The study of information theory therefore depends heavily on entropic quantities.

The goal of this thesis is to describe these entropy inequalities for quantum systems and find additional relevant inequalities and their applications. For up to three party systems, the most general linear inequalities for the von Neumann entropy are known and have been proven to be tight [10]. However, once one looks at four party quantum systems, it is unclear whether there are additional inequalities that hold for all quantum states besides those that are known for three or less parties. Classically, it has been shown by Zhang and Yeung [11] that there exist relevant additional inequalities that all classical probability distributions of four parties or more have to obey. The proof of this statement uses the so called copy lemma. The name of this lemma already suggests that the same trick might not work for quantum systems, and indeed in general it does not. However, in this thesis it is shown that these Zhang-Yeung inequalities do hold for a certain subset of quantum states, known as ccqq states. These inequalities go beyond what are normally called the Shannon-type and von Neumann-type inequalities and are not yet well understood.

Classically, the Zhang-Yeung inequalities have some scenarios in which they are of relevance. We will discuss two of these scenarios and consider their quantum equivalents. The first is known as Private Information Retrieval (PIR) in which the goal is for a user to retrieve a message from a number of databases, without the databases learning what message was retrieved. In a special case of the PIR scenario, called OPIR, the classical capacity is limited by the Zhang-Yeung inequality [12].

The second application that is discussed are certain causal structures. These are directed acyclic graphs (DAGs) that depict correlation and causation between random variables or quantum systems. In the theory of causal structures, one is interested in the probability distribution and entropy of the

(7)

observable variables in the structure. The entropy inequalities that normally hold for the observable systems are then supplemented by the inequalities that are implied by the causal structure. It has been shown that there are causal structures in which the Zhang-Yeung inequality further restricts the possible entropy assignments [13, 14].

Apart from the Zhang-Yeung inequalities, there are other relevant inequalities to investigate. An example of such an inequality is an entropic Brascamp-Lieb inequality. These inequalities have a dual description: on the one hand they can be expressed in terms of the Shannon entropy classically and the von Neumann entropy quantumly; on the other hand, there is an analytical description that does not use the notion of entropy at all, but is stated in integral form as a generalisation to Young’s inequality [15, 16]. The entropic Brascamp-Lieb inequalities are also not categorised as Shannon-type or von Neumann-type inequalities and are therefore novel and of interest.

In the special case of Gaussian quantum states, it is possible to prove certain entropic Brascamp-Lieb inequalities. Some of these were previously found by Berta et al. [16], but others are proven in this work. In particular, we show a generalisation of strong subadditivity by proving a result for geometric Brascamp-Lieb inequalities. In order to do this, we introduce the Gaussian formalism, which uses the symplectic group and phase space methods. Gaussian states are ubiquitous in the theory of optics and optical quantum computing [17]. The entropy inequalities are especially relevant for optical quantum computers: At the moment optical channels are still one of the promising directions for the development of quantum computing and quantum internet [18], the latter of which is being developed by several universities in the Netherlands at the moment.

1.2

Outline of the thesis

The information theoretic quantities and theories that are necessary as prerequisites to this thesis are discussed in chapter 2. By introducing these quantities and theorems, the notation that is used throughout the thesis is established.

Afterwards, in chapter 3, the conic description of entropy inequalities is discussed. This chapter also treats the classical Zhang-Yeung inequality, a proof for the copy lemma and Zhang-Yeung inequality for ccqq states, and a proof for why the copy lemma will not work for any other types of cq states. A numerical analysis of the Zhang-Yeung inequalities is also briefly discussed.

Chapter 4 and 5 discuss applications of the Zhang-Yeung inequality in certain information the-oretic schemes. In Overhead Private Information Retrieval (OPIR) the task of secretly retrieving information from databases is limited by the amount of data such a database can store. It turns out that classically, the Zhang-Yeung inequality is relevant for the capacity of this task. Chapter 4 reviews why this is the case and argues why the Zhang-Yeung inequality for ccqq states is not relevant for this example.

In the theory of causal structures, the goal is to be able to exclude certain causal relations between random variables or quantum systems. This is done by investigating the probability dis-tributions and entropies that are permitted by the causal structures and comparing them with the obtained data. The classical Zhang-Yeung inequality is relevant in this scenario for certain causal structures where there are enough correlated random variables. Chapter 5 investigates whether there are causal structures for which the ccqq Zhang-Yeung inequality is relevant. We will see that the task of determining this becomes computationally infeasible very quickly. In order to make a qualitative statement about the relevance of the ccqq Zhang-Yeung inequality in some structures, Farkas’ lemma is employed. This concludes the first part of the thesis about Zhang-Yeung type inequalities.

Chapter 6 discusses the theory of Gaussian quantum states, which are the ground and thermal states of quadratic Hamiltonians. The phase space formalism is introduced and an equation for the von Neumann entropy of Gaussian states is rederived. Furthermore, the action of Gaussian maps on Gaussian quantum states is treated.

After the introduction of the Gaussian formalism, chapter 7 treats Brascamp-Lieb inequalities for these Gaussian states. Using heat flow, under which the system evolves according to the diffusion Liouvillean, a generalised strong subadditivity inequality is derived for the von Neumann entropy

(8)

from geometric Brascamp-Lieb inequalities. Following this, a similar Brascamp-Lieb inequality is derived for the R´enyi-2 entropy of Gaussian states. To achieve this, an earlier result from Gross and Walter [19] is used that relates the R´enyi-2 entropy to the classical differential entropy. Using this link with the differential entropy, a non-geometric type of Brascamp-Lieb inequality is derived with some restrictions. The Markov property for covariance matrices is also briefly discussed.

The thesis concludes with a summary of the results and a discussion. In this discussion, some questions are stated that were left unanswered but are still considered to be of relevance.

(9)

2

Quantum Information Theory

This section will discuss the basic notions of quantum information theory. We will start with an introduction of the notation through the definitions of states and the Hilbert spaces they live in. After that, we define quantum entropy functions and several of their properties. Several useful lemmas and theorems will also be discussed.

2.1

Finite dimensional quantum states

Finite dimensional quantum states will be described as density operators on a Hilbert space. A Hilbert space is a vector space on a real or complex field that is equipped with an inner product and is a complete metric space with respect to the distance measure that follows from the inner product. The collection of density operators on such a Hilbert space X is given by

D(X ) = {ρ ∈ Pos(X ) : Tr ρ = 1}, (2.1) where Pos(X ) is the set of positive semidefinite operators on X . Let Γ be a finite set called an alphabet, such that X = CΓ with an orthonormal basis given by {|xi}x∈Γ.

A quantum state ρ is said to be pure if it can be written as ρ = |ψihψ|, with |ψi a complex vector of dimension dim X in Dirac’s bra-ket notation and hψ| = |ψi∗ its adjoint. The adjoint is also often denoted by a † symbol, i.e. hψ| = |ψi†. We thus have that ρ is a rank 1 density operator. If a quantum state can be written in this way, the vector |ψi is also often called a state. A state that is not pure is said to be a mixed quantum state and can be written as a convex combination of pure states:

ρ =X

j

p(j) |ψjihψj| , (2.2)

where the pj’s form a probability distribution.

Whether a state ρ is pure or not can easily be checked by calculating the purity Tr ρ2. This quantity equals 1 if and only if the state is pure. This follows from the fact that hψ|φi ≤ 1, with equality only for |φi = |ψi.

The maximally mixed state is given by

ρ = I

d , (2.3)

with d = dim X .

For a quantum system that consists of several subsystems X1, . . . , Xn the Hilbert space is given

by X = X1⊗ . . . ⊗ Xn with alphabets Γ1, . . . , Γn such that Xi= CΓi. The easiest example is given

by qubits with alphabet Γi = {0, 1}. This yields a Hilbert space of dimension 2n with alphabet

{0, 1}n. For a small number of subsystems, the Hilbert spaces of the subsystems are often denoted

by the first letters of the alphabet, e.g. H = HA⊗ HB for a bipartite quantum system. States that

live on such a Hilbert space are then denoted by ρAB.

If one is interested in the state on one or more of the subsystems of a state ρ[n]∈ D(X1⊗· · ·⊗Xn),

where [n] = (1, . . . , n), one can perform a partial trace over the other subsystems. This defines the reduced density matrix ρI on the set of subsystems I ⊆ [n],

ρI = TrIcρ[n]

= X

x∈ΓIc

(II ⊗ hx|Ic)ρ[n](II⊗ |xiIc),

with Ic= [n]\I the complement of I. The state ρ

Idescribes the state on the subsystem Xi1⊗. . .⊗Xik,

with ij∈ I ⊆ [n].

If a state on a Hilbert space defined by a compound alphabet can be written as

(10)

it is called a product state.

A quantum state ρA with alphabet Γ is said to be a classical state if it can be written as

X

a∈Γ

p(a) |aiha| , (2.5)

with p(Γ) a probability distribution. In that case the density matrix is diagonal and is just given by a probability distribution over classical elements. Note, for example, that the maximally mixed state (2.3) is a classical state with p(a) = 1/d ∀a ∈ Γ. If a state ρAB can be written as

X

a

p(a) |aiha| ⊗ ρ(a)B , (2.6) it is called a classical-quantum state, or cq state. This name arises from the fact that the first register is in a classical state, while the second register is any, possibly mixed, quantum state that depends on the first register. Note that the states ρ(a)B on the second register are in general not orthogonal. 2.1.1 Purification and Schmidt decomposition

Every mixed state ρ on a Hilbert space HA can be written as a subsystem of a pure state |ψihψ| on

some larger Hilbert space HA⊗ HB. The pure state |ψihψ| is then called a purification of ρ. Such

a purification always exists for a large enough Hilbert space HB. This statement is captured by the

following theorem.

Theorem 2.1 (Purification). Let HA and HB be Hilbert spaces and let ρ ∈ D(HA) be a (mixed)

quantum state. Then there exists a pure state |ψihψ| ∈ D(HA⊗ HB), called a purification of ρ, such

that Tr(|ψihψ|) = ρ if and only if dim(HB) ≥ rank(ρ).

The easiest way to see that such a state |ψi exists, is by writing ρ in terms of its spectral decomposition ρ = rank(ρ) X i=1 p(i) |ψiihψi| and write |ψi = rank(ρ) X i=1 p p(i) |ψii ⊗ |ii , (2.7)

with {|ii} any orthonormal basis. It is then clear that indeed TrB(|ψihψ|) = ρ. It also turns out that

purifications are unique up to unitary equivalence. That is if TrB(|ψihψ|) = TrB(|φihφ|) for states

|ψi , |φi ∈ D(HA⊗ HB) then |φi = (IA⊗ UB) |ψi for some U ∈ U (HB) [20].

Equation (2.7) suggests that for a pure bipartite state, we can always write the state as a sum over orthonormal bases on the subsystems. Equivalently, the density operator can be expressed as a sum over diagonal elements. This is known as the Schmidt decomposition, which is specified in theorem 2.2 below.

Theorem 2.2 (Schmidt decomposition). Let |ψi ∈ D(HA⊗ HB) be a bipartite quantum state on

the Hilbert space HA⊗ HB. Then there exists a decomposition of this state, known as the Schmidt

decomposition, as

|ψi =X

j

sj|eji ⊗ |fji , (2.8)

with sj> 0, {|eji} an orhonormal basis on HA and {|fji} an orthonormal basis on HB.

The reduced states on HA and HB can then be written as

ρA= X j s2j|eiihei| and ρB = X j s2j|fjihfj| (2.9) respectively.

(11)

One can find the values sj by, for example, calculating the eigenvalues of the reduced density

operators.

2.1.2 Separability & entanglement

An important property of a quantum state is whether or not it is entangled. To introduce the notion of entanglement, we first need to define separability. Consider a pure, bipartite state |ΨiAB on HA⊗ HB. If |ΨiABcan be written as a tensor product of states |ψiAon HAand |φiBon D(HB),

i.e. if

|ΨiAB= |ψiA⊗ |φiB, (2.10) it is said to be separable. If this is not possible, |Ψi is entangled.

More generally, a mixed quantum state ρAB ∈ D(HA ⊗ HB) is separable if there exists an

alphabet Γ and a probability distribution p such that ρAB can be written as

ρAB=

X

p(x)ρx⊗ σx,

with ρx ∈ D(HA) and σx ∈ D(HB). The set of such separable density operators is denoted by

SepD(HA: HB). Density operators that are not separable are entangled.

The notion of entanglement is only mentioned a few times throughout this thesis, but is extremely important in the area of quantum information. The fundamental building block of entanglement is the Bell state

Φ+ = √1

2(|00i + |11i), (2.11)

also known as an EPR-pair (from Einstein, Podolski and Rosen) or an ebit. It is a maximally entangled 2-qubit state and is therefore often used as a measure of entanglement in quantum systems through entanglement distillation (see e.g. [20, sec. 6.2.2]). As is probably known to most readers, entanglement can be used, among other things, as a resource for state teleportation [21], superdense coding [22] and secure quantum key distribution [23].

2.2

Continuous quantum systems

In the previous section, discrete quantum systems were discussed. This section treats continuous-variable quantum systems. We define a continuous quantum state on the Hilbert space of square integrable functions H = L2(Rn) as a positive semidefinite normalised self-adjoint trace-class oper-ator ρ ∈ D(H). That is, hv|ρ|vi ≥ 0 ∀ |vi ∈ H, Tr(ρ) = 1 and ρ∗= ρ.

All linear operators on the Hilbert space L2(Rn) that represent observables have to be bounded and self-adjoint. We will generally write such operators that act on Hilbert space vectors with a hat. This does not necessarily mean that the bras ha| and kets |ai that define the operator are normalised vectors. The only requirement is that the inner product between |ai and a vector |vi ∈ H is finite and well-defined. For example, for the position operator ˆx we can write

ˆ x =

Z ∞

−∞

x |xihx| dx, (2.12)

which is only valid, because the inner product is given by hx|yi = δ(x − y) for all |yi ∈ H.

The dynamics of continuous quantum systems are governed by a Hamiltonian ˆH. In the Schr¨ odin-ger picture, the evolution of a state vector |ψi is given by the Schr¨odinger equation

d

dt|ψi = −i ˆH |ψi . (2.13) For mixed states it is more convenient to consider the von Neumann equation for time evolution, under which a density matrix ρ evolves as

d

(12)

which is equivalent to the Schr¨odinger equation. This notation resembles, up to an important sign, the Heisenberg picture of operator evolution, where operators ˆO evolve in time according to

d dt

ˆ

O = i[ ˆH, ˆO]. (2.15)

If the quantum system interacts with the environment, one can calculate the dynamics on the larger global system and find the effect this has on the local system. Under certain assumptions, this type of evolution is given by the time-local quantum master equation of Lindblad form:

d dtρ = X j  ˆ Ljρ ˆLj− 1 2( ˆL 2 jρ + ρ ˆL2j)  =X j D( ˆLj)(ρ). (2.16)

We will see an example of such an evolution in chapter 7. 2.2.1 The uncertainty principle

An important notion in quantum mechanics is the Heisenberg uncertainty principle. In continuous variable systems it is most well known for the uncertainty in position and momentum of the form (in natural units)

σxσp≥

1

2, (2.17)

with σa the variance in the operator ˆa.

The principle, however, is more general and stems from the commutation relations between observables. The uncertainty principle for any two operators ˆA and ˆB is given by

σAB2 ≥ 1 2i D [ ˆA, ˆB]E 2 , (2.18)

with h.i the expectation value [24]. That is, two observables can’t be measured simultaneously with arbitrary precision if their operators don’t commute, i.e. if they are not diagonal in the same basis. It is therefore sufficient to express the uncertainty principle in terms of the commutator between two observables. For the position and momentum this then boils down to

[ˆx, ˆp] = i (2.19)

in natural units.

A further treatment of continuous variable systems, the uncertainty principle and the use of phase space can be found in chapter 6, where we give an introduction to Gaussian quantum information.

2.3

Quantum channels

In quantum information theory the change of one state to another is realised by the application of a quantum channel. A quantum channel Φ is a completely positive, trace preserving (CPTP) linear map from one space of square operators L(X ) to another space L(Y). Let the set of such channels be given by C(X , Y). A map Φ : L(X ) → L(Y) is said to be completely positive if for all choices of a complex Euclidean space Z and a positive semidefinite operator P ∈ Pos(X ⊗ Z) it holds that

(Φ ⊗ IZ)(P ) ∈ Pos(Y ⊗ Z). (2.20)

The map Φ is called trace preserving if

Tr(P ) = Tr(Φ(P )). (2.21)

Note that taking a (partial) trace is itself also a channel, since it is obviously trace preserving and the (partial) trace of a positive semidefinite operator is again positive semidefinite.

(13)

We define the adjoint of the channel Φ as the unique linear map Φ∗ : L(Y) → L(X ) such that hY, Φ(X)i = hΦ∗(Y ), Xi , (2.22)

for every X ∈ L(X ) and Y ∈ L(Y), where

hA, Bi = Tr(A∗B). (2.23)

The adjoint map Φ∗is not in general a channel, since it might not be trace preserving. It is however completely positive and unital. A unital map Φ∗: L(A) → L(B) satisfies Φ∗(IA) = IB.

2.3.1 Representations of quantum channels

The action of a channel on a quantum state can be represented in several ways. We will briefly discuss the two representations that we will see in the thesis in this section. These results can also be found, for example, in the book by Watrous [20].

The first, and most common representation is known as the Kraus representation. For any channel Φ ∈ C(X , Y) there exists an alphabet Γ and linear operators

{Aa : a ∈ Γ} (2.24)

from X to Y, such that

Φ(X) =X a∈Γ AaXA∗a, (2.25) withP a∈ΓA ∗

aAa= IX for every X ∈ L(X ). The operators Aa are in general not unique. From the

definition of the adjoint map, it follows that Φ∗(Y ) =X

a

A∗aY Aa, (2.26)

for every Y ∈ L(Y).

The second representation we will elaborate on is the Stinespring representation. Let X , Y and Z be Hilbert spaces. Then for every channel Φ ∈ C(X , Y) there exists a Hilbert space Z and an operators A ∈ L(X , Y ⊗ Z) such that

Φ(X) = TrZ(AXA∗), (2.27)

with A∗A = IX for every X ∈ L(X ). This is known as the Stinespring representation. As in the

case of the Kraus representation, the operators A are not unique. The adjoint of the channel Φ is given in terms of the operator A as

Φ∗(Y ) = A∗(Y ⊗ IZ)A, (2.28)

for all Y ∈ L(Y).

The two representations are related in the following way: If we have a channel Φ ∈ C(X , Y) with Kraus representation

Φ(X) =X

a∈Γ

AaXA∗a, (2.29)

then a Stinespring representation for Z = CΓ is given by

Φ(X) = TrZ(AXA∗), (2.30)

with

A =X

a∈Γ

(14)

2.3.2 Measurements

Measurements, or generally called POVMs (Positive Operator Valued Measurements), are mathe-matically denoted by a map µ : Ω → Pos(X ) from the outcome set Ω to positive operators on the Hilbert space X , with the property

X

ω∈Ω

µ(ω) = IX. (2.32)

The probability of obtaining outcome ω when applying the measurement µ to ρ is then given by

p(ω) = Tr(µ(ω)ρ). (2.33)

We can view a measurement as a special case of a quantum channel. It is a channel given by Φ(X) =X

a

Tr(µ(a)X) |aiha| . (2.34)

Such channels are known as quantum-to-classical channels, since they map any quantum state to a classical state.

Often after measurement, the state ρ ceases to exist. Think for example of the absorption of a photon by a detector. However, if we have applied a non-destructive measurement and found outcome a, the post-measurement state will be

ρa=

AaρA∗a

p(a) , (2.35)

where Aa is the Kraus operator that corresponds to the measurement µ(a) with outcome a and

p(a) = Tr(AaρA∗a).1 If the measurement allows for multiple outcomes I ⊂ Ω, the post-measurement

state becomes ρI = P a∈IAaρA∗a p(I) , (2.36) where p(I) = Tr P

a∈IAaρA∗a and there are now multiple Kraus operators that correspond to the

multiple outcomes.

In the same way we can also deduce the remaining state after only a part of the state has been (destructively) measured. Suppose we have a state ρAB, on which we perform a measurement of

subsystem A, with outcome a. Then the post-measurement state will be ρB=

TrA((µ(a) ⊗ IB)ρAB)

p(a) , (2.37)

where p(a) = TrA((µ(a) ⊗ IB)ρAB) is the probability of outcome a.

2.4

Entropy

This section discusses the notion of information entropy, both in the classical and in the quantum case. We start with the classical Shannon entropy and the inequalities that govern it as an intro-duction to the more complex quantum version of it, the von Neumann entropy. Though we will see many similarities between the classical and quantum cases, we will also encounter some important differences.

1Examples of non-destructive measurements can be found in some setups of the famous Stern-Gerlach experiment,

(15)

Figure 1: A graph of the binary Shannon entropy h(p). It is a concave function with a maximum at p = 12.

2.4.1 Shannon entropy

Most of the basic theory on classical entropy and classical communication was discovered and pub-lished by Shannon in his two famous papers from ’48 and ’49 [25, 26]. This section treats some important concepts and results as an introduction to the quantum equivalent of these concepts.

Let X be a random variable with probability distribution p(X) and alphabet Γ. Then the Shannon entropy H(X) of X is defined as

H(X) = −X

x∈Γ

p(x) log(p(x)), (2.38)

where the logarithm has base 2 by convention and 0 log(0) is set to 0. The Shannon entropy measures the amount of uncertainty in a random variable. It is a concave function that obeys

0 ≤ H(X) ≤ log(|Γ|), (2.39)

where equality on the left side is obtained by a distribution where p(x) = 1 for one specific outcome x and 0 for all the other outcomes, while equality on the right side is achieved by the uniform distribution. To strengthen our intuition, let us look at a graph of the binary Shannon entropy, defined as

h(p) = −p log(p) − (1 − p) log(1 − p), (2.40) for 0 ≤ p ≤ 1, in figure 1. This is the most basic example of a Shannon entropy function, with |Γ| = 2. It is clearly concave and has a maximum of h(1/2) = 1, as expected by the upper bound in eq. (2.39).

For two random variables X and Y , we can define the joint entropy as the entropy of the joint probability distribution p(XY ). If the two random variables are independent, the probabilities can be written as p(xy) = p(x)p(y) for all x ∈ X and y ∈ Y, which yields H(XY ) = H(X) + H(Y ).

Furthermore, we can define conditional entropy of X given Y in the following way H(X|Y ) =X y p(y)H(X|y), (2.41) with H(X|y) := −X x∈Γ p(x|y) log(p(x|y)) (2.42)

(16)

for the conditional probabilities p(x|y) = p(xy)/p(y). The conditional entropy measures the amount of randomness that is left in a random variable X when another random variable Y is already known. With these definitions it also holds that

H(X|Y ) = H(XY ) − H(Y ), (2.43) which is known as the chain rule.

Entropy is a measure for the amount of uncertainty stored in a random variable, but very often we are interested in the amount of correlation between two random variables. A quantity that captures this property is the mutual information, defined by

I(X; Y ) := H(X) + H(Y ) − H(XY ) (2.44)

= H(X) − H(X|Y ), (2.45)

which denotes the difference between the uncertainty in X and the uncertainty that is left in X when Y is known, which is indeed the information that X and Y share. Similarly, we can define the conditional mutual information as

I(X; Y |Z) := H(X|Z) + H(Y |Z) − H(XY |Z) (2.46)

= H(X|Z) − H(X|Y Z) (2.47) = H(XZ) + H(Y Z) − H(XY Z) − H(Z), (2.48) which is equivalent to I(X; Y |Z) =X z p(z)I(X; Y |z), (2.49)

where I(X; Y |z) = H(X|z)+H(Y |z)−H(XY |z). The conditional mutual information calculates the correlation between random variables that is left when one (or more) of the other random variables is already known.

Through the chain rule for the entropy, we can also find a chain rule for the (conditional) mutual information. Specifically

I(W X; Y |Z) = I(X; Y |Z) + I(W ; Y |XZ). (2.50) Finally, we introduce a quantity called the relative entropy or Kullback-Leibler divergence of two probability distributions p and q with the same alphabet X . It is defined as

D(p||q) = X

x∈X

p(x)(log(p(x)) − log(q(x))), (2.51)

where by convention D(p||q) = ∞ if q(x) = 0, while p(x) > 0 for some x ∈ X . This quantity is used to see how different two probability distributions are and is closely related to the mutual information. In fact

I(X; Y ) = D(pXY||pX⊗ pY). (2.52)

Most of the quantities in this section can be expressed in terms of some of the other quantities. A useful overview for three random variables is given in appendix A in the form of a Venn diagram. 2.4.2 Classical entropy inequalities

The entropic quantities introduced in the previous section obey certain inequalities known as entropy inequalities. This section will discuss these inequalities and give proofs or proof sketches.

We’ve already seen in the previous section that the Shannon entropy is a non-negative quantity. This also matches with our intuition that an amount of uncertainty cannot be negative. Similarly, it would be surprising if the mutual information between two variables could become negative. And in fact one can indeed prove in the following way that the mutual information is also non-negative:

(17)

Proposition 2.3. For any two random variables it holds that

I(X; Y ) ≥ 0, (2.53)

with equality if and only if X is independent of Y . Proof.

I(X; Y ) = H(X) + H(Y ) − H(XY ) = −X

x,y

p(xy)(log(p(x)) + log(p(y)) − log(p(xy)))

= −X

x,y

p(xy) log p(x)p(y) p(xy)  ≥ − log   X x,y:p(xy)>0 p(x)p(y)   ≥ − log(1) = 0,

where we have used Jensen’s inequality to get the first inequality. If X and Y are independent, then p(XY ) = p(X)p(Y ) and all steps are equalities.

This inequality is known as subadditivity, due to the fact that it can also be written as

H(X) + H(Y ) ≥ H(XY ). (2.54)

This, too, has an intuitive interpretation: it will always be at least as difficult to express all infor-mation contained within two random variables separately, as it is to express it jointly.

From subadditivity, we can also derive an upper bound on the conditional entropy. This is captured in lemma 2.4.

Lemma 2.4. For any two random variables X and Y it holds that

0 ≤ H(X|Y ) ≤ H(X), (2.55)

with equality on the left side if and only if x = f (y) for some function f : Y → X and equality on the right side if and only if X and Y are independent.

Proof. The lower bound follows trivially from the definition and the bound on H(X|y). Equality follows from the fact that if x = f (y), we have p(x|y) = 1. The upper bound is found by writing

H(X|Y ) = H(XY ) − H(Y ) = H(X) − I(X; Y ) ≤ H(X), (2.56) with equality if and only if X and Y are independent by proposition 2.3.

Hence, knowing the variable Y can never increase the uncertainty of the random variable X. This inequality is also often written as

H(XY ) ≥ H(Y ), (2.57)

and is therefore called monotonicity of the Shannon entropy.

One can generalise the result of subadditivity to the conditional mutual information. This inequality is called strong subadditivity (SSA) and is given for three random variables X, Y and Z by

I(X; Y |Z) ≥ 0. (2.58)

It follows almost trivially from equation (2.49) and subadditivity. Equality is obtained if and only if X and Y are independent given Z.

(18)

Furthermore, there exists an inequality between random variables and their mappings, which is known as the data processing inequality (DPI). It states that for a map Φ : Y → Z

I(X; Y ) ≥ I(X; Φ(Y )) = I(X; Z), (2.59) that is, a mapping that is independent of X of the random variable Y cannot increase the mutual information with X.

Finally, we have two inequalities for the relative entropy. First, it holds that

D(p||q) ≥ 0, (2.60)

which can be proven with Jensen’s inequality in a similar way as subadditivity. Secondly, we have the monotonicity of the relative entropy under a map Φ

D(p||q) ≥ D(Φ(p)||Φ(q)). (2.61) This inequality is very similar in spirit as the data processing inequality and turns out to be equivalent to strong subadditivity.

Many of the previous inequalities are special cases of the strong subadditivity inequality or are equivalent to it, e.g. normal subadditivity follows from Z = {∅} in (2.58). The inequalities defined in this section and all the inequalities that follow from them are known as the Shannon inequalities. 2.4.3 Von Neumann entropy

The quantities introduced in section 2.4.1 for classical random variables can be extended to quantum systems. This section describes the quantum mechanical analog of the Shannon entropy, known as the von Neumann entropy. The other entropic quantities then follow naturally through the same definitions as in the classical case.

We define the von Neumann entropy for a quantum state ρA on a (finite dimensional) Hilbert

space HA as

S(ρA) = S(A) := − Tr(ρAlog(ρA)), (2.62)

where a function f of an operator is understood in terms of its spectral decomposition A = λiΠi as

f (A) =

m

X

i=1

f (λi)Πi, (2.63)

for eigenvalues λi and projection operators Πi onto the space spanned by the eigenvectors that

correspond to the eigenvalue λi. Alternatively, one can write

S(ρA) = S(A) = H(λ(A)), (2.64)

where λ(A) is the vector of eigenvalues of ρA. The von Neumann entropy is invariant under isometries

due to the cyclicity of the trace and the fact that for any isometry V ∈ U (HA, HB) it holds that

log(V ρV∗) = V log(ρ)V∗. Thus we have that

S(V ρAV∗) = S(ρA). (2.65)

The other quantum entropic quantities are now defined according to their classical counterparts. That is, for a quantum state ρAB on a Hilbert space HA⊗ HB we have the joint entropy

S(AB) = − Tr(ρABlog(ρAB)), (2.66)

the conditional entropy

(19)

and the mutual information

I(A; B) = S(A) + S(B) − S(AB), (2.68) where S(A) = S(TrB(ρAB)) and similar for S(B). Similarly, for quantum states ρABC on a Hilbert

space HA⊗ HB⊗ HC we have the conditional mutual information

I(A; B|C) = S(AC) + S(BC) − S(ABC) − S(C). (2.69) As in the classical case, we have that if two systems are independent, i.e. if ρAB = ρA⊗ ρB, the

entropy is additive and we get S(AB) = S(A) + S(B) and I(A; B) = 0.

The quantum relative entropy is also similarly defined as the classical relative entropy, but with a slight caveat. Let ρ, σ ∈ D(HA), then the quantum relative entropy is defined as

D(ρ||σ) = (

Tr(ρ(log(ρ) − log(σ))) if im(ρ) ⊆ im(σ)

∞ otherwise. (2.70)

The condition for the relative entropy to be finite can be explained by the following reasoning: the logarithm for the density operator σ is well-defined for the subspace im(σ), since this is positive definite. So as long as the image of ρ is contained in that of σ, it is clear how to calculate ρ log(σ). However, if some part of im(ρ) is not contained in im(σ), then the logarithm of that subspace is ill-defined and we set it to ∞.

2.4.4 Quantum entropy inequalities

Even though the definitions of the quantum entropic quantities mirror their classical counterparts, the inequalities that bound these quantities are slightly different. For example, it turns out that monotonicity is no longer true. This can be seen by taking ρAB = |φ+ihφ+|, with |φ+i =√12(|00i +

|11i), the maximally entangled ebit. Since this is a pure state, the only nonzero eigenvalue is 1 and the von Neumann entropy of the joint state ρAB is thus 0. However, when we look at the subsystem

on HA by taking the partial trace over HB, we get

ρA= TrB( φ+ φ+ ) (2.71) = I2 2, (2.72) which leads to S(A) = − Tr I2 2 log  I2 2  = 1. (2.73)

This means that for this example S(A) > S(AB), and so monotonicity doesn’t generally hold. The inequalities that are known to hold will be outlined in the rest of this section.

Let us start with the positivity of the von Neumann entropy. We have that for a density operator ρ on Hilbert space HA

0 ≤ S(A) ≤ log(dim(HA)), (2.74)

where equality is attained on the left side by pure states and on the right side by ρA= IHA/ dim(HA).

Note furthermore that for bipartite pure states ρAB, it holds that S(A) = S(B). This can be seen

from the statement (2.9) that follows from the Schmidt decomposition of pure states.

We will now state a famous result by Lieb and Ruskai [27] that strong subadditivity also holds for the von Neumann entropy. This means that for any tripartite state ρABC ∈ D(HA⊗ HB⊗ HC)

we have that

(20)

The proof of this statement is much more involved than the classical proof and its discovery was of great relevance. In the same paper, Lieb and Ruskai also derived the weak monotonicity inequality

S(AB) + S(AC) ≥ S(B) + S(C), (2.76) which can be derived from SSA by purification, as will be shown below.

Proof. Let ρABC ∈ D(HA⊗HB⊗HC) be a density operator and let |ψihψ| ∈ D(HA⊗HB⊗HC⊗HD)

be a purification of ρABC. Then, by the properties of pure states it holds that S(ABC) = S(D) and

S(BC) = S(AD), which yields

S(AC) + S(AD) − S(C) − S(D) = S(AC) + S(BC) − S(C) − S(ABC) ≥ 0, (2.77) by strong subadditivity, which is exactly the statement of weak monotonicity.

Subadditivity also follows from SSA by setting ρABC= ρAB⊗ ρC.

Additionally, we also have the Araki-Lieb inequality [20]

S(AB) ≥ |S(A) − S(B)|. (2.78) Our proof of this inequality again uses the purification symmetry, but this time only subadditivity is needed.

Proof. Let ρAB ∈ D(HA ⊗ HB) be a density operator and let |ψihψ| ∈ D(HA⊗ HB ⊗ HC) be

a purification of ρAB. Then, by the properties of pure states it holds that S(AB) = S(C) and

S(AC) = S(B), which yields

S(AB) = S(C) ≥ S(AC) − S(A) (2.79)

= S(B) − S(A), (2.80)

where the first equality follows from the purification symmetry, the inequality is subadditivity and the final step follows from the purification symmetry again. Similarly

S(AB) ≥ S(A) − S(B), (2.81)

which concludes the proof.

Similar to the classical case, we also have a quantum version of the data processing inequality (DPI). Let ρAB ∈ D(HA⊗ HB) be a quantum state and let Φ : HB → HC be a quantum channel.

It holds that

I(A; B)ρ≥ I(A; C)ω, (2.82)

with ωAC= (IA⊗ Φ)ρAB. The DPI is again equivalent to strong subadditivity.

Finally, we have two inequalities in terms of the relative entropy. We start with the quantum equivalent of the non-negativity of the relative entropy, which is known as Klein’s inequality. It states that for P, Q ∈ Pos(X ) with X a complex Euclidean space and Tr(P ) ≥ Tr(Q) it holds that

D(P ||Q) ≥ 0. (2.83)

In particular, this means that for any two density operators ρ and σ on the same Hilbert space HA,

we have

D(ρ||σ) ≥ 0. (2.84)

The proof of this statement can for example be found in [20, prop. 5.22].

The other inequality that uses the relative entropy is the monotonicity of the relative entropy, given by

(21)

for any quantum channel Φ. This inequality is equivalent to strong subadditivity. That the mono-tonicity of the relative entropy implies strong subadditivity can be seen as follows:

Let ρ = ρABC ∈ D(HA⊗ HB⊗ HC) and σ = ρA⊗ ρBC ∈ D(HA⊗ HB⊗ HC). Let the map Φ

be given by Φ(ρ) = TrB(ρ). Then from eq. (2.85) we see that

ρABC(log(ρABC) − log(ρA⊗ ρBC)) ≥ ρAC(log(ρAC) − log(ρA⊗ ρC)), (2.86)

which leads to

−S(ABC) + S(A) + S(BC) ≥ −S(AC) + S(A) + S(C) (2.87) after taking the trace. This statement is exactly strong subadditivity.

Conversely, it also holds that strong subadditivity implies the monotonicity of the relative en-tropy, but this proof is slightly more involved and was first shown to hold by Lindblad in ’74 [28].

To summarise, we list the von Neumann inequalities that will be relevant in this work. These are also written down in appendix A for easy reference. We also refer the interested reader to the excellent review article by Ruskai [29] on quantum entropy inequalities and their relations.

S(A) ≥ 0 (non-negativity of the entropy)

I(A; B) ≥ 0 (subadditivity)

I(A; B|C) ≥ 0 (strong subadditivity) S(AB) + S(AC) ≥ S(B) + S(C) (weak monotonicity) S(AB) ≥ |S(A) − S(B)| (Araki-Lieb)

I(A; B)ρ≥ I(A; C)ωfor Φ : HB→ HC (DPI)

D(ρ||σ) ≥ 0 (Klein’s inequality)

D(ρ||σ) ≥ D(Φ(ρ)||Φ(σ)) (Monotonicity of relative entropy) 2.4.5 Differential entropy

The Shannon and von Neumann entropy functions that we have discussed so far were defined for finite probability distributions and density operators. One could ask if similar quantities exist for probability densities. Such a quantity, called the differential entropy, indeed exists in the classical case and is defined by replacing the sum by an integral and the probability distribution by a prob-ability density in the definition of the Shannon entropy. That means that for a classical random variable X with probability density f (x) on the event space Ω we define the differential Shannon entropy by

H(X) = − Z

f (x) log(f (x))dx. (2.88) By convention, the logarithm is taken to be the natural logarithm.

In many cases the differential entropy is not finite, but for certain special cases, like Gaussian distributions, it is finite and can be calculated. In the classical case, we get the following easy expression for the entropy of Gaussian distributions [30]:

H(X) = 1

2log((2πe)

n

|σ|), (2.89)

with σ = E((X − E(X))(X − E(X))T) the covariance matrix and |σ| = det(σ).

All additional entropic quantities from previous sections are defined in exactly the same way for the continuous variable case.

For quantum states on the infinite-dimensional Hilbert space HA= L2(Rn) the definition of the

von Neumann entropy (2.62) for finite quantum systems suffices, because there are still countably many eigenvalues. Note, however, that the entropy can still become infinite. Gaussian quantum states will be of interest, because they, too, allow for an easy expression of the entropy, as we will see in section 6.

(22)

2.5

The Markov property

As we have seen, an important notion in (quantum) information theory is the correlation between random variables or states. It might therefore seem plausible that it is also of interest to know whether a random variable Z gives any additional information of a random variable X if another random variable Y is already known, i.e. if I(X; Z|Y ) ≥ 0 (and similar for quantum states). When this is not the case, that is when

I(X; Z|Y ) = 0, (2.90)

the system is said to have the Markov property [30]. For classical random variables this corresponds to the statement that

p(X|Y Z) = p(X|Y ). (2.91)

This property is often also denoted as

X → Y → Z, (2.92)

or

X ↔ Y ↔ Z,

since it also holds that p(Z|XY ) = p(Z|Y ), as can be seen from the symmetry of equation (2.90). In the quantum case, the Markov property has a slightly different interpretation. Since a quantum state is in general a superposition, there is no clear way to condition on a quantum state. Instead, we just define the conditional entropies analogous to the classical case: S(A|B) = S(AB) − S(B) as we’ve seen in section 2.4.3. Similarly, we define the Markov property A → B → C for a quantum state ρABC as I(A; C|B) = 0. This is equivalent to saying that there exists a special recovery map

PB→BC: HB→ HB⊗ HC, called a Petz map, such that [31, 32, 33]

(23)

Figure 2: A polyhedral cone in R3. It can most easily be described in terms of its extremal rays, or

in terms of the (hyper)planes that form the boundary of the cone.

3

Entropy Cones

It turns out that the possible entropy values of a probability distribution or a quantum state are nicely captured in terms of a conic description. In this chapter we will discuss this description, starting with the formal definition of a cone and the proof that all possible entropy values of distributions and states indeed form a so-called entropy cone. We then investigate whether this cone is completely described by the inequalities that were introduced in the previous chapter. When we discover that this is not always the case, we proceed by looking at Zhang-Yeung type inequalities, which supplement the set of inequalities by Shannon. We will then investigate whether similar inequalities hold in the quantum case.

3.1

Cones and entropy vectors

Let us start by mathematically defining what a cone is

Definition 3.1 (Convex cone). Let E be a set of points in RN. E is a convex cone if

(1) ∀X ∈ E , λ ∈ R+: λX ∈ E (Scaling)

(2) ∀X, Y ∈ E , 0 ≤ λ ≤ 1 : λX + (1 − λ)Y ∈ E (Convexity) Proving these two conditions is necessary and sufficient to prove that a set is a cone.

A polyhedral cone allows for two equivalent, but dual descriptions. Consider, for example, the cone in figure 2. One can describe this cone by denoting the vectors

  0 0 1  ,   1 0 1  ,   0 1 1  

and considering all linear combinations with non-negative coefficients. These vectors are called the extremal rays of a cone. If possible, the extremal rays are scaled such that they contain integers to increase readability.

Alternatively, one can look at the (hyper)planes that enclose the cone and consider all points that lie within this enclosure. In the case of figure 2 these planes are defined by the equations y = 0,

(24)

x = 0 and z = x + y. So we are considering all points that obey the inequalities x ≥ 0,

y ≥ 0, z − x − y ≥ 0.

The cone in figure 2 is then given by all points that lie within the half-spaces defined by these inequalities.

Going from the half-space description (often denoted by H-description) to the dual extremal ray description (denoted by V-description, for vertex) is in general an NP-hard problem and therefore becomes exponentially difficult with growing dimensions and more complicated cones [34]. We will encounter this problem later in section 5.3.

Every cone also satisfies two additional properties, known as additivity and approximate dilua-bility [10].

Definition 3.2 (Additivity). Let E be a set of points in RN. This set is additive if

∀X, Y ∈ E : X + Y ∈ E.

Definition 3.3 (Approximate diluability). Let E be a set of points in RN. This set is approximately

diluable if ∀ > 0, ∃δ > 0 such that ∀X ∈ E , 0 ≤ λ ≤ δ, ∃Y ∈ E such that kλX − Y k≤ , where kXk= max1≤i≤NXi.

Proposition 3.4. If E is a convex cone, then E is additive and approximately diluable.

Proof. Let E be a convex cone and X, Y ∈ E . Then 12X + 12Y ∈ E by property (2) of definition 3.1 and then X + Y ∈ E by property (1), which shows additivity.

Now let Y = λX for λ ≥ 0, then Y ∈ E by property (1), and thus kλX − Y k= 0 ≤ , ∀ ≥ 0, which proofs approximate diluability.

However, it turns out that it is not true that every set that satisfies these two properties is a cone. As a counterexample one can take E as the set of rationals, which is a subset of R. This set is additive and approximately diluable, but not a convex cone. Instead, the following theorem holds [10], for which we need the definition of topological closure:

Definition 3.5 (Topological closure). The topological closure of a set S is a set S, which contains all points in S and their limit points, i.e. all points for which every neighbourhood contains at least one point in S.

Theorem 3.6 (Pippenger [10]). If E is additive and approximately diluable, then E is a convex cone, where E is the topological closure of E .

Proof. It is sufficient to prove the properties (1) and (2) from def. 3.1.

Let E be an additive and approximately diluable set and let A ∈ E and µ, η > 0. Now take  = µ+1η and δ > 0. Let U ∈ E be such that kA − U k ≤ , which is possible for any  > 0 by the definition of the topological closure. By additivity, we have mU ∈ E for every positive integer m. Take m = dµδe and λ = µ

m. Finally, let X = mU , such that by approximate diluability, there exists

a Y ∈ E such that kλX − Y k= kµU − Y k≤ . Then, by the triangle inequality

kµA − Y k≤ kµA − µU k+ kµU − Y k≤ µ +  = η, (3.1) which proofs that E obeys property (1) by the definition of the topological closure.

Now let X, Y ∈ E . The following shows that ∀ζ > 0 there exists a point Z ∈ E such that kλX + (1 − λ)Y − Zk≤ ζ,

(25)

which proves that λX + (1 − λ)Y ∈ E and thus (2) holds. Choose A = X, µ = λ and η = λζ in eq. (3.1). We get

kλX − V k≤ λζ, for some point V ∈ E . Similarly

k(1 − λ)Y − W k≤ (1 − λ)ζ,

for another point W . Now let Z = V + W . Then by additivity and the triangle inequality kλX + (1 − λ)Y − Zk≤ kλX − V k+ k(1 − λ)Y − W k

λζ + (1 − λ)ζ = ζ, for all ζ > 0.

By virtue of theorem 3.6, we can show that the closure of the set of vectors comprised of the entropies of all possible subsystems of a quantum state also forms a convex cone. Let us first define what such an entropy vector looks like.

Definition 3.7 (Entropy vector). The collection of entropies {S(ρI)}I⊂[n] is called the entropy

vector for the quantum state ρ ∈ D(H1⊗ . . . ⊗ Hn). This is a vector in R2

n−1

, because we omit the empty set. Let An⊆ R2

n−1

be the set of entropy vectors of all such n-part quantum states. We can similarly define entropy vectors for classical states and weakly symmetric states. We treat both as special cases of quantum entropy vectors, even though the classical entropy vector could be defined without reference to the quantum entropy vector. With these definitions we will be able to unify the proofs that the closures of these sets form cones.

Definition 3.8 (Classical entropy vector). A classical entropy vector {H(XI)}I⊂[n] of a random

variable X = X1. . . Xn with probability distribution p(x1, . . . , xn) is defined by the entropy vector

of the classical state

ρ = X

x1∈Γ1,...,xn∈Γn

p(x1, . . . , xn) |x1ihx1| ⊗ . . . ⊗ |xnihxn|

with the same probability distribution, such that H(XI) = S(ρI). The set of classical entropy

vectors is denoted by Aclass

n .

Definition 3.9 (Weakly symmetric quantum state). We say that an n-part quantum state is weakly symmetric if S(ρI) depends on I only through |I|, the number of elements in I. The set of entropy

vectors of weakly symmetric n-part quantum states is denoted by Cn.

For |I| = i, we thus have ni equal entries S(ρi). The entropy vector of a weakly symmetric quantum

state can therefore be denoted as a vector in Rn, where we again haven’t included the empty set.

We are now ready to prove that the closures of the sets of entropy vectors form cones. Theorem 3.10 (Pippenger [10]). The sets An, Aclassn and Cn are cones.

Proof. To prove that An is a cone, it is sufficient to show that An is additive and approximately

diluable by theorem 3.6.

Additivity: Let ρX, ρY ∈ (Cd)⊗n be n-part quantum states with entropy vectors {S(ρX)I}I⊂[n]

and {S(ρY)I}I⊂[n]respectively. Now let ρZ = ρX⊗ ρY, which can be written as a state on (C2d)⊗n

and is thus still an n-part quantum state. The eigenvalues of ρZ,I = TrIc(ρZ) are

(26)

with {λX,I,x} the set of eigenvalues of ρX,I = TrIc(ρX) and similar for ρY,I. The entropy of ρZ,I

then becomes

S(ρZ,I) =

X

x,y

λX,I,xλY,I,ylog(λX,I,xλY,I,y)

=X

x,y

λX,I,xλY,I,ylog(λX,I,x) +

X

x,y

λX,I,xλY,I,ylog(λY,I,y)

=X

x

λX,I,xlog(λX,I,x) +

X

y

λY,I,ylog(λY,I,y)

= S(ρX,I) + S(ρY,I),

which proves additivity.

Approximate diluability: Choose  > 0 and 0 < δ ≤ 12 such that h(δ) ≤ , with h(x) the binary Shannon entropy. Suppose µ ≤ δ. Let ρX be an n-part quantum state. Now construct an n-part

quantum state ρY such that

kµS(ρX,I) − S(ρY,I)k≤ 

in the following way:

Let ρY = µρX⊕ (1 − µ) |0ih0| ⊗n ∈ (Cd+1)n. Now calculate S(ρ Y,I): S(ρY,I) = − X x

µλX,I,xlog(µλX,I,x) − (1 − µ) log(1 − µ)

= −µX

x

λX,I,xlog(λX,I,x) − µ log(µ) − (1 − µ) log(1 − µ)

= µS(ρX,I) + h(µ).

Such that indeed

kµS(ρX,I) − S(ρY,I)k= h(µ) ≤ .

This proves approximate diluability, which concludes the proof that An is a convex cone.

Note that this also almost immediately proves that the closure of the set of classical entropy vectors is a cone, since classical states are mapped into classical states by the constructions in the proof above. Similarly, weakly symmetric states are mapped into weakly symmetric states. We thus conclude thatAn, Aclassn and Cn are cones.

Classical states and weakly symmetric states thus each generate individual entropy cones. It is also clear from the proof that Aclass

n ⊆ An and Cn⊆ An.

3.1.1 Shannon and von Neumann cones

In the previous section, we’ve seen that a cone can be described by a set of inequalities that denote the half-spaces in which the points in the cone have to lie. This poses the question what inequalities define the boundary of the entropy cones. The first sets of inequalities that come to mind are the Shannon and von Neumann inequalities that were outlined in sections 2.4.2 and 2.4.4. These sets of inequalities indeed form cones for all n and are denoted by Bclassn for the classical Shannon cone

and by Bnfor the von Neumann cone. Since all probability distributions have to obey the Shannon

inequalities and all quantum states have to obey the von Neumann inequalities, it is clear that the cones Bclass

n and Bn are at least outer approximations of the actual entropy cones Aclassn and An.

Pippenger [10] has shown that for n ≤ 3 the classical entropy cone and the Shannon cone coincide, as well as the quantum entropy cone and the von Neumann cone. We thus have

( Aclass

n = Bclassn

An = Bn

(27)

Figure 3: The cone from figure 2 with a new inequality, given by the blue plane, that cuts off a piece of this cone, defining a new, tighter cone

For n ≥ 4 it is unknown whether the von Neumann inequalities are sufficient for quantum states and it has in fact been shown that the Shannon inequalities are insufficient for classical probability distributions [11]. We will elaborate on this in the next section.

3.2

Beyond Shannon inequalities

It was first shown by Zhang and Yeung [11] that for n ≥ 4-party systems the Shannon inequalities are insufficient to describe the classical entropy cone. They proved a new inequality that all probability distributions have to obey, but that can’t be derived by a combination of Shannon inequalities for four parties. This inequality, from now on referred to as the Zhang-Yeung inequality, is given for random variables Xi, i ∈ {1, 2, 3, 4} by

2I(X1; X2|X3) + I(X1; X2|X4) + I(X3; X4) − I(X1; X2) + I(X1; X3|X2) + I(X2; X3|X1) ≥ 0.

(3.3) Note that the term that makes this inequality non-trivial is −I(X1; X2), because without this term

the inequality would simply follow from repeated application of (strong) subadditivity.

The meaning of the Zhang-Yeung inequality is still poorly understood. As we’ve seen for the Shannon inequalities in section 2.4.2, there is an intuitive explanation for why they should hold. For the Zhang-Yeung inequality, such an explanation has not yet been found. It is even unclear how to write the inequality in order to see its meaning. One could, for example, write eq. (3.3) in terms of Shannon entropies in stead of in terms of (conditional) mutual informations, but it would still be unclear how to interpret the inequality.

Returning to the conic description, the Zhang-Yeung inequality, combined with the already known Shannon-type inequalities, defines a new, tighter cone (Aclassn )0 for n ≥ 4. To get a better

intuition, consider the cone we’ve previously seen in figure 2 and imagine a piece of this cone is cut off by a new inequality as illustrated in figure 3 (note, however, that this is only an illustration and not the actual Shannon cone for a 4-party probability distribution). The new cone now forms a tighter outer approximation of the true 4-party entropy cone Bclass

n . Again, we can ask whether

(Aclass

n )0= Bclassn , but again this turns out not to be true. Several more independent

non-Shannon-type inequalities were found by Dougherty et al. [35, 36]. Later, it was even shown by Mat´uˇs [37] that there exist infinite families of linearly independent non-Shannon type inequalities. This also immediately resulted in the fact that our current best approximation to the classical 4-party entropy cone is not polyhedral.

(28)

Let us now give a proof of the Zhang-Yeung inequality by using the copy lemma, the proof of which is repeated below.

Lemma 3.11 (Copy lemma). Let X1, X2, X3, X4 be random variables with the joint probability

distribution p(X1, X2, X3, X4) and alphabets X1, . . . , X4. Then there exists a random variable X5

with alphabet X5, jointly distributed with X1, X2, X3, X4, such that

1. p(X1, X2, X3) = p(X1, X2, X5),

2. I(X3X4; X5|X1X2) = 0,

where p(X1, X2, X3) is the marginal probability distribution of (X1, X2, X3) and similar for

p(X1, X2, X5). If these properties hold, we call X5an X4-copy of X3 over (X1X2).

Proof. Choose X5 = X3. Let x1 ∈ X1 be an element of the alphabet X1 and similar for the other

random variables. Now define the joint probability distribution of X1, . . . , X5 as

p0(x1, x2, x3, x4, x5) := p(x1, x2, x3, x4)p0(x5), (3.4) with p0(x5) := P x4∈X4p(x1, x2, x5, x4) P x3∈X3,x4∈X4p(x1, x2, x3, x4) ∀xi∈ Xi, i ∈ {1, 2, 3, 4, 5}. (3.5)

This is a valid probability distribution, which can easily be checked by summing over x5:

X x5∈X5 p0(x1, x2, x3, x4, x5) = p(x1, x2, x3, x4) P x5∈X5,x4∈X4p(x1, x2, x5, x4) P x3∈X3,x4∈X4p(x1, x2, x3, x4) (3.6) = p(x1, x2, x3, x4) p(x1, x2) p(x1, x2) (3.7) = p(x1, x2, x3, x4). (3.8)

One can check property 1. by comparing the marginal probability distributions X x4,x5 p0(x1, x2, x3, x4, x5) = X x4 p(x1, x2, x3, x4) (3.9) and X x3,x4 p0(x1, x2, x3, x4, x5) = P x3,x4p(x1, x2, x3, x4) P x4p(x1, x2, x5, x4) P x3,x4p(x1, x2, x3, x4) (3.10) =X x4 p(x1, x2, x5, x4) (3.11)

and noting that they are the same, with x5 replacing x3.

The proof of property 2. is slightly more involved and is most easily done in terms of the Shannon entropy. Recall that we can write condition 2. as

H(X1X2X3X4X5) = H(X1X2X3X4) + H(X1X2X5) − H(X1X2). (3.12)

Calculating the Shannon entropy, we get H(X1X2X3X4X5) = − X x1,x2,x3,x4,x5 p0(x1, x2, x3, x4, x5) log(p0(x1, x2, x3, x4, x5)) = − X x1,x2,x3,x4,x5 p0(x1, x2, x3, x4, x5) h log(p(x1, x2, x3, x4)) + log X x4 p(x1, x2, x5, x4) ! − log X x3,x4 p(x1, x2, x3, x4) ! i = H(X1X2X3X4) + H(X1X2X5) − H(X1X2),

Referenties

GERELATEERDE DOCUMENTEN

They may be used a measures of the mixedness of states, as measures of entanglement, measures of criticality, and the von Neumann entropy in particular may be used as a

We have seen that it is possible to derive new Bell’s inequalities if one considers two particles with general spin and even one when using an arbitrary number of spin-1/2

We have presented compelling numerical evidence for the validity of the theory of the Ehrenfest-time dependent sup- pression of shot noise in a ballistic chaotic system. 2,5 The

We use ihe open kicked rotator to model Lhe chaotic scatteimg in a balhstic quantum dot coupled by Iwo poinl contacts to election leseivoiis By calculating

If many delay violations occur, quality of experience (QoE) su↵ers considerably for these applications. In multi-user communication systems, competition for bandwidth among

If a plant R can be arbitrarily pole assigned by real memoryless output feedback in the sense of Definition 2.3.1, then in particular does there exist a regular feedback law (2.29)

We have shown that expectation values of products of Weyl operators that are trans- lated in time by a quantum mechanical Hamiltonian and are in coherent states cen- tered in

This will be important when comparing the ground states of the N -dependent quantum Curie-Weiss model to those of the Schr¨ odinger operator with symmetric double well in the