• No results found

Average-Case: Randomized vs. Quantum

average-case complexity under the uniform distribution.

If a≥ 3/10, the expected number of queries for step 2 is

log NX

i=100

Pr[˜a1 ≤ 2/10, . . . , ˜ai−1≤ 2/10 | a ≥ 3/10] · i c ≤

log NX

i=100

Pr[˜ai−1≤ 2/10 | a ≥ 3/10] · i c ≤

log NX

i=100

2−(i−1)· i

c ∈ O(1).

The probability that step 4 is needed (given a ≥ 3/10) is at most 2−(c log N)/c = 1/N . This adds N1N = 1 to the expected number of queries.

Under the uniform distribution, the probability of the event a < 3/10 is at most 2−cN for some constant c. This case contributes at most 2−cN(N + (log N )2)∈ o(1) to the expected number of queries. Thus in total the algorithm uses O(1) queries on average, hence Runif2 (f ) ∈ O(1). Since Qunif2 (f ) ≤ Runif2 (f ), we also have Qunif2 (f )∈ O(1).

Since a deterministic classical algorithm for f must be correct on every input x, it is easy to see that it must make at least N/10 queries on every input, hence

Dunif(f )≥ N/10. 2

Accordingly, we can have huge gaps between Dunif(f ) and Qunif2 (f ). However, this example tells us nothing about the gaps between quantum and classical bounded-error algorithms. In the next section we exhibit an f where Qunif2 (f ) is exponentially smaller than the classical bounded-error complexity Runif2 (f ).

5.4 Average-Case: Randomized vs. Quantum

5.4.1 The function

We use the following modification of Simon’s problem from Section 1.5:2 Modified Simon’s problem:

We are given x = (x1, . . . , x2n), with xi ∈ {0, 1}n. We want to compute a Boolean function defined by: f (x) = 1 iff there is a non-zero k ∈ {0, 1}n such that for all i∈ {0, 1}n we have xi⊕k = xi.

Here we treat i∈ {0, 1}n both as an n-bit string and as a number between 1 and 2n, and⊕ denotes bitwise XOR (addition modulo 2). Note that this function is total, unlike Simon’s original promise function. Formally, f is not a Boolean function because the variables xi are {0, 1}n-valued. However, we can replace

2The preprint [90] proves a related but incomparable result about another modification of Simon’s problem.

every variable xi by n Boolean variables and then f becomes a Boolean function of N = n2n variables. The number of queries needed to compute the Boolean function is at least the number of queries needed to compute the function with {0, 1}n-valued variables (because we can simulate a query to the Boolean input-variables by means of a query to the {0, 1}n-valued input-variables, just ignoring the n− 1 bits we are not interested in) and at most n times the number of queries to the {0, 1}n-valued input variables (because one {0, 1}n-valued query can be simulated using n Boolean queries). As the numbers of queries are so closely related, it does not make a big difference whether we use the{0, 1}n-valued input variables or the Boolean input variables. For simplicity we count queries to the {0, 1}n-valued input variables.

We are interested in the average-case complexity of this function. The main result is the following exponential gap, to be proven in the next sections:

5.4.1. Theorem (Ambainis & de Wolf [13]). For f as above, we have that Qunif2 (f )≤ 22n + 1 and Runif2 (f )∈ Ω(2n/2).

5.4.2 Quantum upper bound

Our quantum algorithm for f is similar to Simon’s. Start with the 2-register su-perpositionP

i∈{0,1}n|ii|~0i (for convenience we ignore normalizing factors). Apply

a query to obtain X

i∈{0,1}n

|ii|xii.

Measuring the second register gives some j and collapses the first register to X

i:xi=j

|ii.

Applying a Hadamard transform to each qubit of the first register gives X

5.4. Average-Case: Randomized vs. Quantum 83 Notice that|ii has non-zero amplitude only if (k, i) = 0. Hence if f (x) = 1, then measuring the final state gives some i orthogonal to the unknown k.

To decide if f (x) = 1, we repeat the above process m = 22n times. Let i1, . . . , im ∈ {0, 1}n be the results of the m measurements. If f (x) = 1, there must be a non-zero k that is orthogonal to all ir (r ∈ {1, . . . , m}). Compute the subspace S ⊆ {0, 1}n that is generated by i1, . . . , im (i.e., S is the set of binary vectors obtained by taking linear combinations of i1, . . . , im over GF (2)).

If S = {0, 1}n, then the only k that is orthogonal to all ir is k = 0n (clearly ir· 0n = 0 for all ir), so then we know that f (x) = 0. If S 6= {0, 1}n, we just query all 2n values x0...0, . . . , x1...1 and then compute f (x). Of course, this latter step is very expensive, but it is needed only rarely:

5.4.2. Lemma. Assume that x = (x0, . . . , x2n−1) is chosen uniformly at random from {0, 1}N. Then, with probability at least 1− 2−n, f (x) = 0 and the measured i1, . . . , im generate {0, 1}n.

Proof. It can be shown by a small modification of [4, Theorem 5.1, p.91] that with probability at least 1− 2−c2n (c > 0), there are at least 2n/8 values j such that xi = j for exactly one i ∈ {0, 1}n (and hence f (x) = 0). We assume that this is the case in the following.

If i1, . . . , im generate a proper subspace of {0, 1}n, then there is a non-zero k∈ {0, 1}n that is orthogonal to this subspace. We estimate the probability that this happens. Consider some fixed non-zero vector k ∈ {0, 1}n. The probability that i1 and k are orthogonal is at most 1516, as follows. With probability at least 1/8, the measurement of the second register gives j such that f (i) = j for a unique i. In this case, the measurement of the final superposition (5.1) gives a uniformly random i. The probability that a uniformly random i has (k, i)6= 0 is 1/2. Therefore, the probability that (k, i1) = 0 is at most 1− 18 · 12 = 1516.

The vectors i1, . . . , im are chosen independently. Therefore, the probability that k is orthogonal to each of them is at most (1516)m = (1516)22n < 2−2n. There are 2n− 1 possible non-zero k, so the probability that there is a k that is orthogonal to each of i1, . . . , im, is ≤ (2n− 1)2−2n< 2−n. 2

Note that this algorithm is actually a zero-error algorithm: it always outputs the correct answer. Its expected number of queries on a uniformly random input is at most m = 22n for generating i1, . . . , im and at most 21n2n = 1 for querying all the xi if the first step does not give i1, . . . , im that generate {0, 1}n. This completes the proof of the first part of Theorem 5.4.1. In contrast, in Section 5.4.4 we show that the worst-case zero-error quantum complexity of f is Ω(N ), which is near-maximal.

5.4.3 Classical lower bound

Let D1 be the uniform distribution over all inputs x ∈ {0, 1}N and D2 be the uniform distribution over all x for which there is a unique k 6= 0 such that xi = xi⊕k (and hence f (x) = 1). We say that an algorithm A distinguishes between D1 and D2 if the average probability that A outputs 0 is ≥ 2/3 under D1 and the average probability that A outputs 1 is ≥ 2/3 under D2.

5.4.3. Lemma. If there is a bounded-error algorithm A that computes f with m = TAunif queries on average, then there is an algorithm that distinguishes between D1

and D2 and uses O(m) queries on all inputs.

Proof. Without loss of generality we assume A has error probability ≤ 1/10.

Under D1, the probability that A outputs 1 is at most 1/10 + o(1) (1/10 is the maximum probability of error on an input with f (x) = 0 and o(1) is the probability of getting an input with f (x) = 1), so the probability that A outputs 0 is at least 9/10−o(1). We run A until it stops or makes 10m queries. The average probability (under D1) that A does not stop before 10m queries is at most 1/10, for otherwise the average number of queries would be more than 101(10m) = m.

Therefore the probability under D1 that A outputs 0 after at most 10m queries, is at least (9/10− o(1)) − 1/10 = 4/5 − o(1). In contrast, the D2-probability that A outputs 0 is ≤ 1/10 because f(x) = 1 for any input x from D2. We can use

this to distinguish D1 from D2. 2

5.4.4. Lemma. A classical randomized algorithm A that makes m ∈ o(2n/2) queries cannot distinguish between D1 and D2.

Proof. Suppose m∈ o(2n/2). For a random input from D1, the probability that all answers to m queries are different is

For a random input from D2, the probability that there is an i such that A queries both xi and xi⊕k (k is the hidden vector) is ≤¡m If no pair xi, xi⊕k is queried, the probability that all answers are different is

5.4. Average-Case: Randomized vs. Quantum 85 It is easy to see that all sequences of m different answers are equally likely.

Therefore, for both distributions D1 and D2, we get a uniformly random sequence of m different values with probability 1−o(1) and something else with probability o(1). Thus A cannot “see” the difference between D1 and D2 with sufficient

probability to distinguish between them. 2

The second part of Theorem 5.4.1 now follows: a classical algorithm that computes f with an average number of m queries can be used to distinguish between D1 and D2 with O(m) queries (Lemma 5.4.3), but then O(m)∈ Ω(2n/2) (Lemma 5.4.4).

5.4.4 Worst-case quantum complexity of f

For the sake of completeness, we will here show a lower bound of Ω(N ) queries for the zero-error worst-case complexity Q0(f ) of the function f on N = n2n binary variables defined in Section 5.4. (We count binary queries this time.) Consider a quantum algorithm that makes at most T queries and that, for every x, outputs either the correct output f (x) or, with probability≤ 1/2, outputs “don’t know”.

Consider the polynomial P which is the acceptance probability of our T -query algorithm for f . It has the following properties:

1. P has degree d≤ 2T 2. if f (x) = 0 then P (x) = 0 3. if f (x) = 1 then P (x) ∈ [1/2, 1]

We first show that only very few inputs x∈ {0, 1}N make f (x) = 1. The number of such 1-inputs for f is the number of ways to choose k ∈ {0, 1}n− {0n}, times the number of ways to choose 2n/2 independent xi ∈ {0, 1}n. This is (2n− 1) · (2n)2n/2< 2n(2n/2+1). Accordingly, the fraction of 1-inputs among all 2N inputs x is < 2n(2n/2+1)/2n2n = 2−n(2n/2−1). These x are exactly the x that make P (x)6= 0.

However, the following result is known [148, 133]:

5.4.5. Lemma (Schwartz). If P is a non-constant N -variate multilinear poly-nomial of degree d, then

|{x ∈ {0, 1}N | P (x) 6= 0}|

2N ≥ 2−d.

This implies d ≥ n(2n/2− 1) and hence T ≥ d/2 ≥ n(2n/4− 2) ≈ N/4.

Thus we have proved that the worst-case zero-error quantum complexity of f is near-maximal:

5.4.6. Theorem (Ambainis & de Wolf [13]). Q0(f ) ∈ Ω(N).