• No results found

SHUFFLING CARDS with Random Transpositions

N/A
N/A
Protected

Academic year: 2021

Share "SHUFFLING CARDS with Random Transpositions"

Copied!
59
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

with Random Transpositions

Bachelor’s Project Mathematics

Name: A.F.G. Pelzer

First supervisor: Dr. D. Rodrigues Valesin

Second supervisor: Prof. Dr. A.C.D. van Enter

Date: 2 Juli 2018

(2)

2. Shuffling cards described mathematically 2

2.1. Preliminaries on Group Theory 2

2.2. Preliminaries on Markov Chains and Random Walks 7

2.3. Shuffling cards in mathematics 12

2.4. Examples 15

3. Random Transpositions 16

3.1. Natural method 16

3.2. The lower bound for the variation distance 19 3.3. Preliminaries on Representation Theory 27 3.4. The upper bound for the variation distance 42

4. Conclusion 55

5. Epilogue 56

References 57

(3)

1. Introduction

Playing card games entertained us for many centuries. The first play- ing card set was made in China in the ninth century and the first time it appeared in Europe was in 1370-1400. The most used card game was created in France approximately in the year 1480.[12][11]

In this game the playing cards are numbered in this order: Ace, 2, 3, 4,...., 10, J (Jack), Q (Queen) and K (King) and the cards are divided into four groups with the following symbols: Hearts, Diamonds, Clubs and Spades. These symbols stand for the four Medieval classes, the Hearts for Clergy, the Spades for Nobility, the Diamonds for the mer- chants and the Clubs for the farmers.

Other playing cards can have different symbols, for example we have Cups instead of Hearts.[10][11]

Since the end of the 17th century in this French card game they use the same names for the Kings, Queens and Jacks, as Jack of Diamonds is Hector (the hero of the Trojan war or Lancelot’s brother), Jack of Clubs is Lancelot (Knight of the Round Table), Queen of Spades is Pal- las (Greek goddess of wisdom and art)[13], King of Diamonds is Julius Caesar, etc.. Another interesting fact to know is that the women are not actually Queens and are not the wives of the Kings.

As we also saw in the last described card game, each card has a unique face and has a back, where the backs in each game are all the same.

Furthermore every card has the same size and shape.

A card game is played with a deck of n cards, where n can be 32, 52, 54 and so on. In the French card game we have that the number of cards is 52.[10][11]

Therefore there are 52! different ways to put a deck of 52 cards in a certain order and 52! ≈ 8.065817517 · 1067, which is a huge number. So in this game it is impossible to look at all options of sequences of the playing cards, but if the number of cards is not much, then we can.

However we can never see when a deck of n cards is well shuffled, no matter what n is.

First we will describe shuffling cards mathematically using Group The- ory, Markov chains and Random walks in section 2.

In section 3.1 the Natural method will be described. This is a shuf- fling method, where we use random transpositions. Therefore the main question is as follows.

How many random transpositions do we need to shuffle a deck of n cards until it is well shuffled?

To get the answer to this main question we will prove the theorems of the lower bound and upper bound for the variation distance in sections 3.2 and 3.4. For the proof of the upper bound we will use Represen- tation Theory, described in section 3.3. The last section is about the conclusions of shuffling cards.

(4)

2. Shuffling cards described mathematically 2.1. Preliminaries on Group Theory.

An important group for describing shuffling cards mathematically is the symmetric group Sn. This is a group whose elements are bijective functions on the set {1, 2, ..., n} and it is with composition of functions as the group operation and the identity operation. This group Sn has the identity element id(k) = k and ∀σ ∈ Sn, there exists σ−1 ∈ Sn. Before we describe shuffling cards mathematically, we need some more information about cyclic groups, cycles and permutations. For this basic material on Goup Theory, see [16][19][15].

Definition 1. Let G be a group and H ⊂ G. Let hHi be the smallest group, that contains all the elements of the set H. If hHi = G, then H generates G.

Definition 2. A group G is called cyclic, if there exists an element g in G, such that

G = hgi = {gn : n ∈ Z}

Definition 3. Let a permutation σ ∈ Sn and assume that a1, a2, ..., ak∈ {1, 2, ..., n} are distinct integers.

If for 1 ≤ i < k σ(ai) = ai+1, σ(ak) = a1 and σ(x) = x for x /∈ {a1, a2, ..., ak},

then σ is a cycle of length k, also called a k-cycle in Sn, which is denoted by

σ = (a1 a2 ... ak) A 2-cycle is called a transposition.

Moreover, if two cycles (a1a2...ak) and (b1b2...bl) have the property {a1, a2, ..., ak} ∩ {b1, b2, ..., bl} = ∅, then these two cycles are disjoint.

So the general form of the notation of a cycle is σ = (a1a2...ak), where σ(a1) = a2, σ(a2) = a3,...,σ(ak−1) = ak and σ(ak) = a1.

Remark 1. Disjoint cycles commute.

Theorem 1. Every permutation σ ∈ Sn can be written as a product of pairwise disjoint cycles, so σ = σ1· · · σr, where σi is pairwise disjoint cycle, ∀i ∈ {1, 2, ..., r}.

This product of pairwise disjoint cycles is unique, apart from the order of each σi.

Proof. We want to show that every σ ∈ Sn can be written as a product of pairwise disjoint cycles σi, where i ∈ {1, 2, ..., r}.

First we will prove the existence of this product of disjoint cycles by induction on n.

For n = 1 the only permutation is σ = (1), which can be written as a product of disjoint cycles.

Let n > 1 and assume that the existence of this product is true for all

(5)

permutations in Sm, where m < n.

Now let σ ∈ Sn, then {1, σ(1), σ(1)2, ...} ⊂ {1, 2, ..., n}. Therefore there exist k and l, such that k < l and σ(1)k = σ(1)l. This is the same as σl−k(1) = 1, so ∃s ∈ Z>0 σ(1)s = 1.

Let the least such number be denoted by q ∈ Z, then the integers 1, σ(1), ..., σ(1)q−1 are pairwise distinct. Thus for σ we have the k- cycle σ1 = (1 σ(1)...σ(1)q−1).

Consider the other integers in {1, 2, ..., n}. If this is an empty set, then σ = σ1 and therefore we are done.

If it is the case that it is not empty, then σ acts as a permutation on this set. Applying that the existence is true for all Sm, where m < n we have that the restriction of σ to this subset can be written as a product of disjoint cycles σ2, σ3, ..., σr.

Now assume that these cycles are the permutations on {1, 2, ..., n}, then σ = σ1σ2· · · σr.

Now we will show the uniqueness of this product, apart from the order of each σi, where i ∈ {1, 2, ..., r}.

Assume for contradiction that some permutation σ can be written as two different products of pairwise disjoint cycles.

Fix i and j, such that σ(i) = j. Therefore in both products there exists exactly one cycle σr1 of the form (... i j ...).

Now consider that σ(j) = h for some fixed i and h, then there exist one cycle σr2 in both products and that cycle is of the form (... j h ...).

Suppose that σ(h) = g. Similarly, there exists one cycle σr3 of the form (... h g ...) in both products.

This algorithm can be done several times until we get the result that both products contain the same cycles, so contradiction. Thus, every σ ∈ Sn can be written as a product of pairwise disjoint cycles and this product is unique, apart from the order of each pairwise disjoint

cycle. 

A consequence of the last theorem is as follows

Theorem 2. Every permutation σ ∈ Sn can be written as a product of 2-cycles.

Proof. From Theorem 1 we know that every permutation σ ∈ Sn can be written as a product of cycles σi, where i ∈ {1, 2, ..., k}. Since every cycle σi = (ai1ai2...aim) can be written as a product of 2-cycles in this way:

(ai1ai2...aim) = (ai1ai2)(ai2ai3) · · · (aim−1aim),

thus σ can be written as a product of 2-cycles.  Now we will discuss even and odd permutations.

First some notation, for n ≥ 2 we write

X := {(i, j) ∈ Z × Z : 1 ≤ i < j ≤ n}.¯

(6)

For σ ∈ Sn let fσ(i, j) = (min{σ(i), σ(j)}, max{σ(i), σ(j)}) and let hσ : ¯X → Q, such that hσ(i, j) = σ(j)−σ(i)j−i .

From this a useful lemma follows.

Lemma 1. Let n ≥ 2, then 1) For σ, τ ∈ Sn,

fστ = fσ◦ fτ

2) fσ is a bijection on ¯X.

3)

Y

(i,j)∈ ¯X

hσ(i, j) = ±1

We will only prove 2) and 3), because 1) is evident. So for the proof of 1) see [19].

Proof. Let n ≥ 2.

2) fσ is a bijection on ¯X, since

id = fid= fσσ−1 = fσ−1σ = fσ−1◦ fσ = fσ◦ fσ−1 using 1), so fσ has an inverse fσ−1 = fσ−1.

3) The absolute value of Q

(i,j)∈ ¯Xhσ(i, j) is Y

(i,j)∈ ¯X

|hσ(i, j)| = Y

(i,j)∈ ¯X

σ(j) − σ(i) j − i

= Q

(i,j)∈ ¯X|σ(j) − σ(i)|

Q

(i,j)∈ ¯X(j − i) , because i < j.

Since fσ( ¯X) = ¯X, the numerator of the equation above is the same as the product of all (l − k). Therefore

Y

(i,j)∈ ¯X

|hσ(i, j)| = Q

(l,k)∈ ¯X|l − k|

Q

(l,k)∈ ¯X(k − l) = 1 So the absolute value is 1, hence,

Y

(i,j)∈ ¯X

hσ(i, j) = ±1

 To determine whether a permutation is even or odd, we need the following definition.

Definition 4. For n ≥ 2 (σ) is called the sign of a permutation in Sn and is given by

(σ) = Y

(i,j)∈ ¯X

hσ(i, j) = Y

1≤i<j≤n

σ(j) − σ(i) j − i = ±1

(7)

If n = 1, then (σ) = 1.

A permutation σ is even, if (σ) = 1 and a permutation σ is odd, if (σ) = −1

This (σ) has the following properties.

Theorem 3. The sign  : Sn→ ±1 is a homomorphism.

Proof. Let σ, τ ∈ Sn, then we have that fσ is bijective on ¯X. Therefore

(σ) = Y

(i,j)∈ ¯X

hσ(i, j)

= Y

(i,j)∈ ¯X

hσ(fτ(i, j))

= Y

1≤i<j≤n

σ(τ (i)) − σ(τ (j)) τ (j) − τ (i) So

(στ ) = Y

(i,j)∈ ¯X

(στ )(j) − (στ )(i) j − i

=

 Y

1≤i<j≤n

σ(τ (j)) − σ(τ (i)) τ (j) − τ (i)



Y

(i,j)∈ ¯X

τ (j) − τ (i) j − i



= (σ)(τ )

So the sign  is a homomorphism. 

Furthermore, Lemma 2.

1) ∀τ ∈ Sn and any l-cycle (a1a2...al) ∈ Sn,

τ (a1a2...al−1 = (τ (a1)τ (a2)...τ (al)) 2) All 2-cycle (a1a2) satisfies ((a1a2)) = −1.

Proof. 1) Let τ ∈ Sn and (a1a2...al) be a l-cycle in Sn. We have that

(τ (a1...al−1)(τ (al)) = (τ (a1...al))(al) = τ (a1) Similarly, for 1 ≤ k < l,

(τ (a1...al−1)(τ (ak)) = (τ (a1...al))(ak+1) = τ (ak+1) and for all remaining i ∈ {1, 2, ..., n},

(τ (a1...al−1)(i) = i So

τ (a1a2...al−1 = (τ (a1)τ (a2)...τ (al))

(8)

2) Assume that (a1a2) is a 2-cycle in Snand fix any permutation τ ∈ Sn, such that τ (a1) = 1 and τ (a2) = 2. Since  is a homomorphism, we get the following

((12)) = ((τ (a1)τ (a2))) = (τ (a1a2−1)

= (τ )(a1a2)(τ )−1 = ((a1a2)), so all transpositions have the same sign.

Applying Definition 4 we get the sign of ((12)):

((12)) = σ(2) − σ(1)

2 − 1 = 1 − 2 2 − 1 = −1

Hence, any 2-cycle (a1a2) satisfies ((a1a2)) = −1.  From these facts it follows that for any 2-cycle (a1a2) ∈ Sn and

∀σ ∈ Sn,

−(σ) = ((a1a2))(σ) = ((a1a2)σ)

The alternating group consists of all even permutations in the symmet- ric group, so

Definition 5. For n ≥ 1, the alternating group An is the subgroup of Sn, that consists of all even permutations of Sn.

(9)

2.2. Preliminaries on Markov Chains and Random Walks.

In this subsection we will follow the book [16].

First, a finite Markov chain is a process, which moves among the ele- ments of a finite set Ω in the following way.

If it is at x ∈ Ω, then the next position is chosen depending on a fixed probability distribution P (x, ·). So,

Definition 6. Let Ω be the state space, P be the transition matrix and x be the current state. Assume that (X0, X1, ...) is a sequence of random variables satisfying the following.

If ∀x, y ∈ Ω, ∀t > 1 and if all events Ht−1 = Tt−1

s=0{Xs = xs} satisfy P(Ht−1∩ {Xt= xt}) > 0, we have that

P{Xt+1 = y|Ht−1∩ {Xt−1 = x}}

= P{Xt+1= y|Xt= x} = P (x, y) (2.1) Then (X0, X1, ...) is a Markov chain with state space Ω and transition matrix P .

Equation (2.1) is called the Markov property, which means that the probability of going from state x to state y is the same, no matter what happened with the sequence x0, x1, ..., xt−1 before the current state x.

Definition 7. Assume that the Markov chain (X0, X1, ...) has a finite state space Ω and the transition matrix P . Let x ∈ Ω, then

the hitting time for x is the first time at which the chain visits state x and we define the hitting time as follows

τx = min{t ≥ 0 : Xt= x}

Moreover, if we only have that a visit to x is at t ≥ 1, then we also have another notation for the hitting time, which is

τx+ = min{t ≥ 1 : Xt= x}

Also, if X0 = x, then τx+ will be called the first return time.

Definition 8. A stopping time τ for (Xt) is a {0, 1, ...} ∪ {∞}-valued random variable, such that for all t, the event {τ = t} is determined by X0, X1, ..., Xt.

If τ is the stopping time, then using the previous definition and the Markov property we get

Px0{(Xτ +1, Xτ +2, ..., Xl) ∈ A|τ = k and (X1, X2, ..., Xk) = (x1, x2, .., xk)}

= Pxk{(X1, ..., Xl) ∈ A}, ∀A ⊂ Ωl, which is the strong Markov property.

Definition 9. A chain P is irreducible, if for any states x, y ∈ Ω there exists t ∈ Z, such that Pt(x, y) > 0.

(10)

This means that from any state we can reach any other state with a positive probability. Note that t can depend on the two states x and y.

Before we get the result from the last definition, what is a random walk?

Definition 10. The random walk on a group G with increment dis- tribution µ is defined as a Markov chain with state space G and which moves by multiplying the current state on the left by a random element of G selected according to µ.

So, the transition matrix P of this Markov chain is as follows.

P (g, hg) = µ(h), ∀g, ∀h ∈ G

Remark 2. We multiply on the left, because it is to be consistent with the usual notation for composition of functions in non-commutative cases. For commutative cases, it doesn’t matter, since in this case multiplying on the left or on the right is the same.

Thus from the two previous definitions and Definitions 1 and 2 we get the following result.

Proposition 1. Let G be a finite group and let S = {g ∈ G : µ(g) > 0}.

The random walk on G with increment distribution µ is irreducible, if and only if hSi = G.

Proof. ’⇒’: Let ¯g be an arbitrary element of G. Assume that the random walk is irreducible, so ∃r > 0, such that

Pr(id, ¯g) > 0. Therefore there is a sequence s1, ..., sr ∈ G, such that

¯

g = srsr−1· · · s1 and si ∈ S, ∀i ∈ {1, 2, ..., r}.

So hSi = G.

’⇐’: Now assume that hSi = G. Let ¯g, ˜g ∈ G be arbitrary, then

˜

g¯g−1 = srsr−1· · · s1, where si ∈ S and i ∈ {1, 2, ..., r},

because every element of G has a finite order and every inverse in ˜g¯g−1 can be rewritten as a positive power of the same group element.

Let m > 0, therefore

Pm(¯g, ˜g) ≥ P (¯g, s1g)P (s¯ 1g, s¯ 2s1¯g) · · · P (sr−1sr−2...s1¯g, (˜g¯g−1)¯g)

= µ(s1)µ(s2) · · · µ(sr) > 0, ∀¯g, ˜g ∈ G

So the random walk on G with increment distribution µ is irreducible.

 If all states have period 1, then the chain is aperiodic and if the chain is not aperiodic, then it is periodic. The period of a state is defined in the following way.

Definition 11. Let T (x) := {t ≥ 1 : Pt(x, x) > 0} be the set of times at which it is possible for the chain to return to the starting position x.

Therefore the period of the state x is the greatest common divisor of T (x), denoted by gcd(T (x)).

(11)

This implies the following result.

Lemma 3. Let µ be the probability distribution on a group G. If µ(id) > 0, then the random walk with increment distribution µ is ape- riodic.

Proof. Let g ∈ G, then µ(id) = P (g, id · g) = P (g, g) > 0.

So 1 ∈ {t : Pt(g, g) > 0}, therefore gcd{t : Pt(g, g) > 0 = 1}.

Thus gcd(T (x)) = 1 and the resulting chain is aperiodic.  Definition 12. A distribution π on a finite set Ω is called stationary distribution of the Markov chain, if π satisfies

π = πP, where P is the transition matrix.

Before we say something about mixing times, we need the definition of the total variation distance.

Definition 13. The total variation distance between two probability distributions µ and ν on a finite set Ω is given by

||µ − ν||T V = max

A⊂Ω|µ(A) − ν(A)| (2.2)

Proposition 2. Let µ and ν be the two probability distributions on Ω, then

||µ − ν||T V = 1 2

X

x∈Ω

|µ(x) − ν(x)| (2.3)

Figure 1

(12)

Proof. Let B = {x : µ(x) ≥ ν(x)}, as we see in Figure 1 and of course, Bc= {x : µ(x) < ν(x)}.

In this Figure 1 region I has area A(I) = µ(B) − ν(B) and II has area A(II) = ν(Bc) − µ(B). Note that the total area under each of µ and ν is one. Therefore

µ(B) + µ(Bc) = 1 ⇒ µ(B) = 1 − µ(Bc) ν(B) + ν(Bc) = 1 ⇒ ν(B) = 1 − ν(Bc) So

A(I) = µ(B) − ν(B) = ν(Bc) − µ(B) = A(II) Hence, the regions I and II have the same area.

Now let A ∈ Ω be any event.

First for any x ∈ A ∩ Bc, x satisfies the inequality µ(x) − ν(x) < 0,

therefore the difference in probability cannot decrease, if we eliminate all these elements x ∈ A ∩ Bc. So

µ(A) − ν(A) ≤ µ(A ∩ B) − ν(A ∩ B)

Note that if we include more elements of B, then the difference in probability cannot decrease, therefore

µ(A ∩ B) − ν(A ∩ B) ≤ µ(B) − ν(B) Hence,

µ(A) − ν(A) ≤ µ(B) − ν(B) Similarly, the following holds too.

ν(A) − µ(A) ≤ ν(A ∩ Bc) − µ(A ∩ Bc) ≤ ν(Bc) − µ(Bc) Assume that A = B, then

|µ(A) − ν(A)| = ν(Bc) − µ(Bc) = µ(B) − ν(B) So

maxA⊂Ω|µ(A) − ν(A)| = ||µ − ν||T V

= 1

2[µ(B) − ν(B) + ν(Bc) − µ(Bc)]

= 1 2

X

x∈Ω

|µ(x) − ν(x)|

Hence,

||µ − ν||T V = 1 2

X

x∈Ω

|µ(x) − ν(x)|



(13)

Let P the transition matrix of the Markov chain and π be the sta- tionary distribution, then define the maximal distance between Pt(x, ·) and π as follows

d(t) = max

x∈Ω ||Pt(x, ·) − π||T V Therefore

Definition 14. The mixing time measures the time required by a Markov chain for the distance to the stationarity to be small. So the mixing time is defined by

tmix() = min{t : d(t) ≤ }

(14)

2.3. Shuffling cards in mathematics.

Now the question is, how can we describe shuffling cards mathemati- cally? Again following [16].

An ordered arrangement of a deck of n cards can be seen as an element of the symmetric group Sn, which consists all permutations σ of the numbers {1, 2, ..., n}.

We interpret permutations of Snas acting on the locations of the cards.

For example, we have that σ(1) = 3, σ(2) = 1, σ(3) = 2 and σ(4) = 4, then this means that card 1 goes to position 3, card 2 to position 1, card 3 to position 2 and card 4 stays in the same place 4. This can be also written in the cycle notation: σ = (132)(4).

Assume that µ is a probability distribution on Sn. We can define a procedure for shuffling cards based on µ as follows.

Apply the permutation σ ∈ Sn to a deck with probability µ(σ).

Therefore we get the following definition.

Definition 15. Assume that we use the last procedure, then

Repeatedly shuffling the deck is the same as running the random walk on the group G with the increment distribution µ.

Now let the support of µ be K, which is K = {σ ∈ Sn: µ(σ) > 0}.

Therefore from Proposition 1 we get the result that hKi = Sn, if and only if the resulting chain of repeatedly shuffling the deck is irreducible.

Moreover from Definition 12 we observe that every shuffle chain has a uniform stationary distribution.

(15)

2.3.1. Generating an exactly uniform Random Permutation.

We use random permutations to describe a simple algorithm for gen- erating an exactly uniform random permutation using [16], so we get the following definition.

Definition 16. Let σ0be the identity permutation and k be in {1, 2, ..., n}.

Then we construct the permutation σk from the previous permutation σk−1 by swapping the cards at locations k and Jk, where Jk∈ {k, ..., n}

uniformly and Jk is an integer independently of {J1, J2, ..., Jk−1}. So

σk(i) =





σk−1(i) if i 6= Jk, i 6= k σk−1(Jk) if i = k

σk−1(k) if i = Jk

(2.4)

Now we want to prove that this generates a uniformly chosen element of Sn, so

Lemma 4. Let J1, ..., Jn−1be independent integers, where Jkis uniform on the set {k, k+1, ..., n}. Assume that σn−1is the random permutation obtained by Definition 16. Therefore σn−1 is uniformly distributed on Sn.

Proof. We want to prove that σn−1 is uniformly distributed on Sn by showing that

P σk(j) = η(j) : for j ∈ {1, 2, ..., k} =

k−1

Y

i=0

(n − i)−1 (2.5) by induction on k ∈ {1, 2, ..., n − 1}.

Definition 16 gives us the equations,

σk(i) =





σk−1(i) if i 6= Jk, i 6= k σk−1(Jk) if i = k

σk−1(k) if i = Jk Let a specific permutation η ∈ Sn be given.

Basic step: For k = 1 we get that

P σ1(j) = η(j) : for j ∈ {1, 2, ..., k} =

1−1

Y

i=0

(n − i)−1 = n−1 and η(j) = σ1(j) = σ0(j), where σ0(j) is the identity function,

∀j ∈ {1, 2, ..., k}.

So the equation (2.5) is true, if k = 1.

Induction step: Assume that is holds for k, then we have to prove it for k + 1. Thus we have that the equation (2.5) holds.

Let all η(1), ..., η(k) be distinct in {1, 2, ..., n}.

(16)

Therefore

P(σk+1(1) = η(1), ..., σk+1(k) = η(k), σk+1(k + 1) = η(k + 1))

= P(σk+1(1) = η(1), ..., σk+1(k) = η(k))

· P(σk+1(k + 1) = η(k + 1)|σk+1(1) = η(1), ..., σk+1(k) = η(k))

= P(σk(1) = η(1), ..., σk(k) = η(k))

· P(σk+1(k + 1) = η(k + 1)|σk(1) = η(1), ..., σk(k) = η(k)), applying Definition 16. Then the last equality is the same as

k−1

Y

i=0

(n − i)−1· 1 n − k =

k

Y

i=0

(n − i)−1. Therefore it holds for the case k + 1.

Thus by induction we proved that

P σk(j) = η(j) : for j ∈ {1, 2, ..., k} =

k−1

Y

i=0

(n − i)−1,

∀k ∈ {1, 2, ..., n − 1}.

Hence, σn−1 is uniformly distributed on Sn. 

(17)

2.4. Examples.

Some useful examples to determine if a random walk is irreducible or not, are given here using the source [16].

Example 1. Let T be the set of all 3-cycles in Sn and µ be uniform on T . So every permutation in T is even and not every permutation in Sn can be written as a product of 3-cycles. Therefore T does not generate Sn.

Hence, using Proposition 1 the random walk with increment distribution µ is not irreducible.

Example 2. Now let T be the set of all transpositions in Sn and µ be the uniform probability distribution on T .

So all permutations in T are odd.

Using Definition 16 and that every permutation in Sn can be written as a product of 2-cycles by Theorem 2 we have that T generates Sn. Therefore the random walk with increment distribution µ is irreducible applying Proposition 1.

Moreover, the support of µ K is odd.

So if the random walk is started at the identity, its position must be an even permutation after an even number of steps. And after an odd number of steps, its position is an odd permutation.

Example 3. (Lazy random transpositions) A more natural way for shuffling cards is the following.

Let each card be labeled with a number between 1 and n, then at time t, the shuffler chooses two cards Lt and Rt independently and uniformly at random.

If Lt and Rt are different, transpose these two cards. Otherwise, do nothing.

We get the following result for µ, µ(e) = 1

n, if e is the identity µ(σ) = 2

n2, if σ is a transposition in Sn (2.6) µ(.) = 0, otherwise

Since we have n2 possibilities in total and there are n ways to get the identity, the probability that a permutation in Sn is an identity per- mutation is nn2 = 1n. Also, there are two possibilities that σ is the transposition (ij), so the probability that a permutation in Sn is this transposition (ij) is n22, since (ij) = (ji).

We will use this method for the next section.

(18)

3. Random Transpositions 3.1. Natural method.

There are many shuffling methods, one of them is the Top in at random shuffle.[2] The procedure of this method is the following. The top card is removed and inserted into the deck at a random position and this process will be repeated for a number of times.

Another method is the Riffle shuffle [2][16], which is the most often used to shuffle real decks of 52 cards. It is as follows. First the shuffler cuts the deck into two piles, then the piles are riffled together.

We use the shuffling method Random Transposition shuffling [16][7], because is a simpler shuffle than the other methods. This method is also called the Natural method, because it is a more natural way for shuffling cards. The procedure was described in Example 3.

On this symmetric group we have a probability measure T , which mod- els a random transposition in a certain way, which was explained in example 3. However we use T instead of µ.

The convolution of this probability measure T with itself k times mod- els the result of k random transpositions. This convolution and the uniform distribution on Sn are the following.

Definition 17. T∗k is the k-convolution of T with itself k times and this k-convolution models the result of k random transpositions.

U is the uniform distribution on the symmetric group Sn. Therefore

U (σ) = 1

n!, ∀σ ∈ Sn (3.1)

Using the equations (2.2) and (2.3) the total variation distance is in our case the following.

||T∗k− U ||T V = 1 2

X

σ∈Sn

|T∗k(σ) − U (σ)|

= 1 2

X

σ∈Sn

T∗k(σ) − 1 n!

= max

A⊂Sn

|T∗k(A) − U (A)| (3.2)

The lower bound and upper bound for this variation distance are given in the following sections.

First, we need the Inclusion-Exclusion formula. Let Ai∩ Aj be denoted by AiAj, ∀i, j.

(19)

Theorem 4 (Inclusion-Exclusion Formula). If P is a probability func- tion and A1, A2, ..., An are any sets in B, then

P(

n

[

i=1

Ai) =

n

X

i=1

P(Ai) − X

1≤i<j≤n

P(AiAj) + X

1≤i<j<k≤n

P(AiAjAk)

− X

1≤i<j<k<l≤n

P(AiAjAkAl) + ... + (−1)n+1P(∩ni=1Ai) (3.3) Proof. The proof is by Induction on n.

Assume that P is a probability function and A1, A2, ..., An are any sets in B.

First the case that n = 2:

A1 and A2∩ Ac1 are disjoint, since A1 and Ac1 are disjoint. So,

P(A1∪ A2) = P(A1) + P(A2∩ Ac1) (3.4) First A2 = {A2∩ A1} ∪ {A2∩ Ac1}, then

P(A2) = P({A2∩ A1} ∪ {A2∩ Ac1})

= P({A2∩ A1}) + P({A2∩ Ac1}) ⇔ P(A2∩ Ac1) = P(A2) − P(A1 ∩ A2) So (3.4) becomes

P(A1∪ A2) = P(A1) + P(A2) − P(A1∩ A2) Now the case n = 3:

P(A1∪ A2∪ A3) = P(A1∪ (A2∪ A3)) Using case n = 2 we get the last equality is same as

P(A1 ∪ A2) + P(A3) − P((A1∪ A2) ∩ A3)

Now using again the case n = 2 a few times, then the last equation is P(A1) + P(A2) − P(A1∩ A2) + P(A3) − P((A1∩ A3) ∪ (A2∩ A3))

= P(A1) + P(A2) + P(A3) − P(A1∩ A2) − P(A1∩ A3)

− P((A2 ∩ A3) + P(A1∩ A2∩ A3)

= P(A1∪ A2∪ A3).

Assume that it holds for the case n, so P(

n

[

i=1

Ai) =

n

X

i=1

P(Ai) − X

1≤i<j≤n

P(AiAj) + X

1≤i<j<k≤n

P(AiAjAk)

− X

1≤i<j<k<l≤n

P(AiAjAkAl) + ... + (−1)n+1P(∩ni=1Ai)

(20)

holds and now we have to show it for the case n + 1:

So prove that P(

n+1

[

i=1

Ai) =

n+1

X

i=1

P(Ai) − X

1≤i<j≤n+1

P(AiAj)

+ X

1≤i<j<k≤n+1

P(AiAjAk) − ... + (−1)n+2P(∩n+1i=1Ai) Indeed,

using the case n = 2 and the case n we get

P(∪n+1i=1Ai) = P(An+1) + P(∪ni=1Ai) − P(An+1∩ (∪ni=1Ai)) (3.5) Then

P(An+1∩ (∪ni=1Ai)) = P(∪ni=1(Ai∩ An+1))

=

n

X

i=1

P(AiAn+1) − X

1≤i<j≤n

P(AiAjAn+1) + ...

Therefore (3.5) is equal to P(An+1) +

n

X

i=1

P(Ai) − X

1≤i<j≤n

P(AiAj) −

n

X

i=1

P(AiAn+1)

+ X

1≤i<j<k≤n

P(AiAjAk) + X

1≤i<j≤n

P(AiAjAn+1) − ...

=

n+1

X

i=1

P(Ai) − X

1≤i<j≤n+1

P(AiAj)

+ X

1≤i<j<k≤n+1

P(AiAjAk) − ... + (−1)n+2P(∩n+1i=1Ai)

Hence, this proves the Inclusion-Exclusion formula.  Also useful is this using [4],

Definition 18.

We can write

f (n) = o(g(n)), if and only if

n→∞lim

|f (n)|

g(n) = 0 Also, we can write

f (n) = O(n),

if and only if there exists some constant C > 0 in R, such that

|f (n)| ≤ Cn

(21)

3.2. The lower bound for the variation distance.

Theorem 5 (The lower bound for the variation distance).

Assume that T is the probability measure on Sn, given by the equation (2.6) in Example 3 using T instead of µ.

Suppose that T∗k is its k-convolution. Let U be the uniform distribution on Sn, such that U (σ) = n!1, ∀σ ∈ Sn. Suppose that

c = c(k, n) = k−

1 2n log n

n , then ∀k,

||T∗k− U ||T V ≥ (1

e − e−e−2c) + o(1), (3.6) as n → ∞.

Remark 3.

This lower bound is useful if c < 0. That is, if k < 12n log n.

So from the last remark we only have to look at the case in the proof of Theorem 5, when k < 12n log n.

Now we will give the proof of theorem 5 following the sources [7][8].

Proof of the lower bound for the variation distance.

Let A be the set of all permutations of Snwith one or more fixed points.

For all i ∈ {1, 2, ..., n}, i is called a fixed point of the permutation σ if σ(i) = i.

Therefore we have the following claim.

Claim 1.

U (A) = 1 − 1

e + O1 n!



(3.7) Proof of Claim 1.

U (A) = P(at least one fixed point of the permutation)

= P(

n

[

i=1

{σ(i) = i})

Using the Inclusion-Exclusion formula (3.3) we get that the last equal- ity is

P(

n

[

i=1

{σ(i) = i}) =

n

X

i=1

P({σ(i) = i}) − X

1≤i<j≤n

P({σ(i) = i} ∩ {σ(j) = j})

+ ... + (−1)n+1P(

n

\

i=1

{σ(i) = i})

= nP(σ(1) = 1) −n 2



P({σ(1) = 1} ∩ {σ(2) = 2}) +n

3



P({σ(1) = 1} ∩ {σ(2) = 2} ∩ {σ(3) = 3}) − ...

(22)

We have that

P({σ(1) = 1}) = 1

n, P({σ(1) = 1} ∩ {σ(2) = 2}) = 1 n(n − 1),

P({σ(1) = 1} ∩ {σ(2) = 2} ∩ {σ(3) = 3}) = 1

n(n − 1)(n − 2) and so on, then

U (A) = n1

n−n(n − 1) 2!

1

n(n − 1)+n(n − 1)(n − 2) 3!

1

n(n − 1)(n − 2)−..., since for example n2 = 2!(n−2)!n! = n(n−1)2! .

So

U (A) = 1 − 1 2! + 1

3! − 1

4! + ... + (−1)n+1 1

n! ≈ 1 −1 e, because ex =P

i=0 xi

i , then e−1 =

X

i=0

(−1)i i

= 1 − 1 + 1 2!− 1

3!+ ... = 1 2!− 1

3!+ 1 4!− ...

Furthermore, we want to prove that

−(1 − e−1) + U (A) = O1 n!



using Definition 18.

So

(1 − e−1) − U (A) = (−1)n 1

(n + 1)! + (−1)n+1 1

(n + 2)! + ...

Then using the triangle inequality,

| − (1 − e−1) + U (A)| ≤ |(−1)n 1

(n + 1)!| + |(−1)n+1 1

(n + 2)!| + ...

= 1

(n + 1)! + 1

(n + 2)! + ...

=

X

i=1

1 (n + i)!

(23)

Moreover,

X

i=1

1

(n + i)! ≤ 1 n! +

X

i=1

1 (n + i)!

= 1 n! + 1

n!

1

(n + 1)+ 1 n!

1

(n + 1)(n + 2) + ...

= 1

n!(1 + 1

n + 1+ 1

(n + 1)(n + 2) + ...)

≤ 1

n!(1 + 1 n + 1

n2 + 1 n3 + ...)

≤ 1 n!C,

where C is some constant in R, since (1+n1+n12+n13+...) is a geometric series, which converges.

Hence, the result of this is that

| − (1 − e−1) + U (A)| = |(1 − e−1) − U (A)| ≤ C 1 n!, for some C > 0. Therefore using Definition 18 we get that

−(1 − e−1) + U (A) = O

1 n!

 , which is the same as

U (A) = 1 − e−1+ O1 n!

 ,

which proves the claim. 

We have an equation for U (A) and now we want to find a lower bound for T∗k(A). Assume that the random transpositions are (L1, R1), .., (Lk, Rk) and let B be the event that the set of labels {Li, Ri}ki=1 is strictly smaller than {1, 2, ..., n}. Therefore the second claim is as follows.

Claim 2.

P(B) = 1 − e−ne

− 2kn

+ o(1), (3.8)

as n → ∞ and ∀k ∈ [0, b12n log nc).

Also,

∀ > 0, there exists n0, such that n ≥ n0, then

|P(B) − (1 − e−ne− 2kn)| < , for any 0 ≤ k < 12n log n.

Proof of Claim 2.

The situation of event B can be seen as the case that 2k balls are dropped in n boxes (or called cells), then the probability of B is the same as the probability of at least one cell is empty.

Assume that each arrangement has probability n12k and let Aj be the

(24)

event that cell j is empty, where j ∈ {1, 2, ..., n}.

Then in Aj one cell is empty, so 2k balls can be placed in (n − 1) cells in (n − 1)2k different ways. For 2 empty cells we have (n − 2)2k different ways, and so on.

Therefore

P(Aj1) = 1

n2k(n − 1)2k = 1 − 1

n

2k

, P(Aj1 ∩ Aj2) = 1

n2k(n − 2)2k = 1 − 2

n

2k

, and so on.

Then using the Inclusion-Exclusion formula (3.3), P(at least one cell is empty)

=

n

X

i=1

P(Ai) − X

1≤i<j≤n

P(AiAj) + ... + (−1)n+1P(

n

\

i=1

Ai)

= nP(A1) −n 2



P(A1∩ A2) + ... + (−1)n+1P(A1∩ ... ∩ An)

=

n

X

i=1

(−1)i+1n i

 1 − i

n

2k

Let Si = ni(1 −ni)2k, then

P(all cells are occupied) = P0(2k, n)

= 1 − P(at least one cell is empty) =

n

X

i=0

(−1)iSi

Our goal is to find the limiting formula for the probability P0(2k, n) as n tends to infinity and k < 12n log n.

Let λ = ne2kn. Since k < 12n log n, we have that λ < 1 and λ is posi- tive. Therefore λ stays in a finite interval (a, b), where 0 < a ∈ R and a < b ∈ R. So we will follow the proof of Theorem 3 of the book [8].

First we will estimate Si, note that

n(n − 1) · · · (n − i + 1) = n!

(n − i)!

and

(n − i)i < n!

(n − i)! < ni, which is the same as

ni

ni(n − i)i(1 − i

n)2k < i!

i!

n!

(n − i)!(1 − i

n)2k < ni(1 − i n)2k ⇔ ni(1 − i

n)i+2k < i!Si < ni(1 − i

n)2k (3.9)

(25)

To get another expression for (3.9), we need the following.

First we have that geometric series is 1

1 − t = 1 + t + t2+ t3+ ... =

X

j=0

tj,

if and only if −1 < t < 1.

Therefore

t · 1

1 − t = t + t2+ t3+ t4+ ...

Moreover the Taylor expansion of the natural logarithm is

log t =

X

j=1

(−1)j−1(t − 1)j

j ,

if |t − 1| ≤ 1 and if t 6= 0.

Substitute 1 − t instead of t in the Taylor expansion of the natural logarithm we get that

log(1 − t) = −

X

j=1

tj j

⇒ − log(1 − t) =

X

j=1

tj j

= t + 1 2t2+1

3t3+ ..., if −1 < t < 1.

So

t < − log(1 − t) < t

1 − t ⇔ e−t > (1 − t) > e1−tt Using the last inequality the expression (3.9) becomes

nie−(1−i/ni/n )(i+2k) < i!Si < nieni2k

(nei+2kn−i)i < i!Si < (ne2kn)i (3.10) Moreover for a fixed i we get that

(ne2kn)i (nei+2kn−i)i

−→ 1 and (nei+2kn−i)i

(ne2kn)i −→ 1,

(26)

as 2k, n tend to infinity, such that 0 < a < λ < b.

We also have that λ = ne2kn, therefore i!Si

λi −→ 1 ⇒ Si −→ λi

i! ⇒

0 ≤ λi

i! − Si −→ 0, (3.11)

if 2k and n increase in such a way that λ is bounded.

This relation (3.11) still holds, if λ tends to zero, because then Si tends to zero too.

Since 2k and n increase, such that 0 < a < λ < b, we can use (3.11) to get the following expression.

P0(2k, n) =

n

X

i=0

(−1)iSi

n

X

i=0

(−1)iλi

i! ≈ e−λ Moreover we want to show that

−P0(2k, n) + e−λ = o(1), using Definition 18.

So

n→∞lim |e−λ− P0(2k, n)|

= lim

n→∞

X

i=0

(−λ)i i! −

n

X

i=0

(−1)iSi

=

X

i=0

(−λ)i

i! − lim

n→∞

n

X

i=0

(−1)iSi

=

X

i=0

(−λ)i i! −

X

i=0

(−λ)i i!

= 0,

since 2k → ∞ and n → ∞, such that λ is bounded, therefore we can use (3.11) in the last expression.

Thus using Definition 18 we get the following.

− P0(2k, n) + e−λ = o(1) ⇔ P(B) = 1 − e−λ+ o(1)

Now for any k < 12n log n we want to prove that for an arbitrary  > 0, there exists n0, such that n ≥ n0, then

|(1 − e−λ) − P(B)| < 

(27)

Indeed, let  > D for some constant D ∈ R, then

|(1 − e−λ) − P(B)| = |P(B) − (1 − e−λ)|

= |e−λ− P0(2k, n)|

=

X

i=0

(−λ)i i! −

n

X

i=0

(−1)iSi

<

X

i=0

(−λ)i i! −

n

X

i=0

(−1)i(λ)i i!

< 1 −

n

X

i=0

(−1)i(λ)i i!

= λ − 1

2+ 1

3!λ3− ... + (−1)nλn n!

<

X

i=1

(−1)i+1λi i! < D,

since 2k and n tend to infinity, such that 0 < a < λ < b, therefore 0 < e−λ < 1 and using (3.10). Also, the last expression converges, because λ is bounded and positive, where D is some constant in R and D < . So,

|(1 − e−λ) − P(B)| <  Hence,

P(B) = 1 − e−ne

− 2kn

+ o(1), as n → ∞ and ∀k ∈ [0, b12n log nc).

Also,

∀ > 0, there exists n0, such that n ≥ n0, then

|P(B) − (1 − e−ne− 2kn)| < , for any 0 ≤ k < 12n log n.

This proves the second claim. 

We have assumed that c = k−

1 2n log n

n , then e−2c= e−2k/n+log n = ne2kn. Therefore

P(B) = 1 − e−ne

− 2kn

+ o(1) = 1 − e−e−2c+ o(1) We have that

T∗k(A) ≥ P(B) = 1 − e−e−2c + o(1), (3.12) because B ⊂ A.

Thus using the variation distance (3.2) and the formulas (3.7) and

(28)

(3.12) we get

||T∗k− U ||T V ≥ |T∗k(A) − U (A)|

≥ (T∗k(A) − U (A)) ≥ (1 − e−e−2c+ o(1) − 1 + e−1+ O1 n!

 )

= (1

e − e−e−2c) + o(1),

∀k < 12n log n, as n → ∞ and this proves theorem 5. 

(29)

3.3. Preliminaries on Representation Theory.

Before we can give the theorem of the upper bound for the variation distance and the proof of it, we need some information about Repre- sentation Theory following the article [7] and the book [17]. In this subsection we will not give the proof of every result we state. The reader can find these proofs in the article [7].

What is a representation?

First let V be a vector space over the field C, G be a finite group and GL(V ) = Aut(V ) be general linear group on V , where Aut(V ) is the group of isomorphisms of V onto itself.

Definition 19. A linear representation of G is a homomorphism ρ : G → GL(V )

In other words a linear representation is a map ρ, such that ρ(st) = ρ(s)ρ(t), ∀s, t ∈ G

Furthermore V is called the representation space of G, or shortly the representation of G.

Note that if ρ is the representation of G and V the representation space of G, then we write (ρ, V ) as a notation. Also, let the degree of a representation ρ be denoted by dρ, which is the dimension of V . Assume that V has a finite dimension. Therefore GL(V ) is the group of dρ× dρ-invertible matrices on the field C.

Example 4. A trivial representation is ρ(s) =identity for all s ∈ G.

Example 5. A representation ρ with dρ = 1 is given by ρ : G → GL1(C) = C \ {0} = C×

So every element of G has a finite order and the values of ρ(s) are roots of unity.

Example 6. Let G be Z/2Z × Z/2Z and ρ : G → GL2(C) be a homo- morphism given by

ρ(0, 0) =1 0 0 1



, ρ(1, 0) =−1 0 0 −1

 , ρ(0, 1) =0 1

1 0



, ρ(1, 1) = 0 −1

−1 0

 Therefore ρ is a linear representation of G of degree 2.

Definition 20. Let the representations ρ and π be of the same group G. Therefore two representations (ρ, Vρ) and (π, Vπ) are equivalent or called isomorphic, if there exists a bijective linear map T : Vρ → Vπ, such that

T ◦ ρ(s) = π(s) ◦ T, ∀s ∈ G

Moreover the equivalent representations have the same degree.

(30)

Here is the definition of a subrepresentation.

Definition 21. Let

ρ : G → GL(V )

be a linear representation and W be a G-invariant vector subspace of V , which means that ∀w ∈ W ρ(s)w ∈ W , for all s ∈ G.

Moreover the restriction ρ(s)

W of ρ(s) to W is an isomorphism given by

ρ

W : G → GL(W ) and ρ

W(st) = ρ W(s)ρ

W

(t).

Therefore ρ

W is a linear representation of G in W and is called linear subrepresentation of G. Moreover W is the subrepresentation of V .

The last definition implies the following.

Definition 22. A representation ρ : G → GL(V ) is called irreducible, if V is not zero and if there are only trivial G-invariant subspaces of V , these are {0} and V .

Now we will discuss the transform of a function at a representation.

Definition 23. Let G be finite group and ρ : G → GL(V ) be the representation of G. Assume that the function P : G → C, then the analog of the Fourier transform of function P at a representation ρ is

ˆ

ρ(P ) =X

η∈G

P (η)ρ(η)

Furthermore let P1 and P2 be the functions on a finite group G, then the convolution of the two functions P1 and P2 is the following.

For γ ∈ G,

P2∗ P1(γ) =X

η∈G

P2(γη−1)P1(η) Therefore,

Lemma 5. If P1 and P2 are two functions on G and ρ is a represen- tation of G, then

ˆ

ρ(P1∗ P2) = ˆρ(P1) ˆρ(P2)

Assume that ˆG be the set of all irreducible representations of G. For the function P : G → C we have that ρ 7→ ˆρ(P ), where ρ ∈ ˆG and ˆ

ρ(P ) ∈ GL(V ). So ˆρ(P ) of ˆG can be seen as a matrix valued function in GL(V ).

Moreover, let |G| be the number of elements of G, also called the order of G. Further let Tr[.] be the trace of the matrix and ∗ be the complex conjugate transpose. Thus,

(31)

Lemma 6. For any function P : G → C, the Plancherel formula is

X

η∈G

|P (η)|2 = 1

|G|

X

ρ∈ ˆG

dρTr[ ˆρ(P ) ˆρ(P )] (3.13) The character of the representation ρ is denoted by Xρ, which is the function Xρ : G → C, such that

Xρ(s) = Tr[ρ(s)], ∀s ∈ G

Furthermore a character of an irreducible representation is called an irreducible character.

Since C-vector space V has dimension dρ, we have that ρ(1) = 1 and Tr(1) = dρ, therefore Xρ(1) = dρ. So the irreducible characters of G are defined by X1, ..., Xr, such that Xi(1) = di, where 1 ≤ i ≤ r.

Before we can go further, we need the following definition of regular representation.

Definition 24. Let V be a vector space of dimension dρ = |G| with a basis (et)t∈G. Assume that ρs : V → V is a linear representation, for all s ∈ G, such that

ρs(et) = est

Then ρs is called the regular representation of G and V is called the regular representation space of G.

Note that the degree of ρs is the same as the order of G, which is |G|

and es = ρs(e1).

To make it more clear, here is an example. [18]

Example 7. In S3 the vector space V has a basis

(et) = (eidentity, e(12), e(13), e(23), e(123), e(132)) by Definition 24, where t ∈ S3. This implies that if we take for example the permutation t = (12) in S3, then

ρs((12)) =

0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0

 ,

because identity=(1) and (1)(12) = (12), so we get a one on the second place in the first column. Further, (12)(12) = identity, thus there is a one on the first place in the second column and (13)(12) = (123), so we have a one on the fifth place in the third column, etc.. For the last column we have a one on the fourth place, since (132)(12) = (23).

Let XG be the character of the regular representation ρs. If s 6= 1, then st 6= t. Therefore the diagonal elements of the matrix ρs are zero, so XG(s) =Tr(ρs) = 0. Moreover if s = 1, then

Referenties

GERELATEERDE DOCUMENTEN

hij/zij kwam iemand/Mary Woolford tegen (in de supermarkt) die hij/zij liever niet had gezien. 31 B 32 maximumscore 3 1 niet 2 niet 3 niet 4 wel 5 niet 3 2 1 indien

The research questions aimed to answer if this media narrative subsisted among the online users, to examine the discourse power of traditional media in comparison

Lees bij de volgende opgave eerst de vraag voordat je de bijbehorende tekst raadpleegt. Tekst 13 The internet

Eindexamen havo Engels 2013-I havovwo.nl havovwo.nl examen-cd.nl Tekst 3 Can we trust the forecasts?. by weatherman

In paper “Strongly Convex Programming for Exact Matrix Com- pletion and Robust Principal Component Analysis”, an explicit lower bound of τ is strongly based on Theorem 3.4.. However,

The aim of this study was to prospectively evaluate these TVS-based soft markers for the detection of common pelvic pathology (endometriosis and pelvic adhesions) in women

I will contend, first, the normative claim that develop- ing an ideology as a global perspective in the third sense is a valu- able human enterprise and, second,

Pension funds shape the retirement opportunities for older workers and inform them over the course of their careers about the financial prospects of their retirement