Numerical GPU simulations of the hydrogen 1s ground state in Stochastic Electrodynamics A fully relativistic 3D treatment

(1)

ground state in Stochastic Electrodynamics

A fully relativistic 3D treatment

Matthew T. P. Liska

Supervised by dr. Theo M. Nieuwenhuizen

Second corrector: prof. dr. Bernard Nienhuis

30-08-2015

MSc. thesis GRAPPA physics track

Institute of Physics (IoP)

M. T. P. Liska

(2)

Abstract During this research project we tested using numerical simulations

whether stochastic electrodynamics (SED) are able to explain the shape and sta-bility of the 1s hydrogen ground state.

For this we improved on previous work (Cole and Zou 2003) by a full 3D treatment of the zero-point field, longer run times and the inclusion of relativistic corrections plus spin-orbit coupling. On top of this, we compared our results to a conjecture for the angular momentum and energy distribution of an electron in the 1s ground state.

This was made possible by mathematical simplications and using a self-developed high performance OpenCL code utilising the enormous power of graphics process-ing units (GPUs). The results are in all cases with or without the inclusion of position dependence and/or relativistic eﬀects similar. However, the answer to the stability question is negative.

While we can achieve close correlation to a hydrogen ground state on inter-mediate time scales using controlled conditions, we also see serious deficiencies at high eccentricities and inconsistent results for diﬀerent cutoﬀs of the zero-point field. Another result is that our system ionises on timescales similar to 106 elec-tron orbits. This is in contrast to previous simulations (Cole and Zou 2003), who didn’t encounter ionisation on shorter timescales.

We also propose a remedy for this ionisation to be investigated in future by improving on the used point charge approximation for the electron.

(3)

Contents

1 Dutch summary . . . 4

2 Introduction . . . 6

3 The zero-point field . . . 8

3.1 The zero-point field: Physical representation . . . 9

3.2 The zero-point field: Numerical representation . . . 10

3.3 The zero-point field: Fixed vs moving cuttoﬀs . . . 11

4 Deriving the equations of SED . . . 12

4.1 Deriving the equations of SED: The coulomb force . . . 12

4.2 Deriving the equations of SED: The Abraham-Lorenz force . . . 12

4.3 Deriving the equations of SED: Final equation of motion . . . 13

4.4 Deriving the equations of SED: Switching to Bohr units . . . 14

4.5 Deriving the equations of SED: Canonical equations of motion . . . 14

5 Relativistic corrections in the hydrogen problem . . . 16

6 Conjecture for the ground state phase space density . . . 18

7 Numerics . . . 20

7.1 Numerics: Limitations of previous simulations . . . 20

7.2 Numerics: Our code . . . 20

8 Computation . . . 22

8.1 Computation: CPUs not becoming faster . . . 22

8.2 Computation: The tremendous speed of GPUs . . . 22

8.3 Computation: Implementing our algorithm on a GPU . . . 24

8.4 Computation: Used system and performance . . . 24

9 Results . . . 26

9.1 Results: Moving cutoﬀ . . . 26

9.2 Results: Fixed cutoﬀ . . . 27

10 Other work . . . 29

10.1 Other work: Theoretical prediction about ionisation . . . 29

10.2 Other work: A protocol for the electron’s conserved quantities . . . 29

11 Discussion . . . 31

(4)

1 Dutch summary

Quantum mechanica was en is zeer succesvol geweest in het maken van voor-spellingen in de atomaire fysica. Zo kunnen de fijn- en hyperfijnstructuur van onder andere het waterstofatoom verklaard worden, wat erg belangrijk was in de context van het bestuderen van sterspectra begin vorige eeuw. Ook werd aange-toond dat deeltjes net als golven interferentiepatronen kunnen vertonen, wat erop wijst dat deeltjes en golven eigenschappen van elkaar kunnen bezitten. Nog steeds wordt in de wetenschap veelvuldig gebruik gemaakt van voorspelling uit de quan-tum mechanica, zoals in de chipindustrie. Met andere woorden, het heeft zich in ontwikkeld tot een zeer goed functionerende eﬀectieve theorie.

Quantum mechanica werd op fundamenteel niveau echter niet altijd goed ont-vangen, omdat zijn puur probabilistisch karakter onverenigbaar is met onze in-tuitie. Dit probabilistisch karakter houdt in dat de uitkomst van een quantum mechanisch experiment vooraf niet te voorspellen is. Albert Einstein schijnt ooit te hebben gezegd: ’God does not play dice’. In de quantummechanica bepaalt de golffunctie, welke uit toestanden met verschillende kansen bestaat, de uitkomst van een experiment pas na een meting. Dan stort de golffunctie pas in en neemt een zekere toestand in. Een beroemd voorbeeld is Schrödinger’s kat, welke opgesloten in een doos is en een superpositie van een levende en dode kat is. Stel we openen de doos en vinden een dode kat, kunnen we dan vaststellen dat de kat dood was? Volgens de quantummechanica niet. Sterker nog, deze zegt dat als we misschien even gewacht hadden met de ’meting’ we een levende kat vonden.

Deze en soortgelijke contra-intuitieve resultaten uit de quantum mechanica wapperen nog steeds de discussie aan of er een sub-quantum mechanische theorie bestaat, die het bijvoorbeeld wel mogelijk zou maken te voorspellen of de kat levend is. Een veelbelovende theorie is de Stochastische electrodynamica (SED). Deze stelt dat er een zogenaamd nulpuntsveld rond ons heen aanwezig is, de grondtoestand van het elektromagnetische veld, welke alle fotonen ’draagt’. Klassiek gezien wordt dit nulpuntsveld door elektromagnetische golven gerepresenteerd, die natuurlijk een kracht op materie om ons heen kunnen uitoefenen. De stochastische electro-dynamica probeert met behulp van dit nulpuntsveld de resultaten uit de quantum mechanica na te bootsen, maar dan op een betere en meer fundamentele manier.

De Stochastische elektrodynamica is echter eind vorige eeuw in verval is ger-aakt, omdat het te weinig vooruitgang had geboekt in het verklaren van niet-lineare problemen zoals de grondtoestand van het waterstofatoom. Daar gaat men uit van een klassiek elektron dat in een Kepler baan rond het waterstofatoom be-weegt. Uit de relativistische elektrodynamica zou volgen dat het elektron al zijn energie zou uitstralen en op de kern van het atoom belanden, wat natuurlijk on-juist is. Het nulpuntsveld moet dan genoeg energie terugggeven der middel van een elektromagnetische kracht op het elektron om het stabiel te houden en de quantum-mechanische vorm van de grondtoestand te verklaren.

Begin vorig decennium hebben niet-relativistische computerresultaten gesug-gereerd dat de stochastische elektrodynamica de grondtoestand van het water-stofatoom inderdaad kon verklaren. Deze simulaties waren echter gelimiteerd door een tekort aan computationele kracht indertijd. Zo konden deze alleen in 2D en met een beperkte selectie van resonante modes uit het nulpuntsveld worden uit-gevoerd. Dit was bovenop een zeer korte draaitijd, die de volledige dynamica niet in kaart kon brengen.

(5)

De afgelopen jaren is de computationele fysica echter heel snel vooruitgegaan. Buiten de wet van Moore werd het mogelijk om met behulp van videokaarten, ontworpen voor het renderen van graphics in computerspellen, vele codes met meer dan een factor 100 te versnellen.

We hebben in dit onderzoek gebruik gemaakt van deze nieuwe technieken om nog een keer de stochastische electrodynamica te testen. Hiervoor hebben we een eigen 3D relativistische code ontwikkeld, die de grondtoestand van het waterstofa-toom probeerde te berekenen onder aanname van een nulpuntsveld en klassieke mechanica.

Wij vonden in tegenstelling tot eerdere resultaten een iets minder mooie corres-pondentie met de grondtoestand van het waterstofatoom (figuur 1). Het elektron bleek ook te ioniseren, omdat het bij banen met hoge eccentriciteit te veel energie opneemt. Dus we kunnen concluderen dat de huidige implementatie van de theorie minder goed werkt dan de eerdere minder nauwkeurige simulaties suggereerden.

Desalnietemin is er toch een duidelijk verband tussen onze gesimuleerde verdel-ing en de quantummechanische grondtoestand van het waterstofatoom zichtbaar. Dit suggereert dat er misschien ergens in de gebruikte theorie iets ontbreekt of een foute aanname verstopt ligt, wat de theorie zou kunnen redden.

Zo is het mogelijk om nog in de toekomst andere vormen van het nulpuntsveld uit te proberen of het electron een realistische ladingsverdeling toe te kennen. Er zijn namelijk voor beide theoretische argumenten te bedenken die het probleem van de ionisatie zouden kunnen oplossen.

0.2 0.4 0.6 0.8 1.0 1.2 1.4 ÈEÈ 0.5 1.0 1.5 2.0 2.5 P 1 2 3 4 5r 0.1 0.2 0.3 0.4 0.5 P

Fig. 1 a): Histogram voor de energie en straalverdeling van het elektron als gemeten in de

(6)

2 Introduction

The statistical interpretation of quantum mechanics has been a long standing dispute. SED (stochastic electrodynamics) for example claims that everywhere in our universe there is a so called zero-point electromagnetic field (ZPF), a ground state remnant from almost excellently working quantum field theory (QFT). This ZPF would explain all quantum mechanical behaviour of particles and waves, from the double slit experiment to the hydrogen atom all the way to the probabilistic Schr¨odinger equation with wavefunctions as solutions. Thus it would introduce a so called local hidden variable, based on the local value of the ZPF and be in contradiction with mainstream thinking where quantum mechanics constitute the most fundamental theory.

This debate eventually turned into a merely phylosophical one since quantum mechanics works excellent as an eﬀective theory at least, with SED being deemed irrelevant and/or ruled out based on controversial experiments. Nevertheless it is interesting to give SED a rethought, because we can test SED for non-linear problems such as the 1s state of hydrogen atom using cutting-edge simulations impossible 10 years ago.

The biggest argument in favour of a purely statistical interpretation are the Bell inequalities, which claim to rule out the existance a local hidden variable. However previous experiments, which measured the quantities needed to plug into Bell’s formula, may be misinterpreted since differences in the experimental setup between measurements have not been taken into account. Each component of such an experimental setup possibly has a different hidden variable distribution due to differences in the experimental setup/context in which the quantaties are mea-sured. The violation of Bell’s inequalities thus merely demonstrates that different contexts can’t be combined [3]. Thus it may still be worthwile to keep an open mind and test if SED is able to reproduce it’s quantum mechanical equivalent for varioust problems.

The derivation of the Schr¨odinger equation from first principle SED has been reported [4, 5]. The authors show that the Liouville equation describing spatial and temporal evolution of the probability distributions of the position and the angular momentum for a free particle reduces under action of a zero-point field to a Schr¨odinger like equation under assumption of energy balance, which assumes that the energy going into the system through a zero-point field is equal to the total energy radiated by the Abraham-Lorentz force. The authors also show that the commutation relation [x,p] reduces to its quantum mechanical analog [x, p] = i¯h

under action of a zero point field, which is able to introduce Planck’s constant into SED via the spectrum of the zero-point field.

Since matter couples to this zero-point field, it is able to manipulate it. This may for example explain the wave-like nature of electrons observed in the double slit experiment. There waves from the zero-point field may interfere and thus influence the trajectory of such an electron, making the conclusion that the electron exhibits wavelike properties invalid. On the other hand it is hard to imagine how this zero-point field may carry entanglement accros large distances. Thus there are both arguments in favour for as well as against SED.

Finally, it is realised that the energy throughput due to SED keeps matter stable with the energy from stochastic fields going in to compensate for the eﬀect from radiative damping. This can be seen as an arrow of time, called the

(7)

subquan-tum arrow of time, which is more fundamental than the entropic (second law of thermodynamics) and cosmological (expansion) arrows of time [8]

Nevertheless SED is known to work well for harmonic (linear) problems such as the harmonic oscillator, see e.g. [1]. However there is dispute about its correctness for non-linear problems such as the ground state of hydrogen. Claverie and Soto for example put forward that the motion is non-recurrent for highly eccentric orbits, due to the plunging of orbits to lower angular momenta and higher eccentries [9]. However this treatment may be deemed flawed since it doesn’t include relativistic corrections, which according to T. Boyer 2004 [13] should suppress orbits with high eccentricities. Puthoﬀ puts forward a comparison between energy gain and loss terms, concluding that it should be stable [10]. However calculating average energy gain and loss terms doesn’t properly take into eﬀect orbital resonances, which can lead to runaway situations.

The theory has been tested numerically on the hydrogen ground state in a 2-d approximation by Cole and Zou 2003, who observed close correspondence to the 1s ground state without ionisation. However their simulations employed a so called window approximation and insuffucient simulation time due to the unavailabil-ity of sufficient computational resources at that time. These simulations together with the advent of GPUs, which eventually sped up our code more than two ordesr of magnitute, inspired us to redo these simulations with higher numerical preci-sion/longer runtimes (paper 1 [14]) and the inclusion of relativistic effects (paper 2 [15]).

In this thesis I will first describe the classical theory of a hydrogen atom and show how we included the zero point field inside our numerical simulations for our first paper [14]. Then I will add the theory behind the relativistic corrections pro-posed in T. Boyer 2012, and implemented in our second paper [15]. Subsequently I will describe how the code was developed and give a short introduction about how GPUs benefited this project. Then I will present our results and compare them to a conjecture for the hydrogen ground state, before finishing of with a discussion.

(8)

3 The zero-point field

SED assumes the presence of a fluctuating vacuum field called the zero-point field (ZPF). This field is theorized to be present everywhere in our universe and could be the origin of the universe’s dark energy. The origin of this ZP-field is postulated in quantum field theory, which describes the quantisation of fields such as the electromagnetic (EM) and Higgs fields in terms of quanta. We will limit our discussion to the EM-field for this thesis.

QFT postulates that the EM-field at every point in space and time has to be quantised, with the the excited states (quanta) being the photons propagating as EM-waves through vacuum. Each of these EM-waves has a wavelength of λ = 2π_k and complex waveform in the radiation gauge similar to:

A = A0eikx−ωt, E =−

∂A

∂t, B =∇ × A (1)

The most suitable quantum mechanical system describing this quantised EM-field is the quantum harmonic oscillator. It has the following Hamiltonian for a 2D system with complex displacement z = x + iy and unit mass m = 1:

H =p

2 2 +

ω2z2

2 (2)

Compare this to the energy density ρ of an electromagnetic wave and you see the correspondence that A becomes the displacement in the Hammiltonian assuming

ω = ck and c = 1: ρ =EE ∗_{+ BB}∗ 2 = ˙ A ˙A∗ 2 + k2∗ AA∗ 2 (3)

The total energy per mode becomes like for every quantum harmonic oscillator:

En= (n +

1

2)¯hω (4)

The number of photons occupying each state is given by n. For the ZPF our chosen value of A0 dictates n = 0, which gives a zero point energy of hω¯₂ for the case where no photons occupy the relevant mode. The origin of this discrepancy with classical mechanics lies in the Heisenberg uncertainty principle since a photon occupying a harmonic oscillator state has either to have uncertainty in momentum (kinetic energy) or position (potential energy) which leads to it having an average minimum energy of ¯hω₂ .

Now we can represent each possible wave with wavevector k = 2π_L(nx, ny, nz)

as occupying one quantum harmonic oscillator state with ω = c

λ. The total

EM-field at a point in space would be the summation of all harmonic oscillators states in ω-space with each point contributing E =hω¯₂ .

This zero point field has been observed in diﬀerent experiments. The most famous experiment studied the Casimir eﬀect, where because two conducting plates are seperated by a finite distance, the ZP-field is able to exert a force on them. Since these plates are conducting, they serve a boundary conditions for electromagnetic waves, ie an electro magnetic wave can’t traverse the plates. Since their seperation distance is finite, less modes are accepted between the plates than outside in the

(9)

vacuum, because the maximum wavelength beteween the plates seperated by λ becomes 2λ. This radiation overpressure causes the plates to be drawn together.

Another example is the shift in energy levels between 2P an 2S states, called the Lamb shift. The Dirac equation predicts no energy shift between these levels, while in reality there is a shift whose leading terms may be explained by the inclusion of an electromagnetic field. Explaining the non-linear terms of this Lamb shift using SED simulations would constitute a strong point for its validity. Though numerical simulations didn’t reach the point where they are even able to replicate the 2P and 2S states fading this hope in the near term.

There is also a big problem with zero point field, since the number of possible modes ω is infinite in a finite sized box. Let’s consider the EM-field energy density

ρ, which can be obtained by integrating the density of states function g(ω), giving

the number of modes per unit volume between ω and ω + dω:

g(ω) = 4πω

2

(2πc)3dω (5)

multiplied by the average ZP-energy of each mode E = ¯hω₂ over ω-space by:

ρ = ∫ ωmax 0 g(ω)Edω = ∫ ωmax 0 2π¯hω3 (2πc)3dω = ¯ hωmax4 4π2_c3 (6)

We thus see that the energy density diverges by the fourth power of the cutoﬀ frequency ωmax. One partial remedy is to define a frequency cutoﬀ at the planck

scale, though this still gives us a value 100 orders of magnitude bigger than the measured bounds on the cosmological constant. This is called the vacuum catas-trophe, the biggest discrepency with reality arrising from quantum field theory and also present in SED.

3.1 The zero-point field: Physical representation

Thus the bottom line of the above story is that while a ZP-field is able to explain some things in nature, the theory also has serious flaws as evident from the still unresolved vacuum catastrophe. Also the exact shape of this ZP-field along with eﬀects such as its interaction with charges remains unresolved.

Nevertheless for this thesis we represent the ZP-field as a classical EM-field in a finite volume of dimensions V = LxLyLz with an exponential cutoﬀ at the

planck scale (Compton time) of τ = _mch2. We have chosen V such that it is a good

approximation for an infinite universe (see later sections).

In the Coulomb gauge the SED vector potential of a cube of volume V is a sum of plane waves with random (for this work Gaussian) coeﬃcients,

A(r, t) =∑ k,λ √ ¯ h 2ϵ0ωkV e−ωkτc/2_ε_ˆ kλ [ Akλsin(k· r − ωkt) + Bkλcos(k· r − ωkt)] (7)

(10)

E(r, t) =∑ k,λ √ ¯ hωk 2ϵ0V e −ωkτc/2_ε_ˆ kλ [ Akλcos(k· r − ωkt) − Bkλsin(k· r − ωkt)] (8) B(r, t) =∑ k,λ √ µ0¯hωk 2V e −ωkτc/2ˆ k× ˆεkλ [ Akλcos(k· r − ωkt) − Bkλsin(k· r − ωkt)] (9)

The wave vector components ka = 2πna/V1/3 involve integer na = 1, 2,· · · ∞,

(a = 1, 2, 3). The ˆεkλwith λ = 1, 2 are polarisation vectors.

The Akλ and Bkλ are independent random Gaussian variables with average

zero and unit variance. This is an often used assumption for SED since the exact distribution of these random variables is not know SED, but also a fixed Amplitude with a random phase was used in past research (Pena and Cetto, 2005). The only requirement is that for each (k, λ) term the average energy

∫ V d3r(ϵ0 2⟨E 2_{⟩ +} 1 2µ0⟨B 2_{⟩) =}1 2¯hωk exp(−ωkτc) (10) becomes equal to the photon zero point energy combined with an exponential cutoﬀ. This proves that the average amplitude A0=

√ ¯

h

2ϵ0ωkV is well chosen.

3.2 The zero-point field: Numerical representation

Numerically calculating the above 3D sum over the k− values is presently impos-sible, even on a GPU cluster. But one nice property of using a sum of Gaussian variables is that it can be changed with a simpler sum of Gaussian variables, as long as the correlators of these sums remain the same. If these correlators re-main the same than the field will look similar at time t independent of the exact representation of the Gaussian sum.

We adopt a uniform grid in ω-space with ∆ωn= 1/N with N ≫ 1, so that

ωn=

n

N, (n = 1, 2,· · · ), N = L

2π (11)

which corresponds to (n/N ) ω0in physical units. Next we assume for each n and for each direction a = x, y, z, two independent Gaussian random variables Aanand

Bna, with average 0 and variance 1, and consider the 1d sum

E(t) = ∞ ∑ n=0 √ ∆ωnωn3 π e −1 2Z 2 α2ωn₍−A ncos ωnt + Bnsin ωnt). (12)

Its two-point correlation function reads

< Ei(t)Ej(0) >= δijCEE(t) = δij

1 8πN4ℜ

3 + sinh2[(Z2α2+ it)/2N ] sinh4[(Z2_α2_{+ it)/2N ]} . (13)

(11)

At fixed t it reproduces in the limit N → ∞ the autocorrelation function for the autocorrelator of the original field in equation 8 using Bohr units:

< Ei(t)Ej(0) >= δijCEE(t) = δij6

πℜ

1

(t− iZ2_α2₎4. (14) Plotting both autocorellation functions shows that for for finite N , the discretiza-tion will be reliable for times up to t ≃ N. Ideally N is bigger than the total simulation time, which was (almost) satisfied for our first article.

In our second article we applied the same idea, but then for an position depen-dent autocorrelator. Due to the mathematical complexity we refer the interested reader to our second article for the specifics. We however encountered the problem that the alternative EM-field needed 16 times as much memory, which precluded us from satisfying this condition (t≃ N) for a whole run. Though since the shape of the field changes as the electron evolves, we don’t think this necessarily invalidates our results.

3.3 The zero-point field: Fixed vs moving cuttoﬀs

Numerically it is impossible to calculate a ZPF with a cutoﬀ either at infinity or the Planck scale. Previous work (Cole and Zou 2004b [12]) proved that the most important frequencies for the electron’s dynamics lie around the Keplerian frequency or a multiple of it (1 < n < 12). Let’s call these multiples of the Kep-lerian frequency ’harmonics’. SED simulations of the hydrogen atom in 2D (Cole and Zou 2003[11]) defined a small window of around 3 percent below and above the first harmonic. Since we had much more computational power we removed the window and instead defined a cutoﬀ at a frequency of 1.5 harmonics at an energy of

E = −1.6. This corresponds to 52 harmonics at an energy of Eion=−0.05, where

Eion is defined as the ionisation energy. We thus state that the electron ionised

if it stays for a time totally inconsistent with the quantummechanical prediction aboveEion

Since these simulations introduced noise in the sense that the 50thharmonic is 2700 times stronger (sum of equation goes as n2) than the first harmonic, we also worked with a moving cutoff during this project. We did cutoff the frequency at either 2.5, 4.5 or 6.5 harmonics. To prevent discontinuous jumps in the EM-field we did change the cutoff if it differed more than 20 percent from the previously used cutoff frequency, which happens on the order of 102electron orbits. Otherwise we would introduce a lot of noise in the field strength due to adding or removing modes every timestep.

(12)

4 Deriving the equations of SED

In this section we will derive the SED equations and introduce Bohr units. We will start with the non-relativistc, zero spin equations since they are simple and represent the physics correctly. We will also discuss the canonical equations of motion, implemented to reduce numerical errors. Later in this section we will derive the relativistic equations of motion with spin-orbit coupling, which were implemented to remedy the problem where the atom ionized, though unsuccesfull.

4.1 Deriving the equations of SED: The coulomb force

Let’s start with only the Coulomb force assuming a stationary nucleus with charge Z:

Fc= −Ze

2_r

4πϵ0r3 (15)

In this case we get an orbit according to Kepler’s laws with energy E, angular momentum L and eccentricity ¯ε:

E = p2 2m− Ze2 4πϵ0r, L = r× p, ¯ε = 1 mp× L − Ze2 4πϵ0ˆr. (16)

We can prove that the expression for the eccentricty vector reduces to a Kep-lerian orbit by noting that:

¯ ε· r = 1 mr· (p × L) − Ze2 4πϵ0r = 1 m(r× p) · L − Ze2 4πϵ0r = 1 mL 2₋ Ze2 4πε0r (17)

Rearranging gives the well known Keplerian orbit: 1

r =

Ze2m

4πϵ0L2(1 + ¯εcos(θ)) (18)

Taking the dot product of ¯ε gives the relationship between eccentricty, angular

momentum and energy: ¯

ε· ¯ε = ¯ε2= 1 + 2EL 2

m (19)

4.2 Deriving the equations of SED: The Abraham-Lorenz force

We however know from the theory of electromagnetism that an accelerating charged particle will radiate energy through self interaction with its EM field making a stable Keplerian orbit impossible. This force is called the Abraham-Lorentz force

FL. Since the light travel time between the accelerating charge and observer is

non-negligable, we introduce the concept of retarded time:

t′r= t−

r− r′

(13)

This is the time at which the photons contributing to the measured electromagnetic field were emitted. Then we define scalar and vector Lienard-Wiechert potentials:

ϕ(r, t) = 1 4πϵ0 ∫ ρ(r′.t′r) |r − r′_|d 3 r′, A(r, t) = µ0 4π ∫ J(r′.t′r) |r − r′_|d 3 r′ (21) Now calculate the electric and magnetic fields:

E =−∇ϕ −dA

dt, B =∇ · A (22)

The Poynting flux vector becomes:

S = E× B

4π (23)

The power per unit area becomes along direction ˆn: dP

dΩ = (S· ˆn)R

2

(24) With total power for the accelerating charge given by:

P = ∫ dP dΩdΩ = µ0q2 6πc a 2 (25) To get the Abraham-Lorentz force from the expression for an accelerating charge we need to realize that:

W = ∫ P dt = ∫ F· dt = ∫ µ0q2 6πc dv dt dv dtdt (26)

Assuming periodic motion of period T and integrating by parts gives:

W = µ0q 2 6πc a· u| T 0 + ∫ T 0 µ0q2 6πc d2u dt2 · u = 0 + ∫ T 0 µ0q2 6πc d2u dt2 · u (27)

Thus we get for FL:

FL=

µ0q2 6πc

da

dt (28)

4.3 Deriving the equations of SED: Final equation of motion

Adding the electric and magnetic force is trivial, which yields the final SED equa-tion of moequa-tion: F = Fc+ FE+ FB=− Ze2r 4πϵ0|r|3 + µ0q2 6πc da dt − e(E + u × B) (29)

Since solving for ˙a would incur extra numerical noise we iterated the equation once, so we only have to solve a second order diﬀerential equation. This is valid since the damping term is much smaller than the Coulomb term and the Lorentz term. In most of the runs we did take this even a step further by only iterating the Coulomb term since it is far stronger than the Lorentz-term itself:

˙a =F˙c

(14)

4.4 Deriving the equations of SED: Switching to Bohr units

Working in SI units is not the way forward, since it leads to values much smaller or bigger than 1. For that we switch to Bohr units, where a0gives the orbital radius of a classic hydrogen atom in the ground state and orbitial period P0= 2πτ0:

a0= ¯ h Zαmc, τ0= 1 ω0 = ¯h Z2_α2_mc2, ϵ0= 1 4π (31)

The equation of motions reads for these units: ¨

r =− r

r3+ β 2...

r − β[E(Zαr, t) + Zα˙r × B(Zαr, t)], (32)

Using this form we see as well that the B-field and positional dependence of the

E/B-field get suppressed by a factor α and are thus not very relevant. Furthermore

both the fluctuations and the damping involve the small parameter:1

β = √ 2 3Zα 3/2 = Z 1964.71, α = e2 4πϵ0¯hc ≈ 1 137 (33)

This parameter β sets the timescale of the simulation. We call _β12 the damping

time, which is the timescale on which the electron falls into the nucleus. Chosing

Z > 1 makes the electric and magnetic forces linearly stronger, while reducing the

damping time in units τ0 by β2. This implies that we can increase Z to make the simulation reach n damping times with less Keplerian orbits, thus saving computa-tional time, which can be used to for example increase the precision and timespan of our simulations.

4.5 Deriving the equations of SED: Canonical equations of motion

We did solve standard Newtonian second order equation of motion numerically in most of our runs. However, we did also built a code that solved the non-relativistic canonical problem derived from Hamilton’s rule to give us extra confidence in our results:

˙

p = f (r), q = p + βA˙ s+ β2f (r), r = q + βCg (34)

This implies a physical momentum of ˙r = p + β(As+ ˙Cg) + β2f and reduces

correctly to the second order ordinary diﬀerential equation (ODE). The EM-fields were defined as:

A = As+ ˙Cg, (35) As= N1 ∑ n=1 √ n πN2(Ansin nt N + Bncos nt N), (36) ˙ Cg= ∞ ∑ n=N1+1 √ n πN2(Ansin nt N + Bncos nt N), (37) (38) Cg= ∞ ∑ n=N1+1 √ 1 πn(−Ancos nt N + Bnsin nt N) (39)

(15)

This approach was chosen because for large ωn, the coeﬃcients of the E field grows

as ω3/2n , which may cause numerical errors, especially if we define a cutoﬀ quite

high. In that case the first harmonic, that contributes most to orbital resonances, can be 1000 times weaker than for example the 35th harmonic. This could cause a small interpolation error (main source of error in the field strength) in the 35th

harmonic to influence our results (see discussion). Thus we may want to integrate the E field field twice numerically, because the resulting C field grow as ωn−1/2

and thus the higher modes get well behaved. But now the low frequency modes get too strong, thus we did chose for the low frequency modes up to N1 to just integrate once by using the A field.

The present scheme is valid only when N1and N2stay constant. We do however update N1 and N2 as the orbit of the electron changes. Let us assume that our simulation covers nh+1₂ harmonics of the orbit, with nh= 2 or 4, or· · · . At the

initial time we set N2= (nh+ 1₂)k3N , next to N1= k3N .

At some later time t′where k has evolved to some k′we may wish to update not only N1 but also N2, to become N1′= k′3N and N2′ = (nh+1₂)k′3N . This change

is also covered in the above formulae, where now C involves limits N1+ 1 and N2 before t′ while the update C′involves limits N1′+ 1 and N2′ after t′. Likewise, A involves limits 1 and N1, and A′ involves limits 1 and N1.′

All by all, this leads to discontinuous jumps in the position r and angular momentum p. To compenstate for this we introduce δA and δC in the dynamics:

˙

p = f (r) , ˙q(t) = p + β2f + β[A′(t) + δA] ,

r(t) = q(t) + β[C′(t) + δC] f (r) =−r

r3 (40)

In the initial period, one just has δA = δC = 0. However after the first change of N1 and N2 one works with the updates A′ and C′, which involve N1′ and N2′, rather than N1and N2, respectively. Matching at t′ yields

δA = A(t′)− A′(t′) + ˙C(t′)− ˙C′(t′),

δC = C(t′)− C′(t′) (41)

For subsequent changes of N1, N2 one repeats this schedule. One must add the new shifts to the previous ones, which amounts in total to:

δA =∑ t′<t [A(t′)− A′(t′) + ˙C(t′)− ˙C′(t′)] δC = ∑ t′<t [C(t′)− C′(t′)]. (42)

Note that this scheme, like the Newtonian scheme, still introduces discontin-uous jumps during field switchover in acceleration a. Nevertheless this scheme gave us some extra confidence in our results, that they are robust when the high harmonics (up to 52th harmonic) introduce an error similar to 20 percent of the 1st harmonic’s amplitude. Since this approach didn’t lead to better results for the non-relativistic case, we decided that we didn’t need this extra ’test’ for the relativistic version (second paper).

(16)

5 Relativistic corrections in the hydrogen problem

We implemented the previous equation of motion in our first paper and achieved negative results for the stability of the hydrogen atom. We found it ionised in a few thousand to a few million orbits depending on the zero-point field. We encountered the problem that the orbits of the electron became wider and wider with higher and higher eccentricities. This is the same mechanism by which asteroid can be ejected by the orbital resonances due to the sun and Jupiter from our solar system. However while in classical mechanics the electron can have aribtrary low angular momentum (corresponding to high eccentricities), this is not the case for special relativity, since this could imply that the speed at the perihelion would exceed the speed of light (T Boyer 2004).

Thus the next step for this project was the inclusion of leading order relativistic terms that go as 1/c2, hence as α2, with α = 1/137 the fine-structure constant and are thus very weak, but may exert a force close to the nucleus, that could prevent the system from ionising. To derive these equations we follow a slightly diﬀerent approach and start from the Schr¨odinger equation. The spin-orbit interaction reads for an electron of mass m and charge−e in the field of a nucleus with charge Ze (e > 0), with Bohr magneton µB = e¯h/2m in SI units:

HSO= µB ¯ hmec2_r dV drL·S = Ze2 8πϵ0m2_c2_r3L·S (43)

The relativistic Hamiltonian including the Darwin term reads:

Hrel= √ m2_c4_{+ p}2_c2₋ Ze 2 4πϵ0r+ Ze2 8πϵ0m2_c2 L·S r3 + ¯ h2 8m2_c24π Ze2 4πϵ0δ(r) (44) Bohr units allow to introduce dimensionless variables r→ a0r and p→ (ma0/τ0) p, implying that L = r× p → ¯hL, and to take consistently S → ¯hS. Keeping

|S| =√1 2( 1 2+ 1) = 1 2 √

3 one arrives at:

Hrel mc2 = √ 1 + Z2_α2_p2−Z 2_α2 r + Z4α4 2 L·S r3 + Z 4 α4π 2δ(r) (45)

So the dimensionless nonrelativistic Hamiltonian H = (Hrel−mc2)/Z2α2mc2picks up the leading relativistic corrections:

H = 1 2p 2 −1 r− 1 8Z 2 α2p4+Z 2_α2 2 L·S r3 + Z 2 α2π 2δ(r). (46)

This leads to corrections to the energy of order α2, which is of the same order as the positional dependence of the E-field or the eﬀect of the magnetic field B.

(17)

From Hammilton’s equations ˙r = ∂pH and ˙p =−∂rH the dynamics reads for r̸= 0: ˙r = (1− Z 2_α2 2 p 2 )p +Z 2_α2 2 S× r r3 , (47) ˙ p =− r r3 + Z2α2 2 S× p r3 + 3Z2α2 2 L· S r5 r, (48)

Note that the Darwin term describing the zitterbewegung of the electron close to the nucleus disappeared in this treatment, which can’t properly handle a δ(r) term.

The spin progresses approximately as: ˙ S = Z 2_α2 2 L× S r3 = Z2α2 2 (pr− rp)·S r3 . (49)

It is trivial to see that the spin S is conserved and we verified that the total angular momentum J = L + S is conserved up to terms of order α4.

An external electromagnetic field is added by the minimal substitution p → ¯

p = p + βA(Zαr, t), with β given below and the spatial scale factor Zα expressing

the ratio of the Bohr radius to the wavelength of a photon with Bohr energy α2mc2. This represents a renormalization of the rest mass δmec2 in quantum mechanics.

From Eq. (49) it is confirmed that S2 remains conserved, as desired.

Stochastic Electrodynamics leads to a specific stochastic electric and magnetic field, as well as to an ...r damping term, see NL1 and Ref. [1]. When we neglect

terms of order α7/2 higher, the contribution∇ip to (48) and the time derivative¯

of (47) lead to the Abraham-Lorentz or Braﬀord-Marshall equation for a particle with spin: ¨ r =− r r3 − β(E + ˙r × B) + β 2... r + Z2α2˙r 2_{r + 2 ˙r}_{·r ˙r} 2r3 + Z 2_α2 2r3 (S× ˙r − 3 r· ˙r r2 S× r + 3 S· L r2 r ), (50)

(18)

6 Conjecture for the ground state phase space density

For a dynamics with weak EM-noise and damping the stationary distribution in phase space must be a function of the conserved quantities, since only these determine it’s dynamics.. Here these are the parametersE, L and/or ε.

A conjecture for the phase space density of several states of the relativistic H-atom has been made in a previous paper[16]. This was done by requiring that momentum space integral of the phase space density Ppr(r, p) reduces to the 1s

ground state.

Here we restrict ourselves to the ground state in the non-relativistic limit, while this conjecture is also valid for higher states and consistent with the relativistic Hammiltonian discussed in the previous chapter. Nevertheless, the conjecture re-duces to: Ppr(r, p) = f (E(r, p), L(r, p)) (51) f (E, L) = 2Le2/E π3_|E|3 Φ(E) = 2 π3LR 3 e−2RΦ(E) (52)

Here we included a relativistic correction factor Φ(E), which takes the value 1 in the non-relativistic limit and an inverse energy R =−1_Efor notational convenience. The first task is to verify that the ground state density emerges after integrating over momenta. At given r one can take the pz-axis along r, so that:

p = p(sin µ cos ν, sin µ sin ν, cos µ),

√ p = 2E +1 r = √ 2(R− r) rR , (53)

with r≤ R ≤ ∞, 0 ≤ µ ≤ π, 0 ≤ ν ≤ 2π. The volume element reads

d3p = dpdµdν p2sin µ = dRdµdν

√

2(R− r)

rR5 sin µ. (54)

Since L = pr sin µ, Eq. (51) indeed reproduces the QM result, viz.

Pr(r) = ∫ d3p Ppr(r, p) = 4 π ∫ _∞ r dR(R− r)e−2R= e −2r π . (55)

This can indeed be written as:

Pr(r) = ψ0(r)Y2 00(θ, ϕ),2 ψ0(r) = 2e−r, Y00(θ, ϕ) = √1

4π, (56)

and leads to Pr(r) = r2ψ0(r) = 4r2 2e−2r with normalisation

∫_∞

0 dr Pr(r) = 1. For PEL(E, L) we have the definition

PEL(E, L) = ∫ d3r ∫ d3p δ(E − E)δ(L − L)Ppr(p, r) = 4πf (E, L) ∫ dr r2 ∫2π 0 dν ∫ π 0 dµ √ 2(R− r) rR sin µ δ ( r √ 2(R− r) rR sin µ− L ) (57)

(19)

Hence, taking into account the contributions from µ = ¯µ < 1₂π and from µ = π− ¯µ, PEL(E, L) = 16π2f (E, L) ∫ r+ r− dr rL √ R/2 √ rR− r2₋ 1 2L2R Expressing κ = kL, that lies between 0 and 1, as

κ = L Lmax = √L R/2= kL = √ 1− ε2_, ₍₅₈₎

and using that r±= 1₂R(1± ε), this reduces to

PEL(E, L) = 8√2 L

2

|E|9/2e

−2/|E|_, ₍₅₉₎

where L≤ Lmax. Because the latter depends onE, the result does not factorize. However, since both ε and κ lie between 0 and 1, the weight PEL(E, L)dEdL can be factored in the forms PE(E)dE Pε(ε)dε and PE(E)dE Pκ(κ)dκ, where

PE(E) = 4 3|E|6e −2/|E|_, ₍_{−∞ < E < 0),} Pε(ε) = 3ε √ 1− ε2_, ₍₀_{≤ ε < 1),} ₍₆₀₎ Pκ(κ) = 3κ2, (0 < κ≤ 1).

An often made mistake is to take some value out of the conjecture as initial conditions for the simulation. This will lead to a convolution with the conjecture if the simulation didn’t run long enough, which is often hard to tell by looking merely at the data. Fluctuations in radius than may for example suggest a stable solutions in the quantum regime while fluctuations in for example eccentricty occur on longer timescales.

This was exactly the problem with simulations of Cole and Zou 2003, since they claimed a stable ’quantum’ solution for long run times. However they summed multiple runs with the same initial conditions (a circular orbit at E =−0.5), which led them to miss variations in for example angular momentum occuring on longer timescales and eventually miss the fact that the atom ionised and in fact doesn’t even form a ’quantum solution’ on shorter timescales.

For our initial non-relativistic run we chose reasonable values out of the con-jecture, while for the relativistic results we chose values, which were around 2σ oﬀ from the conjecture. Both led to similar results showing that we reached a stable solution up until ionisation (more on this later).

(20)

7 Numerics

7.1 Numerics: Limitations of previous simulations

Authors like Cole and Zou tried to solve the hydrogen problem for SED in 2003 using simulations. These simulations though were on some points flawed due to the fact that lack oﬀ computational forced them to make some unphysical assumptions. Firstly due to the available computational power at that time they were only able to simulate the electron for short times, which is flawed, since angular mo-mentum evolves on longer timescales. Instead they summed 11 runs with the same initial conditions, giving nothing more than a random fluctuation around the inital conditions. This accidently reduced to something similar to an 1s state since the same initial condictions were chosen every run. The diﬀerence with our simulations is that they found circular orbits, while we found eccentricites even above ϵ > 0.9 Secondly they also had to assume a tight window about the n = 1 harmonic, which probably led to these circular orbits with maximum angular momentum. By implementing a tight window abot the n = 1 harmonic, we did reproduce these results.

Finally they simulated the problem in 2D which has a diﬀerent radius distri-butions than our conjecture:

2D : P (r) = 16

2πexp(−4r) 3D : P (r) =

1

πexp(−2r) (61)

However they used the 3D result for their ’semi’-2D simulations, which probably is invalid!

7.2 Numerics: Our code

Our simulation code was developed from scratch in C/C++, while the data analy-sis part was trivially implemented in Mathematica. The summation of the field was done on a GPU. This is described in the next section. Furthermore the code was thoroughly hardware profiled, to extract maximum performance on the CPU/GPU and convergence testing was performed to guarantee valid results. It is able to sim-ulate a system with 107modes in a reasonable amount of time on a single 250 euro GPU and any CPU (is not the bottleneck).

Here we give a pointwise summary of the main features/caveats of our self-developed code:

1) The considered frequencies are ωn = n/N . Due to the presence of a vast

amount of random variables we had to set N = 105 instead of N = 106 as was done in our non-relativistc version of the code [12]. Up to t = N the autocorrelator of equation (13) reduces to the autocorrelator of the exact E field in equation (14). For larger t, we may thus not simulate a genuine 3D problem anymore in the relativistic case. This could be a problem since it is shorter than the damping time 1/β2and possibly also shorter than other timescales relavant to the electron’s dynamics.

2) We used the ’classical’ Runge Kutta fourth order integration scheme (RK4) with full energy and angular momentum conservation up to eccentricities of 0.99 and higher. We did also test other ODE schemes like several of the Adam-Bashforth

(21)

type and the simpler Euler method. Though still very accurate they were less accurate than the RK4 method and thus disbanded. On top of that we did vary the number of iterations per orbit from run to run and achieved consistent results. Thus we can conclude that our results are ODE scheme independent, with the biggest numerical error induced due to the other points mentioned in this paragraph.

3) We used 4000 iterations per orbit, while 600-2000 where found to be enough for high eccentricities using the RK4 method. This was done because solving the ODE was not the computational bottleneck and we wanted to make sure to have the maximum achievable precision. The EM-field (bottleneck), though, was only updated 25-30 times per period of the highest EM-mode with the other points determined by interpolation using 4thorder Lagrange Polynomials. We calculated this EM-field 1.5 to 2 times as often with respect to the non relativistic version of the code [12], since higher numerical precision was required so to assure that the eﬀects of the relativistic corrections, the magnetic field, the positional dependence of the EM-field and the spin-orbit coupling were fully accounted for.

4) We updated the moving cut-oﬀ frequency when the orbital period of the electron changed more than 20%. This was done to minimise the presence of dis-continuities in the equation of motion (9). Note that this field switching everey 102orbits could still induce errors in our solution.

5) Since the electron’s energy can drop below E = −1.6 at which point our GPU runs out of memory and/or the code becomes very slow due to presence of many more modes than at higher energies, we artificially increase the electron’s energy by giving it a ’push’. For this code we chose a scheme in which we randomly gave the electron a push either parallel or perpendicular to its velocity axis, which should give less bias towards higer angular momenta than a previous version of our code where we only gave it a push parallel to it’s velocity axis. The advantage of this push scheme is also that its very simple, while the results are robust against the exact implementation. This push scheme shouldn’t influence our final results, since according to the conjecture of previous section the electron should stay out of this regime 99%+ of time. We also tested if the electron recovers from very low energies likeE = −4.0, and it indeed seems so.

6) The code does the first 8 summation steps of the EM-field in single precision on the GPU. The rest of the summation steps as well as solving the second order Newtonian ODE are done in double precision. Using single precision in this first summation step can speed up the code by a factor 2, while the error in this EM-field remains dominated by interpolation noise. Thus this so called hybrid-precision scheme works really well.

(22)

8 Computation

8.1 Computation: CPUs not becoming faster

We tried to improve on the previously discussed simulations more than a decade later. The only problem was that since approximately 2009 single core non-vectorized performance of CPUs didn’t scale up as predicted by Moore’s law. The problem is that CPUs have a clock frequency, which is fixed around 3-4 GHz for more than a decade. During each clock cycle most CPUs can handle at most two addition (ADD) or multiply (MUL) instructions, at least if these intructions are indepen-dent of each other. This is called instruction level parallelism and can be exploited in practice with diﬀering succes per code. Our code performs relatively well in that regard.

The first explicit optimisation we implemented was to include multiple cores using OpenMP. This managed to speedup the code by almost a factor 4 on a quad-core CPU, which is a very good number.

Another approach to speed up the code we considered was to use vectorized datatypes (float4 ,double4, int4), since modern CPUs can execute for example a double4 (double x, double y, double z, double w) in a single step, giving a theoretical speedup by a factor 4. However writing vector code is diﬃcult, since as soon as only one component of a double4 needs to be manipulated, all components have to be copied back from the vector registers, acted on by the vector ’SIMD’ units, to scalar registers, acted upon by the aritmetric logical and floating point units. In practice we only managed to get a modest speedup of less than 30 percent due to vectorization of the code.

Thus it seemed that having long run times and 3D, even with the smart dis-cretization was impossible on CPUs, since the frequency of CPUs is not expected to increase untill semiconductor manufacturers start to move away from silicon.

8.2 Computation: The tremendous speed of GPUs

However since approximately 2008 graphics card manufacturers also brought their GPUs to to the professional market. Besides graphics rendering in single precision, modern GPUs can also do double precision scientific calculations. Simply said they are much faster because they trade control and cache area on their dies for ALU (arithmetic logic unit) and FPU (floating point unit) space.

To discuss how GPUs deliver such amazing performance for parallel algorithms of more than two orders of magntiude vs a single CPU core we first have to consider what determines the speed of CPU/GPU assuming suﬃcient memory bandwidth. Actually there are two factors: Instruction throughput and instruction latency

Instruction throughput determines the theoretically maximum number of a certain instruction executed per second. The famous FLOP for example shows how many floating point operations a CPU/GPU can execute per second. For this the benchmark usually used MAD instructions that do a MULL and ADD instruction in a single cycle through some clever tricks. FMA instructions occur a lot in codes and are thus a good metric.

(23)

Fig. 2 a): GPUs trade control flow and cache area for ALUs and FPUs. This makes them

much faster in a heavily parallised environment.

The second factor determining CPU/GPU performance is instruction latency. If a CPU for example can read 100 values from the LLC (last level/L1 data cache) with a latency of 12 cycles it can only reach its maximum intruction throughput if all instructions are independent. If the first instruction depends on the second, the second depends on the third etc, the CPU will only achieve 1/12th of its stated performance. There are two remedies for this, clever flow control and thread level parallelism (TLP).

CPUs usually choose for clever flow control, which includes features such as branch prediction, where a CPU tries to guess the branch taken in an if-else state-ment and instruction reordering, where a CPU looks for independent instructions in a code with the intent of changing the order in which instructions are executed. This flow control costs lots of die area. On top of this CPUs have a large part of the die area reserved for the L3 cache which can be seen as a low latency RAM, since it has a factor 20 less latency. It is allways a challenge for the programmer for making sure that all relevant data is loaded in this cache and for the CPU manufactures to make the cache big enough.

GPUs however chose a diﬀerent approach for latency handling. They have a modest sized register, which is the fastest memory available (only 4 cycles of latency). The registers file can often accomodate (except in very complex code) multiple processes. The idea of this mutli-threading/thread level parallism is that the GPU can run multiple threads in parallel. If thread A is waiting at for example a memory read of 4 cycles to complete, the other threads can do calculationd during that time.

This makes GPUs massive parallel systems, since they are built up out of 3000 of stream processors running each up to 40 threads. Each of these stream processors is much weaker than a single CPU core (usually 2 to 4 cores per CPU), but taken together they are tens to hundreds of times faster than a CPU. To accomodate this performance also the memory bus and speed is higher than for a normal CPU. In total GPUs are on a per core basis much slower than CPU cores, but as a whole much faster faster in both performance as well as memory bandwidth.

On paper the speedup is a factor 25-100 (GPUs have a smaller double to single precision performance ratio), but sometimes much higher speedups are attainable since GPUs don’t rely on diﬃcult to achieve vectorization or intruction level par-allelism for their performance. Instead they have scalar registers and use thread level parallelism to achieve their tremendous speedups. Also their (and OpenCL’s)

(24)

clever usage of the caches and registers can reduce the requested memory band-width to speed up memory bandband-width limited codes.

So a program has to be able to run on 120000 independent threads to exploit the full power of a modern GPU. This is three orders more than required for a modern CPU with 4 cores, which can only run up to 8 threads with HyperThreading (HT) enabled.

To run 10 threads per stream processors the program has to optmize the reg-ister file usage in its code, otherwise less threads can be run and the full power of a GPU may not be exploited. This entails in practice to write an eﬃcient code, splitting up kernels, reusing variables and making good use of the memory hier-archy in GPUs. In our case we managed to achieve an occupancy of 60 percent, which can be called pretty good.

8.3 Computation: Implementing our algorithm on a GPU

OpenCL is an extension to C++ that makes it possible to parallelise the sum-mation in equation (12) on a GPU (GPUs don’t support normal C/C++ code). Normally the summation of the modes is performed within for loops, where all elements are summed serially on a single CPU.

In the OpenCL paradigm our GPU is called a compute device. Since we possess a single GPU, we utilise only one compute device. This compute device posseses 44 compute units, which are subdivided over 4 16-wide SIMDs (Single instruction, multiple data). Thus there are 44× 4 × 16 = 2816 processing elements. Each of these compute units can process up to 40 wavefronts of GPU specific size 64 simul-taneously, only limited by the register (GPR) and local memory size (processing multiple workgroups of data is done to hide memory latency).

The sum in equation 12 is then summed by all of these processing elements using parallel reduction. The global work size is defined by the total number of elements to sum. These elements are summed in workgroups of size 256, i.e., 4 times the wavefront size for an AMD GCN (Graphics core next) GPU. These workgroups share local memory, such that every work item reads in its value from the global memory and copies it to the local memory, where it is summed in 8 steps (28= 256) within a workgroup. During the first step the first 128 work items are summed with the last 128 work items in pairs of two. In this way the sum is reduced by a factor of 2 each step. This reduces the total sum by a factor of 256, after which the remainder is copied to pinned RAM memory via a PCI Express bus and summed by the CPU using double precision.

8.4 Computation: Used system and performance

It’s an annoying fact that people in science often don’t state the speed of their codes or the specs of the hardware used. This makes it very diﬃcult to judge the feasibility of performing the same or similar simulations in the future using hardware available to them.

We ran the simulations on a state of the art PC, consisting out of an Intel Core i7 2600k overclocked to 4.6 GHz (Core i7 4770k+ equivalent in performance), together with 16 GB of RAM. In our earlier simulations we used an AMD HD6970

(25)

GPU, which was later upgraded to an AMD R9-290X GPU. This GPU delivers 5.6 TFLOPS of single precision floating point performance and 350 Gigabytes per second of memory bandwidth.

Overall, the performance improvement amounts to a factor of 270 against a double precision vectorized single core C++ implementation using the Intel C/C++ compiler, or a factor 160 against a vectorized single core CPU Intel OpenCL hybrid precision implementation. We compare in this case against a 2.8 GHz Intel Sandy Bridge CPU, since these are widely used in CPU clusters. This allowed us to simulate on the order of 107 modes in real time for several million of orbits in a few days, allowing us to tackle the problem in 3D without the previ-ously mentioned deficiencies of previous simulations (Cole&Zou 2003). Note that these simulations sometimes ran for several days. Getting so much time on a CPU cluster would have been diﬃcult for SED simulations.

Using a GPU performance profiler we found out that the code is memory bandwidth limited, despite clever usage of the local memory and eﬃcient code. It reaches 70 percent of it’s peak memory bandwidth while the instruction through-put stays at a modest 13 percent of it’s peak value, which makes an total utili-sation factor of around 80 percent, suggesting together with an occupancy of 60 percent that the high level of parallelism of the GPU is well used. This further-more suggests an average performane of around 350-370 GFLOP/s and a memory bandwidth (excluding register, local memory and caches) of 245 GB/s.

(26)

9 Results

During this project we conducted more than 25 simulations, ranging from simple 3D runs to fully relativistic 3D runs with spin dependence. Nevertheless all runs gave similar distributions, so we only present the results for the moving and fixed cutoﬀ cases. The longer runs took 50 to 200 GPU hours to complete.

9.1 Results: Moving cutoﬀ

We ran the code several times using different values for the cut-off frequency. In some runs we chose Z = 1 as proton charge, while in other Z = 3 was chosen to reduce the number of orbits per damping time. However no difference in the distributions for different Z was found, only in the ionisation time of the electron. Here we present the results of our most promising run up to the point that the electron energy went aboveE=-0.05, what we define as ionisation. The moving cutoff was set at 2.5 times the electron’s Keplerian frequency. In figures 2a, 2b we show the time series for the energy and radius while in figures 3a, 3b we show the corresponding probability distributions compared to the 1s ground state and the earlier published conjecture [13]. In table 1 we give the run time of this simulation before ionisation and we compare it to the damping timescale of 1/β2. In table 2 we do this for simulations of Cole and Zou 2003 as comparison.

These results suggested that the electron remained stable and had a probability distribution closer to the conjecture [12, 13] for a longer time than was presented in our first article [12]. However, we did find out that the time up until the electron’s ionisation varied quite drastically between t = 106t0to t = 107t0for different runs using either the relativistic or (non-)relativistic code. Thus we can conclude that there is no statistically significant difference between this run and the idealised run in our first article [12]. Runs with the moving cutoff set at 4.5 or 6.5 times the orbital frequency showed ionisation on even shorter timescales (105− 106t0).

We furthermore observed a clear lack of low angular momenta/high eccentricty. In the next chapter we *try* to give an explanation for this behaviour in the non-relativistic limit.

property value duration (s)

ttotal 2.05 107t0 5.5 10−11s

tdamp 4.28 105t0 1.15 10−12s

Norbit 3.26 106

Ndamp 48

Table 1: Time and number of orbits for our simulation with 2.5 harmonics and Z = 3.

property value duration (s)

ttotal 11× 6.21 105t0 11∗ 1.5 10−11 s

tdamp 3.8 106t0 9.36 10−11 s

Norbit 11× 1.0 105

Ndamp 11× 0.16

Table 2: Duration and number of orbits for the Cole-Zou simulation with Z = 1 and a 5% window around the first harmonic in 2D. The factor 11 is the number of diﬀerent runs.

(27)

5.0 ´ 106 _{1.0 ´ 10}7 _{1.5 ´ 10}7 _{2.0 ´ 10}7t 0.5 1.0 1.5 ÈEÈ 5.0 ´ 106 _{1.0 ´ 10}7 _{1.5 ´ 10}7 _{2.0 ´ 10}7t 2 4 6 8 r

Fig. 3 a): Time series for the energy, in Bohr units. b): Time series for the radius.

0.2 0.4 0.6 0.8 1.0 1.2 1.4 ÈEÈ 0.5 1.0 1.5 2.0 2.5 P 1 2 3 4 5r 0.1 0.2 0.3 0.4 0.5 P

Fig. 4 a): Histogram for the energy data of Fig. 1a. The red curve is the conjecture. b):

Histogram for the radius data of Fig. 1b. The red curve is the conjecture.

0.5 1.0 1.5 2.0L 0.2 0.4 0.6 0.8 1.0 1.2 1.4 P 0.2 0.4 0.6 0.8 1.0¶ 0.5 1.0 1.5 2.0 2.5 3.0 P

Fig. 5 a): Histogram for the angular momentum data. We see a clear discrepancy at lower

L with the conjecture given in red. b): Histogram for the eccentricity data. We see a clear discrepancy at high ϵ with the conjecture given in red

9.2 Results: Fixed cutoﬀ

Multiple runs with a fixed cutoﬀ set at 1.5 times the Kepler frequency for E =

−1.6 did even show ionisation within t = 104_t_{0. We see in figure 7 an increasing}

eccentricty until the electron picks up so much energy that it ionizes (figure 6). Though for these fixed cut-oﬀ runs the accuracy of the interpolation scheme (limiting factor) may be brought into question, especially when we want to account for the higher order eﬀects discussed in this article. This is because for high electron energies very strong EM-modes are included, the frequency of which is 52 times the Keplerian frequency of the electron, while we expect only the first few ‘harmonics’ to have a significant contribution [11, 12]. The integrated strength of all modes (assuming a Gaussian sum) of all modes till the first harmonic is for example more than 2700 times weaker than the integrated strength of all modes till n = 52. The

(28)

50 000 100 000 150 000 200 000 250 000

t

0.2 0.4 0.6 0.8

ÈEÈ

Fig. 6 Energy as function of time for Z = 1 with a fixed cutoﬀ exposing the trend towards

ionisation atE = 0, ε = 1. 50 000 100 000 150 000 200 000 250 000

t

0.2 0.4 0.6 0.8

¶

Fig. 7 Eccentricity as function of time for Z = 1 with a fixed cutoﬀ exposing the trend

towards ionisation atE = 0, ε = 1.

numerical error in the value of the EM-field was measured by comparing the output of the CPU code with the GPU calculated and Lagrange interpolated EM-fields. It was found to be as large as 20 percent of the first harmonic between interpolation points. It is quite diﬃcult to argue that this noise averages out over multiple orbits, especially if we want to properly take into account the previously discussed relativistic eﬀects. However the fact that the ionisation times become shorter with more and more harmonics (when the numerical error is quite small) does suggest a real physical and not numerical ionisation problem.

We tried to suppress this numerical error first with exponential cutoﬀs ranging from value 1 at the lowest frequency to value 0.1 at the cuttoﬀ frequency. This didn’t help so we tried switching toward the canonical formulation of the dynamics, which also didn’t improve our results.

(29)

10 Other work

10.1 Other work: Theoretical prediction about ionisation

Thus we did observe ionisation at high eccentricites in simulations, because some-how the damping term doesn’t compensate for the energy gained at high velocities close to the nucleus. In the non-relativistic regime this is not totally unexpected theoretically and this can be shown using perturbation theory, where we directly integrate the equations of motion over one period to obtain the change in energy, angular momentum, eccentricty and position every orbit n analytically.

For this consider a stable Keplerian orbit with position vector r0(t), energy

E0(t) and (scalar)angular momentum L0(t) perturbed by field E. The dynamics become up to first order:

r(t) = r0(t) + βr1(t), E = E0(t) + βE1(t) L = L0(t) + βL1(t) (62)

The perturbed variables for the energyE1and angular momentum L1get updated in the following way in the presence of an E-field:

˙

E1(t) = Eparr + E˙ perpL0(t)

r L˙1(t) = Eperpr˙ (63)

The same can be done for the radiation damping. After several pages of algebra outside the scope of this thesis we arrive at the following results after approximat-ing for small energy and high eccentricities:

< dE n >=< dEf ield n > + < dErad n >= 3πβ 2f (0)− L L6 (64) < dL n >=< dLf ield n > + < dLrad n >=−2πβ 2g(0) + L L3 (65)

The detailed derivation will be laid out in an upcomming paper [TMN, to ap-pear]. We can see that in this regime there is a possibility, independent of the electron’s total energy E, that the angular momentum can shrink, while the en-ergy keeps growing for L < f (0). This is exactly what we see in our simulation during ionisation. It is counterintuitive though that this mechanism still works in special relativity, since special relativity predicts a minimum angular momentum of L > α_c (T. Boyer 2004), which should supress plunging orbits.

10.2 Other work: A protocol for the electron’s conserved quantities

We did also extend the treatment of the previous paragraph to a second order accurate protocol for the energy, angular momentum and eccentricity evolution of the electron. In that case we calculate seperately the perturbation of these values every orbit for each mode and thus don’t have to solve the Newtonian second order equation of motion, which could make the calculation faster. After 20 pages of math, we end up with update values for the conserved quantaties in form of Bessel functions. This works well for modes that are not a multiple of the Keplerian frequency of the electron, but for those that are close to this orbital fequency, the