Simulating β-sheet formation with a coarse grained model of proteins

(1)

Simulating β-sheet formation with a coarse grained model of proteins

Wopke Hellinga

February 28, 2016

(2)

1 Introduction

1.1 Proteins

Proteins are large molecules which are, aside from water, the most abundant in organisms. They fulfill a wide range of functions, both structural as in biochemistry. Examples are transport of molecules, copying of DNA and catalyzing reactions. Proteins are made up of amino acids which are connected by peptide bonds into chains, called polypeptides. A protein consists out of one or more of these polypeptides. The amino acids determine the properties of the protein. Changing the amino acids in a protein or even the order in which they appear in the protein can drastically change the properties of the protein. Most proteins contain up to 20 different amino acids [36].

Amino acids are made up as follows: A carbon atom called the Cα atom is connected to three groups of atoms. The N − H group, the C − O group and the sidechain. The Cα, N − H and C − O group form the backbone of the peptide and are the same for all amino acids. By connecting the N − H group to the C − O group, a chain is made out of multiple amino acids.

The sidechain consists out of a few, to tens of atoms, and determines the properties of the amino acids. Considering the amino acids as a whole, we distinguish some properties. First we have the stiffness of the backbone. We consider three degrees of freedom here: Stretching, bending and torsion. These will be discussed further in chapter two. Also, an amino acid can have a non-zero charge, and it has a hydrophobic/hydrophilic potential. These potentials are the most important when describing the behaviour of an amino acid sequence. However, there are much more interations which act on a smaller energy scale. The backbone and the sidechains contain dipoles which interact with each other, and amino acids can form hydrogen bonds between them.

Although the contribution of these potentials are energetically small, they have a large impact on the structure and function of a protein.

Concerning the structure, we can describe the structure on three levels: primary, secondary and tertiary. The primary structure is the sequence of amino acids. The secondary structure is defined by the local folding of the protein. On this level, the protein can be an unfolded coil, where no specific secondary structure can be assigned to it. The protein can also be folded into specific structures of which the most common are the α-helix and the β-sheet. In an α-helix, the protein is coiled into a spiral conformation. A β-sheet consists of stretched out strands which are connected to each other laterally. (See figure 1.)

Figure 1: Examples of secondary structures. β-sheet above and α-helix below. The blue bands show the spatial structure, and the dotted lines represent hydrogen bonds.

(4)

The tertiary structure is the configuration of these secondary structures into complete folded proteins. It describes for example how a β-sheet is folded around an α-helix. Also here, hydrogenbonds and dipole-dipole interactions play crucial roles [28].

1.2 NPC

The nucleoplasm inside a cell nucleus is in contact with the cytoplasm through the nuclear pore complexes (NPC). The NPC takes care of transport between the cytoplasm and nucleoplasm.

Small molecules (≤ 40kDa) or small neutral particles can diffuse through the NPC. Larger molecules have to be bound to a chaperone protein to move through. This kind of transport is done in a directive manner, giving the cell nucleus the possibility to have higher concentrations of certain macromolecules than the cytoplasm [18].

Figure 2: The nuclear pore complex

The NPC is made up by a ring scaffold in the nuclear envelope, as you can see in figure 2.

Inside the nucleus is a nuclear basket and on the other side cytoplasmic fibrils extend into the cytoplasm. The central meshwork inside the scaffold is comprised of proteins which are called nucleoporins, or Nups. There exist about a 30 different copies of them, and they are proposed to be responsible for the selective transport of macromolecules. The Nups in the central meshwork have units of Phenylalanine and Glycine (FG) repeatedly in their sequence, and are therefore called FG-Nups. These FG repeats have proven to be essential for the selective transport though the NPC [12].

As discussed, macromolecules can be transported through the NPC by the use of chaperone proteins. The role of chaperone is fulfilled by karyopherins or Kaps. In the cytoplasm, the Kaps will bind to a selection of macromolecules. Once bound to these Kaps, the macromolecules can move through the NPC into the nucleoplasm, where the Kap will be removed from the macromolecule by a protein called Ran-GTP. By maintaining a higher concentration of Kaps in the cytoplasm relative to the nucleoplasm, directive transport can be achieved. The method for getting this difference in concentration is discussed in reference [18].

The mechanism responsible for the lower permeability for macromolecules bound to Kaps, is still under debate. Multiple models are proposed [30, 32, 38], however only the selective phase model [17] will be discussed here, as it is the subject of this research. The selective phase model

(5)

act as cross-links to form a sieve-like structure. This sieve would make passive transport of small molecules possible but would block large macromolecules, comparable to small and large fish in a fishnet. The transport of large molecules bound by Kaps is proposed to occur because the Kaps would be able to break the cross-links by use of the Kaps, due to the high FG-Kap affinity. As a result of this the sieve-like structure would locally melt when interacting with a macromolecule- Kap complex and these molecules could thus pass through easily. In vitro experiments have shown examples where Nups form a hydrogel under certain conditions. This suggests that a sieve-like structure is being formed. There is no evidence however that these sieve-like structures are also formed in vivo [1, 16].

1.3 Aggregation

Increasing evidence shows that neurodegenerative diseases such as Huntington’s disease (HD), Alzheimer’s disease (AD), Parkingson’s disease (PD), amyotrophic lateral sclerosis (ALS) and prion diseases are caused by the same molecular mechanism [31]. This mechanism called aggregation is the accumulation of proteins into toxic clumps or plaques. They can appear in the brain intercellular and extracellular, and exist in different areas of the brain dependent on the disease. Although the proteins showing this behaviour can differ a lot in sequence, they show similar mechanism of forming these aggregates. Indeed, the formation of aggregates seems to be a generic process among a wide range of proteins, being a major problem in for instance biotechnology. In this field, formation of aggregates can make the expression and purification of proteins very difficult [22]. Because of this general principle of aggregation, often the aggregation of different proteins is used as a model for the aggregation process of proteins which are causing the disease under investigation. For instance, to study aggregation of the Hunt- ingtin protein, pure polyQ peptide chains are being used [5, 15]. And in interest of the research on, among others, Alzheimer’s disease and spongiform encephalopathies, the aggregation of the yeast prion Sup35 is studied in vitro [4, 13] as well as in silico [8, 21, 37]. Sup35 proves to be a good model for investigating amyloid formation, and because it forms fibrils in vitro and can be crystalized for structural study, it is often used as such. Balbirnie et al. have shown that specifically the sequence GNNQQNY in the N-terminal part of Sup35 triggers the aggregation of Sup35. Interestingly, also isolated GNNQQNY aggregates into a fibril, and has the advantage of being crystalized particularly well. After structural analysis on the packing of GNNQQNY microcrystals, Balbirnie et al. propose a unit cell which is composed of parallel β-sheets of three GNNQQNY strands. Because these microcrystals share principal properties with the amyloid fibrils, this is another contribution to the pool of evidence supporting the assumption that the formation of β-sheets is a key factor in stabilizing amyloid fibrils and aggregates [4].

Because the aggregates are relatively unstructured, and hard to crystalize, it is still hard to study the exact structure and folding pathway. Therefore, comparisons with subunits is often necessary in order to understand these mechanisms. In silico research does not need a crystalized structure to study the configuration of a protein aggregate. However, the long timescale and often large systems involved in these processes make it hard to do detailed modelling on complete systems.

1.4 MD simulations

Biochemistry is the fundament of all organisms. Therefore, a lot of effort is put into this field of research in order to increase our understanding of life. It is very difficult to study these phenomena experimentally because of their timescale and stochastic behaviour. A method to get insight in processes at these scales is molecular dynamics (MD). One could use an all-atom

(6)

model in which each atom is modelled as an interactive sphere. Covalent bonds are represented by harmonic potentials and non-bonded interactions can be modelled by a Lennard-Jones potential.

Simulating a biological system with such a model can be limited by available computational power however. The amount of computation time for n particles is directly proportional to n², because for every particle one has to compute the interaction with every other particle. By giving a maximum distance for which an interaction has to be calculated, the computation time can be reduced to a nlog(n) proportionality [3]. This still prevents calculations on large systems, however.

To simulate large systems within a reasonable time, models are often coarse grained (CG). This means that a group of atoms is represented by one interaction site, which tries to simulate the behaviour of the whole group. In this way one can still represent the same system while having to do less calculations. This is at the cost of detailed interactions that arise from the specific configuration of all atoms of course. Despite this, coarse graining has proven to be a very successful method of simulating large systems at long time scales [29].

(7)

2 The original model

Work discussed in the following chapters is based on an already existing model by A. Ghavami et al. [18]. This coarse grained model represents every amino acid by one interaction site, or bead.

The model is designed to represent the average shape and function of disordered proteins.

2.1 Backbone bonded potentials

In the original model by A. Ghavami, each bead is connected by a single bond. The length of the bond is fixed at 0.38 nm. To translate the degrees of freedom of the all atom backbone to the coarse grained backbone as shown in figure 3, the bending angle τ and peptide bond torsion ω are assumed to be constant. This leaves two degrees of freedom, φ and ψ (see figure 3 (a)) to be translated to the coarse grained bending angle θi and dihedral angle αi. This translation is described by [35].

Figure 3: (a) The all atom representation of a peptide backbone (b) the one-bead-per-amino acid representation of a peptide backbone.

From Ramachandran plots based on experiments, the density of all atom degrees of freedom φ and ψ can be deduced [19], see figure 4(a). As one can see from this figure, the distribution of φ and ψ seems to be constricted to certain regions. This is mainly the result of steric effects of the sidechains, and interaction with the solvent. All beads have been divided into three categories:

Proline (P), Glycine (G) and Rest (X). This is done because Proline and Glycine show distinct distributions. From these Ramachandran plots, a probability distribution is deduced for α and θ using the mapping given by [35]. The resulting distribution (see figure 4(b)) is then inverted to a potential curve using the Boltzman method: U (q) = −kBT ln(P (q)), see figure 4(c). The bending and torsion potential for all combinations is given in appendix 2A.

(8)

Figure 4: Construction of coarse grained bending and torsion potentials (a) Ramachandran plot showing distribution of φ and ψ for the X-X-P combination. (b) The probability distribution for θ. (c) Resulting potential for θ.

By using this model, A. Ghavami et al. were able to predict the radius of gyration of denatured proteins. For this purpose, each bead was given a repulsive potential giving it an excluded sphere with radius 0.38 nm. The results from this simulation correspond very well to experiments on the radius of gyration of denatured proteins [27].

2.2 Long range potentials

To represent long range interactions between beads, two potentials are introduced. The electrostatic potential represents the attraction or repulsion between two charged amino acids, and the hydrophobic potential, representing the attraction or repulsion caused by the hydrophobicity or hydrophilicity of amino acids.

The electrostatic potential is given by a modified Coulomd law:

φel= qiqj

4πε₀ε_r(r)rexp(−κr), (1)

where the Debeye screening coefficient κ = 1.0 nm⁻¹ accounts for the implicit solvent, the distance dependent dielectric constant is given by

ε_r(r) = S_s

1 −r²

z²

e^r/z (e^r/z− 1)²

, (2)

where S_s= 80 and z = 0.25 nm.

The hydrophobic potential is given by

φhp(r) =

(εrep σ r

8

− εij

h4 3

σ r

6

−¹₃i

if r ≤ σ

(3)

(9)

where σ = 0.6 nm, εrep = 10 kJ/mol and εij = εhpp(εiεj)^α where εhp = 13 kJ/mol, and α = 0.27. The parameters εi and εj are the hydrophobicity scales of two interacting amino acids. The value of εi for each amino acid was obtained by fitting the Stokes radius and end-to- end distance of proteins in the model to experimental values. The proteins used for this fit are 16 Nups from the yeast NPC and three peptides: poly-Proline, poly-Glutamine and poly-Glycine.

More details on this procedure can be found in reference [20]. All simulated Stokes radii deviated less than 20% from the experimental value.

The resulting values for the hydrophobicity scales and the charges are given in table 1.

Amino Acid A R N D C Q E G H I L K M F P S T W Y V

εi 0.7 0 0.33 0.0005 0.68 0.68 0.0005 0.41 0.53 0.98 0.6 0.0005 0.78 1 0.65 0.45 0.51 0.96 0.82 0.94

qi 0 1 0 -1 0 0 -1 0 0 0 0 1 0 0 0 0 0 0 0 0

Table 1: Relative hydprophobic strength εi and charge qi for each amino acid.

2.3 Nuclear pore complex

The model described above has been used by A. Ghavami et al. to simulate the distribution of FG-Nups inside the yeast NPC. The scaffold is composed of repulsive spheres which have no further interaction with the amino acids. The yeast NPC contains 128 Nups. Details on the construction of the NPC can be found in reference [20].

As is shown in figure 5(A), the FG-Nups inside the NPC form into a donut-shaped configuration.

And from figure 5(B) and (C) can be seen that the charged amino acids reside close to the scaffold whereas the FG-repeats can be found inside the NPC forming a doughnut-like shape.

Figure 5: Iso-surface plots of Nup distribution in the yeast NPC. (A) Distribution of all amino acids, plot represents mass density of 140 mg/ml. (B) Distribution of charged amino acids. (C) Distribution of FG-repeats, plot represents average distance between FG-repeats of 2.7 nm.

Changing the charged amino acids to neutral beads and removing their hydrophilic potentials, only leaving a repulsive sphere with a radius of 0.6 nm, results in a remarkable difference in the density distribution of Nups in the NPC. The Nups will be concentrated close to the scaffold. Also denaturing the Nups or reversing their sequence will result in the destruction of the doughnut shape. This shows that donut shape is the result of the collective interaction of specific sequences of amino acids.

(10)

2.4 Limitations

Although the model is capable of simulating the density distribution of Nups inside the NPC, it is not able to confirm the formation of hydrogels inside the NPC as proposed by Frey and G¨orlich [17]. The model does show a density inside the doughnut shape of ≥ 200 mg/ml, at which a hydrogel could be formed. However, for the proposed sieve-like structure to be formed, β-sheets should be formed between FG-repeats. The model created by A. Ghavami et al. is not able to form these β-sheet however. The stability of a β-sheet arises from a lot of interactions, which are not all captured by this model. A β-sheet has hydrogenbonds between the backbones of the two strands participating in the sheet. Also, modeling an amino acid as a sphere centered at the C_αatom causes sterical problems as well as excluding the possibility to capture interactions in which only the sidechains play a role.

The objective of this research project is to modify the existing model such that it will be able to simulate the formation of β-sheets, while maintaining the original properties of the model.

This means that the model should still be able to reproduce the results discussed above and in reference [18]. On top of that, it should still be coarse grained enough to do simulations of a complete NPC in a reasonable time. i.e. the goal of coarse graining is to be able to increase system size and/or simulation time, the resulting model should perform reasonably in these terms with respect to the original model.

(11)

3 Implementing a folding mechanism

Many attempts have been made to capture protein folding in a coarse grained model. Because this mechanism depends on a collection of interactions of which a lot are not taken into account in a coarse grained model, this remains a difficult task. Often, a model has to be biased toward a certain configuration in order for folding to occur [26]. Or a lower degree of coarse graining has to be used, representing an amino acid with multiple beads [8,25,33,34]. Other attempts to capture folding in an unbiased one-bead-per-amino acid model involves the modelling of hydrogen bonds as non-bonded angle dependent interactions [14, 39]. In that way the complex behaviour of a group of atoms can be captured by one interaction site. However, this involves a significant increase in computation time and cannot be implemented in the computational method which is used by A. Ghavami et al. In this research project, a hydrogen bond model by Chen et al. is tested for its contribution to folding behaviour in the model by Ghavami. Also the model for backbone dipole-dipole interactions by Alemani et al. is considered. It is stated in literature that the interaction between the backbone dipoles is essential for the formation of stable β-sheets [8].

In this chapter, the implementation of these models into the model by Ghavami will be discussed.

3.1 Backbone hydrogen bonds

The one-bead-per-amino acid model by Chen et al. [10, 11, 24] models the amino acids as hard spheres. The bonds connecting the beads have a fixed length of 0.38 nm and a fixed bending angle of 105^◦. The dihedral angle can move freely however.

The simulation of hydrogenbonding between two amino acids is performed using two pseudo atoms per amino acid. These pseudo atoms cannot move with respect to the amino acid they are connected to but rather act as interaction sites. Pseudo O-atoms can only interact with pseudo H-atoms and vice versa. i.e. a pseudo H-atom cannot interact with other pseudo H-atoms or beads. The pseudo atoms are placed such, that if a pseudo O-atom overlaps with a pseudo H-atom, a hydrogenbond is considered to be formed between the C-O group and the N-H group of an all atom backbone. Thus, a pseudo H-atom isn’t placed on the position of the real H-atom.

Although a pseudo H-atom is aimed to be in the plane defined by Cα-N-H, it extends a bit further than the H atom in order to overlap with a pseudo O-atom in case of a hydrogenbond.

The same holds for the pseudo O-atom.

In figure 6, the geometry of the model is shown. The fat dots are Cαbeads i − 1, i and i + 1 from left to right. The bonds, represented with fat arrows are the bonds u_i and u_i+1. The dotted line in the right-hand figure is the normal to the plane created by the three C_α’s, given by

~

n_i= (~u_i× ~u_i+1)/||~u_i× ~u_i+1||. (4) The pseudo atoms O and H are represented by red and white dots respectively. The O-atom is positioned at ¹₃~ui along the bond and placed ^σ₂~ni above it, where σ = 0.38 nm. The H-atom is positioned at²₃~uiand −^σ₂~ni. Note that the distance^σ₂~niwill not change, unless the cross-product between ~ui and ~ui+1 changes sign, then ^σ₂~ni will flip to the opposite direction.

(12)

Figure 6: Geometry of the pseudo-atoms, left: topview, right: sideview. Black dots are Cα

beads, red dots are pseudo O-atoms and white dots are pseudo H-atoms.

The interaction between pseudo O and H-atoms is described by a shifted Lennard Jones potential:

VOH = 4

"

σ

r + r₀

¹²

−

σ

r + r₀

⁶#

(5)

where r is the distance between pseudo atoms O and H, σ = 0.38 nm and r₀= 2^1/6σ. Note that r₀is chosen such that the energy will be minimal and equal to − when r = 0.

Figure 7: Chen’s coarse grained model of a peptide forming an (A)α-helix (B)β-sheet. Grey spheres are amino acids, centered at the Cα’s, the small dark grey and black spheres are pseudo hydrogen and pseudo oxygen atoms respectively.

(13)

3.2 Backbone dipole-dipole interactions

Bereau et al. state in a 2009 paper [8] that dipole-dipole interactions are necessary in order to fold a peptide into stable β-sheets. Alemani et al. present a one-bead-per-amino acid model where they model the backbone dipole-dipole interaction as an angle dependent potential. As mentioned above, this is computationally very expensive. However, a dipole can also be physi- cally constructed by implementing two charges of opposite sign, which is done by Alemani et al.

when two dipoles are close to each other. This implementation will be used for all dipole-dipole distances in this project.

Alemani et al. present a one-bead-per-amino acid model which does not differentiate between amino acids, just as Chen et al. All beads have no charge and a hydrophobic potential fitted to that of Alanine beads. To get a peptide into a given configuration (α-helix or β-sheet) they change the parameters of dihedral and bending potentials. In this way they bias the peptide to a certain secondary structure. The backbone dipole is unbiased, however [2].

Figure 8: Construction of the backbone dipole by Alemani et al.

The orientation and position of the backbone dipole is defined in a local reference system with axes v,n and m, see figure 8. The origin of the reference system is placed halfway the peptide bond, connecting C_α,i and C_α,i+1. The vector v points to C_α,i+1, and n is orthogonal to the plane defined by C_α,i−1,C_α,i and C_α,i+1. And finally m = v × n. In this reference system the dipole is defined as follows:

µi,v= µ0cosφ, µi,n= µ0sinφcosϑ, µi,m= µ0sinφsinϑ.

(6)

ϑ is the angle of rotation around v and φ is the angle of rotation around n. The angle φ is fixed.

However, ϑ is fitted to a dataset as function of the bending angle γ. This dataset is created by extracting the orientation of the backbone dipole from PDB structures of peptides folded either

(14)

into an α-helix or a β-sheet, see the inset of figure 8 [9]. The fit of ϑ to γ is given by

ϑi= −1.607γi+ 0.094 + 1.883 exph_(γ

i−γ0) σ

i + 1

. (7)

With γ0 = 1.730 rad and σ = 0.025 rad. The dipole will be implemented as two charges with opposite charge of 0.33 e⁻, seperated by 0.2273 nm. The charges interact with each other through the Coulomb law given in equation 2, with the exception that εr= 10.

3.3 Method

Simulations have been performed using GROMACS 4.5.3 [6]. Calculations have been done with the Stochastic Dynamics option using a timestep of 0.02 ps. In order to implement the pseudo atoms, virtual sites have been used [23]. To create the configuration described for both the model by Chen et al. and Alemani et al. a seperate customization of the GROMACS source code was necessary for both models. Both alterations have been included in the Appendix. They have only been tested for GROMACS version 4.5.3. In the case of the model by Chen et al. the alterations are made to ensure the position of the virtual site relative to its constructing amino acid will remain fixed. In the case of the model by Alemani et al. the alterations include the dependence of the orientation of the dipole on the bending angle of the constructing amino acids.

(15)

4 Results

4.1 Simulating a β-hairpin

To test the capability of representing the secondary structure of a β-sheet, simulations have been run on a peptide which native state is a stable β-hairpin, and has a length of 16 beads.

(PDB-code 1GB1m3 ) This peptide is designed such that it has a high stability of its native state at room temperature.

To analyze the configuration of the peptide (unfolded or folded into the native state), the average distance between neighbouring beads from the two strands is measured. i.e. the distance between beads 1 and 16, 2 and 15, 3 and 14 etc. are averaged each time frame. In order for the peptide to be able to get into a folded configuration, the hydrophilicity scales of the charged amino acids Aspartic acid, Glutamic acid, Arginine and Lysine are switched with that of Alanine. Ghavami et al. made the hydrophilic potential of these amino acids very repulsive in order to fit the Stokes radius of charged segments of FG-Nups. This was successful for modeling the configuration of unordered protein. However, when implementing in a peptide with a folded native state, these potentials prevented the strands from approaching each other. The hydrophobicity scale of Alanine was chosen because Alemani et al. use the hydrophobic potential of Alanine for all their beads. Before doing valid predictions on the folding behaviour, these scales should be refitted. But for qualitative comparisions between different models, the replacement of the highly repulsive hydrophilic potentials of the charged amino acids with that of Alanine is assumed to be sufficient. The result of simulations of the configuration of 1GB1m3 is shown in figure 9.

Figure 9: Average distance between all beads that are neighbours in the native state. Using Ghavami’s model in combination with (A) the dipole-dipole interaction by Alemani et al. (B) no additional interactions (C) the hydrogenbond interaction by Chen et al.

To illustrate the probability for each model to reside in a given configuration, the probability of average distances is shown in figure 10.

(16)

Figure 10: The probability for the peptide 1GB1m3 to have a given average neighbour distance, for each model.

From figure 10 can be seen that the average neighbour distance distribution does not change a lot when implementing the hydrogenbonds or dipole-dipole interaction. Although the original model shows a slightly broader peak and the hydrogenbond model shows the smallest average neighbour distance. The model including the dipole-dipole interaction shows the largest tail, representing unfolded configurations.

4.1.1 Considering the hydrogenbonds by Chen et al.

Looking at the shape of the peptide during a simulation with hydrogenbonds reveals a collapsed peptide. Although this does result in small average neighbour distances, it does not result in a straight β-hairpin as seen in the native state. Also the virtual sites that represent hydrogenbonds do not form stable connections. The decrease in potential as a result of the hydrogenbonds does not seem to change significantly when the peptides shape approximates that of the native state compared to a random collapsed state. It should be noted that the radius of the beads used by Chen et al. are 0.38 nm whereas the radius of the beads used in the simulation with hydrogenbonds discussed here is 0.47 − 0.6 nm. This will ofcourse give problems with the geometric requirement for the virtual sites to overlap in order to form a hydrogenbond. Decreasing the radius of the beads would invalidate the model for simulations of disordered proteins as well as for simulations of natively folded proteins however. To compensate for the larger bead radius, simulations have been run with the positions of the virtual sites extended according to the bead size. This did not lead to a significant difference in the result however.

4.1.2 Considering the backbone dipoles by Alemani et al.

In figure 10 can be seen that the model including the dipoles shows the most narrow peak by a small difference. Studying the simulation, a configuration can be seen that resembles the native state better than the more collapsed configuration of the model with hydrogenbonds. The two strands of the β-hairpin are more stretched than in the model with hydrogenbonds. From figure 10 can also be seen that the distribution has a relative large tail of occurences of large average

(17)

these larger average neighbour distance as a result. Although the β-hairpin is far from stable, a model including backbone dipoles seems to model the native state of 1GB1m3 better than the model including hydrogenbonds. Although, among other adjustments, the hydrophobicity scales of the charged amino acids should be refitted before quantitative predictions can be done, the backbone dipole model was first studied in different simulations.

4.2 Aggregation of GNNQQNY

As discussed in the introduction, a lot of research is done on the formation of protein aggregates.

A one-bead-amino acid model could be able to perform simulations on larger systems or time scales in order to understand better the mechanism of aggregate formation. To study the folding capabilities the model with the backbone dipoles, a simulation has been done on a system containing 3 peptides with the sequence: GNNQQNY. These are the relevant segments from the Sup35 protein (see section 1.4). Balbirnie et al. found that GNNQQNY forms parallel β-sheets of three peptides [4], see figure 11.

Figure 11: A schematic of crystalized GNNQQNY aggregates. The structure was revealed by x-ray crystallography [4].

Simulations at 300 K in a periodic box show that three strands of GNNQQNY do not form a stable aggregate, i.e. occasionally a third strand will seperate from the other two. Also, the structure of the aggregate is not ordered at all, but rather dynamic instead. To study the energetically favorable state, the simulation has also been performed at 100 K. The result is a more stable aggregate. However, the resulting β-sheet is anti-parallel. As stated earlier by Gsponer et al. [21], the backbone dipoles result in a high propensity for anti-parallel sheet formation. The interaction of the sidechains are the cause of the final configuration being parallel.

Because the model by Ghavami et al. does not explicitly have sidechains, it is not surprising that an anti-parallel β-sheet is the result of this simulation.

4.3 Simulating a hydrogel with β-sheet crosslinks

Experiments done by Ader et al. [1] show that the segments of Nsp1 consisting of residues 2 − 175 show a high propensity to form a hydrogel. Therefore, a simulation has been run on a system of three of these segments in a periodic box of 8 × 12 × 6 nm. This size was chosen to match the

(18)

concentration used in experiments by Ader et al. of 3 mM .

To quantitatively analyze the formation of β-sheets, a different tool was used. With the simulation of a β-hairpin, the amino acid pairs that will be neighbours in the folded state are known beforehand. In the case of a hydrogel, these pairs are not known. Thus, we have to check for each possible combination. The basic principle of the analyzing tool is that there are 4 requirements for a β-sheet to be validated as such:

• Two neighbouring beads have to be within a given distance.

• Two neighbouring beads have to be oriented within a given angle from each other.

• A β-sheet has to have a minimum number of bead pairs (i.e. length).

• A β-sheet has to exist for a minimum number of time frames.

The threshold to be met for each requirement can be changed at will. The analyzing tool and the requirements are explained in more detail in Appendix 4A.

A simulation was run where the requirements for both the minimal length of a β-sheet and the minimal number of time frames where varied. The minimal distance between neighbouring beads was set to 0.8 nm and no requirement was set on the orientation yet. To study the effectiveness of including the interactions between backbone dipoles, the simulation was performed with and without the backbone dipoles. The results are shown in table 2. From this table can be seen that the increase of the minimal number of time frames a β-sheet has to exist, leads to a slight decrease in the fraction of beads participating in a β-sheet. This effect grows gradually larger as the time threshold is increased. However, increasing the minimal length of a β-sheet has a large impact on the fraction of beads participating in a β-sheet. Increasing the minimum number of beads by 2 decreases the fraction almost by 1 order of magnitude.

However, the model with backbone dipole interaction leads to a larger fraction of beads participating in β-sheets when a threshold of 9 beads is set for the minimum length. Visual study of the simulation shows that the dipoles cause the protein to reside in a more extended configuration.

This results in a decrease of number of β-sheets, but when a β-sheet is formed, it is usually longer.

minimal length minimal length minimal length

5 beads 7 beads 9 beads

minimal time 0.2155^without 0.300^without 0.0026^without 1 frame 0.1373^with 0.0252^with 0.0028^with minimal time 0.2146^without 0.0296^without 0.0025^without 5 frames 0.1366^with 0.0249^with 0.0027^with minimal time 0.2113^without 0.0283^without 0.0023^without 10 frames 0.1343^with 0.0241^with 0.0026^with

Table 2: Average fraction of beads participating in a β-sheet (with and without dipoles).

To see if the backbone dipole-dipole interaction has a significant effect on the orientation of the beads involved in a β-sheet, the dependence of the fraction of beads participating in a β-sheet on this property was tested. The minimum length for a β-sheet was set at 5 beads. From table 3 can be seen that this rules out most of the β-sheets. Note that only one of the beads has to fail the angle criterium for the whole β-sheet to be rejected. However, in this case we see that

(19)

difference is not very large, it is consistent, even when the maximum angle is increased to 60 degrees.

maximal angle maximal angle maximal angle 30 degrees 45 degrees 60 degrees minimal time 0.0016^without 0.0114^without 0.0392^without 1 frame 0.0020^with 0.0132^with 0.0413^with minimal time 0.0014^without 0.0107^without 0.0376^without 6 frames 0.0018^with 0.0125^with 0.0399^with

Table 3: Average fraction of beads participating in a β-sheet (with and without dipoles).

The contribution of the backbone dipole interaction seems to come from the interaction of sequential dipoles (the interaction between the dipoles of beads i and i + 1) much more than from other interactions between dipoles. However, as discussed in section 2.1, the bonded interactions also include this effect. The bonded potentials are constructed from the Ramachandran data of denatured proteins. The denaturization was performed by solvating the proteins in urea. This cancels the hydrophobic effects and also much of the electrostatic effects at long range. However, it does little to interactions at short range. Therefore, much of the dipole-dipole interactions between sequential peptide bonds will be preserved [7]. Indeed, when comparing the radius of gyration of the model with backbone dipoles, the radius of gyration is significantly increased.

Therefore, the interaction between sequential dipoles was excluded. This reduced the radius of gyration back to experimental values again. However, the propensity to form structured β-sheets is not significantly different anymore from the simulation without the backbone dipoles. This confirms that the effect of the backbone dipoles came mostly from interaction between sequential dipoles.

(20)

5 Discussion

5.1 Hydrogenbonds

As discussed in section 4.1.1, including the hydrogenbonds as described by Chen et al. has only a small effect on the average neighbour distance when simulating a β-hairpin. The pseudo atoms do not seem to be able to form stable hydrogenbonds. Moreover, the model is not able to contribute to a structured configuration. There are multiple aspects that might have caused this. Firstly, the bead radius in the model used in this work is larger than that used by Chen et al. This means that the pseudo atoms will not properly overlap, and thus hydrogenbonds will have a decreased strength. Extending the pseudo atoms further out from the Cαbead such that the pseudo atoms can properly overlap again did not yield stable hydrogenbonds however. The reason for this can be that the distance between two strands will now be larger because of the larger bead size and extended pseudo atoms. Because the rest of the geometry stays the same, it is possible that a β-hairpin with this increased average neighbour distance is not a configuration which the peptide is able to make. The bond length or other parameters can prevent the peptide from folding to this configuration. Secondly, the pseudo atoms often get close to other random pseudo atoms. This means that random hydrogenbonds are formed, decreasing the total energy for random configurations. This means that the hydrogenbonds are not specific for certain configurations, but lower the total energy for a large range of configurations.

The model for hydrogenbonds as presented in section 4.1.1, does not seem to be an effective mechanism to fold a peptide into a β-sheet. Although it does bring the peptide in a more collapsed configuration and thus decreasing the avererage neighbour distance, it fails to bring structure and stability to the peptide.

5.2 Backbone dipoles

Including backbone dipoles leads to a more stretched and structured peptide in the simulation with the β-hairpin when compared to the simulation with hydrogen bonds. The model is not able to form a stable configuration, but the average configuration resembles the native state better. This is caused by the effect that the dipoles have on the secondary structure. Where the hydrogen bonds often make random connections, the backbone dipole steers the peptide to a specific configuration.

Using this model for the simulation of aggregation of GNNQQNY, does not result in realistic aggregates. Instead of a parallel β-sheet, an anti-parallel β-sheet is formed. This is caused by the absence of interactions between sidechains. These interaction are responsible for the favorability of the parallel configuration. Therefore, the simulation of aggregation events with a one-bead- per-amino acid model will not be possible with the model used in this paper.

In section 4.3 is argumented that the backbone dipole-dipole interaction of sequential beads is already incorporated in the bonded potentials of the model. This means that taking this interaction into account when considering the backbone dipoles will lead to an overestimation of this effect. And indeed, the radius of gyration increased significantly in this situation, due to the straightening effect the backbone dipole has on the chain. When excluding the interaction between sequential dipoles, the effect of the dipoles on the structure become negligible. Apperently, the contribution of backbone dipoles can be captured by the bonded interactions of Ghavami et al.

(21)

5.3 Steric interactions

When simulating disordered proteins, the representation of an amino acid as a single bead can be sufficient. However, when a protein folds, amino acids come very close together. The final configuration is not only determined by local interactions like hydrogen bonds and dipole-dipole interactions. The steric shape of an amino acid also strongly determines what the possible configurations are.

Because the amino acids are connected to each other by peptide bonds connected to the C_α, the model by Ghavami et al. places the beads at the C_α. This does not correspond to the actual center of an amino acid however. To incorporate the steric shape of the sidechains, the model uses a bead radius of 0.6 nm, which is even larger than the bond length (0.38 nm). The result of this is that the sidechains often extend further than the single bead can cover, and on the other side of the amino acid an area is sterically blocked by the bead while in reality this space is unoccupied. This means that the distance between two backbones when their amino acids are sterically touching is much larger in the current model than in reality. This has proven a large problem when folding a peptide into a β-sheet because the backbones can not approach each other as closely as they would in reality.

Changing the steric shape of the model by Ghavami et al. would mean that the hydrophobicity scales of the model have to be refitted. Therefore, attempts have been made to incorporate the folding mechanism in the excisting model. As described above, this has not led to a model that can succesfully model folding of proteins. It should be stressed again that none of the results discussed in chapter 4 have been obtained with the hydrophobicity scales as published by Ghavami et al. In order to get the folding events at all, the repulsive hydrophilic potential of the charged beads have been replaced with either a steric or a hydrophobic potential. Therefore, it can be concluded that the model as is, does not support simulating the formation of folded configurations.

(22)

6 Considering sidechains

As stated in chapter 1, the goal of this project is to design a model that can simulate folding events while remaining computationally fast. However, when implementing hydrogen bonds or dipoles, the model went from one to three beads per amino acid. This did not lead to a large increase in computation time however. The extra beads were virtual sites, which act as an extra interaction site but do not introduce extra degrees of freedom. (A peptide of 56 beads yielded an increase in computation time of 37 % when pseudo O and H atoms where introduced.) Because of the low cost in computational power for virtual sites, attempts have been made to represent the sidechain by a virtual site. Introducing a sidechain as a virtual site would give the opportunity to approximate the steric shape of an amino acid much better. The C_α bead can be reduced in size to better approximate the shape of the backbone. For every amino acid not only the size of the sidechain can be altered, also the distance between C_α and sidechain bead can be specified for each type of amino acid. Also, the hydrophobic potential could be placed on the sidechain, giving a more realistic interaction site for this potential.

6.1 Implementing sidechains

Before the C_α-sidechain distance and sidechain size are optimized for each amino acid, a simplified model is tested to study the potential of the proposed model.

The sidechains are implemented at a distance of 0.2 nm from the C_α beads. The sidechain is placed in the plane defined by C_α,i−1, C_α,iand C_α,i+1, and points away from the triangle created by these three beads. That is, the vector from Cα,ito the sidechain is oriented along the vector ui − ui+1, where ui = Cα,i− Cα,i−1 and ui+1 = Cα,i+1− Cα,i. As can be seen in figure 12, the sidechains overlap very well with the actual position of the sidechains in the all atom model, even for this simplified model, whithout varying the distance and size of the sidechains.

The interactionsite for the hydrophobic and charged potentials is moved from the Cαbeads to the sidechain beads. The Cαbeads now have a steric radius of 0.38 nm, and are slightly hydrophobic.

The sidechains have a steric radius of 0.5 nm. The bonded interactions are kept the same. As stated in section 2.1, the bonded potentials are also influenced by the steric interactions of the sidechains. Implementing these sidechains explicitly will probably overestimate this effect. How- ever, because these interactions are steric, and thus represent a ’forbidden’ configuration instead of a configuration which is just energetically unfavorable, it is assumed that this overestimation does not influence the final results. Still, the radius of gyration of the model should again be validated against experimental data before any quantitative conclusion can be drawn with this model.

6.2 results

The simulation in section 4.1 is repeated for this model to study the average neighbour distance.

The average neighbour distance is much more stable over time, but shows a similar distribution as the simulations done in section 4.1. As can be seen in figure 13, the peptide is much more structured, similar to the model with the backbone dipoles. However, the resulting β-hairpin is much more stable in this simulation.

(23)

Figure 12: Visualization of the sidechains. The black lines represent the all atom model of 1GB1.

The yellow beads represent the sidechains and the colored beads represent the C_α beads. The size of the beads does not correspond to the actual steric size. (steric size is larger)

Figure 13: Left: Average neighbour distance for a simulation of 1GB1 with sidechains. Right:

the distributions of average neighbour distances for this simulation.

6.3 Including hydrogenbonds

In previous simulations, the hydrogenbonds implemented as pseudo atom interactions did not yield a structured configuration. Because the sidechains gave rise to a very structured peptide, a simulation has been performed with these pseudo atoms included again. In figure 14 can be seen that the average neighbour distance is lower in this case. Also the peptide resides longer in a given state. Thus the hydrogenbonds stabilize the β-sheet. This can also be seen in the smaller

(24)

peak for the distribution of average neighbour distances.

Figure 14: Left: Average neighbour distance for a simulation of 1GB1 with sidechains and hydrogenbonds. Right: the distributions of average neighbour distances for this simulation.

6.4 Revising hydrogels

To test the capability of the model to form β-sheet crosslinks in the Nsp1 protein, the simulation described in section 4.3 was repeated with the sidechain model. A minimal β-sheet length of 5 beads was chosen, a maximal neighbour distance of 0.8 nm, a minimal time of 1 frame, and the maximal angle was chosen to be 30 degrees. These criteria correspond to the first cell of table 3. Previously when backbone dipoles were included, the average fraction of beads participating in a β-sheet was 0.0020 with these criteria. A simulation with the sidechain model shows an average fraction of 0.0289, a significant increase of over a factor 10. A closer study of the simulation reveals that also β hairpins are being formed. These structures are not crosslinks but will contribute to the increase of average fraction of β-sheets. Nonetheless, the model shows a significant increase in propensity towards β-sheets without collapsing into a single lump.

6.5 Future work

Including sidechains seems a promising method to simulate folded proteins. Demonstrations with a simple example of this method have led to more stable β-hairpins than was achieved with the one-bead-per-amino acid model. Also, in a simulations of hydrogels, a significant increase in the propensity to form β-sheets was observed. Even bigger differences are found in the structure of peptides and proteins. The steric interactions between sidechains often prevent the amino acid chain to collapse randomly, forcing it in more realistic configurations.

Including sidechains creates a large freedom in fitting parameters. Due to lack of time, the parameters for the sidechains were extremely simple in the simulations discussed here, using the same size and position for each sidechain. However, the distance between Cαand sidechain bead, and the sidechain bead size can be changed for every type of amino acid to best represent the steric shape of the amino acid. On top of that the hydrophobic interaction can be placed on the sidechain bead and fitted similarly to the method used by Ghavami et al. However, it might also be possible to make the hydrophobic interaction of the Cα bead specific for each type of amino acid. In this way, folding affinity of the backbone can be made amino acid specific. Also, long

(25)

the sidechain potential, similar to the dipoles used by Alemani et al.

Fitting the sidechain parameters so that these opportunities can be utilized is left for future work.

(26)

7 Conclusion

In this report, attempts to include β-sheet folding in a one-bead-per-amino acid model have been discussed. The original model by Ghavami et al. was expanded firstly with hydrogenbonds as presented by Chen et al. and secondly with backbone dipoles following the model by Alemani et al. Although the hydrogenbonds did increase the propensity to β-sheet formation, the resulting peptide was unstructered. The contribution of backbone dipoles to the final structure originates mainly from the interaction of two sequential backbone dipoles. However, this interaction was already included in the original model by Ghavami et al. When excluding interactions between sequential backbone dipoles, they end up having a negligible effect on the final structure.

A new model is proposed, modelling the C_α atom and the sidechain as two seperate beads.

Simulations done with a simple implementation of this model show a β-sheet propentity which is higher than that of the model discussed in this paper, and also the resulting structures resemble much more the native configuration. The proposed model can be greatly improved by choosing the position, size and hydrophobic potential of the Cα and sidechain beads specific for each amino acid. However, due to lack of time, this is left for future work.

(27)

References

[1] Ader, C., Frey, S., Maas, W., Schmidt, H. B., G¨orlich, D., and Baldus, M.

Amyloid-like interactions within nucleoporin fg hydrogels. Proceedings of the National Academy of Sciences 107, 14 (2010), 6281–6285.

[2] Alemani, D., Collu, F., Cascella, M., and Dal Peraro, M. A nonradial coarse- grained potential for proteins produces naturally stable secondary structure elements. Jour- nal of Chemical Theory and Computation 6, 1 (2009), 315–324.

[3] Anandakrishnan, R., Daga, M., and Onufriev, A. V. An n log n generalized born approximation. Journal of Chemical Theory and Computation 7, 3 (2011), 544–559.

[4] Balbirnie, M., Grothe, R., and Eisenberg, D. S. An amyloid-forming peptide from the yeast prion sup35 reveals a dehydrated β-sheet structure for amyloid. Proceedings of the National Academy of Sciences 98, 5 (2001), 2375–2380.

[5] Barton, S., Jacak, R., Khare, S. D., Ding, F., and Dokholyan, N. V. The length dependence of the polyq-mediated protein aggregation. Journal of Biological Chemistry 282, 35 (2007), 25487–25492.

[6] Bekker, H., Berendsen, H., Dijkstra, E., Achterop, S., Vondrumen, R., Van- derspoel, D., Sijbers, A., Keegstra, H., Renardus, M., DeGroot, R., et al.

Gromacs-a parallel computer for molecular-dynamics simulations.

[7] Bennion, B. J., and Daggett, V. The molecular basis for the chemical denaturation of proteins by urea. Proceedings of the National Academy of Sciences 100, 9 (2003), 5142–5147.

[8] Bereau, T., and Deserno, M. Generic coarse-grained model for protein folding and aggregation. The Journal of chemical physics 130, 23 (2009), 235106.

[9] Cascella, M., Neri, M. A., Carloni, P., and Peraro, M. D. Topologically based multipolar reconstruction of electrostatic interactions in multiscale simulations of proteins.

Journal of Chemical Theory and Computation 4, 8 (2008), 1378–1385.

[10] Chen, J. Z., and Imamura, H. Universal model for α-helix and β-sheet structures in protein. Physica A: Statistical Mechanics and its Applications 321, 1 (2003), 181–188.

[11] Chen, J. Z., Lemak, A. S., Lepock, J. R., and Kemp, J. P. Minimal model for studying prion-like folding pathways. Proteins: Structure, Function, and Bioinformatics 51, 2 (2003), 283–288.

[12] Chook, Y., and Blobel, G. Karyopherins and nuclear import. Current opinion in structural biology 11, 6 (2001), 703–715.

[13] DePace, A. H., Santoso, A., Hillner, P., and Weissman, J. S. A critical role for amino-terminal glutamine/asparagine repeats in the formation and propagation of a yeast prion. Cell 93, 7 (1998), 1241–1252.

[14] Enciso, M., and Rey, A. Improvement of structure-based potentials for protein folding by native and nonnative hydrogen bonds. Biophysical journal 101, 6 (2011), 1474–1482.

[15] Fluitt, A. M., and de Pablo, J. J. An analysis of biomolecular force fields for simulations of polyglutamine in solution. Biophysical journal 109, 5 (2015), 1009–1018.

(28)

[16] Frey, S., and G¨orlich, D. Fg/fxfg as well as glfg repeats form a selective permeability barrier with self-healing properties. The EMBO Journal 28, 17 (2009), 2554–2567.

[17] Frey, S., Richter, R. P., and G¨orlich, D. Fg-rich repeats of nuclear pore proteins form a three-dimensional meshwork with hydrogel-like properties. Science 314, 5800 (2006), 815–817.

[18] Ghavami, A. Coarse-grained molecular dynamics simulation of transport through the nuclear pore complex.

[19] Ghavami, A., van der Giessen, E., and Onck, P. R. Coarse-grained potentials for local interactions in unfolded proteins. Journal of Chemical Theory and Computation 9, 1 (2012), 432–440.

[20] Ghavami, A., Veenhoff, L. M., van der Giessen, E., and Onck, P. R. Probing the disordered domain of the nuclear pore complex through coarse-grained molecular dynamics simulations. Biophysical journal 107, 6 (2014), 1393–1402.

[21] Gsponer, J., Haberth¨ur, U., and Caflisch, A. The role of side-chain interactions in the early steps of aggregation: Molecular dynamics simulations of an amyloid-forming peptide from the yeast prion sup35. Proceedings of the National Academy of Sciences 100, 9 (2003), 5154–5159.

[22] Gsponer, J., and Vendruscolo, M. Theoretical approaches to protein aggregation.

Protein and peptide letters 13, 3 (2006), 287–293.

[23] Hess, B., van Der Spoel, D., and Lindahl, E. Gromacs user manual version 4.5. 4.

University of Groningen, Netherland (2010).

[24] Imamura, H., and Chen, J. Z. Dependence of folding dynamics and structural stability on the location of a hydrophobic pair in β-hairpins. Proteins: Structure, Function, and Bioinformatics 63, 3 (2006), 555–570.

[25] Irb¨ack, A., Sjunnesson, F., and Wallin, S. Three-helix-bundle protein in a ramachandran model. Proceedings of the National Academy of Sciences 97, 25 (2000), 13614–13618.

[26] Klimov, D., Betancourt, M., and Thirumalai, D. Virtual atom representation of hydrogen bonds in minimal off-lattice models of α helices: effect on stability, cooperativity and kinetics. Folding and Design 3, 6 (1998), 481–496.

[27] Kohn, J. E., Millett, I. S., Jacob, J., Zagrovic, B., Dillon, T. M., Cingel, N., Dothager, R. S., Seifert, S., Thiyagarajan, P., Sosnick, T. R., et al. Random- coil behavior and the dimensions of chemically unfolded proteins. Proceedings of the National Academy of Sciences of the United States of America 101, 34 (2004), 12491–12496.

[28] McKee, T., and McKee, J. R. Biochemistry: the molecular basis of life. Oxford Uni- versity Press, 2012.

[29] Noid, W. Perspective: coarse-grained models for biomolecular systems. The Journal of chemical physics 139, 9 (2013), 090901.

[30] Peters, R. Translocation through the nuclear pore complex: selectivity and speed by reduction-of-dimensionality. Traffic 6, 5 (2005), 421–427.

(29)

[32] Rout, M. P., Aitchison, J. D., Magnasco, M. O., and Chait, B. T. Virtual gating and nuclear transport: the hole picture. Trends in cell biology 13, 12 (2003), 622–628.

[33] Smith, A. V., and Hall, C. K. Protein refolding versus aggregation: computer simulations on an intermediate-resolution protein model. Journal of molecular biology 312, 1 (2001), 187–202.

[34] Takada, S., Luthey-Schulten, Z., and Wolynes, P. G. Folding dynamics with nonad- ditive forces: a simulation study of a designed helical protein and a random heteropolymer.

The Journal of chemical physics 110, 23 (1999), 11616–11629.

[35] Tozzini, V., Rocchia, W., and McCammon, J. A. Mapping all-atom models onto one- bead coarse-grained models: General properties and applications to a minimal polypeptide model. Journal of chemical theory and computation 2, 3 (2006), 667–673.

[36] Wade Jr, L. Amino acids, petides and proteins. Organic Chemistry (2010), 1153–1193.

[37] Wu, C., and Shea, J.-E. Coarse-grained models for protein aggregation. Current opinion in structural biology 21, 2 (2011), 209–220.

[38] Yamada, J., Phillips, J. L., Patel, S., Goldfien, G., Calestagne-Morelli, A., Huang, H., Reza, R., Acheson, J., Krishnan, V. V., Newsam, S., et al. A bimodal distribution of two distinct categories of intrinsically disordered structures with separate functions in fg nucleoporins. Molecular & Cellular Proteomics 9, 10 (2010), 2205–2224.

[39] Yap, E.-H., Fawzi, N. L., and Head-Gordon, T. A coarse-grained α-carbon protein model with anisotropic hydrogen-bonding. Proteins: Structure, Function, and Bioinformat- ics 70, 3 (2008), 626–638.

(30)

8 Appendix

8.1 2A

Figure 15: Bending potentials for all combinations of amino acids. G stands for Glycine, P for Proline and X for all other amino acids. O represents any type of amino acid and Y represents any type except P.

(31)

8.2 3A

8.2.1 Alternate code for simulations of model by Chen et al.

s t a t i c void c o n s t r v s i t e 3 O U T ( r v e c x i , r v e c x j , r v e c xk , r v e c x , r e a l a , r e a l b , r e a l c , t p b c ∗ pbc )

{

r v e c x i j , x i k , temp ;

p b c r v e c s u b ( pbc , x j , x i , x i j ) ; p b c r v e c s u b ( pbc , xk , x i , x i k ) ; c p r o d ( x i j , x i k , temp ) ;

c /= norm ( temp ) ;

/∗ 15 F l o p s w i t h o u t d i v i s i o n by norm ( temp ) . temp i s n o r m a l i z e d t o k e e p t h e c o o r d i n a t e s f o r t h e v i r t u a l s i t e s c o n s t a n t . S e e s p r e a d v s i t e 3 O U T ( )

T h i s was done on may 19 2015 by Wopke H e l l i n g a ∗/

x [XX] = x i [XX] + a ∗ x i j [XX] + b∗ x i k [XX] + c ∗ temp [XX ] ; x [YY] = x i [YY] + a ∗ x i j [YY] + b∗ x i k [YY] + c ∗ temp [YY ] ; x [ ZZ ] = x i [ ZZ ] + a ∗ x i j [ ZZ ] + b∗ x i k [ ZZ ] + c ∗ temp [ ZZ ] ; /∗ 18 F l o p s ∗/

}

s t a t i c void s p r e a d v s i t e 3 O U T ( t i a t o m i a [ ] , r e a l a , r e a l b , r e a l c , r v e c x [ ] , r v e c f [ ] , r v e c f s h i f t [ ] , t p b c ∗ pbc , t g r a p h ∗ g )

{

r v e c x v i , x i j , x i k , f v , f j , f k , temp ; r e a l c f x , c f y , c f z ;

a t o m i d av , a i , a j , ak ; i v e c d i ;

i n t s v i , s j i , s k i ;

av = i a [ 1 ] ; a i = i a [ 2 ] ; a j = i a [ 3 ] ; ak = i a [ 4 ] ;

s j i = p b c r v e c s u b ( pbc , x [ a j ] , x [ a i ] , x i j ) ; s k i = p b c r v e c s u b ( pbc , x [ ak ] , x [ a i ] , x i k ) ; /∗ 6 F l o p s ∗/

c o p y r v e c ( f [ av ] , f v ) ;

/∗ The two l i n e s b e n e a t h a r e added t o d i v i d e c ( t h e t h i r d c o o r d i n a t e o f t h e v i r t u a l s i t e ) by t h e norm o f t h e c r o s s p r o d u c t o f v e c t o r s r i j and r i k ( x i j and x i k r e s p . ) . T h i s i s done t o t a k e i n t o a c c o u n t t h a t t h e t h i r d c o o r d i n a t e o f t h e v i r t u a l s i t e i s n o r m a l i z e d . i . e . t h e c r o s s p r o d u c t i n ’ c o n s t r v s i t e 3 O U T ’ i s d i v i d e d by t h e norm o f t h e r e s u l t i n g v e c t o r . T h i s i s done t o make t h e t h i r d c o o r d i n a t e i n d e p e n d e n t o f t h e a n g l e between r i j and r i k .

Wopke H e l l i n g a May 19 ∗/

c p r o d ( x i j , x i k , temp ) ; c /= norm ( temp ) ;

(32)

c f x = c ∗ f v [XX ] ; c f y = c ∗ f v [YY ] ; c f z = c ∗ f v [ ZZ ] ; /∗ 3 F l o p s ∗/

f j [XX] = a ∗ f v [XX] − x i k [ ZZ ] ∗ c f y + x i k [YY] ∗ c f z ; f j [YY] = x i k [ ZZ ] ∗ c f x + a ∗ f v [YY] − x i k [XX] ∗ c f z ; f j [ ZZ ] = −x i k [YY] ∗ c f x + x i k [XX] ∗ c f y + a ∗ f v [ ZZ ] ;

f k [XX] = b∗ f v [XX] + x i j [ ZZ ] ∗ c f y − x i j [YY] ∗ c f z ; f k [YY] = − x i j [ ZZ ] ∗ c f x + b∗ f v [YY] + x i j [XX] ∗ c f z ; f k [ ZZ ] = x i j [YY] ∗ c f x − x i j [XX] ∗ c f y + b∗ f v [ ZZ ] ; /∗ 30 F l o p s ∗/

f [ a i ] [ XX] += f v [XX] − f j [XX] − f k [XX ] ; f [ a i ] [ YY] += f v [YY] − f j [YY] − f k [YY ] ; f [ a i ] [ ZZ ] += f v [ ZZ ] − f j [ ZZ ] − f k [ ZZ ] ; r v e c i n c ( f [ a j ] , f j ) ;

r v e c i n c ( f [ ak ] , f k ) ; /∗ 15 F l o p s ∗/

i f ( g ) {

i v e c s u b ( SHIFT IVEC ( g , i a [ 1 ] ) , SHIFT IVEC ( g , a i ) , d i ) ; s v i = IVEC2IS ( d i ) ;

i v e c s u b ( SHIFT IVEC ( g , a j ) , SHIFT IVEC ( g , a i ) , d i ) ; s j i = IVEC2IS ( d i ) ;

i v e c s u b ( SHIFT IVEC ( g , ak ) , SHIFT IVEC ( g , a i ) , d i ) ; s k i = IVEC2IS ( d i ) ;

} e l s e i f ( pbc ) {

s v i = p b c r v e c s u b ( pbc , x [ av ] , x [ a i ] , x v i ) ; } e l s e {

s v i = CENTRAL;

}

i f ( f s h i f t && ( s v i !=CENTRAL | | s j i !=CENTRAL | | s k i !=CENTRAL) ) { r v e c d e c ( f s h i f t [ s v i ] , f v ) ;

f s h i f t [CENTRAL ] [ XX] += f v [XX] − f j [XX] − f k [XX ] ; f s h i f t [CENTRAL ] [ YY] += f v [YY] − f j [YY] − f k [YY ] ; f s h i f t [CENTRAL ] [ ZZ ] += f v [ ZZ ] − f j [ ZZ ] − f k [ ZZ ] ; r v e c i n c ( f s h i f t [ s j i ] , f j ) ;

r v e c i n c ( f s h i f t [ s k i ] , f k ) ; }

/∗ TOTAL: 54 f l o p s ∗/

}

Simulating β-sheet formation with a coarse grained model of proteins