Adaptation and Application of Random Graph Theory to the Polymerization of Linseed Oil

(1)

University of Amsterdam

MSc Computational Science

Master Thesis

Adaptation and Application of Random Graph

Theory to the Polymerization of Linseed Oil

by

Tamika E. van ’t Hoff, BSc

10769250

June 2019

42 ECTS

Research conducted between November 2018 and June 2019

Supervisors:

Prof. Dr. P. Iedema

Ms. V. Schamb¨

ock, MSc

Ms. Y. Orlova, MSc

Examiner:

Prof. Dr. B. de Bruin

Research Group:

Van ’t Hoff Institute

for Molecular Sciences

(2)

Abstract

Restoration of old oil paintings has always been a challenging task due to the enor-mous complexity of the materials used by artists. Oil paint mainly consists of binder and pigment, where linseed oil is known to be the most common binder, dating from the 15th century. It polymerizes at room temperature and forms hard films of paint. These films are nothing but complex polymer networks. The molecule of linseed oil consists of three fatty acids that cause the polymerization: oleic, linoleic and linolenic acid. This work focuses on the polymerization of linseed oil by first researching the reduced models of ethyl oleate, ethyl linoleate and ethyl linolenate. Then, the more complex reduced model of triolein is explored and, finally, linseed oil itself. The methodologies that are used for this modeling are: automated reaction network generation to obtain the detailed kinetic model, and random graph theory to further extract the global properties of the polymer network. Examples of global properties are the molecular weight distribution and the gel point.

(3)

Populair wetenschappelijke samenvatting

De restauratie van olieverf schilderijen is door de jaren heen een grote uitdaging. Niet alleen vraagt het restaureren de benodigde vaardigheden van de restaurateur, ook de oliev-erf die gebruikt werd door de grote meesters is erg complex. Olievoliev-erf bestaat voornamelijk uit bindmiddel en pigment, waarbij pigment de verf kleurt en het bindmiddel ervoor zorgt dat verf opdroogt. Het meest voorkomende bindmiddel is lijnzaadolie. Dit is al zo sinds de 15eeeuw. Het drogen van de olieverf is een polymerisatiereactie die verloopt bij kamertem-peratuur. Hierbij ontstaan grote moleculen met een netwerkstructuur. In dit geval ontstaat er een harde laag van gedroogde verf die bestaat uit complexe polymeren.

De moleculen die ontstaan zijn triglyceriden. Deze esters bestaan uit glycerol waaraan drie verschillende vetzuren gebonden zijn: oliezuur, linolzuur en linoleenzuur. De vetzuren veroorzaken de polymerisatie van het gehele molecuul.

Het doel is om een model te maken om de eigenschappen van deze polymeren te voor-spellen om uiteindelijk schilderijen beter te kunnen conserveren. Eerst worden er gere-duceerde modellen onderzocht. Dit zijn de veresterde vetzuren die de polymerisatie van lijnzaadolie zelf veroorzaken. Daarna worden er complexere modellen onderzocht: eerst een versimpelde vorm van lijnzaadolie, maar daarna het molecuul zelf.

Bij het modelleren worden er twee methodes gebruikt. De eerste methode is het gener-eren van een reactienetwerk om het verloop van de concentraties van monomeerstructuren door de tijd heen te verkrijgen. Ten tweede worden complexe netwerken gebruikt om de stofeigenschappen van het polymeer te kunnen voorspellen. Voorbeelden van deze eigen-schappen zijn massaspectra en het precieze moment dat het polymeer gevormd wordt.

(4)

2.2.1 A mono-variate system . . . 15 2.2.2 A tri-variate system . . . 23 2.2.3 A tetra-variate system . . . 28 2.2.4 A penta-variate system . . . 34 3 Application 40 3.1 Ethyl oleate . . . 40 3.1.1 Kinetic model . . . 40 3.1.2 Global properties . . . 42 3.2 Ethyl linoleate . . . 44 3.2.1 Kinetic model . . . 46 3.2.2 Global properties . . . 48 3.3 Ethyl linolenate . . . 51 3.3.1 Kinetic model . . . 53 3.3.2 Global properties . . . 55

3.4 Behaviour of different fatty acids . . . 58

3.4.1 Kinetic model . . . 60 3.4.2 Global properties . . . 60 3.5 Triolein . . . 62 3.5.1 Kinetic model . . . 63 3.5.2 Global properties . . . 65 3.6 Linseed oil . . . 68 3.6.1 Kinetic model . . . 68 3.6.2 Global properties . . . 71 4 Discussion 78 5 Outlook 80 6 Acknowledgements 81

(5)

7 References 82 A Possible nodes described by a tri-variate degree distribution 84 B Partial moments of a trivariate degree distribution 85 C Tri-variate model; weighted average size assuming gel 86 D Tetra-variate model; weighted average size assuming no gel 87 E Tetra-variate model; weighted average size assuming gel 91 F Penta-variate model; weighted average size assuming no gel 99 G Penta-variate model; weighted average size assuming gel 117

H Gel point criterion penta-variate system 167

I Specifications of models used in application 170 J Additional results of the ethyl linoleate model 171 K Symmetric input molecule ethyl linolenate 172 L Additional results of the ethyl linolenate model 173

(6)

1 Introduction

From the very beginning of mankind, painting and creating statues or images have been a part of life. Through time, a lot of famous paintings like the ‘Mona Lisa’ by Leonardo da Vinci, or the ‘Birth of Venus’ by Sandro Botticelli have been created.1, 2 These masterpieces still inspire people all around the world. However, they are fragile and susceptible to chemical reactions with the air, the light, and the dust surrounding them.3, 4 _{As result, the paint undergoes} chemical degradation and the paint layer is damaged. Generally, oil paint consists of two major components: pigments and linseed oil, where the pigments cause the paint to be coloured, while linseed oil forms the binding medium of oil paint. As the paint dries, linseed oil forms polymer networks, which is a polymerization process.

To prevent further degradation from happening and to enable efficient restoration of paint-ings, an in-depth understanding of the formed polymer networks in the paint and the poly-merization process of the binding medium is necessary. In this thesis the free-radical photo-polymerization of linseed oil (Figure 1) is explored, which is a triglyceride with three fatty acids that can be of the following types: saturated acids such as palmetic and stearic acid, oleic acid, linoleic acid and linolenic acid. The fatty acids are attached to glycerol with an ester bond and vary in their natural abundance (Table 1).5 As paint dries, the individual linseed oil molecules in the paint react with oxygen and each other, due to the unsaturations in the fatty acids. These reactions result in crosslinks between individual linseed oil molecules, forming higher order oligomers with time. As a result of this process, the liquid paint starts to dry until one large network-like structure is the result of the crosslinking reactions; the paint has dried completely. The large polymers are by no means unreactive, so they, just as paintings decay, undergo degradation reactions, causing the network to be destroyed again. In this research, the focus lies only on the drying process, so these degradation reactions are not included.

While linseed oil is interesting for art conservation, the field of polymer modelling is mostly used in the industry. Examples of applications for which modelling is used in industry are plastics and rubbers. The advantage of modelling compared to experiments is that different reactions can be studied in great detail. Furthermore, the conditions under which the reactions take place are easily varied to provide a better understanding of the underlying mechanisms.

Fatty acid Chemical formula Unsaturation Abundancy (%)

Palmetic and C16H32O2 10

stearic acid C18H36O2

Oleic acid C18H34O2 mono 22 Linoleic acid C18H32O2 di 16 Linolenic acid C18H30O2 tri 52

Table 1 The fatty acids that are present in linseed oil with their chemical formula, their (un) saturation, and abundances in natural linseed oil.

(7)

Figure 1 Linseed oil, composed of the three most abundant fatty acids: A oleic acid, B linoleic acid, and C linolenic acid.

This makes modelling often faster, cheaper and more sustainable than experiments.6 How-ever, modelling of a polymerization process can be very difficult as a result of the size of the exponentially growing polymer.

There exist a few commonly used modelling methods, each with their advantages and disad-vantages. First of all, there is the use of analytical solutions. Although these are extremely fast and computationally cheap, they are only available for a limited number of chemical systems, such as reaction mechanisms.7, 8 The polymerisation of linseed oil is a far too complicated sys-tem to be modelled with analytical solutions. Secondly, there are the Monte Carlo simulations which can be used to model polymerization, but it is slow.9 _{An advantage of using Monte} Carlo simulations is that it includes many physical processes as diffusion and steric effects dur-ing polymerization, makdur-ing it possible to describe the reality more accurately. .10 _However, because of the stochasticity involved in Monte Carlo simulations, there are many runs necessary to reduce the variations. Furthermore, due to the many intra- and intermolecular interactions included, it is computationally expensive. Finally, the system size of linseed oil is too large for Monte Carlo simulations to be effective. A third commonly used method, population balance equations (PBE), defines equations for every state of the polymer. Consequently, new and more complex equations must be constructed with every change in the polymer. Since these constantly change during the process of polymerization the model suffers from the curse of dimensionality and requires a cut-off.11, 12

Required for all these methods is information about all the reactions occurring in the system. This is another reason that these methods are not suitable to model the polymerization of linseed oil: the reaction mechanism is highly complex and is not fully known. To remedy this complication, a monomer approach is used where PBEs are still used to model. However, rather than the polymer state, the focus lies on the monomer states. These have only a limited dimensionality, so the model does not suffer from it and the cut-off does not result in the loss of significant information.13 In addition to these advantages, the random graph model allows us to derive analytical solutions and some numerical solutions that describe the global properties of the described polymers.

(8)

The aim of this research is to develop a model of linseed oil that includes all the reactions that happen during its polymerization. In addition, we want to predict the topology of the polymer network that is created as a result of these reactions. Examples of these global properties of the polymer network are the size distribution, the gel fraction, and the mass distribution. To acquire these, both the kinetic model, developed by Orlova et al.,14 _{and the random graph} model, developed by Schamb¨ock et al.,13 _{are combined.}

This combination produces a mathematical model that combines the monomer PBE (mPBE) approach based on the automated reaction generation adapted for the polymerization processes with random graph modelling in the form of the configuration model. The polymer is viewed as a set of nodes with different connectivities, depending on a probability distribution defined by the concentration profiles obtained from the mPBEs. Note that in the model, monomers within polymer molecules are represented by nodes in connected components. The chemical bonds formed during polymerization are called crosslinks and can be viewed as edges placed between nodes. With the information regarding the connectivity, a graph or network is con-structed that describes global properties of our system during the polymerization process.15 The graphs is constructed by randomly placing edges between nodes with similar half-edges, where every pair of half-edges has the same probability to connect. This way the placement of edges is independent of both the size of the connected component and the number of edges per node. We model infinite-sized systems and do not allow small, finite-sized systems as cyclization in the system.

The main difference between classical polymer reaction engineering methods and random graph is that it is a monomer approach, rather than a polymer approach. The polymeriza-tion is modelled by keeping track of the local connectivity of the monomeric structures and their concentrations. Then, the molecular species in the system are converted into nodes with connectivities that are based on their chemical nature. Finally, the polymer is constructed for every moment in time by connecting nodes and the global properties are calculated on each modelled time step. To acquire the concentrations through time, mPBEs are solved numerically for all monomer states possible in the system. This approach is previously used by Schamb¨ock et al.13 and Kryven et al.16 on smaller systems, while in this thesis the method is applied to larger systems: the drying process of linseed oil.

To make the mathematical model accurately describe the system, chemical restrictions must be included. As linseed oil reacts in numerous different ways, it makes the chemistry highly complex. Therefore, we need the automated reaction generation to construct a reaction network, which size is entirely dependent on the structure of the monomer and the detail of the considered reaction mechanism. When all possible monomeric species in the system are included in the reaction network, ordinary differential equations (ODEs) are constructed. These describe the evolution of the concentrations of all species present in the system. This type of kinetic modelling provides a realistic and dynamic description of the composition of the system for

(9)

Figure 2 Three molecules with the same reactivity as the fatty acids: ethyl oleate (EO), ethyl linoleate (EL), and ethyl linolenate (ELn). These molecules serve as models for initial computations.

every moment in time.17Finally, the concentrations of molecular species are lumped according to various functional groups and normalized to 1 in order to obtain a probability distribution for the further use in the random graph model.

As the desired probability distribution is derived directly from the chemical system, the con-structed random graph describes the polymer network. Then, using random graph formalism, global properties are extracted from the network, such as the size distribution. In addition, the phase transition that is observed in the drying of linseed oil is also predicted using graph theory. Therefore, the exact moment of the formation of a polymer, or entering the gel regime, is predicted as well.

Although the aim is to apply this methodology on linseed oil (Figure 1), this is too complex for initial computations. Therefore, we first develop the methodology for three molecules that possess the same reactivity as the fatty acids: ethyl oleate, ethyl linoleate, and ethyl linolenate. These individual fatty acids are already complex in their structure and form three chemically different crosslinks. Then, we proceed with a triglyceride were we introduce the monomer fragmentation technique: triolein. This molecule contains a glycerol head and tree oleic acid tails. Compared with linseed oil, triolein has a reduced complexity. Lastly, the methodology is applied to linseed oil itself on the early stages of the drying. For each model we demonstrate the global properties of the polymer network after applying the methodology.

(10)

2 Methodology

To model the polymerization of a molecule accurately with a random graph model, chemically enforced restrictions must be included. To acquire these, a kinetic model is constructed where all reactions and reaction products are modelled through time. As output, the reaction profiles of all possible molecules in the system are transformed into a degree distribution, which contains these chemical restrictions restrictions and functions as input for the random graph. In this section, the methodologies of both the kinetic model and the random graph model are discussed.

2.1 The Kinetic model

To acquire the degree distribution needed for the random graph model, all the reactions that linseed oil undergoes are included in a reaction network. The initial focus lies on the individual fatty acids as these cause the polymerization of linseed oil (Figure 2).

2.1.1 Free-radical polymerization

In general, polymerization consists of three main steps: initiation, propagation, and termina-tion. Here, they are explained in more detail, taking ethyl linoleate (EL) as an example. The whole reaction mechanism is illustrated in Figure 4 and is described in more detail in the fol-lowing sections. Note that reactions included in the models of EL and ELn overlap in most cases. Whenever there is a difference, it is mentioned. For the reaction mechanism of EO, one can check the paper by Porter 1995.18

Initiation The free-radical polymerization is initiated with the release of a radical in the system. The initiator causes the hydrogen abstraction of the bis-allylic hydrogen from EL (Figure 2 and 4). Because of the electron drawing nature of the double bonds, the allylic carbon atoms become more reactive. Especially the bis-allylic carbon in EL is very reactive as a result of the enhanced electro-positivity (Figure 3).18, 19 _{When the molecule has undergone} hydrogen abstraction, a pentadienyl radical is formed.20 Although there is delocalization, the compound is very reactive, making it susceptible to further reactions .

This initiation extends to EO and ELn as well. Since ELn possesses three unsaturations, it contains two bis-allylic carbons and is consequently more reactive. Still, hydrogen abstraction results in only a pentadienyl, since the third unsaturation is not connected to the conjugated system. In EO there is only one unsaturation, meaning that it does not possess a bis-allylic site. This decreases the reactivity of EO significantly.18, 21

Propagation The second main process that makes up polymerization is propagation, where the radical propagates through the system. There are two types of propagation reactions included in the model. In both of these a radical containing compound reacts with a molecule

(11)

Figure 3 A visualization of the allylic and bis-allylic carbon atoms present in ethyl linoleate (EL). The pink areas correspond to allylic carbons, while the green area corresponds to the bis-allylic carbon.

Figure 4 The reaction mechanism of ethyl linoleate (EL). Note that the set of reaction incorporated in the model is iteratively used: there are multiple oxidation steps that are represented by only one reaction rule.

(12)

without radical. As a consequence, the radical propagates from one molecule to another. As first propagation process, there are the reactions between peroxide radical containing compounds and other molecules. Needed for this is the oxidation of the pentadienyl group created as result of the hydrogen abstraction. The resulting compound is highly reactive due to the functional group and reacts with other molecules. When this takes place, the radical is propagated through the system and is not terminated. As second propagation process, there are addition reactions, which occur at the sites of a conjugated system.19, 22Although termination reactions contribute, without addition reactions no polymerization would be possible.

Termination As a final polymerization process, radicals undergo termination in the system. Here, two radical containing compounds react with each other producing a product without radical. One termination process is Russell termination, where the reaction does not result in a bond between the reacting agents: two peroxide radicals result into one ketone, hydroxide group and one oxygen molecule. The key observation is that the polymer does not grow! Rather, the chain reaction is terminated, as no radical is left for propagation.

In addition to Russell termination, there is radical recombination. Here, two radical con-taining compounds react with each other and a new bond is formed between them: a crosslink. Depending on the position of the radical on either compound, a chemically different crosslink is formed. When, the radical is positioned on a peroxide group, recombination results in a peroxide crosslink (Figure 5A). When, the radical is positioned on an alkoxy group, an ether crosslink is formed with recombination (Figure 5B). Finally, when the radical is positioned on a simple carbon atom, an alkyl crosslink is produced (Figure 5C). Since recombination occurs solely around unsaturations, every molecule can form a limited amount of crosslinks. This is called the maximal functionality (F ) of the molecule and is primarily dependent on the amount of double bonds the initial molecule possesses, as well as on the position of these double bonds. The maximal functionalities of EO, EL, and ELn are 1, 3, and 6, respectively.

Other reactions Besides the three main processes of polymerization, there are a lot of other reactions that play a role in the polymerization of EL. First of all, there are the β−scission reactions, where the radical species are unstable and split their chains. Since the radical can be localized at different carbons in EL, there are multiple β−scission reactions that can take place and, thus, a lot of different reaction products that are produced. Similarly to EL, EO and ELn also undergo these scission reactions.

A different type of additional reactions are the decomposition reactions, where unstable functional groups decompose into two pieces. An important decomposition reaction is the de-composition of the peroxide crosslink. Although the crosslink connects two monomeric struc-tures, its configuration and oxidative state make it unstable.23_{This reaction destroys the formed} polymer, but makes it also possible to form ether crosslinks.

(13)

Figure 5 The crosslinks that ethyl linoleate (EL), ethyl linolenate (ELn), and ethyl oleate (EO) can form as a result of radical recombination. From left to right: peroxide, ether and alkyl.

2.1.2 Construction of Reaction Rate Equations

In general, a set of dependent reactions is modelled using ordinary differential equations (ODEs), which are differential equations that contain the derivatives of one or more functions of independent variable(s).24When applied to a reaction, the equations are dependent on the derivatives from the other relevant compounds in the system, and are constructed like Reaction Rate Equations. With this method, the concentrations of all molecular species in the model are assumed to vary continuously in time.25As a consequence, the effect on the instantaneous rate of change is proportional to the product of the concentrations of the reacting species. An example of a set of ODEs is constructed and given below. Here, the same reaction scheme is used as in Figure 7. dA(t) dt = −k1A(t)B(t) (1) dB(t) dt = −k1A(t)B(t) (2) dC(t) dt = k1A(t)B(t) (3) dD(t) dt = k1A(t)B(t) (4)

As the consumption of the concentrations of A and B both depends on the concentration of each other and the rate of the reaction, this is included in the ODE describing them. Similarly, the production of species C and D is dependent on rate k1 and concentrations A and B. After solving the ODEs for a specified amount of time, the output is the concentrations of all species in the system for every moment in time.

An important remark is that in real-life chemical systems, the reaction rates tend to decrease with time. This diffusion aspect is not included in the Reaction Rates Equations. Consequently, the model is only accurate for a perfectly mixed model, where the reaction rates remain con-stant.

(14)

There is, however, a drawback to using the Reaction Rate Equation approach: as the system size increases, the complexity of the equations increase as well. When taking EL as an example, the system is too complicated to construct the ODEs by hand; the reaction products react as substrates as well. Because of this, all monomeric species in the system are checked iteratively to apply the reactions that are included in the system. To make this more efficient, the Automated Reaction Generation (ARG) was used, developed by Orlova et al.14

2.1.3 Automated Reaction Generation

The ARG is an algorithm that employs various tools to automatically generate the complete re-action mechanism.14 _{When applying the ARG algorithm, the following underlying assumption} is made: the polymer and the process of polymerization can be described solely by the con-centrations of individual monomers in the system. This is called the monomer approach. The molecule is implemented as a molecular graph, where nodes are atoms and edges are bonds between atoms (Figure 6). The connectivity of the molecule is defined by the connectivity matrix and the elements in the molecule are stored in the list of labels. Then, all reactions possible with the initial, but also with its reaction products, are implemented as reaction rules. These are transformations applied to a functional group of a reactant to produce the functional group of a product. Whenever molecule M can react according to reaction rule r1, the pro-gram recognizes a pattern in the molecular graph of M and rule r1 is applied. This pattern is the mathematical representation of the reacting functional group in the molecular species. It is the fragment of a molecule that undergoes changes during a reaction. The pattern that is recognized in the molecular graph is replaced by the pattern of the functional group in the product of the reaction. Note that this replacement of patterns can be viewed as the change in functional groups during a chemical reaction.14 _{After the change in molecular graph, the} reaction has taken place and a new molecular species is present in the system. This process is done for all reaction rules that are included in the system and repeated for every molecular species in the system. When no new species are found, the process is completed and we arrive at a list of species with characteristics and patters.

The next step is to construct the Reaction Network (RN) of the entire system of species and reactions. A reaction network is a graph containing two types of nodes: species nodes and reaction nodes. It is a convenient tool to acquire the concentration profiles, from which the connectivity of the molecule follows. A short example can be seen in Figure 7 Here, the molecular species are A, B, C, and D and the reaction proceeds with a rate k1. The reaction network of this reaction consists of five nodes, where four of them are species nodes and one is a reaction node. To represent the relation between reactants and products, nodes A and B, and nodes C and D are connected through the reaction node k1 (Figure 7). Although this representation is not necessary with only one reaction, it becomes much more usefull when the reaction mechanism consists of multiple reaction steps. Then, there is only one species node

(15)

Figure 6 An example of a molecular graph. On the left, the connectivity matrix with list of labels. On the right the molecule the matrix denotes.

per monomeric species present in the system. Even though one monomeric species contributes to n reactions. It makes the system more structured, which is a great advantage with the construction of the ODEs. According to the Reaction Rate Equations method, the ODEs are constructed and we arrive at the profiles of the concentrations of monomers present in the system for every moment in time.

Finally, the construction of a degree distribution from all molecular species and their concen-trations through time is necessary. As link between the kinetic model and the random graph, this forms the input for the random graph model. The degree distribution is a probability distribution that groups the monomeric species in the system according to their connectivi-ties. The degree of a monomer species is the amount of crosslinks it can make. Moreover, the degree distribution is a probability distribution depending on the degree of the monomers in the system. During the construction of this distribution, the kinetic model is essentially structured. Only the relative properties for the random graph modelling are included in the random graph to decrease the complexity. By recognizing the chemical crosslinks in each of the species in their molecular graph, the connectivity of the species is determined. Then, by combining the connectivity with the normalized concentrations through time, the distribution is constructed. This way, the degree distribution contains system-specific, chemically induced restrictions. Note that the normalization step produces a probability distribution and no longer concentrations. Taking EL as an example, the degree distribution is defined as a tri-variate degree distribution: u(i, j, k). In u(i, j, k), the variates i, j, and k are a consequence of the three different chemical crosslinks present. As the process of polymerization is simulated and the concentrations change in time, a degree distribution is constructed for every moment in time.

(16)

Figure 7 An easy example of a reaction network (right) constructed from a general reaction scheme (left).

2.2 Random Graph Modelling

Starting with the degree distribution, the polymer is reconstructed using the configuration model, which is a type of random graph modelling.26 _{Here, a set of L nodes each with a} pre-specified amount of half-edges is randomly linked with each other to form larger structures. By randomly selecting a set of two nodes with half-edges, an edge is placed between them. This process is independent on the distance between the selected nodes.26_{As more nodes are linked} together, larger connected components emerge such as the giant component. This is the largest component of the system and is of the order of the system size.27

By enforcing chemically dependent restrictions to the Random Graph model, a polymeriz-ing system can be modelled accurately.28 _{Taking EL as an example, three different chemical} crosslinks can be formed (Section 2.1.1). As the degree distribution that is extracted from the kinetic model is tri-variate, the configuration model recognizes three different coloured half-edges. When edges are placed, only similarly coloured half-edges are allowed to link together. This way, the coloured edges represent the three crosslinks in the chemical system. Note that the nodes in the random graph are considered as monomers, while the edges are regarded as the crosslinks between monomers. The amount of edges attached to a node as its degree.

As a consequence of being able to model the polymerization process with a random graph approach, random graph formalism is used. This makes it possible to obtain general properties of the formed giant component, as well as the represented polymer. For a range of different systems, the following global properties of the modelled polymer were derived:

• Component size distribution

• Weighted average size of the connected component • Component mass distribution

• Gel point criterion

2.2.1 A mono-variate system

In this section, we introduce the random graph formalism for a system with only one type of crosslink without directionality. In the following sections we will extend the formalism to

(17)

Figure 8 The mono-variate degree distribution with F = 3 results in 4 possible nodes that are present in the system.

several types of (directional) crosslinks, which is necessary to apply the methodology to EO, EL, ELn, triolein and linseed oil.

Although real-life molecular systems are more complex, reducing it to a few basic rules can very well perform the initial exploration. Taking EL as an example, it is known that the system consists of three different crosslinking groups. Furthermore, the maximal functionality F is equal to 3. For the transformation to a simple mono-variate system, we assume that the crosslinks do not differ chemically. This implies that the degree distribution is no longer tri-variate. Rather than u(i, j, k), we have a reduced degree distribution, defined as u(d):

u(d) = d X j=0 d−j X k=0 u(d − j − k, j, k) and d = i + j + k (5) The configuration model for random graphs allows for an arbitrary degree distribution with an arbitrarily large maximum degree or maximal functionality F . However, F is usually limited to relative low degrees in the case of chemical systems. For instance, recall the maximal functionalities of EO, EL, and ELn (1, 3 an 6, respectively).

We will refer to the transformation of a tri-variate system of EL to a mono-variate system as complexity reduction. As a result of the reduced system, the number of possible states of nodes that the degree distribution can describe with a maximum functionality F = 3 is equal to four (Figure 8). We illustrate the complexity reduction on the example of EL. As a result of the reduction of possible nodes, both the complexity of the mathematical description and the computational power needed for the random graph model decrease notably.For applying random graph formalism, we need the generating function that depends on the degree distribution. A generating function is a mathematical tool that enables the manipulation of infinite series.29 Alike to the reduced degree distribution u(d), the generating function is mono-variate and defined as:

U (x) =X i

u(i)xi. (6)

(18)

derived for the mono-variate random graph model.

Component size distribution As the degree distribution gives the probabilities of possible types of nodes at every single point in time, the random graph model reconstructs the connected components possible through time. Let us select a random edge in the system and follow it to one of its end nodes. The reached node is described by the biased degree distribution u1(i), defining the probability that it has i additional edges that are connected to other nodes, apart from the edge that was used to reach the node. The biased degree distribution is given by

u1(i) = 1 µ1

(i + 1)u(i + 1). (7)

The additional multiplication term (i + 1) expresses the increased probability to reach nodes with more edges. The value µ1is a partial moment that is dependent on the degree distribution. The definition of the partial moments of the degree distribution is given in Equation 8.

µl= " x∂ ∂x l U (x) # |x=1 (8)

From the definition of the biased degree distribution and the definition of the generating function of the true degree distribution, Equation 6, the biased generating function can be constructed:

U1(x) = 1 µ1

d

dxU (x), with U1(1) = 1. (9) To calculate the weighted size distribution of the connected component w(s), its corresponding generating function W (x) is used. According to Schamb¨ock et al., the generating function of the size distribution is defined as:13

W1(x) = x U1 W1(x) (10) W (x) = x UW1(x). (11)

As there is no analytical expression known for W (x), a numerical solution is obtained by solving Equation 10 iteratively and then evaluating Equation 11. From this solution, the weighted component size distribution for the connected component is calculated with the use of the Cauchy integral:

(19)

w(s) = 1 2πi

I _{W (x)}

xs+1dx (12)

Here, the largest possible contour was used: |x| = 1, with x ∈ C. Note that w(s) gives the probability of a random node to be part of a component of size s.

In addition to the weighted size distribution w(s), there exists a number size distribution n(s) that also describes the system. This distribution provides the probability of having a component of size s. Note that this regards the connected components as a whole, while w(s) describes the components according to the nodes it consists. Consequently, the probabilities provided by w(s) are weighted according to size s, in comparison with n(s):

n(s) = w(s) s 1 P s w(s) s (13) w(s) = sn(s)P 1 sn(s) . (14)

When the system is defined correctly, the initial component size distribution gives a prob-ability of p = 1 at a component size s = 1, as the system contains only monomers with functionality of i = 0.

Weighted average size As the number of edges increases through time, the nodes form large connected components of various sizes. The weight average size hsi_wof these components is calculated in two distinct ways: (I) using the component size distribution of the connected component, and (II) using a derived formula depending on the partial moments of the degree distribution.

hsi_w using the size distribution The first method is to use the weighted component size distribution w(s). Recall that for each moment in time, the probabilities for all selected sizes are given. To calculate the accurate hsi_w, all the probabilities are first normalized to 1 for each moment in time, Equation 15.

wnorm(s) = _P0n−20w(s) s w(s0)

, ∀ s ≤ n − 20. (15)

(20)

compo-nent size s, the hsi_wfor that moment in time is given by: hsi_w=X s w(s)s = P ss 2_n(s) P s0s0n(s0) . (16)

hsi_wis a weighted average as a result of the definition of w(s); it gives the probability for every node to be part of a component of size s. The probability of being part of a large component size is relatively high because of the amount of nodes of which the component consists. The number average is computed in a similar fashion:

hsi_n=X s

n(s)s. (17)

Although both the number and weighted average produce different outcomes, the trend is similar: they both increase in value when a giant component is formed as there is a probability of encountering large component sizes in the system. As for the weighted average, it even approaches infinity at this gel point.

Finally, there are some drawbacks to computing hsi_w using the w(s). For a start, the calculation can take a long time with larger and more complicated systems. In addition to this, the weighted size distribution is only computed up to a pre-specified maximum component size. Hence, it is truncated. This introduces errors in the average size, since big components are not included in the average. Especially around the gel point, these errors are substantial.

hsi_w using a formula To counter-act these issues, a quicker method is developed where a formula is derived, depending on the partial moments of the degree distribution. According to Schamb¨ock et al. and Kryven, the weighted average size hsi_wcan be derived from the following general formula:13 , 16

hsi_w= W

0_(x)|x=1

W (1) . (18)

Using the definition of W (x) given in Equation 11, we arrive at an expression depending on the generating functions of both the degree distribution and the size distribution.

W (1) hsi_w= UW1(x)|x=1+ ∂UW1(x) ∂W1(x) dW1(x) dx |x=1 (19)

(21)

W (1) = 1, while in the gel regime W (1) < 1. Hence, the gel fraction is given by gf = 1−W (1).16 For further derivation, we distinguish two methods, both based on one assumption: whether or not a giant component is present in the system.

When no giant component is present in the system, the gel point is not reached yet (t < tg). Consequently, no giant component is present in the system (gf = 0), indicating that the normalization factor in Equation 19 equals to 1. The formula computing the weighted average size hsi_w of a mono-variate system, assuming t < tg, is defined in Equation 20.

hsi_w= 1 − µ 2 1

µ2− 2µ1 (20)

Here, µ1 and µ2 are partial moments of the degree distribution with different values through time. When the gel, or giant component, emerges in the system, hsi_w diverges to infinity, due to the infinitely sized giant component. Since this formula is based on the assumption that no gel has formed, the values are no longer accurate when a gel is formed.

To remedy this, another formula is derived for systems in the gel regime. Here, the assump-tion W (1) = 1 does no longer hold. Consequently, the formula is not entirely independent of the generating functions describing the system at each moment in time: also of the amount of gel gf in the system. The expression for the weighted average size of the connected component assuming t > tg is given in Equation 21.

hsi_w= 1 + µ1(1 − gf) 2

(1 − gf)(1 − U₁0(r0), where r0= W1(1) (21) In this expression, the term U₁0(r0) is the derivative of the generating function of the weak connected component U1(x), evaluated for x = r0. Furthermore, the weighted average size is dependent on the partial moments of the degree distribution, Equation 8. Due to the fact that the formula is directly dependent on the gel fraction, this formula is accurate for every moment in time.

Important to note is that both of the formulas describe only the sol part in the system, which are the finite sized components. Hence, when a giant component is present in the system, only the non-gel nodes are included in the weighted average size computation. This is a consequence of the defined infinitely sized giant component.13

Another key point is that the chemical structure of the monomer represented by the nodes does not matter; each component is a collection of connected nodes of size 1. Since no in-dividual properties of the nodes or monomers are taken into account, no comparisons with experimental data can be made. Nevertheless, the weighted average size does provide insight into the polymerization process.

(22)

Element Molar mass (g/mol)

C 12.000000

O 15.994915

H 1.007825

Table 2 The used molar masses for calculating the total molecular weight of a monomer species.

Component mass distribution Although both the size distribution and the weighted av-erage size of the system are able to predict the behaviour of the system, no comparison with experimental data is made. One of the few experimental measurements that is applied to poly-mers and the process of polymerization is mass spectroscopy. This is an analytical technique that sorts ionized chemical species and loose parts of them into a spectrum based on their mass-to-charge ratio.30

The weighted component mass distribution wM(m) is similarly computed as the weighted component size distribution. The key difference is that monomer specific characteristics are included in the computation: the mass of each monomer. This makes the computation a lot larger and more time consuming. The described nodes are no longer equal to each other; they differ in mass and and therefore need to be distinguished from each other in the degree distribution.

Initially, the computation begins by calculating the molar masses of all the monomeric species present in the system. These are calculated from known molar masses of chemical elements: carbon (C), oxygen (O), and hydrogen (H), which are given in Table 2.

When the molar masses of all monomer species are calculated, the degree distribution is constructed. Again, the chemical base of the species is used to determine the connectivity. Along with the concentrations through time, the probabilities of occurring in the system is computed. The unique masses are included as well. Therefore, the degree distribution is no longer only dependent on the amount of crosslinks i, but also on a mass vector α(m): u(i, α(m)). The reason for the use of the mass vector α(m) rather than the masses m is that the masses do not necessarily have a value near each other. To reduce the size of the degree distribution, we map the occurring masses to the indices of the mass vector. An example is given here: mass vector α(m) [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ]

mass m [ 39 54 55 56 67 68 70 71 72 73 82 83 84 85 87 ]

Note that in the degree distribution u(i, α(m)) the indices follow hierarchically and evenly spaced. When the true masses are necessary, they are extracted from the list of masses where the index in α(m) corresponds with the indicated value in the list of masses.

Now the mass dependent degree distribution is correctly defined, the derivation of wM(m) or the weighted mass distribution can begin. The derivation leading up to wM(m) follows

(23)

the same idea as the derivation to the weighted component size distribution w(s). However, in contrast to the size distribution derivation, the mass distribution requires several degree distributions: for each unique mass in the system. Below, this idea is visualized for time t.

u[α(m)](i) =    u(i, α(m)) mass = α(m) 0 mass 6= α(m) (22)

Note that for a system with n different unique masses, the computation of wM(m) involves n different degree distributions. A generating function is constructed for each degree distribution:

U[α(m)](x) =X i

xiu[α(m)](i). (23)

Then, similarly to the computation of the weighted size distribution, we define also a biased de-gree distribution U₁[α(m)]. Even though the probability of the node being linked to another node with m 6= 0 is 1, the degree distribution becomes biased, similarly as in the size distribution, Equation 7. Using the generating functions of the biased degree distributions, the generating function of the distribution of the biased weak component is constructed.13

WM,1(x) =X m

xmU₁[α(m)]WM,1(x) (24)

Equation 24 explores the mass of the connected component by the multiplication of the masses of every single node: the factor xm. When computing the size of the connected component was present as well. However, as the size of each node equals to 1, this factor becomes x1_{. To} obtain the generating function WM(x), Equation 24 is numerically solved and substituted into:

WM(x) =X m

xmU[α(m)]WM,1(x). (25)

Again, the weighted component mass distribution for the connected component can be calcu-lated with the use of the Cauchy integral, Equation 12, where wM(m) is acquired, rather than w(s). Necessary for comparison with experiments is the number mass distribution nM(m). This is acquired by unweighting and normalizing wM(m) to 1.

Gel point criterion Around the phase transition from the sol regime, or the soluble part of the system, to the gel regime, the weighted average diverges to infinity. This phenomenon is accounted for by the emergence of the giant component: the weighted average formula becomes

(24)

singular at gel point tg.31 Taking the formula for the weighted average size into consideration, Equation 20, hsi_w becomes singular when the denominator of the fraction equals to 0. Thus, the criterion for the phase transition to gel for the mono-variate degree distribution becomes:

µ2− 2µ1= 0. (26)

Here, both µ1and µ2are partial moments, depending solely on the degree distribution, Equation 8.

2.2.2 A tri-variate system

Now the basis for each global property has been explained regarding a simple mono-variate system, it is extended to more complicated systems. First, a tri-variate system is considered where one molecule is able to crosslink with three distinct types of crosslinks. This particular model is used to simulate the individual fatty acids of linseed oil: ethyl oleate (EO), ethyl linoleate (EL), and ethyl linolenate (ELn). Each of these three are able to crosslink with other molecules by either a peroxide group, an ether group, or an alkyl group (Section 2.1.1). This defining property of the chemical system is translated into the random graph model by recognizing three different coloured half-edges. Only similarly coloured half-edges are connected during the placement of edges in the system.15_{As a result of this distinction of crosslinks, the} degree distribution is adapted appropriately: u(i, j, k). Note that the total degree d consists of three contributing factors::

d = i + j + k. (27)

Due to chemical restrictions, the maximum functionality still has a value of 3. This value is molecule specific and differs when the model is applied to a different polymerization process. The possible configurations of the nodes are visualized in Appendix A.

Following from the degree distribution, the corresponding generating function is constructed, Equation 28.

U (x, y, z) =X i,j,k

u(i, j, k)xiyjzk (28)

As the total probability at any given time is equal to 1, both the generating function and the degree distribution satisfy the following statement.

(25)

U (1, 1, 1) =X i,j,k

u(i, j, k) = 1 (29)

For the derivation of global properties, the partial moments of the degree distribution are evaluated at the point (x, y, z) = (1, 1, 1) and defined as:

µlmn= " x∂ ∂x l y ∂ ∂y m z ∂ ∂z n U (x, y, z) # |x=y=z=1. (30)

Since the tri-variate system is more complex than the mono-variate system, more partial mo-ments can be defined. The full range of these momo-ments for the is given in Appendix B. Component size distribution In a similar fashion as with the mono-variate degree distri-bution, the weighted component size distribution is computed. Because there are now three different undirected edges present in the system, there are three different biased degree distri-butions at time t : u1(i, j, k) = 1 µ100 (i + 1)u(i + 1, j, k) (31) u2(i, j, k) = 1 µ010 (j + 1)u(i, j + 1, k) (32) u3(i, j, k) = 1 µ001(k + 1)u(i, j, k + 1). (33)

Note that there are different first moments in each biased degree distribution (µ100, µ010, µ001). These are the three first moments of each of the variates, recall Equation 30. The three biased degree distributions lead to three biased generating functions from which the size distribution is computed: U1(x, y, z) = 1 µ100 ∂ ∂xU (x, y, z), with U1(1, 1, 1) = 1 (34) U2(x, y, z) = 1 µ010 ∂ ∂yU (x, y, z), with U2(1, 1, 1) = 1 (35) U3(x, y, z) = 1 µ001 ∂ ∂zU (x, y, z), with U3(1, 1, 1) = 1. (36)

(26)

with the generating functions of the biased weak components that are reached by following a randomly picked edge to its end.13

W1(x) = xU1W1(x), W2(x), W3(x) (37) W2(x) = xU2W1(x), W2(x), W3(x) (38) W3(x) = xU3W1(x), W2(x), W3(x) (39) W (x) = xUW1(x), W2(x), W3(x) (40)

With the use of the Cauchy integral, Equation 12, the weighted component size distribution w(s) is acquired. As mentioned in Section 2.2.1, the number component size distribution n(s) is gained by unweighting w(s), followed by normalization to 1.

Weighted average size There are two general ways to compute the weighted average size hsi_w of the small polymers that are not part of the gel. In the first case, the weighted size distribution w(s) is used and hsi_w is obtained from Equation 16. The other method is with the derivation of a formula for hsi_w by assuming the existence of gel (t < tg) or not (t > tg) The definition of hsi_w is given by Equation 18. Using the definition for W (x) defined in the previous section and applying both the product and the chain rule, the following expression is the intermediate product:

W (1) hsi_w= UW1(x), W2(x), W3(x)|x=1+ ∂UW1(x), W2(x), W3(x) ∂W1(x) dW1(x) dx |x=1 + ∂UW1(x), W2(x), W3(x) ∂W2(x) dW2(x) dx |x=1+ ∂UW1(x), W2(x), W3(x) ∂W3(x) dW3(x) dx |x=1. (41) The values for hsi_w and W(1) are solely dependent on the degree distribution, indicating that they are computed for every point in time. For systems that do not reach the gel, Equation 30 is used to calculate the moments of the degree distribution. As there is no gel, the term W (1) = 1 and is ignored for the rest of the derivation.16 _{Then, after the substitution of the} definitions of moments into the terms in Equation 41, the following expression gives hsi_w.

(27)

hsi_w= 1 +A + B + C + D + F G + H + I + J , with (42) A = µ100µ010(4µ001(µ100+ µ010+ µ001) − 2µ002(µ100+ µ010) + 4µ001µ110+ 4µ001µ101− 2µ110µ002+ 2µ101µ011+ 4µ001µ011) B = µ2₁₀₀(µ020µ002− µ2 011) C = µ100µ001(−2µ020(µ100+ µ001) + 2µ110µ011− 2µ101µ020) D = µ2₀₁₀(µ200µ002− µ2101) + µ 2 001(µ200µ020− µ2110) F = µ010µ001(−2µ200(µ010+ µ001) + 2µ110µ101− 2µ011µ200) G = µ100(8µ010µ001− 4µ010µ002− 4µ001µ020− 2µ2 011+ 2µ020µ002) H = µ010(−4µ001µ200− 2µ2 101+ 2µ200µ002) I = µ001(−2µ2110+ 2µ200µ020) J = µ110(µ110µ002− 2µ101µ011) + µ2101µ020+ µ2011µ200− µ200µ020µ002

Note that this formula is only valid if no gel is present. When a gel is formed, the output of this formula cannot be trusted. To remedy this, another formula was derived where there is assumed to be gel in the system. The following expression is valid both in the sol regime and the gel regime:

hsi_w= U (r1, r2, r3) W (r1, r2, r3)− 1 W (r1, r2, r3) ∂U (r1, r2, r3) ∂x x1 D − 1 W (r1, r2, r3) ∂U (r1, r2, r3) ∂y y1 D − 1 W (r1, r2, r3) ∂U (r1, r2, r3) ∂z z1 D. (43)

The values for x1, y1, z1, and D are given in Appendix C. In previous equations, the functions were evaluated to x = 1. This is no longer valid as the generating functions do not describe the entire system, anymore; only the soluble part consisting finite sized components. Now, the generating functions are evaluated to r1, r2, and r3.

r1= W1(1) (44)

r2= W2(1) (45)

(28)

Although the outcome of Equation 43, is true for all moments in time, the calculation takes longer than for the first formula, Equation 42.

Component mass distribution For the computation of the weighted component mass dis-tribution, monomer specific characteristics are included in the degree distribution. This effec-tively causes the degree distribution to be dependent on the molar masses of the molecules: u(i, j, k, α(m)). In here, the variates denote the peroxide crosslinks i, the ether crosslinks j, and the alkyl crosslinks k, and the mass of each monomer m. To reduce the matrix size of u, a mass vector α(m) is used where the indices correspond to molar masses. This idea is explained more elaborately in the mono-variate section (Section 2.2.1). Recall that the monomers are no longer accurately described by only one degree distribution. For each unique mass in the system, there is a different degree distribution:

u[α(m)](i, j, k) =    u(i, j, k, α(m)) mass = α(m) 0 mass 6= α(m). (47)

As a consequence, for a system with n unique masses, there exist n degree distributions and n generating functions for time t. An example of the generating function for the degree distribu-tion of monomers with mass mapping(a) is given here:

U[α(m)](x, y, z) =X i,j,k

xiyjzku[α(m)](i, j, k). (48)

To calculate wM(m), or the weighted mass distribution, a random edge is selected and followed to one of the two linked nodes. As the degree distribution becomes biased due to this operation, multiple generating functions of the biased degree distributions are defined. In addition, the generating functions of the distributions of the biased weak components are constructed.

WM,1(x) =X m xmU₁[α(m)]WM,1(x), WM,2(x), WM,3(x) (49) WM,2(x) =X m xmU₂[α(m)]WM,1(x), WM,2(x), WM,3(x) (50) WM,3(x) = X m xmU₃[α(m)]WM,1(x), WM,2(x), WM,3(x) (51)

These generating functions are used to numerically solve the generating functions of the weighted mass distribution itself:

(29)

WM(x) =X m

xmU[α(m)]WM,1(x), WM,2(x), WM,3(x) (52)

To gain the weighted component mass distribution for the connected component, the Cauchy integral is used, Equation 12. For comparison with experimental mass spectra, the number mass distribution nM(m) is necessary. To convert wM(m) to n(m), the distribution is unweighted by consecutively dividing with the mass and by normalizing it to 1. This operation explained in more detail in Section 2.2.1.

Gel point criterion As nodes cluster together and form a giant component, or gel, the weighted average size formula diverges to infinity, Equation 42. This singularity occurs when the denominator of the fraction is equal to 0. Thus, the criterion for the phase transition from sol to gel for the tri-variate degree distribution equals to:

2µ100(4µ010µ001− 2µ010µ001− 2µ001µ020− µ2 011+ µ020µ002) + 2µ010(−2µ001µ200− µ 2 101 + µ200µ002) + 2µ001(−µ2₁₁₀+ µ200µ020) + µ110(µ110µ002− 2µ101µ011) + µ2 101µ020 + µ2011µ020− µ200µ020µ002= 0. (53)

Note that the partial moments used in this gel point criterion are dependent on the degree distribution u(i, j, k), and are defined by the formula given in Equation 30.

2.2.3 A tetra-variate system

As second more elaborate system, a tetra-variate system is considered where one molecule can form 3 crosslinks, but where one bond is regarded as a directional crosslink. This directionality occurs when a chemical reaction employs two distinct reactants (e.g. radical and functional group, or two distinct types of functional groups). For visualization, consider the asymmetric formation of a peroxide crosslink in EL (Figure 9a). As there are two distinct substrates, the reaction is considered asymmetric. To model this appropriately, the asymmetry of the crosslink is included in the random graph. Therefore, the crosslink between the two monomers is represented as a directional edge. Therefore, the crosslink between the two monomers is represented as a directional edge (Figure 9b). When the formation of the peroxide crosslink is considered an asymmetric edge, the tetra-variate random graph model is applicable as well.

In order to include directed edge in the configuration model, the edge is treated as though it is built up from two different half-edges: the in-edge i, and the out-edge j (Figure 9b). When

(30)

Figure 9 The visualization of the asymmetric peroxide crosslink formation, both from a chemical point of view (a) and in network theory (b).

edges are placed, the edges only connect to out-edges and vice versa. As a consequence, in-cluding a directional involves two variates in the degree distribution: u(i, j, k, l). This increases the complexity of the system and makes the simulation more computationally demanding. The generating function for any particular moment in time that is used further in the random graph formalism is given in Equation 54.

U (x, y, z, v) = X i,j,k,l

u(i, j, k, l)xiyjzkvl (54)

Again, both the degree distribution and its corresponding generating function are normalized to 1. Considering that the degree distribution still describes the individual fatty acids EO, EL and ELn, the maximal functionality F equals to 3. As a consequence, 70 different configurations of connected nodes are allowed in the system.

Component size distribution Following the same procedure as in the mono- and tri-variate random graph model, four biased degree distributions are defined at time t :

uin(i, j, k, l) = 1 µ1000(i + 1)u(i + 1, j, k, l) (55) uout(i, j, k, l) = 1 µ0100 (j + 1)u(i, j + 1, k, l) (56) u1(i, j, k, l) = 1 µ0010 (k + 1)u(i, j, k + 1, l) (57) u2(i, j, k, l) = 1 µ0001 (l + 1)u(i, j, k, l + 1). (58)

The partial moments of the degree distribution are calculated according to the following defi-nition. µlmno= " x ∂ ∂x l y ∂ ∂y m z ∂ ∂z n v ∂ ∂v o U (x, y, z, v) # |x=y=z=v=1 (59)

(31)

As random graph formalism makes use of the generating functions of the biased degree distri-butions, these are given in Equations 60, 61, 62, and 63.

Uin(x, y, z, v) = 1 µ1000 ∂ ∂xU (x, y, z, v) (60) Uout(x, y, z, v) = 1 µ0100 ∂ ∂yU (x, y, z, v) (61) U1(x, y, z, v) = 1 µ0010 ∂ ∂zU (x, y, z, v) (62) U2(x, y, z, v) = 1 µ0001 ∂ ∂vU (x, y, z, v) (63)

The weighted size distribution w(s) is acquired with the use of the generating function of w(s) and the generating functions of the biased weak components.

Win(x) = xUinWout(x), Win(x), W1(x), W2(x) (64) Wout(x) = xUoutWout(x), Win(x), W1(x), W2(x) (65)

W1(x) = xU1 Wout(x), Win(x), W1(x), W2(x) (66) W2(x) = xU2 Wout(x), Win(x), W1(x), W2(x) (67) W (x) = xUWout(x), Win(x), W1(x), W2(x) (68)

Note that the terms Wout(x) and Win(x) are flipped in the definitions of the generating func-tions. The reason for this is that in-edges i only bind to out-edges j. Edges between two in-edges or two out-edges are not allowed. When flipping these terms in the generating func-tion, this distinction is incorporated in the configuration model. As a result, the first moments of both the in-edges and the out-edges are by definition equal to each other:13

µ1000= µ0100. (69)

Since these partial moments are equal, they are given the value of µ for the rest of the derivations to simplify the notation. After applying the Cauchy integral in Equation 12, the weighted com-ponent size distribution w(s) is unweighted to gain the number distribution of the comcom-ponent size n(s).

(32)

Weighted average size For calculating the weighted average size hsi_w of the tetra-variate system, the weighted size distribution w(s) is used according to the method discussed in detail in the mono-variate random graph model. In addition, two formula’s are derived that compute hsi_wfrom the partial moments of the degree distribution and the gel fraction. By starting with Equation 18 and using the definition of W (x) for the tetra-variate system, this expression is the intermediate product of the derivation:

W (1) hsi_w= UWout(x), Win(x), W1(x), W2(x) |x=1 + ∂UWout(x), Win(x), W1(x), W2(x) ∂Wout(x) dWout(x) dx |x=1 +

∂UWout(x), Win(x), W1(x), W2(x) ∂Win(x)

dWin(x) dx |x=1 +

∂UWout(x), Win(x), W1(x), W2(x) ∂W1(x) dW1(x) dx |x=1 + ∂UWout(x), Win(x), W1(x), W2(x) ∂W2(x) dW2(x) dx |x=1. (70)

When assuming there is no gel formed in the system, the terms in the intermediate formula substituted the definitions of different moments. Furthermore, the expressions of the generating functions of the biased weak components are entered as well. Moreover, the value W (1) is equal to 1 when no gel is present in the system. After substitution and rearranging, an expression is of obtained of following shape.

hsi_w= 1 − µx1 x2 − µ y1 y2 − µ0010 z1 z2 − µ0001 v1 v2 (71)

The values of x1, x2, y1, y2, z1, z2, v1, and v2 are dependent on all first and second partial moments of U (x, y, z, v) and are given in Appendix D. Recall that when gel has formed in the system, this formula is no longer valid. This next formula is valid for all moments in time, as there is assumed to be gel in the system, t > tg.

(33)

hsi_w= U (rout, rin, r1, r2) − 1 W (rout, rin, r1, r2) ∂U (rout, rin, r1, r2) ∂x x1+ x2 x3+ x4 +∂U (rout, rin, r1, r2)

∂y

y1+ y2 x3+ x4

−∂U (rout, rin, r1, r2) ∂z

z1+ z2 z3+ z4 +∂U (rout, rin, r1, r2)

∂v

v1+ v2 z3+ z4

(72)

Again, the extensive expressions of x1, x2, x3, x4, y1,y2, z1, z2, z3, z4, v1, and v2 are given in Appendix E. Furthermore, Equation 72 is evaluated at rout, rin, r1, and r2:

rout= Wout(1) (73)

rin= Win(1) (74)

r1= W1(1) (75)

r2= W2(1). (76)

Component mass distribution To formulate the tetra variate system so that it depends on the molar masses of the represented monomers, the degree distribution is extended with a fifth variate: the mass.

u(i, j, k, l, α(m)) (77)

With the use of a mass vector, the unique masses that are present in the system are included in the degree distribution. The details on the mapping are explained more elaborate in the mono-variate system. When the masses are necessary for the computation, they are extracted from the list with unique masses.

As a consequence of the inclusion of masses in the degree distribution, it is split into n sub-degree distributions describing only a subset of the nodes present in the system.

u[α(m)](i, j, k, l) =    u(i, j, k, l, α(m)) mass = m 0 mass 6= m (78)

For each of these sub-degree distributions, a generating functions exists. Below, the generating function for time t is defined.

(34)

U[α(m)]x, y, z, v= X i,j,k,l

xiyjzkvlu[α(m)](i, j, k, l) (79)

For calculating wM(m), the generating functions of the biased weak components need to be constructed. In previous sections, this idea is more elaborately explained.

WM,in(x) =X m xmU_in[α(m)]WM,out(x), WM,in(x), WM,1(x), WM,2(x) (80) WM,out(x) =X m xmUout[α(m)] WM,out(x), WM,in(x), WM,1(x), WM,2(x) (81) WM,1(x) = X m xmU₁[α(m)]WM,out(x), WM,in(x), WM,1(x), WM,2(x) (82) WM,2(x) = X m xmU₂[α(m)]WM,out(x), WM,in(x), WM,1(x), WM,2(x) (83) WM(x) =X m xmU[α(m)]WM,out(x), WM,in(x), WM,1(x), WM,2(x) (84)

To acquire wM(m), Equation 84 is solved numerically with the generating functions of the biased weak components and using the Cauchy integral, Equation 80 – 83 and 12. For comparison with experimental data, the number mass distribution nM(m) is needed, which is acquired according to the method described in Section 2.2.1.

Gel point criterion When a gel has formed in the system, the weighted average size hsi_w diverges to infinity, as a result of the denominator being equal to 0. Consequently, the gel point criterion is this equality:

µ0002(µ2₁₁₀₀(−2µ0010+ µ0020) + µ(2µ0010(2µ1100− µ2000) − (µ1010− µ0110)2₎ + µ0110(−2µ1100µ1010+ µ0110µ2000) + µ0200(2µ0010(−µ + µ2000) + µ2₁₀₁₀) + µ0020(µ(−2µ1100+ µ2000+ µ0200) − µ2000µ0200)) + µ0020(µ1001(µ1001(−µ + µ0200) + 2µ0101(µ − µ1100)) + µ0001(2µ(2µ1100− µ2000− µ0200) − 2µ21100+ 2µ2000µ0200) + µ2₀₁₀₁(−µ + µ2000)) + µ20011(µ(2µ1100− µ0200− µ2000) + µ2000µ0200− µ21100) + µ2₁₀₀₁(2µ0010(−µ0200+ µ) − µ2₀₁₁₀) + µ2₁₀₁₀(2µ0001(−µ0200+ µ) − µ2₀₁₀₁) + 2(µ0010µ2₀₁₀₁+ µ0001µ2₀₁₁₀)(−µ2000+ µ) + 2µ(2µ0010(µ0001(−2µ1100+ µ2000+ µ0200) − µ1001µ0101) + µ0011(µ1001− µ0101)(µ1010− µ0110)) + 2µ0110(µ0001(µ1100µ1010 − 2µµ1010) + µ1001(µ1010µ0101+ µ1100µ0011)) + 4µ0010(µ0001(µ21100− µ2000µ0200)

(35)

+ µ1100µ1001µ0101) + µ0011(2µ0101(−µ0110µ2000+ µ1100µ1010) − 2µ1010µ1001µ0200) = 0. (85) Note that the partial moments of the degree distribution that are used in this gel point criterion are given by Equation 59.

2.2.4 A penta-variate system

As final system, a penta-variate system is considered which describes a system with 4 types of edges: 3 undirected edges and 1 directed edge. The involvement of the directed edge results in two separate indices in the degree distribution: u(i, j, k, l, n), namely i and j. The corresponding generation function becomes then:

U (x, y, z, v, w) = X i,j,k,l,n

u(i, j, k, l, n)xiyjzkvlwn (86)

Both u(i, j, k, l, n) and U (x, y, z, v, w) are normalized to 1 for every moment in time. An example of the application for the penta-variate system is the molecule of triolein and linseed oil, where the directed edge represents the ester bond between the glycerol head and the fatty acid tail. Although this is no crosslink in the true molecule, we assume it to be. This, as we apply a fragmentation approach to the triglyceride. Using this fragmentation, the kinetic model is reduced in complexity. However, we do need to reassemble the molecule, so we assume there are crosslinks between the glycerol and the fatty acid. The three undirected edges represent the crosslinks the tails themselves can form (k, l, and n). Considering triolein and linseed oil, the maximal functionality of the two separate types of nodes is:

Head-nodes F = 3 Tail-nodes triolein F = 1 Tail-nodes linseed oil F = 6

Component size distribution Using the generating fuction of the penta-variate degree distribution, the weighted size distribution w(s). First, five biased degree distributions at time t are introduced to the system.

(36)

uin(i, j, k, l, n) = 1 µ(i + 1)u(i + 1, j, k, l, n) (87) uout(i, j, k, l, n) = 1 µ(j + 1)u(i, j + 1, k, l, n) (88) u1(i, j, k, l, n) = 1 µ00100(k + 1)u(i, j, k + 1, l, n) (89) u2(i, j, k, l, n) = 1 µ00010 (l + 1)u(i, j, k, l + 1, n) (90) u3(i, j, k, l, n) = 1 µ00001 (m + 1)u(i, j, k, l, n + 1) (91)

Where µ, µ00100, µ00010, and µ00001 are the first partial moments of the generating function of the unbiased degree distribution. Note that as the directed edge in the system is indicated by two variates, their first moments are equal to each other. For simplification, these are written as µ. The general definition of the moments is as follows:

µlmnop= " x∂ ∂x l y ∂ ∂y m z ∂ ∂z n v ∂ ∂v o w ∂ ∂w p U (x, y, z, v, w) # |x=y=z=v=w=1. (92) By using the partial moments, the generating functions of the biased degree distributions is defined as: Uin(x, y, z, v, w) = 1 µ ∂ ∂xU (x, y, z, v, w) (93) Uout(x, y, z, v, w) = 1 µ ∂ ∂yU (x, y, z, v, w) (94) U1(x, y, z, v, w) = 1 µ00100 ∂ ∂zU (x, y, z, v, w) (95) U2(x, y, z, v, w) = 1 µ00010 ∂ ∂vU (x, y, z, v, w) (96) U3(x, y, z, v, w) = 1 µ00001 ∂ ∂wU (x, y, z, v, w). (97)

To acquire w(s), the definitions for the generating functions for w(s) and of the generating functions of the biased weak components are used.

(37)

Win(x) = xUinWout(x), Win(x), W1(x), W2(x), W3(x) (98) Wout(x) = xUout Wout(x), Win(x), W1(x), W2(x), W3(x) (99) W1(x) = xU1 Wout(x), Win(x), W1(x), W2(x), W3(x) (100) W2(x) = xU2Wout(x), Win(x), W1(x), W2(x), W3(x) (101) W3(x) = xU3Wout(x), Win(x), W1(x), W2(x), W3(x) (102) W (x) = xUWout(x), Win(x), W1(x), W2(x), W3(x) (103)

Using the Cauchy integral, the generation function W (x) is numerically solved, acquiring in the weighted size distribution w(s). Note that the entries for the in- and out-edge are flipped in the definitions of the generating functions. These signify the fact that in-edges are solely connected to out-edges, and vice versa.

Weighted average size The weighted average hsi_wis computed using two methods. Firstly, using the weighted size distribution w(s), hsi_wis computed by multiplication of the probability and the size for all sizes in the system, Equation 16. Furthermore, using Equation 18, the following ‘intermediate’ formula giving the weighted average size is gained.

W (1) hsi_w= UWout(x), Win(x), W1(x), W2(x), W3(x) +

∂UWout(x), Win(x), W1(x), W2(x), W3(x) ∂Wout(x) dWout(x) dx + ∂UWout(x), Win(x), W1(x), W2(x), W3(x) ∂Win(x) dWin(x) dx +

∂UWout(x), Win(x), W1(x), W2(x), W3(x) ∂W1(x)

dW1(x) dx +

dW2(x) dx +

dW3(x)

dx (104)

Here, the definitions of the generating functions of w(s), and the weak connected components for the penta-variate systems were used, Equations 98 – 102. From this ‘intermediate’ expression,

(38)

two different expressions giving hsi_ware derived. The main difference between the two formulas is the underlying assumption that defines the value of W (1) and the evaluation of the generating functions. The equation stating the value of W (1) is given below:

W (1) = 1 − gf. (105)

In the first formula, we assume that no gel is present in the system, causing the value of W (1) to equal 1. After substitution of the definitions of the partial moments of the degree distribution, Equation 92, into Equation 104, the following expression is the result:

hsi_w= 1 + µ x1+ x2+ x3+ x4+ x5 x6+ x7+ x8+ x9+ x10 + µ −y1 y2 − y3(y4+ y5+ y6+ y7) y9(y10+ y11+ y12+ y13) + µ00100 −z1 z2 − z3(z4+ z5+ z6+ z7+ z8) z9(z10+ z11+ z12+ z13+ z14+ z15) + µ00010 −v1 v2− v3(v4+ v5+ v6+ v7+ v8+ v9) v10(v11+ v12+ v13+ v14+ v15) + µ00001 −w1 w2 − w3(w4+ w5+ w6+ w7+ w8) w9(w10+ w11+ w12+ w13+ w14) . (106)

The definitions of the various terms in this equation are given in Appendix F. As this equation is no longer accurate after gel has formed in the system, a different equation is derived where there is assumed to be gel, t > tg. Note that this formula is valid for every moment in time.

hsi_w= U (rout, rin, r1, r2, r3) W (rout, rin, r1, r2, r3)

+ 1

W (rout, rin, r1, r2, r3) ·

−∂U (rout, rin, r1, r2, r3) ∂x x1+ x2+ x3+ x4+ x5+ x6 d1+ d2+ d3+ d4+ d5+ d6 −∂U (rout, rin, r1, r2, r3) ∂y y1+ y2+ y3+ y4+ y5 d1+ d2+ d3+ d4+ d5+ d6 +∂U (rout, rin, r1, r2, r3)

∂z z1+ z2+ z3+ z4+ z5 d1+ d2+ d3+ d4+ d5+ d6 +∂U (rout, rin, r1, r2, r3) ∂v v1+ v2+ v3+ v4+ v5 d7+ d8+ d9+ d10+ d11+ d12 +∂U (rout, rin, r1, r2, r3)

∂w

w1+ w2+ w3+ w4+ w5 d1+ d2+ d3+ d4+ d5+ d6

(107)