• No results found

University of Groningen Modelling species interactions in macroevolution and macroecology Xu, Liang

N/A
N/A
Protected

Academic year: 2021

Share "University of Groningen Modelling species interactions in macroevolution and macroecology Xu, Liang"

Copied!
35
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Modelling species interactions in macroevolution and macroecology

Xu, Liang

DOI:

10.33612/diss.125954510

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2020

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Xu, L. (2020). Modelling species interactions in macroevolution and macroecology. University of Groningen. https://doi.org/10.33612/diss.125954510

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

3

Inferring the effect of species

interactions on trait evolution

Liang Xu, Sander van Doorn, Hanno Hildenbrandt

& Rampal S. Etienne

This chapter is under review.

(3)

3

Abstract

Models of trait evolution form an important part of macroevolutionary biology. The Brownian motion model and Ornstein-Uhlenbeck models have become classic (null) models of character evolution, in which species evolve indepen-dently. Recently, models incorporating species interactions have been de-veloped, particularly involving competition where abiotic factors pull species toward an optimal trait value and competitive interactions drive the trait val-ues apart. However, these models assume a fitness function rather than derive it from population dynamics and they do not consider dynamics of the trait variance. Here we develop a general coherent trait evolution framework where the fitness function is based on a model of population dynamics, and therefore it can, in principle, accommodate any type of species interaction. We illustrate our framework with a model of abundance-dependent competi-tive interactions against a macroevolutionary background encoded in a phy-logenetic tree. We develop an inference tool based on Approximate Bayesian Computation and test it on simulated data (of traits at the tips). We find that inference performs well when traits are clearly different between the species. We then fit the model to empirical data of baleen whale body sizes, using three different summary statistics, and compare it to a model without population dynamics and a model where competition depends on the total metabolic rate of the competitors. We show that the unweighted model per-forms best for the least informative summary statistic, while the model with competition weighted by the total metabolic rate fits the data slightly better than the other two models for the two more informative summary statistics. Regardless of the summary statistic used, the three models substantially differ in their predictions of the abundance distribution. Therefore, data on abundance distributions will allow us to better distinguish the models from one another, and infer the nature of species interactions. Thus our framework provides a conceptual approach to reveal species interactions underlying trait evolution and identifies the data needed to do so in practice.

Key words: species interaction; competition; trait evolution; simulations;

(4)

3

Introduction

hile it is generally acknowledged that nothing in evolution or ecology makes sense except in the light of the other. Pelletier et al.14 , how exactly this mutual interaction can be understood is an area of active research Schoener1 . e are starting to understand how ecology, i.e. species interactions and population dynam-ics, depends on evolutionary history Schoener1 and how evolutionary history, such as morphological trait evolution and phylogenetic information, explains eco-logical processes and patterns Clarke et al. , Ives and Godfray 1, Pigot and Etienne 1 , Rafferty and Ives 1 , Rezende et al. 1 2 . However, the relative roles of biotic and abiotic factors in driving evolution remain elusive. This is in part because establishing a more explicit and mechanistic connection between the pro-cesses of trait evolution and community dynamics remains a difficult task Narwani et al.1 2 . More specifically, we do not fully understand how population dynamics influence trait evolution and vice versa. This is particularly important if evolution is fast Hairston et al. 2, Reznick et al.1 , Reznick and Ghalambor1 4 .

Species tend to evolve towards a phenotype that best exploits the most abun-dant resource Darwin , but also tend to diverge, in order to avoid competition with other species feeding on the same resources. The balance between these opposing tendencies is governed by two factors, niche width and population size, which ultimately determine how many species coexist in a given environment and how they partition the available ecological niches among them. Larger populations exert a larger impact on other species competing with them in the same part of niche space. Conversely, how much competition is experienced by a species ulti-mately determines its population size and may lead to a shift in niche space.

Classic methods attempting to infer the relative roles of the forces shaping trait evolution from trait distribution data use models that describe trait dynamics with Brownian motion or Ornstein Uhlenbeck processes Hansen and Martins 4, Raup et al.1 , Theodore Garland and Ives2 , with the latter describing evolution to-wards an optimum. These models assume independent evolution for each species and thus do not account for species interactions Pennell and Harmon 14 . Re-cently, several phylogenetic comparative tools have been developed to study how the abiotic environment imposing a trait optimum and competition among species jointly drive trait dynamics. Specifically, Nuismer and Harmon Nuismer and Har-mon 14 derived an analytical likelihood formula based on Lande’s work Lande

1 4 to investigate how phylogenetic relationships influence trait evolution and rates of interaction among species. This method was extended to incorporate biogeo-graphical factors Drury et al.4 , and generalized to other ecological interactions Manceau et al.11 . However, none of these studies took into account population dynamics thereby ignoring eco-evolutionary feedbacks of population size on trait evolution. Moreover, to obtain a tractable expression for the likelihood, they relied on a linear approximation of the fitness gradient function that has the following conse uence: the more different species are, the more they repel each other. This is unrealistic because we expect that for species diverging more than their niche width, the rate of trait divergence will decrease.

(5)

3

Here, we present a general framework of trait evolution that can accommodate, in principle, any type of species interaction. e do this by defining the fitness func-tion which determines the direcfunc-tion in which traits will evolve in terms of a popu-lation dynamics model. Species interactions ranging from predation and parasitism to mutualism can then be accounted for in the population dynamics model. e illustrate our framework with an application to competitive interactions and show that we resolve the above-mentioned drawbacks of pre-existing methods Drury et al. 4 , 44, Nuismer and Harmon 14 . Particuilarly our model predicts an in-termediate trait distance between species where repulsion between them is most intense, which reflects a nonlinear relationship between competition and trait simi-larity. Furthermore, our model allows trait evolution to depend on the intraspecific variance in traits. And last but not least.we allow for an effect of population size on trait evolution, and we refer to this model as the abundance-weighted competition A C model. e assume that the traits in our model do not drivepopulation diver-gence speciation , but we let species evolve along a given phylogenetic tree. The model is not amenable to analytical treatment, so we used Approximate Bayesian Computation to investigate parameter inference. e find that our method is capa-ble of recovering the generating parameters from simulated data sets under certain conditions; if these conditions are not satisfied, very different parameter sets can yield very similar patterns, so we expect that no method will be able to infer the parameters under these conditions. e then illustrate our approach using data on the evolution of total body length in baleen whales Mysticeti and compare our model with two variants: one that ignores population dynamics, i.e., a model with unweighted competition the U C model and one that assumes that competition is weighted by total metabolic rate abundance multiplied by per capita metabolic rate rather than only abundance, i.e., a model with metabolic-rate-weighted com-petition the M C model . e find that competition played an important role in shaping trait distribution patterns in baleen whales. e discuss how our model sets the stage for a new generation of models that allow determining the relative roles of various biotic and abiotic drivers of trait evolution.

Model description and analysis

Trait evolution and population dynamics on trees

e derive a model for the population size, mean and variance of the trait of each species in Box 1. e consider trait and population dynamics along given phyloge-netic trees. The phylogephyloge-netic trees can be either reconstructed trees, containing only extant species, or full trees, containing extinct species as well. e assume that data may be available on the final trait distribution and species abundances at the tips of the tree. e used simulations to explore these distributions. e initial-ized the simulations with two ancestral species of identical trait means e ual to and variances e ual to . . The initial population sizes were drawn from a normal distribution with a mean of individuals and a variance of 1 . ithout loss of generality, we assumed zero as the trait optimum set by the environment. To be in

(6)

3

line with the previous models of phenotype evolution along a given phylogeny, we assumed that at the branching points of the tree the two daughter species inherit the trait value from their parent. The abundance of the mother lineage is divided over the two daughter species following a binomial distribution that is truncated at the bottom and the top, so that the daughter species always have positive abun-dance. hen a species goes extinct according to the phylogeny, we just remove the species from the simulation but keep its trait means and variances and abundance stored for the time they were extant.

Parameter inference using ABC-SMC

The complexity of our model precludes analytical approaches to fit the model to data. Hence, we developed an inference framework using Approximate Bayesian Computation based on Se uential Monte Carlo ABC-SMC . In ABC-SMC, first intro-duced by Toni et al.2 , one starts with a large number of parameter sets sampled from the prior these are called particles in the terminology of the field , which are then used to simulate many data sets. e then evaluate the similarity of the simulated data to the empirical data measured by one or more summary statis-tics . This similarity is the goodness-of-fit GOF . The GOFs for all these data sets are used as weights to sample parameter sets in the next iteration generation in ABC-SMC jargon , with some random noise added to it. After a few iterations, the parameter sets will form the posterior distribution. The details of the algorithm, including the computation of the GOF, can be found in the supplementary material. The choice of an efficient summary statistic is crucial to evaluate the similarity between simulation and data. In the simulation study we use the Euclidean dis-tance between simulated and observed trait values. Because of the stochasticity of the trait inheritance after speciation, traits of a focal species can differ substantially across replicate simulations. Nevertheless, the gap between adjacent traits re-gardless of species identities reflects the true strength of environmental stabilizing selection and competition. Therefore, we do not label the species in our simu-lation and sort both the empirical traits and the simulated traits in an increasing order before computing the Euclidean distance of these two vectors. e refer to this summary statistic as the sorted mean trait distance SMTD . e also compute the Euclidean distance of the variance vectors corresponding to the reordered trait means.

Choosing the Euclidean distance of sorted traits may not be the best way to fit our model to empirical data, because we ignore information on the empirical order of trait values across the phylogenetic tree. Therefore, in the empirical study see below , we considered phylogenetic independent contrasts PICs Felsenstein 1

as an alternative set of summary statistics. The PICs are designed to transform the original 𝑛 traits of species to 𝑛 − 1 independently and identically distributed contrasts between pairs of related species or estimated ancestral nodes Garland . Because the PICs have one dimension less than the trait data, we combined the PICs with the unsorted mean trait distance UMTD to obtain a third set of summary statistics, referred to as UMTD PICs. e compared results between the three sets of summary statistics.

(7)

3

Scenarios Trees 𝜆 𝜇 𝐾 Pruned Time scales

1 1 .4 1 No 1 2 2 .4 No 1 .4 1 No 1 4 4 .4 .2 1 No 1 .4 .2 No 1 .4 .2 1 No 1 . 1 No 1 . No 1 . 1 No 1 1 1 . .2 1 No 1 11 11 . .2 No 1 12 12 . .2 1 No 1 1 1 . .4 1 No 1 14 14 . . 1 No 1 1 4 .4 .2 1 Yes 1 1 .4 .2 Yes 1 1 .4 .2 1 Yes 1 1 1 . .2 1 Yes 1 1 11 . .2 Yes 1 2 12 . .2 1 Yes 1 21 1 . .4 1 Yes 1 22 14 . . 1 Yes 1 2 . 1 No 1 ,2 24 12 . .2 1 No 1 ,2 2 1 . .4 1 No 1 ,2 2 14 . . 1 No 1 ,2 2 . 1 No 2 2 12 . .2 1 No 2 2 1 . .4 1 No 2 14 . . 1 No 2

Table .1: The experimental setup for testing the influence of phylogenetic information. The first 14 scenarios are generated under various diversification rates speciation rate and extinction rate and clade-specific carrying capacities . Pruning these trees from extinct species results in the Scenarios 1 -22 only for nonzero extinction rates . Scenarios 2 -2 are designed for studying the effect of the rate of evolution. The observations are generated under a scaling parameter of 1 microevolutionary time steps per unit of macroevolutionary time while the algorithm uses 2 . For Scenarios 2 - ,

(8)

3

Simulation setup

To assess the behavior of our model, we first simulated data for known pa-rameter sets and explored whether the papa-rameter values can be correctly inferred. e considered six different values(0, 0.001, 0.01, 0.1, 0.5, 1) for both the stabilizing selection coefficient 𝛾 and the competition coefficient 𝛼 leading to a total of parameter combinations for a given phylogenetic tree. e set 𝑅 = 𝑒 i.e. the mathematical constant 2. 1 ,𝛽 = 10 and the mutation rate 𝜈 = 10 for all simulations. e focused on the inference of three parameters, namely𝛾, 𝛼 and 𝜈. To study how the phylogenetic information influences the evolution process, we generated several phylogenetic trees, including extinct branches, under the diversity-dependent diversification model Etienne et al. 1 for various parameter settings of this macroevolutionary model see Table. .1. In addition, to mimic the fact that in practice complete trees with extinct species are often not available, we reconstructed phylogenies of only extant species by pruning the extinct species. This means that we generated the trait data under the full tree, but estimated the parameters of our trait evolution model using only the reconstructed tree. Compar-ing this inference to inference usCompar-ing the complete tree informs us to what extent the loss of information of extinct species affects parameter estimation.

The ratio of the time scale of trait evolution and population dynamics to the time scale set by the phylogeny i.e. the number of time-steps of trait and population size dynamics in each unit of time of the phylogeny is a crucial factor. e denote this ratio in our model by the scaling parameter𝑠. For instance, given a phylogenetic tree with a crown age of 1 million years, trait and population dynamics involves 15 × 𝑠 time steps. The value of 𝑠 may influence our parameter estimates. So to assess how not knowing the true number of time steps i.e. the time scale of trait evolution to the time scale of speciation affects parameter inference, we generated data under𝑠 = 10000 and then ran our inference algorithm under 𝑠 = 10000 and 𝑠 = 20000 and compared their performance in parameter estimation see Table. S1 .

In summary, we generated a total of 14 phylogenetic trees and pruned these trees when extinction rates were nonzero, resulting in 22 trees in total see Table. S1 . e designed scenarios to investigate the influence of tree size, speciation rate, extinction rate, removal of extinct species and number of time steps see Table. .1. e simulated our model for parameter combinations for each scenario. e applied our inference algorithm on the simulated data and examined if the generating parameters could be recovered correctly. In the inference process, we set iterations and 2 particles for each iteration. For the analysis of a single scenario, we exploited a cluster of high performance computers with 2 threads running on each computer. Each parameter combination for each scenario analysis took between 2 and hours, depending on the number of evolutionary events and tree size of the specific scenario. All the code is available on Github

(9)

3

Applying the model to baleen whale body size evolution

Baleen whales represent the largest extant animal species and are distributed globally. They are filter-feeders on small fish and crustaceans. Body mass is an ideal trait that responds both to the abiotic factors Smith et al. 1 2 and biotic competitors but measurements of body mass are rarely available. However, data on total length are available. So because it has been shown that whale total length scales with body mass raised to a power of Lockyer 111 , we used the total length as a proxy for body mass Slater et al.1 1 . e log-transformed base 1 the body length, because the log scale is a more natural scale on which evolution takes place Gingerich . Results based on the untransformed total length can be found in the supplementary material, see Fig. S -S and Fig. S 2-S 4. e fitted our model to mean trait data given a reconstructed phylogeny with 1 extant species Slater et al. 1 1 . e did not use abundance or trait variance data, as these were not available.

e designed scenarios to fully assess the effects of environmental stabilizing selection and competition: four values of the time scaling parameter 𝑠 2 , 4 , and corresponding to four reasonable generation times , 2 , 1 . , 12. years/generation, respectively and two heritability values ℎ = 0.5, 1 . In contrast to the simulation study, we also estimated the variance due to mutation and segregation,𝑉 , and the trait optimum, 𝜃. The remaining parameter settings were identical to the simulation study. In the ABC-SMC algorithm, we set 4 particles for each iteration and in total 4 iterations for each scenario.

e developed two additional models for comparison. The first model the U C model assumes that competition does not depend on species abundance and hence does not need population dynamics. The U C model is similar to Druryet al.’s non-linear extension Drury et al.44 of Nuismer & Harmon’s model Nuismer and Har-mon14 . However, it differs in the competition kernel, i.e. from the population dy-namics model it follows that pairwise competition is described as(𝜇 −𝜇 )⋅𝑒 ( ) instead of Druryet al’s ratherad hocchoice of sign(𝜇 − 𝜇 ) ⋅ 𝑒 ( ) E . 1 in Drury et al. 44 . The second additional model the M C model assumes that competition depends on total metabolic rate, in which abundance is multiplied by the per capita metabolic rate, which depends on body length see E . S 2-S & S -S in the supplementary material . That is, the pairwise competition is 𝑒 ( , ,) 𝐵

, instead of𝑒 ( , ,) 𝑁, as used in the A C model , where𝐵,

is the total metabolic rate of species𝑗 at the 𝑡−th generation. Because the loga-rithms of body length and body mass of whales are strongly correlated with a slope of Lockyer111 and per capita metabolic rate has a scaling with body mass of

Brody1 , Brody and Procter1 , Etienne et al.4 , leiber , the total metabolic rate depends on body size and abundance as follows: 𝐵 = 𝑁 ⋅ 𝐵 ⋅ 𝜇 / . Here 𝐵 is a basal metabolic rate per kg BMR/kg that is assumed to be constant across whale species, and therefore drops out of our e uations because only the relative metabolic rate matters. Thus, large-bodied species have more competitive power than small-bodied species. This asymmetry makes the trait distribution at present highly dependent on the optimum trait. e therefore also estimated the optimum

(10)

3

trait 𝜃. Because more free parameters re uire more simulations in each iteration in the algorithm to obtain sufficient sampling, we here used 4 particles per iteration instead of 2 in the simulation study. For these two additional models we again estimated five parameters (𝛾, 𝛼, 𝜈, 𝑉 , 𝜃) . but we considered only one scenario of heritability and time scaling(ℎ = 1; 𝑠 = 20000 , because the analyses are computationally demanding, and because we found that the scenarios were similarly supported for the A C model see Results .

e used three alternative sets of summary statistics, i.e. the sorted mean trait distance SMTD , the phylogenetic independent contrasts only PICs and the unsorted mean trait distance with the phylogenetic independent contrasts UMTD PICs . To compare the goodness-of-fit of the eight scenarios for the A C model among each other, we took the simulations with the highest GOF-values across all scenarios and computed the percentage of simulations represented by each scenario in these best fitting simulations as a measure of the support of that scenario Toni et al. 2 . e did this for each of the three sets of sum-mary statistics.For comparing the three models we used the exact same procedure; support of a model is thus measured by its representation among the best GOF-values across all three models. Lastly, for each model we used the param-eter estimates to generate 1 data sets to compare the predicted phylogenetic independent contrasts with the empirical observations.

Results

The patterns of traits and abundances strongly depend on the combination of the stabilizing selection coefficient 𝛾 and the competition coefficient 𝛼. hen the competition coefficient𝛼 is smaller than 𝛾, the plots show highly compressed traits at the optimum trait value 𝜃 = 0 . The traits are most diverse when the competition coefficient is moderate, due to the nonlinear nature of competition Fig. .1. This is more pronounced when environmental stabilizing selection is weak. e also find that when environmental stabilizing selection is absent 𝛾 = 0 , the species at the edge of the trait spectrum have larger population sizes than species with intermediate traits Figs. S 2-S11 . However, for all the nonzero values of the stabilizing selection coefficient 𝛾 we explored, the species with intermediate trait values i.e. closer to the optimum are more abundant than species with low or high trait values Figs. S 2-S11 . Specie with intermediate trait values also had larger trait variance than those with low or high trait values Fig. S114-S1 .

e find the most accurate inference when the uantity √𝛼/𝛾, is roughly e ual to the number of tips in the phylogeny. Thus, when 𝛼 is larger than 𝛾, but not too large, the traits are informative for parameter inference because the traits are sufficiently separated, and we will call these parameter values the informative pa-rameter domain and otherwise the uninformative papa-rameter domain. For example, in Scenario 1 where the tree has 1 tips see Fig. .2, the estimates for𝛾 and 𝛼 are most accurate in the combinations of the generating parameters being𝛾 = 0.01 and𝛼 = 1. In Scenario 4 where the tree has tips in Figure. S2 the most accurate inference is found when𝛾 = 0.01 and 𝛼 = 0.5.

(11)

3

α =1 α =0.5 α =0.1 α =0.01 −0.25 0.00 0.25 −10 −5 0 5 10 µi− µj Competitiv e repulsion α ( µi − µj )e −α ( µi −µj ) 2

Figure .1: The nonlinear strength of competitive repulsion of two species with identical population size that follows from our model. hen the traits of two competitors are very similar, they experience intense competition but little directional repulsion. ith increasing difference in traits, the repulsion force increases. Eventually, with a further increase in trait difference, repulsion decreases again because competition is avoided. Different values cause different shapes of the strength of repulsion. A large implies strong repulsion when competitors are very similar in traits but this competitive strength drops uickly, implying a small competitive interaction distance. In contrast, a small implies a large competitive interaction distance but mild repulsion.

Influence of phylogenetic information

Tree size has a large impact on the estimation of the competition coefficient and mutation rate, but not on the estimation of the stabilizing selection coefficient Fig. . and Figs. S2-S4 . In general, when comparing across the scenarios with different carrying capacities𝐾 of the diversity-dependent diversification model, large trees tend to cause large variation in parameter estimates. For these large trees, the variance becomes even larger when environmental stabilizing selection becomes stronger. For very small trees e.g. Scenarios 1, 4, , 1 with carrying capacity of 1 , the variance of the estimates first declines and then increases with increasing𝛼 while keeping 𝛾 fixed see Fig. . in the main text and Figs. S2-S4 . The same pattern is also found when increasing𝛾 and fixing 𝛼. The estimates of 𝜈 show a peculiar pattern that depends on the tree size. For small trees the estimates of𝜈 are a bit overestimated but with a low variance. For large trees the median of the estimates is close to the true value but the variance is large.

A high speciation rate tends to bias the estimation of𝛼 and 𝜈 and increases the variance of the estimates for the scenarios with nonzero extinction while for the pure birth scenarios it improves the estimation see Figs. S -S in the supplementary material for the scenarios with no extinction and Figs. S -S1 for the scenarios with extinction . For the estimation of𝛾, there is no substantial effect of the speciation and extinction rates. However, the estimation is likely affected by the range of the resulting traits, e.g. a small range of the resulting traits leads to inferring large

(12)

3

Figure .2: The phylogenetic tree and the corresponding trait trees along Tree 1 across all parameter combinations. Different colors denote the traits of different species. The simulation is initialized with two ancestral species with identical trait mean and variance . and the population size is randomly chosen from a normal distribution centering on with variance of 1 . The parameters used to generate the phylogenetic tree are . , , with a crown age of 1 Myr. The time scaling parameter is indicating a total of 1 , time steps in the trait simulation. Trees are shown only for the informative parameters domain ; when , the trait trees are all almost identical: a highly compressed line such as the pattern on the diagonal.

(13)

3

𝛾. The estimation of the competition coefficient and the mutation rate is likely to be affected by large speciation and extinction rate because the fre uent speciation and extinction prevent the trait evolution from reaching e uilibrium.

Generally, extinction may cause more variance in the estimates of 𝛼 but the median of the estimates resembles the true values for the trees that are not too small see Fig. S11-S1 . For small trees, for example Scenarios 4 and 1 see Fig. S ,𝛼 is greatly underestimated when there is no stabilizing selection and the generating competition coefficient is large 𝛼 = 1 , while 𝛾 is well estimated except when stabilizing selection is weak and the competition coefficient is large for the scenario with very large extinction rate 𝜇 = 0.6 see Scenario 14 in Fig. S1 . Although the estimates of𝜈 show an increasing variance when the extinction rate increases, the true value is still recovered on average. Interestingly, in the cases where 𝛾 and 𝛼 are both large 𝛾 = 0.5, 𝛼 = 1 a nonzero extinction rate greatly improves the estimation of all parameters when using the full tree Scenarios 4-and 1 -14 .

Influence of reconstructing phylogenetic trees

Surprisingly, from the overall comparison among the full-tree scenarios and the corresponding pruned-tree scenarios, there is no substantial difference in parame-ter estimates for most of the parameparame-ter combinations in the informative parameparame-ter domain see Figs. S1 -S24 in the supplementary material . Only a slight underes-timation of 𝛼 is observed for the pruned-tree scenarios compared to the full-tree scenarios. The estimation of𝛾 is only substantially biased for the scenarios using extremely large extinction rate Scenarios 14 and 22 with extinction rate𝜇 = 0.6, see Fig. S24 and small stabilizing selection coefficient. There seems to be no clear influence of pruning extinct species on the estimate of 𝜈. The overall relative in-sensitivity of the parameter estimates to phylogenetic reconstruction for only extant species may be due to the fact that species extinction will almost immediately lead to another species occupying its niche.

Influence of the rate of evolution

Choosing a larger value of the scaling parameter 𝑠 = 20000 in estimation than the one used to generate the data 𝑠 = 10000 does not lead to substantial difference in estimation for most of the parameter combinations see Fig. S2 -S2 . The competition coefficient is only slightly underestimated for the scenarios with large 𝛼 except for one special case when 𝛾 = 0 and 𝛼 = 1 in the pure birth tree Scenario 2 , see Fig. S2 . The estimation of𝛾 is accurate in the informative parameter domain. The estimates of𝜈 show a somewhat larger variance but still match the true value in most of the parameter combinations.

A larger value of the scaling parameter used both in generating the data and in the parameter inference generally does not improve the performance of parameter inference Figs. S2 -S 2 . A significant improvement only appears in the estimation of𝛼 when 𝛾 = 0 for the scenarios with small or zero extinction rates. The stabilizing selection coefficient𝛾 is normally e ually well estimated for the scenarios with small

(14)

3

Figure . : Parameter inference for the parameter combinations under Scenarios 1, 2 and Table.

.1 the corresponding phylogenetic trees are shown at the bottom left . The dashed lines in three colors denote three parameter values used to generate the data. Note that the scale of mutation rate is . Box plots indicate the distribution of inferred parameter values, where the whiskers extend from the minimum to the maximum value. Some plots show no estimates for part of the parameter combinations, because there is no complete simulation under these parameter combinations after 1 , attempts. The shared parameters used to generate the phylogenetic trees are . , with a crown age of 1 Myr. The clade-level carrying capacity is , , for trees 1, 2, , respectively. Data are shown only for the informative parameter domain ; when , the parameter estimation show major bias and substantial variance.

(15)

3

extinction rates for both scaling parameters. By contrast, for the scenarios that have large extinction rates, the parameter inference is worst when stabilizing selection is weak. hen stabilizing selection becomes strong, parameter estimation for all three variables is substantially improved.

Environmental stabilizing selection and competition in baleen whales

To examine the phylogenetic signal in the whale body length data, i.e. to study to what extent the correlation in traits reflects their shared evolution history, we computed Blomberg’s 𝐾 Blomberg et al. 1 and Pagel’s 𝜆 Pagel 14 . These two statistics represent two different uantitative measures of the phylogenetic signal, measuring the correlations between species relative to what is expected under Brownian motion and a scaled ratio of the variance among species over the contrasts variance, respectively. Blomberg’s𝐾 for the baleen whales is . 1 and is significantly different from0 with a 𝑝-value of . 2, implying a moderate degree of phylogenetic signal close to an O-U process with a weak adaptation effect Blomberg et al.1 , Diniz-Filho et al.42 . Pagel’s𝜆 is . , significantly different from with a 𝑝−value of . 2. This suggests a strong phylogenetic signal, i.e. resemblance among close relative species close to a Brownian motion process which has𝜆 = 1 . These findings suggests that there is a clear phylogenetic signal in the body length data, making it reasonable to apply our model.

In all scenarios, the estimates of𝛾 are much smaller than the estimates of 𝛼. Heritability tends to affect the inference of the environmental stabilizing selection coefficient𝛾, and the competition coefficient 𝛼 but not so much the mutation rate 𝜈, the maximum variance by mutation 𝑉 and the optimum trait 𝜃 see Fig. .4 . ith a larger heritability ℎ = 1 , the estimates of 𝛾 and 𝛼 are smaller and less variable than forℎ = 0.5 across all three summary statistics. Alternative sets of summary statistics lead to similar results in the estimation of the parameters 𝜈, 𝑉 and 𝜃, but for 𝛾 and 𝛼, the inferences using PICs and UMTD PICs differ from those using SMTD particularly whenℎ = 0.5 see Fig. .4for SMTD, Fig. S 4 for PICs and Fig. S for UMTD PICs . The estimated𝛾 is smallest, around 6, when using PICs. The value increases when the summary statistics include absolute trait information SMTD and UMTD PICs , reaching 9 for the algorithm using SMTD only. In contrast, the estimations for𝛼 using SMTD are much smaller than using the other two summary statistics. The mutation rate 𝜈 is consistently inferred to be 0.001 except when using PICs, in which increasing 𝑠 to 80000 increases the estimate of 𝜈 to 0.002. e did not find substantial differences in the estimates under different time scaling parameters, except that with only PICs as the set of summary statistics, the predicted 𝛼 decreases when increasing the time scaling parameter and/or heritability. The estimation of 𝜃 is estimated to be around 3.05 across all three summary statistics, although a larger time scaling parameter leads to a larger variance when using only PICs.

Comparing the support of the scenarios across ℎ and 𝑠 computed from the GOF-values tells us that when using SMTD as summary statistic ℎ = 0.5 better fits the data than ℎ = 1 while all values of 𝑠 are similarly supported. Using PICs and UMTD PICs as summary statistics leads to e ually good fits for different ℎ

(16)

3

and𝑠 see Fig. . .

e note that the uantitative values of the estimates for stabilizing selection and competition coefficients are not comparable among the three models of trait evolution A C, U C and M C because these models assume different factors affecting competition. Nevertheless, the estimations all suggest a small environ-mental stabilizing selection coefficient and a large competition coefficient see Fig. S -S 1 for the three sets of summary statistics . hen using the summary statis-tics SMTD, the best fitted value of the optimum trait𝜃 is close to the mean of the extant species traits . for the A C model and the U C model, and around 2. for the M C model. A similar pattern is also found in the estimation using UMTD PICs but a large variance emerges for the M C model. Using PICs but not SMTD and UMTD PICs leads to large variance in estimations of𝜃 in all three models. The abundance distribution substantially differs for the three models Fig. S regardless of the choice of summary statistics. The A C model generates a symmetric unimodal abundance distribution around the optimum trait. For the M C model, the abundance distribution shows a decrease of abundance with increasing body length. The prediction of the intraspecific variance also differs substantially among the three models see Fig. S . The A C model produces higher intraspe-cific variance than the U C and M C models. Comparing the supports of the three models we find that the U C model is favored when using SMTD as summary statis-tic while all three models show a similar fit with the M C model slightly favored when using the other two more informative summary statistics . and see Fig. S1 -S1 for the GOF distributions of the three models for the three summary statistics .

Using the estimated parameters, we generated 1 replicate trait data sets for the three models from the posterior distributions and compared the predicted phy-logenetic independent contrasts to the empirical data see Fig. . for the results using UMTD PICs and Fig. S -S for the results using PICs and SMTD . e ob-serve a good fit for most of the contrasts but substantial differences for some pairs of species and clades. For example, the contrasts between the pairs ofB.musculus and its daughter clade,M.novaeangliaeandB.physalusare underestimated.

Discussion

e have developed a novel, coherent, framework to study how species inter-actions affect trait evolution. e have illustrated it with a model of trait evolution and population dynamics to explore how environmental stabilizing selection and competition shape trait patterns. e employed Approximate Bayesian Computa-tion with a Se uential Monte Carlo algorithm to infer the parameters of interest and measure the performance of our method for simulation data. Our analysis reveals that the trait data at the present is generally informative for parameter inference, particularly when the number of species allowed by the strength of stabilizing selec-tion and competiselec-tion is similar to the number of species present in the phylogeny, which implies that the competition coefficient needs to be larger than the stabi-lizing selection coefficient. In contrast, when environmental stabistabi-lizing selection is

(17)

3

Figur e .4: P ar ameter inf er ence under par ameter set tings i.e. al l combinations of , , , and ., to test the influence of the number of time steps and heri tabi lit y on par ameter estimation using SMTD as the summary statistic f or the resul ts using PICs and UMTDPICs, see Fig. S4-S. The thr ee dashed lines in the viol in plot ar e 2th per centile, median and th per centile uanti le of the samples in the last iter ation of the AB C algori thm that pr oduce the best fi ts to the baleen whale data.

(18)

3

greater than competition, our inference approach is limited due to an uninformative trait pattern that is highly compressed to the optimum trait. Our empirical study shows a small environmental stabilizing selection coefficient and a relatively large competition coefficient, suggesting a rapid repulsion among whale species.

Our model advances previous studies Drury et al. 4 ,44, Manceau et al.11 , Nuismer and Harmon 14 in three ways. First, our model is based on an explicit definition of fitness based on population dynamics, instead of chosenad hoc. This property makes our model more coherent. Second, we have relaxed the assumption of a linear competition kernel, and thereby capture a realistic mechanism in species interactions: when two populations have very similar trait values, the substantial overlap in their intraspecific trait distributions neutralizes directional selection al-though competition is very intense there. Conse uently, the rate of traits to evolve apart is also low. However, once stochastic mutation produces imbalance, com-petition leads to character displacement. Thus, the force of directional selection dramatically increases at the onset of a division in traits. hen species interactions decrease due to increasing dissimilarity, the force of directional selection drops. This process continues until the two populations evolve apart in traits to the extent that competition is balanced by stabilizing attraction. The third advantage of our model is that we have made population size a factor in the force of competition

which seems more realistic , and modeled the population dynamics.

Our simulation study reveals that phylogenetic information, i.e. tree size, branch-ing times and extinct branches, is informative for the inference of stabilizbranch-ing selec-tion but carries little informaselec-tion on species interacselec-tions in phylogenies with nonzero extinction rate Nuismer and Harmon14 . Large trees show more variance in esti-mations. On the one hand, this is because large trees have more speciation events that may result in species branching before they can evolve to e uilibrium. On the other hand, the potential mismatch between the species carrying capacity𝐾 used to generate phylogenetic trees and the number of coexisting species that is allowed by the strength of environmental stabilizing selection and competition could play a role in the performance of the inference method. Large environmental stabilizing selection results in a narrow width of natural resources, which may sustain only a few species. However, our simulation conditioning on a prescribed individual car-rying capacity and a given phylogenetic tree forces more species to survive than stabilizing selection would naturally allow. This conflict may produce more variance in parameter estimation. High speciation rate generally improves parameter infer-ence in phylogenies resulting from pure birth process, because the community can reach the clade-level carrying capacity faster, allowing more time to form a specific pattern in traits. Conversely, extinction prevents the community from reaching the carrying capacity and generates more macroevolutionary events, allowing less time for trait evolution. Thus, whether parameter inference is accurate depends on the rate of evolution. Generally our study suggests that if the rate of evolution is fast relative to the extinction rate, extinction rate has little influence on the behavior of the parameter inference. If the extinction is so high that the rate of evolution cannot lead species to evolve uickly enough to fill the trait niches left by extinct species, bias in parameter inference is expected even when speciation rate is high.

(19)

3

Figur e .: Pr ediction of the ph ylogenetic independent contr asts PICs of baleen whale log-tr ansf ormed body length, simulated using the par ameters estimated using UMTDPICs . The time scal ing par ameter is , , corr esponding to y ears per gener ation and heri tabi lit y . The ph ylogen y is the reconstructed tr ee of the Mysticeti wi th the -axis in uni ts of mi llion y ears Slater et al. 11 . The bo x plots show the distributions of PICs acr oss 1 simulations under the thr ee models against the true data the black dots wi th the -axis denoting the absolute ph ylogenetic independence contr asts. The bars on the right show the untr ansf ormed body length of the species.

(20)

3

Figure . : Model comparison. Panel A shows support of various and for the three summary statis-tics under the A C model. The width of the bar denotes the support of the corresponding parameter combination. Panel B shows support across the three candidate models A C, U C and M C for the three summary statistics. The height of the bar denotes the support of the corresponding model.

(21)

3

Reconstructing trees by pruning extinct species has a similar effect on parame-ter inference as using full trees if environmental stabilizing selection is present and the number of time steps is large. One might expect that extinction leaves gaps in the trait distribution suggesting a wider effective competition zone corresponding to a smaller competition coefficient𝛼, which is observed in our results. However, any possible bias that extinction may cause will disappear if given enough time steps because extant species eventually take over empty niches left by species that went extinct. Thus, if the rate of evolution is fast enough, the trait pattern of extant species only can still inform about species interactions. That is to say, under these conditions phylogenetic information is not necessarily needed for the inference of species interactions. This conclusion is consistent with Nuismer & Harmon’s find-ing for the phenotype differences model Nuismer and Harmon 14 . This result increases the robustness of our inference approach when applied to empirical data, where we often have only molecular phylogenies and morphological trait data for extant species.

In our example of baleen whales we find that𝛼 > 𝛾 suggesting strong competi-tion, but we must interpret this result with caution. A large value of the competition coefficient𝛼 does not immediately imply strong competition because of the shape ofΩ see Fig. .1 where larger𝛼 not only increases the intensity of the repulsion between two species with different traits, but at the same time reduces the inter-action distance, that is, species with very similar trait values experience stronger interaction, but species with dissimilar trait values experience less interaction. More important than the parameter estimates is the model fit and comparison. Our sim-ulation study demonstrates the ability of our method to recover the generating pa-rameters when √𝛼/𝛾 is close to the number of species at the tips of the phylogeny, while our empirical study among three models reveals that different processes can produce similar patterns in traits but result in substantial differences in popula-tion sizes. Thus, the difference in the predicpopula-tions of populapopula-tion sizes for different trait values seems informative for distinguishing between mechanisms. This un-derscores the importance of implementing population dynamics in trait evolution. Because the fitness function E . .1 can be altered to accommodate any type of species interactions, using our approach for model selection allows one to unravel underlying species interactions.

e find some differences between observed and predicted contrasts regardless of the set of summary statistics used in ABC , showing a limitation in the ability of our model to predict body sizes. That is, some close relatives show larger con-trasts than predicted by our model. This suggests that 1 other traits than body size also determine the strength of competitive interactions, and/or 2 some of the baleen whales explore different niche dimensions e.g. they utilize different types of resources , i.e. they evolve to adapt to new niches to avoid competition. For example, the largest underestimation of the contrasts is found in the pair of B.musculusand its daughter clade and the pair ofB.physalusand its sister lineages which seems to imply that the two largest species find niches that favor gigan-tism Slater et al.1 1 . This additional mechanism in trait evolution also reduces the phylogenetic signal, as suggested by Blomberg’s𝐾. This does not mean that

(22)

3

competition has not played a role in baleen whale trait evolution: competition may have led to diversification in different niche dimensions. Both explanations involve multiple traits, and hence extending our model to incorporate evolution of multiple traits presents an interesting avenue for further research.

ith the three models we studied in this paper we have introduced additional mechanisms in the competition kernel. The formulation of the pairwise competi-tion can be written in the general form𝑒 ( , ,) 𝐹(𝑁

, , 𝜇, ), where 𝐹(𝑁, , 𝜇, )

describes the mechanism of interest. For example, the U C model assumes e ual competitive power 𝐹(𝑁, , 𝜇, ) = 1 to all species while the A C model assumes

an abundance-dependent competition 𝐹(𝑁, , 𝜇 , ) = 𝑁, and the M C model sets

a metabolism-dependent competition 𝐹(𝑁, , 𝜇 , ) = 𝐵, . The A C model agrees

with the intuition that the species that can utilize the most available natural resource is the most abundant even if competition is most intense there. The M C model captures the fact that the relationship between body size and abundance is gen-erally negative Damuth , , Peters and Raelson1 1, Peters and assenberg

1 2 . Our general formulation E . . - .4 allows for other ways of weighing the effect of competitive interactions on trait evolution.

Our model can be further extended in various directions. For example, our model currently only incorporates demographic stochasticity; the noise term de-creases rapidly with population size. One can easily add environmental noise, which is independent of population size. Other possible extensions would be to replace the single optimum of a single trait by multiple optima, and to relax the assump-tion that all species interact with each other to only local interacassump-tions in a spatially explicit model see e.g. Drury et al.44, Manceau et al.11 , u and Etienne22 . Furthermore, the optimum trait value,𝜃, may be made time-dependent as the abi-otic environment e.g. temperature, resource availability has obviously not been constant over macroevolutionary time scales. It may even vary on much shorter time scales. For example, Darwin finches on the Gal pagos show different beak sizes in different years due to the changing seed size distribution Grant and Grant , Lack 1 2 . Another extension of our model would be to model the effect of trait evolution on diversification, along the lines of Aristide & Morlon Aristide and Morlon who integrated trait evolution in a macroevolutionary model to study the effect of competition on biodiversity and phenotypic diversity but without popula-tion dynamics . Insights from adaptive dynamics on evolupopula-tionary branching may be useful here Geritz et al. 1 . Finally, our model and the ABC-approach also allow for joint inference of macroevolutionary speciation and extinction dynamics and microevolutionary trait dynamics.

In summary, we have presented a general framework to model trait evolution subject to species interactions and provided an inference tool to perform model selection based on trait data with or without abundance data. e have shown that the availability of abundance data may be crucial to distinguish between different models. The availability of data on intraspecific trait variation may further aid model selection. Our approach thus represents an important step in the eco-evolutionary synthesis and testing this on empirical data.

(23)

3

Box 1. The derivation of the trait evolution model with population dynamics. e consider𝑛 species competing for the same spectrum of resources with a fixed and unimodal distribution Mahler et al.11 .

e define fitness for an individual of species 𝑖 with trait 𝑧 through the per capita growth function

𝜔(𝑧 ) =𝑁, 𝑁,

𝑓, (𝑧 )

𝑓, (𝑧 )

. .1

where 𝑁, denotes the population size at the 𝑡-th generation and 𝑓, and

𝑓, are the densities of traits of the two generations respectively. After

applying simplifying assumptions such as weak selection, we can derive a model on the species level that describes the dynamics of the population size𝑁, ofthe mean species trait value 𝜇 and of the intraspecific variance of each species’ trait𝑉 in terms of the fitness function see the supplementary material for the full derivation Harmon et al. :

𝑁, = 𝑁, ⋅ 𝜔(𝜇, ) .2 𝜇, = 𝜇, + ℎ ⋅ 𝑉, ⋅ 𝜔 (𝜇, ) 𝜔(𝜇, ) . 𝑉, = (1 − 1 2ℎ )𝑉, + 1 2ℎ ⋅ 𝑉, ⋅ 𝜔 (𝜇, ) 𝜔(𝜇, ) + 2𝑁, 𝜈𝑉 1 + 4𝑁, 𝜈 ⋅ ℎ , .4 where 𝑁, , 𝜇, and 𝑉, denote the population size, trait mean and trait

variance of species𝑖 at the 𝑡-th generation. Our framework is amenable to all sorts of species interactions through the fitness function 𝜔(𝜇, ) which

can be a function of the traits and abundances of other species as well as of the abiotic environment. In the model,𝜔 (𝜇, ) and 𝜔 (𝜇, ) are the first

and second derivatives of the fitness function with respect to the trait value, evaluated at the species mean of the trait 𝜇, . The first derivative of the

fitness function is proportional to the rate at which the mean trait in the population evolves uphill on the fitness landscape. The second derivative of the fitness function determines whether selection is stabilizing or disruptive, which influences the variance of the trait and, thereby, the rate of evolution. . E . .2, which is is essentially Lande’s formula E . in Lande1 4 , links the population size to a user-specified fitness function. In the literature, the trait variance is usually assumed to be constant Drury et al. 4 , 44, Lande1 4, Nuismer and Harmon14 , but this may be an oversimplification Barnett and Simpson 1 , an alen 2 . Therefore, we allow variance dynamics E . .4 in trait evolution which consists of three components that capturea reduction of variance due to sexual reproduction (1− ℎ )𝑉, , the

effect of selection ℎ ⋅ 𝑉, ⋅ ( ,)

( ,) , and an inflow of new variance imura

and Crow due to segregation and mutation ,

(24)

3

heritability of the trait, i.e. the degree of phenotypic resemblance between parents and offspring Falconer and Mackay , 𝜈 is the average rate of mutation of the alleles existing in a diploid population. 𝑉 represents the maximum variance of the trait that can be supported by mutation in an infinitely large population, in the absence of selection on the trait. In the simulation study we set𝑉 e ual to 1 and the environmental contribution to phenotypic variance to , such that trait heritabilityℎ is e ual to 1. In the empirical application we estimate 𝑉 as a free parameter, and we consider two distinct values of heritabilityℎ .

e allow for stochastic evolutionary trait change due to drift by adding a noise term𝜂, to E . . Lenormand et al.1 , Nuismer and Harmon14

that follows a normal distribution with mean and a variance that is inversely proportional to the effective population size which we set e ual to the actual population size 𝑁, , i.e. 𝜂, ∼ 𝑁(0, ,

, ). e incorporate demographic

stochasticity by drawing species abundances from a zero-truncated Poisson distribution with a mean determined by E . .2. Hence, we do not allow for extinction due to demographic stochasticity, but species can become extinct if the phylogeny tells us so see below .

. Here, we define the fitness of the mean phenotype𝜇, at the𝑡-th

genera-tion using a Ricker type form of discrete-time populagenera-tion dynamics see the supplementary material :

𝜔(𝜇, ) =

𝑁, 𝑁,

∶= 𝑅(𝜇, )𝑒 (⃗⃗⃗,⃗⃗⃗⃗)/ . .

Here, 𝑅(𝜇, ) is the growth factor that depends on the trait value and the

parameter𝜃 that represents the optimum trait favored by abiotic stabilizing selection as follows:

𝑅(𝜇, ) = 𝑅 𝑒 ( ,) . .

where𝑅 is the optimal growth factor and 𝛾 determines the strength of sta-bilizing selection towards the optimum. Furthermore, the function𝛽 in E .

.1 uantifies the intensity of competition. Assuming a Gaussian competition kernel, we define 𝛽 as

𝛽(⃗⃗⃗𝜇, ⃗⃗⃗𝑁) = ∑(𝑒 ( , ,) 𝑁

, ). .

E . . states that the strength of competition between two species with trait means𝜇, and𝜇 , increases with similarity in these trait means, and that the effect of competition on species𝑖 increases with population size of species 𝑗. The competition coefficient𝛼 scales the strength of the interaction and de-termines the effective interaction length. Finally, the parameter𝛽 in E . .1

(25)

3

has a similar interpretation as an individual-scale carrying capacity of each species Abrams 1 , as it sets the scale at which competitive interactions start to strongly impact the growth of the population. Because E . .1is an increasing function of𝛽 , the ecological e uilibrium 𝜔(𝜇, ) = 1 is reached

at a carrying capacity set by the e uilibrium condition𝛽 = 𝛽 ⋅ ln 𝑅 where environmental stabilizing selection and competition balance each other. Inserting our fitness function E . .1in E . .2, . , .4and adding stochas-ticity leads to 𝑁, ∼ Pois (𝑁, 𝑅 𝑒 ( ,) ⋅ 𝑒 ∑ ( ( , , ) ,)/ ) . 𝜇, = 𝜇, + ℎ ⋅ 𝑉, (2𝛾(𝜃 − 𝜇, ) + Ω) + 𝜂, . 𝑉, = (1 − 1 2ℎ )𝑉, + 2𝑁, 𝜈𝑉 1 + 4𝑁, 𝜈 ⋅ ℎ +1 2ℎ ⋅ 𝑉, [2𝛾(−1 + 2𝛾𝜃 ) .1 + 𝜕Ω 𝜕𝜇, − (2𝛾(2𝜃 − 𝜇, ) + Ω) ⋅ (2𝛾𝜇, − Ω)]

whereΩ = ∑ (𝜇, − 𝜇, )𝑒 ( , ,) 𝑁, and Pois(⋅) denotes the

zero-truncated Poisson distribution. E . .2- .4advances Nuismer and Harmon

14 ’s model by relaxing the simplification of a linear species interaction, as in Drury et al. 44’s model. The nonlinear pairwise competitive repulsion, 𝛼(𝜇, − 𝜇, )𝑒 ( , ,) as a function of the trait difference𝜇, − 𝜇, , is

il-lustrated in Fig. .1for several values of𝛼. Moreover, in contrast to Drury’s models Drury et al.4 ,44 it takes into account the abundance of competi-tors in the community as a weight in the competition kernel, and includes the dynamics of trait variance. Furthermore, the full derivation of our model is based on a coherent definition of fitness.

(26)

3

Supporting Information

Model description and analysis

e considered the coevolution of uantitative trait and abundance under stabi-lizing selection to an environmental optimum and interspecific competition among 𝑛 species in a community. These species are assumed to belong to a single trophic level such that they are competing for one type of natural resource. The distribution of natural resources or environmental factor is assumed to be an unimodal distri-bution for simplicity. Thus the trait optimum that is favored by the environment is fixed. e begin by defining an individual-based fitness function associated with population growth and for a given trait value. e then derive a model describing the dynamics of trait evolution and population dynamics.

The fitness of an individual with trait 𝑧 is related to the change of abundance and fre uency of the trait

𝑁 𝑓 (𝑧, 𝜇 + Δ𝜇, 𝑉 + Δ𝑉 ) = 𝑁 𝑓 (𝑧, 𝜇, 𝑉 )𝜔(𝑧) .11 where 𝑁 denotes the population size of the focal species at the 𝑡-th generation and𝑓 (𝑧) is the probability densities of the trait at the 𝑡-th generation. e assume that the trait distribution follows a normal distribution but with a small change in mean𝜇 and variance 𝑉 in the next generation. Thus, we define the distribution at the𝑡-th generation as follows

𝑓 (𝑧, 𝜇, 𝑉 ) = 1 √2𝜋𝑉 𝑒

( )

.12 and at the𝑡 + 1-th generation with the slight difference in mean Δ𝜇 and variance Δ𝑉

𝑓 (𝑧, 𝜇 + Δ𝜇, 𝑉 + Δ𝑉 ) = 1

√2𝜋(𝑉 + Δ𝑉 )𝑒

( )

.1 The increase of the population size at the𝑡+1th generation 𝑁 thus can be simply expressed in terms of the mean fitness of the population Lande1 4 :

𝑁 = 𝑁 ∫ 𝑓 (𝑧, 𝜇, 𝑉 )𝜔(𝑧)𝑑𝑧 = 𝑁 ̄𝜔 .14 where the integration 𝜔 = ∫ 𝑓 (𝑧, 𝜇, 𝑉 )𝜔(𝑧)𝑑𝑧 measures the mean fitness of thē population at the𝑡-th generation.

Substituting e . .14 into e . .11 yields the following dynamics of the fre-uency of trait𝑧:

𝑓 (𝑧, 𝜇 + Δ𝜇, 𝑉 + Δ𝑉 ) = 𝑓 (𝑧, 𝜇, 𝑉 )𝜔(𝑧)

̄

𝜔 . .1

Expanding𝜔(𝑧) around the mean trait 𝜇 up to the first order gives

(27)

3

Figure . : Illustration of the model derivation. The orange curve denotes the fitness landscape with a wide and flat shape due to the assumption of weak selection. The dashed curves are the trait distribution of species in which the blue dashed curve is for the focal species with the density function ( ) and the mean trait and the variance , . The trait evolution strategy of species depends on the attraction of

the stabilizing selection and the interspecific competition. The orange arrow accounts for the attraction toward the optimum favored by the environment while the dashed gray arrows are the competition forces from the competitors.

The mean fitness𝜔 can then be approximated by see Fig.̄ .

̄

𝜔 = ∫ 𝜔(𝑧)𝑓 (𝑧)𝑑𝑧

≈ ∫ (𝜔(𝜇) + 𝜔 (𝜇)(𝑧 − 𝜇)) 𝑓 (𝑧)𝑑𝑧

≈ 𝜔(𝜇) .1

Applying Taylor expansion to the left hand side of .1 with respectΔ𝜇, Δ𝑉 yields:

Δ𝜇 = 𝑉 𝜔 (𝜇)

̄

𝜔 .1

Δ𝑉 = 𝑉 𝜔 (𝜇)

𝜔(𝜇). .1

Considering heritabilityℎ = where𝑉 is the genetic variance, the change of the trait meanΔ𝜇 must be multiplied by ℎ , which yields:

Δ𝜇 = ℎ 𝑉 𝜔 (𝜇)

𝜔(𝜇). .2

The phenotypic variance results from two factors, i.e. genetic variance 𝑉 and environmental sources 𝑉 . The genetic variancechanges due to selection, which is formulated in the above section e . .1 , due to reproduction, and due to mutation.

(28)

3

Change in variance due to selection At the𝑡-th generation, according to the dynamics E . .1 after selection the variance changes to

𝑉, ,after selection= 𝑉, ,before selection+ 𝑉, ,before selection

𝜔 (𝜇)

𝜔(𝜇) .21

= 𝑉, ,before selection⋅ (1 +

𝜔 (𝜇)

𝜔(𝜇)𝑉, ,before selection) .22

where𝑉, ,before selectionis just𝑉, . Thus, the genetic variance𝑉, ,after selectionyields:

𝑉, ,after selection= ℎ 𝑉, ,before selection⋅ (1 + 𝜔 (𝜇)

𝜔(𝜇) 𝑉, ,before selection). .2

Change in variance due to sexual reproduction e can compute the trait distribution of offspring with trait value𝑧 according to its parents’ traits 𝑧 − , 𝑧 + that independently follow the trait distribution of the previous generation, where𝛿 is the deviation of the parents from the offspring’s trait𝑧. For clarity, we define the trait distribution by

𝐻(𝑧, 𝜇, 𝑉) ∼ 𝒩(𝜇, 𝑉)

where 𝒩(𝜇, 𝑉) is a normal distribution with mean 𝜇 and variance 𝑉. The density function of the offspring’s trait is therefore

𝑓(𝑧) = ∫ 𝐻(𝑧 −𝛿 2, 𝜇, 𝑉) ⋅ 𝐻(𝑧 + 𝛿 2, 𝜇, 𝑉)𝑑𝛿 = 𝐻(𝑧, 𝜇,𝑉 2). .24

E . .24 tells us that the reproduction will lose half of the variance in the trait distribution of offspring.

Change in variance due to mutation Another factor that affects the trait dis-tribution is mutation. The change of variance by mutation is given by imura and Crow

𝑀 = 2𝑛𝜈𝑉,

1 + 4𝑛𝜈 .2

where𝜈 is the mutation rate and 𝑉, is the maximum additive genetic variance.

The expected change in the trait by mutation is , thus it has no contribution to the mean of the trait distribution.

Overall, due to reproduction and mutation, the additive genetic variance at the 𝑡 + 1th yields:

𝑉, = 1

2𝑉, ,after selection+

2𝑛𝜈𝑉,

(29)

3

Incorporating the environmental variance𝑉 yields:

𝑉, = 𝑉, + 𝑉 .2 = 1 2ℎ 𝑉, + 1 2ℎ 𝜔 (𝜇) 𝜔(𝜇)𝑉, + 2𝑛𝜈𝑉, 1 + 4𝑛𝜈 + (1 − ℎ )𝑉, .2 = (1 −1 2ℎ )𝑉, + 1 2ℎ 𝜔 (𝜇) 𝜔(𝜇) 𝑉, + 2𝑛𝜈𝑉, ⋅ ℎ 1 + 4𝑛𝜈 .2

where𝑉 = 𝑉, −𝑉, is the environmental variance which is assumed to be constant. In summary, the trait and population dynamics are governed by the following set of e uations: 𝜔(𝜇, ) = 𝑁, 𝑁, . 𝜇, = 𝜇, + ℎ 𝑉, 𝜔 (𝜇, ) 𝜔(𝜇, ) . 1 𝑉, = (1 − 1 2ℎ )𝑉, + 1 2ℎ 𝜔 (𝜇, ) 𝜔(𝜇, ) 𝑉, + 2𝑁, 𝜈𝑉, ⋅ ℎ 1 + 4𝑁, 𝜈 . 2 This novel model tells us that the growth of abundance for one specific species depends on the fitness of its average trait. The evolution of the average trait in response to selection is in the direction that increases the mean fitness in the pop-ulation Lande 1 4 . In the subse uent section, we build a model for the mean fitness function associated with stabilizing selection to an environmental optimum and competitive interactions.

e have derived a general trait and population dynamics model with dynamical variance at the species scale. To complete the model we have to specify the mean fitness function. e define the mean fitness function for species using the growth ratio of abundances in subse uent generations:

𝜔(𝜇, ) =

𝑁, 𝑁,

= 𝑅𝑒 / .

where 𝜇, denotes the trait mean of species 𝑖 at the 𝑡-th generation. The growth

factor𝑅 depends on the trait value and the parameter 𝜃 that can be interpreted as the optimum trait favored by abiotic stabilizing selection:

𝑅 = 𝑅 𝑒 ( ,) . . 4

where 𝑅 is the optimal growth factor. Here, 𝛾 determines the strength of stabi-lizing selection towards the optimum. In E . . ,𝛽 uantifies the intensity of competition. Assuming a Gaussion competition kernel, we define𝛽 as

𝛽 = ∑(𝑒 ( , ,) 𝑁

(30)

3

E . . tells us that the strength of competition between two species depends on the similarity in their mean traits weighted by their abundances. Thus, we have fitness function in the𝑡-th generation

𝜔(𝜇, ) = 𝑅 𝑒 ( ,) ⋅ 𝑒 . .

The parameter𝛽 in E . . and in E . . is a coefficient that can be interpreted as the carrying capacity of the community.

Finally, we incorporate demographic stochasticity in the population dynamics by drawing species abundances from a zero-truncated Poisson distribution de-noted by Pois(⋅) with the mean determined by the population dynamics. The zero-truncation ensures that the population cannot go extinct when present in the phylogeny. Only when extinction occurs in the phylogeny, we remove the species.

Inserting this fitness function in model E . . - . 2 and adding demographic stochasticity𝜂, yields 𝑁, ∼ Pois (𝑁, 𝑅 𝑒 ( ,) ⋅ 𝑒 / ) . 𝜇, = 𝜇, + ℎ 𝑉, (2𝛾(𝜃 − 𝜇, ) + 2𝛼 𝛽 ∑(𝜇, − 𝜇, )𝑒 ( , ,) 𝑁 , ) + 𝜂, . 𝑉, = (1 − 1 2ℎ )𝑉, + 2𝑁, 𝜈𝑉, 1 + 4𝑁, 𝜈 ⋅ ℎ . +1 2ℎ 𝑉, [2𝛾(−1 + 2𝛾𝜃 ) − 𝛽 ∑ (4𝛼 (𝜇, − 𝜇, ) − 2𝛼) 𝑒 ( , ,) 𝑁 , − (2𝛾(2𝜃 − 𝜇, ) + 2𝛼𝛽 ∑(𝜇, − 𝜇, )𝑒 ( , ,) 𝑁, ) ⋅ (2𝛾𝜇, − 2𝛼𝛽 ∑(𝜇, − 𝜇, )𝑒 ( , ,) 𝑁, )] .

The demographic stochasticity follows a normal distribution with mean and a population-related variance. It is defined as

𝜂, ∼ 𝑁(0, 1 2 𝑉, 𝑁, , )

where𝑉, is still the phenotypic variance,𝑁, , is the effective population size which for simplicity is assumed to be𝑁, in our case. This random variable describes the relationship between the variance and the population size. For large population size, the trait distribution will have small variance.

(31)

3

To answer the uestion whether population dynamics really matters in the trait evolution, we developed two additional models for comparison. One model is to simply remove the population dynamics and abundance weights in the competition kernel, yielding: 𝜇, = 𝜇, + ℎ 𝑉, (2𝛾(𝜃 − 𝜇, ) + 2𝛼 ∑(𝜇, − 𝜇, )𝑒 ( , ,) ) + 𝜂, .4 𝑉, = (1 − 1 2ℎ )𝑉, + 2𝑁, 𝜈𝑉 1 + 4𝑁, 𝜈 ⋅ ℎ + 1 2ℎ 𝑉, [2𝛾(−1 + 2𝛾𝜃 ) − ∑ (4𝛼 (𝜇, − 𝜇 , ) − 2𝛼) 𝑒 ( , ,) − (2𝛾(2𝜃 − 𝜇, ) + 2𝛼 ∑(𝜇, − 𝜇, )𝑒 ( , ,) ) ⋅ (2𝛾𝜇, − 2𝛼 ∑(𝜇, − 𝜇, )𝑒 ( , ,) )] . .41

The competition kernel of this model is similar to Drury et al. 44 ’s model in the sense that when the traits of two competitors are very different, repulsion will be small due to competition being avoided. But it differs from Drury et al. 44 ’s model in that when the traits of the two competitors are similar, Drury et al.’s model just assumes strong repulsion, whereas our model shows a weak directional repulsion which emerges as a result of modelling the population dynamics. Such a weak di-rectional repulsion for similar trait values is realistic because if the trait distributions of two populations largely overlap, there is no strong tendency to evolve either way.

e refer to this model as the Unweighted competition U C model.

The other model we consider weighs competition by the total metabolic rate and hence we refer to it as the Metabolism-weighted competition M C model. Here we assume that it is not so much the abundance that affects competitive strength, but the species’ energy consumption.The basal metabolic rate Brody 1 , Brody and Procter1 , leiber scales with body mass and body mass scales with body length in whales Lockyer111 , so the basal metabolic rate scales with body length as

𝐵, ,per capita= 𝐵 𝜇,/ . .42

The total metabolic rate of species𝑖 is therefore

(32)

3

Thus, the trait-metabolism coevolution model yields: 𝑁, ∼ Pois (𝑁, 𝑅 𝑒 ( ,) ⋅ 𝑒 / (∑ ( , , ) ,)) .44 𝜇, = 𝜇, + ℎ 𝑉, (2𝛾(𝜃 − 𝜇, ) + 2𝛼 𝛽 ∑(𝜇, − 𝜇, )𝑒 ( , ,) 𝐵 , ) + 𝜂, .4 𝑉, = (1 − 1 2ℎ )𝑉, + 2𝑁, 𝜈𝑉 1 + 4𝑁, 𝜈 ⋅ ℎ .4 + 1 2ℎ 𝑉, [2𝛾(−1 + 2𝛾𝜃 ) − 𝛽 ∑ (4𝛼 (𝜇, − 𝜇, ) − 2𝛼) 𝑒 ( , ,) 𝐵 , − (2𝛾(2𝜃 − 𝜇, ) + 2𝛼𝛽 ∑(𝜇, − 𝜇, )𝑒 ( , ,) 𝐵, ) ⋅ (2𝛾𝜇, − 2𝛼𝛽 ∑(𝜇, − 𝜇, )𝑒 ( , ,) 𝐵, )] .

e used an ABC-SMC algorithm to compare the three models on the baleen whales data. e estimated free variables, i.e. 𝛾, 𝛼, 𝜈, 𝑉 , 𝜃.

Parameter inference and model selection using the ABC-SMC

algo-rithm

The complexity of our model precludes analytical approaches to fit the model to data. Hence, we developed an inference framework using Approximate Bayesian Computation based on Se uential Monte Carlo ABC-SMC . In ABC-SMC, first intro-duced by Toni et al.2 , one starts with a large number of parameter sets sampled from the prior these are called particles in the terminology of the field , which are then used to simulate data. The similarities of the simulated data to the empirical data measured by one or more summary statistics are then used as weights to sample parameter sets in the next generation, with some random noise added to it. After a few iterations generations in the terminology of the field , the parameter sets will form the posterior distribution.

Specifically, the algorithm for parameter inference consists of the following steps for our model:

1. Set a particle indicator𝑗 = 1, 2, ⋯ , 𝑛, where 𝑛 is the total number of particles in each iteration for each candidate model, and a maximum𝑇 to the number of iterations.

2. Initialize the algorithm with parameter sets 𝜃( ), . The parameters are sampled independently from a prior distribution𝜋 (𝜃) of model 𝑘. e used

Referenties

GERELATEERDE DOCUMENTEN

F inally it is pointed out th a t the general theory offered should be con­ sidered no t as a prognostic instrum ent, bu t exclusively as a functional

Counterexamples were found (Hemker et al., 1996) for the models from the divide-by-total class in which c~ij varied over items or item steps or both, and for all models from the

Then the theory of the extremal types theorem will be used to determine extreme value distribution models of large data sets3. We use both exact known distributions to compute

In this thesis, I explored how biological interactions on both species and individual scales influence the pattern of biodiversity, morphological traits, abundance structure

Understanding the effect of competition during evolutionary radiations : an integrated model of phenotypic and species diversification.. Competition, seed predation, and

In dit hoofdstuk hebben we een algemeen, coherent, soorteigenschappen evolutie- framework ontwikkelt waarbij de fitnessfunctie is gebaseerd op een model van po- pulatiedynamica,

This also applies to the big family of the Theoretical Research in Evolutionary Life Sciences TR S group... You helped me a lot in the first year of

Large population size can act as a large fist in species duels of the same trophic level (Chapter 3) but also serves as a feast to natural enemies (Chapter 4). Population dynamics