• No results found

University of Groningen Divergence and adaptive capacity of marine keystone species Fietz, Katharina

N/A
N/A
Protected

Academic year: 2021

Share "University of Groningen Divergence and adaptive capacity of marine keystone species Fietz, Katharina"

Copied!
50
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Groningen

Divergence and adaptive capacity of marine keystone species Fietz, Katharina

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2017

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Fietz, K. (2017). Divergence and adaptive capacity of marine keystone species. Rijksuniversiteit Groningen.

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

Chapter 4

Mind the Gut: Genomic insights to population divergence

and gut microbial composition of two marine keystone

species

(3)

Mind the Gut:

Genomic insights to population divergence and gut microbial composition of two marine keystone species

Katharina Fietz1,2*, Christian Olaf Rye Hintze1, Mikkel Skovrind1, Tue Kjærgaard Nielsen3, Morten T. Limborg1, Marcus A. Krag1, Per J. Palsbøll2, Lars Hestbjerg Hansen3, Peter Rask Møller1, M. Thomas P. Gilbert1,4*

1

Natural History Museum of Denmark, University of Copenhagen, Section for Evolutionary Genomics, Øster Voldgade 5-7, 1350 Copenhagen Denmark

2

Marine Evolution and Conservation, Faculty of Science and Engineering, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands

3

Department of Environmental Science, Environmental Microbial Genomics Group, Aarhus University, Frederiksborgvej 399, 4000 Roskilde, Denmark

4

NTNU University Museum, 7491 Trondheim, Norway

Keywords sand lance, population genomics, holobiome, microbiome, Baltic Sea, local adaptive potential

Corresponding authors*

Katharina Fietz, katharina.fietz@snm.ku.dk, Natural History Museum of Denmark, University of Copenhagen, Section for Evolutionary Genomics, Øster Voldgade 5-7, 1350 Copenhagen, Denmark

M. Thomas P. Gilbert, tgilbert@snm.ku.dk, Natural History Museum of Denmark, University of Copenhagen, Section for Evolutionary Genomics, Øster Voldgade 5-7, 1350 Copenhagen, Denmark

(4)

Abstract

Deciphering the mechanisms governing population genetic divergence and local adaptation across heterogeneous environments is a central theme in marine ecology and conservation. While population divergence and ecological adaptive potential are classically viewed at the genetic level, it has been argued that their microbiomes may also contribute to such processes. We explored whether this might be plausible along the well-described environmental gradient of the Baltic Sea, using two sand lance species (Ammodytes tobianus and Hyperoplus lanceolatus). Specifically, we characterized both their population genomic and gut microbiomic composition variation, and investigated not only which environmental parameters correlate with the observed variation, but whether host genome also correlates with microbiome variation. We find clear genetic structure separating the high salinity North Sea and low salinity Baltic Sea sand lances. The observed genetic divergence seems to be not simply caused by isolation by distance, but rather correlates with environmental parameters that include salinity, sea surface temperature, and in the case of A. tobianus possibly even water bacteria. Furthermore, Baltic A. tobianus exist as two distinct genetic stocks that we hypothesize indicate sympatric spawning types. We additionally find evidence in support of the hypothesis that host genetics may play a role in regulating the gut microbiome at not only the inter-specific, but also intraspecific level. As sequencing costs continue to drop, we anticipate that future studies that include full genome and microbiome sequencing will be able to explore the full relationship and its potential adaptive implications for these species.

Introduction

A major current focus within marine ecology and conservation is to improve our understanding of the mechanisms governing population genetic divergence and local adaptation across heterogeneous environments. Although gene flow may hamper local adaptation, genetic outlier loci across environmental gradients in several marine fishes imply possible local adaptation despite low overall levels of population genetic divergence (Limborg et al. 2012; Berg et al. 2015; Guo et al. 2016). In a time in which the planet’s oceans are expected to undergo considerable changes in oxygen, temperature and salinity levels (IPCC 2014), leading to extensive changes in the conditions of available habitats, an organism’s ability to adapt swiftly to these changes will be vital for its survival. It is therefore particularly important to understand which processes drive the genetic divergence that may be at the basis of ecological adaptation, yet our understanding of them is only in its infancy. Studying the genetic basis of ecological adaptation in natural populations is particularly difficult when population sizes are limited, because the signals of natural selection and genetic drift can be confounding. In the marine realm

(5)

however, there are many ideal systems, for example numerous fish species whose populations exhibit extremely large numbers and span different ecological conditions (Nielsen et al. 2009a; Martinez Barrio et al. 2016). Thus a number of studies have begun to study their adaptive genetic variation as a response to environmental change. However, while this growing body of research correlates outlier loci signatures with key environmental parameters such as salinity and water temperature (Limborg et al. 2012; Berg et al. 2015; Vilas et al. 2015; Martinez Barrio et al. 2016)), very few studies have gone beyond these standard parameters.

Although the debate about adaptation to different environments is classically viewed at the genomic level, it has recently been argued that an organisms’ associated microbial component might also play a role (Alberdi et al. 2016). This argument is nested within the holobiome concept, which views an organism as an entity that encompasses its own as well as the sum of its microbial symbionts’ genetic information (Rosenberg & Zilber-Rosenberg 2011; Bordenstein & Theis 2015). With regards to adaptation, Alberdi et al. argued that, given (i) gut microbiome communities can have significant and rapid phenotypic effects on their hosts and (ii) the relatively short time-frame of many environmental changes, microbiome community changes may provide an important mechanism for adaptation. While challenging to study directly, first steps in this direction can come from characterizing the natural variation in host species’ microbial communities in light of genomic and environmental divergence. While a number of studies have begun to consider host genomic-microbiomic relationships, and even environmental-microbiomic relationships, few have attempted to take a full hologenomic approach (both genomic and microbiomic) across environmental variation.

We explored the potential of such an approach by characterising both population genomic and gut microbiomic variation in two sand lance species along the well-described environmental gradient of the Baltic Sea. The Baltic seascape is characterized by a steep aquatic gradient; it is a semi-enclosed brackish water basin in Northern Europe with a gradient in environmental conditions ranging from nearly limnetic to almost fully marine. Its inhabitants must therefore cope with a wide range of environmental conditions. Fish species with low levels of genetic divergence at presumed neutral markers, such as herring (Lamichhaney et al. 2012; Limborg et al. 2012; Guo et al. 2016), cod (Nielsen et al. 2009b; Berg et al. 2015) and three-spined stickleback (Guo et al. 2015), have been shown to exhibit substantial divergence at single nucleotide polymorphisms (SNPs) inferred to be associated with genome regions under selection. The divergence at these outlying SNPs is likely the result of the subsequent colonization of the Baltic Sea by marine species after its formation as a marine habitat ca. 8,000 BP and of resulting specific adaptations to this environment (Emeis et al. 2003; Ojaveer &

(6)

Kalejs 2005; Bjorck 2008), although alternative scenarios such as a secondary contact zone cannot be ruled out (Gaggiotti et al. 2009).

Five species of sand lances (family Ammodytidae) occur in the Northeast Atlantic and adjacent waters. These highly abundant fish are known to be closely associated with specific soft substrates (Nelson et al. 2016) and are characterized by their resident nature and small dispersal ranges (Gauld 1990; Proctor et al. 1998; Christensen et al. 2009). Such characteristics may restrict long-distance gene flow besides the absence of physical barriers. As a food source for many marine birds, mammals, and other fish species, they are keystone organisms in their environment (Rindorf et al. 2000; Furness 2002; Frederiksen et al. 2006). Furthermore, sand lances are also a major component of commercial fisheries and thus have considerable economic value (Anderson et al. 2012). Although these characteristics render them both relevant to study in the context of marine management and as an interesting potential model organism for studying local adaptation, previous research has principally focused on defining sand lance species and populations using few markers (Kim et al. 2010; Warner 2011; Han et al. 2012; Orr et al. 2015; Thiel & Knebelsberger 2016).

We generated genome-wide SNP data and gut microbiome taxonomic composition data for two ecologically and economically important sand lance species, Ammodytes tobianus and Hyperoplus lanceolatus, along the North Sea - Baltic Sea gradient. We used this data to firstly explore their levels of population differentiation across this gradient. Secondly, we tested whether the observed population genomic structure correlates with environmental factors, including not only salinity and sea surface temperature (SST), but also the relative bacterial composition of the water at sampling sites. Thirdly, we characterized each species’ gut microbiome, its inter-specific variation, and its relationship with both environmental as well as host genomic parameters. Lastly, in light of our findings, we discuss the future potential of how a hologenomic approach such as that taken here, may help improve our understanding of marine species’ ecological adaptive potential.

Material and Methods

Sample collection

Sand lances were collected at multiple sites across the environmental gradient from the brackish Inner Baltic Sea to the marine North Sea. Samples from A. tobianus and H. lanceolatus were collected during May through September 2015 from 12 and 15 different sites, respectively (Fig.

(7)

1, Table 1). Individual fishes were collected during commercial and research trawls, or caught with near-shore seine nets. Individual muscle tissue samples were collected upon capture (seine nets) or 1 – 6 hours (commercial trawls) post-mortem. If no direct sampling was possible, individuals were stored in 96% ethanol directly upon capture and stored at -20 C° until sub-sampling. Guts were collected under sterile conditions directly upon capture from a subset of individuals. The samples collected for the gut microbiome analysis were collected from the frontal gut located directly behind the stomach and ending in a tight loop in the gut (hereafter referred to as gut). The external part of the gut was cleaned with sterile equipment to remove any tissue, and the gut wall including contents was used as a sample. Gut samples were stored in 96% ethanol at -20 C° until further processing.

Environmental data

We collected data on salinity, SST and sea water bacterial taxonomic composition for correlation analyses between genetic data and environmental parameters (Table S1). Salinity and SST data were retrieved from www.smhi.se and www.ices.dk. We estimated the annual average salinity during the period 2010 to 2014, as well as the annual, minimum and maximum average SST in the same period. Data of the relative abundance of six major bacterial taxa in the water near our sampling sites were obtained from Hu et al. (2016). Our rationale for using these water data from 2013 was that due to the long residence time of water in the Baltic basin (3 – 30 years (Reissmann et al. 2009)) and identical sampling season, those water samples would be broadly representative of the water at sampling sites in our study.

Genotyping-By-Sequencing (GBS)

DNA extraction and sequencing library preparation

Whole genomic DNA was extracted using the KingFisher™ Duo Prime Purification System (Thermo Fisher Scientific Inc., Waltham, USA) following the manufacturer’s protocol for the cell and tissue DNA purification kit. DNA quantity was measured on a QubitTM 2.0 fluorometer (Thermo Fisher Scientific Inc., Waltham, USA). The size range of DNA extractions was measured in a subset of DNA extractions using an Agilent 4200 TapeStation. DNA extractions were then cleaned using the ZR-96 Genomic DNA Clean & ConcentratorTM following the manufacturer’s protocol (Zymo Research, Orange, CA, USA). Population genomic data were then generated using the GBS approach developed by Elshire et al. (2011) using the Cornell University Institute of Biotechnology commercial service following their standard pipeline

(8)

(Elshire et al. 2011). The six base cutter restriction enzyme EcoT221 was used for GBS library generation, producing fragments 200 – 380 base pair (bp) in size.

GBS sequencing and SNP calling

GBS libraries were sequenced by the Cornell service as single-end 64bp sequencing on an Illumina HiSeq2000™ platform (Illumina Inc., San Diego, US). Subsequent analytical steps were conducted separately for each fish species. Details of data filtration and SNP calling can be found in the supplementary material.

Population genomic analyses

We employed the AMOVA (Excoffier et al. 1992) implemented in GENODIVE v.2.0 to estimate overall and pairwise levels of genetic divergence (FST). We assumed an Infinite Alleles Model

(Kimura & Crow 1964) using 999 permutations. To further investigate population genomic structuring, we conducted a Principal Component Analysis (PCA) using the SMARTPCA program in the EIGENSOFT package (Patterson et al. 2006). The datasets were reduced to ten eigenvectors and principal components 1 and 2 as well as 1 and 3 were plotted using the Perl script PLOTEIG (EIGENSOFT package). In order to infer ancestry among sand lances in various areas, we used a model-based approach implemented in the software ADMIXTURE v.1.3.0 (Alexander et al. 2009). ADMIXTURE analyses were performed for values of K from 2 to 14 clusters. Convergence was assumed when the log-likelihood difference between iterations was below 10-4. We used the 5-fold cross-validation method to select the most suitable number of genetic clusters (K) (Alexander & Lange 2011). The ADMIXTURE analysis was undertaken with and without filtering out loci that deviated significantly from HWE in >75% of sampling sites.

Detection of outlier loci

We applied three different approaches based on Bayesian algorithms to investigate if loci deviated from the neutral model and if we could detect correlations with environmental parameters. We used an FST-based approach implemented in BAYESCAN v.2.1 (Foll & Gaggiotti

2008) to identify outlier loci. In order to test for associations between population genetic divergence and environmental parameters, we employed two additional approaches, implemented in BAYESCENV v.1.1 (de Villemereuil & Gaggiotti 2015) and BAYENV V.2 (Gunther & Coop 2013). Details of the estimation parameters are listed in the supplementary material. We further created allele frequency plots for all outlier loci to investigate the potential presence of geographic clinal patterns.

(9)

Microbial 16S profiling

DNA extraction and purification

We extracted genomic DNA from the gut using the MoBio Power Soil kitTM (MoBio Laboratories Inc., Carlsbad, USA) following the manufacturer’s instructions. The extracted DNA was resolved in a final volume of 100µ L DNA. DNA concentrations were quantified using a QubitTM 2.0 Fluorometer (Thermo Fisher Scientific Inc., Waltham, USA), and diluted to 10ng/µ L for amplification if necessary.

Library preparation and amplicon sequencing

We employed a two-step PCR amplification approach for library preparation. The universal

primer set 341F (5’-CCTAYGGGRBGCASCAG-3) and 806R

(5’-GGACTACNNGGGTATCTAAT-3) was used to amplify the V3-V4-regions of the bacterial 16S rRNA gene (Hansen et al. 2012). A subsequent PCR amplification was performed using the Nextera™ XT index primers (Illumina Inc., San Diego, US) to attach Illumina MiSeq™ sequence adaptors and barcodes. Full details of 16S library preparation and sequencing are listed in the supplementary material. In the following, the term microbiome refers to these bacterial 16S data.

Data filtering and OTU clustering

We performed data filtering and clustering in USEARCH v.8.1.1861 (Edgar 2010). Raw sequence reads were de-multiplexed according to barcode sequences. Details of data filtration and SNP calling can be found in the supplementary material.

Gut microbiome taxonomic composition

We employed QIIME (Quantitative Insights into Microbial Ecology v.1.8.0 (Caporaso et al. 2010)) to estimate the alpha diversity within the sampling sites as the Shannon-Wiener (Shannon 1948) and Chao1 indices (Chao & Shen 2003). We employed an ANOVA to assess the differences in OTU frequencies among sampling sites, and corrected the results for multiple comparisons using a step-down resampling algorithm as in Westfall and Young (1993). We report all bacteria with P < 0.05 at the lowest possible taxonomic level (genus being the lowest level). We further investigated patterns of beta diversity among sampling sites. Dissimilarities in microbiome composition among sites were quantified with a Principal Coordinate Analysis

(10)

(PCoA) using a weighted unifrac distance matrix that takes into account OTU abundance as well as phylogenetic information (Lozupone et al. 2011). We tested for significant differences between sampling sites at family level using a non-parametric Kruskal-Wallis H-test (Kruskal & Wallis 1952), and applied the Benjamini-Hochberg False Discovery Rate correction to adjust for multiple testing (Benjamini & Hochberg 1995).

Predictors of gut microbiome composition

In order to determine which factors correlate with changes in gut microbiome composition, we implemented various approaches in the vegan (Oksanen et al. 2016) and phyloseq (McMurdie & Holmes 2013) packages in R (R Development Core Team 2015). In the case of A. tobianus, in addition to salinity and SST, we also included the relative abundances of major bacterial taxa found in the Baltic water as independent parameters.

We employed a permutational multivariate analysis of variance (Permanova, adonis function) using a distance matrix to identify each parameter that has a significant influence on our dataset variation. Results were then plotted by non-metric multidimensional scaling (NMDS) analysis. A biologically plausible combination of significant environmental parameters based on knowledge of the biological system was fitted onto our unconstrained NMDS ordination.

We then used a constrained analysis of principal coordinates (CAP) ordination implemented in the phyloseq package (McMurdie & Holmes 2013) in R (R Development Core Team 2015) to identify which combination of independent parameters explains the largest proportion of variation in gut microbiome composition. To this end, we included those environmental parameters that proved significant in the Permanova described above. The order of parameter inclusion in the model was chosen so that the parameter explaining most variation when tested individually was included first. In both NMDS and CAP ordination, collinearity among variables was accounted for by excluding any variable showing collinearity with variables already included in the model.

Impact of environment versus host genetics on the gut microbiome composition

We used an interspecific dataset including gut microbiome data from A. tobianus, H. lanceolatus, and a number of other Baltic fish species sampled in Køge Bugt during 2016 (listed in Fig. S5). The other species served as references to contrast the influence of environmental factors with the influence of host genotype on gut microbiome composition. We tested influence of the host species identity with a Permanova and a subsequent CAP ordination with the phyloseq package (McMurdie & Holmes 2013) in R (R Development Core Team 2015).

(11)

In order to investigate the degree to which the sand lance gut microbiome reflects the microbiome composition of the surrounding environment, we compared the relative abundance of major bacterial taxa in our data to the relative abundance of those found in the Baltic Sea. We therefore used data collected along a 2000km transect in the Baltic Sea in 2013 and reported in Hu et al. (2016). We focused on six bacterial taxa at phylum and class level and report the relative abundances per sample of non-normalized reads: Alphaproteobacteria, Verrucomicrobia, Bacteroidetes, Betaproteobacteria, Actinobacteria, and Gammaproteobacteria. We used a Chi2 -test of homogeneity implemented in R (R Development Core Team 2015) to -test whether bacterial abundance changed significantly among sampling sites.

Results

Genotyping-By-Sequencing (GBS)

In order to decide on a minimal amount of accepted missing data per individual while including the maximum number of individuals in analyses, we investigated SNP coverage in our UNEAK-filtered dataset. For A. tobianus, no individuals were genotyped at more than 20% of the overall identified loci, while no H. lanceolatus were genotyped at more than 30% of loci (Fig. S1). We excluded all individuals that had more than 90% missing data in either species to keep filtering criteria consistent and to include the maximum possible number of individuals in analyses. Following data filtration, we were left with a final dataset consisting of 4,039 SNPs and a genotyping rate of 0.976 for A. tobianus (n = 286), and with 4,328 SNPs and a genotyping rate of 0.980 for H. lanceolatus (n = 163). All following analyses were conducted for both species, unless otherwise indicated.

Population genomic divergence

Overall estimates of genetic divergence (FST) were 0.012 (95% CI 0.011 – 0.013) for A. tobianus,

and 0.017 (95% CI 0.015 – 0.018) for H. lanceolatus. Pairwise FST values in A. tobianus ranged

from 0.001 to 0.041, the highest divergence was observed between the most brackish and the marine sampling sites. Divergence between the geographically intermediate sampling sites and the brackish and marine sampling sites, respectively, was intermediate, between 0.011 and 0.030 (Table S2a). A similar pattern emerged in H. lanceolatus, pairwise FST values ranging from

0.002 to 0.039. As in A. tobianus, genetic divergence was largest between the Inner Baltic and the North Sea sampling sites, while the geographically intermediate sampling site at Helsingør displayed intermediate FST values around 0.012 (Table S2b).

(12)

ADMIXTURE analyses revealed that A. tobianus most likely consists of three genetically distinct clusters and H. lanceolatus most likely of two (Figs. 1 and 2, Table S3). The signature remains consistent whether or not the data is filtered for SNPs deviating from HWE in at least 75% of sampling sites (to keep consistent with other studies that used the UNEAK pipeline, we continue with data that included the HWE filtering step). We find that A. tobianus from marine sampling sites (Texel to Horns Rev) belong to the same genetic cluster, while those in the Kattegat and Western Baltic (Læsø to Køge Bugt) are one third admixed with Baltic ancestry (K = 3 in Fig. 2). In the other Western Baltic and Inner Baltic locations (Køge Bugt to Bönan), A. tobianus consists of two genetically distinct Baltic Sea clusters. Individuals of either cluster occur together in Bornholm and Åland, while Køge Bugt harbors A. tobianus with mainly a North Sea or a Baltic Sea ancestry. Ammodytes tobianus in Bönan have pure Baltic ancestry.

Overall, H. lanceolatus show a similar pattern: individuals in the North Sea sampling sites (Texel to NW-Hanstholm) have pure marine ancestry, while individuals at the two Inner Baltic locations have pure Baltic Sea ancestry (K = 2 in Fig. 2). Helsingør in the Western Baltic harbors both H. lanceolatus with pure North Sea as well as pure Baltic Sea ancestry. These results were further supported by the PCA that also grouped A. tobianus into three clusters and H. lanceolatus into two (Fig. S2). In agreement with the pairwise FST results, geographically intermediate

sampling sites in A. tobianus include individuals that are genetically different from both North and Inner Baltic Sea (upper left cluster in Fig. S2, left plot), while the geographically intermediate sampling sites in the case of H. lanceolatus (Helsingør, Fig. S2, right plot) includes individuals with admixed genomes (i.e., equal ancestry from the Baltic Sea and North Sea clusters). In the following, we will refer to the different clusters as ‘populations’. This adheres to Waples & Gaggiotti’s evolutionary population paradigm under which a population is constituted by a group of conspecific individuals living in close enough proximity that random mating can potentially take place (Waples & Gaggiotti 2006).

Loci under putative local adaptation

Outlier analyses detected numerous SNPs where the variation correlated with environmental parameters. A total of 43 and 72 SNPs were identified as outliers by all three approaches in A. tobianus and H. lanceolatus, respectively (Fig. 3a, Table S4). Of these, the allele frequencies at 22 (A. tobianus) and 29 (H. lanceolatus) SNPs displayed a spatial allele frequency pattern along the environmental gradient, suggesting that these loci might be subject to selection by factors co-varying with the environment (Fig. 3b, Fig. S3). Most of these outlier SNPs in A. tobianus were correlated with the relative proportions of three major bacterial taxa found in the water (Actinobacteria, Alphaproteobacteria, and Gammaproteobacteria) as well as with the minimum and maximum SST (Table S4a). The allele frequencies at ten outlier SNPs correlated with the

(13)

change in salinity, while the allele frequency change in seven SNPs correlated significantly with the change in nearly all environmental parameters. In H. lanceolatus, 12 SNPs were correlated with all tested environmental parameters (salinity and SST) (Table S4b).

Microbial 16S profiling

Following data filtration and exclusion of samples with low read coverage, our final microbial dataset consisted of 31 A. tobianus gut microbiome samples from four sampling sites in which a total of 210 OTUs were detected. Nineteen H. lanceolatus gut microbiome samples from two sampling sites revealed 107 OTUs (Table 1). Read depth per sample ranged from 1,051 to 25,352 in A. tobianus, and from 1,348 to 12,492 in H. lanceolatus (Table S5). Both extraction and PCR negative control samples resulted in very low read coverage suggesting that contamination was negligible, and were not included in the final dataset.

Sand lance gut microbiome composition

At a high bacterial taxonomic level, the gut microbiome composition of A. tobianus and H. lanceolatus was similar and varied only slightly between fish species (Fig. 4a). Proteobacteria was the only phylum present in all individuals in both species. The phyla Tenericutes, Cyanobacteria, Firmicutes, Actinobacteria, Bacteroidetes and Spirochaetes were present in a large percentage of individuals in both species. Variation in the percentage of occurrence was largest for Verrucomicrobia which only occurred in 5% of H. lanceolatus, but in 45% of A. tobianus individuals.

Alpha gut-microbiome diversity was overall higher in A. tobianus (Chao1 = 8 – 101; Shannon = 2.89 – 5.63) than in H. lanceolatus (Chao1 = 5 – 35.6; Shannon = 2.21 – 4.96). Both the Shannon-Wiener index and the Chao1 index were higher in A. tobianus samples from brackish sampling sites than from marine sampling sites, while no trend was evident in H. lanceolatus (Fig. 4b). Relative abundance of four bacterial genuses changed significantly between sampling sites in A. tobianus, while relative abundance of seven bacterial genuses changed significantly between the two H. lanceolatus sampling sites (P < 0.05) (Fig. 4c). Among these, two genuses belong to the obligate phototrophic Synechococcus. It is worth noting that only two of these genuses overlap between sand lance species.

In the PCoA, the first principal coordinate axis explained 35.9% of variation in gut microbial communities between sampling sites for A. tobianus, while for H. lanceolatus the first axis explained 30.5% of variation (Fig. 5). In A. tobianus, the level of variation within sites was less than among sites with the exception of Faxe Bugt where three outliers group with individuals

(14)

from other sites. In H. lanceolatus, the level of variation within and between the two sites was similar.

Predictors of gut microbiome composition

We tested a range of environmental and host genetic factors to assess how well they explained the observed variance in sand lance gut microbiome composition. Our inclusion of water bacterial data led to exclusion of the sampling site Halsskov in A. tobianus due to lack of water bacterial data for this site. We defined the ancestry fraction Q of the North Sea group as identified by ADMIXTURE for the most likely K as the host genetic component for either species. We included this host genetic component as an ‘environmental parameters’ in our analyses. The Permanova identified ten and three environmental parameters that correlated significantly with the gut microbiome composition in A. tobianus and H. lanceolatus, respectively (P < 0.001) (Table S6). We visualized the data using a 2-dimensional NMDS ordination to assess possible multivariate interaction of the gut microbiome composition with the environmental parameters (Fig. S4). The main axis among which A. tobianus sampling sites were separated was characterized by differences in salinity, SST, and the relative abundances of four water bacterial taxa. In H. lanceolatus, the gut microbiome among sampling sites was not as clearly differentiated, the main axis being characterized by differences in SST, distance and date, describing the same variation in opposite directions (Fig. S4, Table S6).

We used CAP ordination to test for statistical significance in the multivariate interactions among environmental parameters and gut microbiome composition. In A. tobianus, the combination of parameters describing most variation in the gut microbiome composition included SST, geographic distance from the westernmost sampling site, and the host genetic component (CAP1 = 26.7%) (Fig. 6). In H. lanceolatus, the analysis resulted in a one-dimensional ordination due to the binary character of variables among only two sampling sites. Distance between sampling sites explained 19.8% of variation (not visualized).

Impact of environment versus host genetics on the gut microbiome composition

We conducted a CAP ordination in order to extend our understanding of the potential influence of host species factors on gut microbiome composition. This revealed that the host species identity is a key explanatory variable for gut microbiome composition (Permanova P < 0.001; PC1 = 6.5%, PC2 = 5.5%) (Fig. S5). We then displayed the gut microbiome composition of multiple species from three adjacent sampling sites to illustrate the different scales of similarity in gut microbiome composition among species and sampling sites (Fig. S6). We therefore chose A. tobianus samples from two sites 131km apart, and samples from an outgroup of fishes (sand

(15)

goby (Pomatoschistus minutus), flatfish (species unknown), stickleback (Gasterosteus aculeatus)) from a geographically intermediate sampling site (Køge Bugt). Our hypothesis was that gut microbiome composition would be more similar within than between host species, even if individuals of the same species were sampled at different sites. With the exception of individual P611035, the gut microbiome composition was visibly less variable among A. tobianus across sites than it was among A. tobianus and other fish species from sites that are geographically closer to each other.

Correlations between gut microbiome composition and relative abundance trends of bacterial taxa in the environment revealed significant changes in the relative abundances of some of the major bacterial taxa along the North Sea – Baltic Sea environmental gradient (Fig. 7, Table S7). While some bacterial taxa demonstrated identical abundance trends in gut and environment, others showed opposing relative abundance trends. Specifically, trends were identical in Gammaproteobacteria which became proportionately less dominant in both environmental (water) and gut samples as environmental salinity decreased (Fig. 7). In Alphaproteobacteria, relative abundance increased in guts and decreased in the environment, whereas the opposite trend was observed in Actinobacteria (Fig. 7).

Discussion

To our knowledge, this is the first study to investigate population genetic divergence of two non-model fish species that draws on information from both fish genomics and from host internal and external bacterial communities. Our findings have relevance for both the population structure of two commercial species, but also provide insights into potentially relevant genomic and microbiomic factors with regards to their adaptation across the North Sea – Baltic Sea environmental gradient. Furthermore, our findings provide evidence that sand lance gut microbial communities are influenced by both host genetics and environmental parameters.

Isolation of Baltic Sea sand lances

In accordance with population genomic patterns observed in other marine fishes, both sand lance species displayed genomic divergence between the Baltic Sea and North Sea. In both species we found pairwise genetic divergence to be highest between the North Sea and Inner Baltic Sea sampling sites. This is similar to neutral differentiation in Atlantic cod (Berg et al. 2015), herring (Guo et al. 2016), flounder (Florin & Hoglund 2008), and sprat (Limborg et al. 2009). The rate of change in divergence was highest between intermediate sampling sites; this is comparable to the divergences observed in several other Baltic Sea fish species and appears to reflect the

(16)

steeper gradient in salinity in this region (Fig. 2 in Limborg et al. (2009)). Sand lances in the North Sea tend to be resident and associated with specific habitat types (Gauld 1990). Consequently, most dispersal is likely during the larval phase but even this is thought to be limited between areas (Proctor et al. 1998; Christensen et al. 2009). Assuming that these characteristics are shared with Baltic Sea sand lances, they match the genetic divergence between these two Seas.

Ammodytes tobianus in the Baltic Sea consists of two genetically differentiated populations

Ammodytes tobianus in the Baltic Sea – mainly between Køge Bugt and Åland in the Western to Inner Baltic Sea – seems to belong to two genetically differentiated populations. We suggest that this observation may be attributed to either spatial or temporal segregation. Spatial segregation would occur if different breeding stocks left their spawning areas and mixed at sampling sites outside the spawning season. As we did not sample during the spawning season, we cannot exclude this option. We find it unlikely, however, given the sand lances’ residential behavior and limited dispersal capacity (Proctor et al. 1998; Christensen et al. 2009). Another possible cause for the observed genetic pattern may be temporal segregation. This would occur if different spawning types would co-occur in the same spawning habitat, but be reproductively isolated in time (Jorgensen et al. 2005). Populations of A. tobianus may indeed consist of two distinct, but often sympatric spawning types, an autumn and a spring spawning contingent (Kandler 1941) that are known to occur together e.g. off the coast of West Ireland (Oconnell & Fives 1995). These spawning types generally differ inter alia in the mean number of vertebrates which is higher in the autumn group (Whitehead et al. 1986). As early as 1934, it was suggested that A. tobianus in the Baltic Sea consists of two spawning types (Bahr 1934). More recently, Wiecaszek et al. (2007) investigated A. tobianus in the Southern Baltic Sea in the Gulf of Gdansk and based on vertebral counts assigned the individuals of this area to the autumn-spawning component of the stock. While we did not find significant differences in vertebral counts in our fishes that would support a hypothesis of different spawning types (data not shown), we hypothesize that the two genetic A. tobianus Baltic populations that we detect here may nonetheless represent two sympatric spawning types. In order to gain certainty about co-occurrence of different spawning types in the Baltic Sea, future studies will have to sample individuals during the respective spawning season (autumn and spring) and investigate the presence of ripe gonads in adult females. The presence of different spawning types in the same habitat is also known from other species such as herring (Jorgensen et al. 2005; Lamichhaney et al. 2012). Our results highlight the power of using genetic markers as a tool to monitor the relative proportion of the two spawning types in fishery landings.

(17)

Sand lances along the North Sea - Baltic Sea environmental gradient display signatures consistent with local adaptation

As a geologically young sea that has seen extreme changes in environmental conditions in the last 8,000 years, the Baltic Sea provides a popular model in which to study marine organisms’ genetic adaptations to their surroundings. Our study supports previous work hypothesizing that marine fish populations show signatures of potential local adaptation at much smaller spatial scales than might be expected if considering their high dispersal potential compared to terrestrial and freshwater species (Corander et al. 2013; Berg et al. 2015). Given the relatively small number of SNPs, the lack of a reference genome, and the fact that we aimed to exercise great care to avoid over-interpreting our results, we refrained from blasting outlier loci and chose to instead focus on drawing conclusions from the observed correlations and allele frequency plots. In both A. tobianus and H. lanceolatus we find elevated levels of genetic divergence at outlier loci that correlate with SST and salinity (we discuss the relationship with water bacteria in a following section). Being osmoregulators that maintain a stable internal salinity of around 9 PSU (Kultz 2015), fishes have evolved physiologies that fundamentally differ in high and low saline waters. Divergent selection, especially at loci that are associated with salinity tolerance, may therefore be expected to promote reproductive isolation between Baltic Sea and North Sea populations. A study on Atlantic cod (Berg et al. 2015) combined outlier analyses with a landscape genomic approach to show that various regions across the cod genome show signatures of genetic divergence that are correlated with salinity. Berg et al. argue that these loci are located within an osmoregulatory framework of genes that are involved with survival of eggs and larvae in low-saline environments. Besides adaptations that are associated with increased offspring survival, studies have also investigated the genetic baseline underlying physiological traits. Kultz et al. (2013) used acclimation experiments and molecular phenotyping to reveal that salinity stress seems to have an influence on gill design of a euryhaline cichlid fish. We hypothesize that similar mechanisms related to physiology, egg and larval survival in low-saline waters cause genetic divergence between Baltic and North Sea sand lances. Next to salinity, water temperature plays a similarly pivotal role for aquatic organisms. Ambient temperature is important in affecting metabolic reactions especially of poikilotherm organisms such as fishes. Water temperature has been shown to correlate with patterns of adaptive evolution of Atlantic cod on either side of the Atlantic Ocean (Bradbury et al. 2010). The fact that the degree of genetic divergence in many of the sand lance outlier loci was significantly associated with salinity and water temperature led us to the hypothesis that environmental heterogeneity in these, or correlated, factors may be an important driving force behind the genetic divergence between the Baltic and North Sea sand lance populations.

(18)

Gut microbiome composition and diversity

We incorporated gut microbiome data from a subset of each sand lance species in order to explore associations between host genomics and environment on gut microbiome composition. At a general level, the sand lance gut microbial community composition is comparable to that of other fish. As observed in other fish species, members of the Proteobacteria were by far the most common phylum in both sand lance species. Among other common microbial phyla were Actinobacteria, Tenericutes and Cyanobacteria. Proteobacteria are known to be one of the major phyla in the guts of many other studied fish species (as are Actinobacteria and Tenericutes), and are found in high abundance in fish guts (Givens et al. 2015; Sullam et al. 2015; Llewellyn et al. 2016). Cyanobacteria were also detected at high abundances in catfishes (Di Maiuta et al. 2013). The observed gut microbial composition is consistent with taxa that aid in nutrient absorption and maintaining homeostasis (Colston & Jackson 2016). For example, Proteobacteria help break down and ferment complex sugars, Actinobacteria have been shown to also aid in maintaining host homeostasis, as well as in inhibition of Gram-negative pathogens and lactic acid fermentation, and Tenericutes are known to help nutrient processing (Colston & Jackson 2016). Microbial diversity estimates of both sand lance species in our study varied substantially between individuals, and were overall somewhat higher in A. tobianus than in H. lanceolatus. Chao1 diversity estimates were substantially lower in our sand lances than in zebrafishes (Roeselers et al. 2011), while they overlapped in part with the gut microbiome richness of coral reef surgeon fishes (Miyake et al. 2015). Shannon diversity estimates on the other hand were higher in both sand lance species than in zebrafishes or sturgeon fishes. While gut microbiome diversity is likely correlated with diet (Ley et al. 2008; Bolnick et al. 2014) – including some bacterial uptake from environmental water during feeding – we did not include dietary analyses in our study. The diet – gut microbial relationship is not straight-forward: while some authors argue that a more diverse diet will increase gut microbial diversity (Laparra & Sanz 2010), others found that the opposite is true in some fishes (Bolnick et al. 2014). What we do understand from this is the high complexity of these communities that are impacted by multiple environmental variables. We conclude that the gut microbial diversity increases with decreasing salinity in A. tobianus. Inclusion of diet analysis and more spatio-temporal replicates in future studies will help us better understand the processes shaping this pattern.

Microbial communities of the sand lance gut are not mere reflections of environmental microbial communities

In organisms ranging from ants (Sanders et al. 2014) over fishes (Li et al. 2014) through humans (Goodrich et al. 2014; Blekhman et al. 2015), it has been shown that the composition of the core

(19)

microbes of the gut is not a mere reflection of the environment, but that it correlates to some degree with host genetic factors. Whether this composition is more reflective of host phylogenetic history or of host ecology is likely influenced by the mode of microbiome acquisition: via either vertical transmission or from the environment (Colston & Jackson 2016). In order to gain first insights into what may drive the gut microbiome composition in sand lances, we investigated the potential correlation of host (fish) species identity with the microbial community composition of the gut, and indeed found it to be significant. This analysis faces the caveat of fish samples stemming from different sampling sites. As such, causes for these visible differences – apart from host species identity – may be differences in the physical parameters of each sampling site, or fish age and associated differences in their diet. In order to account for this caveat, we visualized the gut microbiome composition of A. tobianus from a subset of locations, in addition to samples from other fish species from a geographically intermediate site. Our assumption was that similarity of composition should be greater within a species even among different sampling sites than among species. We found this visually confirmed in Fig. S6. A potential influence of host genetic factors on gut microbiome composition seems further confirmed by our CAP analysis. The host genetic component was among the parameters that best explain observed variation in gut microbiome composition. On the other hand, the presence of obligate photoautotrophic bacteria (Synechococcus) suggests at least partial bacterial uptake from the environment. Indeed, it seems that an influence of host genetic factors depends on the bacterial taxon under investigation. In our comparison of changes in relative abundance of major bacterial taxa between fish guts and surrounding water along the North Sea - Baltic Sea environmental gradient, Gammaproteobacteria showed identical trends. This implies that the relative abundance of Gammaproteobacteria in sand lance guts may be driven by their relative abundance in the surrounding environment. In Alphaproteobacteria and Actinobacteria on the other hand, gut-bacterial abundance trends were opposite to those of the surrounding water, suggesting that relative abundances in these taxa might depend more strongly on host species. Last, we expanded our environmental dataset in the population-genomic outlier analyses of A. tobianus to include – besides SST and salinity – the relative abundance trends of major taxa of environmental bacteria. We found significant associations with relative abundances of such bacteria at outlier loci that also show elevated levels of genetic divergence. Relative abundance of Actinobacteria, Alphaproteobacteria, and Gammaproteobacteria correlated with nearly all outlier loci. Actinobacteria are known to be part of the gut, mucosa and skin in fishes, while Proteobacteria are thought to be the dominant phylum in the guts of many fishes. Among the Proteobacteria, Gammaproteobacteria typically break down and ferment complex sugars and provide particularly important digestive roles (Colston & Jackson 2016). This may indicate that bacteria adopted from the environment play important roles in a population’s adaptability to its local habitat. However, we acknowledge that our study design lacks the power to identify such

(20)

functional effects, and importantly, salinity and SST correlate with environmental bacteria in the Baltic Sea (Herlemann et al. 2016). Thus, controlled laboratory experiments would be needed in future studies in order to differentiate between correlations of environmental bacteria and an organisms’ gut microbiome alone, and/or with additional environmental parameters, ultimately fully linking the system together. The combination of in situ work and common garden experiments seems particularly promising for moving beyond the detection of correlations and to assess the causal roles of microbiome and environmental factors in shaping the adaptive potential of wild vertebrate species. However, taken together, our combined results support a model in which the gut microbial community composition is, at least partly, shaped by the host genomic background.

The potential of combining population genomics with gut microbial data to gain insight to an organism’s ecological adaptive potential

It has recently been suggested that in order to cope with rapidly changing environmental conditions, organisms may rely on high phenomic plasticity conferred by their gut bacterial symbionts (Alberdi et al. 2016). Changes at the genetic level that bring about evolutionary change often need many generations to be established in a population. For vertebrates, with their slow reproductive strategies and long generation times, this can be insufficient for short-term adaptation and survival. Phenotypic plasticity may hence be an essential factor in determining how well vertebrates adapt to fast environmental change (Alberdi et al. 2016). Among host-associated microorganisms, the bacteria of the gut are thought to be the most influential symbiotic community, affecting health, immunity, digestive metabolism and consequently fitness of their hosts (Tremaroli & Backhed 2012; Amato et al. 2013; Vatanen et al. 2016). If indeed an organisms’ adaptive potential may be enhanced through help of its gut microbiome, it stands much better chances of survival when faced with the need to adapt swiftly. Given this putative central role of the gut microbiome, knowledge of such communities is a vital addition to population genomics in studies of local adaptation (Benson 2016; Colston & Jackson 2016). Recent research investigating the genetic basis of microbes (the “who”) hypothesizes that bacteria might confer certain abilities or tolerances to their hosts. However, only very few studies have explicitly tested whether certain bacteria actually do confer such abilities or tolerances (the “how” and “why”) (Kohl et al. 2016). In fishes, microbiome research has mainly focused on lab-reared (Rawls et al. 2004; Rawls et al. 2006; Di Maiuta et al. 2013), captured (Bolnick et al. 2014) or aquaculture fish (Huber et al. 2004), while to date there is only a limited number of studies on wild populations, and only on few economically important species such as salmon (Llewellyn et al. 2016). From few studies involving wild populations, we know that inferences about the gut microbiome made from captive animals may not always be transferred

(21)

to their wild counterparts (Sullam et al. 2015; Eichmiller et al. 2016). In this sense, studies of wild non-model organisms are invaluable to gain a better understanding of how the gut microbiome and host act together as an entity in facilitating rapid ecological adaptation. In this study, we for the first time use both population genomic and gut microbial data to make inferences about the population genetic divergence of two fish species in their natural habitat. Our study may serve as a benchmark for future work that aims to integrate population genomic with gut microbial data to investigate an organisms ecological adaptive potential.

Acknowledgements

The Universities of Copenhagen and Groningen are acknowledged for providing a PhD scholarship to KF. We are most grateful to Morten Tange Olsen for very helpful discussions and comments on previous manuscript versions. Thanks to George A. Pacheco for assistance with laboratory work. We also wish to thank Filipe J.G. Viera and Shyam Gopalakrishnan for useful discussions and support in data analyses. Further thanks to Henrik Carl, Michelle Svendson, Felipe Torquato, Fredrik Landfors, Tore Holm-Hansen, Kristian Vedel, Lara Puetz, Chris Höhne, Andrea-Pil Holm, Hannah Jensen, Flora Laughier, Kay Panten, Mikkel Holger Strander Sinding, and Mads Holger Strander Sinding for support in sample acquisition. We most kindly wish to thank the fishermen Hans Holger Strander Petersen, Bo Nielsen and crew, Lars, Kristian and Jens Olsen, and Kenneth Søbye for providing samples.

(22)

Fig. 1: Sampling sites for A. tobianus (above) and H. lanceolatus (below). Pie charts represent genetic ancestry proportions per sampling site as estimated in ADMIXTURE v.1.3.0 for K = 3 (A. tobianus) and K = 2 (H. lanceolatus). Pie charts encircled in blue indicate sites from which we included 16S data in addition to GBS data. Sampling sites for H. lanceolatus for which we had <8 individuals are indicated as hollow circles. TE = Texel, SA = W-Sylt A, SB = W-Sylt B, SHB = SW-Hanstholm B, DB = Doggerbanke, TB = Tannisbugt, HR = Horns Rev, SHA = SW-Hanstholm A, NWH = NW-Hanstholm, LA = Læsø, EB = Ebeltoft, HB = Hornbæk, HØ = Helsingør, HK = Halsskov, MB = Musholm Bugt, KB = Køge Bugt, FB = Faxe Bugt, BH = Bornholm, ÅL = Åland, BӦ = Bönan

(23)

Table 1: Overview of tissue samples for GBS analyses, gut samples for 16S amplicon analyses, and environmental parameters. AT = A. tobianus, HL = H. lanceolatus, OTU = Operational Taxonomic Unit, AB = Actinobacteria, AP = Alphaproteobacteria, BC = Bacteroidetes, BP = Betaproteobacteria, GP = Gammaproteobacteria, VM = Verrucomicrobia. For sampling site abbreviations, see Fig. 1.

(24)

Fig. 2: Plots of ancestral fractions from the ADMIXTURE cluster analysis for both sand lance species. Each vertical bar represents one individual, while the colors indicate the likelihood of this individual belonging to a particular ancestral population. K refers to the number of ancestral populations that were assumed to be present in the dataset. K = * indicates the most likely K. * in head of H. lanceolatus indicates fish from sampling sites with <8 individuals that were removed for population-level analyses. Samples are sorted from North Sea (left) to Baltic Sea (right) sampling sites.

A. tobianus

H. lanceolatus

*

(25)

Fig. 3: a) Venn diagram displaying outlier SNPs identified with different outlier detection software (A. tobianus = left, H. lanceolatus = right). b) Allele frequency plots for a subset of candidate SNPs for divergent selection, as well as sampling-site-specific values for annual average salinity and SST.

a)

(26)

Fig. 4: Composition, diversity, and differentiation of A. tobianus and H. lanceolatus gut microbial communities. a) Barplot depicting the percentage of individuals with the occurrence of the bacterial phyla found in A. tobianus and H. lanceolatus guts; b) Alpha diversity (Chao1 Index left, Shannon-Wiener Index right) increased significantly with decreasing salinity level in A. tobianus, while no clear trend is observed in H. lanceolatus; c) Heatmap of OTUs that changed significantly in relative abundance (%) as a function of sampling site at genus level (A. tobianus and H. lanceolatus combined).

a)

(27)

c)

Fig. 5: Principal Coordinate analysis (PCoA) of the dissimilarity between sand lance gut microbial communities in different sampling sites for A. tobianus (left) and H. lanceolatus (right). Color of the circles denotes the sampling site.

(28)

Fig. 6: Results of CAP analyses displaying the combination of environmental parameters explaining the largest amount of variation in gut bacterial communities in A. tobianus. The shape of symbols indicates the sampling site while the color of symbols indicates the salinity.

Fig. 7: Relative abundance (of non-normalized read numbers) of the six most abundant bacterial taxa at phylum and order level of A. tobianus guts (above) and in environmental (below) water samples (Hu et al. 2016).

(29)

Supplementary Material

Table S1: Environmental parameters that were included in GBS and 16S studies.

Parameter Measurement Details of Recording

Date Julian Day -

Sampling Site Distance

For each species, we appointed the most Westerly sampling site as 0, and calculated distances of all other sampling sites respective to this first site.

Salinity (PSU)

ICES: www.ices.dk

SMHI: http://opendata-download-ocobs.smhi.se/explore/?parameter=4#

Annual average over five years (2010-2014)

Sea Surface Temperature (SST) ICES: www.ices.dk SMHI: http://opendata-download-ocobs.smhi.se/explore/?parameter=4#

Annual average over five years (2010-2014)* Minimum average over five years (2010-2014)* Maximum average over five years (2010-2014)* Coefficient of variance in SST (2010-2014)* Baltic Sea

water

metagenomic data

Abundance of six most abundant bacterial phyla and classes in the water column, data from Hu et al. 2016

Methods detailed in Hu et al. 2016**

Host genetic variable***

The ancestry fraction Q of the North Sea

population Individual-specific ADMIXTURE output

* for Bönan only 2013-2014 were used as reference (no quality control of data available before this time) (SMHI pers comm) ** (Hu et al. 2016)

(30)

GBS Data Filtration and SNP Calling

Raw reads were de-multiplexed and filtered using the UNEAK3 pipeline (Lu et al. 2013). Tags were defined as groups of >5 identical reads in the UMergeTaxaTagCountPlugin. In order to minimize false SNP calls while maximizing the number of generated SNPs, we kept the error tolerance rate in the UTagCountToTagPairPlugin at the default 0.03, and the minimum minor allele frequency (MAF) in the UMapInfoToHapMapPlugin at 0.05. Following UNEAK filtering, data were re-genotyped using an in-house script with the following: Genotype likelihoods were estimated using an in-house script (available from the authors upon request) and only sites with a genotype probability ≥90% and a phred score ≥30 were called as SNPs. Further, all SNPs with >2 alleles were removed, as were SNPs that were not in Hardy-Weinberg equilibrium (HWE) in >75% of sampling sites. Lastly, we removed all SNPs with >10% missing data using PLINK (Purcell et al. 2007) and for all population-level estimates excluded any sampling sites with < 8 individuals.

Signatures of Local Adaptation and Association with Environmental Parameters

We detected loci deviating from the neutral model with an FST-based approach in BAYESCAN

v.2.1 (Foll & Gaggiotti 2008). To test model consistency, we used prior odds for the neutral model of 10, 100, 1,000 and 10,000 and compared results in a preliminary step. For the final analyses, we used prior odds for the neutral model of 100 as recommended in the BAYESCAN V.2.1 manual. In order to tune the acceptance rates of posterior distributions, we used 20 successive pilot runs of 5,000 iterations length each. We conducted three independent analyses per species with a thinning interval of 20 and 100,000 total iterations following a 100,000-iteration burn-in period. We used Geweke’s convergence diagnostic to ensure convergence of each MCMC analysis (Geweke 1991), and the Gelman diagnostic to monitor the convergence of our three independent MCMC outputs (Gelman & Rubin 1992).

To test for associations between population genetic differentiation and environmental parameters, we employed two additional approaches. Among the outlier detection methods that take environmental variables into consideration, BAYESCENV is particularly well suited for species with high dispersal rates (de Villemereuil & Gaggiotti 2015). We used the software with a prior preference p for the locus-specific model = 0.2 as the ideal trade-off to have high power while keeping the false positive rate acceptable, as has been shown to be most appropriate for species with high dispersal rates (de Villemereuil & Gaggiotti 2015). The prior preference for the non-neutral model was set to 0.01 in order to be comparable to our BAYESCAN analyses. We included all environmental variables normalized relative to a reference, and standardized data by dividing each sampling site’s reference value by the population standard deviation. For salinity,

(31)

PSU as the environmental salinity of the ancestral state. For average annual SST, we set 8°C as a reference as sand lance larvae are known to be most abundant at water temperatures >7°C. For all other environmental variables, we used the average as reference. We selected the q-value as a test statistic to test for significance, and included only those SNPs with a q-value < 0.05 in two independent analyses per environmental variable.

We further analyzed the same marker set with BAYENV v.2(Gunther & Coop 2013). This approach controls for demographic effects by using a covariance matrix based on neutral markers when estimating correlations between environmental parameters and genetic differentiation (Coop et al. 2010). We first estimated a covariance matrix based on neutral SNPs using 100,000 iterations, then tested a set of environmental variables (Table 1) for association with genetic variation, again using 100,000 iterations and three independent analyses. SNPs were identified as outliers in BayEnv2 only if they were very strongly (Bayes Factor 32-100) or decisively (Bayes Factor 100- inf) significant in all three runs according to Jeffreys’ interpretation (Jeffreys 1948; Robert et al. 2009).

Microbial 16S Library Preparation and Amplicon Sequencing

Total PCR reaction volume for PCR I was 20 µL and consisted of 12µL AccuPrime SuperMix II, 1.5 µL of each forward and reverse primer, 3 µL H2O, and 2 µL DNA. PCR conditions were set

to an initial denaturation cycle at 95°C for 2 minutes; followed by 38 cycles, each at 95°C for 15s, 55°C for 15s, 68°C for 40s; and a final single extension step at 68°C for 4 minutes.

For PCR II, the total PCR reaction volume was 28 µL, consisting of 12µL AccuPrime SuperMixII, 2 µL of each forward and reverse primer, 7 µLH2O, and 5 µL PCR I product. The

PCR conditions were as follows: a single initial denaturation cycle at 98°C for 1 minute, followed by 13 cycles each at 98°C for 10s, 55°C for 20s, 68°C for 40s, and a final single extension step at 68°C for 5 minutes. PCR amplification products were cleaned using Agencourt AMPure™ XP magnetic beads following the manufacturer’s protocol (Beckman Coulter Inc., Brea, USA). The PCR products were size-fragmented by gel electrophoresis on a 2% agarose gel, visualized by UV-light exposure, and pooled at equimolar ratios as determined by a QubitTM 2.0 Fluorometer in a final volume of 20 µL. Negative controls from DNA extraction and amplification steps and were included in the final sequencing library. The PCR amplification products were sequenced as 250bp paired-end sequencing on an Illumina MiSeq™ 1.9 (Illumina Inc., San Diego, US) with 20% phiX spike-in according to the manufacturer’s specifications.

Microbial 16S Data Filtering and OTU Clustering

(32)

identified using the RDP Gold reference database v.9 (Cole et al. 2014) and were also excluded from the dataset. We then identified operational taxonomic units (OTUs) when the sequence identity level exceeded 97% using the UPARSE algorithm, which constructs representative de novo OTU sequences from amplicon data (Edgar 2013). The OTU table was then built by mapping the remaining reads back to OTUs. Finally, taxonomy was assigned to each OTU using the software LCAClassifier v.2.0.4 and the SilvaMod reference database (Lanzen et al. 2012) with 50 database matches per OTU, and any eukaryotic OTUs were removed. In order to account for differential abundance and read depth in our data, we normalized our dataset with the cumulative-sum scaling method (Paulson et al. 2013). As suggested by the authors, we excluded any samples with <1000 reads and in addition used the relative abundances of OTUs per sample for downstream analyses.

(33)

Fig. S1: Number of individuals of either species that has < than a certain amount of missing SNP data.

Table S2: Pairwise FST estimates for A. tobianus (A) and H. lanceolatus (B). Significant values are displayed in bold (P < 0.05). For sampling site abbreviations, see Fig. 1.

(a)

TE SB HR LA EB HB HK KB FB BH ÅL TE SB 0.001 HR 0 0.001 LA 0.005 0.005 0.005 EB 0.008 0.009 0.007 0 HB 0.006 0.006 0.004 0 0 HK 0.011 0.009 0.007 0.001 0 0.001 KB 0.016 0.018 0.013 0.008 0.006 0.008 0.005 FB 0.028 0.03 0.024 0.017 0.015 0.018 0.012 0.001 BH 0.028 0.029 0.024 0.012 0.009 0.012 0.008 0.001 0.001 ÅL 0.036 0.038 0.032 0.023 0.02 0.023 0.017 0.004 0.001 0.002 0.039 0.041 0.035 0.022 0.019 0.022 0.017 0.013 0.014 0.008 0.009

(b)

TE SA HR SHA NWH HØ FB TE SA 0.002 HR 0 0.001 SHA 0.004 0.002 0.002 NWH 0 0 0 0.001 0.012 0.011 0.012 0.012 0.007 FB 0.039 0.037 0.038 0.039 0.032 0.007 BH 0.039 0.038 0.038 0.039 0.032 0.008 0 0 50 100 150 200 250 300 80 84 85 86 87 88 89 90 91 92 93 N u m b er o f In d iv id u a ls % missing data A. tobianus 0 50 100 150 200 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 N u m b er o f In d iv id u a ls % missing data H. lanceolatus

(34)

Table S3: CV error and Log-likelihood from five independent ADMIXTURE analyses for A. tobianus and H. lanceolatus. * denotes the K with the lowest CV error/log-likelihood.

A. tobianus H. lanceolatus CV error K=2 K=3* K=4 CV error K=2* K=3 K=4 seed1 0.52252 0.52128 0.52471 seed1 0.51018 0.51968 0.53437 seed2 0.52253 0.52121 0.52418 seed2 0.51062 0.52197 0.53204 seed3 0.52251 0.52142 0.52395 seed3 0.51021 0.52268 0.53108 seed4 0.52268 0.52128 0.52456 seed4 0.51023 0.52153 0.53029 seed5 0.52242 0.52125 0.52455 seed5 0.50997 0.52106 0.53037 Average 0.522532 0.521288 0.52439 Average 0.510242 0.521384 0.53163

Fig. S2: Principal Component analysis (PCA) displaying the dissimilarity between sampling sites in A. tobianus (left) and H. lanceolatus (right).

Referenties

GERELATEERDE DOCUMENTEN

A study based on coalescent models for mitochondrial DNA even estimated a 95% decline in the North Atlantic humpback whale population pre- versus post-whaling (Roman &amp;

2 Prehistoric relative abundance of grey seals in northern Europe based on the number of zooarchaeological sites with grey seal remains in the North Sea (NS; white) and Baltic Sea

Table 3: Genetic diversity estimates, Hardy-Weinberg proportions, and neutrality test estimates for 10 microsatellite loci among humpback whales in North

Keystone species have a disproportionately large effect on their environment and are often central for maintaining an ecosystem in a healthy, stable state (Mills

Foote AD, Newton J, Piertney SB, Willerslev E, Gilbert MTP (2009) Ecological, morphological and genetic divergence of sympatric North Atlantic killer whale populations?. Foote

The aim of her present work is to investigate the divergence patterns and adaptive potential of marine keystone organisms using genetic and genomic techniques to

Keystone species have a disproportionately large effect on their environment and are often central for maintaining an ecosystem in a healthy, stable state (Mills

Divergence and adaptive capacity of marine keystone species Fietz, Katharina.. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to