• No results found

Genetic diversity of tropical liverworts along altitudinal gradients in Southeast Asia

N/A
N/A
Protected

Academic year: 2021

Share "Genetic diversity of tropical liverworts along altitudinal gradients in Southeast Asia"

Copied!
73
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Genetic diversity of tropical liverworts along

altitudinal gradients in Southeast Asia

Research Report

Student: Sarah Vergeer | Graduation period: January to June 2018 | Company: Naturalis Biodiversity Center | Supervisor: J.M. de Winter | In-Company Mentor: M. Stech

(2)

Genetic diversity of tropical liverworts along

altitudinal gradients in Southeast Asia

Research Report

Image: Hepaticae, from Kunstformen der Natur by Ernst Haeckel [CITATION Hae04 \l 1043 ]

Student: Sarah Vergeer | Graduation period: January to June 2018 | Company: Naturalis Biodiversity Center | Supervisor: J.M. de Winter | In-Company Mentor: M. Stech

(3)

Abstract

Bryophytes are small, haploid, non-vascular plants, which can be divided in three distinct phyla; the mosses (Musci), the liverworts (Hepaticae) and the hornworts (Anthocerotae) [ CITATION Gra96 \l 1043 ], [ CITATION Tou16 \l 1043 ]. They are valuable indicators of environmental chance, due to the fact that water easily enters the plant body carrying harmful contaminants and pollutions and because they occur in almost all environments[ CITATION Gig01 \l 1043 ]. Species from the liverwort phylum are the objective of this research, and it is tried to verify whether the altitudinal gradients of mount Gede in the Gede-Pangrango National Park (West-Java) influence genetic diversity. This is done by combining morphological taxonomy with molecular analysis using DNA barcoding, a technique that uses standardized parts of an organisms genome for species identification and phylogeny [CITATION STE10 \t \l 1043 ]. The marker regions trnL-F, part of the chloroplast DNA, and ITS, part of the nuclear ribosomal DNA, are commonly sequenced when assessing plant phylogeny [ CITATION Qua04 \l 1043 ].

The taxa under investigation are various Bazzania species and the Heteroscyphus coalitus, both collected in 2017 on the aforementioned mountain. Initially, it was needed to justify the use of column-based, automatic DNA extraction over DNA extraction with CTAB-lysis, which is especially designed for plant material. Then, PCR amplification of the genetic markers was optimized prior to high-throughput processing and subsequent Sanger sequencing. Afterwards, sequences were edited and aligned, and then assessed for their molecular phylogeny by using the statistical approach of Bayesian Inference, incorporating the a Metropolis-coupled Markov chain Monte Carlo (MCMCMC) analysis. The ensuing phylogenetic trees were used to gain insight on intraspecific variability.

High-throughput DNA extraction was justified, and basic PCR reaction mixtures were found to be optimal for either trnL-F or ITS marker amplification. It was established that the combination of morpho-molecular analysis could be used to identify yet unidentified Bazzania species, but should be adjusted in order to analyse Heteroscyphus coalitus, as the current method resulted in a low PCR yield. The technique used allowed for the identification of previously unidentified Bazzania species, and could so help visualise the distribution of the Bazzania species over the altitudinal gradient on mount Gede. Also, the method used visualised intraspecific variation on some Bazzania species, which could subsequently be linked to their altitudinal distribution. Recommendations for future research focussed on the expanding of the data-set, the incorporation of additional analysis of phylogenetic relationships and the use of other techniques that could aid defining the influence of environmental factors of liverwort growth.

(4)

Table of content

1. Introduction...1

2. Theoretical framework...2

2.1 An introduction to bryophytes...2

2.2 Epiphytic liverworts...3

2.2.1 The liverwort taxa of interest...3

2.3 The effect of altitudinal gradients on bryophytes...4

2.4 DNA barcoding...4

2.4.1 The chloroplast marker; the trnT-F region...5

2.4.2 The nuclear marker; ITS...6

2.5 Data processing; Geneious and Bayesian Inference of Phylogeny...7

2.5.1 Geneious...7

2.5.2 Bayesian Inference...7

3 Research Design...9

3.1 Species collection...9

3.2 Subsampling; preparing the samples for DNA extraction...9

3.3 DNA extraction...9

3.3.1 DNA extraction by CTAB lysis...10

3.3.2 Column extraction...10

3.3.3 High-throughput, automatic DNA extraction; preparation of the KingFisher extraction robot 11 3.4 PCR for marker amplification...11

3.4.1 PCR primers...11

3.4.2 PCR program, PCR reaction mixture and gel electrophoresis...12

3.4.3 PCR additives...13

3.5 Analysis of the sequence data...13

4 Results...14

4.1 Subsampling...14

4.2 Comparing DNA extraction methods...14

4.3 PCR mixture optimization...15

4.4 High throughput DNA extraction, PCR amplification and Sanger sequencing...16

4.4.1 High-throughput PCR amplification; success rate...16

4.4.2 Sequence products; marker sequence length and GC-content...17

4.5 Phylogenetic relationships...17

4.5.1 Bayesian Inference; aiding in taxonomy...17

4.5.2 The molecular relationships of the Bazzania species...17

(5)

4.5.4 Genotypic variation in the B. tridens species, and its relation to altitude...19

5 Discussion...21

6 Conclusions...24

7 References...25

Appendix I: Protocol for column extraction using NucleoSpin® Plant II...30

Appendix II: Protocol for DNA extraction by CTAB lysis...32

Appendix III. DNA extraction plant material – KingFisher...35

Appendix IV: E-gel...39

Appendix V: Sample sheet preparation and BaseClear...43

Appendix VI: Geneious...52

Appendix VII: PCR Mixtures and Additives...63

Appendix VIII: High-throughput PCR amplification results...64

(6)

1. Introduction

Bryophytes is the common name for small, non-vascular plants that can be divided in three smaller,

paraphyletic groups of mosses, hornworts and liverworts, [ CITATION Qiu06 \l 1043 ], [CITATION Mor18 \l 1043 ], [CITATION Ren07 \l 1043 ]. Species of the latter phylum, liverworts, are subject of investigation, during which it will be researched whether altitude has an effect on intraspecific genetic variability. Research to bryophytes has been extensive, as a result of their potential use of indicators of changes in the environment [ CITATION Tou16 \l 1043 ]. Because bryophytes are often morphologically similar, visual identification of bryophyte species is often difficult and imprecise [ CITATION Gra96 \l 1043 ]. Therefore, morphological species identification can be combined with DNA barcoding, a molecular identification technique that uses standardised regions of an organism’s genome to determine its species[CITATION Pom07 \l 1043 ]. This integrated taxonomy technique can also be applied to analyse intraspecific genetic variation and assess whether altitude is the cause of this. The DNA markers used during this project are the chloroplast trnL-F region and nuclear ribosomal ITS region, which are standard in bryophyte and plant phylogeny [CITATION STE10 \t \l 1043 ]. Liverwort species, presumably of the species Bazzania and Heteroscyphus coalitus, were collected along the altitudinal gradients of the Gede-Pangrango National Park, West-Java. Prior to high-throughput DNA extraction and PCR amplification, these methods are optimized and it is verified whether automatic DNA extraction with a KingFisher extraction robot is justifiable. After successful PCR, the samples are Sanger sequenced by BaseClear, edited and aligned using Geneious software and analysed for their phylogeny with Bayesian Inference incorporating the MCMC-algorithm using MrBayes.

The main question to be answered is:

Does altitude have an influence on the genetic diversity of epiphytic liverwort species along the altitudinal gradients in Gede-Pangrango National Park, West-Java?

Subsequent questions that aid in the answering of the main one are:

- Is DNA extraction from liverwort plant material more efficient with CTAB lysis or column DNA extraction, which can be performed in bulk?

- Which PCR mixture is most optimal for the amplification of the selected DNA markers? - Can yet unidentified liverwort species be identified by combining morphological examination

with molecular analysis such as DNA barcoding?

- Does the techniques used differentiate between the different liverwort species? - How does altitude influence liverwort species diversity?

- Is the technique used adequate to asses genotype variation within the liverwort species? - Is the influence of altitude on genotypic variation in liverworts assessable with the technique

used?

This research proposal is preceded with an introduction to bryophytes and provides more detailed information on the liverwort species. Also concepts such as DNA barcoding and genetic sequence markers are explained, followed by a description of the bio-informatical tools used. Then, an outline of

(7)

the research design is provided, followed by the ensuing results. Lastly, the results are discussed and concluded. At the end of the report, the referred to appendices can be found.

(8)

2. Theoretical framework

For the analysis of liverwort samples, first a general knowledge of them needs to be established. Therefore, a short introduction to bryophytes and some liverwort species is given, followed by an explanation of the technique DNA barcoding, which is used for the taxonomical and phylogenetic analysis of the samples. Also, the principles of the bioinformatical tool Geneious and the statistical analysis strategic Bayesian Inference is explained.

2.1 An introduction to bryophytes

Bryophytes is a general term to

describe small, non-vascular plants[ CITATION Tou16 \l 1043 ]. Phylogenetic research [CITATION Qiu06 \m Mor18 \m Ren07 \l 1043 ] revealed that this group is paraphyletic and, thus, does not share a common ancestor (figure 1). Consequently, bryophytes were divided in three distinct phyla; the Bryophyta, the Marchantiophyta and the Anthocerotophyta, also known as the mosses (Musci), the liverworts (Hepaticae) and the hornworts (Anthocerotae), figure 2,[ CITATION Gra96 \l 1043 ].

Mosses comprise about two-thirds of all bryophyte species, with an estimated 12.5 thousand different species. The liverwort phylum comprises of approximately 5.200 different species and the hornworts are the smallest group, with around 100 species[ CITATION Tou16 \l 1043 ],[ CITATION Gra96 \l 1043 ].

Figure 2. Mosses (A), Liverworts (B) and Hornworts (C) (Jones, Ougham, Thomas, & Waaland, 2013).

The body structures of bryophytes varies per phylum, with leafy structures for mosses, thallous ones for hornworts, while liverworts can either have leafy or thallous structures, see figure 2 [ CITATION Gra96 \l 1043 ]. Within a phylum, morphological taxonomical identification is difficult due to the phenotypic similarities. However, the different bryophyte phyla are easily distinguished from the other plant phyla. For one, the dominant life stage of bryophytes is the haploid gametophyte, while for most vascular plants this is the mature sporophyte[CITATION Cam14 \t \l 1043 ]. Also, bryophytes lack a vascular system and instead are able to absorb water with nutrients on their entire surface

Figure 1. Highlights of plant evolution, from Campbells “Biology, a Global Approach”, chapter 29 [CITATION NCa141 \t \l 1043 ].

(9)

[ CITATION Tou16 \l 1043 ]. Therefore, bryophytes are small plants and are often found in moist areas.

Bryophytes are readily susceptible to changes in their environment[ CITATION Tou16 \l 1043 ]. Because of the simplicity of their water-absorption from their environment, also harmful and undesired substances easily enter the plant body. In combination with the fact that bryophytes grow in almost all terrestrial and freshwater environments, they are valuable indicators of changes in their environment and are useful in documenting environmental change [ CITATION Gig01 \l 1043 ]. Bryophyte species divide either sexually or vegetative (asexually). For sexual reproduction, the sperm cells need a film of water to reach the female sex-organ and fertilization is most fruitful when gametophores are spaced closely together. One of the examples of vegetative reproduction is when the plant produces small brood bodies (small plantlets), which can grow into genetically identical gametophytes. These brood bodies, or gemmae, are formed either at the fringes of leaves, or in specialised organs. Other forms of asexual reproduction, like broken-off pieces of the gametophyte like a leaf or a branch, can result in the growth of a new gametophyte.

2.2 Epiphytic liverworts

Liverwort (Hepaticae) is the common name for the Marchantiophyta phylum. The name is derived from the Latin word hepaticus, which means liver, and refers to the shape some thallous gametophytes have that are similar to the lobes on a liver [CITATION Cam14 \t \l 1043 ], see figure 2 (B) and 3 (right). Most liverworts are leafy, while there are also thallous, semi-leafy and semi-thallous liverworts, the latter two having leaf-like attachments [ CITATION Tou16 \l 1043 ]. In opposition to mosses, liverwort leaves grow spirally and are a-symmetrical [ CITATION Gra96 \l 1043 ]. The latter is the result of the splitting of the initial leaf-cell before leaf-synthesis into two daughter cells, which each in turn make up for half of a leaf. This also results in liverwort-leaves never having one, sharp tip but rather a stump, two-tipped or 4-tipped end (figure 3, left). Besides this, liverworts have unique oil-bodies embedded in their leaf-cells, although their function is yet not fully understood [ CITATION Tou16 \l 1043 ].

Furthermore, the liverworts’ sporophytes mature fully inside protective tissue made by the gametophyte, before being extended into the outer world for spore dispersal [ CITATION Tou16 \l 1043 ]. The archegonium (the female sex-organ) contains a diploid zygote after fertilization and develops into a protective chamber. In this chamber, called the calyptra, the spores mature. Gametophytes of thallous liverworts protect the sporophyte with extra tissue, the involucrum (figure three, right). The seta (stem) of the sporangium also develops before the maturation of the sporangium, and extends as soon as the sporangium is ripe. The capsule of the sporophyte is of simple construction and, upon exposure to air, it quickly desiccates, and bursts open in four flaps, dispersing the spores. From every spore only a small, thallous protonema grows which produces only one gametophyte.

2.2.1 The liverwort taxa of interest

The epiphytic liverwort taxa under investigation are various species from the Bazzania genus and the Heteroscyphus coalitus species, both

leafy liverworts of the order Jungermanniales. Bazzania (figure 4, left) is a genus from the

Lepidoziaceae family [ CITATION Hon88 \l 1043 ], and counts circa 100 to 150 different species. It grows mostly in montane tropical environments, on bark, rocks, rotten wood and nutrient rich soil [ CITATION Gra96 \l 1043 ]. Heteroscyphus coalitus (figure 4, right) is part of the Geocalycaceae family[ CITATION Sep10 \l 1043 ] and grows mainly in South-East Asia and southern temperate,

Figure 3. Leafy and thallous liverworts, with their protective tissues [ CITATION Gra96 \l 1043 ].

(10)

sub-Antarctic climates on soil, rotten wood, bases of trunks and moist rocks [ CITATION Gra96 \l 1043 ].

Figure 4. Bazzania sp., (left) photo by Felipe Osorio-Zúñiga in Chile [ CITATION FOs11 \l 1043 ], and Heteroscyphus coalitus (right) (Hook.) Schiffn., photo by Yumizuchi in Sasayama, Hyōgo, Japan [ CITATION Yum15 \l 1043 ].

2.3 The effect of altitudinal gradients on bryophytes

The liverwort species under investigation (see figure 4) were collected along altitudinal gradients in the Gede-Pangrango National Park, West Java, a mountain forest with a hyper-diverse tropical ecosystem. Several studies have investigated the bryophyte zonation along different altitudes [ CITATION Ree83 \l 1043 ], [ CITATION Gra921 \l 1043 ], [CITATION Fra91 \t \l 1043 ], [CITATION Fra02 \t \l 1043 ], [CITATION San10 \l 1043 ], [ CITATION Den15 \l 1043 ], as well as patterns of bryophyte species richness [CITATION Han06 \m Sub10 \m AhP12 \m Sun13 \m San15 \l 1043 ].

The bryophyte species richness pattern that has been observed by various studies, taken that the observed altitudinal transect was long enough, is a hump-shaped pattern. A hump-shaped pattern is a pattern in which species richness is low at lower altitudes, increases in median altitude zones and decreases at higher altitudes [ CITATION Ree83 \l 1043 ], [CITATION Fra91 \t \l 1043 ], [ CITATION Gra921 \l 1043 ], [CITATION Fra02 \t \l 1043 ], [ CITATION Gra07 \l 1043 ], [ CITATION AhP12 \l 1043 ], [ CITATION Den15 \l 1043 ], [ CITATION San15 \l 1043 ]. For complete sampling, it should be attempted to collect samples over the entire transect. Studies that maintained different or incomplete sampling resulted in different patterns [CITATION Rah04 \m DNo08 \m McC10 \l 1043 ].

2.4

DNA barcoding

DNA barcoding is a rapid, accurate and automatable identification technique that uses certain, standardized parts of an organisms’ genome for species identification [CITATION Pom07 \l 1043 ]. The strategy of combining morphological taxonomy with molecular analysis for species delimitation is called integrated taxonomy, an approach that integrates information from different data sets to classify species with more confidence [ CITATION Pan14 \l 1043 ], [ CITATION Fuj12 \l 1043 ]. Besides taxonomic applications, DNA barcoding is also used as a means to assert inter- and intraspecific genetic variation in species [CITATION Nil08 \m Sun16 \l 1043 ], [CITATION Ste05 \t \l 1043 ], [CITATION Ste032 \t \l 1043 ], [CITATION Ste033 \t \l 1043 ]. DNA barcoding uses a standardized part of organisms’ genomes, called (genome) markers, to identify and compare species. Ideally, only one marker region is enough to identify an organism, but it turned out more regions were needed for plant taxonomy [CITATION STE10 \t \l 1043 ]. Nuclear ribosomal DNA, particularly the ITS region, is widely used to reconstruct phylogenetic relationships in mosses. The same goes for chloroplast DNA, especially its trnT-F region including the trnL-F spacer and trnL intron. The latter region is not

(11)

only used in the study of bryophyte groups but also other plants [ CITATION Par14 \l 1043 ]. As sequence variability of chloroplast markers is quite low in bryophytes [ CITATION Qua04 \l 1043 ], the analysis of multiple marker regions of the same sample may be necessary for accurate species identification in at least some groups.

2.4.1 The chloroplast marker; the

trn

T-F region

In bryophyte phylogeny, the trnTUGU-trnFGAA region is the most widely used marker region in the past

decade. The general structure of this region is similar in both bryophyte species and land plants and is hence a useful marker for phylogenetic analysis [CITATION STE10 \t \l 1043 ]. It is located on the large single copy region (LSC) of the chloroplast DNA, and consists of four single coding regions for the genes for three tRNAs; tRNAThr, tRNALeu and tRNAPhe. Especially the intragenic trnL intron and

the intergenic trnL-F spacer have been extensively sequenced since the introduction of universal primers [ CITATION Qua04 \l 1043 ], [CITATION STE10 \t \l 1043 ].

In figure 5, the complete trnT-F region in bryophytes is depicted, presenting the coding regions as bold squares, with variable regions accented in grey. To sequence both the trnL intron and the trnL-F spacer, primers C and F should be used. The marker region trnL-F includes part of the 5’ exon, the complete 3’ exon, the trnL intron, the trnL-F spacer and part of the tRNAPhe gene.

Figure 5. The trnTUGU-trnFGAA region, present on the chloroplast DNA of both land plants and the group of bryophytes. The

exons (depicted as bold black squares) code for the tRNA molecules of tRNAThr, tRNALeu and tRNAPhe. The highly

length-variable regions from V1-V7 and the prominent primer regions from A-F. From: [ CITATION Qua04 \l 1043 ].

2.4.1.1 The trnL intron

The trnL intron, with an average length of 242-495 bp. in liverworts, is a group I intron, indicating that it aids the conjunction of both trnL exons, post-transcriptionally, by splicing itself out [ CITATION Hed13 \l 1043 ]. A characteristic of such intron is the presence of multiple conserved “core” structures (IGS, P, Q, R, S, figure 6) and various variable regions (e.g. V4 and V5 in figure 6, named P6 and P8 in figure 6). Mutations and variations in sequence length, which are less frequent within the conserved core of the intron, contribute nonetheless to a high number of lineage-specific mutations. Hence, this can be used for taxonomic applications. Sequence variation is, however, increasingly higher in the highly variable stem-loop regions P8 and P6 (figure 6) [ CITATION Qua04 \l 1043 ]. Therefore, this region is a promising source of information regarding intraspecific variation. Figure 6 serves as a “map”, aiding during sequence processing with the identification of conserved and variable regions within the trnL intron. The initial general secondary structure of group I introns, [CITATION TRC94 \l 1043 ], was adjusted by Quandt and Stech to represent its structure in the Campylopus

flexuosus, a moss [ CITATION Qua04 \l 1043 ]. The boxes with letters represent observed single

(12)

Figure 6. The secondary structure of the trnLUAA intron of the Campylopus flexuosis. Depicted are the conservative regions

(IGS, P, Q, R and S). On the 3’ end is the G highlighted where the 3’ exon of trnL begins, the 5’ end where the 5’ exon ends. This figure was adjusted from: [ CITATION Qua04 \l 1043 ].

2.4.1.2 The trnL-F spacer

The spacer between the 3’ exon of the trnL gene and the trnF gene is fairly short, 73.8 bp., SD 14.9 [ CITATION Qua04 \l 1043 ], especially when compared to the other non-coding parts of the trnT-F region. In most liverworts and mosses this spacer consists largely of two conservative promoter elements of aforementioned genes, with two variable regions preceding and following these two elements, see V6 and V7 in figure 5. The promoter elements, although quite conservative, are more variable in some bryophyte species and thus make homology analysis nearly impossible[ CITATION Qua04 \l 1043 ].

2.4.2 The nuclear marker; ITS

Working with nuclear regions during DNA barcoding costs more effort because nuclear DNA is generally more complicated than plastid DNA [CITATION STE10 \t \l 1043 ]. This is due to the fact that “nuclear genes are organised in gene families and generally present in multiple copies entailing the problem of specifically amplifying orthologous loci”. Nuclear ribosomal DNA, however, is assumed to undergo the process of concerted evolution, during which the transcription unit is homogenised, a process which makes amplification and sequencing more effortless. The process of converted evolution is not yet fully understood, but it is a gene conversion mechanism that is thought

(13)

to be responsible for “the homogenization of multigene families such as the nuclear ribosomal gene clusters” [CITATION TijdelijkeAanduiding1 \l 1043 ]. Therefore, nuclear ribosomal DNA has been extensively used for plant phylogeny, most regularly its ITS1-5.8S-ITS2 region (hereby referred to as ITS). Especially the ITS2 region is highly variable and can be used to distinct between species [ CITATION Mai97 \l 1043 ]. The nuclear ribosomal DNA codes for the genes of the smaller and larger subunit of a ribosome, and this region is in general easy to amplify and provides high levels of sequence variation for addressing relationships at the genus or population levels [CITATION STE10 \t \l 1043 ].

The arrangement of the genes of the nuclear ribosomal DNA is similar in bacteria, archaea and eukaryotes. As can be seen in figure 7, the small subunit gene (18S, in eukaryotes) is separated from the large subunit gene (25S/28S, in eukaryotes) by two internal transcribed spacers (ITS1 and ITS2) and a 5.8S gene [ CITATION Laf01 \l 1043 ], [CITATION Rog11 \l 1043 ]. To transcribe ITS, primer regions are positioned at the end of the 18S gene and at the beginning of the 25S gene.

Figure 7. The ITS region in nuclear ribosomal DNA, the image a courtesy of R. Gama from the Naturalis Biodiversity Centre, Leiden [ CITATION Gam18 \l 1043 ].

2.5 Data processing; Geneious and Bayesian Inference of Phylogeny

Sequence data first needs to be edited and aligned after sequencing before phylogenetic analysis can be run. For the editing and aligning of sequence data, Geneious is used, after which a Bayesian Inference analysis is run to create a phylogenetic tree.

2.5.1 Geneious

Geneious is a useful tool for bioinformatics, and allows the user to analyse sequence data by offering aligning, editing and tree-building options [ CITATION Kea12 \l 1043 ]. Sequences can be uploaded into the program, and be assessed for their sequence quality and consensus options. Then, a multiple alignment of sequences can be created using the Geneious Aligner option, figure 8. This alignment gives an impression of which parts of the species’ sequences are conserved and which ones show variation within or between species. Then, the alignment is used for further species delimitation analysis, like a Bayesian Inference analysis (2.5.2), and the result is a phylogenetic tree in which inter-and intra-specific species relationships can be analysed.

Figure 8. An impression of the visualization of multiple alignments of the trnL-F region of various Bazzania samples, made using Geneious Aligner. The conserved regions of the sequence are without highlighted regions, while the highlighted sections visualise differences in base-pair sequence.

(14)

2.5.2 Bayesian Inference

Bayesian Inference a statistical approach that is based on Bayes theorem, which combines the prior probability of a tree P(A) with the likelihood of the data P(B) to produce a posterior probability distribution on trees P(A|B). This is an account for the phylogenetic uncertainty, provided by the analysis, meaning that it awards a number to the created clades of the tree, indicating the probability that that clade is correct [CITATION Hue01 \l 1043 ].

MCMC methods became popular soon after their release in the late 1990’s, partly due to their enhanced methods but also because of their relatively easy-to-use software such as MrBayes. The MCMC algorithm works as follows; first a phylogenetic tree is randomly proposed, which is then either accepted or rejected based on the Metropolis-Hastings algorithm [ CITATION Met53 \l 1043 ], [ CITATION Has70 \l 1043 ]. When a tree is accepted, it is randomly slightly adjusted and made subject of the next generation, and is then again accepted and rejected. Nowadays, a Metropolis-coupled Markov chain Monte Carlo (MCMCMC) analysis is more common. The Monte Carlo methods incorporate a large number of generations to decrease probability of randomly occurring, wrong conclusions. This Monte Carlo integration uses Markov chains to randomly propose phylogenetic trees and assess the posterior probability [ CITATION Ron12 \l 1043 ]. With a metropolis coupled MCMC, multiple parallel chains of generations can be run simultaneously, effectively increasing the posterior probability [ CITATION Alt04 \l 1043 ]. During the analysis, millions of generations are run to create phylogenetic trees. Depending on the set parameters, a tree is saved after each 1000-5000 generations. To provide ‘rooting’ for the tree, an outgroup of samples can be provided. This aids assessing phylogenetic relationships of the species that are closely related [ CITATION Mad84 \l 1043 ].

After (posterior to) the analysis, the consensus tree is made from all saved trees except the so-called ‘burnin’; samples with a low posterior probability at the beginning of the chain. The result, figure 9, is a consensus tree of all input data. Left is the result in MrBayes, with right the stylised result in FigTree v1.4.3. The posterior probability per branch is awarded with a probability number from zero to one, one being the highest. A probability below 0.95 is not accepted, as the credibility of the corresponding branch is uncertain.

Figure 9. The output of a Bayesian analysis of an alignment of a multiple alignment of the trnL-F region of various Bazzania and Heteroscyphus samples, in MrBayes (left) and FigTree v1.4.3 (right).

(15)
(16)

3 Research Design

In this section, the method for the analysis of the target liverwort taxa Heteroscyphus coalitus and

Bazzania species samples is described, followed by directions of how to optimize DNA extraction and

PCR amplification, with subsequent submission to BaseClear for Sanger sequencing. Finally, directions for the analysis of the sequencing data are provided.

3.1 Species collection

The liverwort samples were collected in May 2017 by E. Iskandar (PhD graduate student, personal communications) on the Gede-Pangrango National Park, West Java, where small pieces of liverworts were morphologically identified as species of interest. In total, there were 85 samples from various alleged Bazzania (B.) species collected and 16 samples from what are most likely Heteroscyphus

coalitus (H. coalitus) samples. The samples were transferred in paper envelopes and dried on silica.

The samples have been collected along a transect of roughly 1300 metres, see table 1 for the corresponding elevation.

Table 1. An overview of the approximate altitudinal distribution of collected specimens on Mount Gede, West-Java. The taxonomic identification of the liverwort species is, until confirmed by molecular analysis, alleged.

Name species Altitude (m a.s.l.)

±1500 ±1700 ±1900 ±2000 ±2100 ±2300 ±2500 ±2600 ±2700 ±2800 B. japonica 1 7 10 8 9 B. tridens 2 8 6 9 2 Bazzania sp. 1 6 8 Bazzania sp.2. 5 2 1 H. coalitus 6 4 2 3 1

3.2 Subsampling; preparing the samples for DNA extraction

The liverwort collection was subsampled by procuring a single gametophore per sample and clean it manually with water and tweezers, under a microscope, to remove any contamination like sand that might interfere with proper molecular analysis. Then, the gametophore was placed in a sterile collection microtube, after which 3 mm micro-beads (glass) was added. Then the tubes were centrifuged for 20 to 30 seconds at 3700 rpm, to ensure all of the sample was at the bottom of the tube. This is also done in bulk, on a microplate. Then, the microtubes were frozen in liquid nitrogen (LN2)

and placed in the TissueLyser II (Qiagen), where they were shaken twice for 1.5 minute at 25 Hz, at different orientations.

3.3 DNA extraction

The extraction of DNA from plant material might prove difficult due to the small size of the specimens and because plant material is often more difficult to extract DNA from. Optimally, the DNA extraction process is as quick as possible, something that can be realised by using high-throughput, automatic DNA extraction. This is performed by a KingFisher Flex (KF) extraction robot (ThermoFisher), which can extract the DNA from 95 samples at the same time. It uses magnetic-bead technology, which binds DNA and subsequently transports it through steps to rid of cell residues. However, another requirement for DNA extraction is that enough extract is obtained, otherwise PCR is not successful. For plant material, CTAB (Cetyltrimethylammonium bromide) lysis is especially designed to yield a DNA rich extract. It lyses the plant cells and retains lipids and proteins in micelles, which are easily separated in water from nucleic acids. In essence, all DNA present in the sample is extracted, resulting in a DNA rich extract. However, this is its only beneficiary aspect. One of CTAB’s

(17)

multiple disadvantages is the necessity of using the carcinogenic chloroform-isoamyl alcohol for DNA purification. Also, this method is time-inefficient and can result in an impure DNA extract as result of its robust extraction principle.

Therefore, this method was compared to kit-based column DNA extraction. The latter method relies on the same DNA extraction principle as automatic DNA extraction with the KF extraction robot. When column extraction results in similar yields of DNA rich extract to CTAB extraction, it is justified to continue with high-throughput DNA extraction.

3.3.1 DNA extraction by CTAB lysis

Until DNA precipitation, this step was executed solely in the fume cabinet. Firstly, CTAB-buffer was prepared, with an end concentration of 2% hexadecyltrimethylammonium bromide (95%, Cat. No 855820, Aldrich), 1.4 M NaCl (Cat. No. S7653, Sigma-Aldrich), 100 mM Tris, pH 8 (Cat. No. 11130, Sigma), 20 mM EDTA (≥99%, Lot. No. BCBQ6846V, Sigma-Aldrich) and 2% 2-mercapto-ethanol (Cat. No. M6250, Aldrich). After subsampling (see 3.2), 1 ml of warm CTAB-buffer was added to each sample tube. The samples were then incubated for 1 hour at 65 °C, while constantly being shaken on a rotation block and regularly (every 10-15 minutes) manually inverted.

To separate the nucleic acids from other cellular components like proteins, lipids and cell residue, 450 µl of chloroform-isoamyl (24:1, room temperature, Cat. No. MKCF2063, Sigma-Aldrich) was added and mixed with the solution by inverting for five minutes. After centrifugation (in the fume hood) for 10 minutes at 20,000x g, three phases became visible, with in its top-/ water-phase the nucleic acids. After careful removal of 800 µl of this top phase, it was transferred in a fresh 2 ml tube and this step was repeated from the point where the chloroform-isoamyl was added. After centrifugation, 550 instead of 800 µl was transferred to the 2ml tube. For subsequent DNA precipitation, 550 µl of ice-cold isopropyl alcohol (99.7%, -20 °C, Cat. No. W292907, Sigma Aldrich) was added to the water phase and mixed by carefully inverting for 5 minutes at room temperature. A white substance appeared in case of a high yield in DNA, which was pelleted by centrifugation of 10 minutes at 20,000x g. Afterwards, the alcohol was drained and the pellets were re-dissolved in 150 µl of TE by putting it at 37 °C for half an hour. Any RNA was removed by adding 3 µl of RNase and placing it again at 37 ° C for 30 minutes. At this point it was possible to store the extract at 4 °C, as long as the sample was brought back to 37 °C before further processing (such as PCR). See appendix II: Protocol

for DNA extraction by CTAB lysis for reagents, solutions, kits, chemicals, equipment and a

step-by-step procedure.

3.3.2 Column extraction

After tissue lysis, column based DNA extraction was performed using the NucleoSpin® Plant II kit

(Cat. No. 740770.250, Marchery-Nagel). First, 400 µl of PL1 buffer and 10 µl RNase were added to the samples which were then incubated at 65 °C, for around 10 minutes, to lyse the cells and remove RNA. Impurities were then removed by filtering the solution over a NucleoSpin® Filter, after which

the collected flow-through was centrifuged for 10 minutes at 11,000x g. The DNA was then extracted from the solution and bound to the NucleoSpin® Plant II Column by adding 450 µl of PC buffer and

centrifuging for 1 minute at 11,000x g. The DNA was then washed, first with 400 µl of PW1 buffer, then with 600 µl PW2 buffer and ultimately again with 200 µl PW2 buffer. After the loading of each washing buffer, the column was centrifuged at 11,000x g for one minute to ensure the column was saturated with the buffer. After the final wash-step, the column was allowed to dry and the DNA was eluted by adding 50 µl of PE buffer. Following a short incubation period of five minutes at 70 °C, the tubes were centrifuged for one minute at 11,000x g and the elution step was repeated. The elutes contain the DNA template which can be used for PCR amplification. See appendix I: Protocol for

(18)

3.3.3 High-throughput, automatic DNA extraction; preparation of the KingFisher

extraction robot

For automatic DNA extraction, using the KingFisher extraction robot, the NucleoMag® Tissue kit (Cat. No. 744300.24, Machery-Nagel) was used. After subsampling (see 3.2) in a 96-well plate, 500 µl of lysis-buffer MC1 and 10 µl of RNase was added to each well, which were then sealed with tape, shortly vortexed and centrifuged for 20 to 30 seconds at 3700 x g. The multi-well was then placed in a 56 °C (pre-heated) shake-incubator, where they incubated at 250 rpm for at least an hour. After incubation, the microtubes with the lysed plant material underwent a final centrifugation of 20 minutes at 3700x g, after which the samples were ready for DNA extraction. Prior to the DNA extraction in the KF Extraction robot, the lysis-, wash-, and elution-plates were prepared according to table 1 in section 5.2 in appendix III: DNA extraction plant material – KingFisher. Refer also to this appendix for the reagents, solutions, chemicals, equipment and continuing procedure and manual to initiate DNA extraction. When the KF Extraction robot was finished, part of the eluate was kept in storage in the fridge (-80 °C) and part was prepared for PCR. The microtube plates were scanned and catalogued, obtaining a NCBN-number which could be used to create a PCR-sheet, see appendix V: Sample sheet

preparation and BaseClear.

3.4 PCR for marker amplification

3.4.1 PCR primers

Of the DNA template, PCR was run with primer sets attributing to the desired marker regions; ITS and

trnL-F. For the ITS marker, primers were specially designed for liverwort DNA by M. Stech

(in-company mentor). Optimally, the chosen primer sets can cover the entire marker region. However, degraded DNA material might prove difficult to amplify, so for initial optimization a smaller section of the large ITS marker, the ITS2 region (see figure 7), was amplified. Because initial optimization concerned DNA extraction and PCR additives tackling contaminations in the DNA, this was an acceptable approach and would justify subsequent use of other marker regions for ITS.

Table 2. Primer sequences of respectively the ITS and the trnL-F region. The primers are ordered by IDT and are precedented by an M13-tail.

Marker region Primer Orientation Sequence

ITS Bryo18SF F 5’- GGT GAA GTT TTC GGA TCG CG

Bryo26SR R 5’- AGA TTT TCA AGC TGG GCT

ITS2 5.8SF-M13 F 5’- GCA ACG ATG AAG AAC GCA GC

25SR-M13 R 5’- TCC TCC GCT TAG TGA TAT GC

trnL-F M13-Cm F 5’- CGA AAT TGG TAG ACG CTG CG

M13-Fm R 5’- ATT TGA ACT GGT GAC ACG AG

The forward and reversed primers were ligated at the 5’ end to a so-called M13-tail, a universal primer tail that is not complementary to the template DNA but is used as primer for sequencing. The advantage of using a universal primer tail such as M13 is that during sequencing only one forward and one reverse sequencing primer had to be used for different PCR products on the same sequencing plate. This facilitated the sequencing of high-throughput analyses of the 96-well PCR products, because otherwise a mirror plate with corresponding primers had to be provided.

Table 3. The standard M13-tail sequences, used as sequence primer for Sanger sequencing

Primer-tail Sequence

M13 (F) 5’- TGT AAA ACG ACG GCC AGT

(19)

3.4.2 PCR program, PCR reaction mixture and gel electrophoresis

The PCR programmes, designed specifically for the target marker regions [CITATION Gam15 \t \l 1043 ], were adjusted to match the optimal annealing temperature of the primer sets. An annealing temperature of 57 °C was used for ITS and 65 °C for ITS2, table 4. For the trnL-F marker, which used a two-step PCR program (table 5), initial annealing temperature was 62 °C, and every cycle the temperature would decrease with 1 °C. During the second step, a constant annealing temperature of 55 °C was applied, while the initial annealing time of 50 seconds increased with 1 second per cycle. Table 4. The PCR program for amplification of the ITS or ITS2 marker region, using either a Bryo18SF-M13 and Bryo26SR-M13 primer-set, or a 5.8S-Bryo26SR-M13 and 25R-Bryo26SR-M13 primer set.

Step Process Temp. (°C) Time (min:sec)

1 Initial denaturation 95 05:00 2 Denaturation 95 00:20 3 Annealing 57*/ 65** 00:30 4 Extension 72 01:00 5 Go to step 2, X 39 6 Final extension 72 07:00 7 Pause 12 ∞

*The annealing temperature for the primer set used in ITS amplification ** The annealing temperature for the primer set used in ITS2 amplification

Table 5. The PCR program for amplification of the trnL-F marker region with use of a C(M)-M13 and F(M)-M13 primer-set.

Step Process Temp. (°C) Time (min:sec)

1 Initial denaturation 94 04:00 2 Denaturation 94 00:30 3 Annealing 62 (-1 °C/cycle) 00:50 4 Extension 68 01:15 5 Go to step 2, X 10 6 Denaturation 94 00:30 7 Annealing 55 00:50 (+ 1 sec/cycle) 8 Extension 68 01:20 9 Go to step 6, X 25 10 Final extension 70 07:00 11 Pause 12 ∞

The basic PCR reaction mixtures for amplification contained a primer set (see table 2), dNTPs (2.5 mM, Lot. No. 157039937, Qiagen), CoralLoad PCR buffer (10x, Lot No. 145030107, Qiagen) and Taq DNA polymerase (5 units/µl, Lot. No. 157047479, Qiagen). See appendix VII: PCR Mixtures and

Additives for the compositions of the PCR mixtures for amplification of either the ITS or the trnL-F

marker regions.

The results of the PCR reactions were visualised by agarose electrophoresis. For general throughput the samples were run for 40 minutes at 100V on a 1% agarose gel. The gel with the PCR amplification products was subsequently infused with 1 µg/ml ethidium bromide (EtBr) (10 mg/ml in H2O, Cat. No.

E1510, Sigma), which binds to DNA. Also, it can be visualised using UV-detection, so pictures of the EtBr-infused DNA amplification products were made using a Gel-dock UV-photography. Because EtBr is extremely carcinogenic, the entire procedure was performed while working in a fume cabinet

(20)

and wearing gloves. For a small amount of samples, a gel was poured manually and run on a Horizontal Electrophoresis System (MUPID-exU) with a GeneRuler 1 kb Plus DNA ladder (ThermoFisher). For the analysis of high-throughput samples, E-Gel® 96 Agarose Gels (2% EtBr, G7008-02, Thermo-Fisher) were used, which were run on an E-Base integrated Power System (Mother E-Base™, Thermo-Fisher). For this high-throughput electrophoresis, no ladder was needed. Refer to appendix IV: E-gel for the directions of how to use this system.

3.4.3 PCR additives

Prior to high-throughput PCR, the use of PCR additives in the reaction mixture had to be optimized. Standard PCR additives such as MgCl2 (25mM, Lot. No. 157053840, Qiagen), Q-solution (5x, Lot.

No. 157055477, Qiagen) were observed for their influence on the reaction. Additionally, for the nuclear DNA marker ITS, also the effect of the additives betaine (5M, Lot. No. B0300-1VL, Sigma), DMSO (>99.5%, Lot. No. 45003236, Ultra) and BSA (10 mg/ml, R396A, Promega) was tested. On the chloroplast DNA marker, TBT-PAR (5x, see table2 in appendix VII: PCR mixtures and additives), an additive that has been found to improve the amplification of older plant material, was tried. To see whether the DNA extracts included PRC inhibitors, the DNA template was diluted both 5 and 10 times and tested against the undiluted extract. Furthermore, a low PCR yield invited the increase the volume of DNA template.

3.5 Analysis of the sequence data

Sanger sequencing was performed by BaseClear BV (Leiden, Netherlands), who used the aforementioned standard M13-tail as sequence primer. The sequenced strands, both forward and reversed, were retrieved within 48 hours online from the BaseClear Order Portal. See appendix V:

Sample sheet preparation and BaseClear for directions concerning sample submission and subsequent

sample-sequence downloading. Sequence data could then be viewed in the program Geneious (see

appendix VI: Geneious.), where the quality scores, sequence length and GC content of the sequences

were presented and where contigs of the two sequences directions were made. Sequences were aligned in the phylogenetic data editor PhyDE®. Sequences were edited using both Geneious and PhyDE®, and involved assessing ambiguous base pairs (bp.) for their identity. Initially, by reviewing chromatograms produced by the sequencing machines, the identity of an ambiguous base was estimated. Later, this could be confirmed when a multiple alignment of the sequences was created and showed that bases at the same position were conserved. If base-identity varied, the ambiguity was not corrected. Also, the extensive sequence tails are removed, cut off at the beginning and end of the marker regions.

Bayesian analysis was run using MrBayes software, its principle is explained in paragraph 2.5.2. The parameters were standard and aimed at creating a tree with the highest posterior probability as possible. The number of substitution types (Nst) was set to 6, with a gamma-distribution rate variation across sites. The coding was set to 'variable', so only variable characters had the possibility of being sampled, and the data type was set to “mixed”. The MCMCMS analysis was run with 4 chains with the temperature set to 4, and ran for 10 million times. The Markov-chain was sampled for a 1000 times, in order to reduce the amount of output-files, and the number of generations between calculations of MCMC diagnostics was set to 30.000. Essentially, this determines whether acceptance ratios of moves and swaps will be printed to file. The Heteroscyphus samples were set as outgroup. MrBayes was run on the science gateway CIPRES to save computational power. The Bayesian analysis resulted in a folder with multiple files, summing up all statistical output to support the trees it created. After analysis, it was checked whether the standard deviation of the runs had dropped below 0.01. This means that the trees created in the file had become extremely similar and conclusive. The first 25% of the created trees were discarded, because their standard deviation was well above 0.01 and they would not be accurate. The program then presents the phylogenetic tree with the highest posterior probability, which is then stylised in FigTree v1.4.3.

(21)

4 Results

After subsampling the targeted liverwort taxa Heteroscyphus (H.) coalitus and Bazzania (B.) species, it was confirmed that is was justifiable to use column-based DNA extraction over DNA extraction with CTAB-lysis. Also, the influence of PCR additives on PCR amplification of the trnL-F and ITS2 marker regions was analysed, after which high-throughput DNA extraction and amplification was performed. Successful PCR products were Sanger sequenced by BaseClear, after which the sequences were edited and aligned using phylogenetic data editors Geneious and PhyDE®. Multiple sequence alignments of either genetic marker regions, or combinations of both, were assessed for their phylogenetic relationships by running a Bayesian Inference analysis. This analysis was run using MrBayes software, incorporating a MCMCMC-algorithm, and resulted in phylogenetic trees that divided the liverwort data according to their molecular relationships

4.1 Subsampling

The morphology of the targeted liverwort taxa, recorded during sub-sampling and cleaning. The morphological difference between H. coalitus and Bazzania species can be seen. Also, the morphology of Bazzania sp. stands out, as it is different from other Bazzania species.

Figure 10. Examples of Bazzania tridens (1), Bazzania japonica (2) detail (3), Bazzania sp. (4), and Heteroscyphus coalitus (5), details (6,7). Pictures taken by S. Vergeer.

4.2 Comparing DNA extraction methods

To justify the use of kit-based, column DNA extraction over DNA extraction with CTAB-lysis, the DNA of various liverwort samples was extracted using both methods and was amplified for their trnL-F marker region. The results of subsequent agarose (2%) gel electrophoresis is shown in figure 11. The sample in slot 5 and 3 of the column extracted samples were also, respectively, in slot 7 and 8 of the CTAB extracted samples and served as a positive and negative control.

(22)

Figure 11. The results of the PCR product after DNA extraction with both column (left) and CTAB (right) extraction. Sample 5 and 3 of the column extraction method were used as positive and negative control during PCR of the samples of CTAB extraction and are loaded on position 7 and 8. A GeneRuler 1 kb Plus DNA Ladder (ThermoFisher) was used as ladder. Bands were visualised using an EtBr solution (1 µg/ml) with subsequent UV-detection.

For column DNA extraction, slot 1, 5 and 6 were positive PCR results, and DNA extraction with CTAB-lysis got positive results for slot 2, 5 and 6.

4.3 PCR mixture optimization

For the optimization of the PCR reaction mixture, ITS2 and trnL-F were attempted to amplify using multiple combinations of additives besides the basic mix of milliQ, buffer, primers, dNTPs and Taq polymerase, see table 1, 2, 3 and 4 in appendix VII: PCR Mixtures and Additives. Figure 12 and 13 show how various PCR mixtures enabled the amplification of both marker regions.

Of a H. coalitus and a B. tridens sample the ITS2 region was amplified in 8 different PCR master mixes (table 1 and 3 in appendix VII: PCR Mixtures and Additives). Figure 12 shows that the ITS2 marker region of this B. tridens sample could be amplified regardless of the PCR master mix. However, the ITS2 region of this H. coalitus could only be amplified with mix 8; the basic mix without additives (highlighted with dashed lines).

Figure 12. The effect of various reaction mixtures on the amplification of the ITS2 marker regions of a Heteroscyphus

coalitus and a Bazzania tridens DNA extract. See table 1 and 3 (appendix VII: PCR Mixtures and Additives) for the

composition of the PCR mixes. A GeneRuler 1 kb Plus DNA Ladder (ThermoFisher) was used as ladder. Bands were visualised using an EtBr solution (1 µg/ml) with subsequent UV-detection. Highlighted with dashed lines is the basic PCR mix without additives.

(23)

The trnL-F marker of a H. coalitus sample and three B. tridens samples were amplified using six different PCR master mixes (table 1 and 4 appendix VII: PCR Mixtures and Additives). Regarding the intensity of the bands (figure 13) it was not evident whether amplification without additives (highlighted with dashed lines) yielded more product, as was seen in figure 12. Illustrated, however, was that even though samples are allegedly the same species, amplification is not always successful.

Figure 13. The effect of various reaction mixtures on the amplification of the trnL-F marker regions of a Heteroscyphus

coalitus and some Bazzania tridens DNA extracts. See table 1 and 4 (appendix VII: PCR Mixtures and Additives) for the

composition of the PCR mixes. A GeneRuler 1 kb Plus DNA Ladder (ThermoFisher) was used as ladder. Bands were visualised using an EtBr solution (1 µg/ml) with subsequent UV-detection. Highlighted with dashed lines is the basic PCR mix without additives.

4.4 High throughput DNA extraction, PCR amplification and Sanger sequencing

After the use of column-based DNA extraction was justified, a full 96-well plate was loaded with liverwort samples, which were subjected to high-throughput DNA extraction in the KF extraction robot. Subsequent PCR amplification of the ITS or the trnL-F marker region was done with basic PCR mixtures (see 3.4.2). Because initial amplification resulted in a low yield, the DNA template was diluted (5x) to lessen the effect of inhibitory factors.

4.4.1 High-throughput PCR amplification; success rate

The results of the amplifications were visualized using E-gel electrophoresis Of the 85 Bazzania samples, 79 trnL-F regions and 63 ITS regions were amplified. This is a PCR success rate of 93% for

trnL-F and 74% for ITS. Of the 16 H. coalitus samples, 6 trnL-F regions and 2 ITS regions were

amplified. This is a PCR success rate of 38% for trnL-F and 13% for ITS. See the results and calculations in appendix VII: High-throughput PCR amplification results).

(24)

4.4.2 Sequence products; marker sequence length and GC-content

After Sanger sequencing of the positive PCR products of the liverwort DNA extracts and subsequent ambiguity editing and the removal of extensive sequence tails in Geneious and PhyDE®, there were 72 trnL-F sequences and 54 ITS sequences with high quality for Bazzania samples. Of these sequences, 49 belong to the same Bazzania sample and were used for a combined phylogenetic analysis. For H. coalitus, there were 6 trnL-F sequences and one ITS sequence with high quality. Hereafter, the H. coalitus samples were used as outgroup during Bayesian analysis, and were not separately considered anymore for their molecular phylogenies.

For Bazzania samples, the sequence length of the trnL-F marker region varied from 460 to 491 bp., with GC contents averaging 33.7%. For ITS, sequence length varied between 859 and 991 bp., with and average CG-content of 57.8%.

4.5 Phylogenetic relationships

4.5.1 Bayesian Inference; aiding in taxonomy

Figure 14 is a phylogenetic tree, made using Bayesian analysis incorporating the MCMCMC-algorithm, of the Bazzania samples for which both markers could be sequenced. The numbers on the branches indicate that the branch has enough statistical support to assume that its division is correct. Branches with no number are hence not supported with statistical evidence. The results of the same analysis, but with only the ITS, only the trnL-F marker or all these alignments combined are attached in appendix IX: Results of Bayesian Inference Phylogenics. After this phylogenetic analysis, samples

that were yet unidentified could now be classified by considering their position in the clade and revising their morphological features. E. Iskandar (PhD student, personal communications) concluded that some B. Japonica species that were divided in a separate clade (clade II in figure 14) should be re-classified as B. javanica. Also, the previously unidentified Bazzania sp. samples in clade I were actually found to be B. vittata, while other unidentified Bazzania sp. and Bazzania sp.2. were sorted in the clades of either B. tridens or japonica.

4.5.2 The molecular relationships of the

Bazzania

species

Referring to figure 14, the organization of the clades suggests that B. japonica is sister to the three remaining clades; B. tridens, B. javanica and B. vittata. The latter two are again sister to the B. tridens. In appendix IX: Results of Bayesian Inference Phylogenics, figure 1 is the phylogenetic tree created

with all available sequences of the marker regions, regardless whether a sample had only one or both markers sequenced. Figure 2 in this appendix is the phylogenetic tree created from solely the ITS sequence data, and figure 3 of exclusively the trnL-F marker sequence data. This tree, however, has a different clade-division; B. vittata is there related to the other three clades, with B. javanica as sister-clade to both B. tridens and B. japonica. Besides this, there is no incongruence between the

(25)

Figure 14. The tree as result of a Bayesian analysis of liverwort samples for which both the trnL-F and ITS marker regions could be amplified and sequenced. A total of 49 sequences was used to create this tree, which was stylised in FigTree after analysis with MrBayes software incorporating the MCMCMC-algorithm. The numbers on the branches indicate that the branch has enough statistical support (posterior probability) to assume that its division is correct. Branches with no number have a posterior probability below 95% and are hence deemed unreliable. Abbreviations; Bs: Bazzania sp., Bs2: Bazzania sp.2., Bt: Bazzania tridens, Bj: Bazzania japonica.

(26)

4.5.3 Distribution of

Bazzania

species over altitude

After the phylogenetic separation of the Bazzania species (figure 14) and the subsequent morphological re-taxonomic conclusions by E. Iskandar, the original altitudinal distribution of the

Bazzania samples on mount Gede (table 1) could be redefined (tale 6).

Table 6.The taxonomy of the analysed Bazzania species, per altitude (m a.s.l.). Taxonomy was confirmed with morpho-molecular phylogenetic research, combining morphological taxonomy with DNA barcoding of the trnL-F and ITS marker regions and subsequent Bayesian Inference phylogenies. See table 1 (Research Design) for the initial morphological analysis and the total species collection. The PCR success-rate was 93% for trnL-F marker region amplification and 74% for ITS, resulting in combined-marker phylogeny of 47 samples.

Name species Altitude (m a.s.l.)

Total #samples ±1900 ±2000 ±2100 ±2300 ±2500 ±2600 ±2700 ±2800 B. japonica 1 2 6 9 6 5 29 B. tridens 2 4 4 10 B. vittata 2 5 7 B. javanica 1 1

With the current taxonomical results, the following plot (figure 15) can be created. It compares altitude to the amount of Bazzania samples that were collected there. It is evident that around 2300-2500 m a.s.l. most species can grow, where at the altitudes above or below that only some appear to exist. 1900 2000 2100 2300 2500 2600 2700 2800 0 1 2 3 4 1 1 1 3 3 1 1 1

The amount of species collected per altitude

Altitude (m a.s.l.) N u m b e r o f sp e ci e s

Figure 15.The sum of Bazzania species collected on mount Gede, per altitude (in m a.s.l.). The data labels represent the amount of samples that were taken on the altitude.

4.5.4 Genotypic variation in the

B. tridens

species, and its relation to altitude

Considering intraspecific variation in figure 14, the B. japonica clade has the largest amount of data, but only shows one clade that is supported with maximum posterior probability. Also in the B. vittata

(27)

clade is a single supported. When focusing on the intraspecific variation of the B. tridens clade (figure 14), it is visible that there is are more prominent intraspecific variability, with two sub-divisions within the clade with maximum posterior probability. One of those clades contains mainly samples from lower altitudes and the other one is made of samples from largely higher altitudes, with some overlap in height.

The B. tridens samples are plotted against their altitudinal point of collection to visualise the distribution of the samples over the gradient of Mount Gede, figure 16.

4 5 6 7 8 9 10 1500 1700 1900 2100 2300 2500 2094 2104 2289 2301 2301 2306

The altitudinal distribution of B. tridens

B. tridens, sub-clade I B. tridens, sub-clade II Sample A lti tu d e ( m a .s .l .)

Figure 16. The distribution of B. tridens samples according to their division in the phylogenetic tree of figure e14 and the altitudinal point of collection, in m a.s.l.

(28)

5 Discussion

The results of the optimization of DNA extraction and amplification resulted in useful information that aided in the high-throughput processing of the samples. As there was no discernible advantage of using DNA extraction with CTAB lysis compared to a kit-based extraction, it was deduced that automatic DNA extraction with the KingFisher Extraction robot was rational. If DNA extraction with CTAB would have proved to be considerably more effective, high-throughput processing would not have been possible and the project would have been delayed considerably. The effect of the various PCR additives, which were thought to aid in the amplification of the liverwort DNA, were surprisingly negative. Amplification was not significantly improved, and even impeded in some cases. This lead to the use of only a basic PCR mix, which was cost- and time-efficient.

Both manual and high-throughput DNA extraction and marker amplification of the Heteroscyphus

coalitus samples was not very successful, with only five trnL-F and two ITS sequences were obtained.

Therefore, these samples were not analysed separately but used as outgroup during Bayesian analysis of the Bazzania samples which could be amplified and sequenced more easily. It shows that the

Heteroscyphus samples are indeed not directly related to the various Bazzania species, while the

kinship of these various genera are evident by their close affiliation. Something that might explain the difficulty of amplifying certain liverwort samples are oil bodies, organelles containing secondary compounds that are uniquely found in liverworts [ CITATION Tou16 \l 1043 ],[ CITATION HeX13 \l 1043 ]. These often vary in chemical composition [CITATION Asa13 \l 1043 ], [ CITATION Asa17 \l 1043 ] and might pose an obstacle for the amplification. This hypothesis is supported by the fact that diluting the DNA extract increased the PCR success rate. By diluting the DNA that is extracted from (plant) material, contaminants and other chemical compounds are diluted accordingly and hence interfere PCR amplification less.

PCR success rates of the ITS region of Bazzania (74%) and H. coalitus samples (13%) are lower than the success rates of the trnL-F region amplification of Bazzania (93%) and H. coalitus (38%) samples. This is due to the larger sequence length of ITS (859-991 bp.) and the increased GC-content (57.8%) compared to trnL-F (460-491 bp., GC-content: 33.7%). The amplification of the ITS region can normally be improved by adding betaine to the PCR reaction mixture [ CITATION Hen97 \l 1043 ]. Also, the ITS1 and ITS2 regions could be amplified separately to decrease the sequence length. However, this did not explicitly increase the PCR success of amplification of neither H. coalitus nor B.

tridens. Therefore, the ITS region might not be a good candidate region for single-marker,

intra-species level phylogeny [ CITATION Fel07 \l 1043 ].

Considering that the phylogenetic analyses of the separate markers did not show incongruence regarding the samples’ phylogenetic relationships, the combination of marker sets was allowed. Therefore, the statistical support of the clade separation and the subsequent objective of the research can both be discussion points when deciding which marker region or combination of regions is most applicable. If the objective is to study the intraspecific variation, either the ITS marker alone or a combination of the two marker sets will gain the highest statistical support and the most intraspecific variability. However, the amount of samples for which both marker regions can be amplified is lower, due to the difficulty that arises when amplifying the complete ITS region. If statistical significance is of less importance or the desire is to perform molecular analysis using only one genetic marker region, it can be chosen to use only the trnL-F marker region instead, in order to increase the number of data points. A combination of genetic markers is recommended as the best way to guarantee accuracy[ CITATION Ste13 \l 1043 ], [ CITATION Bąc17 \l 1043 ], although others found that solely the trnL-F region can be used differentiate between closely related liverwort species [ CITATION Fie98 \l 1043 ] and differentiate genetic variability [ CITATION YLi10 \l 1043 ].

(29)

After reviewing the phylogeny of the liverwort samples, the morphological taxonomy of B. japonica and B. tridens was, for the larger part, correct. However, molecular analysis of the unidentified

Bazzania samples proved to be a necessary asset, as morphological identification proved too

inaccurate. So could the identity of the samples classified as Bazzania. sp. and Bazzania sp.2 be attributed to the clade to which they had the highest molecular similarity to and were other Bazzania

spp. and some B. japonica both morphologically redefined as two different species, respectively B. vittata and B. javanica. The redefinition was done by E. Iskandar (personal communication), by

re-examining the morphology of the samples and confirming their identity. Various researchers studying bryophyte species use this morpho-molecular strategy to provide accurate species delimitation [ CITATION Pat17 \l 1043 ], [ CITATION San16 \l 1043 ], [ CITATION Bri17 \l 1043 ], [ CITATION Dir16 \l 1043 ], [ CITATION Dra15 \l 1043 ], [CITATION TijdelijkeAanduiding2 \l 1043 ].

After definite species identification, the distribution of the different samples over the altitudinal transect could be reviewed. When assessing this distribution, the amount of different species per altitude resembles a hump-shaped pattern. This indicates that some Bazzania species do not grow on lower or higher zonations of the mount Gede. However, current data lacks any information concerning biomass or species diversity, as it merely represents the dispersion of the samples that were targeted and collected during this research. Also, hump-shaped patterns usually rely on a more elaborate data pool, in which the total biodiversity is plotted against altitude [ CITATION Ree83 \l 1043 ], [ CITATION Gra921 \l 1043 ], [CITATION Fra02 \t \l 1043 ], [CITATION Gra07 \l 1043 ], [CITATION AhP12 \l 1043 ], [CITATION Den15 \l 1043 ].

Genetic variation within species varied amongst the different clades. Almost no genetic variation was found in the B. japonica clade that was maximally supported, even though this species had the largest data pool. On the other hand, in the B. tridens clade two sub-clades were separated with maximal posterior probability, confirming intraspecific variability. Unfortunately, the B. vittata and B. javanica clades did not contain enough samples to assess intra-specific comparison with confidence.

When comparing the division in genotypes within the B. tridens clade, samples from one clade were collected at lower altitudes while the other clade contained samples from higher altitudes. There was overlap in the middle region, where samples from either clade co-existed on the same altitude. This result can lead to two different conclusions; one confirming the influence of altitude on genotypic variation, the other suggesting that the two clades are actually two Bazzania species. To confirm the possibility of a cryptic species, the morphology should again be compared of the two clades. In any case, it is evident that the environmental conditions somehow influences the genotype distribution of this Bazzania species. Researchers studying the diversity of bryophytes’ in montane areas perceived a possible influence of altitude on genotypic traits [CITATION Sim15 \l 1043 ], [ CITATION Mac12 \l 1043 ]. However, the latter source was a morphological studies, and did not confirm that this variation was a genotypic one.

To confirm whether environmental conditions impairs growth of certain Bazzania species, use can be made of the relatively new technique of using airborne pollen for molecular taxonomic identification [ CITATION Leo17 \l 1043 ]. The technique uses DNA barcoding to taxonomically identify the pollen of air samples, and relates it to the surrounding biodiversity. In case of bryophytes, the technique needs to be adjusted to analyse spores instead of pollen. This technique would provide conclusive answers to whether ecology prevents growth because it allows the evaluation of spores’ ability to disperse and manifest; if a spore is detected to be in the air, but the species is not perceived in the surrounding biodiversity, the environmental conditions (including altitude) impedes growth. Also, the intraspecific variability might be more pronounced when the altitudinal gradient is longer or when the sample pool is larger. Therefore, subsequent analysis should focus on patterns of intraspecific diversity in B. tridens on other tropical, montane geographical zones with either higher elevations or a longer altitudinal range.

(30)

Other research techniques that might add to current morpho-molecular research and that might enhance intraspecific variability analysis are DNA fingerprinting techniques. This technique might improve the assessment of B. japonica, because standard sequence markers show some, but not enough intraspecific variation. There are multiple methods, ranging from RAPD (random amplified polymorphic DNA), RFLP (restriction fragment length polymorphism) and AFLP (amplified fragment length polymorphism), the latter being previously used together with DNA sequence data [CITATION STE10 \t \l 1043 ], [CITATION PVo95 \l 1043 ]. Next generation sequencing (NGS) fingerprinting of plants has become a useful tool to analyse multiple loci of an organism’s genome, including autosomes, mitochondrial and sex chromosomes [ CITATION Yan14 \l 1043 ]. Besides these techniques, also additional marker regions could be reviewed for their contribution to statistical significance of clade-division.

Referenties

GERELATEERDE DOCUMENTEN

The clustermq R package enables computational analysts to effi- ciently distribute a large number of function calls via HPC schedu- lers, while reducing the need to adapt code

tijdens HFOV wordt beïnvloed door veranderingen in enerzijds CDP en anderzijds de drukamplitude, waarmee de hypothese verworpen kan worden dat deze parameters onafhankelijk

In this paper we have studied LDOS, a current-phase relation of Josephson current, energy levels of Andreev bound state, and induced odd-frequency pairings in a

Disorganization of elastin and a changed organization of collagen fibers were also observed in our PCLS model following treatment with elastase, demonstrating that elastase

In a sample from a population-based cohort study of older adults, we created optimal cut-points for the DIN test, based on PTA-defined hearing categories.. First, we examined

tempeltjes in huis ophangen enzovoorts. Hoe geef je dat vorm? Pleegouders hebben te maken met een bezoek- en omgangsregeling en dat is ook heel lastig voor pleegouders. Maar als je

However, other social change actions (e.g., public protest) were uniquely related to altruistic values (i.e., a key concern for the welfare of all people), and pro-

De aankomende jaren zal naar verwachting het aantal ouderen in het verkeer stijgen. Veel oudere automobilisten zullen blijven autorijden, anderen zullen minder vaak