• No results found

DNA-based Non-invasive Prenatal Paternity Testing in Rape Cases

N/A
N/A
Protected

Academic year: 2021

Share "DNA-based Non-invasive Prenatal Paternity Testing in Rape Cases"

Copied!
27
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

INSTITUTE FOR INTERDISCIPLINARY STUDIES

FACULTY OF SCIENCE

DNA-based Non-invasive

Prenatal Paternity Testing

in Rape Cases

A systematic review of current scientific knowledge

and recommendations for future research.

Heleen Coreelman

Student ID 12284815

Literature Thesis

MASTER FORENSIC SCIENCE

Supervisor: prof. dr. A.D. (Ate) Kloosterman

Co-assessor: mw. dr. P.J. (Pernette) Verschure

Academic year 2019-2020

Word count: 8820

(2)

2

Abstract

According to information from the Central Bureau of Statistics, 1900 rapes were reported in the Netherlands in 2018. The Centre of Sexual Violence published that in 90% of the rape cases the victim is a female and about 7% of the women become pregnant after being raped. In case of pregnancy, there can exist uncertainty about whether the pregnancy was the result of rape or consensual intercourse. In such cases, non-invasive prenatal paternity testing (NIPAT) in early pregnancy is desired, and can be valuable for the criminal investigation. Recently, researchers have started to focus on the use of single nucleotide polymorphism (SNP) analysis for paternity determination. Therefore, the following research question is investigated: 'Can SNP genotyping be used for NIPAT?’.

This review summarises the current state-of-the-art of SNP-based NIPAT and its challenges. It involves several aspects: (1) cell-free foetal DNA (cffDNA) in maternal plasma, (2) the application of SNP genotyping for NIPAT, and (3) relevant considerations including the difference between male and female foetuses, the absence of a forensic SNP database, and the running costs for and accessibility of SNP-based NIPAT.

The fraction of cffDNA in maternal plasma appears to be influenced by biological (i.e. gestational age and maternal body mass index or weight) and pre-analytical (i.e. extraction and cffDNA enhancement methods) factors. Massive parallel sequencing approaches seem to be the most promising methods for the application of SNP genotyping for NIPAT. To overcome the main challenge of differentiating cffDNA from the background of very homologous cell-free maternal DNA, researchers have suggested to analyse effective autosomal SNP loci. These are maternal homozygous SNP loci that allow paternally inherited foetal alleles to be inferred. Although it has been shown that SNP genotyping can be used for NIPAT with high evidential values from gestational week six onwards, future research should be performed. Recommendations include e.g. setting up a forensic SNP database, increasing discriminatory power using multi-allelic SNPs, developing statistical models that consider stochastic effects, and performing large scale trials. The conclusions of the future research are especially essential towards the development of a widely accepted and validated SNP-based NIPAT approach that can be executed in early pregnancy.

Keywords Cell-free foetal DNA (cffDNA)  Non-invasive prenatal paternity testing (NIPAT)  First trimester

(3)

3

Contents

Abstract ... 2 Contents ... 3 1. Introduction ... 4 1.1. Research question ... 6 2. SNP-based NIPAT ... 7

2.1. CffDNA in maternal plasma ... 8

2.1.1. Foetal fraction of cell-free DNA ... 8

2.1.2. Cell-free DNA extraction from maternal plasma and cffDNA enrichment ...11

2.2. Application of SNP genotyping for NIPAT ...12

2.2.1. The evolution towards massive parallel sequencing-based methods for SNP-based NIPAT ...12

2.2.2. Future outlooks for massive parallel sequencing-based NIPAT studies ...19

2.3. Several relevant considerations ...20

2.3.1. Male versus female foetus ...20

2.3.2. Absence of a forensic database containing SNP profiles ...20

2.3.3. Running costs for and accessibility of SNP-based NIPAT ...21

3. Conclusion and perspectives...22

3.1. Management summary for chain partners and policy makers ...23

References...24

(4)

4

1. Introduction

In the Netherlands, a number of 1535, 1760 and 1900 registered rapes was reported by the Central Bureau of Statistics in 2016, 2017 and 2018, respectively [1]. According to a factsheet published in 2013 by the Centre of Sexual Violence, 90% of the rape victims are female and 7% get pregnant after being raped [2]. In case of pregnancy, it may be important for the victim to know whether the biological father of the foetus is the rapist or a partner. Forensic deoxyribonucleic acid (DNA) paternity testing is often used in rape casework as this method may provide the answer to the question. DNA-based paternity testing involves the examination of DNA profiles to determine whether two individuals have a biological parent-child relationship. Although DNA-based paternity testing is undoubtedly essential for several criminal investigations and may thereby enable the criminal justice system to identify the biological father of the foetus, it should be noted that the method can as well be valuable in civil lawsuits when the parenthood of a child is at issue.

After being raped, pregnant women could choose for an abortion. DNA-based paternity testing on aborted foetal material could be performed in order to identify the biological father of the aborted foetus. However, this means that parentage is determined after the abortion, without allowing the choice for abortion to be dependent on knowing whether the pregnancy was the result of rape or consensual intercourse. Indeed, DNA-based prenatal paternity testing methods may help in deciding whether or not the victim wants to continue the pregnancy. In the Netherlands, the limit for abortion is no later than the 24th gestational week (GW) [3].

In essence, prenatal paternity testing corresponds to paternity testing methods before the baby is born, and sampling techniques can generally be subdivided into two categories:

(1) Invasive methods. At present, it is possible to determine the biological father of a foetus through prenatal paternity testing methods that rely on invasive procedures. The two commonly used techniques are chorionic villus sampling (CVS) and amniocentesis (AC) [4]. With CVS, a needle is used to obtain a sample of placental tissue, whereby the needle is inserted in an either transabdominal (i.e. through the abdomen) or transcervical (i.e. vaginally through the cervix) manner. With AC, a needle is inserted through the pregnant woman’s abdominal wall into the uterus in order to be able to withdraw a sample of amniotic fluid. Nevertheless, there are several disadvantages of invasive prenatal paternity testing methods. These include the risk of foetal loss [5], as well as the late sampling inherent to these methods [4]. Namely, CVS is preferably performed between GWs 10-13, and AC is preferably performed between GWs 15-18.

(2) Non-invasive method. Currently, developments in genetic testing have brought about the ability to determine the biological father of a foetus through prenatal paternity testing techniques of non-invasive nature. In 1997, Lo and colleagues [6] reported the existence of circulating cell-free foetal DNA (cffDNA) in plasma of pregnant women. Circulating cffDNA is extracellular foetal DNA that circulates in the maternal plasma, allowing its sampling through venous whole blood collection from the mother (Figure 1). If a woman wants to decide whether or not to terminate the pregnancy by means of an abortion, such as in rape cases, non-invasive and early prenatal paternity testing is preferred. Therefore, the discovery of circulating cffDNA in maternal plasma has been a key progress for the development of non-invasive prenatal paternity testing (NIPAT) (Figure 1). The advantages of

(5)

5 NIPAT – and thus of using cffDNA – are the non-invasiveness of the test and early sampling [7], which will reduce the physical and emotional stress on pregnant women. Specifically, cffDNA can be detected from the GW6 onwards. Another advantage is the rapid turnover of cffDNA, allowing the investigation of events at multiple time points during pregnancy [8]. Since cffDNA is already completely cleared from maternal plasma one day after delivery, it is improbable that there is interference from a previous pregnancy when performing NIPAT. As a result of all described advantages, this review will only focus on NIPAT.

Figure 1 | The presence of circulating cffDNA in the pregnant woman’s blood circulation allows NIPAT. Circulating cffDNA in the

maternal circulation primarily originates from the placenta. A sample for NIPAT is obtained by drawing venous whole blood from the pregnant woman. This maternal blood sample contains circulating cffDNA that is a key factor to perform parentage testing. DNA: deoxyribonucleic acid; cffDNA: cell-free foetal DNA; NIPAT: non-invasive prenatal paternity testing. Figure partly adapted from [9].

(6)

6 A commonly used tool for forensic DNA typing uses widely accepted and validated short tandem repeats (STRs) as genetic markers [10]. For NIPAT, however, it is difficult to amplify enough informative loci using conventional STR genotyping because cffDNA is largely fragmented and most fragments are 140-200 base pairs in length [11,12], whereas conventional STR genotyping assays amplify fragments of about 100-400 base pairs in length [13]. Another complication may occur due to the excessive amounts of cell-free maternal DNA, which can mask the foetus’ STR profile [14]. Specifically, stutter peaks are likely present in an electropherogram due to the polymerase chain reaction (PCR) STR amplification process, and it may be hard to discriminate these PCR artefacts from paternally inherited foetal alleles. Recent advances in single nucleotide polymorphisms’ (SNPs) research have proposed the use of this marker type as a valuable alternative for STRs in order to establish paternity prenatally [15-22]. SNPs are single base variations in the genomic DNA in which the least frequent sequence alternative – or allele – has an abundance of 1% or greater [23]. SNPs are more suitable for cffDNA because these markers can be amplified from shorter DNA fragments. Moreover, as compared to STRs, SNPs have a much higher mutational stability which makes them very suitable for human identification [24]. Namely, the mutation rate of STRs can reach the order of magnitude of 10-3, while that of SNPs is in the order of approximately 10-8.

Previously, research has been carried out in order to gain a better insight into the use of SNPs for NIPAT

[15-22]. Despite the various studies, no in-depth review covering the use of SNPs for NIPAT has been published yet. Therefore, this thesis aims to provide a state-of-the-art overview of current scientific knowledge regarding SNP analysis for NIPAT and give recommendations for future scientific research.

1.1. Research question

Specifically, the main research question of this study is: ‘Can SNP genotyping be used for NIPAT?’. The outline of this thesis is as follows. The review will begin with a general introduction of several important aspects of SNP-based NIPAT that will be discussed. Then, the thesis will continue with a detailed description of cffDNA in maternal plasma. Furthermore, the methods for SNP genotyping of cffDNA and the obtained paternity determination results in the different studies will be discussed. Moreover, several relevant considerations will be highlighted. The considerations will include the difference between a male and female foetus, the absence of a forensic DNA database containing SNP profiles, as well as the running costs for and accessibility of SNP-based NIPAT. Finally, a conclusion with regard to the research question will be provided and recommendations for future research will be outlined.

(7)

7

2. SNP-based NIPAT

Establishing a well-accepted approach for SNP-based NIPAT poses challenges due to its various facets that need to be taken into account. Figure 2 reveals the general workflow of SNP-based NIPAT. Since cffDNA is found in the blood circulation of pregnant women, the isolation of a plasma sample is the first step for SNP-based NIPAT. Subsequent DNA extraction and possibly cffDNA enrichment need to be performed. Then, SNP genotyping is required, followed by data analysis for paternity determination. Figure 2 also indicates the three major aspects regarding SNP-based NIPAT methods that this thesis will cover: (1) cffDNA in maternal plasma, (2) the application of SNP genotyping for NIPAT, and (3) several relevant considerations. The presence of cffDNA in maternal plasma forms the basis for sample collection, as well as DNA extraction and cffDNA enrichment. The application of SNP genotyping for NIPAT forms the basis for SNP genotyping and data analysis to determine paternity. The relevant considerations are general remarks that include the difference between a male and female foetus, the absence of a forensic database containing SNP profiles, as well as the running costs for and accessibility of SNP-based NIPAT.

The subsequent sections give a review of the qualitative and quantitative information available in the literature regarding the three facets of SNP-based NIPAT.

Figure 2 | Schematic representation of the general workflow of SNP-based NIPAT and the aspects that the literature thesis will discuss. A plasma sample of a pregnant woman needs to be collected, followed by DNA extraction and possibly cffDNA

enrichment. Finally, SNP genotyping and data analysis for parentage determination are performed. The three facets that the thesis will cover are cffDNA in maternal plasma, the application of SNP genotyping for NIPAT, and several relevant considerations. SNP: single nucleotide polymorphism; NIPAT: non-invasive prenatal paternity testing; DNA: deoxyribonucleic acid; cffDNA: cell-free foetal DNA.

(8)

8

2.1. CffDNA in maternal plasma

Due to the process of trophoblastic apoptosis, the main source of cffDNA in the maternal circulation during pregnancy is the placenta [25]. Trophoblasts are cells that contribute to placenta formation. Their turnover leads to apoptosis, which is a form of programmed cell death that is important for normal development of the placenta [26]. Apoptosis is characterised by morphological changes, such as membrane blebbing, nuclear condensation, and DNA fragmentation. Without initiating any inflammatory response in the mother, apoptosis ultimately causes cffDNA release in the maternal circulation. Another proposed source of circulating cffDNA is foetal haematopoietic cells that undergo apoptosis in the maternal plasma [25]. However, as a result of their relatively low concentration, foetal haematopoietic cells cannot account for the majority of circulating cffDNA in maternal plasma. A low cffDNA fraction, also called foetal fraction, in maternal plasma can present challenges for subsequent SNP genotyping and data analysis, thereby leading to a higher test failure rate [27]. In the following subsections a detailed description of foetal fraction of cell-free DNA is provided, with a focus on its biological influence factors. In addition, appropriate methods for extracting cell-free DNA from maternal plasma and for enriching the foetal fraction are described.

2.1.1. Foetal fraction of cell-free DNA

Foetal fraction is defined as the proportion of the total cell-free DNA in maternal plasma that originates from the foetus. Circulating cffDNA represents the minor fraction of plasma cell-free DNA. As already mentioned, accurate foetal fraction estimation is important as there is a need for sufficient circulating cffDNA in maternal plasma to be able to acquire reliable and accurate SNP-based NIPAT outcomes [27]. Previous studies have reported on various biological factors that affect the foetal fraction [21,22,28-31]. The influence of biological factors on the foetal fraction leads to a high variability in the concentration of circulation cffDNA in maternal plasma of different pregnant women. Identifying these factors may aid in the development of improved procedures for SNP typing of cffDNA.

Influence of gestational age. Lo and colleagues [32] developed a real-time PCR assay to quantify cffDNA in maternal plasma. Plasma samples of 25 early and 25 late pregnancies were taken from pregnant women who were in GWs 11-17 and 37-43, respectively. 13 early and 14 late pregnancy samples were from women carrying male foetuses. β-globulin and SRY concentrations of this subset were used as a measure for total

cell-free and cffDNA amount, respectively. The data revealed mean cffDNA concentrations in agreement with 3.4% (interval 0.39%-11.9%) of the total cell-free DNA in early pregnancy and 6.2% (interval 2.33%-11.4%) of the total cell-free DNA in late pregnancy. However, Lun and colleagues [29] considered these values as an underestimation. They used digital PCR for foetal fraction quantification. The plasma samples obtained were all from women bearing male foetuses, and the number of ZFX and ZFY molecules was determined for subsequent foetal fraction estimation (i.e. 𝑍𝐹𝑌+𝑍𝐹𝑋2∗𝑍𝐹𝑌 *100). The researchers reported median foetal fractions of 9.7%, 9%, and 20.4% in 10 maternal plasma samples each from pregnant women who were in GWs 12-14, 17-22, and 38-39, respectively. Zhou and colleagues [30] used a method based on massive parallel sequencing to estimate foetal fraction in maternal plasma of 22 650 pregnant women bearing male foetuses who were in GW10 or further. They also showed that cffDNA fraction is positively correlated with gestational age. Namely, the data demonstrated a median cffDNA fraction at GW10 of

(9)

9 approximately 9%, with a subsequent increase of only 1% from GW10 to GW21. After GW21, the foetal fraction augmented with about 1% each week. Recently, Christiansen and colleagues [21] also used a method based on massive parallel sequencing to calculate the foetal fraction in maternal plasma. In essence, the number of unique reads covering SNP loci for which the foetus’ mother is homozygous, thereby allowing the paternally inherited foetal alleles to be inferred, were used to estimate foetal fraction (i.e. 2∗𝑟𝑒𝑎𝑑𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑛𝑜𝑛−𝑚𝑎𝑡𝑒𝑟𝑛𝑎𝑙 𝑎𝑙𝑙𝑒𝑙𝑒𝑡𝑜𝑡𝑎𝑙 𝑟𝑒𝑎𝑑𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑡𝑤𝑜 𝑆𝑁𝑃 𝑎𝑙𝑙𝑒𝑙𝑒𝑠 *100). Using maternal plasma of 15 pregnant women, the researchers calculated median foetal fractions of 0%, 3.9% (interval 0%-6.4%), 5.1% (interval 0%-8.2%), 5.2% (interval 0%-11.2%), and 4.7% (interval 3.5%-8.8%) at GWs 4, 7, 12, 16, and 20, respectively. Essentially, the foetal fractions in maternal plasma increased a little from GW7 to GW20. Nevertheless, the data revealed a high variability in foetal fraction between the pregnant women at each time point as well as between the time points for each pregnant woman. Chang and colleagues [22] as well estimated foetal fraction using massive parallel sequencing, similarly as just described. They quantified cffDNA in maternal plasma of 349 pregnant women who were in GWs 6-34. Their calculations showed a mean foetal fraction of 4.37% at GW6 (interval 3.43%-5.30%), and an increase of about 0.76% per week from GW7 to GW12. From GW13 to GW24, the foetal fraction increase was only approximately 0.15% per week. After GW25, the foetal fraction augmented with about 1.44% per week. However, the researchers revealed differences in foetal fraction values between pregnant women, and a variability in increase rate within each pregnant woman.

The results are summarised in Table 1. The information obtained from the studies suggests that circulating cffDNA can be detected in maternal plasma at GW6 with a mean foetal fraction of about 4.5%, and that circulating cffDNA generally constitutes <15% of the total plasma DNA during the period that abortion is allowed in the Netherlands. Nevertheless, there is a high variability in foetal fraction between as well as within pregnant women at particular time points. Furthermore, the studies specify a positive correlation between foetal fraction in maternal plasma and gestational age, meaning that pregnancy progression will result in higher foetal fractions.

Other influence factors. In addition to the influence of gestational age on foetal fraction, Zhou and

colleagues [30] investigated the influence of maternal body mass index (BMI) and maternal age on foetal fraction in plasma samples of pregnant women with BMIs ranging from 15-33 kg/m² and ages ranging from 16-49 y. They showed that maternal BMI is negatively correlated with foetal fraction, with a decrease in median foetal fraction of about 8% from maternal BMI 15 kg/m² to maternal BMI 33 kg/m². Furthermore, they revealed that maternal age is not correlated with foetal fraction. Likewise, Hestand and colleagues

[31] studied the influence of maternal weight and maternal age on the foetal fraction in 14 379 maternal plasma samples of women with weights ranging from 51-132 kg and ages ranging from 18-49 y. They showed that maternal weight is negatively correlated with foetal fraction. This was illustrated by a decrease of median foetal fraction of approximately 7% from maternal weight 51 kg to maternal weight 132 kg. They also found that maternal age is not correlated with cffDNA fraction in maternal plasma.

The results are summarised in Table 1. Overall, the information obtained from the studies demonstrates that not only gestational age is an influence factor of foetal fraction. Indeed, foetal fraction is also determined by maternal BMI or weight as there is an inverse correlation. Thus, a higher maternal BMI or weight will result in a lower foetal fraction. The reason for this could be the increased turnover of adipocytes in overweight and obese women, which increases the cell-free maternal DNA fraction and

(10)

10 decreases the cffDNA fraction in maternal plasma [33]. Alternatively, the decrease in foetal fraction might be due to a diluting effect caused by an increased total blood volume [33]. Furthermore, circulating cffDNA fraction in maternal plasma is not influenced by the pregnant woman’s age.

Table 1 | Results of the quantitative analysis of foetal fraction in different studies. Studies investigating whether gestational

age, maternal BMI or weight, and maternal age influence the foetal fraction are indicated. The study, relevant information about the recruited subjects and the quantitative analysis of foetal fraction are mentioned. The subjects investigated are singleton pregnant women. PCR: polymerase chain reaction; GW: gestational week; cons. TPs: consecutive time points; BMI: body mass index; kg: kilogrammes; m²: meter²; y: years.

Study

(method)

Information about the recruited subjects

Quantitative analysis of foetal fraction

Influence of gestational age

Lo et al. (1998) [32]

(real-time PCR)

25 pregnancies: GWs 11-17 25 pregnancies: GWs 37-43

Mean (subset: 13 early and 14 late pregnancies): GWs 11-17: 3.4% (interval 0.39%-11.9%) GWs 37-43: 6.2% (interval 2.33%-11.4%) Lun et al. (2008) [29] (digital PCR) 10 pregnancies: GWs 12-14 10 pregnancies: GWs 17-22 10 pregnancies: GWs 38-39 Median: GWs 12-14: 9.7% GWs 17-22: 9% GWs 38-39: 20.4% Zhou et al. (2015) [30]

(massive parallel sequencing)

424 pregnancies: GWs 10-12+6 4229 pregnancies: GWs 13-16+6 14 719 pregnancies: GWs 17-21+6 3231 pregnancies: GWs 22-27+6 355 pregnancies: GWs 28-31+6 98 pregnancies: GWs >32

Median (subset: 22 650 pregnancies): GW10: 9%

GWs 10-21: 1% increase in total >GW21: 1% increase each week

Christiansen et al. (2019) [21]

(massive parallel sequencing)

15 pregnancies (cons. TPs): GWs 4, 7, 12, 16, 20 Median: GW4: 0% GW7: 3.9% (interval 0%-6.4%) GW12: 5.1% (interval 0%-8.2%) GW16: 5.2% (interval 0%-11.2%) GW20: 4.7% (interval 3.5%-8.8%) Chang et al. (2019) [22]

(massive parallel sequencing)

349 pregnancies: GWs 6-34 Mean:

GW6: 4.37% (interval 3.43%-5.30%) GWs 7-12: 0.76% increase each week GWs 13-24: 0.15% increase each week >GW25: 1.44% increase each week

Influence of maternal BMI or weight

Zhou et al. (2015) [30]

(massive parallel sequencing)

23 067 pregnancies: BMIs 13.7-34.2 kg/m²

Median (subset: 22 650 pregnancies):

BMIs 15-32 kg/m²: 8% decrease in total Hestand et al. (2019) [31]

(massive parallel sequencing)

14 379 pregnancies: weights 51-132 kg

Median:

weights 51-132 kg: 7% decrease in total

Influence of maternal age

Zhou et al. (2015) [30]

(massive parallel sequencing)

23 067 pregnancies: ages 16-50 y

Median (subset: 22 650 pregnancies): ages 16-49 y: no correlation Hestand et al. (2019) [31]

(massive parallel sequencing)

14 379 pregnancies: ages 18-49 y

Median:

(11)

11

2.1.2. Cell-free DNA extraction from maternal plasma and cffDNA enrichment

Besides several biological factors that influence the circulating cffDNA fraction in maternal plasma, the methods for cell-free DNA extraction from maternal plasma and cffDNA enrichment also have an influence on cffDNA yield [34-35]. In addition, the low foetal fraction and fragmented nature of cffDNA in maternal plasma [11,12] make the extraction of cell-free DNA and enrichment of cffDNA technically challenging.

Cell-free DNA extraction method. As it is simple to perform, column-based DNA extraction has been

broadly applied for cell-free DNA extraction from maternal plasma in NIPAT [15,17-22]. The commercially available kit that was mostly used in these studies is the QIAamp® Circulating Nucleic Acid Kit (Qiagen, Hilden, Germany). Using this kit enables reproducible cell-free DNA yield and enhanced recovery of fragmented, low concentration cffDNA [34].

Another approach for cell-free DNA extraction is magnetic-based DNA extraction. A commercially available kit that can be used for this approach is the QIAsymphony® Circulating DNA Kit (Qiagen, Hilden, Germany) [36]. In 2016, Wolf and colleagues [37] compared manual cell-free DNA extraction using the QIAamp® Circulating Nucleic Acid Kit with automated cell-free DNA extraction using the more recently

developed QIAsymphony® Circulating DNA Kit. They demonstrated that both extraction methods recover

cell-free DNA with a similar size distribution and similar total cell-free DNA extraction efficiency. Interestingly, they also showed that extraction using the QIAsymphony® Circulating DNA Kit yields a higher

foetal fraction as compared to extraction using the QIAamp® Circulating Nucleic Acid Kit. Due to this

enhanced cffDNA extraction efficiency, the QIAsymphony® Circulating DNA Kit might gain more attention

for cell-free DNA extraction from maternal plasma in NIPAT.

CffDNA enrichment method. CffDNA enrichment based on the difference in size distribution between

cffDNA and cell-free maternal DNA has been researched [38]. CffDNA fragments in maternal plasma are normally <300 base pairs in length, whereas about 20% of cell-free maternal DNA are generally >300 base pairs in length. Therefore, the researchers established a PCR-based method that selectively amplifies the shorter DNA fragments but also maintains the DNA integrity during amplification. This could be achieved by keeping a suitable denaturation temperature range. Although it is time-consuming and cannot provide for the determination of the foetal fraction increase, the method might be useful to further investigate.

Additionally, cffDNA enrichment based on differentially methylated regions (DMRs) has been described

[39]. DNA methylation is an epigenetic modification that is involved in amongst others development. This means that there exist differences in the methylation pattern between cffDNA and cell-free maternal DNA. A recently developed method for examining DMRs is called methylated DNA immunoprecipitation. The researchers identified 2000 DMRs, mostly found in non-genic and poor CpG density areas. However, the investigators only focussed on a few chromosomes with the aim to develop a non-invasive prenatal test for detecting certain foetal chromosomal aneuploidies. Therefore, it might be interesting to explore the possible usefulness of the method in applications for NIPAT by investigating DMRs on more chromosomes.

Furthermore, formaldehyde could be used to increase the relative amount of cffDNA in maternal plasma, as it is a cell membrane stabilising chemical inhibiting the release of maternal cell DNA into maternal plasma [16]. Nevertheless, no reproducibility of results has been found in literature [35]. Hence, more research is needed to investigate whether formaldehyde increases the foetal fraction.

(12)

12

2.2. Application of SNP genotyping for NIPAT

Several SNP-based methods have been suggested for NIPAT [15-22]. Recently, focus has been shifted towards the use of massive parallel sequencing for SNP genotyping in NIPAT applications. Massive parallel sequencing, also called next or second generation sequencing, concerns various high-throughput methods for DNA sequencing with single-base resolution that enables the processing of multiple DNA sequences simultaneously. Investigating the evolution towards the massive parallel sequencing-based approaches for SNP-based NIPAT is useful to understand the reason for the development of the latter, thereby also revealing its advantages compared to the initial efforts towards SNP-based NIPAT methods. Additionally, the limitations of studies that use the massive parallel sequencing-based methods give insights into possible directions for future research.

2.2.1. The evolution towards massive parallel sequencing-based methods for SNP-based NIPAT

In 2011, Tynan and colleagues [15] demonstrated the usefulness of a multiplexed SNP genotyping method, combined with restriction endonuclease digestion, that can confirm the presence of low fractional concentration circulating cffDNA in maternal plasma. They showed that genotyping of 92 SNP loci with a minor allele frequency (MAF) of 0.4 will result in a minimum of four effective SNP loci in >99% of cffDNA samples. Effective SNP loci were defined as loci in the genome for which the foetus’ mother is homozygous for a certain digestible SNP allele, while the foetus is heterozygous for that particular SNP. Specifically, the foetal identification method involves the use of restriction endonucleases to get rid of amplifiable maternal DNA alleles before performing PCR, thereby increasing the fractional concentration of cffDNA in maternal plasma and thus enhancing the detection of paternally inherited foetal SNP alleles that are not digested by the restriction endonuclease. Indeed, as little as 10 copies of a paternally inherited foetal SNP allele were detected in 1000 copies of DNA mixture consisting of 2% low fractional concentration progeny DNA which is heterozygous for the digestible SNP allele and 98% maternal DNA which is homozygous for the digestible SNP allele. The authors suggested that comparable approaches could be valuable for NIPAT. Guo and colleagues [16] analysed cell-free DNA in maternal plasma samples obtained from 30 women who were in the first or second trimester (GWs 8-14) pregnancy. Along with the investigation of each of the maternal plasma samples, a buffy coat sample from the mother, the biological father, and one unrelated man randomly chosen from a set of 29 unrelated men was examined. To determine paternity, the researchers looked at the effective SNP loci in the genetic profiles. Effective SNP loci were defined as loci for which the mother and one of the two males are homozygous and have the same SNP variant, whereas the other male is homozygous for the other SNP variant. The biological father was correctly identified in all the prenatal paternity testing cases.

Both approaches are labour-intensive. Furthermore, it should be noted that the discriminatory power of every single SNP locus is rather low due to the bi-allelic nature of the SNP loci used in the studies. Therefore, the implementation of microarrays or massive parallel sequencing would be more feasible to be able to achieve accurate paternity testing results during the application of SNP-based NIPAT.

Ryan and colleagues [17] introduced a Human CytoSNP-12 microarray and used informatics-based techniques to analyse >300 000 SNP loci in cell-free DNA from maternal plasma of 21 women who were in

(13)

13 the first or second trimester (GWs 6-21) pregnancy. The foetal fraction in maternal plasma varied from 0.6% to 11.7%. Paternity determination was performed by investigating effective SNP loci. The latter are loci in the genome for which the foetus’ mother is homozygous, thereby enabling the identification of non-maternal SNP alleles. For every combination of mother and alleged father, a test statistic was created that specified how well the SNP genotype of the alleged father was explained by the SNP genotype of the foetus. Paternity inclusion, an indeterminate outcome, and paternity exclusion were reported when the p-value of the alleged father’s test statistic as compared to the unrelated males’ test statistic distribution was <10-4, between 10-4 and 0.02, and >0.02, respectively. The unrelated males’ test statistic distribution was based on data from 5000 unrelated men of a particular parentage case. One maternal plasma sample was not used for analysis due to its foetal fraction of <2%. The other maternal plasma samples were subject to parentage determination by performing paternity tests using the biological father and 1820 unrelated males. Paternity was confirmed with 100% accuracy from GW6 onwards. Regarding the tests performed using the unrelated men, 99.95% excluded paternity whereas 0.05% were indeterminate. Still, it should be noted that the SNP loci considered using this assay are mainly related to human diseases. This has two major drawbacks [22]. First, privacy concerns are at issue. Second, SNPs associated with human disease are usually rare variants in the population.

Although the results presented are promising, a widely accepted SNP-based NIPAT method with sophisticated data analysis and interpretation to accurately determine paternity is still to be established. As already mentioned, focus has been shifted towards the development of massive parallel sequencing for the application of SNP-based NIPAT.

In 2016, Jiang and colleagues [18] developed a SNP-based NIPAT using massive parallel sequencing for SNP genotyping of cell-free DNA. Cell-free DNA from maternal plasma of 17 second and third trimester (GWs 12-30+6) pregnancies was analysed using the Illumina® HiSeqTM platform. Either 663 243 (4 samples), 8242 (11 samples), or 5011 (2 samples) SNP loci were investigated. The foetal fraction in maternal plasma varied from 5.68% to 29.74%. The paternity index (PI) – or likelihood ratio – was estimated from the effective SNP loci. These are loci in the genome for which the foetus’ mother is homozygous, thereby allowing the paternally inherited foetal SNP alleles to be inferred. The PI is defined as the probability of obtaining the DNA evidence at a certain effective locus given the hypothesis that the alleged father is the biological father of the foetus divided by the probability of obtaining the DNA evidence at that particular effective locus given the hypothesis that a random man is the biological father of the foetus. The combined PI (CPI) was calculated by multiplication of the individual PI values. The CPI was determined for the real biological father and 90 random Han Chinese males. According to guidelines for conventional paternity testing data interpretation, the log(CPI) value should exceed 4 or be lower than -4 in order to be able to support or exclude paternity, respectively. The researchers initially used one paternity case, and simulation data based on that case, to examine several factors that could influence this massive parallel sequencing-based NIPAT. First, the influence of MAF was investigated. Binominal distribution, which is a probability distribution, showed that MAF, total SNP quantity and effective SNP amount are positively correlated. If the sequencing depth was low, high frequency SNPs (MAF >0.3) were preferred over low frequency SNPs (MAF <0.3). Second, the influence of the foetal fraction was considered. An augmentation of the foetal fraction from 1% to 30% resulted in a log(CPI) increase when the sequencing coverage was 75-fold and the number of effective SNP loci was 103. This showed that CPI and foetal fraction are significantly positively

(14)

14 correlated. However, there was only a substantial effect on the CPI when the foetal fraction was increased from 1% to 10%. When the foetal fraction was <3% and the sequencing coverage was 75-fold, the number of effective SNP loci had to be increased to >105 in order to allow SNP-based NIPAT analysis. Third, the influence of the sequencing depth was evaluated. An increase of the sequencing coverage from 10-fold to 200-fold lead to a log(CPI) increase when the foetal fraction was 10% and the number of effective SNP loci was 103. This revealed that CPI and sequencing coverage are positively correlated. Yet, there was only a substantial effect on the CPI when the sequencing coverage was increased from 10-fold to 75-fold. When the sequencing coverage was <30-fold, the log(CPI) values were lower than zero, meaning that paternity determination was inappropriate. Even changing the number of effective SNP loci did not matter. When the foetal fraction was about 1%, a sequencing coverage >125-fold or a number of effective SNP loci >105 was suggested in order to allow based NIPAT analysis. All the results described propose that SNP-based NIPAT methods should have the following settings for reaching a high accuracy (>99.99%) performance: high frequency SNPs (MAF >0.3), 1*103 to 2*103 effective SNP loci (5*103 to 8*103 total SNP loci), a foetal fraction >3%, and a sequencing coverage >75-fold. With a foetal fraction <3%, a sequencing coverage of >125-fold or an effective SNP loci number of >105 was suggested. Using high frequency SNP alleles, the log(CPI) of the first sample was 9.8774*104, thereby supporting paternity. Then, the researchers performed a validation study to be able to test the SNP-based NIPAT technique using the other 16 prenatal paternity testing cases. The log(CPI) value of the biological father and the 90 unrelated men were 2.7888*103 (interval 176.78 to 1.5523*104) and -4.5534*103 (average interval -2.87*104 to -153.72), respectively. The biological father was correctly identified in all real prenatal paternity determination cases. Nevertheless, it should be mentioned that linkage testing was not performed for various SNPs, making the CPI results obtained in this study debatable [20].

In 2018, Yang and colleagues [19] developed a SNP panel to identify paternally inherited foetal SNP alleles in maternal plasma through the Ion TorrentTM Personal Genome MachineTM platform. This AmpliSeqTM panel amplifies 697 unlinked autosomal (linkage disequilibrium: r2 <0.08 and D’ <0.27; MAF ≥0.3) and 23 Y-chromosomal SNP loci, resulting in amplicons <140 base pairs. For the study, maternal plasma of 11 first trimester (GWs 9-12) and 9 second trimester (GWs 17-21) pregnancies was analysed. A predefined threshold of 2% was used to make a distinction between paternally inherited foetal SNP alleles and background signals, and several SNP loci showing a sequencing depth <1000-fold and heterozygote imbalance were also not taken into account for further analysis. This resulted in a mean cell-free DNA profile completion of 82.44% ± 0.05782. Paternally inherited foetal autosomal SNP alleles were deduced by investigating effective SNP loci. The mean non-maternal allele fractions were significantly higher in later stage pregnancies as compared to early pregnancies, and the foetal fraction in maternal plasma varied from 4.28% to 10.7%. In 19 prenatal paternity testing cases, at the effective loci, no less than one autosomal SNP allele in the profile of the alleged father was found in the assumed paternally inherited foetal autosomal SNP alleles. This was in contrast to the results obtained with 94 random East Asian men. However, the PI and CPI were not reported. Comparison of the inferred paternally inherited foetal autosomal SNP alleles with the SNP profiles generated from foetal tissue resulted in an interval of 49.43% to 89.47% and 88.46% to 100% correctly recognised paternally inherited autosomal SNP alleles in plasma samples of the first and second trimester pregnancies, respectively. Three cases also presented drop-ins that could be explained by stochastic effects. In addition, Y-chromosomal SNP alleles were observed in maternal plasma of 11 pregnant women, demonstrating the presence of a male foetus.

(15)

15 Qu and colleagues [20] also established a massive parallel sequencing-based method for NIPAT. They used the Illumina® HiSeqTM platform to investigate 1795 unlinked autosomal SNP loci (linkage disequilibrium: r2 <0.2; MAF 0.3; Jiang et al. [18]) in cell-free DNA of maternal plasma obtained from 11 first trimester (GWs 9-12) and 23 second trimester (GWs 13-21) pregnancies. Data interpretation was done by two methods, namely the non-maternal allele counting method to detect paternally inherited foetal SNP alleles and mathematical algorithms based on Bayes’ theorem to compute the PI. The PI was calculated for all effective SNP loci. Sequencing error rate and foetal fraction were taken into consideration for PI calculation. Since SNPs are characterised by a high mutational stability, mutation events were not taken into account in the determination of PI. Ultimately, the CPI was calculated by multiplying the PI values of the effective autosomal SNP loci. The latter was allowed because the various SNP loci were considered unlinked, as previously mentioned. A sequencing coverage with at least a mean value of 75-fold was used and all cell-free DNA samples met this prerequisite, allowing the 34 samples to be further analysed. Regarding the counting method, up to 30.82% of actual paternally inherited foetal SNP alleles did not reach the a priori defined allele fraction threshold of 2.5%. Due to the lack of enough effective SNP loci, NIPAT was not possible, particularly in maternal plasma samples with low foetal fraction. Additionally, 12 cases showed drop-ins probably present due to stochastic effects, ultimately leading to genetic contradictions in these cases. In contrast to the counting method, the method based on Bayesian statistics used all effective SNP loci, which are about 40% of the total SNP loci. The latter method revealed that the log(CPI) value distributions between the biological father (interval 68.23 to 158.01) and the set of 90 unrelated individuals from the East Asian population (average interval -15.94 to -724.34) were significantly different. The biological father was correctly identified in all but one real prenatal paternity determination case, even if the foetal fraction in maternal plasma was low. These findings indicated that the Bayesian approach is the preferred option for sequencing data interpretation and weight of evidence calculation. The paternity of one random male in one case was not excluded as the log(CPI) value was slightly higher than the threshold value of -4. The authors suggested that the most probable reason for this exclusion is insufficient testing efficiency of excluding non-paternity in the study.

Recently, Christiansen and colleagues [21] used massive parallel sequencing to investigate paternally inherited foetal SNP alleles in plasma of 15 pregnant women at five consecutive time points during GWs 4-20 regardless of foetal gender. From the 90 autosomal SNP loci targeted by the Precision ID Identity Panel and analysed by the Ion S5TM Sequencer, the median number of loci for which the mother was homozygous for the allele was 42 (interval 35-48), and 19 (interval 12-27) for which the child was as well heterozygous. An a priori defined threshold was calculated based on the sequencing reads data to make a distinction between paternally inherited foetal SNP alleles and background noise. The median number of paternally inherited foetal autosomal SNP alleles in maternal plasma was 0 (interval 0-1), 3 (interval 0-17), 9 (interval 0-22), 10 (interval 2-22), and 12 (interval 3-23) at GWs 4, 7, 12, 16, and 20, respectively. These numbers differed between the pregnant women at each time point but had a tendency to increase with increasing GW within each pregnant woman. Comparison of the consensus SNP profiles of the foetuses generated from the duplicate genotyping with the SNP profiles of the neonates revealed a median number of correctly recognised paternally inherited autosomal SNP alleles of 0%, 21.4%, 64.3%, 60%, and 65% at GWs 4, 7, 12, 16, and 20, respectively. Moreover, the drop-out rate for amplicons >146 base pairs was significantly higher than the drop-out rate for amplicons ≤146 base pairs at GWs 12, 16, and 20. There were also drop-ins observed in the first and second genotyping, but only 0.7% of the drop-ins were

(16)

16 reproduced. Additionally, Y-chromosomal SNP alleles were observed in plasma of the seven pregnant women bearing male foetuses from GW7 and onwards. Each of these SNP profiles matched those of the biological father. Furthermore, the consensus autosomal SNP profiles of the foetuses at GWs 12 and 20 were used for CPI calculation in 121 prenatal paternity testing cases. The CPI was calculated for the real biological father using all effective SNP loci. The mean CPI value for the biological father was 24.4 (interval 0.0035 to 8389) and 198.7 (interval 5.1 to 30 137) at GWs 12 and 20, respectively. The value of 0.0035 in one case was due to a false positive SNP allele call. Regarding the simulated rape case scenarios in the study, the likelihood ratio is defined as the probability of obtaining the DNA evidence given the hypothesis that the partner is the biological father of the foetus divided by the probability of obtaining the DNA evidence given the hypothesis that the offender (i.e. one of the 120 unrelated Danish men) is the biological father of the foetus. 1815 and 1936 parentage comparisons were performed at GWs 12 and 20, respectively. The likelihood ratio was >1000 in 89.8% and 92% of the simulated rape case scenarios at GWs 12 and 20, respectively. The remaining scenarios revealed a likelihood ratio between 0.001 and 1000. Thus, the data supported the hypothesis that the partner is the biological father of the foetus.

Chang and colleagues [22] also used massive parallel sequencing of cell-free DNA to determine the SNP genotypes for NIPAT. 5457 unlinked SNP loci (linkage disequilibrium: r2 <0.05; MAF >0.4; observed

heterozygosity >0.4; Hardy–Weinberg equilibrium) from SNP data of East Asian populations were validated

in 1508 random Han Chinese males. SNP loci that did not show Hardy–Weinberg equilibrium and that had

a low sequencing coverage were not taken into account for further analysis. This resulted in 4151 and 4735 SNP loci with a local MAF >0.4 and >0.3, respectively. Plasma of 14 pregnant women (GWs 6-24) and seven negative controls (i.e. four non-pregnant women and three males) was obtained and cell-free DNA was

sequenced using the BGISEQ-500 platform. A sequencing coverage from 106-fold to 468-fold was

reported. The CPI was calculated using effective SNP loci. The number of SNP loci for which the mother

was homozygous varied from 934 to 2486. 33.8% to 59.8% of the maternal homozygous SNP loci revealed non-maternal allele sequencing reads in the plasma of the pregnant women. In contrast, only 12.5% to 21.1% and 14% to 21% of the maternal homozygous SNP loci showed non-maternal allele sequencing reads in the plasma of the non-pregnant women and men, respectively. Regarding the negative controls, 99.2% of the maternal homozygous SNP loci demonstrating maternal allele sequencing reads had a non-maternal allele sequencing read fraction of <2%. In contrast, the non-non-maternal allele sequencing read fraction in the plasma of pregnant women was >2.4%. Therefore, a threshold of 2% was selected to identify paternally inherited foetal SNP alleles. Comparison of the inferred paternally inherited foetal SNP alleles with the SNP profiles generated from foetal reference samples resulted in 99.54% to 100% correctly recognised paternally inherited foetal SNP alleles in the maternal plasma samples. Furthermore, cell-free DNA extracted from 349 maternal plasma samples of pregnant women (GWs 6-34) underwent massive parallel sequencing for subsequent paternity determination. The CPI values of the biological father and the 348 unrelated males were 1.25*1022 to 1.46*10165 and <10-37, respectively. Therefore, the biological father was correctly identified in all real prenatal paternity determination cases. Finally, a sensitivity analysis was performed using artificial plasma samples with a foetal fraction of 0.5%, 1%, 2%, 3%, 5%, 7%, 10%, 20%, 30%, and 50%. A sequencing coverage of 100-fold was set. Increasing the foetal fraction from 0.5% to 50% resulted in an increase of effective SNP loci from 13 to 603. In case of a foetal fraction of 1%, only 33 effective SNP loci could be found and the CPI was 73.2. An increase in foetal fraction to 2% lead to

(17)

17 98 effective SNP loci, revealing a CPI of 2.89*1013. If the foetal fraction was ≥10%, all heterozygous foetal SNP loci could be obtained.

Taken together, the current focus on massive parallel sequencing-based approaches for SNP-based NIPAT indicates the preference for and potential of these approaches. To avoid foetal allelic suppression by cell-free maternal DNA, all studies emphasize the need to focus on only effective SNP loci for paternity determination (Figure 3). Effective SNP loci are loci in the genome for which the foetus’ mother is homozygous, thereby permitting the inference of non-maternal SNP alleles at these loci. The non-maternal alleles are defined as paternally inherited foetal SNP alleles if an a priori set threshold to distinguish paternally inherited foetal SNP alleles from background signals is exceeded. The stochastic effect data and obtained log(CPI) values in the different studies are summarised in Table 2. The results show that reliable paternity determination can be obtained with high evidential values from GW6 onwards. The studies also highlight the importance to consider various influence factors. First, the foetal fraction plays a role in the success rate of NIPAT. Namely, low foetal fractions may result in failure of paternity determination. Second, there is a need for setting an optimal threshold value to distinguish between paternally inherited foetal SNP alleles and background signals (e.g. sequencing errors). The higher the threshold value, the more background noise and effective SNP loci will not reach the threshold. This condition aids in performing accurate foetal genotyping but decreases the success rate of paternity determination. An appropriate threshold value will guarantee correctness and increase detection efficiency. Third, a suitable sequencing depth is required to reduce the number of sequencing errors. Additionally, a distinction between sequencing errors and true paternally inherited foetal SNPs relies on the sequencing depth. An appropriate sequencing depth will improve the sequencing accuracy and testing efficiency. Fourth, the MAF of the effective SNP loci that are examined is important. Especially when the sequencing depth is low, high frequency SNPs are preferred over low frequency SNPs. Fifth, the number of effective SNP loci that is investigated is critical for an appropriate discriminatory power in NIPAT. Although the potential of massive parallel sequencing-based NIPAT is shown in all described proof-of-concept studies, one could easily deduce several future improvements in each of the studies.

Figure 3 | The principle to avoid foetal allelic suppression by cell-free maternal DNA for the purpose of detecting paternity inherited foetal SNP alleles in maternal plasma. Effective SNP loci, which are autosomal loci in the genome for which the pregnant

woman is homozygous, are investigated. This allows the inference of non-maternal alleles at these loci in the maternal plasma sample. The non-maternal alleles are defined as paternally inherited foetal SNP alleles if they exceed an a priori defined threshold. DNA: deoxyribonucleic acid; SNP: single nucleotide polymorphism.

(18)

18 Table 2 | Results of the stochastic effect and log(CPI) data in different studies. Studies that use massive parallel sequencing

methods for SNP-based NIPAT are indicated. The study, sample information, total number of SNP loci investigated, stochastic effects and log(CPI) are mentioned. The subjects investigated are singleton pregnant women. GW: gestational week; cons. TPs: consecutive time points; log(CPI): logarithm base 10 of the combined paternity index.

Study (massive parallel sequencing method) Sample information Total # SNP loci investigated Stochastic effects Log(CPI) Jiang et al. (2016)[18] (Illumina® HiSeqTM platform)

17 samples: GWs 12-30+6

663 243 (4 samples) 8242 (11 samples) 5011 (2 samples)

not reported biological father 176.78 to 9.8747*104

90 unrelated males -2.87*104 to -153.72

Yang et al. (2018) [19] (Ion TorrentTM Personal Genome

MachineTM platform) 11 samples: GWs 9-12 9 samples: GWs 17-21 697 drop-outs GWs 9-12: 10.53% to 50.47% GWs 17-21: 0.00% to 11.54% drop-ins in 3 cases not reported Qu et al. (2018)[20] (Illumina® HiSeqTM platform)

11 samples: GWs 9-12 23 samples: GWs 13-21 1795 drop-outs GWs 9-21: 0.00% to 30.82% drop-ins in 12 cases biological father 68.23 to 158.01 90 unrelated males -15.94 to -724.34 Christiansen et al. (2019) [21] (Ion S5TM Sequencer) 15 samples (cons. TPs): GW 4, 7, 12, 16, 20 90 drop-outs GW 4: 100% GW 7: 78.6% GW 12: 35.7% GW 16: 40% GW 20: 35% drop-ins 0.7% reproduced biological father GW 12: -2.4559 to 3.9237 GW 20: 0.7076 to 4.4791 120 unrelated males not reported Chang et al. (2019) [22] (BGISEQ-500 platform) 14 samples: GWs 6-24 349 samples: GWs 6-34 5457 drop-outs GWs 6-24: 0.00% to 0.46% biological father GWs 6-34: 22.097 to 165.16 348 unrelated males GWs 6-34: less than -37

(19)

19

2.2.2. Future outlooks for massive parallel sequencing-based NIPAT studies

It is essential to include maternal plasma samples obtained from first trimester pregnancies in the investigation. This has two main reasons. First, early NIPAT is favoured due to the reduced physical and emotional stress on pregnant women, and thus decisions regarding abortion could be made more early during pregnancy. Second, the maternal plasma samples have generally lower foetal fractions, making it more challenging to perform NIPAT. As already mentioned, low foetal fractions may lead to paternity determination failure. Therefore, foetal fraction enrichment methods before performing NIPAT could offer enhancements and are recommended to be investigated in the future.

Additionally, most studies focussed on a particular ethnic group to establish a SNP-based NIPAT using massive parallel sequencing. However, specific frequencies of SNP alleles might vary across different ethnicities. Therefore, it is important to verify the applicability of the developed NIPAT in various ethnic backgrounds.

Moreover, the adoption of more effective SNP markers will increase the discriminatory power of the SNP-based NIPAT. Indeed, if more information – via increasing the number of total and thus also effective SNP markers – can be included in the CPI calculation, this will result in an increased discrimination. Although this is one promising way of improving the discriminatory power, the latter might also result from the implementation of tri- or tetra-allelic SNP markers instead of the conventional bi-allelic SNP markers used in the described studies. In this manner, more information is obtained via increasing the information that is available per marker.

Also, studies should be performed to test whether successful paternity discrimination can be obtained between the biological father and his family members, such as brothers. Generally, more genetic material is shared between relatives as compared to unrelated persons.

Furthermore, optimisation is required for SNPs that have a low sequencing depth, heterozygote imbalance, and high background signals. The reliability of NIPAT studies could be optimised by identifying a suitable foetal fraction threshold, an appropriate threshold to distinguish paternally inherited foetal SNP alleles from background signals, a proper sequencing coverage threshold, an optimal MAF, and a minimum number of total and/or effective SNP loci to be investigated.

Another future outlook is the conduction of statistical studies to establish a prediction model regarding the success rate of the developed NIPAT. This model should consider the stochastic effects – i.e. allele drop-out and drop-in – within the bioinformatics algorithms that define massive parallel sequencing outcomes and the performance of NIPAT.

Finally, large scale experiments including large sample sizes are essential to assess the accuracy of the developed NIPAT in clinical trials.

(20)

20

2.3. Several relevant considerations

There are several relevant considerations regarding SNP-based NIPAT that are worthwhile to mention and discuss. The first consideration is the difference in SNP-based NIPAT between a male and female foetus as the male sex is defined by the Y-chromosome, which is absent in females. The second consideration is the absence of a forensic database containing SNP profiles. The third consideration is the running costs for and accessibility of SNP-based NIPAT.

2.3.1. Male versus female foetus

The Y-chromosome is a useful genetic difference between females and males, and thus between a mother and her male foetus. Indeed, a crucial sign for paternally inherited foetal alleles is the presence of Y-chromosomal SNP alleles. Using Y-Y-chromosomal DNA for SNP-based NIPAT has several advantages. First, foetal fraction estimation based on Y-chromosomal DNA is straightforward. Second, Y-chromosomal SNP analysis is an attractive tool to circumvent the interference of the excessive amounts of cell-free maternal

DNA. Although information obtained from analysis of the Y-chromosome is shared between all individuals

in the same paternal lineage, it can be valuable to perform chromosomal analysis. Specifically, Y-chromosomal SNP analysis of cffDNA might be important for (1) the exclusion of a male suspect as being the biological father of a foetus, and (2) the identification of the paternal lineage of the biological father of a foetus. However, as the individuals of the same paternal lineage share the same Y-chromosomal information, autosomal SNP analysis should be considered to specifically determine paternity. Nevertheless, identifying paternally inherited autosomal SNPs is more difficult, and approaches as the ones discussed in this thesis should be performed.

Since females do not have a chromosome, foetal fraction estimation and SNP-based NIPAT using Y-chromosomal DNA cannot be performed if the foetus has the female sex. Therefore, analysis is more challenging and must focus on applications revealing paternally inherited autosomal SNPs. As already mentioned above, due to cell-free maternal DNA representing the major fraction of plasma DNA, it can be difficult to distinguish cffDNA from cell-free maternal DNA. Consequently, approaches as the ones discussed in this thesis should be performed.

2.3.2. Absence of a forensic database containing SNP profiles

A forensic DNA database that stores genetic information of known offenders allows for the comparison of DNA profiles obtained from a crime scene with the ones collected in the database in order to gather evidence with the aim of aiding criminal investigations. Moreover, it might lead to the identification of possible suspects in the criminal investigation. Even if the offender is not yet included in the database, the search may reveal close relatives, thereby providing an investigative lead for criminal investigations. However, currently, there is no national forensic database available that stores SNP profiles of perpetrators convicted for a variety of serious crimes. Consequently, if there is no suspect, a deduced SNP profile cannot be matched against a database. Thus, it is recommended that such databases will be set up in the future.

(21)

21

2.3.3. Running costs for and accessibility of SNP-based NIPAT

Yang and colleagues [19] included a cost estimation for SNP-based NIPAT in their study. They revealed that the primer prices for each sample were approximately $11 for 720 amplicons and that the panel could be made within 15 working days. The price for library construction, emulsion PCR, and massive parallel sequencing per genomic DNA and cell-free DNA sample was around $200 and $250, respectively. The authors also mentioned that several other studies used sequencing libraries prepared by hybridisation and capture for detecting paternally inherited foetal SNP alleles in maternal plasma. Therefore, a panel with oligonucleotide probes, which costed more than $200 per sample, was needed. Furthermore, developing such a panel took approximately 30 to 60 working days.

Qu and colleagues [20] mentioned that the costs for library preparation and massive parallel sequencing with high sequencing coverage were about $300 and $100 per sample, respectively. In contrast, microarrays used for SNP-based NIPAT, which cover SNP loci with only a limited amount of probes, costed about $600 per sample.

DNA Diagnostics Centre® (DDC®) [40] developed a SNP-based NIPAT that is at present commercially

available. DDC® is a laboratory that has been accredited by amongst others the ANSI

National Accreditation Board ISO/IEC 17025, the American Association of Blood Banks, and the Ministry of Justice. It offers the CertaintyTM NIPAT that can determine paternity with 99.9% accuracy from GW7 onwards, and costs about $1500 to $2000. This NIPAT was previously based on the SNP microarray technology [17] but DDC® reveals that it currently uses the next generation sequencing technology for

SNP-based NIPAT [40]. Nevertheless, no further information is provided regarding the particular next generation sequencing method applied and the SNP loci investigated. Additionally, paternity determination results can be obtained within approximately 7 working days. DDC® is also able to perform

paternity tests that are admissible in court.

Taking into account the increasing focus on massive parallel sequencing for SNP-based NIPAT and the expectation of its further decreasing costs, the application of massive parallel sequencing for SNP-based NIPAT rather than the SNP microarray technology is more likely to be implemented in forensic practices.

(22)

22

3. Conclusion and perspectives

To come back to the main research question of this literature thesis ‘Can SNP typing be used for NIPAT?’, the answer tends towards ‘yes’. However, a widely accepted and validated approach to NIPAT has not been established yet due to the various challenges that need to be dealt with. The existence of circulating cffDNA in maternal plasma of pregnant women lies at the basis of the development of SNP-based NIPAT methods. As discussed in this review, cffDNA fraction in maternal plasma shows inter- and intra-individual variation, and is influenced by both biological and pre-analytical factors. In fact, lower foetal fractions have been detected in early pregnancy as compared to later pregnancy. A reverse relationship has been described for foetal fraction and maternal BMI or weight. Interestingly, no correlation has been found for foetal fraction and maternal age. Since cffDNA yield also relies on the DNA extraction method that is chosen, it is essential to select the appropriate DNA extraction method before performing NIPAT. In contrast to the manual column-based extraction that is currently used, automated magnetic-based extraction might be more valuable as it has been reported that the cffDNA yield is higher and the approach is less time-consuming and less labour-intensive. With regard to the cffDNA enrichment methods, size-, DMR-, and formaldehyde-based enrichment all need to be further researched for potential implementation in NIPAT methods. In recent studies, massive parallel sequencing methods have progressively been applied for SNP-based NIPAT and are likely to be the approaches that ultimately will be used in forensic cases. To circumvent the interference of the excessive amounts of cell-free maternal DNA in maternal plasma, researchers have suggested to consider only effective autosomal SNP loci to ultimately calculate the CPI and determine paternity. These are loci for which the mother is homozygous, thereby allowing the paternally inherited foetal SNP alleles to be inferred. Additionally, paternity determination with high evidential values has been reported from GW6 onwards. Although the prospects of SNP-based NIPAT methods are promising for implementation in e.g. rape cases resulting in pregnancy, forensic databases storing SNP profiles are not available yet and development is recommended in the future. It should also be mentioned that DDC® developed a SNP-based NIPAT that is currently commercially

available. It is expected that more laboratories will follow in the near future.

In conclusion, this literature thesis provides an up-to-date overview of the use of SNP genotyping for NIPAT and the factors influencing its performance. Additionally, it emphasizes the knowledge that is still missing in this research field. Studies should be performed that aim at the development of a widely accepted and validated SNP-based NIPAT that can be used early in pregnancy. The studies should focus on investigating the variability across different ethnicities, compiling a widely applicable set of SNP markers including more discriminatory multi-allelic SNPs, as well as defining a threshold for foetal fraction, differentiating background noise from paternally inherited foetal SNP alleles, sequencing depth, MAF, and total and/or effective SNP loci number. Studies involving male family members of the alleged father should also be executed. Additionally, statistical models should be implemented that consider stochastic effects, and large scale experiments should be performed.

(23)

23

3.1. Management summary for chain partners and policy makers

In a considerable number of cases, female victims of rape become pregnant. At present, many medical and forensic laboratories lack the option of a non-invasive prenatal DNA paternity test for the foetus carried by the victim. According to recent scientific publications, several research groups have studied new methods to enable non-invasive prenatal paternity testing by making use of a liquid blood sample drawn from the mother. It has been reported that sufficient cell-free DNA from the foetus circulates in the blood of woman in early pregnancy – typically from gestational week six onwards – to allow trustworthy paternity determination with extremely high evidential values. Recently, single nucleotide polymorphisms are genetic variations that have been researched as potential markers to perform non-invasive prenatal paternity testing. The objective of this thesis is to give an in-depth overview of single nucleotide polymorphism-based non-invasive prenatal paternity testing methods and its challenges. Evaluation of the different methods has revealed that the massive parallel sequencing technology is the most promising direction to follow. This technology can provide a high sequencing depth to obtain accurate sequencing results, and enables simultaneous processing of multiple DNA sequences. Furthermore, a cost assessment has suggested that massive parallel sequencing is a commercially viable method for single nucleotide polymorphism-based non-invasive prenatal paternity testing. DNA Diagnostics Centre® is a laboratory that acquired the relevant accreditations for offering single nucleotide polymorphism-based non-invasive prenatal paternity testing using massive parallel sequencing. The whole procedure currently costs about $1500 to $2000. However, it is expected that future research for further development and optimisation of the techniques will lead to a decrease in the costs. Interestingly, DNA Diagnostics Centre® might even perform paternity testing for the purpose of obtaining a legal document with results which can be used for court proceedings.

(24)

24

References

[1] Centraal Bureau voor Statistiek, “Geregistreerde criminaliteit; soort misdrijf, regio.,” 2019. [Online]. Available: https://opendata.cbs.nl/statline/#/CBS/nl/dataset/83648NED/table?ts=1571170465757. [Accessed: 25-Oct-2019].

[2] Centrum Seksueel Geweld, “Factsheet: cijfers over seksueel geweld in Nederland.,” 2013.

[3] Rijksoverheid, “Abortus.” [Online]. Available: https://www.rijksoverheid.nl/onderwerpen/abortus. [Accessed: 25-Oct-2019].

[4] Z. Alfirevic, K. Navaratnam, and F. Mujezinovic, “Amniocentesis and chorionic villus sampling for prenatal diagnosis,” Cochrane Database Syst. Rev., vol. 9, p. CD003252, Sep. 2017.

[5] R. Akolekar, J. Beta, G. Picciarelli, C. Ogilvie, and F. D’Antonio, “Procedure‐related risk of miscarriage following amniocentesis and chorionic villus sampling: a systematic review and meta‐analysis,” Ultrasound Obstet. Gynecol., vol. 45, no. 1, pp. 16–26, Jan. 2015.

[6] Y. M. D. Lo et al., “Presence of fetal DNA in maternal plasma and serum,” Lancet, vol. 350, no. 9076, pp. 485–487, Aug. 1997.

[7] A. Vaiopoulos, K. Athanasoula, N. Papantoniou, and A. Kolialexi, “Review: advances in non-invasive prenatal diagnosis.,” In Vivo (Brooklyn)., vol. 27, pp. 165–170, 2013.

[8] Y. M. D. Lo, J. Zhang, T. N. Leung, T. K. Lau, A. M. Z. Chang, and N. M. Hjelm, “Rapid Clearance of Fetal DNA from Maternal Plasma,” Am. J. Hum. Genet., vol. 64, no. 1, pp. 218–224, Jan. 1999.

[9] Quantum Diagnostics, “Non-invasive Prenatal Testing (NiPT),” 2017. [Online]. Available: https://quantumdxs.com/patients/nipt. [Accessed: 27-Oct-2019].

[10] J. M. Butler, “The future of forensic DNA analysis,” Philos. Trans. R. Soc. B Biol. Sci., vol. 370, no. 1674, 2015.

[11] K. C. A. Chan et al., “Size distributions of maternal and fetal DNA in maternal plasma.,” Clin. Chem., vol. 50, no. 1, pp. 88–92, Jan. 2004.

[12] N. B. Y. Tsui et al., “High resolution size analysis of fetal DNA in the urine of pregnant women by paired-end massively parallel sequencing.,” PLoS One, vol. 7, no. 10, p. e48319, 2012.

[13] E. L. Romsos and P. M. Vallone, “Rapid PCR of STR markers: Applications to human identification,” Forensic Science International: Genetics, vol. 18. Elsevier Ireland Ltd, pp. 90–99, 21-Apr-2015. [14] M. R. Whittle, C. W. Francischini, and D. R. Sumita, “Routine implementation of noninvasive prenatal

paternity testing with STRs,” Forensic Sci. Int. Genet. Suppl. Ser., vol. 6, pp. e233–e234, Dec. 2017. [15] J. A. Tynan, P. Mahboubi, L. L. Cagasan, D. van den Boom, M. Ehrich, and P. Oeth, “Restriction

enzyme-mediated enhanced detection of circulating cell-free fetal DNA in maternal plasma.,” J. Mol. Diagn., vol. 13, no. 4, pp. 382–9, Jul. 2011.

[16] X. Guo et al., “A Noninvasive Test to Determine Paternity in Pregnancy,” N. Engl. J. Med., vol. 366, no. 18, pp. 1743–1745, May 2012.

[17] A. Ryan et al., “Informatics-based, highly accurate, noninvasive prenatal paternity testing,” Genet. Med., vol. 15, no. 6, pp. 473–477, Jun. 2013.

[18] H. Jiang et al., “Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.,” PLoS One, vol. 11, no. 9, p. e0159385, 2016.

Referenties

GERELATEERDE DOCUMENTEN

Want ga maar na: bij een Type I fout wordt een economische activiteit op grond van je rapport tegengehouden, terwijl het door had kunnen gaan zonder schade voor de

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:.. • A submitted manuscript is

Speaking of amsmath and package options, there are differences between the style used for this package and the cases done by amsmath (see below), but cases.sty has options to

verschillende type aanleidingen van driftbuien, wordt verwacht dat de reactie ‘ingaan op de wensen van het kind’ ervoor zorgt dat de tijdsduur van driftbuien afneemt wanneer deze is

De bloemen zijn afgeschreven als de steel of de bloem slap ging of op uitbloei (de punten 1 en 2 bij 4.3 ).. Voorgesteld puntensysteem voor schade

Samen met aio Marije Oostindjer doet zij onderzoek naar de rol van de zeug bij het aanleren van onder meer het eten van vast voer.. Het onderzoek leverde tot nu toe een

• De muren aan de binnen- en buitenzijde zijn bedekt met schelpenstucwerk en binnen afgewerkt met leemverf • Voor de isolatie is vlaswol gebruikt in plaats van glaswol •

Research in the field of social network learning analysis has (a) used social network visualizations as a feedback mechanism and an intervention to enhance online social learning