• No results found

Forensic DNA phenotyping of adult body height

N/A
N/A
Protected

Academic year: 2021

Share "Forensic DNA phenotyping of adult body height"

Copied!
25
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Forensic DNA

phenotyping of

adult body height

Current scientific and technical knowledge and

recommendations for future research

Melde Witmond (student ID 11285699)

Literature thesis

Master Forensic Science

January 2018

Supervisor: Prof. dr. Ate Kloosterman

Co-assessor: Dr. ir. Huub Hoefsloot

(2)

1

Abstract

Forensic DNA phenotyping (FDP) is the prediction of externally visible characteristics (EVCs), such as eye colour or height, from a DNA sample. FDP has the potential to provide a profile of the EVCs of an individual, which could produce new leads in a police investigation. Reliable prediction is already possible for some EVCs, namely eye and hair colour; expanding the EVCs that can be predicted would increase the value of FDP in an investigation. Body height prediction would provide particularly useful information, as it is easily observed by witnesses and difficult to manipulate. Therefore, this literature study aims to create an overview of the research into adult body height prediction and to provide recommendations on how to proceed towards implementation of height prediction into forensic practice. Genotype-phenotype relations are pivotal for accurate predictions from DNA. Deploying several approaches, almost 700 common and over 80 rare variants have been associated with height, which together explain approximately 27,4% of height heritability. Several studies have attempted to predict height using the variants known at that time, with promising results. However, a height prediction model has not yet been developed. There are FDP prediction models available for other EVCs, such as the HIrisPlex model and the Identitas v1 Forensic Chip. These models should serve as guides for the development of a prediction model for height. Thus, research into FDP of adult body height is still far from implementation into practice, but a promising start has been made, and much of the necessary knowledge for the development of a prediction model is available. Although research into the genotype-phenotype relations and prediction of height is important, future research should focus on the development of a reliable multi-trait prediction model that includes height, eye and hair colour, biogeographical ancestry and biological sex.

Keywords: forensic intelligence, forensic DNA phenotyping, prediction, adult body height, externally visible characteristics, genotype-phenotype relations

(3)

2

Contents

Abstract 1

Contents 2

1. Introduction 3

2. Research into forensic DNA phenotyping 4

2.1 Research methods for genotype-phenotype relations 4

2.2 Development and validation of a forensic prediction model 6

2.3 Legal and ethical considerations 8

3. Genotype-phenotype relations of adult body height 10

3.1 Common variants 10

3.2 Rare variants 11

3.3 Biological pathways and processes 12

4. Prediction of adult body height 14

4.1 Research done into height prediction 14

4.2 Other FDP models as examples 16

5. Conclusions and recommendations 19

5.1 Conclusions 19

5.2 Recommendations 19

References 22

(4)

3

1. Introduction

In forensic DNA phenotyping (FDP), certain traits of a donor are predicted from the DNA by genotyping loci that are linked to specific phenotypes1. DNA phenotyping can be used to predict many traits; in

the biomedical field, it is usually applied to predict diseases2, while in the forensic field, it can be used

to determine externally visible characteristics (EVCs) such as eye and hair colour1,3–5. Other EVCs that

have a genetic factor, and thus can be predicted from DNA, are biological sex3, hair structure6, skin

colour7,8, facial features9 (e.g. eye distance, chin shape), age10, and body height11,12. Some researchers

include biogeographical ancestry as EVC, while others argue that you cannot observe someone’s genetic origin, and it is therefore not an EVC1.

FDP can be used to compile a profile of the EVCs of the donor, which could be a great addition to the police investigation, especially when there is an unknown individual and when conventional DNA methods fail3. Currently, when crime scene material containing DNA is found, short tandem repeat

(STR) analysis is performed to obtain a DNA profile that can be used for identification of the donor by comparison to reference profiles or a DNA database. However, if no DNA profile match is found, the police investigation might come to a standstill, especially when there is limited other evidence1,3,4. In

these cases, FDP and the subsequent EVC profile could provide new insights and leads in the investigation, for example in narrowing down the list of possible individuals to only those matching the EVC profile. FDP could be particularly helpful in cases where skeletonised human remains are discovered or in cases where an unknown perpetrator left a DNA-containing trace. Thus, FDP might be the solution to furthering criminal investigations that have come to a standstill or reopening cold cases. Although, current knowledge on FDP can only reliably predict limited phenotypes of a few EVCs1, for

example only blue and brown eye colour and only red, blond and brown hair colour. If a trace is small, which it often is in forensic cases, it is undesirable to waste sample material on a test that gives limited or no useful information. If more research is done into the genotype-phenotype relations of EVCs, more EVCs can be reliably predicted, and a more informative test can be developed.

As many researchers are performing their own studies on trait prediction, both in the biomedical field and the forensic field, it is important to create an overview of research done so far. This will allow for better understanding of what is still to be researched before FDP can be applied in forensic practice. Such an overview should be created for each EVC, although this would be too much for one study. Thus, in this literature study, I will focus on one EVC, adult body height, for several reasons. First of all, biological sex, eye colour, hair colour and skin colour have been extensively researched, while the research into the genotype-phenotype relations of body height is upcoming. EVCs such as hair structure and facial features have not been studied as much, and thus, an overview is not yet needed. Secondly, although height is a more complex trait than eye and hair colour, it is less complex than facial features. This makes body height the logical next EVC to investigate thoroughly, after eye and hair colour. Thirdly, body height can give information that is very relevant to a police investigation, compared to other EVCs; height is more easily observable than eye colour, it is not as easy to change as hair colour, and it has less racial and ethical issues than skin colour.

As it is my opinion that including adult body height in an FDP analysis would be of great value, this literature study will provide an overview of the research so far on body height and give recommendations on how to progress with the research and development of a prediction model. The research question of this study is, therefore: “How far from implementation into practice is the research into forensic DNA phenotyping of adult body height?”. To answer this research question, forensic DNA phenotyping in general will first be discussed in more depth. Then, the genotype-phenotype relations of body height will be described. This is followed by a discussion of the research into the prediction of height, including a brief description of FDP prediction models for other EVCs. Lastly, conclusions and several recommendations for future directions will be given.

(5)

4

2. Research into forensic DNA phenotyping

As mentioned in chapter 1, FDP is the process of predicting EVCs of a donor from his or her DNA. EVCs are considered complex traits, their phenotype influenced by multiple genetic loci and environmental factors3,13. This makes it harder to identify the genetic loci of EVCs and thus to predict the phenotypes

compared to Mendelian inherited traits, which only have a genetic factor13. Understandably, FDP starts

with thoroughly characterising the genotype-phenotype relations of an EVC. This chapter will discuss the most deployed research methods to unravel these relations, as well as an approach to develop and validate a prediction model for forensic practice. Several legal and ethical considerations of FDP are discussed as well, as those are important in the implementation of FDP in practice.

2.1 Research methods for genotype-phenotype relations

Genotype-phenotype relations, where single nucleotide polymorphisms (SNPs) are linked to a phenotypic characteristic2,14, are fundamental to FDP. For example, a particular nucleotide at a SNP

location is observed more in individuals with blue eyes while another nucleotide at that location is observed in individuals with brown eyes. Genotyping this SNP, together with several other SNPs that are associated with eye colour, can be used to predict the eye colour of an individual. As EVCs are complex traits, there are many SNPs associated with an EVC, their combination determining the phenotype. There are several research methods for investigating genotype-phenotype relations, Table 1 provides a comparative overview of these methods.

One of the most used methods to study these genotype-phenotype relations is genome-wide association studies (GWAS)2,12,15. Many thousands of participants are genotyped with SNP arrays and

their phenotypes are recorded. Depending on the study, these phenotypes can be diseases, EVCs or both; usually, multiple phenotypes are recorded and investigated in one study. Participants are grouped and compared to find SNPs that significantly differ between a group with a certain variant of a trait and a group with another variant of that trait2,16.

There are several aspects of GWAS that need to be considered as they influence the findings. The first aspect is the selection of participants; more and more GWAS are using population-based approaches to participant selection, called cohort studies2. Cohort studies are especially useful for investigating

EVCs, as EVCs usually have a continuous spectrum of phenotypes. Another factor regarding participant selection that should be considered is the biogeographical ancestry, or ethnicity, of the participants2,14,17. It is important to investigate whether significant findings are truly associated with

the EVC that is researched or are associated with the biogeographical ancestry15. Preferably, in an

initial study, participants of one ethnic group are selected2. In a follow-up study, it can be investigated

if the findings in the ethnic group of the first study are also observed in other ethnic groups17.

Secondly, the data generation and analysis design must be carefully considered. Most GWAS use arrays with common SNPs, either with a DNA pooling approach or individual genotyping2,14,16. DNA pooling,

where the DNA of all participants of one group are pooled and typed together, can reduce the costs greatly. However, the power of the analysis is reduced and individual genotype data is lost. Especially the latter is of concern, as this means that the data cannot be reused for future research. Although individual genotyping is more expensive, data typed with comparable arrays from multiple studies can be combined to create larger datasets. These larger datasets give more power, enough power to identify new genes involved in the EVC under investigation and to find SNPs with smaller effect sizes2.

This is a great advantage of individual genotyping approaches in GWAS, and it might save costs in the long run.

Then, there are some general aspects of GWAS that must be mentioned. Replication and/or validation of GWAS findings is extremely important2,16,17. Findings can be true associations or false-positives; via

validation studies it can be investigated which is the case, as well as what is the source of potential bias or error. Thus, when performing validation studies, it is pivotal to use independent samples2,15,17.

(6)

5 are located in non-coding regions, meaning that they are most likely non-causal variants when associated with a phenotype. Causal variants are more reliable for predicting a phenotype than non-causal variants2, thus, when finding non-causal variants associated with an EVC, the region surrounding

the variant should be closely investigated in an attempt to identify a causal variant. A last, but important note on GWAS is that it is very useful to identify common variants2,13,16, with a minor allele

frequency (MAF) >5%, but less able to identify low frequency variants (1%< MAF <5%) and rare variants (MAF <1%)2. These variants with lower frequency (MAF <5%) can have larger effect sizes and can thus

be informative in predicting very complex traits12,18,19.

Table 1: Comparative overview of the main similarities, differences, advantages, and limitations of research methods for genotype-phenotype relations. GWAS: genome-wide association studies; RVAS: rare variants association studies; PheWAS:

phenome-wide association studies.

Research method

Genotyping approach

Results/findings Advantages Limitations

GWAS Genome-wide SNP array

Common variants

Easily reusing data from other studies

Identifying only common, non-causal variants

Not explaining all heritability Focussing on the presence or absence of a trait

RVAS Whole genome sequencing (WGS) Whole exome sequencing (WES) Targeted sequencing of candidate genes Exome SNP array

Rare and low-frequency variants

Identifying causal, more reliable variants

Requiring very large sample populations Requiring more complex statistical analysis PheWAS Genome-wide SNP array Common variants Understanding phenotypic structure of continuous traits Requiring extensive genotypic and phenotypic data

Rare variant association studies (RVAS) are specially designed for linking rare genotype variants to phenotypes13. The study design and analysis methods differ slightly from GWAS. Firstly, participant

selection in RVAS needs careful consideration. As rare variants have very low frequencies in the general population, the study population must be either of sufficient size to detect these variants or enhanced for rare variants13. Enhancement of rare variants can be done through extreme phenotype

sampling13,18, where only the extreme phenotypic variants of a trait are selected. This is especially

useful for common diseases with early onset, but less so for EVCs. Another enhancement method is to study isolated populations that have undergone reduced genetic diversity and increased genetic drift13.

Frequencies of rare variants may differ in these isolated populations, and thus may be easier to detect. Family studies provide another approach to increase rare variants in the study population13,14.

Genotyping families with multiple affected and unaffected individuals can give additional power to detect rare variants. Although enhancing the study population might seem like a simple method to increase the power of the study, one must be careful to avoid selection bias.

Secondly, GWAS is usually performed with genome-wide SNP arrays, while RVAS can be done with whole genome sequencing (WGS), whole exome sequencing (WES), targeted sequencing of selected genes, or exome SNP arrays13,18. WGS would be the most thorough but also the most expensive; exome

SNP arrays are the least expensive but they are based on data from individuals of European ancestry and are thus less accurate on individuals from other ethnic groups13.

(7)

6 Lastly, there are some general aspects of RVAS that should be mentioned. The statistical analysis for RVAS is more complicated than for GWAS, mainly due to the rarity of the variants13,18. The validation

is also more difficult for rare variants, it might be challenging to find an appropriate study population for validation. Generally, there are two methods to validate RVAS findings; sequencing the entire gene in which the rare variant is located in a validation study population or only genotyping the rare variant13. The sequencing approach is considered more powerful, however, taking other aspects such

as costs into account, both approaches can be used for the task. All in all, RVAS is not a replacement for GWAS, on the contrary, RVAS can complement GWAS by identifying causal genes for GWAS findings.

Another approach to genotype-phenotype relation characterisation is phenome-wide association studies (PheWAS), where analysis of phenotypic structure is combined with genotype-phenotype investigation20. Pendergrass et al. (2011)20, who propose PheWAS to complement GWAS

investigations, mention that one of the limitations of GWAS is that it usually focusses on presence or absence of a trait, while most complex traits have a continuous spectrum of phenotypes. Moreover, many genetic loci of complex traits show pleiotropy, meaning that one genetic locus influences multiple traits. Including multiple phenotypes and the phenotypic structure of a trait is, for these reasons, essential in PheWAS.

As much phenotype data is needed, as well as genotype data, an appropriate study population is required20. With the PheWAS approach, many variables (genotypes, phenotypes, ethnic groups) can

be compared to each other, allowing for much more comparisons than the GWAS approach, and thus for the discovery of novel associations and relations. However, because of the many comparisons, this approach asks for extensive statistical analysis to keep the false-positive rate low. Pendergrass et al. mention both strengths and limitations of the PheWAS approach, concluding that a combination of GWAS and PheWAS would give the optimal results in understanding the genotype-phenotype relations of complex traits20.

Each of the above discussed methods has its own strengths and limitations in characterising the genotype-phenotype relations of complex traits (see Table 1). GWAS is the most logical approach to gain information for reliable prediction, especially for the EVCs of lower complexity such as eye colour and hair colour. However, the common variants found with GWAS might not be enough to reliably predict the phenotypes of more complex EVCs such as body height and facial features. In my opinion, RVAS would then be a valuable addition to GWAS investigations due to its power to associate low-frequency variants with phenotypes. Moreover, rare variants often have larger effect sizes than common variants. PheWAS could provide additional information on phenotypic structure, information that GWAS and RVAS cannot provide. The question is whether, for a forensic application, the gain in knowledge on genotype-phenotype relations is worth the costs and effort of PheWAS. Keeping in mind the goal of FDP, I find it questionable how much knowledge on the phenotypic structure is needed and how applicable that knowledge is in the prediction of EVCs, especially if GWAS and RVAS research is already performed. Nevertheless, as the complexity of the EVCs investigated increases, more methods must be deployed in order to gather enough data to make reliable predictions. Thus, if GWAS and RVAS cannot provide enough information for reliable FDP, PheWAS could, in my opinion, be of additional value.

2.2 Development and validation of a forensic prediction model

When the genotype-phenotype relations of an EVC are sufficiently characterised, research can move towards the development of an FDP prediction model, which consists of a SNP genotyping array and a computational model. Examples of prediction models are the HIrisPlex model4 and the Identitas v1

Forensic Chip5; chapter 4 will elaborate more on these models. There are several stages in the

(8)

7 A prediction model will most likely not include all SNPs that are associated with an EVC but only the SNPs with the strongest association and largest effect sizes. This ensures that the costs of the genotyping array are kept low while still allowing accurate predictions. Generally, the more complex the EVC, the more SNPs must be included in an array to reach accurate predictions; for example, a 6 SNP set is enough for eye colour prediction while 22 SNP are used for hair colour prediction, a more complex EVC4. When the SNPs with the strongest association and largest effect size are selected,

several subsets of these SNPs should be tested for their predictive value. Due to linkage disequilibrium, combinations of some SNPs might give better results than other combinations1. Through trial and

error, the best combination of SNPs with the highest prediction values can be determined. These SNPs can then be combined in a customised SNP array.

The SNP array is only part of the prediction model, a computational model is needed to transform the genotype data into phenotype predictions. When developing the computational model, it is important to consider how the data should be interpreted and reported to the investigating authorities. The Bayesian framework and likelihood ratios (LRs) are already often used in forensic investigations. Therefore, my opinion is that it would be best if the prediction model would calculate LRs for each phenotypic variant of the EVC given the genotype data of the sample.

There are several parameters that are important to test while developing an FDP prediction model. During the development, these aspects can then be adjusted, incorporated or fine-tuned in the model. Regarding forensic samples, it must be determined how much DNA is needed to obtain an accurate genotyping result from the SNP array21. Forensic samples are often minuscule, thus, an array that only

needs a small amount of DNA is preferred. It should also be investigated whether the array gives accurate results on degraded samples21. In real criminal investigations, there is often a time gap

between the deposit of the sample and the collection and subsequent analysis. Depending on environmental factors, the crime scene sample (and the DNA in it) can be in various stages of degradation. If the SNP array works well on degraded samples, the prediction model has more forensic relevance.

EVC-associated SNPs

Customised SNP array

Computational model

Testing parameters regarding samples and accuracy

Independent validation

EVC prediction model

(9)

8 Then, the computational model must be tested on reliability, reproducibility and accuracy. The error rate must be determined by comparing the predicted probabilities of a phenotype to the actual measured phenotype. By performing repeated predictions for the same sample, the reproducibility of the model can be tested. It is especially important to test if the prediction model is also accurate and reliable for individuals from other ethnic groups than the ethnic group it was developed for (usually European ancestry). All these parameters must be thoroughly investigated; it is important to fully understand these parameters before applying the model in real criminal investigations.

The last but essential stage in the development of a prediction model is the validation stage, where the parameters determined during development should be tested. The validation should occur entirely independent of the development of the model, in an independent study population2,13.

2.3 Legal and ethical considerations

Besides technical considerations regarding the research methods to identify genotype-phenotype relations and to develop a prediction model for FDP, there are some legal and ethical issues that must be considered before FDP can be implemented into forensic practice. A thorough discussion of these issues is beyond the scope of this literature study, although it is important to take note of these issues.

The first and foremost legal issue is that in many countries, the legislation surrounding the use DNA evidence for anything else than conventional STR typing is not clearly defined, although this is the case in the Netherlands and Germany1,3. Furthermore, most countries forbid the use of coding variants in

forensic DNA analysis, only allowing the use of non-coding variants that are non-informative of a person’s phenotype1,3,22. Before FDP can be applied in forensic practice, the legal basis must be firmly

established. One might wonder why research effort should be spent on FDP when it is not allowed in forensic practice in some countries. My answer would be to prove its value; the Netherlands, where FDP is allowed in some instances, can serve as an example, to study what the added value of FDP is in a criminal investigation.

Another legal issue that is brought forward by Matheson (2016)22, is that the conventional STR analysis

was designed to contain no personal information and must be used for identification purposes only. According to Matheson, the DNA profile to DNA profile comparison is similar to the comparison of fingerprints or passport numbers. The basis of FDP, however, is the comparison of a genotype (the predicted EVCs) to a phenotype (the observable traits of each person). This type of comparison does contain personal information; the question is whether such a comparison is allowed.

Then, there is the issue of the reliability of FDP evidence1,3. FDP gives information on the EVCs of an

individual, similar to the information that a human eyewitness could provide. The difference is that FDP has, or will have, a sound statistical basis, whereas it has been shown that human eyewitness statements are error prone1. In court, this might give FDP an advantage over eyewitness statements.

Lastly, it must be noted that FDP evidence will most likely not be used for definitive identification purposes, but only to produce new leads in criminal investigations that have come to a standstill3,22. It

will thus not be used in court, conventional STR analysis is needed for the identification of the donor.

There are also several ethical issues that need careful consideration, of which privacy is the most prominent one1,3. The question raised is whether EVCs are private or public information. Some EVCs,

such as hair colour and body height, might be viewed as public as they are clearly visible to the public, while other EVCs that are less visible, such as biological sex and biogeographical ancestry, might be viewed as private information.

Another ethical issue that must be solved is what is to be done with the information obtained from FDP. How long should the information be stored and in what form? Kayser and Schneider (2009)3

suggest that only the predicted phenotypes of the EVCs should be stored, not the actual genotype data; this would reduce privacy concerns. Another suggestion is that FDP information should be stored only for the duration of the criminal investigation (and subsequent legal processes), as its relevance is

(10)

9 related to a specific criminal investigation. When FDP information produces new leads or suspects and a suspect is subsequently identified using conventional STR analysis, the FDP information is no longer relevant and does not need to be stored any longer.

Lastly, an ethical issue that is often brought forward is the fear of racial profiling, which is an unfounded and inappropriate focusing on a particular ethnic group in an investigation3,22. Inferring the

biogeographical ancestry (ethnic group) of the donor of a crime scene sample might be viewed as racial profiling22. However, from my point of view, narrowing down the pool of suspects to an ethnic group

based on FDP is not unfounded and inappropriate, as there is research underlining the inference of biogeographical ancestry. FDP could even exclude a particular ethnic group as donor of the crime scene sample, alleviating the societal prejudices about that particular ethnic group. An excellent example of this is the Dutch case of the murder of Marianne Vaatstra23. Her murder received much attention in

the press, particularly because it was suspected that one of the asylum seekers from the close by centre was the perpetrator. The centre was mainly inhabited by immigrants form non-European countries. However, after analysis of the biogeographical ancestry of the perpetrator’s DNA, it was concluded that the perpetrator was most likely from Western European ancestry, taking away the suspicion from the asylum seekers23.

All these legal and ethical issues should be carefully considered; the discussion should not only be held in the scientific community, but also in the legal community and in society in general. After all, for the successful implementation of FDP, not only a sound scientific basis of the technique is needed, but also a general acceptance of its use by all parties involved in criminal investigations.

(11)

10

3. Genotype-phenotype relations of adult body height

As explained in the previous chapter, prediction of an EVC is based on the genotype-phenotype relations of that trait. Large datasets are available for body height research because height is often recorded in biomedical genotype-phenotype relation studies1. Combining data from multiple studies

gives more power to identify new variants with lower frequencies in the population as well as variants with smaller effect sizes. Height is determined by both genetic and environmental factors, yet it is estimated that more than 80% of the variation in height that is seen in the population can be explained by genetic variants11,24,25. This is called the heritability of height, and this heritability encompasses both

common and rare variants. Moreover, it is estimated that most variants each have a small effect on height. Unfortunately, until now only a portion of the heritability of height is identified. This chapter will provide an overview of the research, both GWAS and RVAS, into genotype-phenotype relations of body height, focussing on the heritability that is explained by the variants identified. The main regions where most variants are located, and thus the main biological pathways and processes that are likely to be involved in height are also briefly mentioned.

3.1 Common variants

The first three large GWAS meta-analyses on height were published in 200826–28, together identifying

54 SNPs associated with height24. Although the datasets used in the studies were genotyped with

varying SNP arrays (Affymetrix 500K, Illumina 317K, Illumina 500K), the arrays are comparable and it was thus possible to combine these datasets to give more power to the analysis.

Lettre et al. (2008) performed a meta-analysis on data from six studies with >15.000 individuals from European ancestry26. The researchers identified 10 new SNPs associated with height, which together

explain ~2% of height heritability (see Table 2). Weedon et al. (2008) analysed data from five studies, encompassing >13.000 individuals of European ancestry27. They were able to associate 20 SNPs with

height, explaining ~2,9% of the heritability of height. Lastly, Gudbjartsson et al. (2008) analysed data from five other studies with >33.000 individuals of European and African American ancestry28. The

researchers identified SNPs in 27 loci, which explained ~3,7% of height heritability.

From these studies, it is clear that many variants have an effect on height, unlike an EVC such as eye colour, where only 6 variants explain most heritability. Being able to explain 2-4% of the variation of height is not enough for accurate FDP predictions. Therefore, the researchers point out that larger GWAS analyses are needed to identify more common SNPs associated with height26–28. Furthermore,

GWAS on study populations from other ethnic groups are needed, as these three were performed on study populations of mostly European ancestry and it is, thus, not possible to generalise the findings to other ethnic populations.

As time progressed, more GWAS datasets became available for height; the Genetic Investigation of ANtropometric Traits (GIANT) consortium is an ever increasing, combined dataset for research into height, body mass index (BMI), and waist circumference29. As part of the GIANT consortium research,

Lango Allen et al. (2010) performed an analysis on data from 46 studies, encompassing >183.000 individuals30. They identified 180 SNPs associated with height, which together explain ~10,5% of the

variation in height. This is still not enough to make accurate predictions (see also chapter 4).

An even larger study on the GIANT consortium was published in 2014, by Wood et al.11. The

researchers performed a meta-analysis on 79 GWAS studies, consisting of >253.000 individuals of European ancestry. This study identified 697 SNPs, some were previously found, others were newly identified. Together, these variants account for ~20% of height heritability11. Furthermore, the

researchers estimated that all common SNPs, including non-height-associated variants, explain ~62,5% of the variation in height. This estimate is interesting, as it does not reach the estimate that more than 80% of the variation in height in the population is explained by genetics. Thus, this indicates that rare variants also have an important role in explaining the heritability of height, together with common variants12,24,27.

(12)

11 It seems that the common variants associated with body height have been very thoroughly researched. Each GWAS with an increased study population was able to identify more height-associated variants than the previous study (Table 2). Efforts such as the GIANT consortium have contributed greatly to the understanding of genotype-phenotype relations of common variants. Despite all these successes of GWAS into height, there is still some ‘missing’ heritability when taking into account only common variants. In my opinion, the genotype-phenotype research of height should shift its focus away from GWAS towards RVAS to find the ‘missing’ heritability.

Table 2: Overview of the number of variants associated with body height and the percentage of heritability that these variants explain. It is estimated that more than 80% of the variation in height in the population is accounted for by genetic

variants (heritability).

Number of variants Heritability explained Study

10 common SNPs ~2% Lettre et al. (2008)26

20 common SNPs ~2,9% Weedon et al. (2008)27

27 common SNPs ~3,7% Gudbjartsson et al. (2008)28

54 common SNPs ~5% Yang et al. (2010)25

180 common SNPs ~10,5% Lango Allen et al. (2010)30

697 common SNPs ~20% Wood et al. (2014)11

All common SNPs, including non-height-associated variants

~62,5% Wood et al. (2014)11

83 rare variants ~1,7% Marouli et al. (2017)12

All common and rare height-associated variants currently known

~27,4% Marouli et al. (2017)12

3.2 Rare variants

Although several studies indicate that rare variants might play a role in the variation of height in the general population, there has been almost no research into rare variants associated with height. One RVAS investigated rare variants associated with height in an isolated population, namely in >6.000 individuals from Sardinia31. The isolated population caused enhancement of rare variants (see also

section 2.1), which lead to the significant association of two rare variants with height.

A larger and more thorough study into rare variants associated with height was performed by Marouli

et al. (2017), as part of the GIANT consortium12. They reported 83 rare and low-frequency coding

variants that were associated with height. Additionally, the researchers found 85 novel common variants, not previously associated with height. The analysis was performed on a dataset of Illumina ExomeChip data from >458.000 individuals, mostly of European ancestry. The ExomeChip array was used, as this approach able to identify coding (causal) variant. Many variants were found to be located near common SNPs previously associated with height, although some variants are at novel loci. The researchers also estimated the heritability explained by these rare variants, which was ~1,7% (Table 2)12. All height-associated variants, previously known and newly identified in this study, together

are estimated to account for ~27,4% of height variation in the population. The researchers conclude that rare variants are very important in genotype-phenotype relations of body height and that deep-sequencing techniques might be deployed to find even more and rarer variants.

Compared to the abundance of GWAS into height, the RVAS research forms a meager contrast. Much more effort is needed to identify the most important rare and low-frequency variants associated with height. Especially for such a polygenic trait as height, where many variants contribute to the overall phenotype, both common and rare genotype-phenotype relations should be understood thoroughly

(13)

12 to allow accurate predictions. Datasets are still increasing, which aids in the identification of rarer variants. It must, however, be noted that the genotyping method for common and rare variants may differ, and datasets might thus not be comparable.

3.3 Biological pathways and processes

As the variants associated with height are usually located in or near specific genes, these variants give information on the genes, and thus on the biological pathways and processes involved in body height11,27,30. Variants are located in pathways specifically involved in skeletal growth, as well as in

general biological pathways12. Most GWAS and RVAS on height both identify height-associated variants

and investigate if these variants indicate new loci, genes and biological pathways.

Several height-associated variants affecting skeletal growth pathways are located in or near genes involved in growth factor signalling. Growth factors are essential in regulating the growth of specific cells and tissues, and thus of the whole organism. Several genes are involved in transforming growth factor signalling11,30, which is important in cell proliferation, adhesion and differentiation32,33, and some

play a role in fibroblast growth factor signalling11,30, important in wound healing and growth of blood

vessels34. Also, there are several variants in genes that are involved in other various growth factor

signaling pathways12,27,30,31. Many variants are located in or near genes affecting growth plate

development, for example in genes important in the general development of growth plates11,28,33,35.

Other genes specifically affect cartilage and bone formation in growth plates11,28 or the formation of

extracellular matrix11,27,28,30. Height-associated variants are also often located in or near genes involved

in the Hedgehog signalling pathway11,26,27, a pathway involved in the differentiation of embryonic stem

cells and important in limb formation33,36. As the Hedgehog pathway is involved in many processes,

altering its function has a wide variety of effects; its deregulation is, for example, involved in the formation of many types of cancer.

One general biological process that is often indicated in height GWAS is the regulation of chromatin structure, affecting chromosome segregation, histone proteins and chromatin remodelling proteins11,26–28. The study by Lettre et al. (2008) also found that several height-associated variants were

located near genes that are targets of the let-7 microRNA26. let-7 microRNA regulates the timing of

stem cell division and differentiation. Possibly, let-7 microRNA is also involved in the regulation of height-associated genes. Lastly, height-associated variants also indicate that the WNT/β-catenin signalling pathway is involved in height11. This pathway is a general intracellular signalling pathway,

involved in cell migration, proliferation and differentiation37.

Figure 2: Overview of the main biological pathways involved in height. The pathways are grouped in general biological

processes and skeletal development pathways. Next to each pathway, specific genes that have height-associated variants located within or nearby are listed.

(14)

13 In short, the main pathways in which height-associated variants are located, are growth factor signalling, growth plate development, Hedgehog signalling, chromatin structure, let-7 microRNA targets and WNT/ β-catenin signalling (Figure 2). This illustrates why it is so difficult to identify the genotype-phenotype relations of height; a wide variety of biological processes is involved in height, all having a small effect, not to mention the environmental factors affecting body height.

It seems a daunting task to identify all genotype-phenotype relations of height, although it might not be necessary to identify all relations before accurate FDP predictions can be made for height. Of course, more research is needed to fully understand the genotype-phenotype relations, especially because this might also reveal information related to growth disorders. The question we must ask ourselves, however, is not whether we understand the genetic architecture of height completely but whether it is necessary to understand it completely before the knowledge can be applied in height FDP. I think not; although GWAS and RVAS research on height must continue to expand our understanding, in my opinion, forensic research should focus on the development of a prediction model for FDP of height.

(15)

14

4. Prediction of adult body height

Although not all height heritability is explained, several studies have attempted to predict height from genotype data24,38,39. Despite some research, an FDP prediction model for height has not yet been

developed. In this chapter, I will first discuss the research done so far into height prediction. Then, the HIrisPlex model for eye and hair colour prediction and the Identitas v1 Forensic Chip for FDP of multiple traits will be explained and used as a guide for recommendations on the best approach for developing a height prediction model. Several aspects that are important to consider when developing a height prediction model for forensic use will also be discussed.

4.1 Research done into height prediction

The first research performed into height prediction was by Aulchenko et al. (2009)38. The researchers

compared genomic predictions based on 54 height-associated SNPs to predictions based on a method by Galton, dating from the Victorian era. The 54 SNPs used in these predictions were those identified in the first GWAS into height26–28. The genomic profile was computed by taking the sum of the number

of height-increasing alleles38. Galton’s method uses the height of both parents, taking the average to

compute the mid-parental predicted height. All height profiles were adjusted for age and sex.

The researchers attempted to predict very tall individuals, namely those who fell in the upper 5% of the population’s height distribution38. For this, a receiver-operating characteristic (ROC) curve was

drawn and the area under the ROC curve (AUC) was calculated to assess the accuracy of the prediction. An AUC value of 0,5 means that the prediction is no better than a guess, while an AUC of 1,0 represents a completely accurate prediction. The genomic prediction of the 5% tallest individuals yielded an AUC of 0,65 (Table 3). Surprisingly, the predictions based on Galton’s method gave much higher accuracies, namely an AUC of 0,84.

Moreover, it was estimated how much of the height variation in the study population was explained by the genomic profile (3,8%) as well as how much variation should be explained to reach certain AUC values for the predictions38. To predict the tallest and shortest 5% with an AUC of 0,8, approximately

25% of height variation should be explained by the genomic profile. In order to reach an even higher accuracy, an AUC of 0,95, it was estimated that a genomic profile explaining 68% of height variation is needed. Although not enough is known about genotype-phenotype relations of height to explain 68% of the variation in height (the heritability), it was estimated by Marouli et al. (2017) that all common and rare variants known at the moment would explain approximately 27,4% of height heritability12, as

explained before in chapter 3. According to the estimate of Aulchenko et al., this should be enough for predictions reaching an AUC of at least 0,838.

Thus, the research of Aulchenko et al. showed that Galton’s method dating from the Victorian era was much more accurate than the genomic method based on 54 loci38. However, the researchers also note

that for Galton’s method information on the height of both parents is needed, which is not always available, especially not in forensic cases. In contrast, the genomic method only needs a DNA sample from the individual him/herself. Thus, if more than 54 variants are used for genomic height predictions, the accuracy might surpass that of Galton’s method and the genomic method might be deployed to make reliable predictions.

Liu et al. (2014) performed a second study on height prediction, using more height-associated variants24. The reason for this study is that the research by Aulchenko et al.38 proved that height

prediction could be made, although not very accurately. As many more height-associated variants were identified in the GIANT study (Lango Allen et al. (2010)30; see chapter 3), the researchers decided

to attempt height predictions using an updated set of height-associated variants24. The question was,

of course, whether this updated set of loci would indeed lead to more accurate predictions.

The researchers attempted to predict the tallest 3% of the population, based on four different sets of genomic loci24. These set were the 54 loci from Aulchenko et al., 180 random loci, the 180 loci from

(16)

15 researchers). The set of 54 height-associated variants predicted the tallest 3% with an AUC of 0,67 (Table 3). This is slightly higher than the result of Aulchenko et al., but this difference is easily explainable; Aulchenko et al. predicted the 5% tallest individuals while this study predicted the tallest 3%. By using a set of 180 random variants to predict height, the researchers showed that this set was not much better than basing the prediction on a coin toss (AUC of 0,52). Using the set of 180 loci from the GIANT study, on the other hand, gave an AUC of 0,75. This clearly shows that predicting height based on height-associated variants indeed works compared to random genomic loci. Lastly, adding 2 newly identified height-associated loci to the 180 loci from the GIANT study significantly increased the AUC from 0,75 to 0,7624. This might seem a small difference, but it shows that adding even a small

number of loci increases the accuracy of predictions.

All in all, this research shows that increasing the number of height-associated variants used for the prediction leads to more accurate predictions24. Although the accuracy of Galton’s method is not yet

reached with 182 loci, the accuracy is much improved compared to the prediction based on 54 loci. This shows that if even more variants are included in the prediction set, even higher accuracies might be reached.

Table 3: Accuracy of height predictions based on different methods and varying numbers of genomic loci. The accuracy is

indicated by the area under the receiver-operating characteristic (ROC) curve (AUC), where 0,5 is a random prediction and 1,0 is a completely accurate prediction.

Prediction based on Predicting Accuracy (AUC) Study

Galton’s method Tallest 5% 0,84 Aulchenko et al. (2009)38

54 height-associated variants Tallest 5% 0,65 Aulchenko et al. (2009)38

Tallest 3% 0,67 Liu et al. (2014)24

180 random variants Tallest 3% 0,52 Liu et al. (2014)24

180 height associated variants Tallest 3% 0,75 Liu et al. (2014)24

182 height associated variants Tallest 3% 0,76 Liu et al. (2014)24

Very recently, Lello et al. (2017) attempted genomic predictions of height using methods from machine learning39. The researchers used a very different approach from the two studies discussed above; exact

height was predicted instead of a binary tall or not-tall prediction and many thousands of common SNPs were taken into account for the prediction instead of only height-associated variants. Furthermore, the statistical analysis of the data is much more elaborate due to the different approach to the prediction.

Approximately 20.000 SNPs were used for the height prediction, which led to a correlation between actual height and predicted height of ~0,6139. Although, this correlation value decreased to ~0,54

when multiple datasets were used instead of a single database. The correlation measurement differs from the AUC value used in the aforementioned studies and, thus, cannot be compared. Moreover, the researchers claim that “the actual heights of most individuals are within about 3 cm of the predicted height”39. This seems a good prediction, and although this may be true, it is unclear from the

article what the exact standard deviation is and how reliable these results are. Therefore, it seems to me that this could be a promising method for height prediction but that more research is needed to correctly assess its parameters and to make the prediction method fully transparent.

As can be seen from the studies discussed above, research into height prediction has already started. However, much must still be investigated before FDP of height may be used in forensic practice. In my opinion, it would be interesting to continue the research from Aulchenko et al. and Liu et al. with an even larger set of height-associated variants. All currently known common and rare variants should be taken into account in order to reach accuracies as high as possible.

(17)

16

4.2 Other FDP models as examples

Although there is no forensic prediction model for body height, there are FDP models for eye colour, hair colour, biological sex and biogeographical ancestry. These FDP models, HIrisPlex and Identitas V1 Forensic Chip, could function as guides for the development of a height prediction model. Therefore, both models will briefly be discussed below (also see Table 4 for a comparative overview), as well as the lessons that can be learned from them.

The IrisPlex model was developed for the prediction of eye colour and was later expanded with several SNPs to include the prediction of hair colour and the combined model was renamed HIrisPlex1,4,21,40,41.

In addition, there is an online web tool available for the HIrisPlex model21,42. The model is based on 24

loci typed with SNaPshot technology, of which 6 are used for eye colour prediction and 22 for hair colour prediction (4 SNPs are used for both predictions). The prediction of eye colour is categorised into blue, brown and intermediate eye colour. Blue eye colour could be predicted with an accuracy of 94%, brown eyes with an accuracy of 95% and intermediate eye colour with an accuracy of 74%1. The

hair colour prediction is categorised into blond (accuracy 69.5%), brown (accuracy 78.5%), black (accuracy 87.5%) and red hair (accuracy 80%)4. The shade of hair colour, light or dark, is also predicted.

The HIrisPlex computational model is a Microsoft Excel macro that calculates the probabilities for each phenotype, using the number of minor alleles for each SNP as data input. The prediction is based on a large European dataset. The model includes guidelines for threshold probabilities and formulating a statement on the results4. Following these guidelines, results on the most likely phenotype are

reported in a concise and clear manner (Figure 3).

Figure 3: Prediction results from the HIrisPlex model. The figure is adjusted from figure 8 of Walsh et al. (2013)4. The coloured

boxes in the prediction result indicate the highest probabilities, the hair and eye colour phenotypes are photographs of the actual phenotypes of the participants, the report statements are based on the guidelines proposed for the HIrisPlex model.

The Identitas v1 Forensic Chip was developed as the first all-in-one prediction model for FDP, predicting biogeographical ancestry, biological sex, eye colour, hair colour and relatedness in one analysis5. A

genome-wide SNP array is used, including both nuclear and mitochondrial loci, together analysing 201.173 SNPs. The prediction of biogeographical ancestry is categorised into 5 groups, namely European ancestry (accuracy 93%), African ancestry (accuracy 88%), East Asian ancestry (accuracy 94%), South Asian ancestry (accuracy 96%) and South American ancestry (accuracy 98%)5 (see also

Table 4). Thus, the predictions of biogeographical ancestry are very accurate with the Identitas chip. The predictions of eye and hair colour are based on the HIrisPlex, using its SNP set and computational

(18)

17 model. However, several SNPs included in the HIrisPlex SNP set could not be included in the Identitas chip and were thus not used in the prediction. Consequently, the accuracies for the prediction of eye and hair colour were lower with the Identitas chip compared to the HIrisPlex model (see Table 4 for Identitas chip accuracies). Unfortunately, the Identitas chip does not have a clearly defined computational model to calculate probabilities for each phenotype, as the HIrisPlex model does have. The accuracies are based on the dataset used in the study (3.196 participants)5.

All in all, several things can be learned from the HIrisPlex model and the Identitas v1 Forensic Chip. Firstly, if more SNPs are genotyped, a larger amount of DNA is needed as input (Table 4). Only 63 pg DNA is needed for genotyping 24 SNPs in the HIrisPlex model, while the Identitas chip needs a minimum amount of 1,75 ng DNA to type more than 201.000 SNPs. It is important to find the right balance between genotyping enough SNPs to obtain accurate predictions and genotyping as little SNPs as possible to keep the amount of DNA needed low. Moreover, it can be seen that some phenotype prediction accuracies of HIrisPlex and the Identitas chip are similar to the height prediction accuracies reached by Liu et al. (2014)24. This indicates that we might know enough about the

genotype-phenotype relations of height to make accurate and reliable predictions for forensic application. As described above, HIrisPlex and the Identitas chip have different approaches to the genotyping method, namely the SNaPshot technology for genotyping tens of SNPs and the genome-wide SNP array for genotyping hundred thousands of SNPs4,5. As it has already been established that many loci influence

height, it seems to me that a genotyping approach similar to that of the Identitas chip would be more suitable for a SNP array for height. Lastly, the computational model of HIrisPlex could be very useful for a height prediction model, as well as the HIrisPlex guidelines for reporting the results. The HIrisPlex provides a concise and clear manner of reporting the results, using probabilities, which can be easily understood by the investigating authorities.

Table 4: Comparison of the HIrisPlex model and the Identitas v1 Forensic Chip.

Characteristics HIrisPlex Identitas v1 Forensic Chip

EVCs Eye colour

Hair colour Biogeographical ancestry Biological sex Eye colour Hair colour Relatedness (3rd degree)

Number of genotyped loci 24 201.173

Amount of DNA needed 63 pg 1,75 ng (1.750 pg) Accuracy of eye colour

predictions Blue 94% Brown 95% Intermediate 74% Blue 70% Brown 85%

Accuracy of hair colour predictions Blond 69.5% Brown 78.5% Black 87.5% Red 80% Blond 63% Brown 72% Black 58% Red 48% Accuracy of biogeographical ancestry predictions European 93% African 88% East Asian 94% South Asian 96% South American 98%

There are several aspects that are important to consider when developing an FDP model for height. As mentioned before, a high accuracy is crucial to rely on predictions in police investigations. Even so, the question is: when is the accuracy high enough? The accuracy of height prediction using all currently known height-associated variants should be assessed (see also section 4.1), and this should be compared to the accuracies of the prediction of other EVCs to determine the minimum accuracy, and

(19)

18 thus the minimum number of SNPs and minimum amount of DNA, required for implementation into practice. In my opinion, this should be one of the next steps in the research into FDP of height. Another aspect that needs consideration is what type of predictions have the most value for the police investigations. Most studies attempted to predict whether an individual fell within the top x% of the population’s height distribution. Likewise, the shortest x% could also be predicted. This method has some advantages over predicting exact body heights (with confidence intervals). It might be possible to reach higher accuracies with 2 or 3 categories compared to a continuous scale prediction, and it is easier for eyewitnesses to observe if an individual is very short, of normal height or very tall instead of observing an exact height. Furthermore, this approach is applicable to both male and female individuals, the method is the same, only the height distribution and the cut-off values differ between men and women.

In summary, only a few studies into height prediction have been performed, resulting in increasing accuracies when taking into account more variants. Although an FDP prediction model for body height has not yet been developed, from my point of view, there is a sound foundation to start from. Many height-associated variants are known, initial studies are promising and there are other FDP prediction models that can serve as guides.

(20)

19

5. Conclusions and recommendations

As explained in chapter 1, forensic DNA phenotyping (FDP) can provide useful information to the police investigation when the investigation comes to a standstill. Height, in particular, can be of interest as it is easily observable, it is quite stable, and its prediction is not surrounded by racial or ethical issues. Furthermore, height prediction can also be applied in paediatric endocrinology, where it can be of importance in determining a treatment plan24. Thus, it is of great interest to research the

genotype-phenotype relations and the prediction of adult body height.

5.1 Conclusions

This literature study aimed to provide an overview of the research into height prediction, and into the aspects necessary for the development of an accurate FDP prediction model for height. I evaluated the three main approaches to identify genotype-phenotype relations of height, namely genome-wide association studies (GWAS), rare variants association studies (RVAS) and phenome-wide association studies (PheWAS; chapter 2). These approaches each have their strengths and limitations. It is my view that GWAS should be applied first, that RVAS can subsequently be used to gather additional information, and that PheWAS could be deployed as last option if the other two approaches do not provide enough information. Additionally, I identified many aspects that must be considered regarding the development of a prediction model, legislation, and ethics (chapter 2); examples are the genotyping method, the computational model, legislation on the use of FDP, reliability of the evidence, privacy, and racial issues. Many height-associated variants have been identified, almost 700 common and over 80 rare variants (chapter 3). It is estimated that over 80% of the variation in height in the population is caused by genetic variants (heritability), however, the height-associated variants identified up to now account for only 27,4% of the variation. Thus, more effort is needed to identify more genetic variants influencing body height. I also discovered that several studies have been performed into the prediction of height, all with promising results (chapter 4). Unfortunately, a height prediction model has not yet been developed. Lastly, there are FDP prediction models for other traits, namely the HIrisPlex model for eye and hair colour prediction and the Identitas v1 Forensic Chip for the prediction of biogeographical ancestry, biological sex, eye and hair colour, and relatedness (chapter 4). Valuable lessons can be learned from these models and, in my opinion, they could and should serve as guides for a height prediction model.

With these main findings in mind, the research question of this study, mentioned in chapter 1, must be revisited: “How far from implementation into practice is the research into forensic DNA phenotyping of adult body height?”. Quite some research has been performed in this field and a promising start has been made. However, much more research is needed, as a height prediction model has not even been developed. Thus, FDP of height is not yet available for forensic practice. In my opinion, it will take many more years of research before FDP of adult body height can be implemented into practice. When research into this field has developed further, possibly in a few years’ time, the research question should be investigated again, and hopefully, a more accurate answer can be given then.

5.2 Recommendations

As part of this study, I have formulated several specific recommendations on how research into height prediction should proceed towards implementation into forensic practice. In short, more (rare) height-associated variants should be identified, height prediction studies that include more variants should be performed, a height prediction model or a multi-trait model should be developed and validated for forensic use, and the support for implementation in practice should be assessed.

First of all, more research is needed into the genotype-phenotype relations of height, as literature shows that much of the variation in height seen in the population is still unexplained. As previously

(21)

20 mentioned in chapter 3, a shift from genome-wide association studies (GWAS) towards rare variants association studies (RVAS) is needed to identify more height-associated variants. Including these newly identified variants will then hopefully lead to more accurate predictions of height. If these two research methods are not able to find enough height-associated variants, explaining much or most of the height variation, phenome-wide association studies (PheWAS) could be performed to gain more insight into the phenotypes. It is my view that this will not be necessary for height, but that PheWAS might prove useful for more complex externally visible characteristics (EVCs), such as facial features.

A second recommendation, related to the first, is that biomedical GWAS and RVAS studies should record information on forensically relevant EVCs. Very large data sets are needed to identify variants with low frequency and/or small effects. If information on EVCs is also recorded, the data from these studies can be reused for FDP-related studies. This would save costs, as the FDP-related studies would not have to collect their own data but would be able to use data from previous studies. Although data on height is already often recorded in biomedical research1, information on most EVCs (such as hair

colour and structure or facial features) is not included. Thus, new biomedical research into genotype-phenotype relations should record information on EVCs to allow for a more efficient use of data.

Furthermore, a prediction study with all (common and rare) height-associated variants known up to now should be performed. As shown in chapter 4, several height prediction studies have been performed, although, the last study that used only height-associated variants is from 2014. Since then, many more variants have been identified. Therefore, it would be very interesting to investigate what accuracies can be reached by including the recently found variants in a new prediction study. In addition, this study should not only attempt to predict the tallest x%, as previous studies have done, but also the shortest x%. This would create 3 groups (shortest x%, medium height, tallest x%), which would be useful in a height prediction model.

Be that as it may, identifying more height-associated variants or performing height prediction studies should, in my opinion, not be the main focus of future research. The focus should be on the development and subsequent validation of a multi-trait FDP prediction model for the simultaneous prediction of adult body height, eye and hair colour, biogeographical ancestry and biological sex. Moreover, the components of the model, the genotyping method and the computational model, should be designed in such a way that it is relatively easy to add more EVCs and thereby expand the model. As forensic samples are often small and contain little DNA, it is undesirable to waste sample material on tests that give limited information. Developing prediction models for each EVC separately would do exactly that. Thus, to prevent wasting precious sample material, a multi-trait FDP model would be preferable.

As mentioned in chapter 4, the HIrisPlex model and Identitas v1 Forensic Chip should serve as guides for the development of this new prediction model. The genotyping method of the Identitas v1 Forensic Chip could aid the design of a customised array for the new model. This approach allows for the genotyping of many genome-wide variants at once. However, the method should be critically reviewed and adjusted, to make sure that as little DNA as possible is needed for accurate results. The HIrisPlex model could be of use in the development of a computational model. The guidelines for interpreting and reporting the results of the predictions could also be based on the HIrisPlex model as these are clear and provide concise and understandable results; this should also be the aim of the new model. In my opinion, this would create an FDP prediction model that is best suited for a forensic application.

Lastly, the support for the use of FDP among police officers, forensic scientists, legal practitioners and the society in general should be investigated before it can be successfully implemented. It is important to assess whether the police force wants and would use FDP during an investigation as it would be a waste of time and resources to pursue a method that will not be used in practice. Furthermore, a society-wide debate on legislation and ethics surrounding FDP should be held (see also chapter 2) to clarify how and when specifically FDP could and should be deployed.

(22)

21 In short, much is already known about height-associated variants, prediction of height from DNA, and the development of an FDP prediction model. However, a height prediction model has not yet been developed. FDP of height has the potential to aid police investigations, but much research and development is still needed before FDP can be implemented into forensic practice.

Referenties

GERELATEERDE DOCUMENTEN

However, setting up such policies is unfeasible unless we gain more knowledge on (i) sandwave dynamics, (ii) waterway morphodynamics and (iii) the combined interpretation of such

The lower standard deviation on the stylistic-editing items indicates that the respondents agree that stylistic editing is important, and that all editors working in

Er is een regressie-analyse uitgevoerd met de ‘stemming van de afgelopen twee weken’ als moderator om te onderzoeken of de bereidheid om een alcoholaanbod in de angst versie

It shows that objectively measured participants’ outdoor walking levels (i.e., durations) vary by area deprivation: Participants residing in high-deprivation areas spend less

Aim of this study was to investigate the potential of 18 F-FDG PET, diffusion weighted imag- ing (DWI) and susceptibility-weighted (T2 *) MRI to predict response to systemic

The puns found in the corpus will be transcribed in English and Polish and classified (which strategy was used for which type of pun). Both, English and Polish puns

Indonesia is independent, so the immediate purpose of Taman Siswa was fulfilled and private schools at all should not be necessary anymore, because now the state should and

Samenvattend adviseert de commissie ribociclib niet in de basisverzekering op te nemen vanwege de ongunstige kosteneffectiviteit, tenzij een acceptabele lagere prijs voor het middel