• No results found

Separating archaeological and forensic bones by using fluorescence spectroscopy.

N/A
N/A
Protected

Academic year: 2021

Share "Separating archaeological and forensic bones by using fluorescence spectroscopy."

Copied!
31
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Separating archaeological and forensic bones by

using fluorescence spectroscopy

Daphne Muijderman (12376388)

Research project MSc Forensic Science (36 EC)

12

th

of August

Supervisor:

Examiner:

Prof. dr. Maurice C. G. Aalders

Prof. dr. Roelof-Jan Oostra

Biomedical Engineering and Physics (AMC)

(2)

Abstract

When bones are found it is important to know whether the bones are forensically relevant or archaeological material. This distinction is based on the age of the bones in a way that bones that date from before 1920 are archaeological material and bones that date from after 1920 are considered to be forensic. Since 1920 there is a missing person’s database in the Netherlands and the donor of the bones could be traced back when the bones are dating from after 1920. The current method of dating bones is radiocarbon dating. This is a time-consuming method, creating the need for a faster method to make it possible for investigators to know immediately how to treat bones when they are found. In this research, the feasibility of a method based on fluorescence spectroscopy is investigated. The compounds collagen, hydroxyapatite (HAP) and flavin adenine dinucleotide (FAD) are present in the bone tissue and they are expected to decline after death due to the decomposition of bone. The differences in amounts can be measured with fluorescence spectroscopy. Bones from five different time periods, ranging from 1500 to 2003, are used in this research. Several models were designed to determine the age of bones: ratios of collagen, HAP and FAD, k-means clustering analysis, principal component analysis (PCA), linear discriminant analysis (LDA) and support vector machine (SVM) models. The HAP-FAD ratio showed a clear decrease over time, but due to the large amount of variation within the time periods this model could not be used in determining the age of bones. The other models were used to create a distinction between the archaeological and forensic bones, in which the SVM models proved to be the most promising method to apply in practice. One of the models resulted in a sensitivity of 86.764% to 100% and a specificity of 73.26% to 87.65%. However, many improvements need to be achieved before this model could be used by the police and crime scene investigators and before it could replace the method of radiocarbon dating. Overall, SVM has the potential to become an additional method to distinguish between archaeological and forensic bones, leading to less bone samples that need to be investigated by using radiocarbon dating. Keywords: Fluorescence spectroscopy, dating bones, forensics

Introduction

When bones are found, it is important for the police and crime scene investigators to know whether they need to treat the bones as forensically relevant or as archaeological material. This distinction is based on the age of the bones, which can give an indication of the time since death, also called the post-mortem interval (PMI), of the donor of the bones. When bones date from after 1920, they are considered to be forensically relevant. In the missing person’s database, the DNA of missing persons and/or their relatives is stored, which enables a DNA comparison and makes it possible to trace back the donor of the bones (1). These bones are then considered to be forensically relevant (2). When the bones are dating from before 1920, a DNA comparison with the database would not be possible anymore and the bones are seen as archaeological material (2). Therefore, methods to estimate the age of bones are important tools in the forensic field.

A common method that is used in dating organic samples, such as wood, charcoal, peat, bones and other substances is radiocarbon dating (3,4). This technique is based on the decay of radiocarbon in deceased organisms or substances (5). Plants take up radiocarbon by photosynthesis, and animals consume these plants. This causes the amount of radiocarbon in living organisms to be comparable to the levels in the atmosphere. After death, there is no ingestion of radiocarbon and a decay can be observed. The half-life of this compound is relatively long, and small changes can be measured over

(3)

time. These changes are used in the dating of bones (5,6). However, carbon dating is a time consuming method (7), and it is useful for the police and crime scene investigators to know instantly whether they need to treat the bones as archaeological or forensic. A method that would make it possible to determine the age of bones at the crime scene in an easy manner could be the method of fluorescence spectroscopy. This method is based on changes in the bone matrix over time that are measured with fluorescence spectroscopy, and this method is investigated in this research.

The compounds collagen, hydroxyapatite (HAP) and flavin adenine dinucleotide (FAD) are present in bone tissue and have fluorescent properties. The major organic component in bone is the protein collagen (8–10). Collagen fibres are strong and are responsible for the flexibility of the bone (11). This compound is also the compound that is mainly responsible for the fluorescent properties of bone, and the amino acids tyrosine and tryptophan present in collagen are largely causing this effect (12). After death, the decomposition of collagen starts due to diagenesis until it is completely eliminated (13,14). This decrease in collagen is also visible when measuring the fluorescence of collagen in bone tissue, which shows a decrease in fluorescence intensity (12). FAD is an enzyme and consists, like collagen, of amino acids (15). It is therefore hypothesised that similar trends will be found in the amount of FAD as the bone decomposes; therefore, the amount of FAD is expected to decline after death. The major inorganic component of bone is the mineral HAP (Ca₅ (PO₄) ₃), which is present in its crystalline form (8,9,11). The following post-mortem process is causing a loss of HAP: Calcium ions, present in the HAP crystals, dissolve into the soil and are replaced by hydroxy ions. An equilibrium of these hydroxy ions in the bones and soil water will be reached, resulting in a stop of the exchange of ions (13,14). In acidic environments, which are rich of hydroxy ions, this reaction occurs faster since more ion exchanges need to occur before an equilibrium is reached (13). Besides that, an equilibrium of calcium and phosphate ions in soil water and bone HAP will be established, both leading to a decline in HAP (13). The degradation of collagen and HAP is competitive, as both processes, i.e. the hydrolysis of collagen and the dissolving of HAP, require the presence of water (9). Each compound physically prevents water from reaching the other compound (16). Environmental conditions, such as the temperature and pH of the soil, influence the speed of degradation of the compounds and which of the compounds decomposes first (16).

Fluorescence spectroscopy could be used to create a method to determine the age of bones. Chemical compounds differ in their wavelength-dependent emission intensity when being excited with light of different wavelengths in fluorescence spectroscopy (17). Since the composition of bone tissue alters over time, differences in the fluorescence intensity are expected to be observed. The aim of this research is to create a method based on fluorescence spectroscopy which could be used in determining the age of bones. It is hypothesised that the ratios between collagen, HAP and FAD differ depending on the time since death and it is expected that these ratios can be used in creating a model for estimating the age of bones.

In this research, two datasets are used of which the first dataset dates from research performed in 2018 (18), and the second dataset is created especially for this research. These datasets will from now on be referred to as dataset-2018 and dataset-2020. Both datasets contain bones dating from five different time periods: 1250-1500, 1650-1860, 1870-1930, 1970-1990 and 2003-2005. For both datasets, the intensity of fluorescence of collagen, HAP and FAD are measured, leading to emission spectra for which the area under the curve (AUC) is calculated. The AUC is a representation of the fluorescence intensity, which is related to the amount of compound present in the bone. This AUC was

(4)

then used to test several approaches in creating a model for determining the PMI of recovered bones. The techniques that are tested are calculated ratios, k-means clustering analysis, principal component analysis (PCA), linear discriminant analysis (LDA) and support vector machine (SVM) models. The performance of these models will be expressed in the sensitivity and specificity.

Materials and methods

Bone samples

The bones used to create dataset-2018 and dataset-2020 originate from the same bone collections and time periods. The bones dating from the first time period, 1250-1500, belong to a collection from the Netherlands Forensic Institute (NFI). The bones were recovered from clay soil at a graveyard located in Delft, the Netherlands. The bones dating from 1650-1860 belong to a collection of the city of Leiden, the Netherlands, originate from Central America, and were buried in sand. Recovered bones from 1870-1930 are part of a collection of the AMC and were recovered from a graveyard in Bloemendaal, the Netherlands, of which the soil is dune sand. The collections dating from 1970-1990 and 2003-2005 originate from graveyards containing coarse sand of which the first one is located in Stein and the second one in Breda, both located in the Netherlands. These two collections are owned by the NFI.

Datasets

Two datasets, dataset-2018 (number of bones = 12, number of measurements = 96) and dataset-2020 (number of bones = 27, number of measurements = 202), were used in this experiment. To create the first model, in which ratios between AUCs were calculated, the datasets were combined. All five time-periods were taken into account in this model, to provide information in how the ratios alter over time. For all the other models, the datasets were combined and then randomly divided into two datasets, to create a training and test set. This division was conducted three times, to avoid any results based on chance. Results obtained from the three differently divided datasets were slightly different and were therefore reported as a range. All figures that are shown are created by using the first created dataset. In these models, the first three time periods and the last two time periods were combined, leading to two time-periods ranging from 1250 to 1930 and 1970 to 2005. This was done to create a distinction around the boundary of 1920 between forensic and archaeological bones.

Fluorescence measurements

The bone samples that were used in this research were created by cutting cross-sections of the diaphysis of human femurs. A handsaw was used to make the cross-sections, after which the samples were sanded. The fluorescence spectrometer Perkin Elmer LS-55 with external fibre optic sensor was used to measure the fluorescence intensity of bone samples and the following excitation wavelengths were used to conduct the measurements: 330 nm, 375 nm and 445 nm for collagen, HAP and FAD respectively. The emission wavelengths were 350 – 600 nm for collagen, 400 – 600 nm for HAP and 470 – 600 nm for FAD. The distance between the optic fibre and the bone sample was between 2 to 5 mm. Each sample was measured on 4 spots on the front and back side, resulting in a total of 8 measurements (Figure 1). In dataset-2018, two samples were measured for each time period, except for time period 3 (1870-1930), for which 4 samples were measured. During the measurements of this dataset, the excitation and emission slits of the fluorescence spectrometer were set on 10.0 nm and the scanning speed was 500 nm/min. For the new dataset, an excitation slit of 10.0 nm, an emission

(5)

slit of 15.0 nm and a scanning speed of 300 nm/min were applied. For each time period 5 samples were measured at the same 8 locations as the dataset from 2018, except for the time period 1970-1990, for which 7 samples were measured. For the time period 1650-1860 no five complete cross sections were available, so three incomplete samples were measured. Maximal two hours before measurement, these samples were sanded to mimic the effects of sawing.

Figure 1. The locations on the bone sample that were measured. Location 1, 2, 3 and 4 are on the front site of the cross-section and location 5, 6, 7 and 8 are on the back side.

Calculate area under the curve

For creating the models, the AUC of the measured samples needed to be calculated. At first, the peaks of the compounds collagen, HAP and FAD, located around the wavelengths of 400 nm (17,19), 440 nm (17,20) and 535 nm (17,19), needed to be identified. The expected locations of the peaks according to the literature were visualised as black markers on the curves in Figure 2a. Subsequently, a baseline was created underneath the peaks of interest, leaving out any extra peaks caused by other excited fluorescent compounds or additional signal due to the excitation light. The range of these baselines were set according to previous research, at 370 – 470 nm for collagen, 420 – 520 nm for HAP and 475 – 575 nm for FAD (21) (Figure 2a). The area between the curve and the drawn baseline was calculated by subtracting the area underneath the baseline from the area underneath the total curve (Figure 2b). Both areas were calculated by integrating via the trapezoidal method. In case there was no peak present above the set baseline, an AUC of zero was used for the analysis. When there was no clear peak visible, the AUC of the small fluctuations visible above the baseline were still calculated and used. These calculations and all other analyses were performed in MATLAB (R2019b).

(6)

Figure 2. How the AUC is calculated: a) The locations of the peaks of collagen (400 nm), HAP (440 nm) and FAD (535 nm) are shown as black markers, and the baseline for each compound is visualised. b) The AUC that remains after subtracting the baseline from the total AUC.

Ratios

There is a decline in the amount of collagen, HAP and FAD over time as the bone decomposes, and it was expected that the rate in which this decrease occurs differs for the three different compounds (13,14). To test this, the absolute AUCs of the three compounds were plotted over time. How the

(7)

compounds decline could help explain the models that were created. However, the absolute AUCs could not be used in determining the age of bones, since there are many factors influencing the fluorescence intensity besides the amount of measured compound. To correct for the fluctuations between the measurements, a ratio between compounds was calculated. This ratio could provide a model for estimating the PMI of recovered bones. Previous research has shown that the HAP-FAD ratio (equation 1) results in a promising model (18,21). Besides this ratio, the HAP and collagen-FAD ratio (equation 2 and 3), were also investigated. The ratios were calculated according to the following equations:

HAP-FAD ratio = !"# % &"'

()**+,-. % !"# % &"' [1]

Collagen-HAP ratio = ()**+,-. % !"#

()**+,-. % !"# % &"' [2]

Collagen-FAD ratio = ()**+,-. % !"# % &"'()**+,-. % &"' [3]

The ratios were plotted against the age of the bones after which an exponential curve has been fitted to the data. Since most biological processes follow an exponential increase or decrease, an exponential curve was chosen. Other trendlines, such as a linear and polynomial curve, were also fitted to the data but these results are not shown in this report.

Normalising the data

The two datasets that were used showed some difference in the AUCs. Additionally, the ranges of the AUCs differed among the three compounds. To correct for these differences, the data was normalised before creating models based on clustering analysis, PCA, LDA and SVM. After normalisation, the AUCs of collagen, HAP and FAD all ranged from 0 to 1. The following equation (equation 4) was used to normalise the data (22,23).

AUCnew = "/(0"/(12.

"/(1+30"/(12.

[4]

Clustering analysis

Another method to base the model for dating bones on is clustering analysis. This method tries to find clusters in data of numerous amounts of dimensions (24,25). K-means clustering was used to estimate clusters based on the AUC of collagen, HAP and FAD. The clusters were created based on the distances between the datapoints and the nearest mean, while the method had no knowledge of the true groups in the data (24–26). By indicating the number of clusters that is expected to be observed in the data, in the context of this research this number is two, k-means clustering separates the data in this number of clusters, only based on the distance between each datapoint. The area in which these clusters were positioned will form predicted regions in the graph for the archaeological and forensic bones. Exactly between the centres of the clusters, the boundary between the predicted regions was located, leading to a linear separation between archaeological and forensic bones. The location of a newly measured bone sample in the graph will then determine whether a bone is predicted to be archaeological or forensic. The model was created by using the training set, after which it was tested with the test set, of which the datapoints were considered to be new. Additionally, the performance of the model was expressed as the sensitivity and specificity, which are respectively the percentage of forensic bones

(8)

that are correctly classified as forensic and the percentage of archaeological bones that are correctly classified as archaeological.

Principle Component Analysis

Another method to use for classification is PCA. PCA enables the handling and plotting of datasets with many variables, by describing these variables with fewer principal components (27). This method was chosen to apply the data of all the three variables, the AUC of collagen, HAP and FAD, in one two-dimensional model, which is clearer than a three-two-dimensional model. PCA transforms the data in several steps. At first, the average of each variable is calculated and the intersection from the line vertical from the average on the x-axis and the line horizontal from the average on the y-axis, from now on called the centre, will be located. Thereafter, the data is shifted in a way that this centre is now moved to the origin of the graph (28). Then a line will be fitted through the origin, trying to minimise the distance between the datapoints and the line. This line is a so-called principle component, and several principle components can be fitted to the data (28). To be more specific, the number of principle components is the same as the number of variables. When there are more than three variables, the described transformation may seem abstract; however, PCA is able to perform this transformation with many more variables (28). Each principle component has its eigenvalue, which is a number that indicates the amount of variation in the data that can be explained by that single principle component (27). The two principle components that are accountable for most of the variation for the data in this research were plotted, leading to the compounds that were correlated to cluster together. By using k-means clustering, it was investigated whether clusters could be found by using the training set. Thereafter, the test set was used to calculate the sensitivity and specificity.

Linear Discriminant Analysis

LDA is similar to PCA in a way that it enables the reduction of variables, which allows the researcher to investigate the data in an easier way. The difference is that LDA focuses on maximizing the separability among the groups (29,30), in this case the archaeological and forensic bones, while PCA causes variables that correlate to cluster together. Therefore, LDA is possible to find the best way to separate the two groups of archaeological and forensic bones in the data. An LDA model was trained on the training set, which resulted in a decision boundary, separating the two groups of bones. Weights have been implemented in the model in a way that the weights for archaeological and forensic bones were 0.4 and 0.6 respectively. By adding weights, the influence of a variable on creating a model can be adjusted and is no longer equal for all variables (31). This will cause a slight shift in the boundary of the two predicted regions in the model to the direction of the forensic bones, predictably leading to a higher sensitivity and a lower specificity. The model then had to predict which measurements of the test set belonged to which category. The performance was again expressed as the sensitivity and specificity.

Support Vector Machine

An SVM model has been created and tested. An SVM is a binary classification model which uses training data to create a decision boundary, which is a line between two predicted regions, that optimises the separation of groups (32). After optimising the SVM model, test data has been used to give an indication of the performance of the model. By using the location of a new datapoint relative to the decision boundary, the model predicts which group the data belongs to. Both a linear and radial basis function have been used in order to create a clear distinction between the two time periods. In the linear approach, a linear decision boundary will be drawn between the two predicted regions of

(9)

archaeological and forensic bones. The radial basis function allows for a more flexible decision boundary, which results in predicted regions that are more shaped around the data. Additionally, a box constraint, which is a number that indicates the amount of misclassification that will be allowed by the model, is chosen. The higher this number, the stricter the model and the less misclassification will be permitted (33). Since there is overlap between the archaeological and forensic bones, a strict classification that works perfectly for the training set will result in a low performance in the test set. Therefore, a box constraint of one is chosen. Similar to LDA, weights were set for the archaeological and forensic bones, which were respectively 0.4 and 0.6. Again, this was done to obtain a smaller false negative rate. After creating the model, it was tested by using the test set and the sensitivity and specificity were calculated to evaluate the performance of the model.

Results

Ratios

At first, the absolute AUCs of collagen, HAP and FAD were plotted against time and in all three compounds a decline has been observed (Figure 3). After calculating the HAP-FAD, collagen-HAP and collagen-FAD ratio, an exponential curve was fitted to the data. For the HAP-FAD ratio, a decrease has been observed over time, meaning that older bones show a lower HAP-FAD ratio than younger bones (Figure 4). The collagen-FAD ratio did not show any differences between older and younger bones (Appendix 1, Figure 1). For the collagen-HAP ratio, a slight increase is visible over time (Appendix 1, Figure 2). This indicates that the ratio is lower for younger bones and higher for older bones.

(10)

Figure 4. The HAP-FAD ratio over time showing the means for each time period with the standard deviations. The curve is exponentially fitted to the data.

Clustering analysis

K-means clustering creates clusters based on the data, without knowing the actual groups, after which these clusters can be compared with the true existing clusters. The clusters predicted by the data in the training set, are visualised as predicted regions for both archaeological and forensic bones. This is shown for the compounds HAP and FAD in Figure 5. Accordingly, the test set was used to show the performance of the model (Figure 6). The sensitivity ranged from 19.05% to 24.53% and the specificity from 75.58% to 83.95%. The same was done for the compounds collagen and HAP, which resulted in a sensitivity of 20.75% to 26.47% and a specificity of 72.84% to 78.13% (Appendix 1, Figure 3 and 4), and for collagen and FAD, of which the outcome was a sensitivity of 20.59% to 24.53% and a specificity of 72.09% to 83.33% (Appendix 1, Figure 5 and 6). Of all the three models, the sensitivity was very low and the specificity relatively high. There was not one model that showed better results than the others.

(11)

Figure 5. The training data is plotted (archaeological bones in blue and forensic bones in red) together with the obtained k-means clustering model, based on HAP and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The centers of the clusters are marked with a cross (x).

Figure 6. The test data is plotted (archaeological bones in blue and forensic bones in red) together with the k-means clustering model, based on HAP and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown.

PCA

By using PCA, three principle components were found and by comparing their eigenvalues it was investigated which two components were mostly accountable for the variation in the data, which were the first and second principle component. Thereafter, these two components were plotted against each other and k-means clustering was applied on the transformed training data (Figure 7). By using the test set a sensitivity of 20.59% to 26.42% and a specificity of 70.93% to 82.29% were found (Figure

(12)

8). These results were comparable with the sensitivity and specificity obtained by k-means clustering only, before data transformation by PCA.

Figure 7. The training data is plotted (archaeological bones in blue and forensic bones in red) after transformation with PCA. The by k-means clustering predicted archaeological (grey) and forensic regions (white) are shown.

Figure 8. The test data is plotted (archaeological bones in blue and forensic bones in red) after transformation with PCA. The by k-means clustering predicted archaeological (grey) and forensic regions (white) are shown.

LDA

In contrast to k-means clustering and PCA, LDA takes the true groups of archaeological and forensic bones into account when making a division between clusters in the data. This method led to a higher

(13)

sensitivity compared to the previous methods. The model based on HAP and FAD resulted in a sensitivity of 55.88% to 98.11% and a specificity of 79.07% to 97.53% (Figure 9 and 10). The model created by plotting collagen and HAP gave a sensitivity of 58.82% to 96.23% and a specificity of 77.08% to 97.53% (Appendix 1, Figure 7 and 8). The combination of collagen and FAD resulted in a sensitivity ranging from 52.94% to 84.90% and a specificity ranging from 82.56% to 95.06% (Appendix 1, Figure 9 and 10). Although the sensitivity was higher, the ranges were very broad for each of the models, meaning that how the datasets were randomly divided affected the performance of the model.

Figure 9. The training data is plotted (archaeological bones in blue and forensic bones in red) together with the obtained LDA model, based on HAP and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The black line is the decision boundary between the two predicted regions.

(14)

Figure 10. The test data is plotted (archaeological bones in blue and forensic bones in red) together with the LDA model, based on HAP and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The black line is the decision boundary between the two predicted regions.

Support Vector Machine

For the SVM method, two approaches for determining the decision boundary have been used: The linear method and the radial basis function method. Figure 11 and 12 show the linear model based on the compounds HAP and FAD for the training and test set, for which a sensitivity of 76.47% to 100% and a specificity of 75.58% to 88.89% were reached. The background colour of the graphs determines the posterior probability, this helps in stating the probability a datapoint belongs to a certain group. For collagen and HAP, the sensitivity ranged from 76.47% to 98.11% and the specificity ranged from 70.83% to 93.83% (Appendix 1, Figure 11 and 12). The sensitivity and specificity from collagen and FAD were 80.88% to 100% and 77.91% to 87.65% respectively (Appendix 1, Figure 13 and 14). Relatively high sensitivities were reached for the linear approach for all three the compound combinations and the ranges were much smaller compared to the sensitivity ranges of the LDA models. The specificity was slightly lower compared to earlier observed specificities in previous models, but it was still relatively high. Additionally, for the radial basis function approach, comparably high sensitivities were obtained. The HAP and collagen model showed a sensitivity of 86.764% to 100% and a specificity of 73.26% to 87.65% (Figure 13 and 14). The model of collagen and HAP resulted in a sensitivity of 83.82% to 100% and a specificity of 71.88% to 90.12% (Appendix 1, Figure 15 and 16). The model based on collagen and FAD gave a sensitivity ranging from 92.65% to 98.11% and a specificity ranging from 77.91% to 87.65% (Appendix 1, Figure 17 and 18). The lower number of the sensitivity range was in all models higher compared to the linear approach, meaning that the radial basis function created a more consistent model. The specificity was also for this approach slightly lower compared to the earlier mentioned models.

(15)

Figure 11. The predicted regions of archaeological and forensic bones, predicted with a linear SVM model, based on HAP and FAD. The model is plotted together with the data the model is based on (training set). The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Figure 12. The predicted regions of archaeological and forensic bones, predicted with a linear SVM model, based on HAP and FAD. The model is plotted together with the test set. The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

(16)

Figure 13. The predicted regions of archaeological and forensic bones, predicted with a radial basis function SVM model, based on HAP and FAD. The model is plotted together with the data the model is based on (training set). The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Figure 14. The predicted regions of archaeological and forensic bones, predicted with a radial basis function SVM model, based on HAP and FAD. The model is plotted together with the test set. The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

(17)

Discussion

Discussion of the models

Several models for classifying recovered bones as archaeological or forensic have been tested. At first, several ratios have been investigated. The HAP-FAD ratio showed a decline over time, which is in accordance with previous research (18,21). However, there was a lot of variation within the data at each time period, which makes a specific age determination impossible. The cause of this variation will be discussed later. The collagen-FAD ratio did not change over time; therefore, this ratio cannot be used in dating bones. The collagen-HAP ratio showed an increase when the bones were getting older, but again, the large amount of variation caused the age determination to be problematic. These ratios were based on the AUC, which scales with the approximate amount of a compound. This research showed a decrease in the fluorescence of collagen, HAP and FAD, which is consistent with the literature, which states that proteins, such as collagen and FAD, will decrease over time and that also HAP will be decomposed (12–14).

The second method that has been performed on the data was k-means clustering. The three models, based on HAP and FAD, collagen and HAP, and collagen and FAD, showed similar results. The sensitivity, the percentage of forensic bones that were correctly predicted by the model to be forensic, in general was very low. This resulted in high false negative rates, meaning that a high percentage of forensic bones were incorrectly predicted to be archaeological. In general, the specificity for the k-means clustering models was relatively high. The specificity is the percentage of archaeological bones that are correctly predicted to be archaeological. A high specificity results in a low false positive rate, meaning that there was only a small percentage of the archaeological bones that were predicted to be forensic by the model. Since bones that are predicted to be forensic will all be investigated and false positives will be found in a later stage of the research, false positives are less harmful in our investigation than false negatives. Bones that are predicted to be archaeological are not subjected to further investigation, causing the false negatives to remain unnoticed. The low sensitivity of the k-means clustering models are problematic in determining the age of bones, because many forensic bones will be classified as archaeological and will consequently not be investigated. An explanation for this low sensitivity could be the way k-means clustering finds groups in the data. K-means clustering is based on the distances between the datapoints and the nearest mean, without taking into account which bones are archaeological and which are forensic, which led towards a linear separation of the data (24–26). Since there was a certain amount of overlap between the archaeological and forensic bones, an easy linear division was not possible for the fluorescence data in this experiment. Therefore, k-means clustering is not a good method to create a model for dating bones.

The third method for estimating the age of bones was PCA in combination with k-means clustering. Since an overlap of the archaeological and forensic bones was still present after the transformation of the data with PCA, k-means clustering did not manage to clearly separate the data by using a linear division. Therefore, the models created with k-means clustering led again to a low sensitivity and a high specificity. The same problem of the high false negative rate applies in this situation, meaning that PCA in combination with k-means clustering is not the best method in estimating the age of bones. The fourth method, LDA, took into account the groups that were present in the data and this led to better models for estimating the age of bones. However, similar to k-means clustering, LDA also separated the data based on a linear decision boundary, but the fact that the LDA model was trained

(18)

on the actual groups that were present in the data, and the possibility of adding weights to the groups, LDA was a more promising way in separating the data compared to k-means clustering. The LDA model indeed resulted in a higher sensitivity and specificity, compared to the earlier created models. Still, there are a number of forensic bones that are classified to be archaeological, which is not ideal in practice.

The last models, the ones based on SVM, gave the best results and therefore have the best potential to be used in determining the age of bones. Especially the model based on the radial basis function in combination with the weights given to the archaeological and forensic bones resulted in a high sensitivity and specificity. This is the only model that is created that does not make a distinction between archaeological and forensic bones by using a linear decision boundary. The model based on the radial basis function allows the decision boundary to be more flexible, leading to more specific predicted regions for the two groups of bones.

Factors concerning the measurements

As mentioned earlier, there was a lot of variation within the AUCs at each time period. This made the calculated ratios less reliable. Besides that, it caused the archaeological and forensic measurements to overlap when plotted in a two-dimensional scatterplot. There may be many explanations for causing this variance. The first one has to do with the sample preparation. Just before measuring the samples in 2018, the cross-sections were sawn from the femurs. To create dataset-2020, these cross-sections already existed, and the samples were only sanded before measured. This slightly different sample preparation might have affected the fluorescence intensity. The second explanation relates to how the bone samples were measured using fluorescent spectroscopy. The distance between the fibre and the sample affects the intensity of the emission spectrum in a way that the intensity is larger when the distance is smaller. Between the measurements of collagen, HAP and FAD, the height of the fibre was not changed, so by taking ratios of these samples, differences caused by this height are ruled out. However, in the other models that were created no ratios were used, meaning that this height did have an influence on those models. Besides that, the thickness of the cross-sections was not equal amongst the whole surface and therefore the height of the fibre was adjusted if necessary, to maintain the same distance between the fibre and the sample. This caused differences in the height of the fibre within the measurements of one sample, creating a large variation in AUC between the different measurement locations. Besides that, the fibre that was used to measure the samples with had a large diameter, which resulted in sampling of the emission spectrum of both the spongy bone and the more compact cortical bone (11). It is expected that these layers will show a different fluorescent intensity when measured separately. The cortex of the bone matrix may be more affected by environmental factors, while the spongy bone on the inner side may remain more intact. The mixture of the different structures within one measurement could be a source of the variation. Additionally, the dataset of 2020 was created at two different moments, with a 3.5-month gap in between. It was not tested whether anything had changed about the fluorescent spectrometer or its light source, such as replaced elements, which may cause more variation within the dataset. The last explanation could be found in the curves of collagen, HAP and FAD. Since the peaks of collagen (400 nm) and HAP (440 nm) are close to each other, there might be some overlap in the peaks. By calculating the AUC of collagen, a part of this area may actually belong to the AUC of HAP, and the other way around. The same applies for FAD (535 nm), of which the peak lies on a slope, which could actually be the peak of HAP.

(19)

Factors that influence decomposition

Besides the factors concerning the measurements that are affecting the variation in the dataset, other reasons could be found in the environment the bones were buried in. For example, there are many factors that are influencing the decomposition of bones (10,14). Several physical factors, such as the transport due to animals and water, sandblasting, weathering, burial and mineralisation affect the process of decomposition (14). The rate of decomposition, influencing the amount of the different compounds present in bone, is affected by the temperature of the environment, rainfall, clothing of the individual, the soil composition, perimortem trauma to the bone, body weight, and chewing by animals (14,16). The bones in this experiment were all buried, and the created models are thus limited to be used in the case of buried bones. However, the burial conditions differed between the bone collections in terms of the type of soil and these conditions could have influenced the decomposition of the bones. For example, the breakdown of collagen will occur faster in an alkaline environment compared to an acidic environment (10), while the loss of HAP will be faster in a more acidic than an alkaline environment (13). Additionally, the accessibility for water is an important factor, because both the loss of collagen and HAP require the presence of water (9). In this experiment, it was not corrected for these burial conditions and other environmental factors.

Problems and limitations of the models

The models that have been created have some problems and limitations. The first problem is that the boundary between archaeological and forensic bones in the SVM model is placed at the year 1930, while this boundary should actually be set at 1920. The reason for this is that bones dating from after 1920 are considered to be forensically relevant in the Netherlands (2). The boundary in this experiment is set at 1930, due to the fact that none of the five time periods the bones originate from begin or end at 1920. Additionally, there is a time gap between 1930 and 1970 for which no bones were available. Therefore, the models might classify some forensic bones as archaeological. This problem links to the second problem, which is the problem of false negatives. In case of a false negative, forensic bones are incorrectly predicted to be archaeological by the model, and even though the false negative rate of the SVM models in this experiment is very low due to the high sensitivity, false negatives might occur in the future. This change is increased by the fact that the age boundary of 1930 is higher than the actual boundary of 1920. Additionally, a limitation of the models is that the effect of the environment on the bones’ composition and therefore their fluorescence measurement is not taken into account. Besides that, it is difficult to state for which types of soil the models should work.

Improvements

The SVM model will not make it possible yet to completely replace the current method; however, it could be a good addition to radiocarbon dating. The low false negative rate makes it possible to conclude that most bones that are classified as archaeological do not need any additional research, which makes it possible to reduce the investigation time and costs. The bones that are predicted to be forensically relevant need to be further investigated, and a small portion of these bones will probably later turn out to be archaeological. However, the model produced during this research is the first step in creating a model that could eventually be used in practice by crime scene investigators. Another improvement could be made by implementing the earlier discussed adjustment for different soil types, which makes the age determination more accurate. This could be done by investigating bones originating from different types of soil and determine which model would work for which soil type. Other aspects that influence the fluorescence measurements could be included in the model as well. This could be investigated by looking into which factors influence the measurements and by creating

(20)

an adjustable model for the factors with the largest influence. Eventually, the model should be tested by using case examples before it could be applied in practice.

Future research

To make it possible to use the SVM model in the future for dating bones, additional research needs to be done. An important improvement while measuring the fluorescence of the bone samples would be to measure a more specific area on the cross-section. This could help in minimising the variation within and between bone samples of the same age. Narrowing down the excitation area of the fibre could be done by using a set-up including two or more lenses. Besides that, the way the bone samples were prepared could be more standardised and improved. The cross-sections should be sawn and sanded in water to avoid influences of heat friction. Additionally, future research could indicate how different soil types affect the fluorescence measurement and the AUC of collagen, HAP and FAD. By using this knowledge, the model could be adjusted for each soil type. Besides soil, there are many other factors that could influence the fluorescence measurement of the recovered bones. These are all factors that affect the decomposition of bones, such as the environmental factors temperature, ground- and rainwater, pH, bacteria, but also the age and sex of the donor, or whether the donor had any diseases or was injured (14,16). These factors might for example affect the porosity of the bone and these effects could be investigated in future research. Finally, other fluorescent compounds in bone tissue, such as elastin and NADH, could be included in the models to improve the accuracy of the method (17).

Conclusion

To find a new method in determining the age of bones several models have been created, based on ratio-curves, k-means clustering, PCA, LDA and SVM. SVM models with the radial basis function are the most promising method in determining the age of bones. The current method of radiocarbon dating is time consuming and SVM can be a good addition in dating bones to minimise the number of bones that need to be measured with the radiocarbon dating. However, many improvements need to be achieved before this method could be implemented in casework. These improvements mainly have to do with standardising the fluorescent measurements, but on the other hand, more knowledge about factors that influence these measurements and bone composition is required.

References

1. Nederlands Forensisich Instituut. DNA-databank vermiste personen [Internet]. [cited 2020 Jul 24]. Available from:

https://dnadatabank.forensischinstituut.nl/dna-databanken/dna-databank-vermiste-personen

2. Nederlands Forensisich Instituut. Zandmotor brengt geheimen met zich mee [Internet]. 2016 [cited 2020 Feb 11]. Available from:

https://magazines.forensischinstituut.nl/atnfi/2016/17/zandmotor-brengt-geheimen-met-zich-mee

3. Taylor RE, Myers Suchey J, Payen LA, Slota PJ. The Use of Radiocarbon ( 14 C) to Identify Human Skeletal Materials of Forensic Science Interest. Journal of Forensic Sciences. 1989;34(5):1196-12053J.

(21)

2008;50(2):249–75.

5. Coleman DC, Fly B. (ed.) Carbon Isotope Techniques. Academic Press Inc.; 1991. 267 p. 6. Taylor RE, Aitken MJ. Chronometric Dating in Archeaology. Springer US; 1997. 411 p. 7. Nederlands Forensisich Instituut. Koolstofdatering: hoe oud is een skelet? [Internet]. 2014

[cited 2020 Feb 11]. Available from:

https://magazines.forensischinstituut.nl/atnfi/2014/04/koolstofdatering-hoe-oud-is-een-skelet

8. White, T. D.; Folkens PA. The human bone manual. Elsevier; 2005.

9. Cox M, Mays S. Human Osteology: In Archaeology and Forensic Science. First Edition. Greenwich Medical Media; 2000.

10. Kendall C, Eriksen AMH, Kontopoulos I, Collins MJ, Turner-Walker G. Diagenesis of archaeological bone and tooth. Palaeogeography Palaeoclimatology Palaeoecology. 2018;491(May 2017):21–37.

11. Martini FH, Nath JL, Bartholomew EF. Fundamentals of Anatomy and Physiology. Tenth Edition. Pearson; 2015.

12. Swaraldahab MAH, Christensen AM. The Effect of Time on Bone Fluorescence: Implications for Using Alternate Light Sources to Search for Skeletal Remains. Journal of Forensic Sciences. 2016;61(2):442–4.

13. Pokines JT, Symes SA. (ed.) Manual of Forensic Taphonomy. CRC Press; 2013.

14. Haglund WD, Sorg MH. (ed.) Forensic Taphonomy: The Postmortem Fate of Human Remains. CRC Press; 1997.

15. Mewies M, McIntire WS, Scrutton NS. Covalent attachment of flavin adenine dinucleotide (FAD) and flavin mononucleotide (FMN) to enzymes: The current state of affairs. Protein Science. 1998;7(1):7–20.

16. Collins MJ, Nielsen-Marsh CM, Hiller J, Smith CI, Roberts JP, Prigodich R V., et al. The survival of organic matter in bone: A review. Archaeometry. 2002;44(3):383–94.

17. Bachmann L, Zezell DM, Ribeiro A da C, Gomes L, Ito AS. Fluorescence spectroscopy of biological tissues - A review. Applied Spectroscopy Reviews. 2006;41(6):575–90. 18. Van Hal R. Auto-fluorescence in human femora over time - validating the model for the

forensic relevance of human skeletal remains. 2018;

19. Ramanujam N. Fluorescence spectroscopy in vivo. In: Encyclopedia of Analytical Chemistry. 2000.

20. Bachman CH, Ellis EH. Fluorescence of bone. Nature. 1965;206:1328–31. 21. Keuning R. Autofluorescenctie van bot in relatie tot het PMI. 2016;

22. Jain A, Nandakumar K, Ross A. Score normalization in multimodal biometric systems. Pattern Recognition. 2005;38(12):2270–85.

23. Li W, Liu Z. A method of SVM with normalization in intrusion detection. Procedia Environmental Sciences. 2011;11(PART A):256–62.

(22)

24. Likas A, Vlassis N, J. Verbeek J. The global k-means clustering algorithm. Pattern Recognition. 2003;36(2):451–61.

25. Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY. An Efficient k-Means Clustering Algorithm: Analysis and Implementation. Pattern Analysis Machine Intelligence. 2002;24(7):881–92.

26. Alpaydin E. Introduction to Machine Learning. Third Edition. The MIT Press; 2014. 640 p. 27. Abdi H, Williams LJ. Principal component analysis. Wiley Interdisciplinary Reviews:

Computational Statistics. 2010;2(4):433–59.

28. Jolliffe IT. Principal components analysis. Second Edition. Springer; 2010. 518 p.

29. Balakrishnama S, Ganapathiraju A. Linear Discriminant Analysis - A Brief Tutorial. Institute for Signal and Information Processing. 1998;8.

30. Ye J, Janardan R, Li Q. Two-dimensional linear discriminant analysis. Advances in Neural Information Processing Systems. 2005;

31. Xu L, Iosifidis A, Gabbouj M. Weighted linear discriminant analysis based on class saliency information. In 2018 25th IEEE International Conference on Image Processing (ICIP). 2018; (October):2306–10.

32. Suykens JAK, Vanderwalle J. Least Squares Support Vector Machine Classifiers. Neural Processing Letters. 1999;9:293–300.

33. Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press; 2013. 204 p.

(23)

Appendix 1.

Figure 3. The collagen-FAD ratio over time showing the means for each time period with the standard deviations (n = 298). The curve is exponentially fitted to the data.

Figure 4. The collagen-HAP ratio over time showing the means for each time period with the standard deviations (n = 298). The curve is exponentially fitted to the data.

(24)

Figure 5. The training data is plotted (archaeological bones in blue and forensic bones in red) together with the obtained k-means clustering model, based on collagen and HAP. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The centers of the clusters are marked with a cross (x).

Figure 6. The test data is plotted (archaeological bones in blue and forensic bones in red) together with the k-means clustering model, based on collagen and HAP. The predicted regions of the archaeological (grey) and forensic bones (white) are shown.

(25)

Figure 7. The training data is plotted (archaeological bones in blue and forensic bones in red) together with the obtained k-means clustering model, based on collagen and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The centers of the clusters are marked with a cross (x).

Figure 8. The test data is plotted (archaeological bones in blue and forensic bones in red) together with the k-means clustering model, based on collagen and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown.

(26)

Figure 9. The training data is plotted (archaeological bones in blue and forensic bones in red) together with the obtained LDA model, based on collagen and HAP. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The black line is the decision boundary between the two predicted regions.

Figure 10. The test data is plotted (archaeological bones in blue and forensic bones in red) together with the LDA model, based on collagen and HAP. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The black line is the decision boundary between the two predicted regions.

(27)

Figure 11. The training data is plotted (archaeological bones in blue and forensic bones in red) together with the obtained LDA model, based on collagen and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The black line is the decision boundary between the two predicted regions.

Figure 12. The test data is plotted (archaeological bones in blue and forensic bones in red) together with the LDA model, based on collagen and FAD. The predicted regions of the archaeological (grey) and forensic bones (white) are shown. The black line is the decision boundary between the two predicted regions.

(28)

Figure 13. The predicted regions of archaeological and forensic bones, predicted with a linear SVM model, based on collagen and HAP. The model is plotted together with the data the model is based on (training set). The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Figure 14. The predicted regions of archaeological and forensic bones, predicted with a linear SVM model, based on collagen and HAP. The model is plotted together with the test set. The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

(29)

Figure 13. The predicted regions of archaeological and forensic bones, predicted with a linear SVM model, based on collagen and FAD. The model is plotted together with the data the model is based on (training set). The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Figure 14. The predicted regions of archaeological and forensic bones, predicted with a linear SVM model, based on collagen and FAD. The model is plotted together with the test set. The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

(30)

Figure 15. The predicted regions of archaeological and forensic bones, predicted with a radial basis function SVM model, based on collagen and HAP. The model is plotted together with the data the model is based on (training set). The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Figure 16. The predicted regions of archaeological and forensic bones, predicted with a radial basis function SVM model, based on collagen and HAP. The model is plotted together with the test set. The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

(31)

Figure 15. The predicted regions of archaeological and forensic bones, predicted with a radial basis function SVM model, based on collagen and FAD. The model is plotted together with the data the model is based on (training set). The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Figure 16. The predicted regions of archaeological and forensic bones, predicted with a radial basis function SVM model, based on collagen and FAD. The model is plotted together with the test set. The circles are the bones are showing the real category the bones belong to (archaeological or forensic) and the crosses indicate the prediction by the model. The posterior probabilities (P), indicated with the coloured background, shows the probability a datapoint belongs to a certain group.

Referenties

GERELATEERDE DOCUMENTEN

Dit suggereert dat we de hoekpunten, vlakken en ribben van een veelvlak op de een of andere manier als plustekens, gebieden en zetten van een „sprouts-achtig“ spel

The mechanisms that determine community structure The study of soil microbial communities mainly concentrated on for the factors which influence soil microbial diversity Weiner

Dompeling in Delvocid®Instant: Een deel van de bollen werd ontsmet door middel van dompeling gedurende 15 minuten in Delvocid (4 g/l, bevat 2 g/l natamycine) waaraan een hechter

Volgens het OECD, Guidance on Transfer Pricing Aspects of intangibles, zijn de inkomsten waar de juridische eigenaar binnen TP recht op heeft afhankelijk van: de functies die

In order to understand the relationship between traditional religion and politics on the land issue, I will first provide a brief background of the political scenario that has

Samples of archaeological cod cranial (n ¼ 208), and postcranial (n ¼ 188) bones for d 13 C and d 15 N analysis of collagen were obtained from over 50 archaeological assemblages

The aim of this study was to assess the predictive validity of the IFTE for both positive treat- ment outcomes (i.e., leave) and negative treatment outcomes (i.e., inpatient

For detailed AIAA layout and style guidelines, please refer to the AIAA author guide for paper submission, format, and other