Classifying Raman Spectra of Extracellular Vesicles based on Convolutional Neural Networks for Prostate Cancer Detection

(1)

R E S E A R C H A R T I C L E

Classifying Raman spectra of extracellular vesicles based on

convolutional neural networks for prostate cancer detection

Wooje Lee

1

| Aufried T.M. Lenferink

2

| Cees Otto

2

| Herman L. Offerhaus

1

1_{Optical Sciences, MESA+ Institute for}

Nanotechnology, University of Twente, Enschede, The Netherlands

2_{Medical Cell BioPhysics, Technical}

Medical Centre, University of Twente, Enschede, The Netherlands

Correspondence

Herman L. Offerhaus, Optical Sciences, MESA+ Institute for Nanotechnology, University of Twente, Enschede, The Netherlands.

Email: h.l.offerhaus@utwente.nl

Funding information Stichting voor de Technische

Wetenschappen, Grant/Award Number: 14197

Abstract

Since early 2000s, machine learning algorithms have been widely used in many research and industrial fields, most prominently in computer vison. Lately, many fields of study have tried to use these automated methods, and there are several reports from the field of spectroscopy. In this study, we demonstrate a classifica-tion model based on machine learning to classify Raman spectra. We obtained Raman spectra from extracellular vesicles (EVs) to find tumor derived EVs. The convolutional neural network (CNN) was trained on preprocessed Raman data and raw Raman data. We compare the result from CNN with results from principal component analysis that is widely used among in spectroscopy. The new model classifies EVs with an accuracy of >90%. Moreover, the new model based on CNN is also suitable for classifying the raw Raman data directly without preprocessing with a minimum accuracy of 93%.

K E Y W O R D S

Cancer biomarker, convolutional neural network, extracellular vesicles, machine learning, Raman spectroscopy

1 | I N T R O D U C T I O N

Raman spectroscopy allows to extract chemical

informa-tion from a sample without labeling.[1,2]When we obtain

the vibrational spectrum from a pure sample like toluene, ethanol, or silicon, we can readily identify the chemical contents. In real life, samples are unlikely to include only

one pure chemical component.[3–5] Especially in clinical

or biological applications, samples include many different types of molecules indicative of their function or cellular

origin.[3,6–10]Thus, we obtain very complex Raman

spec-tra and analyzing specspec-tral data requires an extended

effort.[4,5,11–13]

To analyze these deeply convoluted data, principal

component analysis (PCA) has commonly been

employed. PCA is mostly used to reduce the dimension

of the data and to make a prediction model. PCA calcu-lates principal components of the data and projects given

data onto a newly generated coordinate system.[14,15]PCA

shows optimal performance if the spectral data are line-arly correlated to their chemical content. Spontaneous Raman data are linear in first approximation but practical Raman spectra are unlikely to be only linear because they contain background and other features that disturb the scaling. Therefore signal processing is generally a

prereq-uisite, and this can bias the result of PCA.[12,13]

The main sources of background signal in a biological

sample are (a) autofluorescence[4,5,12,13] and (b)

suspen-sion solutions and sample container, such as phosphate‐

buffered saline or cell culture medium. These solutions

can contribute peaks or bands to the Raman spectrum.[12]

Background signals strongly affect the result of the

-This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

(2)

analysis so that we need to perform a rigorous background removal to avoid biasing of analysis result. However, removing background and noise, partially associated with the background, is challenging, because the background subtraction itself can induce peak shifting, leaning, and rejection of small peaks and distortion. It is difficult to dis-tinguish background noise from useful Raman spectrum in a complex sample. Because the background removal is both essential and a source of errors, using a type of data analysis that can automatically handle raw data directly is useful. We can use computing power where it is most useful, and we can eliminate a human source of artifacts. In our previous research, we classified Raman spectra obtained from extracellular vesicles (EVs) to detect prostate

cancer without labeling.[16]EVs are small spherical particles

(about 30 nm ~ 1μm in diameter) secreted by mammalian

cells.[17,18]Cells in general will contain different molecules

depending on their function or disease state. Thus, EVs orig-inating from diseased cells are likely to have a different cel-lular content compared with EVs derived from healthy

cells.[17,19–21]Thus, finding the cellular origin of EVs gives

us an insight and potential route to disease diagnosis. We obtained Raman spectra from four types of extra-cellular EVs; two out of four are blood product derived EVs, and the other two are prostate cancer cell line derived EVs (also known as tumor derived EVs [tdEVs]). In this research, we are aiming to discriminate spectral differences of Raman spectra of tdEVs from blood derived EVs as a disease biomarker. Healthy people should not have tdEVs or only a very low presence of tdEVs in their

body fluids.[22]Hematopoietic cell‐derived EVs are always

present in healthy people and patients. Hence, we aim to

distinguish between hematopoietic cell‐derived EVs and

tdEVs using an algorithmic analysis technique. We do not aim to distinguish between healthy prostate derived EVs and prostate tumor derived EVs. Although that is interesting in itself to reveal the spectral changes, it is not the clinically relevant distinction for diagnosis.

In the prior study, we demonstrated that PCA (after preprocessing) can classify EVs depending on their cellu-lar origin. With PCA, we were able to classify the Raman spectra with 95% of accuracy using the spectral finger print

region (400–1800 cm−1).[16]Although the method showed

good results, it classified data that were preprocessed and that model is not suitable for raw data. Here, we propose a prediction model based on a machine learning (ML)

algorithm. ML is widely used in computer vison,[23,24]

voice recognition, and voice synthesis,[25,26] and there

were some attempts to use ML in the field of

spectros-copy.[27–31] We demonstrate an ML‐based prediction

model to classify the Raman spectra of EVs without data preprocessing. Specifically, we use a convolutional neural network (CNN) to build our prediction model.

The CNN[32–34] is inspired by the mammalian brain.

Layers in the brain extract features from input before this information enters the deeper areas of brain for further processing. It was revealed by Hubel and Wiesel in the 1950s who showed that feature extraction is used in

pat-tern recognition tasks.[35]In 1998, LeCun et al. applied a

feature extractor in their pioneer convolutional

net-works.[36]This neural network algorithm has the feature

extracting layer known as the convolution layer prior to the feed forward neural network. In contrast to an artifi-cial neural network, the convolution layer in CNN allows the model to extract small details and to be trained on the extracted details of the input data, which improves its

pre-diction accuracy.[36,37]Since then, CNNs have been widely

used for image recognition or image classification, and there were several trials to use artificial intelligence

algo-rithm to study Raman spectral data[27–31,38]as well as

dif-ferent types of spectroscopic data.[39–42]In this study, we

suggest a platform for Raman signature classification of EVs based on CNN. The classification performed in this article is aimed at finding the spectral differences between prostate cancer derived EVs and blood cell derived EVs, because the latter are the clinically relevant background of the measurement. The platform approach will provide an automated and robust classification tool for a potential prostate cancer biomarker detection.

2 | E X P E R I M E N T A L

2.1 | Sample preparation

We prepared four different EV subtypes for this study: two from blood products (red blood cells and platelets) and the other two subtypes from prostate cancer cell lines (pros-tate cancer cell line [PC3] and lymph node carcinoma of the prostate [LNCaP]). The red blood cell concentrate and the platelet concentrate were obtained from the blood bank, Sanquin (Amsterdam, the Netherlands). The blood

products were diluted 1:1 with filtered phosphate‐

buffered saline followed by three times centrifugation. The supernatant was pooled to collect the separated EVs. We used the same protocol to harvest EVs from the PC3 and LNCaP cell lines. The EVs from PC3 and LNCaP

were used as a model system for prostate cancer‐derived

EVs. Cancer cell lines were cultured at 37°C and 5%

CO2 for 48 hr. After 48 hr of cell culture, the culture

medium was collected and centrifuged at 1,000xg for 30 min to get rid of undesired particles, for instance apopto-tic cells and bigger EV populations. After the centrifuga-tion, the supernatant was pooled to obtain the EVs.

Transmission electron microscopy images were taken to verify the collection of the EVs. The size distribution

(3)

tracking analysis (NS500; Nanosight, Amesbury, UK), see

Figure 2. The nano‐tracking analysis showed a mean size

for red blood cell derived EVs as 148 ± 3.7 nm at a

con-centration of 0.85 × 108± 0.03 × 108particles per

milli-meter. For platelet‐derived EVs, we find 89 ± 4.6 nm

and 0.42 × 108 ± 0.02 × 108 particles per millemeter.

For PC3‐derived EVs: 172 ± 3.7 nm and 1.00 × 108 ±

0.03 × 108particles per millemeter. For LNCaP‐derived

EVs the mean is 167 ± 4.4 nm, and the concentration is

1.06 × 108 ± 0.05 × 108 particles per millemeter. More

details are available in our previous work.[16]

2.2 | Raman spectral signature collection

We obtained the Raman signal of EVs using a home‐built

confocal Raman microscope.[16,43]This provides Raman

measurements and optical trapping. The Raman

micro-scope uses a Kr+laser at a wavelength of 647 nm as the

excitation source. The laser is focused onto the sample through a 40X objective. The same objective is also used

to collect the back‐scattered photons. The signal is

dis-persed in a home‐built spectrograph. For the collection

of the EVs spectral fingerprint, we used glass slides with

a small cavity. The cavity was filled with 50μl of sample

and covered by a cover glass. The excitation beam was focused in the middle of the cavity. Trapping event can be readily noticed by monitoring Rayleigh scattering. Once intensity of Rayleigh scattering is increased, we recorded 16 spectra with an exposure time of 10 s per spectrum (in total 160 s). Since we measure the sample at a fixed position for a sufficiently long time, we strongly believe that we measured multiple EVs instead of single EV. In this way, we obtained 300 spectra from the four EV subtypes (75 spectra per each subtype). Figure 3 shows averaged Raman spectra of each subtype. In the figure, blue line shows Raman spectrum after the back-ground removal, and the red curve represents averaged raw data. The raw data is shifted for clarity. Data

collec-tion is described in detail in our previous work.[16]

2.3 | CNN architecture for training on

Raman spectral data

The CNN architecture that is used in this study is illus-trated in Figure 1. The network has three convolution layers with a max pooling layer for each convolution layer for feature extraction. The feature extractor is followed by a fully connected network for learning on the extracted fea-tures. Output from the fully connected layers is normalized by softmax into a probability distribution that is the set of probabilities of K possible outcomes. Thus, the normalized output must be in the interval (0, 1). The networks were

realized in Python (Python Software Foundation. Python Language Reference, Version 3.6.6. Available at http:// www.python.org) and using Tensorflow (TensorFlow:

large‐scale machine learning on heterogeneous systems,

2015. Software available from tensorflow.org) (See Supporting Information).

Figure 4 shows a diagram of a convolution layer for 1‐

D input data. In our CNN architecture, the input spectral data in the moving window are convoluted with n × 1 fil-ter(s), and the filter(s) determine(s) the size of the moving window. Next the convoluted input is activated by Leaky rectified linear unit (ReLU). Leaky ReLU is given by

f xð Þ ¼ x;

ax;

_{if x}_{≥ 0}

otherwise; (1)

where x is the input to the neuron and the parameter a is normally smaller than 1 or zero for ReLU. After

convolu-tion, the convoluted data are down‐sampled by an

opera-tion known as max pooling. The max pooling reduces the spatial dimension of the convoluted feature by selecting

FIGURE 1 Schematic diagram of the 1‐D convolutional neural network used in this paper. The model has three convolution‐max pooling layers and fully connected network with four hidden layers. The convolution‐max pooling layer extracts features from the input spectra data and the fully connected layer is trained on the extracted features. LNCaP, lymph node carcinoma of the prostate; PC3, prostate cancer cell line; RBC, red blood cell

(4)

the maximum value in the moving window and allows

the creation of a translation‐invariant feature.

The extracted feature will enter the fully connected

(FC) layers. In this study, the FC network is a feed‐

forward neural network[44–46] with four hidden layers.

In the feed forward neural network, I inputs are propa-gated to the adjacent hidden layer. This process is contin-ued in every hidden layer until the end of the FC

FIGURE 3 The averaged Raman spectra of EVs from (a) red blood cells‐, (b) platelet‐ (c) PC3‐, and (d) LNCaP‐derived EVs. In each panel, the blue line represents preprocessed data, and the red line shows raw data. The shaded area shows the standard deviation of the measurement. All the spectra are normalized between 0 and 1

FIGURE 2 Size distribution and concentration of EV samples measured by nano‐tracking analysis. (a), (b), (c), and (d) show nano‐ tracking analysis results of red blood cell‐, platelet‐, prostate cancer cell line‐ and lymph node carcinoma of the prostate‐derived EVs, respectively. In set shows image of EVs taken by transmission electron microscopy. Scale bar in each panel is 500 nm. This figure is reused with permission and modified after its original work.[16]_{EVs, extracellular vesicles}

(5)

network. If the networks have I inputs connected to the next hidden layer in which the layer has J neurons, the forward propagation can be expressed as

aj¼ tanh ∑

I

i¼1Wj i⋅xiþ bn

j¼ 1; 2; 3; …; J; (2)

where Wjiis a weight between ith input and jth neuron, xi

is ith input, bnis the bias of nth hidden layer, and ajis the

output of jth neuron in the hidden layer. The output of the hidden layer will be the input of the next hidden layer or the output of the FC network with K classes; we have four classes in this study. The output of the FC is given by

y_k ¼ ∑

J

j¼1Wkj⋅ajþ bout k¼ 1; 2; 3; …; K; (3)

where Wjiare the weights connected to the output of FC

network, ajis the output of the previous hidden layer, bout

is the bias of the output layer, and ykis the k‐dimensional

nonactivated output of the FC. The output will be

acti-vated by the softmax function,[24,47]Equation (4).

S yð Þ ¼_k e yk ∑K m¼1e ym k¼ 1; 2; 3; …; K (4)

The softmax calculates the probability distribution of

the event over “K” different events that sum to one. To

train a model on a given input, the model calculates the distance between its prediction and given label. The dis-tance is called cost, and the cost is calculated by the cross

entropy function[47,48]written as

D Sð ; LÞ ¼ − ∑

K

k¼1Lklog S yð ð Þk Þ; (5)

where S is the probability of each class and L is the given label. Here, we used one hot encoded label that means the character label is expressed as a vector, for example “RBC‐EVs” is expressed as [0 0 0 1], “platelet‐EVs” is [0 0 1 0], and so on. The Adam optimizer updates weights based on the cost to minimize the distance between the

prediction result and the target.[49] The outcome of the

cross entropy function will be closer to zero if the model is trained well. Then, the model propagates new data for-ward, and the new cost will be back propagated itera-tively for the training. During the network training, we applied dropout to avoid weight vanishing and overfitting

to the training data.[50]The dropout algorithm randomly

selects 50% of neurons in each layer for every iteration.

3 | R E S U L T A N D D I S C U S S I O N

We performed PCA and CNN both on baseline corrected data and raw data. To make a prediction model based on

PCA, the EVs’ Raman data are divided into two subsets

which are training and testing sets. The PCA training set consists with 240 spectral data, and the testing set has 60 spectral data. To make the training set and testing set, the spectra are evenly selected on a random basis from four EV subtypes; we selected 15 spectra from each EV subtypes to make the testing dataset. PCA is done on the training dataset, and we predict the testing set based on

PCA result of the training set. The PCA‐based prediction

model was realized in MATLAB R2016b (Version 9.1.0, The MathWorks, Natick, MA). The PCA and CNN model were trained on three different spectral regions to find most relevant spectral area for the classification;

400–3,050 cm−1(full spectrum), 400–1,800 cm−1

(finger-print) and 2,700–3,050 cm−1(high frequency, also known

as C‐H stretch region).

FIGURE 4 A schematic diagram of convolution and max pooling layer. In the convolution layer a scalar product is performed of two vectors with input values and filter properties. Next, the result of the scalar product is activated by Leaky ReLU and down sampled to reduce the size of the extracted feature. ReLU, rectified linear unit

(6)

The aforementioned 300 Raman spectra of EVs were

also used for the CNN‐based prediction model. Artificial

neural network models usually require large volume of data to learn more small detail and avoid over fitting to the given data. Moreover, the dimension of Raman spectrum is 1,152 × 1. It means that the dimension of the training data is far bigger than the number of data, and it can readily cause over fitting problem. To solve this problem, we conducted data augmentation which is a commonly used method to increase the number of training data. For the data augmentation, we generated white Gaussian noise with signal to noise ratio (SNR) of 15, 25, and 30 and added to the original signal, which was done using the additive white Gaussian noise func-tion provided by MATLAB. Figure 5 shows an example of data augmentation done for this research. After the augmentation, we had 1,200 spectral data that include 300 original data and 900 random noise added data. Then, the Raman spectral dataset is divided into three subsets as follows: training, validation and testing dataset. We randomly selected 90 spectra from each sub-type. The testing set was prepared from 50% of 90 spec-tra, and the other 50% became validation set. In the end, we had a training set, testing set, and validation set with 840, 180, and 180 Raman spectra, respectively. The structure of the model for this particular analysis has three sets of convolution layers followed by a max pooling layer. The feature extractor is followed by four hidden layers that have 1,000, 500, 200, and 200 neu-rons. The output layer has four neurons and is con-nected to softmax to convert output scores to a probability distribution. Prior to the network training, all the weights of the network were initialized on a ran-dom basis. We assigned the weights with Xavier

initializer, which assigns weights from a Gaussian

distri-bution.[46] The initialization method keeps the variance

of the weights the same in each hidden layer. CNN training time was about 10 to 70 min depending on the input dimension, and all the training was done by a graphics processing unit, which is NVIDIA GTX1080Ti.

Table 1 and 2 show the classification result of PCA‐

linear discriminant analysis (LDA) and PCA‐ quadratic

discriminant analysis (QDA) on preprocessed data

(Table 1) and raw data (Table 2). In Table 1, PCA‐QDA

shows a fairly good classification, especially in the

finger-print region. However, the results show that the PCA‐

based model has a high prediction/classification accuracy under certain condition; it performed well on the

finger-print region (400–1,800 cm−1) of the background corrected

Raman data. In general, however, the PCA model classi-fied the Raman spectra of the EVs poorly in the full spectral area and in the high frequency region. Table 2 shows the

result of PCA‐LDA and PCA‐QDA trained on untreated

data. As can be seen in the table, classification accuracy of PCA on raw data is very low. The result shows that PCA is not suitable for handling the raw Raman data because PCA requires decent background/noise removal process as discussed in Section 1.

We trained the CNN model on preprocessed and raw Raman data, and Table 3 shows the prediction accuracy on both datasets. In both cases, the prediction accuracy is higher than 90%. Originally, we assumed that CNN trained on clearer signal would show a better classifica-tion accuracy because, after removing background

FIGURE 5 Example of data augmentation. Randomly generated white Gaussian noise is added to the Raman spectrum of red blood cell‐derived EVs. In the figure, the blue curve represents the original Raman signal. The red line shows the noise added signal, and the noise is plotted separately in yellow. All the spectra are normalized between 0 and 1. SNR, signal to noise ratio

TABLE 1 Prediction accuracy of principal component analysis on preprocessed dataset

Spectral region (cm−1) LDA QDA

400–3,050 0.6500 0.7833

400–1,800 0.8333 0.9500

2,700–3,050 0.7333 0.8667

Abbreviations: LDA, linear discriminant analysis; QDA, quadratic discriminant analysis.

TABLE 2 Prediction accuracy of principal component analysis on raw data

Spectral region (cm−1) LDA QDA

400–3,050 0.6167 0.6833

400–1,800 0.6167 0.6167

2,700–3,050 0.5833 0.6000

Abbreviations: LDA, linear discriminant analysis; QDA, quadratic discriminant analysis.

(7)

contribution, the remaining data should have cleaner EVs contribution instead of noise/fluorescent contribution. Although the model trained on preprocessed data classi-fied the spectra with accuracy of 90.89% in full spectral area and 90.22% in fingerprint and 91.22% in high fre-quency, the model trained on raw Raman data shows bet-ter prediction accuracies of 95.22% in full spectral region and 96.56% in fingerprint region and 93.11% in high fre-quency region. We attribute this to small signal buried in the untreated spectral data, which is not clearly visible because of its low SNR.

The mean size of EVs used in this study is about 150

nm.[16]The single particle is about 100‐fold smaller than

the focal volume of the Raman microscope. Thus, the solution in which EVs are suspended contributes to the Raman signal more than the trapped particles do, which leads to poor SNR of the Raman spectra of EVs, about 7 dB. At such, an SNR small spectral features are concealed by background contribution, and small peaks might be eliminated by background correction. In other words, raw data retain small spectral information that is not clearly visible in the raw spectrum because of poor SNR. This subtle information allows the CNN model to learn more details of the input signal.

We have tried identifying the most relevant spectral regions that contain most of the meaningful information for the classification. The result in Table 3 shows that every spectral segment used in this study shows high accuracy of 95.22%, 96.56%, and 93.11% in finger print

(400–1,800 cm−1), high frequency (2,700–3,050 cm−1),

and full spectrum (400–305 cm−1), respectively. The fact

that the model trained on fingerprint performed better

is suggesting that the spectral fingerprint region (400–

1,800 cm−1) has more relevant information for the

classi-fication than the high frequency region (2,700–3,050

cm−1). However, it does not imply that the spectral

infor-mation in high frequency region is less important than information in the spectral fingerprint region. Whereas protein and lipid contribution are more prominent in the high frequency region, many other biomolecules con-tribute to the fingerprint region. The PCA model shows a similar result, namely a classification accuracy on the fin-gerprint and high frequency region of 95.00% and 86.67%, respectively.

4 | C O N C L U S I O N

In this study, we have demonstrated that a CNN‐based

prediction model can be used as a classifier of Raman spectra of EVs and that the model is suitable for raw data

handling. The study shows that a PCA‐based prediction

model can classify the spectral data by EVs’ cellular

ori-gin, but its classification ability is limited by background noise and spectral range of input signal. On the other hand, the CNN model suggested in this paper shows a

better classification accuracy (>90%) on both

preprocessed data and raw data. Interestingly, the model trained on raw data classifies the Raman spectra of EVs better than the model trained on preprocessed data. It suggests that the use of raw data is useful for the classifi-cation because the raw data keeps more features to learn and computing power can be saved.

A C K N O W L E D G E M E N T S

This work is part of the research program (Cancer‐ID)

with project number (14197) which is financed by the Netherlands Organization for Scientific Research (NWO). C O N F L I C T O F I N T E R E S T

The authors declare no competing financial interest. O R C I D

Wooje Lee https://orcid.org/0000-0002-6238-2898

Cees Otto https://orcid.org/0000-0001-6955-4843

R E F E R E N C E S

[1] C. V. Raman, Ind. J. Phys. 1928, 2, 387.

[2] K. Kneipp, H. Kneipp, I. Itzkan, R. R. Dasari, M. S. Feld, Chem. Rev.1999, 99, 2957.

[3] Y. H. Ong, M. Lim, Q. Liu, Opt. Express 2012, 20, 22158. [4] R. Petry, M. Schmitt, J. Popp, ChemPhysChem 2003, 4, 14. [5] S. Nie, S. R. Emory, Science 1997, 275, 1102.

[6] X. Song, R. D. Airan, D. R. Arifin, A. Bar‐Shir, D. K. Kadayakkara, G. Liu, A. A. Gilad, P. C. van Zijl, M. T. McMahon, J. W. Bulte, Nat. Commun. 2015, 6, 6719.

[7] L. A. Austin, S. Osseiran, C. L. Evans, Analyst 2016, 141, 476. TABLE 3 Classification accuracy of the convolutional neural network‐based prediction model

Spectral region (cm−1) Preprocessed data Raw data

400–3,050 0.9089 ± 0.0101 0.9522 ± 0.0101

400–1,800 0.9022 ± 0.0050 0.9656 ± 0.0091

(8)

[8] R. E. Kast, S. C. Tucker, K. Killian, M. Trexler, K. V. Honn, G. W. Auner, Cancer Metastasis Rev. 2014, 33, 673.

[9] A. S. Haka, K. E. Shafer‐Peltier, M. Fitzmaurice, J. Crowe, R. R. Dasari, M. S. Feld, Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 12371.

[10] Z. Huang, A. McWilliams, H. Lui, D. I. McLean, S. Lam, H. Zeng, Int. J. Cancer 2003, 107, 1047.

[11] R. Gautam, S. Vanga, F. Ariese, S. Umapathy, EPJ Tech. Instrum.2015, 2, 8.

[12] F. Bonnier, S. M. Ali, P. Knief, H. Lambkin, K. Flynn, V. McDonagh, C. Healy, T. Lee, F. M. Lyng, H. J. Byrne, Vib. Spectrosc2012, 61, 124.

[13] J. Zhao, H. Lui, D. I. McLean, H. Zeng, Appl. Spectrosc. 2007, 61, 1225.

[14] H. Abdi, L. J. Williams, Wiley Interdiscip. Rev.: Comput. Stat. 2010, 2, 433.

[15] S. Wold, K. Esbensen, P. Geladi, Chemom. Intel. Lab. Syst. 1987, 2, 37.

[16] W. Lee, A. Nanou, L. Rikkert, F. A. W. Coumans, C. Otto, L. W. M. M. Terstappen, H. L. Offerhaus, Anal. Chem. 2018, 90, 11290.

[17] E. van der Pol, A. N. Böing, P. Harrison, A. Sturk, R. Nieuwland, Pharmacol. Rev. 2012, 64, 676.

[18] J. Park, M. Hwang, B. Choi, H. Jeong, J.‐h. Jung, H. K. Kim, S. Hong, J.‐h. Park, Y. Choi, Anal. Chem. 2017, 89, 6695. [19] M. Verma, T. K. Lam, E. Hebert, R. L. Divi, BMC Clin. Pathol.

2015, 15, 6.

[20] F. Andre, N. E. Schartz, M. Movassagh, C. Flament, P. Pautier, P. Morice, C. Pomel, C. Lhomme, B. Escudier, T. Le Chevalier, The Lancet2002, 360, 295.

[21] G. Rabinowits, C. Gerçel‐Taylor, J. M. Day, D. D. Taylor, G. H. Kloecker, Clin. Lung Cancer 2009, 10, 42.

[22] A. Nanou, F. A. Coumans, G. van Dalum, L. L. Zeune, D. Dolling, W. Onstenk, M. Crespo, M. S. Fontes, P. Rescigno, G. Fowler, Oncotarget 2018, 9, 19283.

[23] E. Rosten, T. Drummond In European conference on computer vision; Springer: 2006, p 430.

[24] C. M. Bishop, Pattern recognition and machine learning, Springer, New York, NY 2006.

[25] S. Kang, X. Qian, H. Meng In, IEEE Int. Conf. Acoust.Speech Signal Proces.; IEEE2013, 2013, 8012.

[26] L. Muda, M. Begam, I. Elamvazuthi, arXiv preprint arXiv:1003.40832010.

[27] S. Sigurdsson, P. A. Philipsen, L. K. Hansen, J. Larsen, M. Gniadecka, H.‐C. Wulf, I.E.E.E. Trans. Biomed. Eng. 2004, 51, 1784.

[28] M. G. Madden, A. G. Ryder, Opto‐Ireland 2002: Optics and photonics technologies and applications, Vol. 4876, International Society for Optics and Photonics, Galway, Ireland 2003 1130. [29] T. A. Dolenko, S. A. Burikov, A. V. Sugonjaev, Opto‐Ireland

2005: Optical sensing and spectroscopy, Vol. 5826, International Society for Optics and Photonics, Dublin, Ireland 2005 298. [30] J. Liu, M. Osadchy, L. Ashton, M. Foster, C. J. Solomon, S. J.

Gibson, Analyst 2017, 142, 4067.

[31] M. Jermyn, J. Desroches, J. Mercier, M.‐A. Tremblay, K. St‐Arnaud, M.‐C. Guiot, K. Petrecca, F. Leblond, J. Biomed. Opt.2016, 21. 094002

[32] N. Aloysius, M. Geetha In, Int. Conf. Commun. Sig. Process. (ICCSP); IEEE2017, 2017, 0588.

[33] R. Yamashita, M. Nishio, R. K. G. Do, K. Togashi, Insights Imaging2018, 9, 611.

[34] W. Rawat, Z. Wang, Neural Comput. 2017, 29, 2352. [35] D. H. Hubel, T. N. Wiesel, J. Physiol. 1962, 160, 106.

[36] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Proc. IEEE 1998, 86, 2278.

[37] A. Krizhevsky, I. Sutskever, G. E. Hinton, Adv. Neural. Inf. Process. Syst.2012, 1097.

[38] X. Fan, W. Ming, H. Zeng, Z. Zhang, H. Lu, Analyst 2019, 144, 1789.

[39] J. Acquarelli, T. van Laarhoven, J. Gerretzen, T. N. Tran, L. M. Buydens, E. Marchiori, Anal. Chim. Acta 2017, 954, 22. [40] S. Malek, F. Melgani, Y. Bazi, J. Chemometr. 2018, 32, e2977. [41] C. Yuanyuan, W. Zhibin, Chemom. Intel. Lab. Syst. 2018, 181, 1. [42] K. Ghosh, A. Stuke, M. Todorović, P. B. Jørgensen, M. N.

Schmidt, A. Vehtari, P. Rinke, Adv. Sci. 2019, 6. 1801367 [43] L. Hartsuiker, W. Petersen, R. G. Rayavarapu, A. Lenferink, A.

A. Poot, L. W. Terstappen, T. G. Van Leeuwen, S. Manohar, C. Otto, Appl. Spectrosc. 2012, 66, 66.

[44] M. Minsky, S. A. Papert, Perceptrons: An introduction to computational geometry, MIT press, Cambridge, MA 2017. [45] G.‐B. Huang, Q.‐Y. Zhu, C.‐K. Siew, Neural Netw. 2004, 2, 985. [46] X. Glorot, Y. Bengio In Proceedings of the thirteenth international conference on artificial intelligence and statistics 2010, p 249.

[47] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press, Cambridge, MA 2016.

[48] R. Y. Rubinstein, D. P. Kroese, The cross‐entropy method: A unified approach to combinatorial optimization, Monte‐Carlo simulation and machine learning, Springer Science & Business Media 2013.

[49] D. P. Kingma, J. Ba, arXiv preprint arXiv:1412.6980 2014. [50] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R.

Salakhutdinov, J. Mach. Lear. Res. 2014, 15, 1929.

S U P P O R T I N G I N F O R M A T I O N

Additional supporting information may be found online in the Supporting Information section at the end of the article.

How to cite this article: Lee W, Lenferink ATM, Otto C, Offerhaus HL. Classifying Raman spectra of extracellular vesicles based on convolutional neural networks for prostate cancer detection. J Raman