Blackbox classifiers for preoperative discrimination between malignant
and benign ovarian tumors
C. Lu 1 , T. Van Gestel 1 , J. A. K. Suykens 1 , S. Van Huffel 1 , I. Vergote 2 , D. Timmerman 2
1
Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium,
2
Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
Email address: chuan.lu@esat.kuleuven.ac.be
Variable (symbol) Benign Malignant Demographic Age (age)
Postmenopausal (meno) 45.6 15.2 31.0 %
56.9 14.6 66.0 % Serum marker CA 125 (log) (l_ca125) 3.0 1.2 5.2 1.5 CDI High color score (colsc3,4) 19.0% 77.3 % Morphologic Abdominal fluid (asc)
Bilateral mass (bilat) Unilocular cyst (un)
Multiloc/solid cyst (mulsol) Solid (sol)
Smooth wall (smooth) Irregular wall (irreg) Papillations (pap)
32.7 % 13.3 % 45.8 % 10.7 % 8.3 % 56.8 % 33.8 % 12.5 %
67.3 % 39.0 % 5.0 % 36.2 % 37.6 % 5.7 % 73.2 % 53.2 % Demographic, serum marker, color Doppler
imaging and morphologic variables
Visualizing the correlation between the
variables and the relations between the
variables and clusters.
Biplot of Ovarian Tumor Data
1. Introduction
Ovarian masses is a common problem in gynecology. A reliable test for preoperative discrimination between benign and malignant ovarian tumors is of considerable help for clinicians in choosing appropriate treatments for patients.
In this study, we develop and evaluate several blackbox models, particularly multi-layer
perceptrons (MLP) and least squares support vector machines (LS-SVMs) , both within Bayesian evidence framework, to preoperatively predict malignancy of ovarian tumors. Model performance is accessed via Receiver Operating Characteristic (ROC) curve analysis.
2. Data
o: benign case x: malignant case
ROC curves
constructed by plotting the sensitivity (true positive rate) versus the1-specificity, or false positive rate, for varying probability cutoff level.
visualization of the relationship between
sensitivity and specificity of a test.
Area under the ROC curves (AUC)
measures the probability of the classifier to correctly classify events and
nonevents.
Patient Data
Unv. Hospitals Leuven
1994~1999
425 records, 25 features 32% malignant
Univariate Analysis
Preprocessing
Multivariate Analysis
PCA, Factor analysis Stepwise logistic regression
Model Building
Bayesian LS-SVM + sparse approxi.
Bayesian MLP
Model Evaluation
ROC analysis: AUC
Cross validation (temporal, random)
Descriptive statistics
Input Variable Selection
Data Exploration
Model Development
Procedure of developing models to predict the malignancy of ovarian tumors
Goal: find a model
With High sensitivity for malignancy and low false positive rate.
Providing probability of malignancy for
individual.
Bayesian LS-SVM (RBF, Linear) Forward Selection (Max. Evidence)
3. Methods
4. Bayesian MLPs and Bayesian LS-SVMs for classification
LS-SVM Classifier
(VanGestel,Suykens 2002)2 , 2
1
The following model is taken:
min ( , ) ,
2 2
S.T. [ ( ) ] 1 1,..., with reg
( ) ( )
ularizer . Denote [ , ]
T N w b i
i T
i i i
T
J w b w w e
y w x b e i N
f w x b
x
1 1
2 2
1
1
[ ,..., ] ,1 [1,...,1] , [ ,..., ] , [ ,..., ] , ( ) ( ) ( , ) e.g. RBF kernel: ( , ) exp{ / } Linear kernel: ( , )
Resulting 0 1 0 1
cl
T T T
N v N
T T
N ij i j i j
T v v
T N
Y y
b I Y
y e e e
x x K x x
K K
x z x z
x z z x
1
( ) [
assifier: N i i ( , )i ]
i
y x sign y K x x b
1 2
12 1 2 2T 1 2
MP
Introduce new error variables ( ( ) ˆ ), with ˆ the center of class in feature space.
2 ( ) exp ,
2( )
where ( ( ) ˆ ), , is the ( ,
varia ,
nc )
e
T
e e
e
e e
p x y D
e w x m
m
m
m w x m
H
1
of due to target noise and uncertainty in w.
( , , ) ( , , ) with the prior class probabili
( ) ( ) (
( , , )
ty.
)
y
e
p x y D H p y
p y p y
p x y D H p y x D H
Computing posterior class probabilities
solved in dual space
model , for
MLP: network structure, e.g.
LS-SVM: kernel parameter, e.g.
#hidden neurons for rbf ke
( , , , ) ( , , ) , ,
rnel
( , )
,
s
: infer , for given ,
p D w b H p w b H
p D
H
H P H
w b D
w b H
Level 1
=> the Maximum A Posteriori Estimation for and will be the solution of basic MLP/LS-SVM classifier
( , ) ( )
(
exp(
( , ) =
( , ))
b
( , )
w
)
: Infer hyperparameter
p D H p D H p H
p
J
D
w b
H H
p D
Level 2
Level
( )
choose the which maximi ( )
( )
( ) (
ze t
) he
: Compare models:
j j
j
j
j
j
p D p
D H H
H p D H
H p H
p D
p D
3
Model evidence
Bayesian Evidence Framework
Inferences are divided into distinct levels.
(2) (1) (1) (2)
Consider the one hidden layer MLP:
, where ( , ) ' with activation function of
exp( ) exp( ) hidden layer: '( ) tanh(
( ) ( ,
) ,
exp( ) exp( ) output layer: logistic funct on
)
i
a x w w g w x b b
a a
g a a
a a
f x g a x w
, 1
1
min ( , ) , with regularizer , 2
where the cross entropy error function ( ) 1
1 e
{ log ( ) (1 ) log(1 ( ))}.
xp( )
T w b
N
i i i i
i
J w b w w G
G y
g
f x y f x
a a
MP
2 2
( 1)
( 1)
posterior class probability can be approximated:
( ) ( , ) log log ,
where ( ) 1/ 1 / 8, and is var( | ), with the prior class probabili
( 1| , , )
( 1) ty.
g s a x N P y
w P
N
s s s
P y x D H
x y
P y
a
1,...,
Consider a binary classification problem, given D {( , )}x yi i i N, where xi Rp, yi 0,1 in case of MLP, yi 1,1 in case of LS-SVM.
MLP Classifiers
(Mackay 1992)Computing posterior class probabilities for minimum risk decision making Incorporate the different misclassification costs into the class priors: e.g.
Set the adjusted prior probability for malignant and benign class to: 2/3 and 1/3.
5. Experimental results
RMI: risk of malignancy index = scoremorph× scoremeno× CA125
Training set : data from the first treated 265 patients
Test set : data from the latest treated 160 patients
Performance from Temporal validation
ROC curve on test set
MODEL TYPE
AUC cut off
Accur acy
Sensi tivity
Speci ficity RMI 0.8733 0.4 78.13 74.07 80.19
0.3 76.88 81.48 74.53 MLP 0.9174 0.4 83.13 81.48 83.96 (10-2-1) 0.3 81.87 83.33 81.13 LS-SVM 0.9141 0.4 81.25 77.78 83.02
(LIN) 0.3 81.88 83.33 81.13
LS-SVM 0.9184 0.4 83.13 81.48 83.96 (RBF) 0.3 84.38 85.19 83.96
Performance on Test set
Input variable selection
The forward selection procedure tries to maximize the
model evidence of LS-SVM given a certain type of kernel
10 variables were selected using RBF kernels.
l_ca125, pap, sol, colsc3, bilat, meno, asc, shadows, colsc4, irreg
( 1)
'( 1) , where , denote the cost of misclassifying a case from class '+' and '-', respectively.
( 1) ( 1)
P y c
P y c c
P y c P y c
The forward selection procedure which tries to maximize the evidence of LS-SVM model is able to identify the
important variables.
The performance of LS-SVMs and MLPs are comparable.
Both models have the potential to give reliable
preoperative prediction of malignancy of ovarian tumors.
A larger scale validation is needed.
References
1. C. Lu, T. Van Gestel, et al. Preoperative prediction of malignancy of ovarian tumors using Least
Squares Support Vector Machines (2002), submitted paper.
2. D. Timmerman, H. Verrelst, et al., Artificial neural network models for the preoperative discrimination between malignant and benign adnexal masses.
Ultrasound Obstet Gynecol (1999).
3. J.A.K. Suykens, J. Vandewalle, Least Squares support vector machine classifiers, Neural Processing Letters (1999), 9(3).
4. T. Van Gestel, J.A.K. Suykens, et al., Bayesian framework for least squares support vector
machine classifiers, Gaussian process and kernel fisher discriminant analysis, Neural Computation (2002), 15(5).
5. D.J.C. MacKay, The evidence framework applied to classification networks, Neural Computation
(1992), 4(5).
Performance from randomized cross-validation (30 runs)
MODEL TYPE
mAUC (SD)
cut off
Accur acy
Sensi tivity
Speci ficity RMI 0.8882 100 82.65 81.73 83.06
0.0318 80 81.10 83.87 79.85 MLP 0.9409 0.6 84.46 87.20 83.21 (10-2-1) 0.0198 0.5 82.17 90.80 78.24 LS-SVM 0.9405 0.5 84.31 87.40 82.91 (LIN) 0.0236 0.4 82.77 90.47 79.27 LS-SVM 0.9424 0.5 84.85 86.53 84.09 (RBF) 0.0232 0.4 83.52 90.00 80.58
randomly separating training set (n=265) and test set (n=160)
Stratified, #malignant : #benign ~ 2:1 for each training and test set.
Repeat 30 times
Averaged Performance on 30 runs of validations