A review of multivariate calibration model transfer strategies

(1)

MSc Chemistry

Track Analytical Sciences

Literature study

A review of

multivariate calibration model transfer

strategies

By: Ali Ghamati

UvA#: 11806788

VU#: 2630227

Supervisor:

Jan Gerretzen

Examiner:

Bob Pirok

2

nd

Reviewer: Johan Westerhuis

February 21

th

, 2020

(2)

Summary

Calibration transfer is a relevant topic in applications where the same calibration method is applied on multiple instruments. Usually, calibration models are developed on one instrument (master) and cannot directly be transferred to another instrument (slave) having a different instrumental response. In such applications, time-consuming and costly recalibrations can be avoided by performing a calibration transfer. Typically, a set of samples referred to as transfer samples are used in this process to map the differences in instrumental responses and subsequently yield transfer parameters. Different calibration transfer methods are available in literature that calculate such transfer parameters. These include methods that correct the predicted y-values, spectral x-values or the calibration models.

In general, applying methods for correction of x-values such as the piecewise direct standardization results in good predictive performance after model transfer to the slave instrument. The optimal method is dataset dependent. Thus, there is no universal calibration transfer method.

Correcting spectral x-values can be performed in two directions. New spectra measured on the slave instrument can be transformed to match those measured on the master instrument and can subsequently be predicted by the initial model developed on the master instrument. Alternatively, in the reverse direction, the spectra measured with the master instrument on which the calibration model was developed are transformed to match the slave instrument, and are subsequently used to compute a new model. The reverse direction is typically the preferred option in industrial applications due to easier implementation.

As an alternative, model updating can be used to include instrumental variation of the slave instrument in the calibration model. Compared to calibration transfer methods, it is less likely that this method can yield acceptable results. However, due to the simplicity and relatively low time investment it should be considered first before applying transfer methods.

Standard-free calibration transfer methods or the use of transfer samples based on generic standards are discussed as well. These strategies have shown successful results in a limited number of studies and should only be considered in exceptional cases.

To finalize this review, a flowchart is developed as listed in Appendix I, providing guidance in the process of calibration transfer.

(3)

Introduction

Developing robust multivariate calibration models is a time consuming and costly process. The process involves selecting and preparing calibration samples, measuring spectra, performing reference analysis, and developing and validating the calibration model. In some cases, several hundreds of production samples are included in the model which have been collected over an extended period of time. Typically, calibration models are developed on one instrument (master/primary) with a specific instrumental response. When differences in the instrumental responses (e.g. intensity, absorbance or shift) occur after the calibration, the developed calibration model might no longer be applicable. Such differences might occur on the master instrument, due to replacement of optical components or detector, changes in sampling or physical modifications (e.g. different process temperature and particle size), or through ageing of the instrument (instrument drift). Or the instrumental responses have changed because the calibration model that is developed on the master instrument is implemented on other instruments (slave/secondary). In all cases the differences in instrumental responses can yield poor predictions for new samples and therefore require recalibration. Several methods have been developed to avoid time-consuming and costly full recalibration which is referred to as calibration transfer or standardization methods.

Two main steps are involved in a calibration transfer. The first step is remeasuring a set of samples on the master instrument in the case of differences in instrumental responses occurring on the master instrument. Or measuring a set of samples on the master and slave instrument in the case of transferring a model to another instrument. This set of samples referred to as transfer or standardization samples are used to map the differences between the instrumental responses. Then using transfer samples, the transfer or standardization parameters are calculated. However, there are also standard-free methods which do not require the any transfer samples.

Section 2 of this report describes some relevant theory regarding multivariate calibration models and how the performance (accuracy) of such models before and after calibration transfer is expressed. Section 3, will discuss the importance and criteria for selecting transfer samples in order to achieve optimal transfer results. Section 4 gives an overview of different methods found in the literature used to transfer calibration models. Finally, section 5 will discuss methods that can potentially allow the development of transferable models.

(5)

2

Theory

2.1 Multivariate calibration using PLS-regression

Second order spectroscopic instruments (e.g. IR, NIR, Raman) cannot perform direct quantitative measurements. Therefore, such instruments require calibration using a reference technique. In the calibration procedure a mathematical relationship is established between the absorbance or intensity values at specific wavelengths (𝑋𝑚) and a certain property (𝑌𝑛) that is determined with the reference

technique. Often numerous wavelengths are involved in measurements by second order instruments, thus a multivariate calibration model is needed to describe this relationship. This requires an arrangement of two dataset matrices. By doing so a spectral dataset (X) is formed in which the rows represent the number of spectra and the columns the number of wavelengths. The other dataset (Y) contains the corresponding reference values in which the number of rows is equal to that of the X matrix and the number of columns equals to the number of independent properties. Subsequently, the two matrices are corelated using linear regression as shown in Figure 1.

Figure 1: Principle of multivariate calibration. A mathematical relationship is established between the Y-matrix

and the X-matrix with n observations and 𝑚 variables. Via linear regression two matrices are computed with one containing the regression coefficients B and the residual error E of the regression model.

One of the best established and mostly applied regression methods for multivariate calibration is the partial least squares (PLS) regression. Which applies data reduction in form of principal components to describe the information in the X-matrix. The method constructs a m-dimensional space and computes a direction 𝑡1 (PLS latent variable component) in this space which describes the maximum

amount of variation between the X variables and highest amount of correlation with the Y variables. Thus, the method aims to establish a maximum covariance. The first LV (𝑡1= 𝑋𝑊1) is computed in

the direction that yields a maximum for 𝐶𝑜𝑣(𝑡1, 𝑦). With 𝑊1 being the coefficient explaining maximum

covariance. Subsequently the loadings 𝑃 are computed for each X variable for 𝑡1. The contribution of

each X variable of the first LV is then subtracted from the initial matrix thus yielding a residual X-matrix 𝐸1 as given by Eq 1.

𝐸1= 𝑋 − 𝑡1𝑝1𝑇

(1)

This procedure is then repeated for the second LV 𝑡2 according to 𝑡2= 𝐸1𝑊2 as such to obtain a

maximum for 𝐶𝑜𝑣(𝑡2, 𝑦). The second LV is orthogonal to the first one. This can be repeated 𝑘 times

until a sufficient amount of correlation is modelled in the regression model as can be given by Eq 2.

𝑌 = ∑𝑘𝑖=1𝑡𝑖𝑏𝑖+ 𝑒 = 𝑇𝑏 + 𝐸𝑦 (2)

With 𝑇 for different scores, 𝑏 for calculated regression coeficients and 𝐸𝑦 for regression model error

(6)

2.2 Multivariate model errors

The accuracy of multivariate calibration models is expressed as the Root Mean Squared Error of Calibration (RMSEC) given by Eq 3 which is based on 1 × standard deviation (68% confidence level).

𝑅𝑀𝑆𝐸𝐶 = √∑ (𝑦𝑖− 𝑦̂𝑖) 2 𝑛 𝑖=1 𝑛 − 𝑑𝑓 (3) Where 𝑦𝑖 is the reference value for sample 𝑖, 𝑦̂𝑖 is the predicted value by the multivariate model for

that sample, 𝑛 is the total number of samples and 𝑑𝑓 is the degrees of freedom which is defined by the number of deployed principle components or latent variables, plus one for mean-centering. In principle, the RMSEC decreases as more latent variables are included in the model. The more latent variables are included the more information from the X data set is described by the model. Thus, the number of latent variables can be increased until 100% of the X data is described by the model. However, including too many latent variables has the consequence of overfitting, resulting in poor predictive model performance. Therefore, several cross-validation procedures can be applied to assess the robustness of the included number of latent variables. Such procedures typically leave out some of the data (samples) from the model, fit a new model with the remaining samples and use the left out samples as test samples to validate the model error. This is repeated until each sample is left out once.

The accuracy of the model after cross-validation is expressed as the Root Mean Squared Error of Cross-Validation (RMSECV) given by Eq 4 which can increase in case of overfitting.

𝑅𝑀𝑆𝐸𝐶𝑉 = √∑ (𝑦𝑖− 𝑦̂𝑖) 2 𝑛 𝑖=1 𝑛 (4) A more robust validation procedure is based on using an external (independent) prediction set. The accuracy of the prediction can then be expressed as the Root Mean Squared Error of Prediction (RMSEP) given by Eq 5. 𝑅𝑀𝑆𝐸𝑃 = √∑ (𝑦𝑖− 𝑦̂𝑖) 2 𝑛 𝑖=1 𝑛 (5) Where 𝑦𝑖 and 𝑦̂𝑖 correspond to the sample reference values in an independent prediction set

(7)

3

Transfer samples

Transfer samples are measured on both master and/or slave instrument to estimate the differences between responses. Such differences can arise either from different instrumental configurations or the physical states of the samples (e.g. temperature and powder to tablet). The latter case sometimes applies to calibration transfer to the same instrument. Proper selection of transfer samples is important for successful calibration transfer. Therefore, attention will be given on the requirements they have to meet, and to the different selection methods available in the literature.

3.1 Requirements for transfer samples

For proper estimation of differences in responses, two aspects criteria be considered. These are the stability and the representativity of the transfer samples. Physical and chemical stability of these samples during calibration transfer ensures that the calibration transfer parameters are computed based on instrumental differences and not also by spectral differences resulting from degradation of samples. Preferably, transfers samples must be representative of the new predicted samples. If these two criteria are not met, the use of the transfer parameters leads to additional errors.

3.2 Selection of subsets for calibration transfer

There are three sorts of sample sets of which a subset of transfer samples can be selected. Namely, from the calibration set, prediction set, or the use of independent transfer samples.

3.2.1 Subset selection methods

There are different subset selection methods available for selecting representative transfer samples. Wang et al. [3] has proposed a subset selection method which was based on selecting samples with high leverages. This procedure starts by selecting the sample with the highest leverage (after outliers have been excluded). The spectral information in this sample is removed from the rest of the samples by a linear transformation. By doing so the remaining samples are orthogonal to the selected sample. This stepwise procedure is repeated until the defined number of samples are obtained. However, selecting samples based on high leverages proved to be inadequate for covering the whole model space. This would lead to decreased representativity and thus poor calibration transfer results. Therefore, De Noord [4], proposed the Kennard & Stones method (KS split) as an alternative. This is stepwise procedure that selects samples with a uniform distribution over the entire model space. It is the most applied subset selection method found in the literature used for calibration transfer and is also available in the PLS_toolbox.

3.2.2 Subset selection from the calibration set

The advantage of subset selection from the calibration set is that the transfer samples and calibration samples are similar in terms of spectral information and variation. Thus, the accumulation of additional spectral error is less likely. However, subset selection from the calibration set is not always applicable. In some applications unstable calibration samples are used such as peroxides and fresh food product. Therefore, the used calibration sample are no longer available.

3.2.3 Subset selection from the prediction set

In this approach a number of transfer samples are selected from new samples which were not used in the calibration. These new samples are similar (e.g. same product) to the samples used in the calibration. However, they have yet not been measured on the master instrument. Therefore, these samples have to be measured on both instruments to perform a calibration transfer. The disadvantage of this method is that the calibration conditions on the master instrument must still exist.

(8)

Moreover, it must be feasible to measure the samples on both instruments without too much time delays. Therefore, this approach is not applicable to unstable samples if the instruments are located very far from each other.

3.2.4 Generic transfer samples

The use of generic samples consists of measuring new samples on both instruments similar to the approach that uses samples from the prediction set. However, now the samples are of different nature compared to the calibration samples (e.g. different product). The advantage of this approach is that it allows the use of physically and chemically stable transfer samples. Moreover, it can be applied to different calibration models. Therefore, the transfer sample set needs to be measured only once per instrument instead of preparing and measuring a transfer set for each model that needs to be transferred. The disadvantage of this approach is that generic samples lack in representativity compared to the calibration samples. Numerous studies have applied the use of generic standards in were sometimes acceptable results were obtained. However, for most application the obtained results were not satisfactory. Moreover, a considerable effort is required to investigate what type of generic samples and additional data pretreatment can provide acceptable results for each calibration model. Therefore, it depends on factors such as the scale and complexity of the calibration transfer, accessibility of instruments and the stability of samples to determine whether it is worthwhile to take this effort.

3.2.5 Number of transfer samples

The number of transfer samples must be carefully determined as it is important that a sufficient amount of information regarding the instrumental differences is captured by these samples. The use of too few transfer samples can therefore lack in information and yield insufficient results. Using too many samples does not affect the results negatively, but it is unnecessary work. The required number of transfer samples is determined by the complexity of the differences between the instruments and the calibration transfer method. For instance, Shenk and Westerhaus [5] suggested to use 15 to 30 transfer samples for instruments that require wavelength correction using the Shenk-Westerhaus calibration transfer method. But also introduced cases were a single transfer sample was sufficient for NIR spectrometers not requiring wavelength corrections [6]. Later Wang et al. [3] demonstrated that only 3 transfer samples are required when using the PDS (section 4.2.3) calibration transfer method. However, using more than 3 transfer samples is recommended to capture a larger experimental model space which reduces artefacts in the transfer parameters.

(9)

4

Transferring multivariate calibration models

Different calibration transfer methods are available in literature that calculate transfer parameters for the correction of instrumental differences or measurement conditions. These corrections are applied to the y-values (Section 4.1), or to the spectral x-values (Section 4.2) or to the calibration models (Section 4.3).

Other methods have also been proposed to develop calibration models that are transferable between different instruments. Such methods do not require corrections in the transfer parameters (see Section 4.4).

4.1 Methods for correction of y-values

4.1.1 Univariate slope and bias correction

The univariate slope and bias correction (SBC) method consists of predicting y-values for transfer samples with the calibration model developed on the master instrument. With this calibration model the y-values are predicted with both master and slave instrument by Eq. 6-7.

𝑦𝑠𝑡𝑑𝑚 = 𝑋𝑠𝑡𝑑𝑚 ∙ 𝑏 (6)

𝑦𝑠𝑡𝑑𝑠 = 𝑋𝑠𝑡𝑑𝑠 ∙ 𝑏 (7)

The y-values predicted by the master instrument are then plotted against their corresponding values predicted by the slave instrument. A univariate correction is then applied to the bias or bias/slope by least squares regression. The predicted y-values for new spectra as measured on the slave instrument are then calculated by Eq. 8 and are bias/slope corrected by Eq. 9, yielding standardized predicted y-values.

𝑦𝑝𝑠 = 𝑋𝑝𝑠 ∙ 𝑏 (8)

𝑦 𝑝𝑠 𝑠𝑡𝑑 = 𝑏𝑖𝑎𝑠 + 𝑠𝑙𝑜𝑝𝑒 ∙ (𝑋𝑝𝑠 ∙ 𝑏)

(9)

This method was applied by J.A. Jones et al. [7] to transfer single and dual wavelength calibration models for moisture determination in pharmaceutical products between two NIR instruments. The two NIR spectrometers were implemented on two different production sites (UK and USA). The data set of each site was split into calibration and prediction set. On each instrument a calibration model was build using Karl Fisher as reference. Subsequently, the calibration spectra of each site were predicted using the calibration model developed by the other site. A correction for the slope and bias was then computed which was applied to the predicted moisture content of the prediction set. Details of the data sets are listed in Table 1 and a summary of the prediction set statistics is listed in Table 2. The study demonstrated that by applying a bias/slope correction, accurate moisture predictions were obtained NIR for the transferred models.

(10)

Table 1: Water content of calibration

and prediction sets.

Water content (% w/w) by KF

Sample set

No. in

set Mean Range

%RS D Calibration sets UK 28 1.66 0.90 - 3.10 33 USA 32 2.11 1.34 - 4.90 33 Prediction sets UK 12 1.59 1.16 - 2.13 20 USA1 25 1.84 1.27 - 2.70 20 USA2 27 1.35 0.84 - 1.71 16 USA3 19 1.39 1.12 - 1.71 12

Table 2: Statistics overview of NIR prediction data sets. Residuals (NIR-KF)

Bias Accuracy

Prediction

set Mean SD Mean SD

(A) Using UK 2-wavelength calibration UK -0.04 0.11 0.09 0.07 USA1 -0.01 0.22 0.19 0.10 USA2 -0.06 0.20 0.16 0.14 USA3 -0.01 0.23 0.16 0.16 USA3Q -0.06 0.20 0.16 0.13

(B) Using USA 1-wavelength calibration UK +0.02 0.11 0.09 0.06 USA1 0.00 0.24 0.20 0.13 USA2 -0.05 0.25 0.19 0.17 USA3 -0.04 0.24 0.16 0.17 USA3Q -0.09 0.20 0.17 0.13

(C) Using USA 2-wavelength calibration UK -0.03 0.17 0.14 0.08

The advantage of this calibration transfer method is in the simplicity of the univariate correction. The disadvantage is that this correction has to be applied to each calibration model developed on the master instrument, which becomes rather time consuming. Moreover, the y-values of the transfer samples are required as well. Furthermore, this correction on the predicted y-values was only shown to be successful with identical instruments. If applied to different type of instruments, the instrumental responses are most likely too complex to yield satisfactory results [8]. Another limitation of this method is, that it is not applicable to data sets containing samples for which the response shows variation between different situations (e.g. particle size, temperature). This requires every sample type to be corrected differently. Therefore, this method is not recommended for more complex multivariate calibration models [9].

4.2 Methods for correction of x-values

4.2.1 The Shenk-Westerhaus algorithm

The Shenk-Westerhaus algorithm, also found in the literature as Shenk’s patented method or the patented method was the first method for calibration transfer which applied corrections to spectra (x-values) [5]. This method applies a two steps univariate correction on all wavelengths. Corrections are applied to both the wavelength index correction as well to the spectral intensity.

The algorithm is described by Bouveresse et al. [10]. For each wavelength (𝑖) of the master instrument, a spectral window (𝑖 − 𝑤, 𝑖 + 𝑤) of wavelengths on the slave instrument is chosen. The spectral intensity of the wavelength of each spectral window that correlates mostly with that of the master instrument is computed. A quadratic model by Eq. 10 is fitted to the wavelength of the master instrument with the highest correlation (𝑚) and its two neighboring wavelengths (𝑚 − 1 and 𝑚 + 1). By doing so a more precise estimate is obtained of the wavelength which has the highest correlation. A second quadratic model by Eq. 11 is then fitted to obtain the relationship between the wavelengths of the master instrument as obtained from the first quadratic model to their matching wavelengths of the slave instrument. By doing so definitive matching values for the wavelengths of the slave instrument 𝑖′_{to those of the master instrument 𝑖 are obtained.}

𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 = 𝑎 + 𝑏 ∙ 𝑖 + 𝑐 ∙ 𝑖2 ₍₁₀₎

(11)

By applying interpolations, the spectral intensities of the slave instrument are computed at the wavelengths suggested by the second quadratic model. Subsequently, the spectral intensities are corrected by Eq. 12. A univariate linear regression is applied to correct for the responses of the wavelengths of the slave instrument after index correction

𝑋

𝑖𝑠#and those of the matching master instrument

𝑋

_𝑖𝑚.

𝑋 𝑖𝑚= (𝑎 ∙ 𝑖) + (𝑏 ∙ 𝑖 ) ∙ 𝑋𝑖𝑠# (12)

For new spectra as measured with the slave instrument the response is corrected for intercept (𝑎) and slope (𝑏) at each wavelength (𝑖) by Eq. 13, thus a standardization of the spectra is obtained 𝑋𝑠𝑡𝑑,𝑖𝑠 . A

calibration transfer file is generated containing the correction factors for the wavelength indices and spectral intensities.

𝑋𝑠𝑡𝑑,𝑖𝑠 = (𝑎 ∙ 𝑖) + (𝑏 ∙ 𝑖 ) ∙ 𝑋𝑖𝑠# (13)

In the study of Bouveresse et al. [10] the Shenk-Westerhaus algorithm was used to transfer NIR MLR models of agricultural products from the master instrument to three slave instruments (UK, SP and DA). The effect of three different sets of transfer samples was investigated. The sets consist of samples which are similar to the samples used for calibration (STD1), generic samples (STD2), and pure organic and inorganic chemicals (STD3). Therefore, STD1 has spectral intensity values comparable to the prediction set were as a different intensity range is found in the latter two sets. Three prediction sets (HE, MA and CO) were used to assess the accuracy of the prediction (RMSEP) before and after calibration transfer on the slave instruments. The STD2 set was also applied without a wavelength index correction which is referred to as STD2’. For comparison the standard error of calibration (SEC) obtained with the master instrument is provided as well. An overview with details of the standardization and prediction sets is listed in Table 3. A summary of the prediction set statistics for HE and MA is listed in Table 4.

Table 3: Name and composition of the transfer and prediction sets.

Set name Description

STD1 30 sealed cups with agronomic products (ISI standardization set) STD2 6 generic standards (4 from Labsphere + 2 made by SHB) STD3 12 sealed cups with pure organic and inorganic products HE 6 grass samples

MA 6 corn samples CO 17 colze samples Studied variables for each agronomic sample

Agronomic samples Studied variables

HE Proteins, cellulose MA Proteins, cellulose CO Proteins, fat

Table 4: Summary of the prediction set statistics.

RMSEP (proteins) for the HE set RMSEP (proteins) for the MA set UK RMSEC Before Std1 Std2 Std2' Std3 UK RMSEC Before Std1 Std2 Std2' Std3 0.85 0.66 0.21 0.35 0.45 1.18 0.43 0.40 0.12 0.95 0.04 1.24 SP 0.85 0.37 0.14 0.73 0.69 1.68 SP 0.43 0.23 0.10 0.89 0.14 5.71 DA 0.85 0.53 0.45 1.38 0.47 3.87 DA 0.43 0.39 0.22 1.58 0.69 7.65

RMSEP (cellulose) for the HE set RMSEP (cellulose) for the MA set UK RMSEC Before Std1 Std2 Std2' Std3 UK RMSEC Before Std1 Std2 Std2' Std3 1.27 0.45 0.21 0.93 0.48 7.07 1.04 0.47 0.30 0.38 0.60 2.89 SP 1.27 1.17 0.35 1.02 0.52 0.85 SP 1.04 0.74 0.15 0.85 0.38 2.45 DA 1.27 0.49 0.34 0.82 0.95 1.16 DA 1.04 0.92 0.33 1.22 0.61 1.72

(12)

The study demonstrated that by applying the Shenk-Westerhaus algorithm good predictions were obtained with the slave instruments but only if the standardization and prediction sets are similar (e.g. STD1).

The advantage of this method is that a relatively small standardization set (30 samples) without the need of reference values can be used to transfer calibration models between instruments. The disadvantage is due to its univariate character. Therefore, the method is not able to correct for more complex instrumental differences between the master and slave instrument.

4.2.2 Direct standardization

The direct standardization (DS) method as introduced by Wang et al. [3] is a calibration transfer method that applies corrections to the spectra measured with the slave instrument to match those measured with the master instrument. The calibration model is not corrected. By applying a matrix transformation 𝐹, the relationship between the spectra matrices (x-values) of both instruments are obtained as given by Eq. 14. With 𝐸 containing the unmodelled residuals.

𝑋𝑚_{= 𝑋}𝑠_{∙ 𝐹 + 𝐸} ₍₁₄₎

The transfer matrix is a square matrix which is computed by multiplication of the generalized inverse of the standardization set as measured on the slave instrument 𝑋𝑠𝑡𝑑−𝑠 and the master instrument 𝑋𝑠𝑡𝑑𝑚

as given by Eq. 15.

𝐹 = 𝑋𝑠𝑡𝑑−𝑠 ∙ 𝑋𝑠𝑡𝑑𝑚 (15)

New spectra measured on the slave instrument are then transferred 𝑋𝑡𝑟𝑓𝑠 by multiplication with the

transfer matrix as given by Eq. 16.

𝑋𝑡𝑟𝑓𝑠 = 𝑋𝑠∙ 𝐹 (16)

The generalized inverse of this method is calculated using PCR. An alternative method for calculating the generalized inverse is based on PLS and is referred to as two-block PLS. This method uses PLS regression twice. First, the relationship between spectra of the standardization set of the master and slave instruments is computed. Then PLS regression is performed for the second time to compute the relationship between spectral variables (x-values) and chemical variables (y-values).

In the study of Forina et al. [11] the two-block PLS variation of the direct standardization method was used to transfer NIR PLS models of soy flour between four instruments in four different laboratories. The procedure was applied to 60 soy four samples of which 40 samples were used for the standardization set and 20 for the prediction set. In the paper the effect of the number of components used for the standardization set and that for the prediction set were studied, see Figure 2 (left). The effect of using 5 samples in the standardization set instead of 20 was studied as well, see Figure 2 (right).

Figure 2 (left) shows that if the number of components for the standardization set (X-X relationship) and prediction set (X-Y relationship) is selected with care, the transfer of the NIR models can be performed without losing predictive performance. For example, a 5 PC (X-X) in combination with a 2 PC (Y-X) yields comparable RMSEC and RMSEP values.

Figure 2 (right) shows an example of the results obtained for a standardization set containing 5 samples, with using a two or three PLS latent variables to compute 𝐹. The increase of RMSEP for using 5 transfer samples (right) is very limited compared to that of using 20 samples (left).

(13)

The disadvantage is that the whole spectrum measured on the slave instrument is used to compute the transferred spectrum. This may lead to overfitting. The method requires a relatively small standardization set and, in this study even 5 samples could be used. The number of transfer samples depends on the complexity of the calibration model and should be at least as large as the deployed number of latent variables.

Figure 2: Overview with details of the studied parameters.

4.2.3 Piecewise direct standardization

Wang et al. [3] introduced the piecewise direct standardization (PDS) as an improvement to the direct standardization. The main differences between these two methods is the transfer matrix. The direct standardization method uses the entire spectrum of the slave instrument to apply wavelength fitting on the master instrument (Eq. 15). However, the spectral variation in spectroscopic data are more related to each other for neighboring wavelengths compared to the full spectrum. Therefore, each wavelength on the master instrument is more likely related to a window of neighboring wavelengths rather than the full spectrum on the slave instrument. For the transferred spectrum each wavelength 𝑊𝑖_{is reconstructed from its corresponding wavelength window [𝑖 − 𝑔, 𝑖 + ℎ] on the slave instrument}

given by Eq. 17. 𝑊𝑖_{= [𝑇}

𝑖−𝑔𝑆 , 𝑇𝑖−𝑔+1𝑆 , +, … , 𝑇𝑖𝑆, …, 𝑇𝑖+ℎ−1𝑆 , 𝑇𝑖+ℎ𝑆 ] (17)

Using PCR or PLS, regression coefficients are computed that relate the spectral intensity of the wavelengths of the master instrument with the wavelength window on the slave instrument given by Eq. 18.

𝑇𝑖𝑚= 𝑊𝑖∙ 𝑏𝑖+ 𝑒𝑖 (18)

Finally, the spectra as measured on the slave instrument are standardized by multiplying them with the regression coefficients matrix.

(14)

In the study of Wang et al. [3] PDS was compared with four other calibration transfer methods along with subset recalibration using simulated data. In the case of subset recalibration, a new calibration model is computed based on subset samples only. Some of these calibration transfer methods are discussed in later chapters of this report. Figure 3 shows the accuracy of each method against the number of transfer samples.

Figure 3: Effect of subset size on RMSEP for different transfer methods. Recreated from [3].

The results show that all calibration transfer methods outperform subset recalibration (model updating). Whether the accuracy of the subset recalibration is acceptable depends on the application and it can potentially be improved by using a larger subset. Clearly, the obtained accuracy with transfer methods is better when a small number of transfer samples are applied. The direct and inverse model standardization methods work best with a relatively larger number of transfer samples (more than 5) compared to the other methods. For this dataset, the best results are obtained using the PDS method with an RMSEP of about 0.007. The optimal PLS model with 5 components yielded an RMSEP of 0.0055 for analyte A. Therefore, the PDS method does not outperform a full recalibration. However, considering that only a few transfer samples were used, the additional 1.2 - 1.6 times larger prediction error could be acceptable.

In another study of Wang et at. [12] the PDS method was applied for temperature compensating calibration transfer on a NIR instrument. Reflection spectra of an agricultural product were measured at 45 °C to predict analyte A and B. Subsequently, 20 out of the 120 samples were selected and measured at four temperatures (30, 50, 60 and 70 °C). From these 20 spectra, they selected 4 or 9 as transfer samples and standardized all temperatures to 45 °C. The remaining samples were applied as validation samples to obtain the RMSEP as a measurement of the calibration transfer performance. Table 5 shows the performance for temperature compensating calibration at four temperatures on both analytes when different numbers of transfer samples are used.

0 0,01 0,02 0,03 0,04 0,05 3 4 5 6 7 8 9 10 RMSE P Subset size Subset recalibration

Classical model standardization Inverse model standardization Direct standardization PDS

PDS with quadratic term Patented method

(15)

Table 5: RMSEP from temperature standardization. analytes % A % B Conc. Range 36.8 - 39.4 53.2 - 55.9 PLS CV (rank) at 45 ⁰C 0.11 (4) 0.19 (5) PCR CV (rank) at 45 ⁰C 0.11 (6) 0.19 (7) standardization (4 samples) 30 ⁰C 0.12 1.71 50 ⁰C 0.17 0.95 60 ⁰C 0.15 0.65 70 ⁰C 0.16 1.11 standardization (9 samples) 30 ⁰C 0.16 0.36 50 ⁰C 0.15 0.37 60 ⁰C 0.18 0.33 70 ⁰C 0.15 0.39

The PLS results listed in Table 5 show that after calibration transfer the accuracy of analyte A is comparable to that of a full recalibration for all temperatures. However, the obtained RMSEP for analyte B is clearly higher. This could potentially mean that that the correlating information for this analyte is less pronounced, which could therefore require additional transfer samples to yield more information. This is also in line with the cross-validation results, as a higher rank number is required for analyte B. Furthermore, increasing the number of transfer samples from four to nine did not show a significant improvement in RMSEP for analyte A, while that for analyte B was improved for all cases.

In general, better results are obtained with PDS compared to DS. This is due to the local multivariate models generated in PDS, reducing the risk of overfitting compared to DS. Furthermore, using several local multivariate models can model non-linearity effects better compared to single model used in DS. Another advantage of the PDS method is that it requires only a few transfer samples to obtain good results. This can be explained by the small local rank of the moving window compared to that of the whole spectrum. However, the practical drawback of this method is that the optimal window size, rank and number of transfer samples for the combination of two PLS models must be carefully determined. In the literature, PDS is one of the most applied methods for calibration transfer and it is often used as benchmark in comparative studies with novel transfer methods. A version of the PDS transfer method is available in Eigenvector’s PLS_Toolbox for MATLAB.

4.2.4 Double window piecewise direct standardization

The narrow spectral window used in PDS can result in less robust regression coefficients. The PDS method can be extended using a double-window (DW) PDS to include additional spectral data in the regression models. Thereby, the first window length is defined for the frequencies of the slave instruments and a second window length for that of the master instrument. PLS regression models are then computed like PDS whereby a regression coefficients matrix is obtained. A schematic comparison between PDS and DWPDS is illustrated in Figure 4. A version of DWPDS is available in Eigenvector’s PLS_Toolbox for MATLAB.

(16)

Figure 4: Schematic comparison between PDS and DWPD [13].

In a study of Pereira et al. [14], the DWPDS method using 7 transfer samples was applied to transfer NIR calibration models of pharmaceutical powder mixtures to models for intact tablets. Three approaches were compared. First, tablet samples were predicted directly using the calibration model built with powder samples. Second, a hyphenated model was build containing both powder and tablet samples. Third, DWPDS as a calibration transfer method was applied to transfer the powder model to a tablet model. The accuracy for predicting tablet samples in RMSEP and the range of the errors were used to compare the performance of the three approaches as shown in Table 6. The window size of the DWPDS method was assessed in the range from 7 to 15 variables for powders and from 9 to 21 variables for tablets. The best results were obtained using window sizes of 15 and 11 variables for powders and tablets respectively.

Table 6: Results for three different approaches for the prediction of intact tablets.

Model RMSEP (%) Error range (%)

Powder 4.8 -5.1 8.7 Powder + tablet 3.3 -2.8 6.3 Powder (DWPDS) 2.6 -4.6 3.3

The results in Table 6 show that the use of DWPDS allowed to significantly decrease the RMSEP compared to the other two approaches. The accuracy for predicting powdered samples by the powder model was 1.7%. Therefore, the DWPDS method does not outperform a full recalibration. However, considering that only 7 transfer samples were used and that very small amounts of raw materials were used to prepare the tablets, the elimination of a full recalibration by applying a calibration transfer could be favorable.

4.2.5 Orthogonal Signal Correction

Orthogonal signal correction (OSC) is a pre-processing filter applied to spectroscopic data prior to building calibration model. OSC, corrects the X-matrix by subtracting variation that is orthogonal to the Y-matrix. This correction is also applied to new spectra for which the Y-values are going to be predicted by the model. Therefore, it is officially a pre-processing method and not a calibration transfer method. Nevertheless, the feasibility of applying OSC to remove instrumental differences in the spectra prior to building a calibration model was evaluated by Sjöblom et al. [15].

The strategy of OSC is to reduce the uncorrelated variation between the X-matrix and the Y-matrix. The X-matrix consists of two blocks namely, Xm (master instrument) and Xs (slave instrument). All matrices are first mean centered. Then PCA is applied to the X-matrix and the first PC is identified.

(17)

The scores of this PC are made orthogonal to the Y-matrix by rotating the loading. By doing so, the loading represents a response which is not affected by variation in the Y-matrix. After rotation, a PLS model is computed which is able to predict the orthogonal scores in the X-matrix. A certain number of latent variables are selected for the PLS model that capture a user specified level of variance for these scores. Subsequently, the weights, loadings, and predicted scores of this PC are used to remove the orthogonal component from the X-matrix. These corrections are stored and applied to new unknown samples. By removing the first orthogonal component, a deflated X-matrix is obtained. This process can then be repeated for the specified number of components.

Three parameters must be defined when applying OSC as pre-processing. Namely, the number of components and iterations, and the tolerance level. The first parameter defines how many times the process is repeated. The second parameter defines how many times the first PC loading is rotated to be as orthogonal to the Y-matrix as possible. The third parameter defines the acceptable percentage of variance for the orthogonal scores as captured by the PLS model [16].

In the study of Sjöblom et al. [15], OSC was used to transform the X-block in where instrumental differences between the master and slave instrument are removed. A PLS calibration model was then build using the transformed spectra of the master instrument to predict the water content in a pharmaceutical drug as shown in Figure 5. Subsequently, a prediction set was measured on both instruments which was then transformed and predicted by the model. The results in RMSEP and bias as shown in Figure 6 were compared with several pre-processing methods and with the PDS method.

Figure 5: Schematic overview of the OSC workflow.

Figure 6: RMSEP (striped) and bias (black) for different correction methods. Models on master instrument using

training set. Predictions for master and slave instrument using prediction set. With uncorrected and without spectral centering (UC 1), with local centering (UC 2), with global centering (OSC 1) and with local centering (OSC 2). The bias bars for Der+OSC and UC 2 are too small on this scale.

(18)

The results show that local centering of the data sets yields good results compared to more advanced pre-processing and transfer methods. Most likely, both instruments had a similar response, but the spectra were different due to wavelength shift. The methods OSC and SNV+PDS show good results without local centering. Moreover, OSC gives similar results for both global and local centering of the data. Therefore, it is concluded that OSC can correct for a linear wavelength shift.

A version of OSC is available in Eigenvector’s PLS_Toolbox for MATLAB both as pre-processing filter and as calibration transfer method. The main disadvantage of this method is that the Y-values of the transfer set must be available. Furthermore, it has some practical drawbacks as the number of components, iterations and the tolerance level must be carefully determined.

4.2.6 Generalized least squares weighting

Generalized least squares weighting (GLSW) is a pre-processing filter similar to OSC as it attempts to remove the difference between instruments or similar samples rather than making them alike. It identifies patterns in the X-block which should be down-weighted or removed. These are interfering signals referred to as clutter. The GLSW filter applies a “soft” orthogonalization to this clutter in the data prior to building a model. In the case of calibration transfer, these similar samples are the same samples which are measured on both and have the same reference values. The reference values of the samples are not required. By applying GLSW filtering to the standardization set, the differences in instrumental response between the two instruments are down-weighted. Finally, a new calibration model is build using the filtered data which can then be implemented on both instruments. The filtering must be applied to new measured spectra prior to prediction.

The GLSW algorithm applies PCA to map and down-weight the variance of the clutter in the data. A clutter threshold (α) must be specified which determines the threshold for clutter distance to which down-weighting is applied. Decreasing the threshold from 0.1 to 0.001 will result in more variance being identified as clutter, therefore more down-weighting being applied. A typical threshold to start with is 0.02. An alternative and simplified version of this method applies a “hard” orthogonalization termed as External Parameter Orthogonalization (EPO) [16].

Applying the GLSW algorithm for calibration transfer starts by calculating the difference between the mean-centered transfer samples measured on the master and slave instrument given by Eq. 19-21.

𝑋𝑚𝑐 𝑠𝑡𝑑𝑚 = 𝑋𝑠𝑡𝑑𝑚 − 𝑋̅𝑠𝑡𝑑𝑚 (19)

𝑋𝑚𝑐 𝑠𝑡𝑑𝑠 = 𝑋𝑠𝑡𝑑𝑠 − 𝑋̅𝑠𝑡𝑑𝑠 (20)

∆𝑋 = 𝑋𝑚𝑐 𝑠𝑡𝑑𝑠 − 𝑋𝑚𝑐 𝑠𝑡𝑑𝑚 (21)

The covariance matrix 𝐶 is then calculated given by Eq. 22. Singular-value decomposition is then applied to this matrix which yields the eigenvectors 𝑉, and the diagonal matrix 𝑆 given by Eq. 23.

𝐶 = ∆𝑋𝑇_{∙ ∆𝑋} ₍₂₂₎

𝐶 = 𝑉 ∙ 𝑆2_{∙ 𝑉}𝑇 ₍₂₃₎

Next, the weighted singular values are calculated given by Eq. 24. Where 1𝐷 is a diagonal matrix with

ones and 𝛼 is the weighting parameter. Finally, the inverse of the weighted eigenvalues is used to obtain the filtering matrix given by Eq. 25.

(19)

𝐷 = √𝑆2

𝛼 + 1𝐷 (24)

𝐺 = 𝑉 ∙ 𝐷−1_{∙ 𝑉}𝑇 ₍₂₅₎

New spectra measured on the slave instrument are projected into the filtering matrix. This projection down-weights the correlations in the original covariance matrix. In a study by Martens et. al. [17], GLSW was compared to OSC and PDS for transferring NIR calibrations models of corn between 3 instruments using 5 transfer samples. The results in RMSECV between the instrument are averaged and summarized in Table 7.

Table 7: Predictive performance of moisture, oil, protein and starch for different transfer methods. Moisture Oil Protein Starch

Full recalibration 0.11 0.07 0.16 0.34 Unstandardized 0.97 0.20 0.88 1.97 PDS 0.27 0.09 0.17 0.38 GLSW 0.15 0.07 0.15 0.29 OSC 0.16 0.07 0.16 0.32

The results show that the GLS and OSC method work very well for the corn data set. With their results being similar to a full recalibration for all 4 products. The results obtained with PDS are slightly higher compared to a full recalibration. However, only moisture and perhaps starch can be considered as significantly higher in this case.

A version of GLSW is available in Eigenvector’s PLS_Toolbox for MATLAB both as pre-processing filter and as calibration transfer method. A possible issue with this method is that it can affect the net analyte signal. Thus, it is recommended to evaluate a number of different clutter threshold values are using a cross-validation. The advantages of this method are that the Y-values of the transfer set are not required and that only one parameter is adjustable.

4.2.7 Spectral space transformation

Spectral space transformation (SST) aims to eliminate the spectral differences between instruments or measurement conditions with a transformation matrix. The transformation matrix is obtained by the spectral space regression of the transfer samples measured on the master instrument against the slave instrument by means of singular value decomposition.

The method starts by combining the spectral matrices of the transfer samples measured on the master and slave instrument given by Eq 26. The general equation for a singular value decomposition of a matrix is given by Eq 27.

𝑋𝑐𝑜𝑚𝑏 = [𝑋𝑠𝑡𝑑𝑚𝑎𝑠𝑡𝑒𝑟, 𝑋𝑠𝑡𝑑𝑠𝑙𝑎𝑣𝑒] (26)

𝐴 = 𝑈𝐷𝑉𝑇 ₍₂₇₎

Where, 𝑈 is the left singular vectors, 𝐷 is a diagonal matrix with the singular values and 𝑉𝑇_{is the}

transposed matrix for right singular vectors. This can be expressed by Eq 28 in more detail for the combined matrix.

(20)

𝑋𝑐𝑜𝑚𝑏 = [𝑈𝑠, 𝑈𝑛] [

∑𝑛 0

0 ∑𝑠] [𝑉𝑠, 𝑉𝑛]𝑇 = 𝑇𝑠𝑃𝑆𝑇+ 𝐸 = 𝑇𝑠[𝑃𝑚𝑎𝑠𝑡𝑒𝑟𝑇 , 𝑃𝑠𝑙𝑎𝑣𝑒𝑇 ] + 𝐸 (28)

Where, 𝑈 is the singular vector matrix, 𝑇𝑠= 𝑈𝑠∑𝑠 ; 𝑃𝑠= 𝑉𝑠 ; 𝐸 = 𝑈𝑛∑𝑛 𝑉𝑛𝑇 ; The subscript “s” and “n”

in the vectors represent spectral information and noise, respectively. According to the Beer-Lambert law, 𝑋𝑐𝑜𝑚𝑏 can also be given by Eq 29.

𝑋𝑐𝑜𝑚𝑏 = [𝑋𝑚𝑎𝑠𝑡𝑒𝑟, 𝑋𝑠𝑙𝑎𝑣𝑒] = 𝐶[𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 , 𝑆𝑠𝑙𝑎𝑣𝑒𝑇 ] + 𝐸 (29)

Where 𝐶 is the concentration matrix. Each row in this matrix contains the concentrations of all the chemical components. 𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 and 𝑆𝑠𝑙𝑎𝑣𝑒𝑇 are pure spectral matrices. Each column contains the pure

spectra of a chemical component. Combining Eq 28 with Eq 29 yields Eq 30 and Eq 31.

𝐶𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 = 𝑇𝑠𝑃𝑚𝑎𝑠𝑡𝑒𝑟𝑇 , 𝐶𝑆𝑠𝑙𝑎𝑣𝑒𝑇 = 𝑇𝑠𝑃𝑠𝑙𝑎𝑣𝑒𝑇 (30)

𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 = (𝐶𝑇𝐶)−1 𝐶𝑇𝑇𝑠𝑃𝑚𝑎𝑠𝑡𝑒𝑟𝑇 = 𝑅𝑃𝑚𝑎𝑠𝑡𝑒𝑟𝑇 , 𝑆𝑠𝑙𝑎𝑣𝑒𝑇 = (𝐶𝑇𝐶)−1𝐶𝑇𝑇𝑠𝑃𝑠𝑙𝑎𝑣𝑒𝑇 = 𝑅𝑃𝑠𝑙𝑎𝑣𝑒𝑇 (31)

Where 𝑅 represents a full rank square matrix. For a new sample measured on the slave instrument, the concentration vector 𝐶𝑛𝑒𝑤 of the chemical components in the new sample is determined as 𝐶̂ =

𝑋𝑛𝑒𝑤(𝑆𝑠𝑙𝑎𝑣𝑒𝑇 )+ (“+” is the generalized inverse). Its corresponding transformed spectrum 𝑋𝑡𝑟𝑎𝑛𝑠 can be

expressed by Eq 32 and is calculated by Eq 33.

𝑋𝑛𝑒𝑤− 𝑋𝑡𝑟𝑎𝑛𝑠= 𝐶̂(𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 − 𝑆𝑠𝑙𝑎𝑣𝑒𝑇 ) (32)

𝑋𝑡𝑟𝑎𝑛𝑠= 𝐶𝑛𝑒𝑤𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 + 𝑋𝑛𝑒𝑤− 𝐶𝑛𝑒𝑤𝑆𝑠𝑙𝑎𝑣𝑒𝑇 (33)

= 𝑋𝑛𝑒𝑤(𝑆𝑠𝑙𝑎𝑣𝑒𝑇 )+𝑆𝑚𝑎𝑠𝑡𝑒𝑟𝑇 + 𝑋𝑛𝑒𝑤− 𝑋𝑛𝑒𝑤(𝑆𝑠𝑙𝑎𝑣𝑒𝑇 )+𝑆𝑠𝑙𝑎𝑣𝑒𝑇

Finally, substituting Eq 31 into Eq 33 yields Eq 34 [17].

𝑋𝑡𝑟𝑎𝑛𝑠= 𝑋𝑛𝑒𝑤(𝑃𝑠𝑙𝑎𝑣𝑒𝑇 )+𝑃𝑚𝑎𝑠𝑡𝑒𝑟𝑇 + 𝑋𝑛𝑒𝑤− 𝑋𝑛𝑒𝑤(𝑃𝑠𝑙𝑎𝑣𝑒𝑇 )+𝑃𝑠𝑙𝑎𝑣𝑒𝑇 (34)

In a study by Du et al. [18], the calibration transfer performance of a pharmaceutical tablet model by SST was compared to PDS, SBC and global PLS. A NIR PLS calibration model was built to predict the active ingredient concentration in tablets using 151 samples. The optimal model was selected using the minimal RMSEP value as obtained for the prediction set containing 451 samples. The first sample of the standardization set was randomly selected from the calibration set. Sequentially a new sample was added to the standardization set containing the most spectral information not already included by the current selection of transfer samples. The effect of the number of transfer samples on the obtained accuracy for the four transfer methods are shown in Figure 7. Table 8 lists the lowest RMSEP values obtained on the slave instrument for each transfer method.

(21)

Figure 7: Predictive performance against number of transfer

samples.

Table 8: RMSEP values (mg) of the API

concentration in the tablet samples obtained on the slave instrument for each transfer method.

Method RMSEP No transfer 22.5 Full recalibration 2.7 SBC 6.6 Global PLS 4.1 PDS 3.2 SST 2.9

These results show that in general by increasing the number of transfer samples , the RMSEP will decrease untill a certain threshold has been reached. Including additional transfer samples at this threshold represents unnessary work. This threshold is reached with 4 transfer samples for SST and 5 or 6 transfer samples for the other methods. For this data set, the performance of SST was better compared to the other methods. Morover, it is comparable to a full re-calibration.

PDS provides acceptable result as well. However, SST has some advantages over PDS. Both methods share a common parameter, which is the number of PC’s used to compute transformation matrix. The number of slected PC’s in PDS is more critical compared to SST. There is a certain optimal number of PC’s. Including additional PC’s after this optimum decreases the performance of PDS. In contrast, the performance of SST does not decrease with additional PC’s. Furthermore, the window size has to be determined in PDS as well. Determining the optimal window size requires more effort and is difficult to achieve when only a few transfer samples are available. Therefore, it is easier to apply the SST method compared to PDS.

4.2.8 Wavelet Packet Transform Standardization

Wavelet packet transform standardization (WPRS) transforms the spectra from the wavelength domain to the wavelet domain. This is an exotic standardization method and not extensively reported in the literature. Therefore, it will be discussed briefly.

The wavelet packet transform is an abstract form of discrete wavelet transform (DWT) [19-20]. DWT decomposes a signal into a group of elementary signals, referred to as wavelets. Each wavelet has a defined frequency and time. Therefore, the signal can be analyzed in two dimensions (time and frequency domain), which provides information on the evolution of each frequency over time.

In a study by Tan et al. [21], the WPTS method was applied to transfer spectra of corn samples between two NIR spectrometers. The calibration transfer performance of WPTS was compared to other standardization methods. The obtained results in RMPSEP for WPTS and PDS are given in Table 9. These results show that WPTS clearly outperforms the PDS method for this dataset.

(22)

Table 9: Comparison of predictive performance between different calibration transfer methods.

Method Moisture Oil Protein Starch

Without transfer 0.52 0.48 0.6 0.91 PDS (S/W) 0.4 (31/3) 0.16 (11/3) 0.51 (25/3) 0.74 (8/3) WPTS (S) 0.21 (25) 0.1 (30) 0.14 (30) 0.43 (25)

S, W, C: Subset size, window size and number of coefficients correspond to the optimal RMSEP, respectively.

4.3 Methods for model correction

The strategy of these methods is to correct the existing calibration model before it is transferred from the master instrument to the slave instrument. Therefore, the spectra as measured on the slave instrument are predicted directly.

4.3.1 Classical calibration model

The classical calibration model (CCM) method was introduced by Wang et al. [3]. This method assumes that there is a linear relationship between the 𝑌 matrix containing reference values (y-values) and both 𝑋𝑚_and_𝑋𝑠_{matrices containing the spectra collected with the master and slave}

instrument. If all absorbing contributions are known, a calibration model can be made according to the Beer-Lambert’s law given by Eq. 35 and 36.

𝑋𝑚_{= 𝑌 ∙ 𝐾}𝑚_{+ 𝑒}𝑚 ₍₃₅₎

𝑋𝑠_{= 𝑌 ∙ 𝐾}𝑠_{+ 𝑒}𝑠 ₍₃₆₎

= 𝑌 ∙ (𝐾𝑚_{+ ∆𝐾) + 𝑒}𝑠

Where 𝐾𝑚_and_𝐾𝑠_{are sensitivity matrices for each instrument in which the rows represent pure}

component spectra. The matrix ∆𝐾 is the difference between them. This relationship should also hold for the standardization set given by Eq. 37 and 38.

𝑋𝑠𝑡𝑑𝑚 = 𝑌𝑠𝑡𝑑∙ 𝐾𝑚 (37)

𝑋𝑠𝑡𝑑𝑠 = 𝑌𝑠𝑡𝑑∙ 𝐾𝑠 (38)

= 𝑌𝑠𝑡𝑑∙ (𝐾𝑚+ ∆𝐾)

By solving Eq. 37 and 38 for ∆𝐾 Eq. 39 is obtained where 𝑌𝑠𝑡𝑑+ is the generalized inverse of 𝑌𝑠𝑡𝑑.

Substituting ∆𝐾 into Eq. 36 and using Eq. 35, 𝑋_{̂ is estimated by Eq. 40.}𝑠

∆𝐾 = 𝑌𝑠𝑡𝑑+ ∙ (𝑋𝑠𝑡𝑑𝑠 − 𝑋𝑠𝑡𝑑𝑚 ) (39)

𝑋_{̂ = 𝑋}𝑠 𝑚_{+ 𝑌𝑌}

𝑠𝑡𝑑+ ∙ (𝑋𝑠𝑡𝑑𝑠 − 𝑋𝑠𝑡𝑑𝑚 ) (40)

Subsequently, new multivariate calibration model is built for the slave instrument using 𝑋_{̂ and 𝑌.}𝑠

Therefore, this method transfers the calibration spectra of the master instrument. A new model for the slave instrument is then build using these transferred calibration spectra. This is a reversed calibration transfer methodology.

The disadvantages of this method are that some effort is again required to develop the new calibration model. Furthermore, the concentrations of all components in the transfer samples must be known.

(23)

4.3.2 Inverse calibration model

The inverse calibration model (ICM) introduced by Wang et al. [3] is given by Eq. 41 and 42. Where 𝑏𝑚_{and 𝑏}𝑠_{are regression vectors for a compound on each instrument.}

𝑌 = 𝑋𝑚_{∙ 𝑏}𝑚 ₍₄₁₎

𝑌 = 𝑋𝑠_{∙ 𝑏}𝑠 ₍₄₂₎

= 𝑋𝑠_{∙ (𝑏}𝑚_{+ ∆𝑏)}

The same relationship can be given for the standardization set by Eq. 43 and 44. Were 𝑏𝑠𝑡𝑑𝑚 and 𝑏𝑠𝑡𝑑𝑠

are the regression vectors obtained from the standardization set.

𝑌𝑠𝑡𝑑= 𝑋𝑠𝑡𝑑𝑚 ∙ 𝑏𝑠𝑡𝑑𝑚 (43)

𝑌𝑠𝑡𝑑= 𝑋𝑠𝑡𝑑𝑠 ∙ 𝑏𝑠𝑡𝑑𝑠 (44)

= 𝑋𝑠𝑡𝑑𝑠 ∙ (𝑏𝑠𝑡𝑑𝑚 + ∆𝑏)

By calculating 𝑏𝑚_{from Eq. 41 and estimating}_{∆𝑏 by combining Eq. 43 with Eq. 44, a standardized}

regression vector is obtained given by Eq. 45. 𝑏𝑠

̂ = 𝑏𝑚_{+ ∆𝑏}

= 𝑏𝑚_{+ (𝑏}

𝑠𝑡𝑑𝑠 − 𝑏𝑠𝑡𝑑𝑚 )

= 𝑋𝑚+∙ 𝑌 + (𝑋𝑠𝑡𝑑𝑠+ − 𝑋𝑠𝑡𝑑𝑚+) ∙ 𝑌𝑠𝑡𝑑 (45)

With 𝑏_{̂ which is an estimate of 𝑏}𝑠 𝑠_{, the concentration of new spectra measured on the slave}

instrument can be predicted. In contrast to calibration transfer with CCM, this method does a direct calibration model transfer. Moreover, only the concentration of the compound for which the model was developed is required. However, the performance of this method is often disappointing (see Figure 3), due to additional error introduced by estimating the standardized regression vector.

4.3.3 Reverse (piecewise) direct standardization

The DS and PDS methods as discussed in the previous chapters can be used in the reversed direction as well. In reverse standardization the spectra measured on the master instrument are transformed to match their corresponding spectra on the slave instrument. Subsequently, a new calibration model is built with the transformed spectra to predict new measurements on the slave instrument. This procedure does require some additional effort to rebuild the new model. However, this allows the detection of outliers in the data and the selection of variables can be optimized for the slave instrument [9].

A comparative study of calibration transfer between two NIR instruments for the determination of naphthalene’s and the Research Octane Number in fuel was performed by Pereira et al. [22]. Seven transfer methods were compared: DS, PDS, OSC, RS, PRS and SBC and MU (model updating, see section 4.4.3). The effect of two pre-processing methods were also investigated, namely standard normal variate (SNV) and multiplicative scatter correction (MSC). The master instrument was identical to the “S1”, instrument having a 1 mm path length flow cell while the “S2” instrument was from a different vendor which was coupled via fiber optics and utilized a smaller path length. The number of transfer samples (TS) and the window size in the applicable techniques was carefully determined. The optimized results of each transfer method for the prediction of naphthalene’s and RON is listed in Table 10.

(24)

Table 10: Results of different transfer method for the prediction of naphthalene’s and RON.

Naphthalene’s RON

Method Instrument S1 Instrument S2 Instrument S1 Instrument S2

RMSEP TS RMSEP TS RMSEP TS RMSEP TS

No calibration transfer 0.66 - 6.0 - 0.40 - 5.6 - Complete recalibration 0.28 - 0.28 - 0.28 - 0.28 - DS 0.40 13 0.50 15 0.29 a ₁₂ _0.27a ₁₂ PDS 0.43 (1) 10 0.32 a₍₁₎ ₁₂ _0.26a₍₁₎ ₈ _0.24a₍₁₎ ₁₁ OSC 0.42 (3) 9 0.56 (4) 10 0.37 a₍₅₎ ₇ _{0.52 (3)} ₅ RS 0.31 a ₁₅ _0.34a ₁₅ _0.27a ₁₅ _0.25a ₁₂ PRS 0.46 (7) 14 0.65 (1) 6 0.34 a₍₃₎ ₁₀ _0.27a₍₃₎ ₁₀ SBC 0.52 5 1.03 14 0.57 5 0.94 7 MU 0.34 a ₂₀ _0.52 ₂₀ _0.31a ₇ _0.30a ₅ MSC 1.15 - 1.76 - 0.58 - 0.37 a _- SNV 0.73 - 3.03 - 0.42 - 0.76 -

a_{These results are comparable to a full recalibration according to an F-test at a significance level of 95%.} The window size or number of OSC components are given in the parentheses. The results show that the best performance was obtained by reverse standardization for this data set. In fact, it was the only transfer method that yielded an accuracy comparable to a full recalibration. Model updating (MU), which will be discussed in section 4.4.3 provided good results as well.

4.4 Standard-free calibration transfer methods

In this section several standard-free calibration transfer methods are discussed. These methods require corrections prior to implementation on a slave instrument. However, the methods do not require any transfer samples.

4.4.1 Multiplicative signal correction

Multiplicative signal correction also known as multiplicative scatter correction (MSC) is a pre-processing method applied to correct light scattering losses in NIR reflectance spectroscopy. The effects of both scaling and offset are corrected. This is done by regressing each spectrum against a reference spectrum. The mean of the calibration spectra is often used as the reference spectrum. The intercept and the slope of the obtained regression line is used to correct new measured spectra. MSC can be used as a calibration transfer method as well. The mean of the calibration samples measured on the master is used as the reference spectrum and the linear regression procedure is applied to the spectra of the slave instrument. No transfer samples are measured on the master instrument. Therefore, MSC can be classified as a standard-free calibration transfer method. Due to correction using a mean spectrum, this method works in particular well for similar looking spectra and linear concentration differences between the samples. The method assumes that the baseline and slope differences are constant for all wavelengths, this is not the case for most spectroscopic applications. Therefore, it offers minimal improvements and it can even provide insufficient transfer results compared to no calibration transfer as shown in the study discussed in Section 4.2.5. [23-24]

(25)

4.4.2 Moving window multiplicative signal correction

An alternative version of MSC is termed moving window multiplicative signal correction (MW-MSC). This is a moving window variation of MSC, applied for a fixed window size. By doing so the baseline and slope correction factors are computed for each spectral window independently. In a comparison study by Kramer et al. [25], the calibration transfer of a jet fuel model between two NIR instruments was investigated using MW-MSC and MSC. A two-step procedure was applied to obtain an optimized window size. A cut-off window was assigned to prevent excessive processing followed by window size selection based on high leverage. Following this procedure, the optimal window size was determined at 441 points.

The two-step procedure is quite laborious compared to most methods discussed so far, as the 11 page article is mostly devoted to the procedure and discussion of the window size. Furthermore, the authors observed a saturated spectral region in the spectra of the master instrument. However, they opted to include this region in the window size selection procedure and compared the transfer results directly with non-saturated wavelength regions. This practice is questionable and therefore this review will focus on the most relevant results as listed in Table 11.

Table 11: Prediction results for D45 using different preprocessing. With (A) for master and (B) for slave.

Preprocessing method No. of LV's D45_A SEC D45_A SEP D45_B SEP None 4 0.0027 0.0034 0.52 MSC 6 0.0020 0.0022 0.0110 MW-MSC 6 0.0021 0.0022 0.0024

The results show that both MSC and MW-MSC yield better transfer results on compared to no preprocessing. Furthermore, the obtained transfer result for MW-MSC is comparable to the calibration result. However, the calibration range is very small (0.79 – 0.82 g/cm3_{), therefore the applicability of} this method for a larger calibration range is not proven. Moreover, the calibration set was also used to validate the calibration transfer which is a questionable way of validation.

4.4.3 Model updating

Model updating (MUP) is often used to improve the calibration models of a single instrument by including additional calibration samples. These samples reflect new variation introduced by changes in sample composition or measuring conditions (e.g. temperature, pH). MUP can also be applied to add instrumental variation in the model by including spectra of samples measured on the slave instrument. Thus, the model is updated with instrumental variation. The updated model can then be applied on both instruments. By including additional variation, the complexity of the updated model increases, requiring more latent variables. It is strictly not a standard-free calibration transfer method. and it can also be categorized as an alternative to calibration transfer. The new included spectra measured on the slave instrument can be considered as transfer samples. However, unlike other methods discussed so far, these new samples are not necessarily identical samples that are measured on both instruments. Although the principle of MUP is simple, it is not always straightforward. There is a certain optimum in the number of update samples and their placement in the original model space which vary with the number of samples in the initial calibration model. Model updating is well suited and perhaps the only alternative in application where the samples are instable or are impractical to store or transport [26].