Sensors & Actuators: B. Chemical 350 (2022) 130769
Available online 28 September 2021
0925-4005/© 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Global calibration models for temperature-modulated metal oxide gas 
sensors: A strategy to reduce calibration costs 
Albert Miquel-Ibarz a, Javier Burgues a, Santiago Marco a,b,* 
a Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028 Barcelona, Spain 
b Department of Electronics and Biomedical Engineering, University of Barcelona, Marti i Franques 1, 08028 Barcelona, Spain  
A B S T R A C T   
Tolerances in the fabrication of metal oxide (MOX) gas sensors lead to inter-device variability in baseline and sensitivity, even for sensors of the same fabrication 
batch. This has traditionally forced the use of individual calibration models (ICMs) built specifically for each sensor unit, which requires an expensive and time- 
consuming calibration process and hinders sensor replacement. We propose Global calibration models (GCMs) built using the responses of multiple sensor units, 
and then applied to a new sensor unit that is not part of the calibration set. GCM have been already successfully applied to transfer calibration models between sensor 
arrays (electronic noses) for classification tasks. In this work, we investigate the use of such models for regression purposes in temperature-modulated sensors, aiming 
at the quantification of low concentrations of carbon monoxide (CO) in the presence of variable humidity levels (20–80% r.h. at 26  1 C). Using a laboratory 
dataset containing data from 6 replicas of the FIS SB-500–12 model, we evaluate the performance of global models built with data from 1 to 4 sensors when applied to 
unseen sensor units. Results show that the performance of global models improves with an increasing number of sensors in the calibration set, approaching the 
performance of individual calibration models (1.38  0.15 ppm for GCM; 1.05  0.24 ppm for ICM), and surpassing their performance only if few calibration 
conditions per sensor are available (2.09  0.10 ppm for GCM;; 2.76  0.22 ppm for ICM, if only 5 samples per sensor are used).   
1. Introduction 
Metal oxide semiconductor (MOX or MOS) gas sensors are one of the 
most successful low-cost technologies in the market for measuring gases 
and volatile organic compounds (VOCs) at the parts-per-million (ppm) 
and sub-ppm levels. They are used in application fields such as envi-
ronmental monitoring, indoor air quality, industrial safety, automotive 
exhaust, and in-cabin air quality monitoring, residential alarms, and 
biomedicine, among others. MOX sensors are also often used in elec-
tronic noses (e-noses) or sensor arrays to quantify odor intensity and 
classify odor types [1]. Despite being broadly used in so many fields, 
MOX generally display certain limitations such as lack of selectivity [2], 
non-linear response [3], cross-sensitivity to environmental conditions 
[4,5], high-power consumption [6], slow response time [7,8], and 
inter-device variability [9,10]. The sensor metrological performance can 
be improved by a combination of new developments in sensor tech-
nology [11], innovative operation principles [12–15] proper calibration 
methods [16] and, last but not least, signal processing and machine 
learning [17]. 
In particular, the inter-device variability in baseline and sensitivity 
due to tolerances of the fabrication process has forced the use of 
individual calibration models, i.e., models tailored for each specific 
sensor unit, that require costly and time-consuming calibration cam-
paigns. Despite some authors presented methods to reduce the calibra-
tion costs [18], the need for individual calibration hinders the use of 
chemical sensors in mass-applications, especially because faulty sensors 
cannot be directly replaced [19,20]. We state that the difficulties to 
transfer calibration models to different (otherwise identical) units are 
probably the highest barrier for the large-scale deployment of not only 
temperature modulated MOX sensors, but also sensor arrays and elec-
trochemical sensors, which also exhibit high inter-device variability. 
When a calibration model trained in one device is applied to other 
identical devices, there is a non-acceptable degradation of performance. 
In order to facilitate the use of predictive models trained with a 
master instrument to other slave devices, the field of calibration transfer 
methods have flourished. This field did develop originally for spectro-
scopic instrumentation [21], but it was applied originally to electronic 
noses by Balaban et al. [22]. They extended the calibration model of one 
sensor array to an uncalibrated replica by defining mapping structures 
(intercept-only functions, linear regression, and transfer matrices) be-
tween the response spaces of the two units (master and slave. Since then, 
many different calibration transfer strategies have been proposed to 
* Corresponding author at: Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028 
Barcelona, Spain 
E-mail address: smarco@ibecbarcelona.eu (S. Marco).  
Contents lists available at ScienceDirect 
Sensors and Actuators: B. Chemical 
journal homepage: www.elsevier.com/locate/snb 
https://doi.org/10.1016/j.snb.2021.130769 
Received 26 May 2021; Received in revised form 13 September 2021; Accepted 14 September 2021   
Sensors and Actuators: B. Chemical 350 (2022) 130769
2
extend the individual calibration models to uncalibrated replicas of a 
sensing instrument. The basic idea behind any calibration transfer 
strategy is to (1) build an individual calibration model with a set of 
calibration samples from a reference instrument (the master), (2) collect 
smaller sets of samples with uncalibrated replicas of the same instru-
ment (the slaves), (3) find possible transfer functions to map the re-
sponses of the slave and the master systems and (4) use the mapping of 
the slave responses to transfer the calibration model between replicas. In 
most cases the mapping structure has proven to be simple. 
The transformation of data from the slave device to match the data of 
the master has been proposed by many different methods including 
direct standardization (DS), piecewise direct standardization (PDS) 
[23], but also regression algorithms such as artificial neural networks or 
partial least squares [22]. For instance, Zhang et al. [24] found that 
different e-nose systems are related by homogeneous linear functions 
and performed a linear calibration transfer using six twin e-noses, with 
four MOX sensors (of different models) each. Global affine trans-
formation (GAT) by robust weighted least squares fittings (RWLS) was 
used to map slave units to the master, while a Kennard-Stone sequential 
algorithm (KSS)[25] was applied to select an optimum subset of repre-
sentative samples. Artificial neural networks (ANNs) were used to pre-
dict the gas concentrations based on the sensor responses; however, this 
was only validated with calibration samples. One drawback of this 
methodology is that it requires all sensors to be placed together in the 
same controlled chamber, which might be impractical in realistic sce-
narios. Fonollosa et al. [26] studied the calibration transfer between 
MOX sensor arrays, each composed by 8 sensors, on a laboratory 
experiment including several gas compounds and sensor drift. In their 
work, they trained one sensing system (master) with a non-linear mul-
ticlass regression model to classify the different gases and their con-
centrations. Then, they used direct standardization (DS) to map the 
samples from uncalibrated replicas (slaves) to the reference space of the 
master device. 
In temperature modulated MOX sensors, tolerances in the heater 
resistance induce temperature shifts in the hotplate when excited at 
constant voltage. Fernandez et al., used DS, PDS, Orthogonal Signal 
Correction (OSC) and Generalized Least Squares Weighting for calibra-
tion transfer. In this case, PDS provided the best results with the mini-
mum number of transfer samples [27]. In recent years, the problem of 
calibration transfer has received a strong boost. Calibration transfer can 
be reformulated as a problem of multitask learning: the calibration 
model is learned in different domains that correspond to different de-
vices. The key point is to realize that these learning problems are related, 
and information can be shared to improve their performances. Cali-
bration transfer makes emphasis on those problems where the number of 
samples across domains differ. In other words, the number of transfer 
samples is smaller than the samples used for master calibration. This has 
been named transfer-sample based multitask learning [28]. Similarly, in 
other works the main idea has been to learn a domain adaptation 
technique in such a way that the feature distribution similarity is 
maximized but preserving the information content of the source. The 
domain adaptation maybe based on linear projections [29], or autoen-
coders [30]. Calibration transfer is a very active topic of research and we 
have today sufficient evidence that the use of a limited number of 
calibration transfer samples greatly improves the transferability of the 
master calibration model. However, several difficulties remain: despite 
recent progress the calibration transfer approach still requires acquiring 
a number of transfer samples per slave instrument. Additionally, the 
method gives an excessive weight to the master instrument. An unlucky 
selection of the master instrument can lead to bad performance for the 
full batch. In other words, the obtained models for the slave devices may 
inherit negative characteristics due to an unperforming master device. 
Finally, calibration transfer does not properly address the specific 
problem of sensor replacement. 
Global calibration methods try to overcome these issues by finding a 
unique calibration model that can be applied to multiple units of the 
same sensor model without the need of calibration transfer samples. The 
underlying hypothesis is that there is a common mapping structure be-
tween the responses of multiple units of the same sensor model and the 
gas concentrations. The challenge in practice is that this common 
structure is hidden by the systematic differences of the sensors and time- 
drifts. A promising solution to find this structure using global calibration 
methods is multi-unit calibration [31]. Multi-unit calibration consists of 
building a calibration model with the matrix of responses of multiple 
sensor units exposed to the same calibration conditions, and then 
applying this global model to new uncalibrated replicas. These models 
are optimized so that the prediction accuracy in new replicas is maxi-
mized. Obviously, homogenization of the different responses via 
normalization and scaling is key to achieve good results. Solorzano et al. 
[31] compared the performance of individual and global calibration 
models applied to a MOX sensor array (e-nose) aiming at the classifi-
cation of six gases under variable humidity levels in a controlled labo-
ratory experiment. Individual calibration models had the best overall 
classification rate (100%), followed by global models (99%) and direct 
transfer models (91%), the latter ones consisting of applying the indi-
vidual model of one unit to a different unit. Despite the overall good 
results, direct transfer models failed to correctly classify 62.5% of the 
carbon monoxide (CO) samples (with 25.6% false negatives to air), 
while global models predicted all the CO samples correctly and had 5% 
of false positives to air. Their results confirmed that individual models 
were local to the sensor array employed for calibration, indicating 
possible overfitting and master system dependencies when applied to 
different units. 
In this work we evaluate the feasibility of global calibration models 
(GCMs) for temperature-modulated MOX sensors aiming at the predic-
tion of CO concentration in variable humidity conditions. This is the first 
time that global models are studied for a regression problem or using 
temperature-modulated sensors. Our proposal involves the use of 
Orthogonalized Partial Least-Squares (O-PLS) calibration method [32] 
in combination with Repeated Stratified K-Fold cross-validation for 
model optimization. The model performance is evaluated in external 
validation samples acquired several weeks after calibration using the 
limit of detection (LOD) as a figure of merit. We compare the prediction 
error and temporal stability of global models versus individual models, 
and study how the number of sensors and samples included in the 
calibration set affects the performance. 
2. Materials and methods 
2.1. Data set 
For this study we used a public dataset [32,33] containing recordings 
from 6 replicas of a temperature-modulated MOX sensor (SB-500-12, FIS 
Inc. [34]) exposed to gas mixtures of carbon monoxide (range 0–20 
ppm) were and humid synthetic air (range 20–80% r.h. at 26  1 C) 
inside a controlled in-house gas mixing station. Each gas exposure lasted 
for 15 min. Before each exposure the sensor chamber was cleaned for 15 
min with synthetic air at an identical nominal flow. The sequence of gas 
concentrations and relative humidity levels was randomized The 
SB-500–12 uses a mini-bead type sensing element of tin dioxide material 
placed in an external housing which contains an active charcoal filter. 
The multivariant sensor conductance is measured using a voltage 
divider and a load resistor of 1 MΩ. The output voltage of the sensors 
was sampled at 3.5 Hz using an Agilent HP34970A/34901 A data 
acquisition unit configured at 15 bits of precision and input impedance 
greater than 10 GΩ. 
We refer the reader to Burgues et al. [32,33] for more detailed in-
formation about the experimental setup. According to the datasheet, the 
sensor can exhibit tolerances of a factor of 10 in baseline (4–40 kΩ) and a 
factor of 2 in sensitivity (1.05–2.1). The recommended operation mode 
is temperature cycling using a squared heating waveform (0.2 V for 20 s 
followed by 0.9 V for 5 s) with a total period of 25 s. In the dataset, the 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
3
sensor conductance was continuously recorded at a sampling frequency 
of 1 Hz. Data is organized in 15 measurement campaigns, each one of 24 
h duration and consisting of the same 100 measurement conditions (10 
concentration levels x 10 humidity levels). For the purpose of this study, 
we used the subset of the data where the CO concentration is <10 ppm, 
as previous studies suggested this is the most relevant range for LOD 
estimation [32,33] while also being a relevant CO range for air quality 
applications. We also disregarded sensor unit #2, as this was identified 
as an outlier in previous studies [32,33]. 
2.2. Individual vs global calibration models 
Individual calibration models were obtained by training a calibration 
model with data acquired by one sensor. Global models were built using 
signals from multiple sensor units, and the model was then applied to a 
new replica that has been left out of the calibration set (Figs. 1–2). The 
number of available measurements for building a global model increases 
with the number of sensors included in the model. For example, 
assuming a scenario in which m calibration samples are measured by a 
set of N sensors, we can build N individual models (one per sensor) with 
m calibration points each or build a single global model with m*N 
calibration points (Table 1). 
2.3. Preprocessing and transformation 
The raw sensor conductance signals were preprocessed to improve 
the signal-to-noise ratio (SNR), reduce non-linearity, and correct drift. 
Median filtering (window size of 3 s) followed by a moving average filter 
(window size of 3 s) was applied to remove spurious spikes and smooth 
the signals. The filtered signals were logarithmically transformed to 
linearize the relationship between sensor resistance and gas concentra-
tion. To correct for baseline drift (visible after several days of mea-
surements) and compensate differences in the nominal sensor 
resistances, the sensor conductance was divided by the conductance in 
clean air. The latter one was measured at the beginning of each daily 
measurement campaign when the sensor chamber was flushed with 
clean air. We acknowledge that obtaining clean samples can be a 
limiting factor in certain open sampling conditions, where the sensors 
are constantly exposed. Although we have not been able to explore said 
cases, different reference points or preprocessing strategies might be 
then inevitably set. 
After being corrected, the preprocessed signals were transformed 
prior to building the calibration models. For individual models, the 
training signals were concatenated in one matrix and the heating points 
were standardized (mean-centered and scaled to unit variance), while 
for global models they were only mean-centered to improve general-
ization (a complete empirical demonstration can be found in the Sec-
tions A and B of the Supplementary Material). The same process is 
applied to the test data, always using the transformation parameters of 
the training data. Since the training pool for global models includes data 
from multiple sensors, two approaches can be considered: global and 
individual transformations. In global transformations, the preprocessed 
signals from the sensors in the training set are first arranged into the 
matrix of predictors X, and then X is transformed. In individual trans-
formations, each sensor in the training data matrix is transformed with 
its own individual parameters prior to building X. 
2.4. O-PLS calibration models 
Orthogonalized partial least squares (O-PLS) is a multivariate 
regression model based on PLS, the latter being one of the most popular 
multivariate regression models for gas sensor data due to its inherent 
ability to deal with high-dimensional and collinear sensor data [35]. By 
using PLS we can find which sets of measurements along the heating 
point provide relevant information to predict the gas concentrations. 
Such information is found maximizing the correlation of the X block 
(signals) variance with the Y block (concentrations) variance, the joint 
variability of X and Y, by the means of a least-square fitting. After per-
forming PLS, we obtain a set of weights, fw1:kg, (weighted linear com-
binations of heating points) that are used to calculate a regression 
vector, β. Although the latter is often used to make predictions, data is 
usually projected into the set of weights to visualize the scores and 
understand how the model behaves. However, to make a proper visual 
analysis the score space must have low dimensionality. To reduce the 
dimensions of the score space without losing relevant predictive infor-
mation, an Orthogonal Signal Correction filter can be applied over the 
PLS set of weights (O-PLS). The resulting O-PLS model has the same 
regression vector, β, but explains all the score information with just two 
Fig. 1. Flow diagram of the proposed methodology to build and validate global calibration models.  
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
4
weights, fw1;w2?g. This allows an intuitive visualization of the full 
structure of the model in the form a two-dimensional score plot, 
regardless of the number of latent variables of the original PLS model. 
The underlying hypothesis behind OSC is that the first weight, w1, of 
the PLS model is the direction that condenses the relevant predictive 
information; thus, the information contained in any weight orthogonal 
to w1 is considered an irrelevant source of noise. Since PLS weights 
fw2:kg are not orthogonal to w1, the first step in OSC is to orthogonalize 
them using the Gram-Schmidt method [36]. The resulting orthogonal 
weights, fw2:kg?, are then rotated using the Singular Value Decompo-
sition (SVD) [37] of the covariance matrix of the PLS scores of the 
calibration data. The resulting first new weight, w2?, condenses almost 
all the variance of the set of non-orthogonal weights, fw2:kg. Thus, an 
O-PLS model is composed by only two weights orthogonal to each other: 
the first one, w1, contains the relevant predictive information, and the 
second one, w2?, contains all the noise-related variance. The orthogo-
nalization just improves model interpretability, it does not change the 
regression vector of the model, so the performance is preserved. 
2.5. Limit of detection (LOD) 
As a figure of merit to evaluate the prediction performance of a 
calibration model in the low concentration range, we use the limit of 
detection (LOD). The LOD indicates the minimum concentration of the 
target analyte that can be reliably distinguished from the absence of the 
same analyte, and is defined by the International Union of Pure and 
Applied Chemistry (IUPAC) [38] for univariate models as: 
LD  2 t1  α;v sy;x bA
  1
(1)  
Where t1  α;v is the one-sided t-critical value for the chosen confidence 
level (α) and degrees of freedom (v), sy;x is the standard error of 
regression computed on cross-validation or external validation samples, 
and bA is the slope of the calibration curve. The beauty of the LOD is that 
combines repeatability (scattering) and sensitivity (slope) into a single 
number, allowing for a more precise characterization of the measure-
ment system than simply using the root mean squared error (RMSE). 
However, the univariate LOD formula cannot be directly applied to a 
multivariate model, such as O-PLS. To circumvent this limitation, we use 
a methodology proposed by us in a previous work [32] that computes 
the LOD in O-PLS models by using the scores along w1 (the first 
orthogonal O-PLS direction) as a surrogate scalar variable to which the 
univariate LOD formula (Eq. 1) can be applied. 
2.6. Model optimization and validation 
The main parameter to optimize in an O-PLS model is the number of 
latent variables (LVs) of the underlying PLS model. To avoid overfitting, 
the dataset is split into two disjoint subsets: calibration and external 
validation. The first experimental day is used as the calibration set and 
the data from the remaining days for external validation. The optimi-
zation set was therefore composed of the sensor responses to the 50 
calibration conditions of Day 1 whereas the external validation set 
contained the responses to the remaining 450 conditions (50 conditions 
x 9 days). A cross-validation (CV) procedure based on Repeated Strati-
fied K-Fold [39] was applied to the optimization set to find the optimum 
number of LVs without risk of overfitting. 
To build an individual model for a given sensor using Stratified K- 
Folds (without repetition), the optimization set is randomly divided into 
K-folds containing balanced [CO] distributed samples (Fig. 3). In each 
iteration, a different fold is left-out the training pool and used as internal 
validation. Therefore, K-1 folds are transformed and used to train PLS 
models with different number of LVs. Afterwards, the orthogonalized 
models are applied to the left-out data to estimate the LOD. Once the K 
iterations are completed, the K LODs obtained for every O-PLS model 
with a given number of LVs, O-PLSLV , were averaged to produce 
LODLV . Visual inspection of the resulting LOD vs LV boxplot is used to 
Fig. 2. Simplified flow diagram for a global calibration model that uses 4 training sensors. All possible training (blue), internal validation (green) and external 
validation (red) sensor sets are evaluated. For a given set, day 1 data is used to build and optimize an O-PLS model, which is externally validated using data from the 
following days. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 
Table 1 
Number of calibration samples available in the dataset for individual and global 
models.  
Individual Global 
1 sensor 2 sensors 3 sensors 4 sensors 
50 100 150 200  
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
5
choose the optimum number of LVs. To prevent overfitting, the stratified 
k-fold process is repeated R times with a different shuffling of the data. 
The (K⋅LODs)R of every O-PLSLV are averaged trough the R repetitions. 
After determining the optimum number of LV, the model is refit using all 
calibration data. Then, the optimized model is externally validated with 
samples collected by the same sensor in days 2,.,N. This described pro-
cess is independently done for every sensor in the dataset. 
To build m-sensor global models given a dataset of N sensors, 
Repeated Stratified K-Fold CV is also used but the data partitions and 
optimization objectives are completely different (Fig. 4). First, an 
external validation sensor is selected. The unused sensors are used as the 
calibration set: m sensors are selected to train the model and N-m-1 are 
used for internal validation. The K-Folds only contain data from the m 
training sensors, stratified by [CO] and sensor. One-fold is left out of the 
Fig. 3. Example of a K-Fold cross-validation process used to optimize Individual Calibration Models. Only 4 folds have been represented for visualization purposes.  
Fig. 4. Flow diagram of the repeated stratified K-Fold process used to optimize the number of LVs of the global models.  
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
6
training pool. Once the training folds are transformed by sensor, PLSLV 
models are built, orthogonalized and applied to the internal validation 
sensors. The resulting LODLV are averaged through the sensors, and this 
process is repeated for every fold. To avoid overfitting, this process is 
repeated multiple times reshuffling the data. Similarly to individual 
models, the boxplot of the resulting LOD vs LV curve is used to deter-
mine the optimum number of LVs and the model is refitted with all 
training data afterwards. Then, the optimized model is externally vali-
dated with all the samples collected by the external validation sensor in 
days 2,.,N. The described process is repeated until all sensors are 
externally validated. 
3. Results and discussion 
3.1. Raw signals 
An example of the responses of one sensor unit to the generated 
mixtures of CO (0–20 ppm) and humid synthetic air (20 – 80% r.h. at 
26  1 C) in the first measurement day is shown in Fig. 5a. The 
conductance patterns clearly reflect the two temperatures of the heating 
pattern, i.e. high temperature for 5 s followed by low temperature for 
20 s. Most of the sensitivity to CO happens at the low temperature 
regime, but the high level is also highly sensitive to CO (see inset). The 
inter-sensor variability is shown in Fig. 5b, by plotting the responses of 
the six sensors in the dataset to a fixed concentration of 20 ppm. As can 
be seen, there are not only baseline differences between units but also 
changes in sensitivity along the heating pattern. For example, unit #3 
(green trace) shows the lowest conductance at high temperature but the 
highest conductance at low temperature. The baseline differences at 
high temperature amount for 20% in the worst case (unit #3 versus units 
#1 and #6). This variability is within the specs provided in the sensor 
datasheet [34]. An elegant way to visualize the differences in the re-
sponses of different units is by means of Principal Component Analysis 
(PCA), as shown in Fig. 6. Here we can identify two distinct sensor 
behaviors according to how the data points spread in the space spanned 
by the first two principal components (PCs). Sensors 1, 4 and 5 (panels 
a-c) group into what we call Type-A response, whereas sensors 2, 3 and 6 
(panels d-f) group into a Type-B response. The first PC, which accu-
mulates nearly 94% of the total variance, captures the largest variations 
in the CO concentration. The second PC (2.06% variance) correlates 
slightly with the CO concentration mostly at low concentrations. 
Focusing on PC1 (the most important), we can see that the projection of 
the scores into this PC show a better separability for Type-A sensors at 
high concentrations and for Type-B sensors at low concentrations. This 
suggests that Type-B sensors could have a better performance in terms of 
LOD since PC1 will dominate the model parameters. One can also 
imagine that a calibration model built for a Type-A sensor will probably 
produce high prediction errors if directly applied to a Type-B unit. 
3.2. Preprocessing and transformation 
As an illustrative example, the preprocessed signals of one sensor 
unit are shown Fig. 7. It can be seen that the first seconds after the 
temperature transitions show higher sensitivity to the CO concentration, 
which was already observed in previous analysis of this same dataset 
after an elaborated data analysis [33]. The signal processing that we use 
in this study has highlighted this optimum detection point without any 
complicated data processing or analysis. 
Fig. 8 compares two transformation approaches for global models 
(global standardization and sensor-specific mean-centering) by means of 
a PCA score plot. In global standardization (Fig. 8a–b), the first PC 
condenses the signal variance correlated to the CO concentration 
(Fig. 8a), whereas the second PC mostly captures the inter-device vari-
ability (Fig. 8b), hence modeling sensor dissimilarity. This variance 
derived by the sensors systematic differences condenses up to a 11% of 
the total variance, shadowing other relevant sources of variance. The 
scores along PC2 indicate which sets of sensors can be considered quasi- 
identical measuring systems (Fig. 8b). The scores of Type-A (S1, S4 and 
Fig. 5. (a) Logarithmic conductance patterns of sensor unit #1 during one heating cycle, colored by CO concentration (see color bar on top). The black dashed line 
represents the heater voltage and the corresponding operation temperature. (b) Conductance patterns of the six sensor units when exposed to 20 ppm of CO (colored 
by sensor). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
7
S5) and Type-B (S2, S3 and S6) sensors drift in the positive and negative 
PC2 direction, respectively, with increasing CO concentration. Despite 
the low number of sensor units in this study do not allow us to draw 
strong conclusions, Type-A sensors seem to have higher inter-device 
variability than Type-B sensors. A different scenario is portraited 
when sensor-specific mean-centering is performed (Fig. 8c–d). Now, the 
scores are clustered by CO concentration and not by sensor units, and the 
variance captured by PC1 has increased from ~84% to ~94%, probably 
leading to a better starting point for building a global calibration model. 
Relevant information about the sources of variance in each trans-
formation method can be extracted from the PCA loadings (Fig. 9). If 
global standardization is performed (solid line), PC1 seems to give 
Fig. 6. Principal component analysis of the preprocessed and standardized sensor signals (each subplot represents the scores of a different sensor) showing two 
distinct sensor behaviors (a–c versus d–f). All sensor-standardized data from Day 1 was used to build the model, and the measurements taken in Days 2–15 were 
projected into this model. The colormap indicates the CO concentration, linearly spaced from 0 to 9 ppm (dark blue to red, respectively). The values next to each axis 
label indicate the variance captured by each principal component. To perform PCA, multivariate sensor data was processed using Singular Value Decomposition and a 
scatter plot function. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 
Fig. 7. Example of sensor signals after (a) baseline correction and (b) mean-centering. Data corresponds to sensor unit #4. The responses during the last heating cycle 
of all measurements taken in Day 1 are stacked into the same figure for visualization purposes. The colormap indicates the CO concentration (0–20 ppm). The black 
dashed line represents the heater voltage and the corresponding operation temperature. The signals in (b) are the ones used as input for the PLS model. (For 
interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
8
slightly more weight to the low temperature level and penalizes the 
temperature transitions, while PC2 captures the variability in the high 
temperature of the pattern. If sensor-specific mean-centering is per-
formed (dashed line), PC1 only weights the low temperature level (with 
higher weight to the spot with high sensitivity after the temperature 
transition), while PC2 captures the strong variance in the temperature 
transitions. Therefore, the inter-device differences are mostly caused by 
the variance of the responses in the high temperature regime, which was 
also seen in Fig. 5a–b. The reason why global or individual standardi-
zation do not perform well is that these differences are accentuated 
when all sensors are scaled using the mean standard deviation of the 
data (see more details in the Section A of the Supplementary Material). 
3.3. Model optimization 
Fig. 10 illustrates the optimization of a 4-sensor GCM and its external 
validation against an unseen sensor. In Fig. 10a, each box represents the 
internal validation LOD predictions when a given subset of four sensors 
is used for model training (averaged over 100 repetitions and 2 folds). In 
this example, four LVs could be a good choice to ensure the model is 
neither underfitted nor overfitted. Increasing the model complexity 
beyond four LVs brings slight performance improvements in internal 
validation but probably not in external validation. Fig. 10b shows the 
scores of the training, internal and external validation samples of an 
optimized model. The first O-PLS weight condenses 98% of the variance 
due to CO concentration, whereas the second weight condenses the 
orthogonal variation which only relates to the cross-sensitivities. The 
separability between the clusters along the first direction indicates the 
ability of the model to distinguish between concentration levels. The 
score dispersion along the second direction shows the noise, which ac-
cording to the second orthogonal weight pattern (red line in Fig. 10c) is 
mostly due to the misalignments of the sensor responses around the 
temperature transitions. Due to a slight lack of synchronization in 
sampling the conductance patterns of different sensor units, the fast 
conductance transitions under a high temperature gradient are an 
important source of interference. Interestingly, this noise does not seem 
to be related to humidity changes, which the model can inherently reject 
(further detail is provided in the Section C of the Supplementary Ma-
terial). The first O-PLS weight direction in Fig. 10c concentrates the 
weight in the steady-state part of the low temperature regime of the 
Fig. 8. Global standardization (a–b) and sensor-specific mean-centering (c–d) effects on the scores of the day 1 sensor signals seen in the reduced space of the first 
two principal components of a PCA. The scores are colored by concentration (panels a–c) and by sensor (panels b–d). The five CO concentrations are equally spaced, 
ranging from 0 to 8.89 ppm. The shapes of the scores also code the sensor number. (For interpretation of the references to color in this figure legend, the reader is 
referred to the web version of this article.) 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
9
heating cycle, this is, rejecting the noise associated to the temperature 
transitions and resembling the first loading of the PCA models shown 
before. 
It is important to note that the first orthogonalized weights of all the 
possible 4-GCMs trained models are remarkably similar. In fact, these 
weight directions are practically parallel (Section B of the Supplemen-
tary Material). This indicates that the inner structure that relates the 
sensor signals and the CO concentration remains constant among 
models. The second orthogonalized directions of the different models 
are collinear too, which indicates that there is also a common noise- 
filtering structure for all sensors. This partially ensures model trans-
ferability between the sensor units used for training, internal validation, 
and external validation. 
Small differences between the OPLS scores obtained in training and 
external validation will have a negligible effect in the LOD estimation as 
long as these differences mostly occur in the second O-PLS direction. 
This is derived from the fact that the LOD estimation only compares the 
linear relationship between the scores of each sensor along the first O- 
PLS weight and the real CO concentration. This can be seen in Fig. 10d, 
which shows the scores along w1 versus the concentration. The LOD in 
this example is 1.35 ppm. 
The regression vector of the OPLS model indicates how the sensor 
responses are weighted by the model to predict the gas concentration, 
thus showing which are the most relevant features along the heating 
cycle. Fig. 11 compares the normalized regression vectors of individual 
and global models. The regression vector coefficients of the individual 
models show more dispersion than in the global models, probably due to 
the sensor-specific nature of the individual optimization objectives. In 
contrast, the regression vectors of global models are very stable and, to 
some extent, converge to the average value of the individual regression 
vectors. The low dispersion of the GCMs regression vectors encouraged 
us to use the averaged regression vector to predict the LOD of every 
sensor in all the calibration campaigns. The predicted daily LOD settled 
at 1.34  0.13 ppm, which is slightly better than the one obtained using 
any of the original regression vectors of the GCMs (1.38  0.15 ppm). 
3.4. Influence of training size and model complexity 
Now we study how important is the training size of individual and 
global calibration models for estimating the daily LOD. For that, N- 
sensor GCMs were built using all possible combinations of sensors with 
varying training sizes. The characteristics of the different models and 
their optimization parameters are shown in the Supplementary Material 
(Tables D1 and D2). The results of this study are summarized in Fig. 12. 
In Fig. 12 we compare the LOD of N-GCMs and ICMs built with 
different training sizes. The results show that the performance of N- 
GCMs increases with the number of sensors, N, and samples, m, used for 
training (Fig. 12a). The largest improvement generally occurs when the 
training size increases from 10% (5 samples/sensor) to 20% (10 sam-
ples/sensor). The improvement of GCMs saturates after using 30–40% of 
the training size, while ICMs keep improving until ~80% of the cali-
bration samples have been used. ICMs and 4-GCMs produce a daily LOD 
of 1:05 0:24 ppm and 1:38 0:15 ppm, respectively. This is, fully 
trained ICMs outperform global models. However, when the number of 
calibration samples is very low (e.g., 5 samples per sensor or 10% of the 
training size), GCMs built with 2–4 sensors seem to outperform ICMs. In 
the case of 4 sensors and 5 samples we have an LOD of 2.09  0.10 ppm 
for GCM and2.76  0.22 ppm for ICM, The reduced calibration costs of 
GCMs and their generic nature (i.e., applicable to new sensor units 
without acquiring transfer samples) could be, in some scenarios, more 
convenient than achieving the lowest possible LOD. In this dataset, 4- 
GCMs built with 10 samples/sensor (20% of the training size) repre-
sent a good tradeoff between calibration effort and performance 
(LOD  1.61 ppm  0.14 ppm). 
Fig. 12b explains whether the improvement of the GCMs is due to the 
increase of the number of training samples per sensor or to the fact that 
these samples come from more sensors. Three main conclusions can be 
drawn from this panel. First, 1-GCMs (direct transfer models) have the 
highest LOD, probably because models containing just one sensor in the 
training set are not global models per se. Second, the LOD curves of the 
remaining GCMs look like shifted versions of each other, suggesting that 
adding new sensors to the training set might be the main driver for the 
observed performance enhancement. Third, GCMs exhibit lower and 
more stable standard deviations than ICMs for any training size 
Fig. 9. Global standardization (solid lines) and sensor-specific mean-centering (dotted lines) effects on the first two principal components of the shown PCAs.  
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
10
(0.20 ppm vs 0.33 ppm at 10% training size; 0.15 ppm vs 0.24 ppm at 
100% training size). These points support the utility of GCMs: it is 
possible to learn the calibration models using only few calibration 
conditions applied to large sensor sets, instead of using many calibration 
conditions for only few sensors. 
We may therefore argue that the developed GCMs succeed in 
capturing the underlying structure that relates the sensor responses to 
the steady-state gas concentrations. The presented methodology seems 
to be able to work as a generic calibrant for a FIS SB-500-12 sensor in the 
given controlled scenario and could potentially be extended to other 
sensing devices. 
3.5. Temporal stability of global and individual models 
Having studied the optimum training conditions for GCMs, we now 
benchmark the predictive performance of 4-GCMs against fully trained 
ICMs (most challenging adversary) in a two-week period (Fig. 13 a–b). 
As Day 1 data is used to optimize the models, its cross-validation LODs 
won’t be included in external validation LOD analysis, derived using the 
following unseen days. The mean daily LOD predicted using global 
models (1.38  0.15 ppm) is 31% higher than the one given by indi-
vidual models (1.05  0.24 ppm). However, global models produce a 
LOD with higher temporal stability in terms of mean value and disper-
sion, which means that the LOD obtained in GCMs in Day 1 is less 
overfitted to the training data and therefore more representative of 
future performance. As the number of available samples for calibration 
decreases (use case of GCMs), the performance of ICMs will rapidly 
decrease (i.e., LOD increases) while GCMs will keep a relatively good 
performance (Fig. 13c–d). A possible interpretation is that ICMs with 
low sampling conditions (Fig. 13c) seem to be prone to suffer from noisy 
samples, whereas GCMs (Fig. 13d) naturally use more samples so noise 
resistance is enhanced. 
Fig. 10. (a) Day 1 LOD of a 4-sensor GCM versus the number of LVs. Each box represents the interquartile range (IQR) of the distribution of LODs for the five used 
sensors, being the bottom and top edges the 25th and 75th percentiles, respectively. The mean and median of the distribution are represented by a horizontal blue 
segment and a red dot, respectively. The whiskers extend to the most extreme data points not considered outliers. The latter are considered standard outliers (empty 
circles) or extreme outliers (filled circles) depending on whether their value exceeds the interquartile range by more than 1.5 or 3 times, respectively; (b) Score plot of 
an optimized 4-sensor O-PLS model (external validation samples from all days are also shown). The color indicates the CO concentration (ranging from 0 ppm to 
8.89 ppm). The values next to each axis label describe the of the signal-block and response-block variance contained in each direction. (c) O-PLS weights of all trained 
model; (d) Regression between t1 scores and true concentration, showing a linear dependency and homoscedasticity. (For interpretation of the references to color in 
this figure legend, the reader is referred to the web version of this article.) 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
11
4. Conclusions 
The main goal of this study was to investigate whether global cali-
bration models could compare to the performance of individual cali-
bration models for temperature-modulated MOX sensors. The main 
advantage of global models is that they can be directly transferred 
among different units of the same sensor model, reducing the calibration 
costs and effort, and allowing an easy replacement of faulty units. Using 
an experimental dataset with temperature-modulated MOX sensors 
aiming to the prediction of low levels of carbon monoxide in the pres-
ence of humidity interference, we studied the performance of O-PLS 
calibration models built with different number of sensors and varying 
training sizes. 
The main result is that global models can be effective with a minimal 
number of sensors in the training set. In the explored dataset, as few as 4 
sensors were enough to learn the common underlying mapping between 
the sensor responses and the gas concentration after proper signal pre-
processing. In the explored dataset, as few as 4 sensors were enough to 
learn the common underlying mapping between the sensor responses 
and the gas concentration after proper signal preprocessing. It is 
certainly possible that adding additional sensors might further improve 
the performance. We expect though that the law of marginal returns will 
manifest soon. In other words, the improvements will saturate for a 
sensor set of a small cardinality While we have seen that global models 
Fig. 11. Orthogonal partial least squares regression vector comparison between Individual and Global models. The solid line indicates the mean regression vector 
over all day 1 optimized global models used to estimate the external validation daily LOD of the different calibration campaigns. The shaded area spans up to twice 
the observed standard deviation. The black dashed line represents the heater voltage and the corresponding operation temperature. 
Fig. 12. Limit of detection of ICM and GCM models as a function of (a) number of samples per sensor; (b) total number of calibration samples. The solid line in (a) 
represents the mean daily LOD predicted by each model in external validation, while the shaded area indicates the standard deviation of the estimations. 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
12
improved their performance with an increasing number of training 
sensors, the largest improvements occurred when doubling the number 
of samples per sensor from 5 to 10. In contrast, the performance of in-
dividual models did not easily saturate, and achieved better daily LODs 
than the global models if enough samples were available (e.g., more 
than 15–20 samples). However, global models provided lower LODs if 
the number of samples per sensor was lower than 10. Therefore, for 
global models it is more convenient to increase the number of sensors in 
the training pool rather than increasing the number of samples per 
sensor. 
Another finding of this study is that the LOD of global models seemed 
more stable in time than that of individual models. This means that the 
LOD computed during calibration was more representative of future 
model performance than the one obtained using individual models. Our 
interpretation of this result is that by rejecting the variability among 
sensor units, global models could also be partially rejecting drift 
directions. 
These results define a clear use case for global models in scenarios 
where reduced calibration costs and generic models (i.e., applicable to 
new sensors without acquiring more samples) are more important than 
achieving the lowest possible LOD. At the same time, being able to share 
the calibration cost and effort of one single global model among 
numerous replicas, should be also especially attractive to sensor man-
ufacturers, who solve inter-device variance through batch screening and 
individual calibrations. We envision that the same calibration strategy 
used in this paper for temperature-modulated sensors could be feasible 
for isothermal sensor arrays. Follow-up studies involving longer 
measuring campaigns, larger number of sensors and various interfering 
gases are necessary to confirm the results obtained in this first study. The 
methodology presented in this paper is intuitive and can be applied to 
other sensor technologies with vectorized responses (such as gasFETs or 
to isothermal sensor arrays) broadening the possible applications of 
chemical sensors to large-scale problems. 
Fig. 13. Temporal evolution of the predicted daily LOD using (a, c) optimized ICMs and (b, d) optimized 4-GCMs. Lower LOD values are better. The models are 
always optimized using data from day 1, with n  50 samples/sensor in (a, b) and n  5 samples/sensor in (c, d), to make predictions in the following days. Each box 
represents the interquartile range (IQR) of the distribution of daily LODs for the six used sensors, being the bottom and top edges the 25th and 75th percentiles, 
respectively. The mean and median of the distribution are represented by a horizontal blue segment and a red dot, respectively. The whiskers extend to the most 
extreme data points not considered outliers. The latter are considered standard outliers (empty circles) or extreme outliers (filled circles) depending on whether their 
value exceeds the interquartile range by more than 1.5 or 3 times, respectively. (For interpretation of the references to color in this figure legend, the reader is 
referred to the web version of this article.) 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
13
CRediT authorship contribution statement 
JB, SM: Conceptualization. JB, SM: Methodology. AM, JB: Software. 
AM, JB: Validation. AM, JB: Formal analysis. AM: Investigation. SM: 
Resources. JB: Data curation. AM: Writing – original draft. JB, SM: 
Writing – review & editing. AM, JB: Visualization. JB, SM: Supervision. 
SM: Project administration. SM: Funding acquisition. 
Declaration of Competing Interest 
The authors declare that they have no known competing financial 
interests or personal relationships that could have appeared to influence 
the work reported in this paper. 
Acknowledgements 
We would like to acknowledge, the Departament d’Universitats, 
Recerca i Societat de la Informacio de la Generalitat de Catalunya 
(expedient 2017 SGR 1721); the Comissionat per a Universitats i Recerca 
del DIUE de la Generalitat de Catalunya; and the European Social Fund 
(ESF). Additional financial support has been provided by the Institut de 
Bioenginyeria de Catalunya (IBEC). IBEC is a member of the CERCA 
Programme/Generalitat de Catalunya. 
Appendix A. Supporting information 
Supplementary data associated with this article can be found in the 
online version at doi:10.1016/j.snb.2021.130769. 
References 
[1] J.W. Gardner, P.N. Bartlett, Electronic noses. Principles and applications, Meas. 
Sci. Technol. 11 (1999) 1087. 
[2] A. Ponzoni, C. Baratto, N. Cattabiani, M. Falasconi, V. Galstyan, E. Nunez- 
Carmona, F. Rigoni, V. Sberveglieri, G. Zambotti, D. Zappa, Metal oxide gas 
sensors, a survey of selectivity issues addressed at the Sensor Lab, Brescia (Italy), 
Sensors 17 (2017) 714, https://doi.org/10.3390/s17040714. 
[3] P.K. Clifford, D.T. Tuma, Characteristics of semiconductor gas sensors I. Steady 
state gas response, Sens. Actuators 3 (1982) 233–254, https://doi.org/10.1016/ 
0250-6874(82)80026-7. 
[4] K. Kamarudin, V.H. Bennetts, S.M. Mamduh, R. Visvanathan, A.S.A. Yeon, A.Y.M. 
Shakaff, A. Zakaria, A.H. Abdullah, L.M. Kamarudin, Cross-sensitivity of metal 
oxide gas sensor to ambient temperature and humidity: effects on gas distribution 
mapping, in: Proceedings of the AIP Conf., (2017), 020025. hhttps://doi.org/ 
10.1063/1.4975258i. 
[5] M. Holmberg, T. Artursson, Drift compensation, standards, and calibration 
methods, in: Handb. Mach. Olfaction, (2004), 325–346. hhttps://doi.org/10.100 
2/3527601597.ch13i. 
[6] J. Burgues, S. Marco, Santiago Marco, J. Burgues, S. Marco, J. Burgues, S. Marco, 
Low power operation of temperature-modulated metal oxide semiconductor gas 
sensors, Sensors 18 (2018) 339, https://doi.org/10.3390/s18020339. 
[7] D. Martinez, J. Burgues, S. Marco, Fast measurements with MOX sensors: a least- 
squares approach to blind deconvolution, Sensors 19 (2019), https://doi.org/ 
10.3390/s19184029. 
[8] J. Fonollosa, S. Sheik, R. Huerta, S. Marco, Reservoir computing compensates slow 
response of chemosensor arrays exposed to fast varying gas concentrations in 
continuous monitoring, Sens. Actuators B Chem. (2015) 618–629, https://doi.org/ 
10.1016/j.snb.2015.03.028. 
[9] W. Gopel, K.D. Schierbaum, SnO2 sensors: current status and future prospects, 
Sens. Actuators B Chem. 26 (1995) 1–12, https://doi.org/10.1016/0925-4005(94) 
01546-T. 
[10] M. Bruins, J.W. Gerritsen, W.W.J. Van De Sande, A. Van Belkum, A. Bos, Enabling a 
transferable calibration model for metal-oxide type electronic noses, Sens. 
Actuators B Chem. 188 (2013) 1187–1195, https://doi.org/10.1016/j. 
snb.2013.08.006. 
[11] P.T. Moseley, Progress in the development of semiconducting metal oxide gas 
sensors: a review, Meas. Sci. Technol. 28 (2017), 082001, https://doi.org/ 
10.1088/1361-6501/aa7443. 
[12] I. Sayhan, A. Helwig, T. Becker, G. Müller, I. Elmi, S. Zampolli, M. Padilla, 
S. Marco, G. Mueller, I. Elmi, S. Zampolli, M. Padilla, S. Marco, G. Muller, I. Elmi, 
S. Zampolli, M. Padilla, S. Marco, Discontinuously operated metal oxide gas sensors 
for flexible tag microlab applications, IEEE Sens. J. 8 (2008) 176–181. 
[13] F. Palacio, J. Fonollosa, J. Burgues, J.M. Gomez, S. Marco, Pulsed-temperature 
metal oxide gas sensors for microwatt power consumption, IEEE Access 8 (2020) 
70938–70946, https://doi.org/10.1109/ACCESS.2020.2987066. 
[14] E. Martinelli, D. Polese, A. Catini, A. D’Amico, C. Di Natale, Self-adapted 
temperature modulation in metal-oxide semiconductor gas sensors, Sens. Actuators 
B Chem. 161 (2012) 534–541, https://doi.org/10.1016/j.snb.2011.10.072. 
[15] K.J. Johnson, S.L. Rose-Pehrsson, Sensor array design for complex sensing tasks, 
Annu. Rev. Anal. Chem. 8 (2015) 287–310, https://doi.org/10.1146/annurev- 
anchem-062011-143205. 
[16] T. Baur, M. Bastuck, C. Schultealbert, T. Sauerwald, A. Schütze, Random gas 
mixtures for efficient gas sensor calibration, J. Sens. Sens. Syst. 9 (2020) 411–424, 
https://doi.org/10.5194/jsss-9-411-2020. 
[17] S. Marco, A. Gutierrez-Galvez, Signal and data processing for machine olfaction 
and chemical sensing: a review, IEEE Sens. J. 12 (2012) 3189–3214, https://doi. 
org/10.1109/JSEN.2012.2192920. 
[18] I. Rodriguez-Lujan, J. Fonollosa, A. Vergara, M. Homer, R. Huerta, On the 
calibration of sensor arrays for pattern recognition using the minimal number of 
experiments, Chemom. Intell. Lab. Syst. 130 (2014) 123–134, https://doi.org/ 
10.1016/j.chemolab.2013.10.012. 
[19] M. Padilla, A. Perera, I. Montoliu, A. Chaudry, K. Persaud, S. Marco, Fault 
detection, identification, and reconstruction of faulty chemical gas sensors under 
drift conditions, using principal component analysis and multiscale-PCA, in: 
Proceedings of the Int. Jt. Conf. Neural Networks (IJCNN 2010), 2010. 
[20] O. Tomic, T. Eklov, K. Kvaal, J.-E. Haugen, Recalibration of a gas-sensor array 
system related to sensor replacement, Anal. Chim. Acta 512 (2004) 199–206, 
https://doi.org/10.1016/j.aca.2004.03.001. 
[21] R.N. Feudale, N.A. Woody, H. Tan, A.J. Myles, S.D. Brown, J. Ferre, Transfer of 
multivariate calibration models: a review, Chemom. Intell. Lab. Syst. 64 (2002) 
181–192. 
[22] M.O. Balaban, F. Korel, A.Z. Odabasi, G. Folkes, Transportability of data between 
electronic noses: Mathematical methods, Sens. Actuators B Chem. 71 (2000) 
203–211, https://doi.org/10.1016/S0925-4005(00)00617-1. 
[23] Y. Wang, D.J. Veltkamp, B.R. Kowalski, Multivariate instrument standardization, 
Anal. Chem. 63 (1991) 2750–2756, https://doi.org/10.1021/ac00023a016. 
[24] L. Zhang, F. Tian, C. Kadri, B. Xiao, H. Li, L. Pan, H. Zhou, On-line sensor 
calibration transfer among electronic nose instruments for monitoring volatile 
organic chemicals in indoor air quality, Sens. Actuators B Chem. 160 (2011) 
899–909, https://doi.org/10.1016/j.snb.2011.08.079. 
[25] R.W. Kennard, L.A. Stone, Computer aided design of experiments, Technometrics 
11 (1969) 137–148, https://doi.org/10.1080/00401706.1969.10490666. 
[26] J. Fonollosa, L. Fernandez, A. Gutierrez-Galvez, R. Huerta, S. Marco, Calibration 
transfer and drift counteraction in chemical sensor arrays using direct 
standardization, Sens. Actuators B Chem. 236 (2016) 1044–1053, https://doi.org/ 
10.1016/j.snb.2016.05.089. 
[27] L. Fernandez, S. Guney, A. Gutierrez-Galvez, S. Marco, Calibration transfer in 
temperature modulated gas sensor arrays, Sens. Actuators B Chem. 231 (2016) 
276–284. 
[28] D. Zhang, D. Guo, K. Yan, D. Zhang, D. Guo, K. Yan, Learning classification and 
regression models based on transfer samples, Breath. Anal. Med. Appl. (2017) 
113–135, https://doi.org/10.1007/978-981-10-4322-2_7. 
[29] L. Zhang, Y. Liu, P. Deng, Odor recognition in multiple E-nose systems with cross- 
domain discriminative subspace learning, IEEE Trans. Instrum. Meas. 66 (2017) 
1679–1692, https://doi.org/10.1109/TIM.2017.2669818. 
[30] D. Zhang, D. Guo, K. Yan, A transfer learning approach for correcting instrumental 
variation and time-varying drift, Breath. Anal. Med. Appl. (2017) 137–156, 
https://doi.org/10.1007/978-981-10-4322-2_8. 
[31] A. Solorzano, R. Rodríguez-Perez, M. Padilla, T. Graunke, L. Fernandez, S. Marco, 
J. Fonollosa, Multi-unit calibration rejects inherent device variability of chemical 
sensor arrays, Sens. Actuators B Chem. 265 (2018) 142–154, https://doi.org/ 
10.1016/j.snb.2018.02.188. 
[32] J. Burgues, S. Marco, Multivariate estimation of the limit of detection by 
orthogonal partial least squares in temperature-modulated MOX sensors, Anal. 
Chim. Acta 1019 (2018) 49–64, https://doi.org/10.1016/j.aca.2018.03.005. 
[33] J. Burgues, J.M. Jimenez-Soto, S. Marco, Estimation of the limit of detection in 
semiconductor gas sensors through linearized calibration models, Anal. Chim. Acta 
1013 (2018) 13–25, https://doi.org/10.1016/J.ACA.2018.01.062. 
[34] F.I.S. Inc, FIS Gas Sensor SB-500-–12, 2017. 
[35] P. Geladi, B.R. Kowalski, Partial least-squares regression: a tutorial, Anal. Chim. 
Acta 185 (1986) 1–17, https://doi.org/10.1016/0003-2670(86)80028-9. 
[36] S.J. Leon, Å. Bjorck, W. Gander, Gram-Schmidt orthogonalization: 100 years and 
more, Numer. Linear Algebr. Appl. 20 (2013) 492–532, https://doi.org/10.1002/ 
nla.1839. 
[37] G.H. Golub, C. Reinsch, Singular value decomposition and least squares solutions, 
in: Linear Algebr., Springer, 1971, pp. 134–151. 
[38] L.A. Currie, Detection: international update, and some emerging di-lemmas 
involving calibration, the blank, and multiple detection decisions1Contribution of 
the National Institute of Standards and Technology; not subject to 
copyright.12Based on an invited lecture at t, Chemom. Intell. Lab. Syst. 37 (1997) 
151–181, https://doi.org/10.1016/S0169-7439(97)00009-9. 
[39] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and 
model selection, in: Proceedings of the Fourteenth Int. Jt. Conf. Artif. Intell., 2 
(1995), 1137–1143. 
Albert Miquel_Ibarz a Fundamental Physics student at the University of Barcelona. Since 
2019 he has been an internship student at Institute for Bioengineering of Catalonia (IBEC) 
under the guidance of Dr. Santiago Marco and Dr. Javier Burgues. He is passionate about 
artificial intelligence, data science, philosophy, and their application to bioengineering, 
neuroscience and social sciences. 
A. Miquel-Ibarz et al.                                                                                                                                                                                                                          
Sensors and Actuators: B. Chemical 350 (2022) 130769
14
Javier Burgues received the BSc. degree in Telecommunication Engineering from the 
University Autonoma of Madrid, in 2010, the MSc. degree in Computer Science from the 
University of Southern California, in 2013, and the Ph.D. degree in Engineering and 
Applied Sciences from the University of Barcelona, in 2019. He currently works as R&D 
Technical Lead at ScioSense Germany GmbH, where he is responsible for the development 
of next-gen environmental sensors for automotive applications. His main research interests 
include signal processing and pattern recognition for chemical sensor data, electronic 
design, artificial intelligence, integration of chemical sensors into robotic platforms, and 
algorithm development for localization and mapping of chemical sources. More at https: 
//javierburgues.com 
Santiago Marco completed his university degree in Physics (1988) and Ph.D. (1993) from 
the University of Barcelona (UB). He held a European Human Capital Mobility grant for a 
postdoctoral position at the Department of Electronic Engineering at the University of 
Rome “Tor Vergata” working on Electronic Noses. In 1995, he became Associate Professor 
at the Department of Applied Physics and Electronics at UB. In 2004 he had a sabbatical 
leave at AIRBUS-Innovation Works, Munich, working on Ion Mobility Spectrometry. In 
2008 he was appointed leader of the Signal and Information Processing for Sensing Sys-
tems Lab at the Institute for Bioengineering of Catalonia. From 2020 he is Full Professor at 
the Department of Electronics and Biomedical Engineering at UB. His research concerns 
the development of signal/data processing algorithmic solutions for smart chemical 
sensing based in sensor arrays or microspectrometers integrated typically using Micro-
system Technologies. He has published around 130 archival journals and around 250 
conference papers. (more at http://ibecbarcelona.eu/sensingsys). 
A. Miquel-Ibarz et al.