Untargeted Profiling of Concordant/Discordant Phenotypes of High Insulin

19 This study explores the metabolic profiles of concordant/discordant phenotypes of high insulin 20 resistance (IR) and obesity. Through untargeted metabolomics (LC-ESI-QTOF-MS), we analyzed 21 the fasting serum of subjects with high IR and/or obesity (n = 64). An partial least-squares 22 discriminant analysis with orthogonal signal correction followed by univariate statistics and 23 enrichment analysis allowed exploration of these metabolic profiles. A multivariate regression 24 method (LASSO) was used for variable selection and a predictive biomarker model to identify 25 subjects with high IR regardless of obesity was built. Adrenic acid and a dyglyceride (DG) were 26 shared by high IR and obesity. Uric and margaric acids, 14 DGs, ketocholesterol, and 27 hydroxycorticosterone were unique to high IR, while arachidonic, hydroxyeicosatetraenoic (HETE), 28 palmitoleic, triHETE, and glycocholic acids, HETE lactone, leukotriene B4, and two glutamyl29 peptides to obesity. DGs and adrenic acid differed in concordant/discordant phenotypes, thereby 30 revealing protective mechanisms against high IR also in obesity. A biomarker model formed by DGs, 31 uric and adrenic acids presented a high predictive power to identify subjects with high IR [AUC 32 80.1% (68.9− 91.4)]. These findings could become relevant for diabetes risk detection and unveil 33 new potential targets in therapeutic treatments of IR, diabetes, and obesity. An independent validated 34 cohort is needed to confirm these results. 35

can present IR and β-cell impairment. 3,4 The inclusion of discordant phenotypes in research studies Turbo Spray IonDrive source coupled to a Shimadzu Nexera X2 series HPLC system (Kyoto, Japan) 124 (Atlantis T3 RP column 50 × 2.1 mm2, 5 μm (Waters, Milford, MA)) was used. A linear gradient 125 elution was used ( (Tables S1 and S2). Raw data 137 contained 3000 mass features, including redundant mass signals (isotopes, adducts, in-source 138 fragments, etc.). The data sets were filtered out to remove variables that did not appear in more than 139 25% of any of the groups.11 The final data sets presented 2607 (ESI+) and 2318 (ESI−) mass features. 140 ESI+ and ESI− data sets were analyzed separately. 141

Multivariate Statistical Analysis 142
Partial least-squares discriminant analysis with orthogonal signal correction (OSC-PLS-DA) was 143 used to examine between-group differences in LC−MS data (SIMCA-P+ 13.0 software, Umetrics, 144 Umeå, Sweden). Data were log-transformed and Pareto scaled,15,16 and an OSC filter was applied 145 to remove the variability not associated with the diseases. Comparisons were performed by 33). The robustness of the models was evaluated through the R2X (cum), R2Y (cum), and Q2 (cum) 149 parameters, cross-validation and permutation tests (n = 200) (Table S3). As a final quality test, the 150 data set was randomly split into ten equal-size subsamples, nine of which were used as a training set 151 while the remaining was used as a validation set. This process was repeated ten times (Table S4). 152 Mass features explaining group separation were selected according to their variable importance for 153 projection (VIP) values (cutoff ≥ 2). 154

Annotation of Metabolites 155
A cluster analysis, based on Pearson correlation and Ward's distance method,17 was used to 156 determine eventual clusters of mass features from the same metabolite (PermutMatrix 1.9.3). 157 MetaNetter, a plugin for Cytoscape (v.2.8.0), was used to define adducts and fragments within the Metabolite identity confirmation was carried out by matching peak chromatographic and MS 165 responses (extracted ion chromatogram, product ion scan) to those of commercial reference 166 standards, when available, spiked in Milli-Q water and plasma (50 ppb), on a QStar Elite system (AB 167 Sciex). The analytical parameters were the same as described above. 168

Univariate Statistical Analysis 169
Univariate analysis was performed in R to describe differences in clinical and metabolic parameters. 170 Clinical parameters were first log-transformed prior to the analysis. Statistics on metabolic 171 parameters were performed on the raw matrix. Prior to the analyses, data were log-normalized and obesity and high IR on clinical variables. Fisher's exact test was used to evaluate differences in gender reduce the probability of false positives.22 Gender, age and drug consumption were considered as 178 confounders in all the analyses. Only those metabolites with adjusted p-value ≤ 0.05 were considered 179 significant. 180

Enrichment Analysis 181
ChemRICH (http://chemrich.fiehnlab.ucdavis.edu/) was used to perform an enrichment analysis of 182 the metabolites that presented VIP ≥ 2 and adjusted p-value ≤ 0.05. ChemRICH utilizes structure 183 similarity and chemical ontologies to map all known metabolites and name metabolic modules. The 184 ChemRICH statistical approach compares chemical similarities using the Medial Subject Headings 185 database and Tanimoto chemical similarity coefficients to cluster metabolites into nonoverlapping 186 chemical groups. Enrichment statistical analysis uses a background-independent database test, 187 Kolmogorov− Smirnov-test, using the created clusters.23 188

Predictive Models of Combined Serum Markers 189
Variable selection was performed with all the metabolites that met both criteria, VIP ≥ 2 and adjusted 190 p-value ≤ 0.05, for high IR to select those compounds that better separate subjects with IS or high IR. 191 A new metabolic variable, total diglycerides (tDG), was created with the arithmetic mean of all DGs. 192 Variable selection was conducted with the least absolute shrinkage and selection operator (LASSO) 193 logistic regression using a leave-one-out cross-validation. 24 Prior to the analysis, data were log-194 normalized and Pareto scaled, and adjusted by gender, age, and drug consumption. The lambda-195 coefficient was used to choose the most predictive metabolites, and these were employed to build a 196 new parameter, the multimetabolite biomarker model, as follows: Multimetabolite biomarker model 197 The global performance of this multimetabolite biomarker model was evaluated through receiver 201 operating characteristic (ROC) curves. The area under the curve (AUC) value, confidence intervals 202 (CIs 95%), sensitivity, and specificity were calculated in R with the pROC package. 203

Anthropometric and Biochemical Parameters 205
Individuals with high IR presented altered FG, fasting insulin, HOMA-IR index, and lipid metabolism 206 indicators (total cholesterol, HDL, and LDL cholesterol and TG). Subjects with obesity had higher 207 adiposity markers, systolic and diastolic pressure, and total cholesterol than individuals without 208 obesity. No changes were observed in the interaction between high IR and obesity for any of the 209 variables (Table 1). Differences between concordant and discordant phenotypes of high IR were 210 mainly due to adiposity markers. Subjects with concordant and discordant phenotypes of obesity also 211 presented metabolic differences including FG, fasting insulin, HOMA-IR index, and lipid metabolism 212 (Table 1). 213

LC−MS Data Quality 214
Neither carryover nor apparent clustering due to the batch injection order were noticed ( Figure S1). 215 The run-to-run repeatability of the QCs across the whole data set met the quality criteria (retention 216 time shift ≤ 0.05 min, mass accuracy deviation <3 mDa and peak area CV < 25%)11 (Table S1). The 217 generation of the OSC filters removed six and five components (eigenvalue >2), maintaining the 54% 218 and 76% non-orthogonal variation in the original ESI+ and ESI− data sets, respectively. The OSC-219 PLS-DA resulted in four robust models that discriminate metabolic differences among control 220 individuals and subjects with high IR or obesity ( Figure 1, Table S3). The PLS score plot showed 221 that the control group and the high IR or obesity groups clearly separated in the first component. The 222 plot also suggested that concordant and discordant phenotypes of each disorder (high IR-obesity vs 223 high IR-non-obesity, and IS-non-obesity vs IS-obesity, respectively) might be metabolically different the metabolites were lipids. We were not able to discern between a molecular ion or sodium adduct 230 in DGs since both species presented a small mass difference with the theoretical mass (<3 mDa). 231 Thus, we provided both annotations. A Student's t test confirmed that two out of these compounds 232 were shared by both metabolic statuses, 18 were only found in high IR and nine in obesity. Adrenic 233 acid and a DG (34:2/36:5) were common between high IR and obesity, which were higher than in the 234 control group. Metabolomics also revealed that the high IR group presented more DGs, margaric 235 acid, ketocholesterol, and uric acid, and lower levels of hydroxycorticosterone. On the other hand, 236 alterations in lipid metabolism were also found in obesity. For instance, the obesity group showed 237 higher levels of arachidonic acid, HETE, HETE lactone, leukotriene B4, palmitoleic acid and 238 tryhydroxyeicosatetraenoic acid (triHETE), and the dipeptides γ-glutamyl-γ-aminobutyraldehide and 239 glutamyl-valine than the control groups, and lower levels of the bile acid glycocholic acid ( Figure 2). 240 An enrichment analysis was performed with ChemRICH to identify which chemical class was more 241 enriched in each metabolic disorder. ChemRICH revealed that the most enriched chemical class in 242 high IR was DGs (adjusted p-value = 2.2 × 10−20), while HETEs and unsaturated fatty acids were in 243 obesity (adjusted p-values = 1.7 × 10−05 and 6.0 × 10−04, respectively) (Table 3). Therefore, we will 244 mainly focus the discussion of the results in these chemical classes. 245

Metabolic Differences between Concordant/Discordant Phenotypic Groups 246
Comparisons between phenotypic groups confirmed that the main differences between groups were 247 due to DG and polyunsaturated fatty acid (PUFA) levels, revealing that the degree of dyslipidemia 248 and pro-inflammatory markers could differentiate subjects of distinct phenotypic groups (Figure 3).  (Table 4). This predictive model presented better performance than the combination of other 259 lipid markers such as cholesterol or TG between them and/or with uric acid and adrenic acid (Table  260 S6). 261

DISCUSSION 262
The untargeted profiling of the serum of concordant/ discordant phenotypes of high IR and/or obesity 263 allowed exploring the metabolic profiles of these two metabolic statuses and describing their 264 similarities and divergences. In addition, it allowed defining a multimetabolite biomarker model to 265 detect high IR regardless of obesity, which might predict the risk developing diabetes. Large 266 disturbances in lipid metabolism were observed in all the metabolic disorders. 267

Metabolic Profile of High IR 268
DGs were the most enriched chemical class in subjects with high IR. This group also presented 269 differences in TG levels, whose levels highly correlate with DG levels (Pearson's correlation 270 coefficient: r = 0.90). However, TG species could not be detected in metabolomic profiles because 271 of their very low polarity, which provokes that most TGs remain adsorbed into the protein precipitate 272 during serum extraction. Furthermore, these neutral lipids are not readily ionized in ESI, unless some 273 modifier is added to mobile phases (e.g., ammonium salts). Despite the adipocytokines-induced inflammation is the prevailing hypothesis of IR progression, the hypothesis of DGmediated IR is 275 becoming increasingly important.26,27 In line with this hypothesis, we observed higher levels of 276 DGs in subjects with high IR regardless of obesity. An accumulation of DGs leads to a cascade of 277 events such as the activation of isoforms of protein kinase C that inhibit sensibility to insulin of insulin 278 responsive tissues, the reduction of fatty acid β-oxidation in the mitochondria, thereby limiting 279 energy production, and lipodystrophy in tissues due to the redistribution of fat.26,27 Adrenic acid 280 was the only PUFA whose levels were altered in subjects with high IR, suggesting a certain degree 281 of a proinflammatory response. Adrenic acid is a ω-6 PUFA. This class of lipids act as inflammatory 282 mediators by acting as ligands for immune receptors and trigger a perpetual low-grade inflammaadipocyte growth and dysfunction, oxidative stress and altered signaling.28,29 Uric acid, a product 285 of the metabolic breakdown of purine nucleotides, was also higher in subjects with high IR. It is 286 normally excreted by the urine but high concentrations of uric acid in blood are associated with 287 oxidative stress, inflammation and alterations in carbohydrate and lipid metabolism. For instance, 288 hyperuricemia promotes endothelial cell damage and dysfunction, decreases endothelial nitric oxide 289 availability, which limits insulin action, increases reactive oxygen species, and blocks adiponectin 290 synthesis. In addition, hyperuricemia alters gluconeogenesis, fatty acid oxidation, and induces the 291 production of pro-inflammatory mediators. Serum uric acid has been proposed as a risk marker in IR, 292 cardiovascular disease, metabolic syndrome and renal failure, among others.30,31 The precursor of 293 aldosterone, hydroxycorticosterone, was lower in subjects with high IR. Hypoaldosteronism has been 294 associated with adrenal insufficiency and diabetic nephropathy. 32 Results from the cohort 295 Framingham Heart Study described a lineal relationship between the glycaemic index and the risk for 296 renal alterations, even before the onset of diabetes.33 Therefore, alterations in uric acid and 297 hydroxycorticosterone might reflect that subjects with high IR may be prone to develop renal 298 alterations. Furthermore, higher levels of 7-ketocholesterol might also confirm oxidative processes 299 in high IR. 7-ketocholesterol, also known as 5-cholesten-3β-ol-7-one, is a sterol derived from the 300 oxidation of cholesterol and it has been proposed as a robust biomarker of oxidized LDL particles in increase the production of free radicals, which might damage cellular structures and alter metabolic 303 processes.35,36 304 metabolic disorders, FFA increase in plasma due to the stress of the adipose tissue, which releases 309 more FFA than in normal conditions.37 The enrichment analysis with ChemRICH revealed that 310 HETEs and unsaturated fatty acids were the most enriched chemical classes in subjects with obesity. 311 For instance, adrenic acid, arachidonic acid, HETE, HETE lactone, leukotriene B4 (diHETE), and 312 triHETE levels were found to be higher in the obesity group. These metabolites belong to the ω-6 313 PUFAs class and, as already commented, they are lipid mediators that trigger a perpetual low-grade 314 inflammation. Arachidonic acid is considered the primary source of pro-inflammatory lipid mediators 315 and it is rapidly converted into potent inflammatory mediators such as prostaglandins, thromboxanes, 316 leukotrienes, lipoxins and HETEs, and derivatives, which lead to cascade of events, as described 317 hereinbefore.28,29 Therefore, the fact that we found more ω-6 PUFAs differentially expressed in 318 obesity than in high IR with respect to the control group (Table 2), and their levels were higher in 319 concordant than in discordant phenotypes (Figure 3, Table S5), suggests that the inflammatory 320 processes in high IR might be at a lower extent than in obesity. Inflammation and oxidative stress are 321 tightly interconnected processes. For instance, inflammatory cells produce free radicals during the 322 immune response.35,36 Although 7-ketocholesterol was not altered in obesity, two glutamyl 323 peptides, namely glutamyl-γ-aminobutyraldehyde and glutamyl-valine, levels were higher in obesity.

Differences between Concordant/Discordant Phenotypes of High IR and Obesity 342
The main differences between the four phenotypic groups were DGs and PUFA levels. The highest 343 levels of these metabolites were found in subjects with both high IR and obesity, while the lowest 344 levels in individuals with both IS and non-obesity. In addition, this study also revealed that the 345 metabolic profile of subjects with only one metabolic disorder, high IR or obesity, had lower levels 346 of DGs, free fatty acids and pro-inflammatory markers than individuals presenting both disorders. 347 These results might unveil that obesity itself also implies the existence of protective mechanisms 348 against high IR. In line with this observation, differences in pro-inflammatory markers in subjects 349 with obesity and IS or IR have been already described. This observation is also known as the "obese 350 healthy paradox".44,45 Among all the metabolites identified as potential markers of discordant 351 phenotypes of high IR and obesity (Table S5), adrenic acid is particularly interesting since it is the 352 only compound whose levels allowed differentiating the four phenotypical groups. Adrenic acid IR sets in before disease markers appear and it might remain undiagnosed for a long period, thereby 359 increasing the risk of developing other metabolic alterations. Therefore, there is a need to detect IR 360 rapidly and to monitor its progression to diabetes. Although current markers have a high predictive 361 power, they also present some limitations.1 Current markers of high IR such as FG, fasting insulin or 362 HOMA-IR presented a high predictive power (not shown, AUC ≈ 95%). It may be because subjects 363 were grouped according their FG levels and HOMA-IR index. However, they may be late markers 364 since when insulin deficiency manifests as hyperglycaemia, considerable pancreatic β-cell 365 insufficiency has already occurred.47 Thus, the third aim of this work was to identify new markers 366 of high IR. We selected those metabolites that presented a VIP ≥ 2 and adjusted p-value <0.

Strengths and Limitations 375
Although this study is an observational study, the high potential of untargeted metabolomics has 376 provided a snapshot of the metabolome of subjects with high IR and/or obesity at a given time. Thus, 377 we have explored in depth the metabolic profiles of these two metabolic disorders, described their insights, and defined a predictive model for the risk of developing diabetes. Despite the low number 380 of subjects enrolled in the study and the fact that some individuals were grouped in both high IR and 381 obesity groups, results were robust and in line with previously reported. Complementary 382 metabolomics studies are necessary to provide a comprehensive overview of the metabolome of these 383 metabolic disorders. The authors support large-scale and follow-up studies to replicate and validate 384 the results. 385

CONCLUSION 386
Through an untargeted metabolomic-driven approach, we have explored the metabolic profiles of 387 concordant and discordant phenotypes of subjects high IR and/or obesity. Large alterations in lipid 388 metabolism, oxidative stress, and inflamma-tion were unveiled. In addition, these results allowed to 389 build a multimetabolite biomarker model to predict high IR regardless of obesity that includes the 390 measurement of DGs, uric acid, and adrenic acid. It might be also employed to predict the risk of 391 developing diabetes; however, they need to be externally validated.  (individuals with both IS and non-obesity) from patients with high IR (models 1 and 2) or subjects 570 with obesity (models 3 and 4) in both ionization modes. White circles refer to the control group 571 (nonobese IS), gray circles to high IR, and black circles to obesity. Abbreviations: ESI, electrospray 572 ionization; IR, insulin resistance; IS, insulin sensitivity; OSC-PLS-DA, orthogonal signal correction 573 partial least-squares discriminant analysis. 574  (Table S5). Significances (p-values) are shown with asterisks 583 when compared with the control group as follows: * p < 0.05, * * p < 0.01, * * * p < 0.001; or with 584 hash keys when compared with the group of subjects with high IR and obesity as follows: # p < 585 0.05, ## p < 0.01, ### p < 0.001. Abbreviations: IR, insulin resistance; IS, insulin sensitivity; OB, 586 obesity. 587