UHPLC-HRMS (Orbitrap) Fingerprinting in the Classification and Authentication of Cranberry-based Natural Products and Pharmaceuticals using Multivariate Calibration Methods

A 100% classification and authentication rate for cranberry-based natural and pharmaceutical products by UHPLC-HRMS fingerprinting.


Introduction
Nowadays, it is a common practice worldwide to address the prevention of several chronic diseases by employing, together or not with the use of regulated medicines, plant-based and/or fruit-based pharmaceutical extracts. This is the case of American red cranberry (Vaccinium macrocarpon), a small evergreen shrub from the Ericaceae family that grows in acid swamps in humid forests. American cranberry fruits are composed mostly of water (>80%), and are a rich source of vitamin C and dietary polyphenols, such as flavonols, anthocyanins, organic acids and proanthocyanidins (PACs). Cranberries have been used for centuries as a flavoring or by sailors to prevent scurvy due to its high vitamin C content. Moreover, their consumption may be associated with reduced risk of chronic diseases such as cancer, although strong evidences have not been yet established in humans, and several berry-based extracts have shown antitumor activities. Cranberry-extracts enriched in polyphenolic contents have also shown enhanced antiproliferative activity. These extracts may also play an important role in the treatment of oral infections by reducing the pathogenesis of dental caries, the protection against cardiovascular diseases, and the prevention of oxidation of low density lipoproteins and platelet aggregation. [1][2][3][4][5] Recently cranberries have attracted much attention due to their high content on PACs and the capacity of some of them to prevent urinary tract infections (UTIs). This activity is attributed to the inhibition of the adhesion of pathogenic bacteria, such as Escherichia coli and Helicobacter pylori, to the cells of the urinary tract tissues, thus preventing bacterial colonization and the proliferation of infections. [6][7][8][9] PACs, also known as condensed tannins, are flavan-3-ol polymeric structures mainly based on (epi)catechin oligomers, called procyanidins, but other forms can have (epi)gallocatechin units (i.e. prodelphinidins) or (epi)afzelechin (i.e. propelargonidins) units. 10 PACs can be classified according to the linkage between their units. PACs linked through C4-C8 or C4-C6 bonds are known as B-type PACs. If these structures have an additional ether linkage between C2-C5 or C2-C7 they are known as A-type PACs. 11 As an example, Fig. S1 (electronic supplementary information) shows the structure of a trimeric PAC with A-type and B-type linkages. Nevertheless, only A-type PACs, very abundant in American red cranberries, exhibit the bioactive activity to prevent UTIs, while B-type PACs, which are found in other fruits such as grapes and blueberries, do not show this activity. [12][13][14] Many cranberry-based pharmaceutical 4 preparations have recently appeared in the market to prevent UTIs, and there is the suspicion that some of them do not contain the necessary bioactive PACs. The fact that only A-type PACs have the required bioactive capacity and that pharmaceutical laboratories frequently assess the total content of PACs by non-selective colorimetric methods, 15,16 unable to differentiate among A-and B-type PACs, demonstrates the importance on developing analytical methods to characterize cranberry fruit-based extracts and pharmaceutical preparations to authenticate the fruit of origin employed in these processed extracts and to prevent frauds. Moreover, due to food trade globalization and the increased complexity of supply chains, the need for effective systems to protect consumers from impure, contaminated and fraudulently presented food-processed products has increased. Current food labeling and traceability systems cannot strictly guarantee that the food is authentic, of good quality and safe. As a result, consumers are demanding verifiable traceability evidences as an important criterion of food quality and safety. 17 Liquid chromatography coupled to mass spectrometry (LC-MS), tandem mass spectrometry (LC-MS/MS) and high resolution mass spectrometry (LC-HRMS), in combination with chemometric methods, emerge today as the best analytical tools to characterize, classify and authenticate food products. [18][19][20][21][22][23] These platforms result in one of the best ways to detect fraudulent practices derived from the substitution of the most valued components in the fruit-processed extracts by others of lower commercial value, with worse organoleptic characteristics, or without the intended beneficial properties for human health. Food fingerprinting, the non-targeted chemical analysis of food products with multivariate data analysis, is emerging as an innovative approach for food authentication. [24][25][26] This approach is based on the principle of metabolomics, which describes the scientific study of metabolites (small molecules below 1,500 Da), present in a biological system with the aim to detect as many components as possible. Although the main focus of metabolomics are in the field of pharmacology and toxicology, the use of these approaches in food science is gaining acceptance. However, in the food field, an important distinction is made between the concepts of food fingerprinting and food profiling in accordance to the corresponding definitions of metabolomics. 24,27 Researchers coming from the metabolomic field use "profiling" and "fingerprinting" on a different way to researchers who are devoted to food science. The arrival of a "foodomics" discipline was not enough to allay this terminological problem, since authors keep on using the terms with both meanings. 26 Food profiling focuses on the 5 analysis of a group of known selected metabolic chemicals, or a group of chemicals belonging to the same family or with a similar structural feature. The concentrations (or peak signals) of these targeted compounds are then used as food features (markers) to address food authentication. In contrast, food fingerprinting do not deal with the identification of metabolites, but on the recognition of patterns, the so-called "fingerprints" of the foodstuff. 28 After identification and mapping of the patterns to individual food matrices, the objective is usually to differentiate between various food fingerprints in terms of food features such as botanical species, geographical origin, or with respect to possible food adulterations. 24 The fact that similar fruit extracts but with different properties are used as the ingredients in the preparation of food processed products and pharmaceutical preparations increases the difficulty of using targeted methods and may need to employ non-targeted approaches in order to obtain specific fingerprints of the original products.
These so-called food fingerprinting approaches aim to capture as many compounds or features as technically possible to gain a comprehensive insight into the composition of the sample. 25 However, a large amount of chemical data is obtained making difficult its treatment. In this regard, several chemometric data processing software packages with different characteristics and algorithms have been introduced for MS users. 29 After data acquisition and processing, chemometric univariate and multivariate statistical methods are then used for sample characterization, classification and authentication. 30,31 The aim of this work was to develop a suitable method to characterize, classify and authenticate natural and pharmaceutical cranberry-based products, employing ultrahigh performance liquid chromatography-high resolution mass spectrometry (UHPLC-HRMS) using a non-targeted fingerprinting approach with a Q-Exactive Orbitrap analyzer. Different classes of fruit-based (cranberry, blueberry, raspberry and grape) products including the raw fruit extracts, fruit juices and raisins, as well as commercial cranberry-based pharmaceutical preparations including raw extracts, powder capsules, syrups and sachets were analyzed after a simple sample extraction procedure. The hypothesis established in this work is that UHPLC-HRMS fingerprinting data, obtained in both positive and negative ESI mode, exploring also the possibility of data-fusion, can be considered as a source of potential chemical descriptors to be exploited for the characterization and classification of fruit-based natural products and pharmaceuticals by unsupervised principal component analysis (PCA) and supervised partial least squares-discriminant analysis (PLS-DA). Data was further treated by partial least square 6 (PLS) regression to quantify percentages of fruit extracts (grape, blueberry and raspberry) used for adulteration in cranberry extracts.

Instrumentation and methods
The chromatographic fingerprints were obtained with an Accela UHPLC system column re-equilibration for 6 min at initial conditions. Injection volume was 10 µL.

Sample treatment
A total of 106 natural and pharmaceutical products were analyzed in this work.
Natural products from different brands were purchased from Barcelona markets and pharmaceutical preparations and raw extracts were provided by Deiters S.L. Company For authentication studies by PLS regression, three cases were evaluated in which cranberry extracts were adulterated with different amounts of grape, blueberry or raspberry, respectively. For such a purpose, 3 cranberry, 3 grape, 3 blueberry-and 3 raspberry-fruit sample extracts were processed as indicated above.

Data analysis
The data treatment for the untargeted analysis was carried out with R

UHPLC-HRMS fingerprinting
In this work, a non-supervised UHPLC-HRMS fingerprinting analysis of fruitbased products and cranberry-based pharmaceuticals was evaluated in order to obtain proper chemical descriptors for sample classification and authentication. For that purpose, 106 samples were processed with a simple sample extraction method and the obtained extracts were analyzed with a C18 reversed-phase UHPLC-HRMS method 9 (see experimental section). The fingerprint of a fruit-based product will depend on both the fruit variety genotype and the product phenotype (food attributes determined by ambient conditions, agricultural practices, food-processing procedures, etc.). Thus, it is expected that these fingerprints will provide good chemical descriptors to achieve sample characterization and classification by means of chemometric methods.
Accordingly, an untargeted strategy relied on UHPLC-HRMS fingerprints consisting of intensity peaks recorded as a function of m/z and retention time. Data was then registered in both negative and positive HRMS full scan mode (m/z 100-1500). As an example, Fig. 1 shows the total ion chromatograms (TIC) obtained in negative H-ESI mode for the four types of fruits analyzed (raspberry (a), grape (b), blueberry (c), and cranberry (d)). The figure also shows, as an arbitrary example, the full scan HRMS spectra obtained for each fruit extract at a retention time of 6.61 min. As can be seen, important differences in peak signals and abundances in both total ion chromatograms and HRMS spectra were obtained. Cranberry-and blueberry-based samples seemed to provide richer fingerprints (more signals), while those belonging to grape-based products were simpler.

Exploratory principal component analysis study
The obtained UHPLC-HRMS fingerprint raw data was processed with MSConvert software to obtain a profile of peak intensities in function of m/z values and retention times. In order to reduce the data complexity, a threshold peak filter of absolute intensity 10 5 was applied. The converted data was then processed with R software to obtain a data matrix including the UHPLC-HRMS fingerprints Nevertheless, samples tend to be grouped according to the fruit of origin. In general, UHPLC-HRMS fingerprinting in negative ionization mode seem to provide more discriminant chemical descriptors among samples which allow to concentrate the samples in smaller regions within the score plot (see for instance the distribution of cranberry-based and, especially, for raspberry-based natural products). Although at this point complete discrimination among the four fruit types was not achieved, for example blueberry-and grape-based natural products tend to be overlapped in both score plots, cranberry-based natural products were completely separated from the other three types of fruits when using both positive and negative UHPLC-HRMS fingerprints. This is an interesting result because cranberry natural products should be the specific source of the raw extracts employed in the preparation of pharmaceuticals, and thus susceptible to adulteration with other fruit extracts as commented in the introduction.
Finally, in Fig. 2 it can also be observed that cranberry-based samples (both natural products and pharmaceutical preparations) tend to be grouped, more or less, in the same region of the score plots, although with certain discrimination depending on the pharmaceutical form (raw extract, capsules, syrups and sachets), and in some cases clearly differentiated from cranberry-based natural products. This is probably due to the fact that purification and preconcentration procedures followed by pharmaceutical companies in the preparation of raw extracts from cranberry-fruits enriched with bioactive compounds in comparison to non-treated cranberry-fruit natural products, thus providing different patterns even though the fruit of origin is the same.
Taking into account that raspberry, blueberry and grape extracts are expected to be used as potential adulterant of cranberry extracts, independent PCA among cranberry-based natural products and the other three fruit families were also evaluated.
In this case, the dimensions of the obtained data matrices were 84 × 469 for positive H-ESI mode and 84 × 641 for negative H-ESI mode. As an example, Fig. 3 shows the score plots that provided the best sample differentiation: (a) PC1 vs PC2 in positive H-ESI mode and (b) PC1 vs PC3 in negative H-ESI mode. It can be observed that, except for some outliers that are expected when working with natural products, samples tend to be grouped in both cases according to the fruit of origin, although more overlapping between groups was observed in positive ionization mode. Up to this point, it seems that UHPLC-HRMS fingerprinting in negative ionization mode provided better discrimination among cranberry-fruit products and other adulterant fruits, as can be seen in Fig. 3b. PC1 clearly differentiated cranberry-based fruit products (clustered at the right of the plot) from those obtained with other fruits (distributed at the left of the plot).
It was thus concluded from PCA that samples were reasonably distinguished as a function of fruits of origin. Hence, data was expected to be of interest to tackle further classification studies by PLS-DA.

Supervised partial least squares-discriminant analysis study
UHPLC-HRMS fingerprints obtained in both H-ESI negative and positive acquisition modes were here evaluated as chemical descriptors to address sample classification by PLS-DA. For that purpose, no further data treatment from the employed PCA data matrices was required. Therefore, the same X-data matrix employed in PCA was submitted in PLS-DA, while the Y-data matrix coded the belonging of the samples to their corresponding classes (i.e., cranberry products, cranberry-based nutraceuticals, and grape, blueberry and raspberry products). Fig. 4 shows the 3D-scatter plots of scores of LV1 vs LV2 vs LV3 from UHPLC-HRMS fingerprints in negative H-ESI (Fig. 4a) and in positive H-ESI (Fig. 4b). In addition, data fusion of both negative and positive fingerprints were also evaluated as chemical descriptors for PLS-DA, and the obtained 3D-scatter plot of score of LV1 vs LV2 vs LV3 is depicted in Fig. 4c. As seen in the figures, in general very acceptable discrimination among the analyzed sample groups (samples tend to be grouped according to the fruit of origin) was obtained independently of the H-ESI acquisition mode employed, as well as when data fusion of both ionization modes was considered.
In addition, cranberry-based products are differentiated into two groups, namely: fruitbased and pharmaceutical-based products, in agreement with purification and preconcentration procedures followed applied to nutraceuticals, as previously commented. Nevertheless, both sample groups tend to be distributed on the same area of the 3D-scatter plots of scores and opposed to the grape-, blueberry-and raspberry-fruit based samples. Therefore, either individual data sets from negative or in positive H-ESI mode, or even the data fusion set of both fingerprints, are adequate, a priory, for the characterization and authentication of cranberry-based natural products and pharmaceutical preparations.
In order to validate the proposed methodology, and taking into consideration that raspberry, blueberry and grape extracts are expected to be used as potential adulterants of cranberry extracts, PLS-DA models using UHPLC-HRMS fingerprints in negative H-ESI mode were built by pairs (cranberry vs grape, cranberry vs blueberry and cranberry vs raspberry). The optimum number of latent variables of each PLS-DA model was stablished cross validation classification error average, being approximately the first minimum point the most appropriate one (Fig. S2, electronic supplementary information). As a good classification was obtained for the three studied pairs. Models were built by using a 70% of each group of samples as the calibration set and were validated with the remaining 30% of the samples. Fig. 5 depict the obtained PLS-DA plots of scores projected on LV1 vs LV2 as well as the classification plot of cranberry vs raspberry (a), blueberry (b) and grape (c), respectively. As seen in the figures, all samples were correctly assigned to its corresponding class, thus reaching a prediction rate of 100% in each studied case.

Adulteration studies by partial least square regression
UHPLC-HRMS fingerprints in negative H-ESI acquisition mode were also here considered for the authentication and quantitation of fraud levels of fruit extract adulterants in cranberry-based extracts. Thus, a cranberry fruit-extract was adulterated at different levels (from 2 to 50%) of the other three fruits studied (blueberry, raspberry and grapes). Adulterated samples were then processed with the proposed sample treatment procedure and the obtained extract solutions were analyzed by UHPLC-HRMS to obtain the corresponding fingerprints as chemical data for partial least square regression. A data set of calibration as indicated in the experimental section was first employed in order to establish the PLS model. The number of latent variables (LV) used for the assessment of the PLS model was estimated by venetian blinds cross validation considering 2 data splits. The PLS model was further applied to quantify the percentage of adulteration in the samples belonging to the test set. Fig. 6 shows the obtained PLS results when grape (a), blueberry (b), and raspberry (c) were the adulterants, showing the good performance of the obtained PLS models. Calibration errors were in all cases below 0.01% and, in general, small prediction errors were also obtained in the validation study, with values of 0.17% and 0.47% when grape and blueberry were used as adulterants, respectively, except when raspberry was used as adulterant where the prediction error increased up to 3.86%. However, taking into consideration that adulteration levels in nutraceuticals are expected to be high if an economical profit is intended, the proposed methodology showed a good performance for the authentication and quantitation of frauds, even at low adulteration levels. Finally, the proposed methodology will allow to obtain in a fast way fingerprint chemical descriptors with lower data processing in comparison to targeted UHPLC-HRMS profiling approaches in which a known family of bioactive compounds need to be characterized, detected and their signal confirmed and quantified by employing standards. This will make non-targeted UHPLC-HRMS fingerprinting methods cheaper 14 than targeted approaches as chemical standards are not required to perfectly achieve sample classification.

Conflict of interest declaration
The authors declare no conflict of interest.       Cranberry-based natural products Grape-based natural products Blueberry-based natural products Raspberry-based natural products