Current strategies to guarantee the authenticity of coffee

Abstract As they become more health conscious, consumers are paying increasing attention to food quality and safety. In coffee production, fraudulent strategies to reduce costs and maximize profits include mixing beans from two species of different economic value, the addition of other substances and/or foods, and mislabeling. Therefore, testing for coffee authenticity and detecting adulterants is required for value assessment and consumer protection. Here we provide an overview of the chromatography, spectroscopy, and single-nucleotide polymorphism-based methods used to distinguish between the major coffee species Arabica and Robusta. This review also describes the techniques applied to trace the geographical origin of coffee, based mainly on the chemical composition of the beans, an approach that can discriminate between coffee-growing regions on a continental or more local level. Finally, the analytical techniques used to detect coffee adulteration with other foods and/or coffee by-products are discussed, with a look at the practice of adding pharmacologically active compounds to coffee, and their harmful effects on health.


Introduction
Coffee is a beverage with a distinctive taste and aroma made from ground roasted coffee beans. Due to its aromatic flavor and the beneficial effects of caffeine and other bioactive components, millions of people consume coffee every day. The world produces 6.3 million tons of coffee per year in about 60 tropical and subtropical countries (mainly, Hawaii, Jamaica, Ethiopia, Kenya, Brazil and Vietnam), some producing coffee as their main agricultural export. The coffee plant belongs to the Coffea genus of the Rubiaceae family, which has more than 100 species, although most of the coffee consumed is produced from Coffea arabica (Arabica) and Coffea canephora (Robusta) (N uñez et al. 2020).
The composition of green coffee beans is dominated by carbohydrates ($60% dry weight) and lipids (8-18%), with a minor amount of proteins, peptides, and free amino acids (9-16%) (Ludwig et al. 2014). The phytochemical profile of green coffee beans is complex, with over 1000 different chemical classes, including diterpenes (cafestol and kahweol), methylxanthines (e.g., caffeine, theobromine, and theophylline), nicotinic acid (vitamin B3), and trigonelline (Jeszka-Skowron, Zgoła-Grze skowiak, and Grze skowiak 2015). For years, coffee has been valued for its stimulating effect, associated mainly with caffeine (Butt and Sultan 2011; George, Ramalakshmi, and Mohan Rao 2008). However, it is now known that coffee contains many other bioactive components with valuable health-promoting properties. Coffee is rich in antioxidant substances such as phenolic compounds, the most abundant being ellagic, caffeic, and chlorogenic acids (Butt and Sultan 2011;George, Ramalakshmi, and Mohan Rao 2008). Studies have attributed many potential health benefits to coffee intake, including the prevention of several chronic and degenerative diseases, such as cancer, type 2 diabetes, cardiovascular conditions and Parkinson's disease (Esquivel and Jimenez 2012;Ludwig et al. 2014;George, Ramalakshmi, and Mohan Rao 2008). Among the bioactive compounds responsible for these effects, polyphenols are the most important (Bułdak et al. 2018). Chlorogenic acid, the major polyphenol of coffee, is reported to have antibacterial, antifungal, antiviral, antioxidant, and chemo-protective properties (Bharath, Sowmya, and Mehta 2015;Hayakawa et al. 2020). Furthermore, caffeic acid exerts anticancer effects through the inhibition of DNA methylation and prevention of tumorigenic processes (Yu et al. 2011). Coffee polyphenols have also demonstrated potential anti-obesity effects and they can improve metabolic risk factors such as hypertension, abdominal obesity, and hyperglycemia (Ohishi et al. 2021; G€ okcen and Şanlier 2019).
The chemical profile, and therefore the antioxidant characteristics of coffee, can vary depending on the origin, variety, degree of roasting, and storage conditions, among other factors (George, Ramalakshmi, and Mohan Rao 2008;Herawati et al. 2019). The frequent and diverse adulteration practices in coffee production can involve the quality of the coffee beans (substitution by beans of other species or geographical origin, or defective beans), or the addition of external agents (for example, coffee husks and stems, soybeans, maize, barley, brown sugar), strategies that reduce production costs and increase profits from the final product (Toci et al. 2016).
For the consumer, flavor is what matters most in a highquality coffee, which is described as having a balanced combination of body, aroma and flavor without any defects (Sunarharum, Williams, and Smyth 2014). Whereas green coffee has a mild, bean-like aroma, the desirable fragrance associated with coffee beverages is developed during roasting. The air temperatures in standard roasting are in the range of 180-250 C, and roasting time can vary between 25 min at the lowest temperatures to 2 min at the highest, depending on the desired degree of roasting and the technique employed (Parliment, Ho, and Schieberle 2000). The flavor and aroma of brewed coffee is intrinsically linked to this roasting process, during which the chemical composition changes profoundly due to Maillard and Strecker reactions (Flament 2001;Ishwarya S and Nisha 2021). The substances produced in these reactions are responsible for the characteristic aroma of coffee and its pleasant bitterness. The characteristic flavor and aroma that these components provide make possible to classify coffee according to its quality based on sensory analysis. This approach relies on the evaluation of coffee quality from an olfactory and sensory perspective by trained panelists in a score scale developed by the Speciality Coffee Association of America (SCAA).
This review takes a look at the current strategies employed to assess the quality of coffee, including methods that can distinguish between the two main species used in its production, trace the geographical origin of coffee, and detect the addition of adulterants.
Discrimination between Arabica and Robusta coffee species C. arabica (Arabica) and C. canephora (Robusta) differ in several aspects, for example, morphology, bean size and color, chemical components, and sensorial properties (Davis et al. 2006;Keidel et al. 2010;Feria-Morales 2002). Coffee is generally marketed as a mixture of the two species blended in different amounts to achieve the desired sensory characteristics (Martın, Pablos, and Gonz alez 1998). Arabica is employed to enhance aroma, whereas Robusta is usually added to improve the body and foam of some coffee beverages (e.g., espresso coffee) and in instant coffee production (Wongsa et al. 2019;Clarke 2012).
Due to differences in price and organoleptic properties, Robusta can be considered as an adulterant of Arabica, and its illegal addition constitutes fraud. The more expensive Arabica coffee (reaching 20-25% higher market prices) has a more pronounced and refined flavor. On the other hand, Robusta crops are more resistant to disease, but the coffee they produce is considered to have an inferior flavor. It is therefore important to develop analytical methods that allow the reliable identification of both species and the estimation of their content in coffee products. Several approaches to coffee varietal identification have been applied with relative success, but many require techniques that are expensive and/or time-consuming ). Table 1 provides a general description of the chromatographic methods used to distinguish between Arabica and Robusta coffee species, highlighting the strengths and weaknesses of each method. Chromatography is one of the most versatile methods for detecting fraud in coffee (Wang, Lim, and Fu 2020). The triglyceride and tocopherol contents of green and roasted coffee beans of the Arabica and Robusta were determined by reversed phase and normal high performance-liquid chromatography (HPLC), respectively, after Soxhlet extraction with hexane . Applying principal component analysis (PCA) and linear discriminant analysis (LDA), species discrimination was achieved with both parameters, but only tocopherols allowed differentiation between green and roasted coffees. Similarly, the tocopherol profile in the two coffee species was analyzed by normal-phase HPLC/diode-array/fluorescence detection (Alves et al. 2009), and the higher content of b-tocopherol in Arabica after roasting permitted a clear separation; in Robusta, the mean degradation of this antioxidant was approximately 25% when expressed as dry weight. The ratio between a:b:c tocopherol homologues determined by reversed phase-ultra HPLC electrospray ionization/mass spectrometry (RP-UHPLC-ESI/MS n ) was reported as a marker of authentication able to distinguish between coffee varieties even in roasted samples (G orna s et al. 2014). In this study, an alkaline saponification procedure followed by extraction with a mixture of organic solvents was necessary to improve the recovery of tocopherols from coffee beans.

Chromatographic techniques
HPLC was also employed to evaluate the content of hydrosoluble compounds (caffeine, trigonelline, 5-caffeoylquinic acid, and nicotinic acid) as a method to discriminate between Arabica and Robusta in coffee blends (Dias and Benassi 2015). The most efficient discriminator was caffeine, which was unaffected by the degree of roasting, unlike the other tested compounds, whose application as markers required an additional step to characterize the roasting. To circumvent these difficulties, in the HPLC-diode-array-based method developed by Casal et al. (2000), all samples were roasted to the same degree. Multivariate and nonparametric analysis of the chromatographic results revealed that trigonelline and caffeine effectively discriminated between Arabica and Robusta, but not nicotinic acid (Casal et al. 2000).
Other potential biomarkers for Arabica and Robusta coffee are biogenic amines (putrescine, cadaverine, serotonin, tyramine, spermidine, and spermine). Using a method based on reversed-phase HPLC after derivatization with dansyl chloride and multivariate analysis, it was determined that putrescine, the predominant biogenic amine in green beans, could be used for species discrimination, even after different post-harvest processes, but the statistical significance decreased considerably after roasting (Casal et al. 2004).
Recently, non-targeted approaches relying on HPLC-UV chromatographic fingerprints together with partial least squares regression-discriminant analysis (PLS-DA) have also been applied for the evaluation of varietal classification and authentication (N uñez et al. 2020;De Luca et al. 2018).
The GC analyses of the free amino acids, as well as the amino acids obtained after acid hydrolysis, were performed after derivatization. Multivariate analyses applied to the results showed that the free amino acids can serve as a tool to discriminate between Arabica and Robusta, especially Lglutamic acid, L-tryptophan, and pipecolic acid. Although they have less discriminatory capacity, the amino acid levels after acid hydrolysis can also be used.
In summary, the advantages of the chromatographic techniques allow the identification of a large number of biomarkers (triglycerides, tocopherols, hydrosoluble compounds, biogenic amines, aminoacids and FA) to discriminate between Arabica and Robusta coffee species. Another advantage is that little amount of sample is required compared to spectroscopy techniques.

Spectroscopic techniques
Spectroscopic techniques have emerged as an attractive and useful tool for varietal identification purposes: methods based on nuclear magnetic resonance (NMR) spectroscopy and Raman spectroscopy, also combined with near infrared (NIR) spectroscopy, have been developed. Table 2 provides a general description of the spectroscopic methods used to distinguish between Arabica and Robusta coffee species, highlighting the strengths and weaknesses of each. These methodologies have proved to be easily implemented in routine analysis. In most of these studies, multivariate methods such as PCA, LDA, or partial least squares regression (PLS) were employed to evaluate the complex spectral information and to identify the compounds responsible for differentiation.
An ultraviolet-visible (UV-Vis) spectroscopy-based determination of caffeine and chlorogenic acid contents to discriminate between green coffee beans of Arabica and Robusta was reported recently (Adnan et al. 2020). Seventyfour green coffee bean samples from Indonesia were analyzed in this study, and the data related to both compounds were processed using LDA, achieving an accuracy of 97%.
The original NIR spectra of roasted coffee samples can be used directly to develop a classification model with a moderate to high discrimination ability for pure varieties. However, after applying the orthogonal signal correction methods to remove information, Esteban-Diez et al obtained a notably less complex model with excellent classification power ). The same research group applied NIR spectroscopy combined with multivariate calibration methods to quantify the content of Robusta in roasted coffee samples as a means of controlling coffee adulteration (Pizarro, Esteban-D ıez, and Gonz alez-S aiz 2007). PLS regression and a wavelet-based pre-processing method (called OWAVEC) were applied in this case to simultaneously operate two crucial pre-processing steps in multivariate calibration: signal correction and data compression. Another study also showed NIR spectroscopy to be a very consistent and useful tool to classify coffee samples (Buratti et al. 2015). The practicability of the approach was demonstrated by LDA, and an external test set validation showed the samples were 100% correctly classified. More recently, this technique has been applied to intact beans, achieving high classification accuracy (95%) when wavelength was selected by multivariate analysis (Adnan et al. 2020).
Fourier transform (FT) Raman spectroscopy is a dispersion process that allows discrimination between coffee beans of different species, both green and roasted, through their lipid fraction, which is extracted by diethyl ether in a Soxhlet system (Rubayiza and Meurens 2005). Taking advantage of two specific scattering bands at 1567 and 1478 cm À1 in the Raman spectra of the diterpene kahweol (present in 0.1-0.3% of dry matter in Arabica beans and only in traces in Robusta), a set of 86 green and 82 roasted coffees were grouped by species with a high degree of accuracy after PCA.
NMR spectroscopy is a powerful tool for the qualitative and quantitative analysis of complex mixtures of small molecules in solution and has been used with great success to analyze foods and beverages. This approach is especially suitable for the quantification of minor components in complex matrices (Olmo-Cunillera et al. 2020). Using proton nuclear magnetic resonance ( 1 H NMR) spectroscopy, kahweol and 16-O-methylcafestol (16-OMC) were established as markers of Arabica and Robusta, respectively, in the lipophilic extracts of authentic roasted and green coffees (Monakhova et al. 2015). The integration of the 16-OMC signal (d 3.165 ppm) was used to estimate the amount of Robusta in coffee blends with an approximate limit of Characterization of roasting degree of the sample is required.
Dias and Benassi (2015) Hydrosoluble compounds: Caffeine and trigonelline Roasted coffee beans 9 samples belong to Arabica 20 to Robusta Suitable for routine analysis in the coffee industry.
There was no association with the geographical origin of the samples.

Casal et al. (2000)
Fingerprint from solidliquid extracted using water/methanol mixture. Useful and suitable tool to assess the amounts of Arabica and Robusta in a coffee blend.
The variability of FA composition in Robusta reduces applicability in blends containing a high percentage of Robusta.
(continued) detection of 1-3%. The method was successfully applied for the analysis of 77 commercial coffee samples (coffee pods, coffee capsules, and coffee beans). Another study revealed that the two species can be quickly discriminated by quantitatively evaluating the major metabolites of green coffee beans using carbon-13 nuclear magnetic resonance ( 13 C NMR)-based metabolite profiling coupled with chemometric analysis (PCA or orthogonal partial least squares discriminated analysis (OPLS-DA)) and by applying signal assignment information. Additionally, 1 H NMR and multivariate statistical analysis was used to develop an OPLS model based on multiple chemical components, which successfully determined the composition of coffee blends of unknown Arabica and Robusta content, regardless of the geographical origin of the analyzed samples (Cagliani et al. 2013).
A method based on direct-infusion electrospray ionization À mass spectrometry (ESI À MS) data calibrated by a PLS multivariate technique allowed the rapid detection and quantification of adulterations of Arabica coffee with Robusta (Garrett et al. 2012). A total of 16 PLS models were built using ESI(±) quadrupole time-of-flight (QToF) and ESI(±) Fourier transform ion cyclotron resonance (FT-ICR) MS data from hot aqueous extracts of certified coffee samples. The 30 most abundant ions accurately predicted the composition of commercial Robusta and Arabica coffee blends. In addition, ESI(±) FT-ICR MS analysis identified 22 compounds in Arabica and 20 compounds in Robusta, mostly phenolics, which were responsible for the distinction between the coffee varieties.
The proton transfer reaction-time of flight-mass spectrometry (PTR-ToF-MS) technique for the analysis of volatile organic compounds (VOCs) can be used for a rapid and correct classification of Arabica and Robusta coffee at different stages of processing, from the roasted beans to the brewed coffee, but not for green beans (Colzi et al. 2017). After multivariate statistical analysis, the identified VOCs (16 for roasted beans, 12 for ground coffee and 12 for brewed coffee) were able to characterize the different aromatic profiles of the two species and discriminate between them. The best results were obtained with roasted beans, which may therefore be the most suitable coffee matrix for authentication screening.
In brief, spectroscopic methods have been widely used to distinguish between Arabica and Robusta coffee species. Within the strengths of these techniques, we would like to emphasize: i) simplified measurement procedures, ii) high throughput, iii) fast and low cost and iv) (lipid fraction, caffeine and chlorogenic acid, 16-OMC and VOCs). In addition, these methods can be affected by environmental conditions and that the success depends on signal pre-processing methods applied to minimize the spectral variation, due to the alteration in sample preparation and conditions.

Single-nucleotide polymorphism-based methods
Single-nucleotide polymorphisms (SNPs) are single-base changes in DNA that discriminate between closely related species and/or varieties. SNP-based methods are therefore useful for authenticity testing of coffee beans by enabling the differentiation between Arabica and Robusta varieties. The method developed by , based on the detection of an SNP in the chloroplastic trnL(UAA)-trnF(GAA) intergenic spacer, accurately determined the percentage of Arabica and Robusta beans in a mix. After polymerase chain reaction (PCR) amplification of this genomic region, the resulting DNA fragments were subjected to extension reactions by DNA polymerase using Robusta-specific and Arabica-specific primers. In the reaction, the extended strands were labeled with oligo(dA) tags and biotin. The products were immobilized in streptavidincoated microtiter wells and hybridized with the oligo(dT)conjugated photoprotein aequorin. The fragments were then quantified by measuring the presence of aequorin via its characteristic bioluminescent reaction following the addition of Ca 2þ .
In subsequent work , this SNP-based authentication assay was further developed into a low-cost, disposable, dipstick-type test that allows DNA-based coffee bean authenticity testing by the naked eye. After the described PCR amplification of the chloroplastic intraspacer region and fragment extension using species-specific primers, the fragments are applied to the dipstick, followed by a carrier buffer. While being transferred through a membrane, DNA fragments take up gold nanoparticles. Speciesspecific fragments are held back by immobilized streptavidin due to their biotin labeling, while unspecific fragments bind to a final zone on the membrane and serve as a control. The presence and quantity of labeled fragments can be easily assessed by the intensity of the nanoparticle staining. Rui  FA: Fatty acids To date, very few studies have used SNPs to discriminate between closely related species and/or varieties. However, SNP-based methods are useful for authenticity testing of coffee beans by enabling the differentiation between Arabica and Robusta varieties.

Geographical origin authenticity
The worldwide growth of the coffee market has increased the importance of the geographical origin of coffee, and this information is increasingly included on product labels. As the quality of this globally appreciated beverage is associated with specific growing areas, mislabeling has become another area of fraud.
Tracing the geographical origin of coffee is challenging, mainly because the chemical composition of beans is influenced not only by agronomic practices and the climate of the growing area, but also by the post-harvest processing methods, storage conditions, distribution, and roasting procedures (Alves et al. 2009). The choice of a discrimination technique depends not only on its performance, but also the time required for analysis, the cost of the analytical equipment, and the possibility of automation (Anderson and Smith 2002;Perez, Lopez-Yerena, and Vallverd u-Queralt 2020). Table 3 provides an overview of the methods commonly used to distinguish the geographical origin of coffee.
Discrimination between major coffee-growing regions NMR has emerged as a promising technique for the traceability of coffee from the largest growing areas. In this context, the metabolite content of Arabica roasted coffee samples from America, Africa, and Asia was investigated by NMR spectroscopy by Consonni et al. (Consonni, Cagliani, and Cogliati 2012). The samples were clearly separated according to origin when OPLS-DA models were applied to 1 H NMR data. The main compounds characterizing the American samples were FA, whereas chlorogenic acids and lactate were the key compounds for African coffee, and acetate and trigonelline for the Asian samples. On the other hand, the geographical origin of green coffee beans can be rapidly discriminated by quantitative 13 C NMR-based metabolomics (Wei et al. 2012). The content of caffeine was found to be higher in Robusta green coffee beans from Vietnam compared to Indonesia, or in those from Central America compared to South America and Africa, therefore serving as an indicator of origin. Other reported indicators are chlorogenic acids, acetic acid and amino acid levels.
Coffee bean samples from three major coffee-growing regions (Indonesia, East Africa, and Central/South America) were analyzed by elemental analysis using inductively coupled plasma atomic emission spectroscopy (ICPAES) (Anderson and Smith 2002). A computational evaluation of the data sets from 11 elements was carried out using statistical pattern recognition methods, including PCA, discriminant function analysis, and neural network modeling, resulting in 70-86% of successful classification. Similarly, the trace element composition of coffee beans from six different regions (Brazil, Colombia, Vietnam, Indonesia, Tanzania, and Guatemala) was analyzed using a high sensitivity X-ray fluorescence spectrometer with three-dimensional polarization optics (Akamine et al. 2010). After optimization of the experimental conditions and the construction of the linear calibration curves, the analytical results of six trace elements were used in the PCA to classify both roasted and green beans according to their growing area.
Regarding stable isotope ratios of elements, it was found that the ratio of carbon, nitrogen, and boron of green coffee beans produced in three continents (Africa, Asia and America) were good indicators of geographical origin (Serra et al. 2005). The combination of the isotopic fingerprints of these three elements and the subsequent PCA successfully identified the continental origin of 88% of the analyzed samples. Although this approach has produced promising results, it fails to distinguish between adjacent countries with similar climatic environments. Moreover, samples from large countries with a variety of climatic areas may also result in an extensive range of isotope ratio values, and therefore a wide dispersion. Multi-element stable isotope analysis of caffeine isolated from green coffee beans of different geographical origins (Central and South America, Africa, Indonesia, Jamaica and Hawaii) was carried out using isotope ratio mass spectrometry (IRMS) and elemental analysis (EA) (Weckerle et al. 2002). Data evaluation by LDA and classification and regression tree (CART) analysis showed that the d 18 O VSMOW values were highly significant for origin assessment.
Based on the volatile and semi-volatile profiles in coffee, ToF-MS has also been applied to trace the origin of coffee bean samples. In this regard, a rapid analytical method to distinguish the geographical origin of coffee samples from  supervised multivariate data analysis techniques, significant differences were found in the volatile profiles of the coffee according to origin, as visualized by PCA, and classification prediction accuracy was established by further partial least square regression-discriminant analysis. The geographical origin of green coffees from the major growing regions of America, Africa, Asia, and Oceania was also analyzed by HPLC coupled with UV spectrophotometry ( -Salces et al. 2009). Phenolic and methylxanthine profiles provided classification models that correctly identified all authentic Robusta green coffee beans from Cameroon and Vietnam and 94% of those from Indonesia after multivariate data analysis, LDA and PLS-DA. Moreover, PLS-DA afforded independent models for Robusta samples from these three countries with classification sensitivities and specificities close to 100% and for Arabica samples from America and Africa with sensitivities of 86 and 70% and specificities of 90 and 97%, respectively. The content of chlorogenic acids, caffeine and total polyphenols were analyzed by means of UHPLC coupled to an exactive Orbitrap MS for the geographical assessment of coffee samples from China, India and Mexico (Mullen et al. 2013).
Arabica and Robusta coffee from India and Mexico showed similar contents of chlorogenic acids and polyphenols, whereas significantly lower contents were found in samples from China.
To date, few published studies have compared the different analytical techniques applied to trace geographical origin. However, quite recently, Medina et al. published a collective comparative analysis of 1 H NMR, attenuated total reflectancemid infrared (ATR-MIR), and NIR applied to detect fraud in Colombian coffee (Medina et al. 2017). For each technique, classification models were constructed for discrimination by origin and ATR-MIR emerged as the best candidate, as it showed the same ability as 1 H NMR to determine the Colombian origin, but more rapidly and at a lower cost; NIR fell short in comparison with the other methods.
In summary, NMR is the most powerful technique for the traceability of coffee from the largest growing areas, although IRMS and EA seem to have gained interest in the last few years. MS and UV coupled to GC and HPLC have also been used to determine the volatile and semivolatile profile of coffees, but further research is necessary to improve the applicability of these techniques.

Discrimination between local/regional growing areas
The effectiveness of chlorogenic acids, FA, and elements analyzed by HPLC, GC, and ICPAES, respectively, for the discrimination of five (one traditional and four introgressed) Arabica varieties from three Colombian locations was compared by Bertrand using PCA and discriminant analysis (Bertrand et al. 2008). Although elements provided an excellent classification of the three locations studied, this chemical class was ineffective for Arabica discrimination. Chlorogenic acids gave satisfactory results, but FA were clearly the most effective in distinguishing between varieties (Arabica versus Robusta) and regions, with very high percentages of correct classification (79 and 90%, respectively). On the other hand, green coffee samples proceeding from four different cities in the south of Brazil were successfully distinguished by NIR spectroscopy (Marquetti et al. 2016) after the complexity and quantity of information within the spectra was simplified by PLS-DA.
Recently, the phenolic profile obtained by UPLC-MS was applied to determine the geographical origin of green coffee beans produced in four Ethiopian regions (Mehari et al. 2021). PCA of the data identified 3-caffeoylquinic acid, 3,4dicaffeoylquinic acid, 3,5-dicaffeoylquinicacid, and 4,5-dicaffeoylquinic acid as the most discriminating phenolic compounds for authentication, with a moderate classification efficiency (74% prediction success rate). On the other hand, the metabolite variability in coffee grown in Indonesia, a top exporter of Arabica coffee, was analyzed by means of nontargeted GC/MS according to species and geographical origin (Putri, Irifune, and Fukusaki 2019).
In summary, in an effort to confirm the validity of the information on the product label regarding origin, numerous technologies have been applied to discriminate major coffee-growing regions and between local/regional growing areas. While some biomarkers show high classification efficiency (e.g. chlorogenic acids, FA, lactate, acetate and trigonelline, caffeine, carbon, nitrogen, and boron) others biomarkers (phenolic profile) are characterized to have moderate classification efficiency.

Other adulteration practices in roasted coffee
Fraudulent or accidental adulteration is the most serious problem affecting the coffee trade (Nogueira and do Lago 2009). To lower the production costs, beans from two species of different economic value may be mixed and other substances added. The major adulterants of coffee include roasted and unroasted coffee husks or parchments, coffee stems, maize, barley, chicory, cereals, wheat middlings, brown sugar, soybean, rye, triticale, acai (Toci et al. 2016), malt, starch, maltodextrins, glucose sirups, and caramelized sugar (Nogueira and do Lago 2009).
As well as devaluing the coffee product, the addition of substances could also affect consumer health, which has prompted the development of several analytical techniques to detect the presence of adulterants in coffee. Microscopy analysis and visual inspection have been traditionally used to examine roasted and ground coffee, but they are not suitable to identify impurities in processed coffee (Cai, Ting, and Jin-Lan 2016;Nogueira and do Lago 2009). Therefore, other methods have been developed that provide more reliable and reproducible results, including chromatographic, spectroscopic, voltammetric and biological techniques (Figure 1).

Chromatographic techniques
Adulteration in commercial coffee can be indicated by carbohydrate levels. Thus, by determining the concentration of free and total carbohydrates, it was possible to detect the deliberate contamination of coffee with coffee husk and ligneous material (sticks), as this resulted in a higher content of mannitol, xylose, glucose, and fructose; pure and adulterated products were also distinguished on this basis (Nogueira and do Lago 2009).
Carbohydrates are usually analyzed by HPLC. Accordingly, roasted soybean and wheat adulterations were revealed by a method combining HPLC -high performance anion exchange chromatography with pulsed amperometric detection (HPLC-HPAEC-PAD) with chemometric tools. After characterizing pure roasted coffee beans and adulterants by their carbohydrate profile and monosaccharide content (Cai, Ting, and Jin-Lan 2016), glucose and fructose were established as markers for adulteration with wheat and soybean, respectively. In another study, the standardized ISO 11292:1995 methodology (HPLC-HPAEC-PAD) for the determination of free and total carbohydrate content in soluble coffee was compared with HPLC coupled to UV-Vis to characterize the carbohydrate profile of the adulterants triticale and acai (Domingues et al. 2014). Although both chromatographic methods effectively detected the coffee adulterants, pulsed amperometry was superior for quantification. Nevertheless, the HPLC-UV-Vis system was faster, cheaper and easier to operate. Another study also demonstrated that HPLC-HPAEC-PAD associated with chemometrics has potential as a routine system for adulteration and authenticity tests in ground roasted coffee . It was found that pure roasted coffee has higher levels of galactose and mannose, and that glucose and fructose can indicate adulteration with wheat and soybean, respectively. A novel method developed by Cai, Ting, and Jin-Lan (2016) used ultra performance liquid chromatographyhigh resolution mass spectrometry (UPLC-HRMS) technology to determine the oligosaccharide composition of coffee and common adulterants. This approach identified up to 17 oligosaccharide markers and detected the presence of soybeans and rice in ground coffee when these adulterants were present in amounts of 5% (Cai, Ting, and Jin-Lan 2016). Based on chemometric analysis (PCA), HPLC was also used in a non-targeted analysis of coffee adulteration with soybeans and green mung beans. Unlike targeted analysis, this method allowed the identification of unknown additive compounds without sample preparation. Compared to FTIR, HPLC provides more detailed information because the peaks in the chromatogram represent different compounds, whereas Fourier transform infrared spectroscopy (FTIR) spectra only indicate functional groups. However, the detection limit of adulterants was 5%, whereas in many chemometric analyses with IR it is below 1% (Cheah and Fang 2020).
Tocopherol fingerprinting is another potential approach to detect coffee adulteration. In a study of tocopherol levels based on HPLC-PDA/UV and mean tests, regression analysis, PCA, LDA and SIMCA (Tavares et al. 2016), tocopherol ratios indicated the presence of maize, husks and cleaned husks, c-tocopherol being the main descriptor for adulterations with both maize and coffee by-products. Another study analyzing a, b, c and d-tocopherol by HPLC-UV also found that ctocopherol was the best indicator of coffee adulterations with corn (Jham et al. 2007).
In 2009, Oliveira et al. used solid-phase microextraction (SPME) -GC-MS and chemometrics to study coffee adulteration with roasted barley, carrying out a comparative analysis of the volatile profiles of both coffee and barley, pure and mixed, and at several roasting degrees ). The results demonstrated that the higher the degree of roasting, the easier it was to distinguish the adulterated samples, allowing the detection of roasted barley in quantities as low as 1% (w/w) in dark roasted coffee samples.

Voltammetry
A new approach to detect coffee adulterations involves voltammetry coupled with chemometrics. This simple low-cost technique avoids the common disadvantages of physical, chemical, and biological methods, such as high costs, long analysis times and the need for skilled manpower. The voltammetric analysis is performed by an electronic tongue, an electronic system that generates complex data requiring chemometric tools to extract the information. This system was first used in coffee samples by Arrieta, Arrieta, and Mendoza (2019) for the detection of adulterations with roasted soybean and corn. They achieved sample discrimination using an electronic tongue equipped with a polypyrrole sensor array, followed by either PCA or cluster analysis. The method was also successfully applied for quantitative analysis by partial least squares regression (Arrieta, Arrieta, and Mendoza 2019;de Morais et al. 2019).

Capillary electrophoresis
Capillary electrophoresis is a powerful tool that can detect and quantify a wide range of food-related molecules with different chemical properties (Papetti and Colombo 2019). In processed coffee, this technique has been applied to detect adulterations with cereals and coffee husks (Nogueira and do Lago 2009), soybeans and corn (Daniel et al. 2018) by evaluating the monosaccharide profile. Even though capillary electrophoresis has proven to be a useful technique for the analysis of carbohydrates, it has the disadvantage that monosaccharides need to undergo acid hydrolysis and a neutralization step, which is time-consuming. However, Daniel et al. developed an optimized procedure by using Ba(OH) 2 to neutralize the medium, as this reduces the amount of salt and the ionic conductivity of the sample (Daniel et al. 2018). In another study, a strong-base anion resin was used, because it exchanges chloride for hydroxide, which simultaneously neutralizes the medium and reduces the ionic strength (Nogueira and do Lago 2009).

Spectroscopic techniques
FT-MIR has been employed to determine the quality of different food products, including coffee (Karoui, Downey, and Blecker 2010). The spectral variability between pure and adulterated coffee samples are fundamental in building chemometric models (Flores-Valdez et al. 2020). Thus, the characteristic spectral regions of pure coffee (assigned to chlorogenic acid, lipids, lignin, quinic acid, amides, caffeine, among others) (Flores-Valdez et al. 2020;Oliveira 2013a, 2013b;Craig, Franca, andOliveira 2012), tocopherols (Winkler-Moser et al. 2015) and/or coffee adulterants such as sibutramine have been used (Cebi, Yilmaz, and Sagdic 2017). Both FT-MIR and FT-NIR are rapid, direct, and simple techniques, but the NIR bands are more difficult to interpret and less reproducible and specific. Moreover, the mid-infrared region is more sensitive to the chemical composition of the samples (de Oliveira et al. 2014). Flores-Valdez et al. (2020) developed a method based on FT-MIR spectroscopy coupled with chemometrics that allowed the identification and quantification of coffee adulterants (coffee husks, barley, corn, soy, oat and rice) at concentrations ranging from 1 to 30% (Flores-Valdez et al. 2020). The amount of barley added to coffee samples using a method based on FT-NIR spectral information also have been study (Ebrahimi-Najafabadi et al. 2012). In this study, the excellent predictive ability obtained by multivariate calibration, which was confirmed by the low values of root mean square errors (RMSE), indicated that nondestructive NIR measurements can successfully detect and quantify the fraudulent addition of roasted barley (up to 2% w/w) to roasted coffee. In addition, variable selection using genetic algorithms helped to determine which spectral regions would be most useful to identify the adulteration. ATR-FTIR combined with PCA was also employed to detect sibutramine, an oral anorexiant that may be illicitly included in green coffee. This method was based on the presence of an absorption band at 2698 cm À1 , which is specific to sibutramine hydrochloride monohydrate (Cebi, Yilmaz, and Sagdic 2017).
A different FTIR procedure, known as diffuse reflectance Fourier transform infrared spectroscopy (DRIFTS), was used to determine roasted corn and coffee husks in roasted and ground coffee (Reis, Franca, and Oliveira 2013a). The same research group developed a method using DRIFTS and PLS that allowed the detection and quantification of roasted coffee husks, barley and corn (Reis, Franca, and Oliveira 2013b). To date, no published studies have compared ATR-FTIR and DRIFTS for the analysis of coffee adulteration. Comparisons of other coffee-related applications, such as discrimination by quality or maturity, have shown that DRIFTS provides a more effective differentiation and higher intensity spectra than ATR-FTIR (Craig, Franca, and Oliveira 2012).
Winkler-Moser et al. (2015) performed a comparative analysis of HPLC and NIR to detect adulteration with corn. HPLC analysis was based on the determination of tocopherol in coffee, as corn and coffee differ in their tocopherol profile. The sensitivity of both HPLC and NIR was about 5%, but NIR has the advantage of being a simple and faster technique that does not require sample treatment (Winkler-Moser et al. 2015).
NMR has been successfully employed to discriminate between coffee species and geographical origins, as already described, and in the authentication of other foods (Hong et al. 2017), but it has been underused for the identification of coffee adulterants. A methodology based on 1 H NMR combined with PCA and soft independent modeling of class analogies (SIMCA) for the identification and quantification of coffee contamination was recently developed (Milani et al. 2020). The technique was able to quantify six adulterants (coffee husks, soybean, corn, barley, rice, and wheat) in coffee with two different degrees of roasting.
A novel technique, laser-induced breakdown spectroscopy (LIBS), combined with PLS and PSA, has proven to be a reliable method to detect and quantify the coffee adulterants chickpeas, corn, and wheat. Based on a laser that detects atomic and molecular emission signals of elements, LIBS is a rapid technique that does not need any sample preparation and determines adulterations in coffee below 0.6% (Sezer et al. 2018).

Biological methods
DNA-based techniques have emerged in the last years as useful methods to guarantee food authenticity and safety (Laube et al. 2010;Fuchs, Cichna-Markl, and Hochegger 2012). PCR is a fast, specific and sensitive method that can be used to obtain DNA from roasted beans and instant coffee (Martellossi et al. 2005). This approach was adopted by Ferreira et al., who developed a real-time PCR-based method to detect and quantify barley, maize, and rice in roasted and soluble coffee. Marker genes for coffee and the targeted adulterants were tested using the Basic Local Alignment Search Tool (BLAST). Primer sensitivity and efficiency revealed that this method was suitable for authenticity control in the coffee industry (Ferreira et al. 2016).
In summary, a large number of methods (chromatographic, voltammetry, capillary electrophoresis, spectroscopic and biological methods) have been used by the scientific community and the coffee industry as a strategy to identify other added substances with lower value. However, more efforts are needed to curb adulteration in the coffee sector, toward high-quality production.

Coffee adulteration and its effect on human health
Adulterants have been studied for their effect on the bioactive constituents of coffee, and it was found that levels of caffeine, chlorogenic acid and other phenolic compounds decreased with increasing adulterant concentration (de P adua Gandra et al. 2017), as did the antioxidant capacity. The results therefore show that adding coffee hulls, coffee straw, and corn affects the health benefits of coffee beverages, reducing protection against oxidative stress.
Sibutramine is an oral anorexiant that may be illicitly included in herbal slimming foods and supplements marketed as "100% natural" to enhance weight loss. However, sibutramine consumption has been associated with increased blood pressure and heart rate (Bertholee et al. 2013), and heart attacks and strokes (Cebi, Yilmaz, and Sagdic 2017). Numerous efforts have therefore been invested in developing an effective and rapid method for its detection in weightloss products such as green coffee (Cebi, Yilmaz, and Sagdic 2017), coffee (Suryoprabowo et al. 2020), and Brazil Potent Slimming Coffee (Bertholee et al. 2013) to guarantee the quality of functional foods and protect consumer health.
Phosphodiesterase-5 inhibitors (PDE-5i) are another family of drugs that have been used as adulterants in coffee. PDE-5i are employed for the medical treatment of erectile dysfunction, but they are known to have side effects, such as headaches, nausea, skin flushes, muscle pain or prolonged erection (Suryoprabowo et al. 2020). In recent years, the detection of illegal PDE-5i and analogues in herbal supplements has been reported in many regions, including Asia, Europe and North America (Dong et al. 2020). In order to protect public health, Suryoprabowo et al. (2020) developed a fluorescence-based method that allowed rapid and sensitive determination of tadalafil in coffee (Suryoprabowo et al. 2020).
Coffee seeds are liable to become contaminated with mold, including ochratoxin A, especially if they are not dried properly or become rehydrated during any stage of drying, storage and transportation (Blanc 2004). As coffee is one of the most consumed beverages worldwide, this nephrotoxic and nephrocarcinogenic mycotoxin is a potential risk factor for human health. Notably, the levels of ochratoxin A were highest in soluble coffees that had been adulterated with coffee husks and/or coffee parchments (Pittet et al. 1996).

Conclusions
In this review, we have provided an extensive overview of analytical techniques and multivariate data analyses successfully applied to detect adulteration or authenticity in coffee, focusing on the most common species, Robusta and Arabica. Advances in technology have allowed the detection of fraudulent practices in coffee through the identification/ quantification of specific chemical or biological markers with a higher sensitivity than ever before, although each method has its limitations. Additionally, we have comprehensively compared the capacity of the different analytical techniques to discriminate between Arabica and Robusta and trace geographical origin, pointing out their respective drawbacks. We have also looked at the advancements in methods to detect fraudulent or accidental adulteration with other foods and/or substances. It can be concluded that more efforts are necessary to protect coffee producers from the huge economic losses and consumers from the health risks these practices entail. Attenuated