Metabolic profiling for the identification of Huntington biomarkers by on‐line solid‐phase extraction capillary electrophoresis mass spectrometry combined with advanced data analysis tools

In this work, an untargeted metabolomic approach based on sensitive analysis by on‐line solid‐phase extraction capillary electrophoresis mass spectrometry (SPE‐CE‐MS) in combination with multivariate data analysis is proposed as an efficient method for the identification of biomarkers of Huntington's disease (HD) progression in plasma. For this purpose, plasma samples from wild‐type (wt) and HD (R6/1) mice of different ages (8, 12, and 30 weeks), were analyzed by C18‐SPE‐CE‐MS in order to obtain the characteristic electrophoretic profiles of low molecular mass compounds. Then, multivariate curve resolution alternating least squares (MCR‐ALS) was applied to the multiple full scan MS datasets. This strategy permitted the resolution of a large number of metabolites being characterized by their electrophoretic peaks and their corresponding mass spectra. A total number of 29 compounds were relevant to discriminate between wt and HD plasma samples, as well as to follow‐up the HD progression. The intracellular signaling was found to be the most affected metabolic pathway in HD mice after 12 weeks of birth, when mice already showed motor coordination deficiencies and cognitive decline. This fact agreed with the atrophy and dysfunction of specific neurons, loss of several types of receptors, and changed expression of neurotransmitters.


Introduction
Huntington's disease (HD) is an inherited neurodegenerative disorder, which is characterized by progressive motor and cognitive disturbances. HD is caused by an expansion of the cytosine-adenine-guanine (CAG) repeat in the exon 1 of the Hungtintin gene (HTT), which encodes a stretch of glutamines in the Hungtintin protein [1][2][3][4][5][6][7][8]. Although the HTT gene is ubiquitously expressed as the Huntingtin protein in most tissues, HD pathology has primarily been located to the basal ganglia and to the neocortex. The pathology involves atrophy and dysfunction of specific neurons, loss of several types of receptors, changed expression of neurotransmitters and key proteins, as well as formation of ubiquitin positive aggregates [1][2][3][4][5][6][7][8]. HD is a fatal disease, and the median interval between clinical diagnosis and death is typically given as 15-20 years [2,4,6,8].
By use of predictive genetic testing, it is possible to identify individuals who carry the HTT gene defect before the onset of symptoms, providing a unique window of opportunity for intervention aimed at preventing or delaying disease onset [4,7]. However, without robust and practical measures of disease progression, the efficacy of therapeutic interventions in this premanifest HD cannot be readily assessed. Neuroimaging and biochemical biomarkers are being investigated for their potential in clinical use and their value in the development of future treatments [4,7]. Modern neuroimaging techniques such as magnetic resonance imaging (MRI) enable high-quality images of brain structure and function to be obtained [9,10]. However, metabolites that can be quantified in biofluids, such as blood or urine, are appealing due to the improved selectivity, the minimal requirement for patient involvement, opportunity for rapid bulk processing of specimens, availability of reliable assays, and possibility of carrying out multiple analyses on a single sample [5,7,11].
Metabolomics aims to obtain a comprehensive coverage of low molecular mass compounds from biological systems [12][13][14]. Metabolomics studies can be approached using targeted or untargeted analysis [15][16][17]. In targeted analysis, a specified list of metabolites is analyzed. In contrast, untargeted analysis requires comprehensive metabolite measurements. Furthermore, it can implicate previously unrecognized metabolites or pathways with a unique phenotype and, therefore, is a powerful platform to elucidate novel biomarkers and gain insight into disease pathogenesis.
Different techniques are currently used for untargeted metabolomics, including NMR, GC-MS, LC-MS, and CE-MS [17][18][19][20][21]. For the first time to our knowledge, the use of on-line C 18 -SPE-CE-MS is proposed as an alternative sensitive method for metabolomic studies of biological fluids, which are complex diluted samples. CE is a versatile, highperformance separation technique with many desirable characteristics such as instrumental simplicity, full automation, high efficiency, low consumption of sample and reagents, and reduced analysis times. However, like many other microanalytical techniques, it has poor concentration sensitivity for most analytes, from low molecular mass compounds to biopolymers such as proteins [22,23]. Several strategies have been proposed to improve CE sensitivity. Today, SPE-CE is becoming widely recognized as a powerful approach that overcomes this major drawback [22][23][24][25][26][27]. In SPE-CE, a microcartridge placed inside and near the inlet of the separation capillary contains an appropriate extraction sorbent (in our case, C 18 ). This sorbent selectively retains the target analyte, enabling large volumes of sample to be introduced (50-100 L). The retained analyte is eluted in a small volume of an appropriate solution (25-50 nL), which results in sample cleanup and concentration enhancement with minimum sample handling before separation and detection, for example, by on-line MS (SPE-CE-MS) [22][23][24][25][26][27].
Chemometric methods play a crucial role in data processing, exploration, and classification of the massive datasets generated in metabolomic studies [28][29][30][31][32][33][34]. If the goal of the study is the compound detection, the use of resolution methods such as multivariate curve resolution alternating least squares (MCR-ALS) can be an excellent alternative. MCR-ALS can resolve overlapped electrophoretic/chromatographic peaks from the collected data and provide the separation profiles and mass spectra of the constituents in the analyzed samples. This approach allows overcoming problems such as retention time shifts, background noise contributions, and differences in S/Ns among different injections. Several published articles focus on the application of MCR-ALS to solve similar problems in LC-MS [32,33] and GC-MS [34]; but only a few studies have been previously reported combining CE-MS and MCR-ALS in metabolomic applications [35].
In this paper, we evaluate the capacity of SPE-CE-MS combined with advanced multivariate data analysis to preconcentrate, separate, detect, and identify low molecular mass metabolites in plasma samples from wild-type (wt) and HD (R6/1) mice of different ages (8, 12, and 30 weeks). A comparison between the different untargeted metabolomic profiles allows us to propose novel potential biomarker candidates involved in the progression of HD, which could be useful for prediction of disease onset or response to treatment.

Electrolyte solutions, sheath liquid, and standard solutions
Aqueous standard solutions (2500 g/mL) of Dyn A, End 1, and Met peptides were prepared and stored in a freezer at −20°C when not in use. A 10 ng/mL standard mixture of the three peptides was prepared and analyzed at the beginning and at the end of each SPE-CE-MS sequence, in order to check the proper functioning of the on-line SPE microcartridges. The BGE contained 50 mM of HAc and 50 mM of HFor and was adjusted to pH 3.50 with ammonia. The sheath liquid solution consisted of a hydroorganic mixture of 2-propanol/water (60:40, v/v) with 0.05% v/v of HFor. All solutions were passed through a 0.45 m nylon filter (MSI, Westboro, MA, USA) before analysis and were stored at 4°C when not in use. The sheath liquid was degassed for 10 min by sonication before use.

Mice blood plasma and sample preparation
Plasma samples from male wt mice and R6/1 transgenic mice (B6CBA background) expressing exon 1 of mutant Huntingtin with 145 repeats (HD, R6/1) of different ages (8, 12, and 30 weeks; early, middle, and late disease stage, respectively), were kindly supplied by the Department of Cellular Biology, Immunology and Neurosciences (Faculty of Medicine, University of Barcelona) [36]. Blood from mice was collected by cardiac puncture in standard clinical vials and placed on ice. Plasma was separated from the blood cells, pooled, deposited into polyethylene tubes, and frozen at −20°C. It is worth mentioning that due to the small amount of blood that was possible to extract from a single mouse (between 1 and 2 mL), each set of samples corresponded to the combination of the plasma obtained from four or five mice. All animal procedures were approved by the CEEA committee of the University of Barcelona and were in accordance with the European Communities Council Directive (2010/63/EU). The sample pretreatment used for the analysis of low molecular mass compounds in plasma samples was described elsewhere [22,37]. The off-line double-step pretreatment of plasma samples consisted of protein precipitation with cold ACN (plasma/ACN, 200:1200 L) followed by centrifugal filtration with 10 000 M r cutoff cellulose acetate filters (Amicon R Ultra-0.5, Millipore). Centrifugal filters were passivated before the first use with 5% v/v of PEG in water [37].

Apparatus and procedures
Measurements of pH were made with a Crison 2002 potentiometer and a Crison electrode 52-03 (Crison Instruments, Barcelona, Spain). Centrifugal filtration was carried out in a cooled Rotanta 460 centrifuge (Hettich Zentrifugen, Tuttlingen, Germany) for centrifugation at controlled temperature (25°C).

On-line SPE-CE-MS
The construction of the microcartridge or analyte concentrator for C 18 -SPE-CE-MS was carried out as described elsewhere [22,37]. All fused silica capillaries were supplied by Polymicro Technologies (Phoenix, AZ, USA). The microcartridge (7 mm L T × 250 m id × 360 m od) was inserted inside the separation capillary (72 cm L T × 75 m id × 360 m od), at 7.5 cm from the inlet, using two plastic sleeves. Previously, it was filled with the sorbent found in C 18 Sep-pak cartridges (Waters, Milford, MA, USA). The sorbent particles were retained in the microcartridge between two frits (0.1 cm).
All capillary rinses were performed at high pressure (930 mbar). New separation capillaries were flushed with 1.0 M NaOH (20 min) and water (15 min) before inserting the microcartridge. This activation procedure was performed off-line to avoid the unnecessary entrance of NaOH into the MS system. Once inserted the microcartridge, the SPE-CE-MS capillaries were first conditioned by consecutive flushes of water (1 min), methanol (1 min), water (1 min), and BGE (3 min) at 930 mbar. Standard peptide mixture (Dyn A, End 1, and Met) or mice plasma samples were then introduced at 930 mbar for 10 min (approximately 60 L using the Hagen-Poiseuille equation [38]). A final rinse with the BGE (2 min at 930 mbar) eliminated nonretained molecules and equilibrated the capillary before the elution. Retained compounds were eluted by injecting a solution of methanol/water (60: 40, v/v) with 50 mM HAc and 50 mM HFor at 50 mbar for 10 s (approximately 50 nL [38]). Separation was carried out at 25°C by applying a voltage of 17 kV (normal polarity, cath-ode in the outlet). Between runs, the capillary was rinsed for 2 min with water and 2 min with ACN, in order to avoid carry-over between consecutive analyses. In general, the different plasma samples (i.e. 8wt, 12wt, 30wt and 8HD, 12HD, and 30HD) were analyzed in triplicate (with the exception of 12wt, 12HD, and 30HD, for which only two replicates were analyzed due to the small volume of plasma sample available). Each series of replicate analyses was performed in a new SPE-CE-MS capillary due to the limited durability of the SPE microcartridges (10 analyses) because of the complexity of the plasma matrix and the limited selectivity of the C 18 sorbent. After these analyses, the extraction efficiency decreased and the microcartridge was packed until it was completely clogged [39]. At the beginning and at the end of each sequence, a 10 ng/mL standard peptide mixture was analyzed as a quality control of the system.
All CE-MS experiments were performed in an HP 3D CE system coupled with an orthogonal G1603A sheath-flow interface to a 6220 oa-TOF LC/MS spectrometer (Agilent Technologies, Waldbronn, Germany). The sheath liquid was delivered at a flow rate of 3.3 L/min by a KD Scientific 100 series infusion pump (Holliston, MA, USA). ChemStation C.01.06 software (Agilent Technologies) was used for CE control and separation data acquisition (e.g. voltage, temperature, and current), and was run in combination with MassHunter B.04.00 workstation software (Agilent Technologies) for control of the mass spectrometer and MS data acquisition.
The TOF mass spectrometer was operated under optimum conditions in positive mode using the following parameters: capillary voltage 4000 V, drying gas temperature 200°C, drying gas flow rate 4 L/min, nebulizer gas 7 psig, fragmentor voltage 215 V, skimmer voltage 60 V, and OCT 1 RF Vpp voltage 300 V. Data were collected in profile at one spectrum/s between 40 and 1250 m/z, with the mass range set to high-resolution mode (4 GHz). A standard tune and an external mass calibration were performed daily at the beginning of the day following the manufacturer's instructions using the typical LC-MS sprayer and ESI-L tuning mix (Agilent Technologies).

Data analysis
SPE-CE-MS data were analyzed by a combination of advanced chemometric tools to evaluate the most significant metabolic changes involved in HD. Figure 1 shows a summary of the data analysis workflow, which is explained in detail in this section.

Data preprocessing of dataset
First, SPE-CE-MS raw data were converted to .txt format using the ProteoWizard software [40] and, then imported into the MATLAB environment (The Mathworks, Natick, MA, USA) using in-house made routines. During this import process, MS information was compressed to 0.01 Da/e resolution. Every sample provided a data matrix with 2490 rows (migration ). An automatic weighted least squares baseline correction was applied before to the MCR-ALS analysis.

Full scan MS data arrangement and MCR-ALS analysis
MCR-ALS is a chemometric method especially useful to analyze multicomponent systems with strongly overlapping contributions, such as those present in CE separations, where the electrophoretic behavior of metabolites is rather similar [41].
In the case of SPE-CE-MS, full scan MS data matrix D contains the experimental mass spectra at all retention times in their rows and the electropherograms at all m/z values in their columns. MCR-ALS analysis of the data matrix D, following a bilinear model, gives two factor matrices, C and S T , as in Eq. (1): where matrix C contains the electrophoretic profiles of the resolved contributions (components), matrix S T contains the corresponding mass spectra of the resolved contributions, and matrix E contains the residuals unexplained by the model. The different samples can be simultaneously analyzed and compared by MCR-ALS using a column-wise augmented data matrix configuration (see matrix D aug in Eq. (2) and Fig. 1B), following the strategy described in the work of Ortiz-Villanueva [35]: This approach allowed obtaining a common matrix of the mass spectra of the resolved components (S T ) for all samples, and a set of matrices describing the resolved electrophoretic profiles (C aug ) in every sample. These electrophoretic peaks resolved in matrix C aug are allowed to vary in position (shifts) and shape among samples because the only requirement for a proper resolution is that the resolved spectra are the same for the common constituents in the different samples [42]. This aspect is especially useful in the case of CE data where migration shifts among samples occur and, hence, the alignment of electrophoretic peaks before analysis is not needed.
In this study, the electropherograms were partitioned in two time windows corresponding to the two regions with the most intense peaks (selected regions, depending on the sample, varied approximately from 10 to 25 min and from 30 to 40 min, respectively, Fig. 2). Then, the resulting data matrices were further reduced in their m/z mode dimension in 30 different m/z ranges (m/z widths for reduction were 20, 50, and 100 m/z in the m/z ranges 40-400, 400-800, and 800-1250 m/z, respectively; Fig. 1B) [35].
MCR-ALS analysis was carried out following standard procedures for the determination of the number of components (SVD, [43]) and initial estimates (SIMPLISMA, [44]). ALS optimization was performed under nonnegativity constraints for electrophoretic (C aug ) and spectral (S T ) profiles, and spectral normalization (equal height) [45,46].

Detection and identification of potential metabolites
For every resolved MCR-ALS component, electropherogram (peak) profiles of the six sample sets (i.e. 8wt, 12wt, 30wt and 8HD, 12HD, and 30HD) were compared. Only resolved components of C aug that showed S/Ns higher than 10% of the abundance of the most intense component were selected.
Next, their corresponding mass spectra profiles (S T ) were used to identify the m/z values causing the differentiation between wt and HD plasma samples at 8, 12, and 30 weeks. Finally, peak areas of these candidate m/z values were recovered from the full scan SPE-CE-MS data using the MassHunter workstation software, taking as a reference the m/z value and the migration time of the MCR-ALS resolved components (Fig. 1C). Areas were finally normalized considering the peak area corresponding to a compound present in all the samples that was not discriminant between control and HD samples (m/z of 72.9858, in the first time window). These areas were used to build a data matrix containing the area of each candidate (feature) in every sample. This data matrix was autoscaled in order to give equal weighting to all candidates in the measured samples. Finally, partial least squares discriminant analysis (PLS-DA) models were applied to the autoscaled data matrix to evaluate sample discrimination and to identify the most important features. There are numerous methods for feature selection when considering PLS models. In this work, the variable importance in the projection (VIP) method was used [47], because it is one of the preferred methods to deal with metabolomic data due to its ability for handling multicollinear data [48]. For each model, VIP scores estimate the importance of each feature in the projection. Only features with a VIP score over a particular threshold value (usually 1) are considered important and selected for further analysis. In all the cases, leaveone-out cross-validation was used to assess the performance of the built models. Thereafter, the accurate experimental molecular mass values of the finally VIP selected metabolites were searched in on-line databases resources, such as METLIN Metabolite Database [49] and Human Metabolome Database [50]. A small error from the calculated (theoretical) molecular mass (M r ) was used to evaluate the accuracy of possible molecular formulas (E r ࣘ 20 ppm, │M r experimental − M r theoretical │/M r theoretical × 10 6 ). Finally, the list of the tentatively identified metabolites was used to investigate the possible metabolic pathways and mechanisms involved in HD according to the KEGG database [51] (Fig. 1C).

Software
Most of the calculations and data analysis were performed under MATLAB R2013a (The Mathworks). PLS Toolbox 7.3.1 (Eigenvector Research, Wenatchee, WA, USA) was used for PLS-DA and VIP calculations; and MCR-ALS toolbox [42] was used for resolution of electrophoretic and mass spectral metabolite profiles from full MS scan augmented data matrices.

Analysis of mice plasma by C 18 -SPE-CE-MS
Untargeted metabolomics analysis requires a comprehensive coverage of low molecular mass compounds from biological samples. However, very often sample amount limitations, matrix complexity, and metabolite concentration preclude direct analysis with CE-MS. With the aim of solving these issues, plasma samples from wt (control) and HD mice were analyzed by C 18 -SPE-CE-MS in order to preconcentrate, separate, detect and identify low molecular mass compounds, and establish significant differences between the global metabolite profiles from different groups of samples. In order to evaluate HD progression in individuals at the premanifest motor stage of the disease, plasma samples from wt and HD mice were analyzed at 8, 12, and 30 weeks of age. In HD mice, these samples corresponded to early (asymptomatic), middle (symptomatic), and late (terminal) disease stage mice, respectively, although this classification is only based on motor coordination deficiencies [52].
The applied C 18 -SPE-CE-MS method in positive ESI mode was developed for the analysis of peptides in human plasma in previous works [22,37], but preliminary experiments showed that it was also useful to obtain a rich fingerprint of low molecular mass compounds in mouse plasma. As shown in those studies, all the plasma samples were subjected to an off-line sample pretreatment before C 18 -SPE-CE-MS in order to prevent microcartridge saturation due to the limited selectivity of the C 18 sorbent. A double-step pretreatment based on solvent precipitation and centrifugal filtration with M r cutoff filters was applied to eliminate salts and high molecular mass compounds (i.e. proteins). This pretreatment allowed excellent recoveries for low molecular mass opioid peptides (Ͼ70%) [37]. Furthermore, LODs were improved by C 18 -SPE-CE-MS between 1000 and 10 000 times compared to CE-MS, depending on the peptides and the sample [39]. Figure 2 shows the total ion electropherograms obtained for the mice plasma samples by C 18 -SPE-CE-MS. As can be observed, separation resolution is not high because of the complexity of the sample. All the electropherograms present a characteristic profile with two time regions with the most intense peaks (approximately at 10-25 and 30-40 min, respectively), and advanced chemometrics methods are necessary for high throughput and reliable comparison between the different sets of plasma samples.

MCR-ALS analysis and detection of the most relevant metabolites
MCR-ALS was applied using a column-wise augmented data matrix containing simultaneously the information of the 15 samples (wt and HD, both at 8, 12, and 30 weeks) and allowed the resolution of the electropherogram profiles and corresponding mass spectra of the plasma metabolites. MCR-ALS analysis was performed separately on columnwise augmented data matrices of different m/z ranges (at the resolution of 0.01 Da/e), corresponding to the two selected time windows. A total number of 60 column-wise augmented matrices (two time windows × 30 m/z intervals) were separately analyzed. The number of components selected was related to the number of electrophoretic peaks, despite the fact that some of these resolved components could be due to contributions such as solvent background or instrumental noise. In most of the cases, MCR-ALS models showed an explained variance (R 2 ) of almost 100%. The electropherogram profiles for the resolved MCR-ALS components in the six sample sets (i.e. 8wt, 12wt, 30wt and 8HD, 12HD, and 30HD) were compared and only resolved components of C aug that showed S/Ns higher than 10% of the abundance of the most intense component were finally selected (in order to remove contributions such as solvent background or instrumental noise). The mass spectra of these components (from S T ) were used to identify the m/z values causing the discrimination between samples. After the resolution and analysis of the 60 augmented data matrices, a total number of 74 features were detected. Finally, peak areas of these candidate m/z values were recovered from the full scan raw C 18 -SPE-CE-MS data using the MassHunter workstation software, taking as a reference the m/z value and the migration time of features obtained from the MCR-ALS resolved components.
PLS-DA was then applied to identify the most important metabolites responsible for the sample discrimination considering the raw peak areas for the selected 74 candidate metabolites. In order to identify potential Huntington biomarkers that could be useful to discriminate between wt and HD samples, as well as to follow-up the HD progression, three different PLS-DA models were built. Figure 3 shows the PLS-DA scores plot for the mice plasma samples taking into account the three mentioned models. As can be observed in Fig. 3A, the first PLS-DA model was applied to discriminate between control and HD samples. This model permitted us to propose possible biomarkers involved in HD. Two latent   Table 1) that were used to explain (A) HD progression (set HD), (B) aging of healthy controls (set wt), and (C) differences between wt and HD plasma samples (set wt/HD). The percent of Abundance of each metabolite was calculated normalizing to the metabolite presenting the highest abundance. ( * See Table 2 for the related metabolic pathways.) variables (LVs) explained 34 and 89% of the X and Y variances, respectively. The second PLS-DA model was applied to differentiate between wt samples of different ages and identify metabolites involved in aging of healthy controls. In order to improve the reliability of the PLS-DA model due to the limited amount of samples, a two-class model was used, which presented at least three samples in each class (8 and 12-30 weeks). These two sets of samples were also the best option to differentiate later between aging and early HD progression. A PLS-DA model with two LVs explained 47% of the X-variance and the 99% of the Y-variance (Fig. 3B). Finally, the third PLS-DA model was applied to distinguish between HD samples of different ages and identify possible biomarkers that could be useful to follow-up the disease progression. Again, the same two sets of samples were defined (8 and 12-30 weeks). In this case, two LVs explained 52 and 98% of the X and Y variances, respectively, (see Fig. 3C). All PLS-DA models allowed class discrimination and the detection of the most relevant components for the differentiation of the samples. It is worth mentioning that HD samples of 12 and 30 weeks were slightly separated in the scores plot (Fig. 3C), whereas this separation was not observed for wt samples (Fig. 3B). Anyway, a three-class PLS-DA model was not recommended because of the limited amount of samples. VIP scores values higher than 1 were used as a feature selection tool in order to choose only the most relevant candidate metabolites for each PLS-DA model (33 out of 74).

Tentative metabolite identification and biological meaning
The most contributing metabolites to sample discrimination (33) were tentatively identified, taking advantage of the highly accurate experimental molecular mass values provided by the oa-TOF mass spectrometer. Only 29 features of the total of 33 were tentatively identified with an error ࣘ20 ppm ( Table 1; the four nonidentified features were discarded for further discussion). As can be observed in Table 1, there were some ambiguities on the metabolite identities because this tentative identification was solely based on the agreement between the experimental and the theoretical molecular mass values. For example, in some cases several isobaric metabolites were proposed for a certain molecular formula and experimental molecular mass value (i.e. identification number (ID) 2, 3, 4, 6, 7, 9, 13, 15, 16, 17, 18, 19, and 29 ). In the future, analysis of standard samples and MS/MS measurements for structure characterization would be necessary to improve reliability of these identity assignments.
The Venn diagrams that appear as insets in Fig. 4A and B show the relations between the identified metabolites that explain HD progression and aging of healthy controls. As can be observed, seven metabolites (4 + 3) were useful to specifically explain HD progression (HD set, ID 1, 10, 28, 3, 13, 15, and 24 in Table 1 and Fig. 4). Similarly, eight metabolites (8 + 0, ID 5,6,14,16,20,21,22,and 23 in Table 1 and Fig. 4) were useful to specifically explain aging of healthy controls (wt set). The concentration trends of these specific metabolites were varied (Fig. 4A and B), some of them decrease, while others increase after 12 weeks of birth. Finally, there were eight metabolites (7 + 1, ID 29,4,11,26,7,8,9,and 19 in Table 1 and Fig. 4) that were explaining both progression of HD and aging. Four of them showed a clear different concentration trend in HD and wt plasma samples, but for the other four metabolites the trend was similar, indicating that differences were found on their absolute concentration (e.g. ID 8 normalized areas in HD and wt plasma samples were 466.4072 and 431.2187, respectively). With regard to differentiation in general of wt and HD samples (wt/HD set), there were six metabolites that were useful to specifically distinguish between control and HD samples, four downregulated and two upregulated in HD samples, as shown in Fig. 4C (ID  2, 12, 18, 25, 17, and 27 in Table 1). For metabolites explaining also HD progression and/or aging (ID 10, 15, 1, and 11 in Table 1 and Fig. 4C), the concentration trends were varied (two were downregulated and two upregulated).
The identified metabolites were searched against different on-line databases to identify the potential metabolic pathways that could be involved in HD pathology. Different metabolic pathways were found to be related to 13 of the 29 identified metabolites (see Table 2). It is well known that HD could affect different metabolic pathways. Huntingtin is ubiquitously expressed and, in addition to neurological features, the peripheral phenotype of HD could include weight loss, energy disturbances, and alteration of endocrine function.
As is shown in Fig. 4A, concentrations of phenylalanylarginine and arginyl-phenylalanine (ID 13, Table 1) were found increased in HD mice after 12 weeks of birth (Fig. 4A). These metabolites, which were specific to explain HD progression, are incomplete breakdown products of protein digestion or protein catabolism known to have physiological or cell-signaling effects ( Table 2) [53]. Similarly, prostaglandins, thromboxanes, lipoxins, and leukotrienes (ID 15, Table 1) were found upregulated after 12 weeks (Fig. 4A), but downregulated when all HD samples were compared to all controls (Fig. 4C), thus indicating a change of trend after 12 weeks. These metabolites are related with regulation of inflammatory processes and signaling pathways, mainly the arachidonic acid metabolism, the neuroactive ligand-receptor interaction, the serotonergic synapse, the cAMP signaling pathway, and the oxytocin signaling pathway (see Table 2). The arachidonic acid metabolism has also been related with the synthesis of cytochromes involved in the mitochondrial oxidative phosphorylation, and altered mitochondrial function has been associated to HD [54,55]. Furthermore, cAMP levels have been found reduced in the striatum of several HD mouse models [56], while the oxytocin signaling pathway has been related with changes in the hypothalamic and limbic systems that take place at HD early stages [57]. Concentration of L-urobilinogen (ID 24, Table 1), which is related with the porphyrin metabolism (Table 2), was also found increased after 12 weeks of birth (Fig. 4A). All these changes in 12-weeks-old HD mice suggest an onset on specific neuronal dysfunction, altered expression of several types of receptors and changed expression of neurotransmitters and key proteins. Unbalanced activity within these pathways provides a potential mechanism for many of the pathological phenotypes associated with HD, such us transcriptional dysregulation, inflammation, and ultimately neurodegeneration [58][59][60].
With regard to metabolites explaining both progression of HD and aging ( Fig. 4A and B), gangliosides (ID 29, Table 1), which are cell plasma membrane components that modulate cell signal transduction events, showed a different concentration trend on HD progression compared to aging. Gangliosides levels decreased after 12 weeks of birth in HD progression (Fig. 4A), while increased in controls (Fig. 4B).
Decreased ganglioside concentration has also been found in the cerebellum of R6/1 (HD) mice at 35-40 weeks [61], and in fibroblasts, cortex, and striatum of YAC128 mice [62]. Similarly, L-hexanoylcarnitine levels (ID 11, Table 1), which decreased with aging in healthy controls (Fig. 4B), were found to increase with HD progression (Fig. 4A), and also when all HD samples were compared to all controls (Fig. 4C), suggesting that the disease involves disturbances in energy production, which are characterized by production and excretion of unusual acylcarnitines [63]. Concentration of PC(14:1(9Z)/14:1(9Z)) (ID 26, Table 1), which is related with signaling pathways (the arachidonic acid metabolism and the retrograde endocannabinoid signaling), the glycerophospholipid metabolism and the linoleic acid metabolism, was also found increased with HD progression and decreased with wt aging ( Table 2, Fig. 4A and B). In contrast, changes on the concentration trend with HD progression or aging of (−)epinephrine and normetanephrine (ID 7, Table 1), which are metabolites related with tyrosine metabolism and signaling pathways (cAMP signaling pathway, adrenergic signaling in cardiomyocytes, and neuroactive ligand-receptor interaction; Table 2) were not observed ( Fig. 4A and B). These metabolites were found decreased after 12 weeks in HD progression and aging ( Fig. 4A and B). The same trend was observed for vanylglycol and phosphorylcholine (ID 8, Table 1, Fig. 4A and B), which are related with the tyrosine and the glycerophospholipid metabolisms, respectively (Table 2). Finally, metabolites with ID 9 (Table 1), presented again a decreasing trend in both HD progression and wt aging ( Fig. 4A and B). In this case, 3-indolebutyric acid is related with the tryptophan metabolism, while the other metabolites are incomplete products of protein digestion or protein catabolism associated with cell-signaling effects ( Table 2) [64]. With regard to metabolites explaining only wt aging, dimethylbenzimidazole (ID 5, Table 1), which is related with the riboflavin and porphyrin metabolisms ( Table 2), was found reduced after 12 weeks of birth (Fig. 4B). The same concentration trend was observed for 18-hydroxycorticosterone and cortisol (ID 16, Table 1, Fig. 4B), which are metabolites associated with the steroid hormone biosynthesis (Table 2). 3-Indolebutyric acid Tryptophan metabolism Glycyl-glutamine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * Glycyl-gamma-glutamate Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * Asparaginyl-alanine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * Glutaminyl-glycine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * Alanyl-asparagine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * Gamma-glutamyl-glycine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * 11 L-Hexanoylcarnitine Energy production 12 Histidinyl-histidine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * 13 Phenylalanyl-arginine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * Arginyl-phenylalanine Incomplete breakdown product of protein digestion or protein catabolism with cell-signaling effects * 15 Prostaglandin D2 Arachidonic acid metabolism * /neuroactive ligand-receptor interaction * /serotonergic synapse * /Fc epsilon RI signaling pathway * Prostaglandin E2 Arachidonic acid metabolism * /neuroactive ligand-receptor interaction * /serotonergic synapse * /cAMP signaling pathway * /oxytocin signaling pathway * /inflammatory mediator regulation of TRP channels * Prostaglandin H2 Arachidonic acid metabolism * /serotonergic synapse * /retrograde endocannabinoid signaling * /oxytocin signaling pathway * Prostaglandin I2 Arachidonic acid metabolism * /neuroactive ligand-receptor interaction * /cAMP signaling pathway * /VEGF signaling pathway * Thromboxane A2 Arachidonic acid metabolism * /neuroactive ligand-receptor interaction * /serotonergic synapse * Lipoxin A4 Arachidonic acid metabolism * /neuroactive ligand-receptor interaction * 15 Lipoxin B4 Arachidonic acid metabolism * 20-Hydroxy-leukotriene B4 Arachidonic acid metabolism * 15-Keto-prostaglandin F2a Arachidonic acid metabolism * 16 18 Ganglioside GD1b (d18:1/12:0) Signal transduction * Ganglioside GD1a (d18:1/12:0) Signal transduction * *All these metabolic pathways are included in the signaling pathway.
Comparing all HD samples with all controls, concentration levels of m-cresol and p-cresol (ID 2, Table 1), which are involved in protein digestion and absorption, as well as in degradation of aromatic compounds, were found downregu-lated in HD samples (Fig. 4C). The same concentration trend was observed for histidinyl-histidine (ID 12, Table 1, Fig. 4C), an incomplete breakdown product of protein digestion or catabolism with cell-signaling effects [65,66].

Concluding remarks
An optimized sample pretreatment was applied to wt and R6/1 mice plasma samples (of 8, 12, and 30 weeks) prior to the analysis by C 18 -SPE-CE-MS. The proposed methodology demonstrated to be suitable to ensure a reliable and comprehensive metabolite profiling of the plasma samples. The combination of MCR-ALS with other chemometric tools, such as PLS-DA, allowed the comprehensive analysis of the C 18 -SPE-CE-MS metabolomic data, resolving electrophoretic peaks and mass spectra of a large number of metabolites. Finally, a list of potential metabolites useful to discriminate between control and HD plasma samples, as well as to followup the HD progression, were tentatively identified, and the most affected metabolic pathways were discussed. Although different pathways were found altered in HD, the intracellular signaling was observed to be the most affected, especially after 12 weeks of birth, thus suggesting that the pathology involves dysfunction of specific neurons, altered expression of several types of receptors, and changed expression of neurotransmitters. In addition, although some of the identified metabolites have been previously described in the striatum of R6/1 (HD) mice or other rat models, attempts to find such biomarkers in plasma have hitherto been unsuccessful. In the present work, we propose direct brain-striatal metabolites as good biomarkers that can be found in periphery (plasma samples). Therefore, we provide a window of opportunity for prediction of disease onset, evaluation of HD early progression, or response to treatment.