Association and epistatic analysis of white matter related genes across the continuum schizophrenia and autism spectrum disorders: The joint effect of NRG1-ErbB genes

Abstract Background Schizophrenia-spectrum disorders (SSD) and Autism spectrum disorders (ASD) are neurodevelopmental disorders that share clinical, cognitive, and genetic characteristics, as well as particular white matter (WM) abnormalities. In this study, we aimed to investigate the role of a set of oligodendrocyte/myelin-related (OMR) genes and their epistatic effect on the risk for SSD and ASD. Methods We examined 108 SNPs in a set of 22 OMR genes in 1749 subjects divided into three independent samples (187 SSD trios, 915 SSD cases/control, and 91 ASD trios). Genetic association and gene-gene interaction analyses were conducted with PLINK and MB-MDR, and permutation procedures were implemented in both. Results Some OMR genes showed an association trend with SSD, while after correction, the ones that remained significantly associated were MBP, ERBB3, and AKT1. Significant gene-gene interactions were found between (i) NRG1*MBP (perm p-value = 0.002) in the SSD trios sample, (ii) ERBB3*AKT1 (perm p-value = 0.001) in the SSD case-control sample, and (iii) ERBB3*QKI (perm p-value = 0.0006) in the ASD trios sample. Discussion Our results suggest the implication of OMR genes in the risk for both SSD and ASD and highlight the role of NRG1 and ERBB genes. These findings are in line with the previous evidence and may suggest pathophysiological mechanisms related to NRG1/ERBBs signalling in these disorders.


Introduction
Neurodevelopmental processes are essential for the acquisition and maintenance of the human brain configuration and function efficiency. The interplay of inherent genetic programs with a wide range of environmental exposures determines the brain's diverse configurations and functional attributes. This is achieved by orchestrating different processes involved in, for example, synaptogenesis, synaptic plasticity, myelination, or connectivity (Tau and Peterson 2010).
These processes, which take place along the developmental timeline and generate the intrinsic human variability, are ultimately related to the ability of the brain to perceive and interpret the world and make adaptive changes (Markham and Greenough 2004). Deviations in such processes are involved in the aetiology of many psychiatric disorders (neurodevelopmental disorders, NDDs) (Paus et al. 2008); however, the specific mechanisms leading to these disorders remain unknown.
Current evidence suggests that a phenotypic continuum links neurodevelopmental disorders, such as schizophrenia-spectrum disorders (SSD) and autism spectrum disorders (ASD). Although they are considered separate disease entities, emerging data has increasing recognition of these conditions overlap. Epidemiological studies have revealed the co-occurrence of these disorders in the same subject and different family members (Stahlberg et al. 2004;Daniels et al. 2008;Sullivan et al. 2012Sullivan et al. , 2013. Concerning clinical characteristics, both SSD and ASD present deficits in social interaction and communication and show impairments in similar cognitive domains (i.e. attention, memory, executive function, and social cognition) (Kerns et al. 2008;Sasson et al. 2011;Martinez et al. 2019). At a neurobiological level, subjects with SSD and ASD show microscopic and macroscopic evidence of brain disruption in overlapping areas (de Lacy and King 2013). For example, recent studies focussing on induced pluripotent stem cells-derived neural cells from SSD and ASD patients support common alterations in glutamatergic synapse formation and function in both disorders (Habela et al. 2016). Also, from neuroimaging approaches, different studies have reported WM alterations in SSD and ASD (Dennis and Thompson 2013;Wheeler and Voineskos 2014). The observed differences in both disorders as compared to healthy subjects include structural connectivity deficits regarding long-distance and interhemispheric bundles in the corpus callosum and the superior longitudinal, the inferior fronto-occipital, and the inferior longitudinal fasciculi (Ford et al. 2002;Mueller et al. 2012;Karlsgodt 2020;Katz et al. 2016).
The importance of intact WM for brain functionality and the implication of WM alterations in numerous psychiatric disorders has been extensively described (e.g. Catani and Ffytche 2005;Konrad and Winterer 2008). At the histological level, WM is composed of bundles of myelinated axons connecting different parts of grey matter regions and contains the connections between specialised processing regions. Then, the oligodendrocytes, the brain cell type responsible for the synthesis of the myelin sheath that supports and isolates neuronal axons, appear to have a critical role in the white-grey matter cooperation essential for the correct functioning of the brain. Interestingly, neuropathological studies suggest that oligodendrocytes may be altered in SZ (Uranova et al. 2001) and ASD (Morgan et al. 2010;Cao et al. 2012).
Regarding the genetic background of SZ and ASD, several studies have confirmed cross-diagnosis genetic correlations (Lee et al. 2013;Hammerschlag et al. 2019). Interestingly, recent molecular studies have reported evidence for overlap in SZ GWAS regions and those with de novo non-synonymous mutations in ASD . Also, transcriptome data has allowed identifying gene-sets with shared dysregulated expression in cortical brain regions of individuals affected with SZ or ASD (Guan et al. 2019;Hammerschlag et al. 2019).
The candidate gene pathway and epistatic approaches have gained attention as a strategy to deep into the overlapping mechanisms in these disorders. Multiple lines of genetic evidence derived from candidate gene association studies or genome-wide approaches support the association of oligodendrocyte/myelination related (OMR) genes with SZ (Karoutzou et al. 2008;Roussos and Haroutunian 2014). OMR genes are mainly expressed in oligodendrocytes and are involved in myelination (Davis et al. 2003), trophic support to the axon (Segal et al. 2007), and axon-glial interactions (Pernet et al. 2008). For example, the NRG1 gene, which has a central role in cortico-cortical myelination during neurodevelopment , has been repeatedly associated with SZ (Agim et al. 2013;Mostaid et al. 2017).
Additionally, the SZ GWAS published by Ripke et al. (2014) reported that although no OMR gene reached genome-wide significance, a set of them showed suggestive association (ERBB4, NRG1, ANK3). Moreover, WM volume was highlighted as a candidate endophenotype linked to SZ-associated Single Nucleotide Polymorphisms (SNPs) (Terwisscha Van Scheltinga et al. 2013). Remarkably, these authors reported an effect of GWAS-identified SZ risk variants on WM volumes in SZ, when compiled as polygenic risk scores. However, these results were not replicated in an independent study, which suggests the need for further analyses of WM genes' role in the polygenic architecture of mental disorders (Papiol et al. 2014).
Also, astrocyte and oligodendrocyte gene-sets have been associated with an increased risk for SZ (Duncan et al. 2014;Goudriaan et al. 2014), suggesting that genetic alterations underlie specific glial cell type functions which increase susceptibility to SZ. However, other studies did not replicate such effects (O'dushlaine et al. 2015).
About ASD, gene network analyses have also shed light on the putative role of these gene sets. In detail, Li et al. (2014) identified a functional module linked to autism, revealing a subset of genes expressed in oligodendrocytes at the corpus callosum, a structure associated with ASD (Wolff et al. 2015). Accordingly, new genetic studies involving WM-related genes and their genetic interactions could provide new insights into the molecular and cellular mechanisms underlying each disorder and the shared aetiological pathways between both disorders.
Based on the above mentioned, we hypothesised that WM-related genes and their epistatic effects would contribute to both SSD and ASD. To test this hypothesis, we first explored the relationship between genetic variability among a selected set of 22 WMrelated genes and the risk for developing SSD or ASD. Secondly, we analysed second-order gene-gene interactions.

Sample description
Patients were included in this study based on the presence of a diagnosis of SSD or ASD. When possible, healthy patients' parents were also recruited to have a trio-based sample. A sample of healthy subjects was also included.
The sample was drawn and coordinated from three centres: (i) University of Barcelona  Probands and parents were evaluated by trained psychiatrists using: (i) the Comprehensive Assessment of Symptoms and History (CASH), Structured Clinical Interview for DSM Disorders and/or the Diagnostic Interview for Genetic Studies (DIGS) (Samples A and B), (ii) the Autism Diagnostic Interview (Sample C).
Controls were recruited from university students and staff and their acquaintances, plus independent sources in the community. They were interviewed and excluded if they reported a history of mental illness and/or treatment with psychotropic medication.
The exclusion criteria included: major medical illnesses that could affect brain function, substanceinduced psychotic disorder, neurological conditions, history of head trauma with loss of consciousness, and moderate or severe mental retardation.
All participants were of Caucasian origin, thereby reducing the possibility of confounding genetic differences by population stratification.
All participants provided written consent after being informed of the study procedures and implications. The study was performed following the institutions' guidelines and approved by the local ethics committee of each participating centre.

Gene selection
The selection of the 22 white matter related genes was based on: (i) their direct biological implication with WM structure/function, (ii) their relationship with WM related biological pathways and also, (iii) their previously reported association with both the risk for SSD/ASD and WM alterations.
Gene interaction graph network from STRING v.9 was implemented (Franceschini et al. 2013). STRING integrates protein-protein interactions from literature curation, computationally predicted interactions, and interactions transferred from model organisms based on orthology. As visible in Figure 1, most of the proteins included in our study are linked and form an interaction network, supporting the gene-gene interaction approach.

Genotyping
Genomic DNA was extracted from peripheral blood cells or buccal mucosa using standard methods. The genotyping was conducted at the Genomic Service of the Spanish National Cancer Research Centre (CEGEN-CNIO) by using the Open ArrayV R Genotyping System of Applied Biosystems.
The molecular genetics data for the sample include 108 Single Nucleotide Polymorphisms (SNPs) among the 22 selected genes ( Table 1). The optimal set of SNPs that contained maximum information about surrounding variants was selected using SYSNPs (http:// www.sysnps.org/) with a minor allele frequency (MAF) Figure 1. Protein-protein interaction network analysed by STRING software in a set of 22 white matter genes included in our study. All 22 genes were used as input for STRING analysis and a network was built based on high confidence (0.8) evidence from experimental protein-protein interaction and curated databases. The nodes of the network (the marbles in the figure) represent the proteins, while the edges of the network (the lines between the marbles) represent the predicted functional associations between the proteins. The colour of each of the edges represents the type of evidence that exists for that interaction: a red line indicates the presence of fusion evidence, a green line indicates neighbourhood evidence, a blue line indicates co-occurrence evidence, a magenta/purple line indicates experimental evidence, a yellow line indicates text-mining evidence, a light blue line indicates database evidence, and a black line indicates co-expression evidence. >5%, using the pairwise option tagger (threshold of r 2 ¼ 0.8). These SNPs were also selected due to either previous associated findings or functional implications. FASTSNP (function analysis and selection tool for single nucleotide polymorphisms) was used to identify and prioritise high-risk SNPs according to their phenotypic risks and putative functional effects (Yuan et al. 2006: http://fastsnp.ibms.sinica.edu.tw).
After genotyping, standard data cleaning and quality control were performed (see Supplementary Material).

Generating case/pseudo-controls data
In the family-based samples (Samples A and C), pseudo-controls were generated (with PLINKtucc command). This approach consists of generating pseudo-controls using the parent's untransmitted alleles, thus creating a matched case-control design where the observed case is compared to all possible genotypic combinations that could have arisen from the parental mating type (Cordell et al. 2004). For any single variant, there are three alternative genotypes for pseudo-controls that could have been transmitted to the case (case: pseudo-controls ratio is 1:3). Case/ pseudo-controls files were used to conduct both association and epistasis analyses.

Association analyses
Association analyses were conducted using PLINK 1.07 to explore the relationship between our 22 WM-related genes and the risk for developing SSD/ASD.
In those analyses with individuals from different sampling sites (Spain and France), Cochran-Mantel-Haenszel (CMH) and Breslow-Day (BD) tests were used for association analyses and heterogeneity testing, respectively. The Cochran-Mantel-Haenszel (CMH) association test allows the comparison of SNP allelic frequencies between groups while controlling for the country of collection. In addition, the asymptotic pvalue of the Breslow-Day test was used to analyse the heterogeneity of the odds ratios (ORs). After Bonferroni correction, none of the analysis reached statistical significance, meaning that any SNP was excluded due to population differences. CMH and BD were applied to conduct: (i) Association analysis in NDDs sample (Samples A þ C), ii) Association analysis in SSD (Samples A and B, separately). Since sample C was recruited from one centre, allelic association analysis was conducted directly. All the analyses were adjusted by sex. Holm-Bonferroni correction was applied to control for multiple testing in all association analyses.

Gene-gene interaction analysis
To capture second-order SNP-SNP interactions, we used the model-based multifactor dimensionality reduction (MB-MDR) method as implemented in the mbmdr R package (Calle et al. 2010). It is an extension of the multifactor dimensionality reduction (MDR) method in which risk categories are defined using a regression model that also allows adjustments for main effects and covariates. By this approach, first, logistic regressions analyses are performed to determine the nine possible genotypic combinations as high (H), low (L), or no risk (0). Then, genotypes of the same risk category are merged, and two Wald statistics (WH and WL, one for each risk) are calculated. Finally, the significance of a specified model is assessed through a permutation test on the maximum Wald statistic, implemented in the function mbmdr.PermTest.
Gene-gene interactions were tested in the whole group of patients affected by NDDs (Samples A þ C) and also in each sample separately. The permutation procedure (10,000 permutations) was applied to the interaction models with the threshold set at p < 0.05. In all interaction models (Table 3), sex was added as a covariate. In addition, when necessary (Samples A þ C and Sample B), the analyses were repeated including the centre as a covariate. Table 2 shows the characteristics of the three independent samples of our study. As sex distribution between groups (patients vs. relatives and patients vs. controls) showed significant differences, sex was added as a covariate in all the analyses.

Association analyses
First, when samples A and C were analysed together (considering all patients to belong to the Neurodevelopmental Disorders group), no significant associations were detected (data not shown).
Second, to explore diagnosis-specific effects, we conducted family-based association analyses separately in Sample A and C. In SSD families, the associated SNPs implicated six genes: ERBB4, ERBB2, MOG, NRG1, MBP, and MAG (uncorrected p-values p < 0.04; Table S1). After H-B correction MBP gene remained significant. Analyses in ASD families showed a trend association for SNPs implicating four genes: MOG, QKI, MAG, and MBP (uncorrected p-value p < 0.05; Table  S2). However, after H-B correction, none of these gene associations remained significant.

Epistatic analysis effect of WM related genes
When the analysis included the whole group of NDDs patients (Samples A þ C), the following epistasis was found: NRG1 (rs6989777) Ã ERBB4 (rs707284), Perm p-value ¼ 0.0003. This interaction reflects that some allelic combinations of these two SNPs are differentially distributed in patients compared to healthy subjects. Afterward, we conducted the epistatic analyses in each sample separately (Table 3). In SSD (Sample A), a significant interaction was detected involving MBP and

Discussion
The present study has focussed on exploring the role of a set of white matter related genes, individually or in interaction, in the risk of developing schizophrenia and autism spectrum disorders (SSD and ASD), psychiatric disorders with a neurodevelopmental component. The inclusion of the two conditions and the cross-disorders analyses combined with the diagnosis-specific ones represent a novel approach that has allowed studying a targeted gene-set across the continuum of these disorders.
We first developed single-gene-based analyses in three samples, including individuals diagnosed with SSD or ASD and healthy subjects (relatives or nonrelated). On the one hand, our results suggest that some of the analysed WM genetic risk variants seem to be shared across SSD-ASD while others seem to be diagnosis-specific. For example, MBP or MAG genes appear to be associated with either SSD or ASD, while AKT1 and NRG1 (and its receptors, ERBB2, ERBB3, and ERBB4) emerge only in SSD based samples. These results should be interpreted cautiously. First, because some do not remain significant after multiple testing correction; second, because some are not replicated in the two SSD samples; and third, because there is no ASD replication sample. However, our results gain interest when considering the gene-gene interactions analysed afterward.
In this regard, epistasis analyses identified significant effects that include the genes previously highlighted by the single-gene approach. While the specific genes involved in the second-order interactions differ across samples, it is remarkable that in the three samples as well as in the analyses combining both disorders, the identified epistatic effects include the interaction of NRG1 or ErbB receptors genes with OMR genes. These differences across samples could be attributed to diagnosis specificities and/or to design effects (case-control vs. family-based); however, our convergent results suggest the main role of Nrg-ErbB signalling across the continuum of these neurodevelopment disorders.
On the one hand, the ERBB3 gene appears to interact with AKT1 (in the SSD case-control sample) and with QKI (in the ASD trio sample). Although these interactions have not been previously described, it is interesting to note that QKI and AKT1 genes are related to myelination pathways (Åberg et al. 2006;Flores et al. 2008;Lee 2009). In this line, disruption of QKI gene expression has been related to downstream consequences on oligodendrocyte development and myelination, eventually increasing the susceptibility for suffering psychiatric disorders (Lee 2009). Of note, QKI Significant gene-gene interactions using model-based multifactor dimensionality reduction method (MB-MDR) for each sample are shown. The Risk category indicates that for a given gene-gene combination, there are allelic combinations that occur more frequently in patients than in healthy individuals (high-risk) or that are increased in healthy subjects compared to patients (low-risk). Perm p-value: refers to the p-value for the interaction model after applying 10,000 permutations. All the reported analyses were obtained including sex as a covariate. The analyses indicated ( þ ) were repeated adjusting by centre. Results were highly similar to those covaried by sex and remained significant (data not shown). Sample A refers to the SSD family-based sample [parents and an offspring with a diagnosis of schizophrenia-spectrum disorder (SSD)]. Sample B refers to the SSD case-control sample (individuals with a diagnosis of SSD and healthy unrelated subjects). Sample C refers to the ASD family-based sample [parents and an offspring with a diagnosis of autism-spectrum disorder (ASD)].
expression changes have been associated with the treatment with typical neuroleptics in SZ patients (Åberg et al. 2006). Also, Akt signalling has emerged as a fundamental player in both peripheral and central nervous systems myelination (Figlia et al. 2017) due to its implication in regulating several processes during the development of myelinating Schwann cells and oligodendrocytes (Norrm en and Suter 2013). On the other hand, a combined effect between NRG1 and ERBB4 has been observed in the combined group of Neurodevelopmental disorders. Genetic association studies have identified NRG1 and ERBB4 as SZ risk genes, and altered NRG-ERBB4 signalling has been associated with positive, negative, and cognitive symptoms (Stefansson et al. 2002;Li et al. 2006;Nicodemus et al. 2006;Harrison and Law 2006;Yin et al. 2018). In addition, NRG1xERBB4 interaction has already been described in SZ (Norton et al. 2006), but, to our knowledge, no study has previously studied it across neurodevelopmental disorders, further suggesting that.
In favour of the role of Nrg-ErbB signalling across this continuum, recent studies have revealed complex Nrg/ErbB signalling networks that regulate the assembly of neural circuitry, myelination, neurotransmission, and synaptic plasticity (Mei and Nave 2014). Moreover, evidence indicates that deviances in Nrg/ErbB signalling impair brain functions critically involved in neural development, including circuitry generation, peripheral myelination, neurotransmission, and homeostasis of CNS synaptic functions (Mei and Nave 2014). In this sense, OMR and Nrg/ErbB signalling pathways are highlighted as therapeutic targets for neuropsychiatric diseases (Wehr et al. 2017).
In this support, the disruption of the NRG1-ErbB4 pathway in oligodendrocytes of animal models has been observed to lead to an alteration of the myelin sheath of the white matter tracts, reduced conduction velocity, and cognitive changes (Roy et al. 2007). Moreover, post-mortem human-based studies have also reported that NRG1 expression is deregulated and ERBB4 hyperphosphorylated in the dorsolateral prefrontal cortex and hippocampus in SZ (Hashimoto et al. 2004;Law et al. 2006;Hahn et al. 2006;Weickert et al. 2012), which can also be related to a hypothetic unbalanced mechanism of expression underlying NRG1-ErbB4 interaction. Then, given the role of NRG1 and ErbB receptors in oligodendrocytes development, alterations in the corresponding genes could impact the oligodendrocytes functionality and thus, affect the white-grey matter relationship.
Even though the two SNPs detected in the secondorder interaction are intronic, it is of note that the vast majority of the genome has gene regulatory properties (Bernstein et al. 2012), in which intronic and intergenic variants are involved. The impact of the non-coding variants captured by this interaction NRG1 Â ERBB4 (rs6989777 Â rs707284) can be evaluated using HaploReg (Ward and Kellis 2012). This is a tool that uses LD information from the 1000 Genomes Project to provide data on the predicted chromatin state of the queried SNPs, their sequence conservation mammals, and their effect on regulatory motifs. In this case, both SNPs are associated with putative changes in regulatory motifs (multi-tissue eQTL data). For example, NRG1-rs6989777 (also associated with SSD risk in our single-gene approach) is in an intronic region, and it is predicted to alter several motifs that overlap the recognition sequences of transcription factors, such as AP-1, CTCF, and Maf. In this sense, AP-1 and CTCF have been proposed as key players in maintaining a chromatin conformation of gene regulatory elements (Park et al. 2012). Interestingly, there is also evidence that this SNP is included in one of the top identified genes through polygenic scoring and pathway analyses in SZ (Ayalew et al. 2012). Besides the putative regulatory properties of the identified SNPs, it has also to be considered that the detected effects could reflect the involvement of other SNPs in linkage disequilibrium.
Some limitations of this study must be acknowledged. The polygenic nature of mental disorders, the relatively small sample size, and the minor effect of the common genetic variants limit the power of our research. Also, the clinical heterogeneity derived from the inclusion based on diagnosis instead of more specific clinical traits restricts our analyses and could be potentially associated with error I type errors. Accordingly, as described in detail in the methodology, multiple testing corrections have been applied in each analysis. Nevertheless, although such procedures have been used, if multiple testing is addressed taking together all the analyses, probably not all the findings would remain significant. Then, further studies are required to validate our findings and determine the biological mechanisms underlying the detected gene interactions among the WM-related genes.
Overall, our results contribute from a biological approach to identifying putative genetic mechanisms involved in SSD and ASD and suggest the potential role of the genes involved in the relationship between OMR and Nrg-ErbB signalling. These findings help in the detection and characterisation of the biological pathways that underpin these disorders and add interest in investigating the interactions of genes to explain a substantial shared component of the risk.

Acknowledgements
We are deeply grateful to all the participants whose generosity made this work possible. We also sincerely acknowledge the psychiatrists, psychologists, and mental health staff from all clinical and research centers who collaborated in this study. We also thank Anna Valldeperas for her assistance with the molecular laboratory tasks.

Disclosure statement
All authors report no conflict of interest.

Funding
This study was supported by: i) Eranet Neuron Consortium "AUSZ: from Autism to SchiZophrenia: study of the genetic mechanisms underlying brain dysfunction and structural phenotypes in schizophrenia and autistic spectrum disorders" (ANR-2010-NEUR-002-01, PIM2010ERN-00642), ii) Fondation de France (Engt n 15142, 2011) and by GDR 3557, iii) the Comissionat per a Universitats i Recerca del DIUE of the Generalitat de Catalunya (Ag encia de Gesti o d'Ajuts Universitaris i de Recerca (AGAUR), 2017SGR1577 and 2017SGR1271); iv) travel grant from Aarhus University graduate school to C Prats; v) Ajuts de Personal Investigador Predoctoral en Formaci o (APIF-IBUB-Universitat de Barcelona) to C Prats, and vi) the Spanish Ministry of Economy and Competitivity, Instituto de Salud Carlos III through the project PI18/01535 and the Miguel Servet contract CP20/00072 to M Fatj o-Vilas (co-funded by European Regional Development Fund (ERDF)/European Social Fund "Investing in your future").