Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorTorrents Arenales, David-
dc.contributor.advisorMercader Bigas, Josep Maria-
dc.contributor.authorGuindo Martínez, Marta-
dc.contributor.otherUniversitat de Barcelona. Facultat de Biologia-
dc.description.abstract[eng] Genome-wide association studies (GWAS) have been proven useful for identifying thousands of associations between genetic variants and human complex diseases and traits. However, the identified loci account for a small proportion of the estimated heritability (i.e., the proportion of variance for a particular phenotype that can be explained by genetic factors). The usually small effect size of common variants and the low frequencies of some variants with potentially larger effect sizes limit the statistical power of GWAS. The identification of common variants with small effects and low-frequency variants with large effects can be overcome with the analysis of larger sample sizes and imputing genotypes using dense reference panels. However, there is still room for improvement beyond increasing the sample size and the number of variants. As current GWAS are predominantly focused on the autosomes and only test the additive model, current strategies still constrain the full potential of GWAS. In this thesis, we hypothesized that performing a comprehensive analysis improving current GWAS strategies by 1) implementing the analysis of the X chromosome alongside the autosomes, 2) including genetic variants from a broader allele frequency spectrum and type of variants, such as small insertions and deletions (INDELs) through genotype imputation using multiple reference panels, and 3) testing different models of inheritance in the association test, would improve our understanding of the genetic architecture of complex diseases. To test these hypotheses we developed an integrated framework including our methodology, called GUIDANCE. Hence, GUIDANCE integrates state-of-the-art tools for GWAS analysis, including the analysis of X chromosome, a two-step imputation with multiple reference panels, the association testing including additive, dominant, recessive, heterodominant and genotypic inheritance models, and cross-phenotype association analysis when more than one disease is available in the cohort under study. We used GUIDANCE to analyze the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort, a publicly available cohort that includes 62,281 subjects from European ancestry with an average age of 63 years for 22 diseases, representing the largest cohort for age-related diseases to date. After quality control, we analyzed 56,637 subjects from European descendant populations. Following our methodology, we imputed genotypes using 1000 Genomes Project (1000G) phase 3, the Genome of the Netherlands project (GoNL), the UK10K project22, and the Haplotype Reference Consortium (HRC) as reference panels. Using this strategy, we identified 26 new associated loci for 16 phenotypes (p < 5 × 10-8), with 13 showing significant dominance deviation (p < 0.05). Importantly, we identified three recessive loci with large effects that could not have identified by the additive model. This include a region let by an INDEL associated with cardiovascular disease in CACNB4 (rs201654520, minor allele frequency [MAF] = 0.017, odds ratio [OR] = 19.02, p = 4.32 × 10-8), a lous near PELO associated with type 2 diabetes with the greatest odds ratio for type 2 diabetes in Europeans reported to date (rs77704739, MAF= 0.036, OR = 4.32, p = 1.75 × 10-8), and a rare INDEL associated with age-related macular degeneration near THUMPD2 (rs557998486, MAF= 0.009, OR = 10.5, p = 2.75 × 10-8). Despite the phenotype discrepancies and different demographical characteristics of the GERA cohort and UK Biobank, four of the novel loci were replicated with an equivalent phenotype in UK Biobank, and we found additional supporting associations in related traits, treatments or biomarkers in UK Biobank for the remaining novel loci. Of note, PELO and THUMPD2 recessive loci were replicated using the recessive model in UK Biobank (combined results: PELO, rs77704739, OR = 2.46, p = 4.68 × 10-11, and THUMPD2, rs557998486, OR = 26.51, p = 3.29 × 10-8), which could not have been found with the additive model. Overall, these results highlight the importance of performing a comprehensive analysis of the full spectrum of genetic variation and considering non-additive models when performing GWAS, especially with well-powered biobanks and the increasing ability to impute low-frequency variants. For the benefit of the research community, we make available both GUIDANCE to boost the analysis of existing and ongoing GWAS projects, and the GERA cohort results, which constitute the largest non-additive genetic variation association database to date, through the Type 2 Diabetes Knowledge Portal (
dc.format.extent267 p.-
dc.publisherUniversitat de Barcelona-
dc.rightscc-by-nc-nd, (c) Guindo, 2019-
dc.subject.classificationEpidemiologia genètica-
dc.subject.otherGenetic epidemiology-
dc.titleA systematic and comprehensive approach for large-scale genome-wide association studies. Unraveling non-additive inheritance models in age-related diseases-
Appears in Collections:Tesis Doctorals - Facultat - Biologia

Files in This Item:
File Description SizeFormat 
MGM_PhD-THESIS.pdf45.54 MBAdobe PDFView/Open    Request a copy

Embargat   Document embargat fins el 18-12-2020

This item is licensed under a Creative Commons License Creative Commons