Human Population Genetics

  • Diversity of Copy Number Variation (CNV) in native American Populations
  • Patterns of American colonization
  • Genome-wide association studies of admixtured populations


  • Database for human genetic variation data
  • Data and tool integration


Exploring the biomedical consequences of genetic differentiation of Native Americans.

The goal of our laboratory is to identify genes involved in immunity and cancer that bear variants common in Native Americans and rare elsewhere, and to understand their effect on susceptibility to complex diseases. We study the pattern of genetic diversity of these genes, infer their evolutionary histories and functionally characterize the variants for which we have strong evidence of influence on complex diseases.

The EPIGEN-Brazil project and the use of admixture to map susceptibility to complex traits.

The EPIGEN-BRAZIL initiative is supported by the Brazilian Ministry of Health. We are genotyping ~6,600 Brazilians from the three largest cohorts in the country: the 1982 Pelotas birth cohort (n=3,900), the Bambui (Minas Gerais) cohort study of aging (n=1,500, baseline year: 1997) and the Salvador children cohort (n=1,200, baseline year: 1997). This initiative involves four other Brazilian groups: the Fundação Oswaldo Cruz from Belo Horizonte (Bambui cohort coordinator: Dr. Maria Fernanda Lima-Costa), the University of Pelotas (Dr. Bernardo Lessa-Horta), the University of Bahia (Dr. Mauricio L Barreto) and the University of São Paulo (Dr. Jorge E Krieger). Because we expect higher European ancestry in Pelotas, higher African ancestry in Salvador, and intermediate levels of each in Bambui, there are specific methodological challenges in the association studies to be performed. This genome-wide dataset and its associated clinical information will be one of the first and largest in non-European populations, requiring robust bioinformatics support. It will allow us to study gene-environment interactions and to perform admixture mapping of several biomedical outcomes such as rheumatoid arthritis, blood pressure, and anthropometric characteristics. 

We also want to identify genes responsible for the decimation of Native Americans by infectious disease by performing “natural selection-admixture mapping” (NSAM)11. When Europeans and Africans arrived in the New World after 1492, they brought pathogens to which the Native American immune system was naive. Smallpox, measles, typhus and influenza decimated the autochthonous populations across the Americas1, and are still relevant pathogens today. 


The production of biological data by high-throughput technologies has revolutionized biology. In genetics, classical and emerging scientific questions are being approached using SNPs and CNVs genotyping and Next Generation Sequencing platforms, as in this proposal. Today, the body of investigators in biology is composed of two distinct groups: a few large research groups that produce high-throughput data, and thousands of small- and medium-sized groups such as our laboratory that produce smaller amounts of data, but also integrate it with the high-throughput data to resolve relevant scientific questions. While large-scale genomics initiatives such as the HapMap and the 1000-genomes projects rely on powerful computational and bioinformatics support to assist in data production and analysis, there are very few bioinformatics platforms oriented to small/medium-scale groups to store, handle, integrate, and analyze data from different sources, as well as to assist in combining different kinds of analyses. As a consequence, these tasks are frequently performed sub-optimally by manually handling data files, an error-prone task that is seldom coupled with adequate quality control procedures. To support investigators from small/medium-sized research groups in human population genetics and genetic epidemiology, we developed a bioinformatics platform called DIVERGENOME. This platform is designed to manage and analyze large amounts of data, and includes two components: DIVERGENOMEdb and DIVERGENOMEtools. DIVERGENOMEdb is a relational database that can integrate data from different sources, including data from the investigator himself. 

More information about DIVERGENOME can be found at:

Magalhães, Wagner C. S. ; Rodrigues, Maíra R. ; Silva, Donnys ; Soares-Souza, Giordano ; Iannini, Márcia L. ; Cerqueira, Gustavo C. ; Faria-Campos, Alessandra C. ; TARAZONA-SANTOS E. DIVERGENOME: A Bioinformatics platform to assist population genetics and genetic epidemiology studies. Genetic Epidemiology (Print), v. 36, p. n/a-n/a, 2012.  

Or on the website pggenetica.icb.ufmg.br/divergenome


Bioinformatic Tools

We developed and mantain the following bioinformatics tools:

The EPIGEN-SW is implemented as a web tool and facilitates the access to computational resources through an integrative and interactive approach based on flowcharts, masterscripts and auxiliary scripts. (To appear)

DANCE is a graph-based web tool that allows to integrate and visualize information on human complex phenotypes and their GWAS-hits, as well as their risk allele frequencies in different populations. (Araújo et al., 2015)

CNVice (Inbreeding Coefficients Estimation for CNV data) is a freely available R script for population genetics applications. (To appear)

DIVERGENOME is a bioinformatics platform to assist population genetics and genetic epidemiology studies performed by small- to medium-sized research groups. The platform manages a relational database where information on genotypes, polymorphism, laboratory protocols, individuals, populations, and phenotypes is organized in user projects. (Magalhães et al., 2012)

DIVERGENOMEtools is a set of tools to convert data formats as required by popular software in population genetics and genetic epidemiology. It implements a new method proposed by our laboratory. It consists of a graph-based approach to compose pipelines automatically by compiling a specialised set of tools on demand, depending on the functionality required, instead of specifying every sequence of tools in advance. (Rodrigues et al., 2012)

Implements a pipeline that facilitates data handling typical of re-sequencing studies. Functionalities: (1) consolidates different outputs produced by distinct Phred-Phrap-Consed contigs sharing a reference sequence; (2) checks for genotyping inconsistencies; (3) reformats genotyping data produced by Polyphred into a matrix of genotypes with individuals as rows and segregating sites as columns; (4) prepares input files for haplotype inferences using the popular software PHASE; and (5) handles PHASE output files that contain only polymorphic sites to reconstruct the inferred haplotypes including polymorphic and monomorphic sites as required by population genetics software for re-sequencing data such as DNAsp. (Machado et al., 2011)

Publications of LDGH Members


  • Magalhães WCS, Araujo NM, Leal TP, Araujo GS, Viriato PJS, Kehdy FS, Costa GN,Barreto ML, Horta BL, Lima-Costa MF, Pereira AC, Tarazona-Santos E, Rodrigues MR. Brazilian EPIGEN Consortium. EPIGEN-Brazil Initiative resources: a Latin American imputation panel and the Scientific Workflow. Genome Res. 2018. 28(7):1090-1095.

  • Harris DN, Song W, Shetty AC, Levano KS, Cáceres O, Padilla C, Borda V, Tarazona D, Trujillo O, Sanchez C, Kessler MD, Galarza M, Capristano S, Montejo H, Flores-Villanueva PO, Tarazona-Santos E, O'Connor TD, Guio H. Evolutionary genomic dynamics of Peruvians before, during, and after the Inca Empire. Proc Natl Acad Sci U S A. 2018 Jul 10;115(28):E6526-E6535.

  • Rodrigues-Soares F, Kehdy FSG, Sampaio-Coelho J, Andrade PXC, Céspedes-Garro C, Zolini C, Aquino MM, Barreto ML, Horta BL, Lima-Costa MF, Pereira AC, LLerena A, Tarazona-Santos E. Genetic structure of pharmacogenetic biomarkers in Brazil inferred from a systematic review and population-based cohorts: a RIBEF/EPIGEN-Brazil initiative. Pharmacogenomics J. 2018 May 1. PubMed PMID: 29713005.

  • Silva TM, Fiaccone RL, Kehdy FSG, Tarazona-Santos E, Rodrigues LC, Costa GNO, Figueiredo CA, Alcantara-Neves, NM, Barreto ML. Biogeographical ancestry is associated with socioenvironmental conditions and infections in a Latin American urban population. Population Health. 2018 April (4);301-306. Access. 

  • Fonseca PAS, Leal TP, Santos FC, Gouveia MH, Id-Lahoucine S, Rosse IC, Ventura RV, Bruneli FAT, Machado MA, Peixoto MGCD, Tarazona-Santos E, Carvalho MRS. Reducing cryptic relatedness in genomic data sets via a central node exclusion algorithm. Mol Ecol Resour. 2018 May;18(3):435-447.

  • Torres KCL, Rezende VB, Lima-Silva ML, Santos LJS, Costa CG, Mambrini JVM, Peixoto SV, Tarazona-Santos E, Martins Filho OA, Lima-Costa MF, Teixeira-Carvalho A. Immune senescence and biomarkers profile of Bambuí aged population-based cohort. Exp Gerontol. 2018 Mar;103:47-56.



  •  Rahbari R, Zuccherato LW, Tischler G, Chihota B, Ozturk H, Saleem S, Tarazona-Santos E, Machado LR, Hollox EJ. Understanding the Genomic Structure of Copy-Number Variation of the Low-Affinity Fcγ Receptor Region Allows Confirmation of the Association of FCGR3B Deletion with Rheumatoid Arthritis. Hum Mutat. 2017 Apr;38(4):390-399. 2017 Feb 15. PubMed PMID: 27995740.
  • Marques CR, Costa GN, da Silva TM, Oliveira P, Cruz AA, Alcantara-Neves NM, Fiaccone RL, Horta BL, Hartwig FP, Burchard EG, Pino-Yanes M, Rodrigues LC, Lima-Costa MF, Pereira AC, Gouveia MH, Sant Anna HP, Tarazona-Santos E, Lima Barreto M, Figueiredo CA. Suggestive association between variants in IL1RAPL and asthma symptoms in Latin American children. Eur J Hum Genet. 2017 Apr;25(4):439-445. Jan 25. PubMed PMID: 28120837

  • Zuccherato LW, Schneider S, Tarazona-Santos E, Hardwick RJ, Berg DE, Bogle H, Gouveia MH, Machado LR, Machado M, Rodrigues-Soares F, Soares-Souza GB, Togni DL, Zamudio R, Gilman RH, Duarte D, Hollox EJ, Rodrigues MR. Population genetics of immune-related multilocus copy number variation in Native Americans. J R Soc Interface. 2017 Mar;14(128).  PubMed PMID: 28356540

  •  Lima-Costa MF, Melo Mambrini JV, Lima Torres KC, Peixoto SV, de Oliveira C, Tarazona-Santos E, Teixeira-Carvalho A, Martins-Filho OA. Predictive value of multiple cytokines and chemokines for mortality in an admixed population: 15-year follow-up of the Bambui-Epigen (Brazil) cohort study of aging. Exp Gerontol. 2017 Nov PubMed PMID: 28803133.

  • Torres KCL, Rezende VB, Lima-Silva ML, Santos LJS, Costa CG, Mambrini JVM, Peixoto SV, Tarazona-Santos E, Martins Filho OA, Lima-Costa MF, Teixeira-Carvalho A. Immune senescence and biomarkers profile of Bambuí aged population-based cohort. Exp Gerontol.  2017 Dec 14. PubMed PMID: 29247791.

  • Fonseca PAS, Leal TP, Santos FC, Gouveia MH, Id-Lahoucine S, Rosse IC, Ventura RV, Bruneli FAT, Machado MA, Peixoto MGCD, Tarazona-Santos E, Carvalho MRS. Reducing cryptic relatedness in genomic data sets via a central node exclusion algorithm. Mol Ecol Resour. 2017 Dec 22 PubMed PMID: 29271609.



  • Lima-Costa MF, Mambrini JV, Leite ML, Peixoto SV, Firmo JO, Loyola Filho AI, Gouveia MH, Leal TP, Pereira AC, Macinko J, Tarazona-Santos E. Socioeconomic Position, But Not African Genomic Ancestry, Is Associated With Blood Pressure in the Bambui-Epigen (Brazil) Cohort Study of Aging. Hypertension. 2016 Feb;67(2):349-55. Epub 2015 Dec 28. PubMed PMID: 26711733.
  • Lima-Costa MF, Macinko J, Mambrini JV, Peixoto SV, Pereira AC, Tarazona-Santos E, Ribeiro AL. Genomic African and Native American Ancestry and Chagas Disease: The Bambui (Brazil) Epigen Cohort Study of Aging. PLoS Negl Trop Dis. 2016 May 16;10(5):e0004724. eCollection 2016 May. PubMed PMID: 27182885.

  • Sosa-Macías M, Teran E, Waters W, Fors MM, Altamirano C, Jung-Cook H, Galaviz-Hernández C, López-López M, Remírez D, Moya GE, Hernández F, Fariñas H, Ramírez R, Céspedes-Garro C, Tarazona-Santos E, LLerena A. Pharmacogenetics and ethnicity: relevance for clinical implementation, clinical trials, pharmacovigilance and drug regulation in Latin America. Pharmacogenomics. 2016 Oct 28. PubMed PMID: 27790935



  • Kehdy FS, Gouveia MH, Machado M, Magalhães WC, Horimoto AR, Horta BL, Moreira RG, Leal TP, Scliar MO, Soares-Souza GB, Rodrigues-Soares F, Araújo GS, Zamudio R, Sant Anna HP, Santos HC, Duarte NE, Fiaccone RL, Figueiredo CA, Silva TM, Costa GN, Beleza S, Berg DE, Cabrera L, Debortoli G, Duarte D, Ghirotto S, Gilman RH, Gonçalves VF, Marrero AR, Muniz YC, Weissensteiner H, Yeager M, Rodrigues LC, Barreto ML, Lima-Costa MF, Pereira AC, Rodrigues MR, Tarazona-Santos E; Brazilian EPIGEN Project Consortium. Origin and dynamics of admixture in Brazilians and its effect on the pattern of deleterious mutations. Proc Natl Acad Sci U S A. 112(28):8696-701. PubMed PMID: 26124090.

  •  Lima-Costa MF, Rodrigues LC, Barreto ML, Gouveia M, Horta BL, Mambrini J, Kehdy FS, Pereira A, Rodrigues-Soares F, Victora CG, Tarazona-Santos E; Epigen-Brazil group. Genomic ancestry and ethnoracial self-classification based on 5,871 community-dwelling Brazilians (The Epigen Initiative). Sci Rep. 5:9812. PubMed PMID: 25913126.

  • Zamudio R, Pereira L, Rocha CD, Berg DE, Muniz-Queiroz T, Sant Anna HP, Cabrera L, Combe JM, Herrera P, Jahuira MH, Leão FB, Lyon F, Prado WA, Rodrigues  MR, Rodrigues-Soares F, Santolalla ML, Zolini C, Silva AM, Gilman RH, Tarazona-Santos E, Kehdy FS. Population, Epidemiological, and Functional Genetics of Gastric Cancer Candidate Genes in Peruvians with Predominant Amerindian Ancestry. Dig Dis Sci. PubMed PMID: 26391267.

  • Santos HC, Horimoto AV, Tarazona-Santos E, Rodrigues-Soares F, Barreto ML, Horta BL, Lima-Costa MF, Gouveia MH, Machado M, Silva TM, Sanches JM, Esteban N, Magalhaes WC, Rodrigues MR, Kehdy FS, Pereira AC. A minimum set of ancestry informative markers for determining admixture proportions in a mixed American population: the Brazilian set. Eur J Hum Genet. PubMed PMID: 26395555.

  • Sosa-Macias M, Moya GE, LLerena A, Ramírez R, Terán E, Peñas-LLedó EM, Tarazona-Santos E, Galaviz-Hernández C, Céspedes-Garro C, Acosta H. Population pharmacogenetics of Ibero-Latinoamerican populations (MESTIFAR 2014). Pharmacogenomics. 16(7):673-6. PubMed PMID: 25929854.

  • Araujo GS, Lima LHC, Schneider S, Leal TP, da Silva APC, Vaz de Melo POS, Tarazona-Santos E, Scliar MO, Rodrigues MR. Integrating, summarizing, and visualizing GWAS-hits and human diversity with DANCE (Disease ANCEstry Networks). Bioinformatics. 2016 Apr 15;32(8):1247-9. Dec 15. PubMed PMID: 26673785.

  • Costa GN, Dudbridge F, Fiaccone RL, Silva TM, Conceicao JS, Strina A, Figueiredo CA, Magalhaes WCS, Rodrigues MR, Gouveia M, Kehdy F, Horimoto A, Lessa-Horta B, Burchard E, Pino-Yanes M, Del-Rio-Navarro B, Romieu I, Hancock D, London S, Lima-Costa MF, Pereira A, Tarazona-Santos E, Rodrigues LC, Barreto M. A Genome-wide association study of asthma symptoms in Latin American children. BMC Genet. 2015 Dec 3;16:141. PubMed PMID: 26635092.

  • Lima-Costa MF, Macinko J, Mambrini JM, Cesar CC, Peixoto SV, Magalhaes WCS, Lessa-Horta B, Barreto M, Castro-Costa E, Firmo JO, Proietti FA, Leal TP, Rodrigues MR, Pereira A, Tarazona-Santos E. Genomic ancestry, self-rated health and its association with mortality in an admixed population: 10 year follow-up of the Bambui-Epigen (Brazil) Cohort Study of Ageing. PLoS One. 2015 Dec 17;10(12):e0144456. PubMed PMID: 26680774




Prof. Eduardo Tarazona Santos CV

I am graduated in Biological Sciences at the University of Bologna (Italy), Master in Biochemistry by the Federal University of Minas Gerais (UFMG) and PhD in Biochemistry (UFMG) and in Biological Anthropology (University of Bologna). I was posdoc at the University of Maryland (supervisor: Sarah Tishkoff) and then at the National Cancer Institute (supervisor: Stephen Chanock), in the US. Currently, I am Associate Professor at the Department of Biology of the UFMG, where I lead the Laboratory of Human Genetic Diversity, performing research in the following topics: (1) Human genomic diversity in Latin America: evolutionary inferences and biomedical problems; (2) Genetic epidemiology of complex diseases and pharmacogenetics in Latin America, and (3) Development of bioinformatics tools for the study of genetic diversity. I was Head of the Graduate Program of Genetics at the UFMG (2010-2013). Currently, I am Head of the Department of Biology at UFMG (2016-2018) and Member of the Biological Sciences Study Section of the Minas Gerais State Agency for Research (FAPEMIG, 2016-2018).

  Post docs

Dr. Giordano Bruno Soares Souza CV


  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts
  • Inputation in Latin American populations

  PhD Students

Hanaisa de Pla e Santanna CV


  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts
  • Exploring the biomedical consequences of genetic differentiation of Native Americans.

Isabela Oliveira dos Anjos Alvim CV


  • Human genomic diversity in Latin America: biomedical implications and evolutionary inferences
  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts

Marla Mendes de Aquino CV


  •  Human genomic diversity in Latin America: biomedical implications and evolutionary inferences



Nathalia Matta Araujo CV


  •  EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts Inputation in Latin American populations



Rennan Garcias Moreira CV


  •  EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts

Thiago Peixoto Leal CV


  • Population genetics analysis of Drug Metabolism Enzymes (DME) genes
  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts



Victor Octavio Borda Pua CV


  • Genetic history and natural selection in South America
  Master Students

Camila Zolini de Sá CV


  • Gastric Cancer (NIH)
  • EPIGEN-Brasil
  Undergraduate Students

Carolina SIlva de Carvalho CV


  • Human genomic diversity in Latin America: biomedical implications and evolutionary inferences

Lucas Azevedo Birro Michelin CV




Former LDGH Members

Dr. Andrea Rita Marrero (now at UFSC) CV

Dr. Fernanda Kehdy (former EPIGEN posdoc, now Investigator at Instituto Oswaldo Cruz, Rio de Janeiro, current collaborator) CV

Dr. Fernanda Rodrigues Soares (former Graduate Student, now at UNINASSAU, current collaborator) CV

Dr. Gilderlanio Santana de Araújo (Now posdoc at UFRN) CV

Dr. Luciana Zuccherato (Now at Baylor College of Medicine, Houston, Texas). CV

Dr. Maíra Ribeiro Rodrigues (former Graduate Student, now posdoc at UNICAMP). CV

Dr. Maria Clara Fernandes da Silva (Now Investigator at Hemominas, Belo Horizonte) CV

Dr. Marilia de Oliveira Scliar (Now posdoc at Human genome and stem cells study center - USP) CV

Dr. Mateus Gouveia (now Investigator at Instituto Oswaldo CruzCV

Dr. Moara Machado (former Graduate student, now investigator at National Cancer Institute - NIH, current collaborator) CV

Dr. Roxana Zamudio Zea (former Graduate student, now posdoc at University of Leicester, current collaborator) CV

Dr. Wagner C. Santos Magalhães (former Graduate Student, now investigator at  Instituto de Ensino e Pesquisa da Fundação Mario Penna). CV





Images Copyright to www.sxc.hu Website Copyright LDGH