Human Population Genetics

  • Diversity of Copy Number Variation (CNV) in native American Populations
  • Patterns of American colonization
  • Genome-wide association studies of admixtured populations


  • Database for human genetic variation data
  • Data and tool integration


Exploring the biomedical consequences of genetic differentiation of Native Americans.

The goal of our laboratory is to identify genes involved in immunity and cancer that bear variants common in Native Americans and rare elsewhere, and to understand their effect on susceptibility to complex diseases. We study the pattern of genetic diversity of these genes, infer their evolutionary histories and functionally characterize the variants for which we have strong evidence of influence on complex diseases.

The EPIGEN-Brazil project and the use of admixture to map susceptibility to complex traits.

The EPIGEN-BRAZIL initiative is supported by the Brazilian Ministry of Health. We are genotyping ~6,600 Brazilians from the three largest cohorts in the country: the 1982 Pelotas birth cohort (n=3,900), the Bambui (Minas Gerais) cohort study of aging (n=1,500, baseline year: 1997) and the Salvador children cohort (n=1,200, baseline year: 1997). This initiative involves four other Brazilian groups: the Fundação Oswaldo Cruz from Belo Horizonte (Bambui cohort coordinator: Dr. Maria Fernanda Lima-Costa), the University of Pelotas (Dr. Bernardo Lessa-Horta), the University of Bahia (Dr. Mauricio L Barreto) and the University of São Paulo (Dr. Jorge E Krieger). Because we expect higher European ancestry in Pelotas, higher African ancestry in Salvador, and intermediate levels of each in Bambui, there are specific methodological challenges in the association studies to be performed. This genome-wide dataset and its associated clinical information will be one of the first and largest in non-European populations, requiring robust bioinformatics support. It will allow us to study gene-environment interactions and to perform admixture mapping of several biomedical outcomes such as rheumatoid arthritis, blood pressure, and anthropometric characteristics. 

We also want to identify genes responsible for the decimation of Native Americans by infectious disease by performing “natural selection-admixture mapping” (NSAM)11. When Europeans and Africans arrived in the New World after 1492, they brought pathogens to which the Native American immune system was naive. Smallpox, measles, typhus and influenza decimated the autochthonous populations across the Americas1, and are still relevant pathogens today. 


The production of biological data by high-throughput technologies has revolutionized biology. In genetics, classical and emerging scientific questions are being approached using SNPs and CNVs genotyping and Next Generation Sequencing platforms, as in this proposal. Today, the body of investigators in biology is composed of two distinct groups: a few large research groups that produce high-throughput data, and thousands of small- and medium-sized groups such as our laboratory that produce smaller amounts of data, but also integrate it with the high-throughput data to resolve relevant scientific questions. While large-scale genomics initiatives such as the HapMap and the 1000-genomes projects rely on powerful computational and bioinformatics support to assist in data production and analysis, there are very few bioinformatics platforms oriented to small/medium-scale groups to store, handle, integrate, and analyze data from different sources, as well as to assist in combining different kinds of analyses. As a consequence, these tasks are frequently performed sub-optimally by manually handling data files, an error-prone task that is seldom coupled with adequate quality control procedures. To support investigators from small/medium-sized research groups in human population genetics and genetic epidemiology, we developed a bioinformatics platform called DIVERGENOME. This platform is designed to manage and analyze large amounts of data, and includes two components: DIVERGENOMEdb and DIVERGENOMEtools. DIVERGENOMEdb is a relational database that can integrate data from different sources, including data from the investigator himself. 

More information about DIVERGENOME can be found at:

Magalhães, Wagner C. S. ; Rodrigues, Maíra R. ; Silva, Donnys ; Soares-Souza, Giordano ; Iannini, Márcia L. ; Cerqueira, Gustavo C. ; Faria-Campos, Alessandra C. ; TARAZONA-SANTOS E. DIVERGENOME: A Bioinformatics platform to assist population genetics and genetic epidemiology studies. Genetic Epidemiology (Print), v. 36, p. n/a-n/a, 2012.  

Or on the website pggenetica.icb.ufmg.br/divergenome


Bioinformatic Tools

We developed and mantain the following bioinformatics tools:

The EPIGEN-SW is implemented as a web tool and facilitates the access to computational resources through an integrative and interactive approach based on flowcharts, masterscripts and auxiliary scripts. (To appear)

DANCE is a graph-based web tool that allows to integrate and visualize information on human complex phenotypes and their GWAS-hits, as well as their risk allele frequencies in different populations. (Araújo et al., 2015)

CNVice (Inbreeding Coefficients Estimation for CNV data) is a freely available R script for population genetics applications. (To appear)

DIVERGENOME is a bioinformatics platform to assist population genetics and genetic epidemiology studies performed by small- to medium-sized research groups. The platform manages a relational database where information on genotypes, polymorphism, laboratory protocols, individuals, populations, and phenotypes is organized in user projects. (Magalhães et al., 2012)

DIVERGENOMEtools is a set of tools to convert data formats as required by popular software in population genetics and genetic epidemiology. It implements a new method proposed by our laboratory. It consists of a graph-based approach to compose pipelines automatically by compiling a specialised set of tools on demand, depending on the functionality required, instead of specifying every sequence of tools in advance. (Rodrigues et al., 2012)

Implements a pipeline that facilitates data handling typical of re-sequencing studies. Functionalities: (1) consolidates different outputs produced by distinct Phred-Phrap-Consed contigs sharing a reference sequence; (2) checks for genotyping inconsistencies; (3) reformats genotyping data produced by Polyphred into a matrix of genotypes with individuals as rows and segregating sites as columns; (4) prepares input files for haplotype inferences using the popular software PHASE; and (5) handles PHASE output files that contain only polymorphic sites to reconstruct the inferred haplotypes including polymorphic and monomorphic sites as required by population genetics software for re-sequencing data such as DNAsp. (Machado et al., 2011)

Publications of LDGH Members


Prof. Eduardo Tarazona Santos

I am graduated in Biological Sciences at the University of Bologna (Italy), Master in Biochemistry by the Federal University of Minas Gerais (UFMG) and PhD in Biochemistry (UFMG) and in Biological Anthropology (University of Bologna). I was posdoc at the University of Maryland (supervisor: Sarah Tishkoff) and then at the National Cancer Institute (supervisor: Stephen Chanock), in the US. Currently, I am Associate Professor at the Department of Biology of the UFMG, where I lead the Laboratory of Human Genetic Diversity, performing research in the following topics: (1) Human genomic diversity in Latin America: evolutionary inferences and biomedical problems; (2) Genetic epidemiology of complex diseases and pharmacogenetics in Latin America, and (3) Development of bioinformatics tools for the study of genetic diversity. I was Head of the Graduate Program of Genetics at the UFMG (2010-2013). Currently, I am Head of the Department of Biology at UFMG (2016-2018) and Member of the Biological Sciences Study Section of the Minas Gerais State Agency for Research (FAPEMIG, 2016-2018).

  Post docs

Dr. Giordano Bruno Soares Souza


  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts
  • Inputation in Latin American populations

  PhD Students

Hanaisa de Pla e Santanna


  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts
  • Exploring the biomedical consequences of genetic differentiation of Native Americans.

Isabela Oliveira dos Anjos Alvim


  • Human genomic diversity in Latin America: biomedical implications and evolutionary inferences
  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts

Marla Mendes de Aquino


  •  Human genomic diversity in Latin America: biomedical implications and evolutionary inferences



Nathalia Matta Araujo


  •  EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts Inputation in Latin American populations



Rennan Garcias Moreira


  •  EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts

Thiago Peixoto Leal


  • Population genetics analysis of Drug Metabolism Enzymes (DME) genes
  • EPIGEN/BRASIL - Genomic epidemiology of complex diseases in Brazilian population-based cohorts



Victor Octavio Borda Pua


  • Genetic history and natural selection in South America
  Master Students

Camila Zolini de Sá


  • Gastric Cancer (NIH)
  • EPIGEN-Brasil
  Undergraduate Students

Carolina SIlva de Carvalho


  • Human genomic diversity in Latin America: biomedical implications and evolutionary inferences

Lucas Azevedo Birro Michelin




Former LDGH Members

Dr. Andrea Rita Marrero (now at UFSC)

Dr. Fernanda Kehdy (former EPIGEN posdoc, now Investigator at Instituto Oswaldo Cruz, Rio de Janeiro, current collaborator)

Dr. Fernanda Rodrigues Soares (former Graduate Student, now at UNINASSAU, current collaborator)

Dr. Gilderlanio Santana de Araújo (Now posdoc at UFRN)

Dr. Luciana Zuccherato (Now at Baylor College of Medicine, Houston, Texas).

Dr. Maíra Ribeiro Rodrigues (former Graduate Student, now posdoc at UNICAMP).

Dr. Maria Clara Fernandes da Silva (Now Investigator at Hemominas, Belo Horizonte)

Dr. Marilia de Oliveira Scliar (Now posdoc at Human genome and stem cells study center - USP)

Dr. Mateus Gouveia (now Investigator at Instituto Oswaldo Cruz

Dr. Moara Machado (former Graduate student, now investigator at National Cancer Institute - NIH, current collaborator)

Dr. Roxana Zamudio Zea (former Graduate student, now posdoc at University of Leicester, current collaborator)

Dr. Wagner C. Santos Magalhães (former Graduate Student, now investigator at  Instituto de Ensino e Pesquisa da Fundação Mario Penna).





