Aarhus University Seal

QGG Big Data Software

At QGG, we analyse large amounts of phenotypic and genomic data, to help farmers breed healthier and more efficient farm animals and plants.

Such large-scale analyses provide information and statistics on genetic markers throughout the entire genome. This information can be used to improve genomic selection and thereby find the best breeding candidates across entire populations. Genomic selection can for instance increase the rate of genetic gain for a lower risk of udder infection in cows, a lower mortality in calves or more stress resistant grass varieties with longer roots that can survive better in dry climates.

At the very core of QGG’s research is our most powerful and important research tool, a HPC (High Performance Computing) Linux Cluster with approx. 500 cores, 100 Terabyte of data and a fast network. The HPC clusters are upgraded on a regular basis.

This powerful cluster enables us to develop and improve statistical models for genomic predictions, genomic selection and breeding programmes. So far, the animal species we work with are cattle, pigs, mink, horses, poultry and trouts, and plant species such as grasses, cereals, beans, clover and potatoes. We are increasingly focusing on research in human genetics.

The computation of such complex dataset demands special software, and at QGG we now have five custom-made software packages that enable us to extract different perspectives and values from a large variety of data.

ADAM

ADAM is a computer program that models selective breeding schemes for animals using stochastic simulation. The program simulates a population of animals and traces the genetic changes in the population under different selective breeding scenarios. It caters for different population structures, genetic models, selection strategies, and mating designs. ADAM can be used to evaluate breeding schemes and generate genetic data to test statistical tools.

ADAM is developed by A.C. Sørensen , M. Henryon , S. Ansari-Mahyari , L.D. Pedersen & P. Berg

EVA

EVA is a tool for breeding management on population level

EVA performs two tasks:

1. Describes the history of populations in terms of

  • Individual inbreeding coefficients and completeness of the pedigree
  • Average inbreeding, co-ancestry, pedigree completeness and generation equivalents per cohort
  • Genetic contributions of:
    • All founders
    • Most contributing ancestors
    • Any user-specified individuals to any individual or cohort

2. Optimizes genetic contributions

  • Optimizes the linear function of genetic merit and average additive genetic relationships
  • Conditional on optimal contributions, individuals may be mated randomly or while minimizing inbreeding in the offspring

By optimal balancing of inbreeding and genetic gain, EVA provides means for sustainable long-term breeding decisions regardless of population size or structure. EVA has been successfully applied and tested in both commercial and endangered populations.

To use EVA you will need accurate and consistent pedigree information. An ordinary desktop computer has in most cases enough capacity to run EVA, with exception of very large population sizes.

EVA is freely available for Windows, Linux and Mac.

EVA is developed by: Peer Berg, Morten Kargo

DMU

DMU is a package of statistical software developed for applications in quantitative genetics and genomics. The package implements powerful Likelihood based tools to estimate variance components, fixed effects (BLUE), and to predict random effects (BLUP) as well as tools for Bayesian inference about dispersion and location parameter. The developments of DMU have been driven by needs of research projects in applied quantitative animal genetics and genomics over a period of ~30 years. A general overview of modules in DMU and their functions are discussed in Madsen et. al (2010), Madsen & Jensen (2013) and in Madsen et. al (2014).

DMU is developed and maintained by: P. Madsen, J. Jensen, G. Su, O.F. Christensen, R. Labouriau and V. Milkevych.

Bayz

Bayz is implementing Bayesian models for estimation of variance components, genomic prediction and gene mapping. One if its key features is to allow for hierarchical models, which allows to fit models on complex data structures and to specify complex variance and covariance structures. Bayz fits large multi-trait models, models with repeated observations, random regression models, social effects models, and can combine these features with versions of popular Bayesian prediction and mapping models such as SNP-BLUP, (power) LASSO and Bayesian Variable Selection models. A simple R interface is available that allows to fit the most common models used. Bayz is free to use for research purposes, but license keys are distributed.

Bayz is developed by: Luc Janss

You will find the website for Bayz here.

LDAK

LDAK is a software package for analysing data from genetic association studies. It provides tools for estimating SNP heritability (both from individual-level data and summary statistics), detecting causal variants and genes (both via classical and mixed model analyses), constructing prediction models (including MultiBLUP and MegaPRS), and evaluating heritability models (including estimating heritability enrichments). LDAK is written in C, and designed to be run on Linux or Mac computers. 

LDAK is developed by: Doug Speed

You will find the website for LDAK here.