QGG Big Data Software

At QGG, we analyse large amounts of phenotypic and genomic data, to help farmers breed healthier and more efficient farm animals and plants.

Such large-scale analyses provide information and statistics on genetic markers throughout the entire genome. This information can be used to improve genomic selection and thereby find the best breeding candidates across entire populations. Genomic selection can for instance increase the rate of genetic gain for a lower risk of udder infection in cows, a lower mortality in calves or more stress resistant grass varieties with longer roots that can survive better in dry climates.

At the very core of QGG’s research is our most powerful and important research tool: an HPC (High Performance Computing) Linux cluster with a fast, secure network and 1 PB of data storage. We proudly serve 200 users with 2048 CPU cores and 24.3 TiB of RAM, supported by a whole suite of scientific software. The HPC clusters are upgraded on a regular basis, and the system also includes two GPU nodes equipped with NVIDIA L40S GPUs.

This powerful cluster enables us to develop and improve statistical models for genomic predictions, genomic selection and breeding programmes. So far, the animal species we work with are cattle, pigs, mink, horses, poultry and trouts, and plant species such as grasses, cereals, beans, clover and potatoes. We are increasingly focusing on research in human genetics.

The computation of such complex dataset demands special software, and at QGG we now have five custom-made software packages that enable us to extract different perspectives and values from a large variety of data.

ADAM

ADAM is a stochastic simulation program for modelling selective breeding programs in animals and plants. It simulates populations over time and traces genetic change under alternative breeding strategies and operational workflows. A key strength of ADAM is its capability to simulate complex traits (mimicking important biological aspects of plant and animal genetics) and to combine many interacting steps in real breeding pipelines.

Find an extensive description of ADAM here.

EVA

EVA is a tool for breeding management on population level

EVA performs two tasks:

1. Describes the history of populations in terms of

Individual inbreeding coefficients and completeness of the pedigree
Average inbreeding, co-ancestry, pedigree completeness and generation equivalents per cohort
Genetic contributions of:
- All founders
- Most contributing ancestors
- Any user-specified individuals to any individual or cohort

2. Optimizes genetic contributions

Optimizes the linear function of genetic merit and average additive genetic relationships
Conditional on optimal contributions, individuals may be mated randomly or while minimizing inbreeding in the offspring

By optimal balancing of inbreeding and genetic gain, EVA provides means for sustainable long-term breeding decisions regardless of population size or structure. EVA has been successfully applied and tested in both commercial and endangered populations.

To use EVA you will need accurate and consistent pedigree information. An ordinary desktop computer has in most cases enough capacity to run EVA, with exception of very large population sizes.

EVA is freely available for Windows, Linux and Mac.

EVA is developed by: Peer Berg, Morten Kargo

DMU

DMU is a statistical software package developed for applications in quantitative genetics and genomics. The package implements both likelihood-based tools for estimating variance components, systematic effects (BLUE), and for predicting random effects (BLUP), as well as tools for Bayesian inference on dispersion and location parameters. The development of DMU has primarily been driven by the needs of research projects in applied quantitative animal genetics and genomics over a period of approximately 40 years and is still under development.

The latest publicly available version can be downloaded here: DMU home page.

A User’s guide to DMU can be found here: dmuv6_guide.5.6.pdf.

DMU was primarily developed by Per Madsen and Just Jensen.

Bayz

Bayz is implementing Bayesian models for estimation of variance components, genomic prediction and gene mapping. One if its key features is to allow for hierarchical models, which allows to fit models on complex data structures and to specify complex variance and covariance structures. Bayz fits large multi-trait models, models with repeated observations, random regression models, social effects models, and can combine these features with versions of popular Bayesian prediction and mapping models such as SNP-BLUP, (power) LASSO and Bayesian Variable Selection models. A simple R interface is available that allows to fit the most common models used. Bayz is free to use for research purposes, but license keys are distributed.

Bayz is developed by: Luc Janss

LDAK

LDAK is a software package for analysing data from genetic association studies. It provides tools for estimating SNP heritability (both from individual-level data and summary statistics), detecting causal variants and genes (both via classical and mixed model analyses), constructing prediction models (including MultiBLUP and MegaPRS), and evaluating heritability models (including estimating heritability enrichments). LDAK is written in C, and designed to be run on Linux or Mac computers.

LDAK is developed by: Doug Speed

You will find the website for LDAK here.

Revised 13.05.2026

Jette Odgaard Villemoes