An international collaboration spanning three decades has changed the way Australian grain breeders evaluate potential varieties and has had a major impact on international statistical science.
A biometrics pioneer at Rothamsted Research (RRes) in the United Kingdom, Professor Robin Thompson, was en route to a conference in New Zealand in 1992 when he was asked by two Australian researchers, Professor Brian Cullis and Dr Arthur Gilmour, to visit and discuss a problem they had.
They were working on a GRDC project to provide New South Wales grain growers with accurate predictions of variety performance using multi-environment trial data.
The then head of the Crop Variety Testing unit, Dick Gammie, and Wagga Wagga wheat breeder, John Fisher had provided a unique data-set from many years and locations. But the software they had could not handle the size of the dataset and the complexity of the model.
This software, known as GENSTAT, used the technique known as Residual Maximum Likelihood (REML), which had been devised by Professor Thompson and Professor Desmond Patterson in 1971.
- GRDC's Southern Regional Cropping Solutions Network to host Local Forums in South Australia over coming months
- Multiple tactics the best way to control ryegrass in the HRZ
- Cultural measures a key in dealing with yellow leaf spot
REML is a statistical method to estimate sources of variation in unbalanced data, such as when varieties are not tested at all locations.
In this case, the task was to predict variety performance based on estimates of the variety-by-year; variety-by-location; and variety-by-year-by-location variance components.
A chance encounter with UKs Professor Robin Thompson has given us the power to analyse complicated multi-environment crop variety trials.
GENSTAT's implementation of REML used an algorithm that was not suited to large, complex scenarios.
It was based on software developed by one of Professor Thompson's students in the early 1970s.
By chance, Professor Thompson told them he had devised another algorithm to obtain REML estimates of variance parameters, but had so far done nothing with it.
Professor Thompson referred to it as the average information (AI) algorithm.
All three agreed to pool resources and work with Dr Sue Welham, at RRes, to develop a software platform to obtain REML estimates of variance parameters using the AI algorithm.
The project, titled 'ASReml', began and version one of ASReml was released by NSW Department of Primary Industries (NSWDPI) and RRes in 1996.
ASReml was commercialised in 2002 and VSN International was given the rights to distribute the software, later buying the IP from NSWDPI.
Today ASReml and its more recent adaptation, ASReml-R, are widely used in plant, animal and human genetics.
The power of the AI algorithm in ASReml is the ability to devise and fit models that involve so-called correlated random effects, such as the genetic relationship between varieties.
Such models are fundamental in quantitative genetics and in genomics. Genetic relationships can be incorporated in routine analyses using ASReml for very large datasets using either pedigree-based or genomic-based information.
Building the future
This collaboration, however, has not only been about implementing and developing statistical software, but also about developing and nurturing the next generation of biometricians to solve real problems arising in agriculture and plant and animal breeding.
The legacy of this collaboration continues to enable young statisticians to make advances, and the University of Wollongong recently created the CMajor-Drinkwater Training Award - a scholarship to assist young statisticians to undertake research into applying mixed model theory (such as ASReml) to animal or plant genetics.
The award was made possible through a significant philanthropic donation by Professor Thompson to support of the Centre for Bioinformatics and Biostatistics.
The first recipient of the inaugural CMajor-Drinkwater Training Award was Daniel Tolhurst, who used the scholarship to fund, in part, research into single-step genomic selection for multi-environment trial datasets at the Roslin Institute at the University of Edinburgh.
Mr Tolhurst's subsequent PhD research will focus on efficient sampling schemes for REML estimation in large genomic datasets.
GRDC Research Code: UW00010
More information: Professor Brian Cullis, University of Wollongong, 02 4221 5641, email@example.com