Skip to content
menu icon

Data partnership builds expertise to harness data treasure trove

Agriculture Victoria researcher Dr Glenn Fitzgerald says it is important that metadata is centralised and of a high standard.
Photo: Felicity Pritchard

A goldmine of data generated from thousands of RD&E projects supported by GRDC – on a range of topics including yield, soils, genetics and climate – is the target of GRDC’s Data Partnerships Initiative.

The initiative brings together 12 Australian research organisations which are currently working on projects in which GRDC has invested. Historically, data generated from across GRDC’s investment portfolio – hundreds of projects managed across decades – has been maintained by individual organisations and managed in diverse ways. This can limit how easy that data is to find, access and use.

The partners of the Data Partnerships Initiative want to change that. They are building personal expertise and organisational capacity in managing RD&E data, and collaborating to make the data available under the internationally recognised FAIR principles, which state that data should be findable, accessible, interoperable and reusable.

“The work GRDC and our partners are doing to improve RD&E data management will empower researchers and industry stakeholders to access and reuse these data to extract greater value, derive new insights and accelerate research outcomes,” says GRDC’s research data manager, Dr Washington Gapare, who leads the Data Partnerships Initiative.

The 18-month, $2.8 million Data Partnerships Initiative consists of two streams of activity. One, a ‘data discovery’ stream, has focused on locating valuable data from GRDC co-investments over the past 10 years with its metadata to be included in the GRDC Data Catalogue, which is in development.

The second ‘organisational alignment’ stream builds on GRDC’s existing relationships with its partners to establish a new community of practice to collectively improve data management going forward. This includes stakeholders learning best practice around data governance and introducing common standards around data collection and storage.

New insights and less duplication

Commonly, data generated through GRDC co-investments are stored by individual research partners or institutions using a range of systems, from the sophisticated to the rudimentary. This means that data are often siloed and not easily discoverable by others, and therefore are at risk of being lost to the collective research effort for the grains industry.

While data itself will still be stored and managed by GRDC research partners, the planned GRDC Data Catalogue will bring searchable metadata together into a singular portal enabling information about projects to be easily discovered. This will in turn provide new insights that will support the direction of future research, with results flowing to growers.

“Both GRDC and its partners are making significant headway in identifying ways to upgrade RD&E data management – and by working together we’re learning faster,” Dr Gapare says.

“And while we are working with just 12 partners to start with, we will be sharing the knowledge generated from the Data Partnerships Initiative with all our research partners in the next phase of our work.”

Building a clearer picture

Agriculture Victoria crop agronomy research leader Associate Professor Glenn Fitzgerald is managing his department’s involvement in the initiative.

He says that improved data findability and accessibility, as well as improved internal management systems developed alongside the initiative, will help advance complex research projects around such issues as climate change. By sharing more data, a clearer picture of changes across the landscape can be built and more-accurate predictions made, Dr Fitzgerald says.

For example, data from his department’s world-first research looking at carbon dioxide impacts on grain production in a semi-arid environment over 11 years, as part of the GRDC AGFACE co-investment, will help to build a global picture of the issues if shared, he says.

“It’s the only dataset of its kind in the world and contains a lot of value,” he says. “Some of it has gone into international datasets, but through this initiative it could be accessed more broadly and cross-referenced with other projects, enabling scientists to improve crop models to help predict the impacts of climate change.”

To ensure data can be used and shared widely, Dr Fitzgerald says it is important that metadata is centralised and produced to certain consistent standards. This is where the Data Partnerships Initiative has been imperative.

How metadata helps locate relevant research

Metadata describes data and – when it is of high quality, consistent and precise – provides a powerful guide to useful information, says Associate Professor Helen Thompson, an expert in FAIR data management and director of the Centre for eResearch and Digital Innovation at Federation University – a foundation partner in the initiative.

Dr Thompson’s team has provided practical support to GRDC and its partners, guiding the process around developing effective metadata fields for new projects in line with international standards. These include a range of project descriptors, including locations and crop types, and will ensure the data is discoverable to agricultural scientists searching for information that is relevant to them.

From January 2023, GRDC partners must complete a Data Management Plan when working on GRDC investments which will ensure they capture data according to GRDC Data Management Guidelines. Dr Thompson says it is important that GRDC has provided a range of resources to research partners as part of the initiative as they implement new requirements around data management.

This has included building capacity by seconding and supporting data management experts in institutions to manage storage and data collection in-house as well as facilitating data management workshops for researchers.

“It’s fantastic that GRDC has seen the opportunity to help its research partners improve their practice in this area, because it is going to be important for the whole grains research sector and is also a model for other RDCs to think about adopting,” she says.

Practical advice for managing data

Alexis Tindall, the manager of digital stewardship at the University of Adelaide Library, who supports data management planning for researchers across multiple disciplines, has collaborated on an informative workshop for University of Adelaide and South Australian Research and Development Institute (SARDI) researchers. This workshop focused on how to look after research data well, and comply with and use the new GRDC Data Management Guidelines, including how to use the metadata fields to ensure the information is discoverable.

Among her key messages for participants seeking to meet future data-reporting requirements for GRDC investments is to be well-prepared: consider the metadata requirements in the planning phase of project, ensure collaborators collect data in a consistent format, and actively manage data as the project progresses.

This will not only help to ensure data is kept safe and secure, but also makes the recording process less of a burden, she says.

“Data management planning helps you set realistic milestones along the way to ensure that you are documenting what you need to complete the dataset,” she says. “And planning these things at the start absolutely makes it less onerous in the long run.”

Dr Fatima Naim, a molecular biologist and Data Partnership Initiative project leader at Curtin University, says the GRDC Metadata Collection Form has been refined over the course of the initiative in consultation with researchers and data experts to ensure it is intuitive and user-friendly.

The result of this administration, she says, is the opportunity to “shine a bigger light on our research”.

Information in a few clicks

A novice in data management, Dr Naim says that through the initiative she has developed a new appreciation of the ability for metadata to bring information on particular subjects together through simple keyword searches.

As a project leader in plant physiology at the university’s Centre for Crop and Disease Management (CCDM), the value of good RD&E data management is clear.

“If you’re interested in finding the type of data CCDM has generated in the crop disease space, for example, then you can easily find different and relevant datasets all in the one place instead of spending days searching,” she says. “It’s all there, just a few clicks away.”

Dr Naim says if research is more discoverable by others it will provide collaboration opportunities while helping to grow the collective knowledge base.

“The biggest innovation is that the platform will enable new connections and foster new networks,” she says.

Answering the big questions

Dr Fitzgerald also sees how the work GRDC is doing to improve RD&E data management could facilitate the scaling-up of research by, for example, showing a range of projects that have been undertaken on similar subjects and with similar objectives.

Ms Tindall, who manages the University of Adelaide data repository, says interest in data management has grown over the past decade as datasets become bigger and people realise the significance and potential of what they are generating.

By establishing the GRDC Data Partnerships Initiative, GRDC is acknowledging that data from its co-investments is a valuable asset, worthy of being kept safe and accessible as a resource for future use beyond the lives of particular projects and institutions, she says.

“We’re increasingly recognising that data is an asset and something that we need to look after as much as we look after our publications and our other research outputs that we create,” she says.

Organising data in this way will also ensure that, into the future, they will be suitable for analysis by machine learning or AI systems capable of drawing insights from larger, combined datasets from multiple sources, Dr Fitzgerald says.

“In 20 years, science will be done differently and unless you’re across the data and have it structured properly, you won’t be prepared to answer the bigger questions,” he says.

Partners in the GRDC Data Partnerships Initiative include: Agriculture Victoria, Curtin University, WA Department of Primary Industries and Regional Development, Federation University, Murdoch University, NSW Department of Primary Industries, Queensland Department of Agriculture and Fisheries, South Australian Research and Development Institute, University of Adelaide, University of Queensland, University of Sydney, University of Western Australia.

back to top