The ocean is the world’s largest habitat, yet much of its biodiversity is still unknown. A study published in Frontiers in Science marks a significant breakthrough, reporting the largest and most comprehensive database of marine microbes to date – matched with biological function, location, and habitat type.
Credit: The KMAP Ocean Gene Catalog 1.0 is the largest database of marine microbes to date
The ocean is the world’s largest habitat, yet much of its biodiversity is still unknown. A study published in Frontiers in Science marks a significant breakthrough, reporting the largest and most comprehensive database of marine microbes to date – matched with biological function, location, and habitat type.
“The KMAP Global Ocean Gene Catalog 1.0 is a leap toward understanding the ocean’s full diversity, containing more than 317 million gene groups from marine organisms around the world,” said lead author Elisa Laiolo of the King Abdullah University of Science and Technology (KAUST) in Saudi Arabia. “The catalog focuses on marine microbes, which greatly impact human lives through their influence on the ocean’s health and the Earth’s climate.”
“The catalog is freely available through the KAUST Metagenomic Analysis Platform (KMAP),” added the study’s senior author, Prof Carlos Duarte, a faculty member at KAUST. “Scientists can access the catalog remotely to investigate how different ocean ecosystems work, track the impact of pollution and global warming, and search for biotechnology applications such as new antibiotics or new ways to break down plastics – the possibilities are endless!”
A feat of technological innovation and scientific collaboration
Researchers have been mapping marine biodiversity for hundreds of years, but faced various challenges to creating a full atlas of ocean life. One is that most marine organisms cannot be studied in a laboratory. The advent of DNA sequencing technologies overcame this by allowing organisms to be identified directly from ocean water and sediments.
“Since each species has its own set of genes, we can identify which organisms are in an ocean sample by analyzing its genetic material,” Laiolo explained. “Two technological advances have made this possible at scale.
“The first is the enormous increase in speed, and decrease in cost, of DNA sequencing technologies. This has allowed researchers to sequence all the genetic material in thousands of ocean samples.”
“The second is the development of massive computational power and AI technologies, which make it possible to analyze these millions of sequences.”
The team used KMAP to scan DNA sequences from 2,102 ocean samples taken at different depths and locations around the world. This advanced computing infrastructure identified 317.5 million gene groups, of which more than half could be classified according to organism type and gene function. By matching this information with the sample location and habitat type, the resulting catalog provides unprecedented information on which microbes live where and what they do.
“This achievement reflects the critical importance of open science,” said Duarte. “Building the catalog was only possible thanks to ambitious global sailing expeditions where the samples were collected and the sharing of the samples’ DNA in the open-access European Nucleotide Archive. We are continuing these collaborative efforts by making the catalog freely available.”
A wealth of scientific and industrial applications
The catalog has already revealed a difference in microbial activity in the water column and ocean floor, as well as a surprising number of fungi living in the ‘twilight’ mesopelagic zone. These and other insights will help scientists understand how microbes living in different habitats shape ecosystems, contribute to ocean health, and influence the climate.
The catalog also serves as a baseline for tracking the effect of human impacts like pollution and global warming on marine life. And it offers a wealth of genetic material that researchers can scan for novel genes that could be used for drug development, energy, and agriculture.
Toward a global ocean genome
The KMAP Ocean Gene Catalog 1.0 is a first step towards developing an atlas of the global ocean genome, which will document every gene from every marine species worldwide – from bacteria and fungi to plants and animals.
“Our analysis highlights the need to continue sampling the oceans, focusing on areas that are under-studied, such as the deep sea and the ocean floor. Also, since the ocean is forever changing – both due to human activity and to natural processes – the catalog will need continual updating,” said Laiolo.
Duarte cautions that despite its clear benefit, the future of the catalog is uncertain. A major obstacle is the status of international legislation on benefit-sharing from discoveries made in international waters.
“While the 2023 Treaty of the High Seas offers some solutions, it may inadvertently impede research by reducing incentives for companies and governments to invest. Such uncertainty must be resolved now we have reached the point where genetic and artificial intelligence technologies could unlock unprecedented innovation and progress in blue biotechnology,” he concluded.
The article is part of a Frontiers in Science multimedia article hub featuring an explainer as well as an editorial, viewpoint, and policy outlook from other eminent experts: Prof Enric Sala (National Geographic Society, USA), Prof Andreas Teske (University of North Carolina at Chapel Hill, USA), and Peggy Rodgers Kalas (International Ocean Policy Advisor to the Oceano Azul Foundation, and former Director of the High Seas Alliance).
Journal
Frontiers in Science
DOI
10.3389/fsci.2023.1038696
Method of Research
Data/statistical analysis
Subject of Research
Animals
Article Title
Metagenomic probing toward an atlas of the taxonomic and metabolic foundations of the global ocean genome
Article Publication Date
16-Jan-2024
COI Statement
The authors declare that the research was conducted in the absence of financial relationships that could be construed as a potential conflict of interest. The handling editor BB declared a shared consortium IMG/M Data Consortium with the author SA at the time of review. The authors IA, SA, TG, CD declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.