• HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
Thursday, November 20, 2025
BIOENGINEER.ORG
No Result
View All Result
  • Login
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
  • HOME
  • NEWS
  • EXPLORE
    • CAREER
      • Companies
      • Jobs
        • Lecturer
        • PhD Studentship
        • Postdoc
        • Research Assistant
    • EVENTS
    • iGEM
      • News
      • Team
    • PHOTOS
    • VIDEO
    • WIKI
  • BLOG
  • COMMUNITY
    • FACEBOOK
    • INSTAGRAM
    • TWITTER
No Result
View All Result
Bioengineer.org
No Result
View All Result
Home NEWS Science News Health

Designing Functional Genes with Genomic AI

Bioengineer by Bioengineer
November 20, 2025
in Health
Reading Time: 4 mins read
0
Share on FacebookShare on TwitterShare on LinkedinShare on RedditShare on Telegram

In a groundbreaking advance poised to reshape the landscape of synthetic biology, researchers have leveraged vast genomic language models to design entirely new genes with remarkable functionality. Harnessing patterns learned from hundreds of billions of DNA bases sampled across prokaryotic life, this semantic design approach marks a new era in which biological functions can be engineered with precision and creativity previously thought unattainable. This pioneering work goes beyond traditional protein design methods by exploiting natural genomic contexts and evolutionary information embedded within DNA sequences, pushing the boundaries of what synthetic genomics can achieve.

At its core, this innovation employs a powerful genomic sequence model called Evo, which was trained on an unprecedented scale of prokaryotic genomic data. Unlike protein design that typically focuses on narrow regions of sequence space or requires laborious structural predictions, Evo taps into the latent functional information encoded not just in isolated gene sequences but within the broader genomic neighborhoods they inhabit. This contextual understanding enables the model to generate de novo gene variants that successfully encode desired functions at experimental success rates ranging between 17 to 50 percent after testing relatively few variants. Such rates surpass many existing protein engineering methodologies, highlighting the potency of conditioning on genomic context.

Remarkably, many designed proteins from this approach display no significant sequence similarity to any known proteins, including those with related functions. This unprecedented novelty blurs the line between de novo protein design and evolution-guided diversification, presenting an ‘existence proof’ that these language models can generalize far beyond the natural sequence repertoires catalogued in biological databases. It opens new avenues to design proteins with unprecedented functional diversity, drawing upon evolutionary principles encoded in genome architecture yet generating sequences never before seen in nature.

What sets semantic design apart from prior techniques is its fundamentally different paradigm for creating functional biological molecules. It does not require any task-specific fine-tuning that risks overfitting to known examples, nor does it rely on natural language prompts derived from existing knowledge bases. Instead, semantic design excavates the rich reservoir of functional diversity hidden within genomic sequences and their ecological and evolutionary contexts. This method can thus access proteins and functions that have not yet been characterized by science, catering to a realm of biological utility beyond current annotations or hypotheses.

A striking demonstration of this technique’s versatility is provided through the generation of novel antitoxins that imply a broader compatibility across diverse toxin–antitoxin systems than previously reported, as well as an anti-CRISPR protein linked to a protein family with a different presumed function. These findings exemplify how semantic design can reveal cross-functional relationships and hidden compatibilities that defy conventional wisdom in molecular biology. It also underscores the advantage of bypassing mechanistic or structural assumptions, as filtering based on predicted structure quality would have discarded many of these successfully designed proteins.

Semantic design emerges not as a replacement but as a complementary strategy alongside classical protein engineering and directed evolution. Its ability to explore vast synthetic sequence space beyond the constraints of well-characterized natural genes presents an exciting toolkit for rational design and innovation. Particularly for functions like anti-CRISPR activity, where multiple structural and mechanistic paths exist, genomic conditioning can selectively guide design towards functional outputs less accessible to traditional approaches.

Crucially, although Evo 1.5 was the model employed for these landmark achievements, the semantic design framework is agnostic to the particular model architecture or training dataset. Any sufficiently trained language model on prokaryotic or phage genomes can be integrated into this framework. As model capabilities improve and our understanding of gene synteny—the relative order and arrangement of genes in genomes—deepens, the power and scope of semantic design are expected to grow commensurately.

The traditional paradigm in biological sequence discovery relies heavily on the concept of “guilt by association,” where hypotheses about gene function are inferred from evolutionary conservation and similarity across species. This constraint limits exploration to the slowly accumulated diversity shaped over billions of years of life’s history. By contrast, semantic design enables a rapid and expansive sampling of synthetic sequences tailored to specific biological systems. To democratize access to this unprecedented resource, the team has released SynGenome, a publicly available database containing over 120 billion base pairs of AI-generated genomic sequences, providing a valuable platform for researchers worldwide to uncover novel synthetic biological parts.

Despite its transformative potential, semantic design faces inherent challenges. Autoregressive sequence generation methods sometimes produce repetitive or hallucinated sequences that appear plausible but lack true functionality. Moreover, genes generated through contextual conditioning may encode regulatory elements rather than the direct functional proteins initially targeted, necessitating rigorous in silico screening and empirical validation. The approach is currently most effective in prokaryotic systems, reflecting the genomic structures and functional architectures captured by training data, and extending semantic design to eukaryotic organisms will require novel strategies attuned to their complex genome organization.

Looking forward, the rapidly growing corpus of genomic data, coupled with advances in language model architectures and inference algorithms, promises to elevate semantic design to new heights. More sophisticated models that can generate entire multi-component biological systems, as demonstrated with toxin–antitoxin pairs, foreshadow the ability to engineer complex synthetic circuits, metabolic pathways, or even whole genomes. These capabilities could accelerate the creation of bespoke living systems tailored for medicine, industry, and environmental applications.

Beyond mere synthetic biology, semantic design opens a window into an expanded biological reality, uncovering sequences and functions veiled from natural observation. This synthetic genomic space is a frontier ripe for discovery, with the potential to reveal new molecular machines and evolutionary principles. By integrating rich semantic information encoded in genomes with computational creativity, scientists are reshaping our capacity to design life itself, heralding a new epoch of bioengineering that transcends the limits of natural evolution.

In essence, this work marks a profound shift in how we conceive and manipulate the building blocks of life. It boldly illustrates that language models trained on biological sequences are not only tools for data analysis but are potent generative engines capable of inventing functional, novel genes and proteins. As the field advances, semantic design could fundamentally alter the trajectory of biotechnology, synthetic biology, and our understanding of the molecular basis of life.

Subject of Research:
Design of functional de novo genes using genomic language models trained on prokaryotic DNA sequences.

Article Title:
Semantic design of functional de novo genes from a genomic language model.

Article References:
Merchant, A.T., King, S.H., Nguyen, E. et al. Semantic design of functional de novo genes from a genomic language model. Nature (2025). https://doi.org/10.1038/s41586-025-09749-7

Image Credits:
AI Generated

DOI:
https://doi.org/10.1038/s41586-025-09749-7

Tags: contextual understanding in geneticsde novo gene variant generationEvo genomic sequence modelevolutionary information in DNA sequencesfunctional gene engineering techniquesgenomic language models in gene designhigh success rates in gene functionalityinnovative approaches to synthetic biologyprokaryotic genomic data utilizationprotein design versus gene designsynthetic biology advancementssynthetic genomics breakthroughs

Share12Tweet8Share2ShareShareShare2

Related Posts

Long-Term COVID-19, Heart Disease, Social Factors, Vaccination

November 20, 2025

Validating a Chinese Nursing Information Literacy Scale

November 20, 2025

Empowering Women Physicians: A Comprehensive Review

November 20, 2025

ZAK Activation Triggered by Ribosome Collision

November 20, 2025

POPULAR NEWS

  • New Research Unveils the Pathway for CEOs to Achieve Social Media Stardom

    New Research Unveils the Pathway for CEOs to Achieve Social Media Stardom

    202 shares
    Share 81 Tweet 51
  • Scientists Uncover Chameleon’s Telephone-Cord-Like Optic Nerves, A Feature Missed by Aristotle and Newton

    119 shares
    Share 48 Tweet 30
  • ESMO 2025: mRNA COVID Vaccines Enhance Efficacy of Cancer Immunotherapy

    211 shares
    Share 84 Tweet 53
  • Neurological Impacts of COVID and MIS-C in Children

    90 shares
    Share 36 Tweet 23

About

We bring you the latest biotechnology news from best research centers and universities around the world. Check our website.

Follow us

Recent News

Open-Source Nano-Stabilization Boosts Super-Resolution Microscopy

NGS-Based Mutation Profiling Advances Breast Cancer Therapy

Unveiling Ginsenoside Rh4’s Action on Leukemia Cells

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 69 other subscribers
  • Contact Us

Bioengineer.org © Copyright 2023 All Rights Reserved.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Homepages
    • Home Page 1
    • Home Page 2
  • News
  • National
  • Business
  • Health
  • Lifestyle
  • Science

Bioengineer.org © Copyright 2023 All Rights Reserved.