AI Meets Non-Coding DNA: Decoding Genomic Grammar Beyond AlphaFold
- DNA writer

- Aug 17, 2025
- 1 min read
Researchers are applying advanced AI to non-coding regions of DNA in order to predict how short regulatory sequences influence gene expression.
Models such as Evo, regLM, and DeepMind’s AlphaGenome are trained on massive sequence datasets and can now generate regulatory elements or forecast their activity from minimal sequence input. These approaches provide a more accurate view of how sequence variation alters gene regulation and open the door to designing novel elements with targeted functions. Importantly, they also capture long-range interactions that span megabases of DNA, a step toward understanding complex regulatory architecture at single-cell resolution and in combination with long-read sequencing technologies.
Evo was trained on a dataset of about 300 billion nucleotide bases, allowing it to generate realistic new DNA sequences. Similarly, regLM has been shown to design short regulatory elements that are predicted to be active across different human cell lines. Together, these advances illustrate how AI can uncover subtle regulatory patterns and build functional sequences, creating a strong complement to long-read single-cell sequencing, which captures isoform diversity and splicing events at the level of individual cells.

Read more:




Comments