人工智能在核酸药物设计中的应用进展、挑战与未来

王 璠 , 谢 志*
中山大学中山眼科中心,眼病防治全国重点实验室,广东省眼科视觉科学重点实验室,广州 510060

摘 要:

核酸药物作为继小分子药物和抗体药物之后的新型治疗平台,因其设计灵活、作用机制多样而受到广泛关注。近年来,人工智能技术的迅速发展为核酸药物设计提供了新的思路和技术手段,在序列优化、结构预测、性质评估以及递送系统改进等方面展现出显著潜力。相关研究主要聚焦于利用深度学习模型进行序列生成与性能预测,通过对大规模生物数据的学习与挖掘,提高了设计的效率和准确性。然而,算法可解释性不足、数据质量不高、样本分布不均以及实验验证与模型泛化之间的差距,仍是限制其进一步应用的重要因素。随着多组学数据的融合、可解释模型的出现以及实验与计算相结合的设计体系逐步完善,人工智能有望在核酸药物的设计、筛选和临床转化等环节发挥更加关键的作用。

通讯作者:谢 志 , Email:xiezhi@gmail.com

Advances, challenges, and future perspectives of artificial intelligence in nucleic acid drug design
WANG Fan , XIE Zhi*
State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou 510060, China

Abstract:

Nucleic acid drugs, as an emerging therapeutic platform following small molecules and antibody drugs, have garnered widespread attention due to their flexible design and diverse mechanisms of action. In recent years, the rapid development of artificial intelligence (AI) technology has provided novel approaches and technical tools for nucleic acid drug design, demonstrating significant potential in sequence optimization, structure prediction, property assessment, and delivery system improvement. This review aims to systematically summarize the recent progress, challenges, and future directions of AI, particularly deep learning, in nucleic acid drug design. The scope  of this review encompasses multiple aspects of AI-driven nucleic acid drug development. We first introduce the clinical rise of nucleic acid drugs, including mRNA vaccines and therapeutics, small interfering RNA (siRNA), antisense oligonucleotides (ASO), and aptamers, which have achieved remarkable success in treating genetic diseases, metabolic disorders, infectious diseases, and cancers. However, their design faces inherent challenges including vast sequence space, high costs, off-target effects, in vivo stability issues, immunogenicity risks, and delivery efficiency bottlenecks, highlighting the limitations of traditional empirical methods and creating opportunities for AI applications. We systematically review deep learning models and their applications in this field. Convolutional neural networks (CNNs) excel at extracting local sequence motifs for predicting siRNA efficacy and immunogenicity. Recurrent neural networks (RNNs) capture sequential dependencies for RNA coding potential prediction and codon optimization. Transformers handle long-range interactions efficiently, demonstrating advantages in siRNA design and mRNA degradation prediction. Graph neural networks (GNNs) model complex molecular topologies, enabling sophisticated analysis of chemical modifications and interaction networks. In sequence design applications, deep learning optimizes mRNA codon usage and untranslated regions (UTRs) to enhance translation efficiency and stability, with computational algorithms demonstrating significant improvements in protein expression and vaccine efficacy. For siRNA, different modeling strategies have emerged: CNN-based approaches with thermodynamic features offer interpretability, GNN-based methods leverage topological modeling of RNA-RNA interactions, and Transformer-based frameworks utilize pretrained language models for transfer learning across diverse datasets. For ASO design, multi-stage frameworks combine sequence engineering with chemical modification optimization through advanced neural network architectures, achieving superior performance compared to traditional empirical approaches. For aptamers, machine learning-guided screening methods and generative models such as diffusion-based approaches accelerate the discovery of high-affinity molecular binders. Beyond sequence design, AI predicts key drug properties. Advanced models assess RNA degradation rates at nucleotide resolution and evaluate sequence-structure stability through integrated computational frameworks. Despite challenges in data quality and mechanistic understanding, deep learning approaches predict immunogenicity through innate immune stimulation assessment and neoantigen identification. Targeting specificity prediction employs geometric deep learning frameworks for RNA-ligand binding analysis and various computational approaches for managing off-target effects in therapeutic oligonucleotides. Expression level prediction integrates multiple factors including mRNA stability, translation efficiency, and immune responses to forecast protein production. Delivery system optimization represents another major application area. Deep learning enables rational design of novel ionizable lipids through virtual library generation and computational screening, successfully identifying superior candidates with minimal experimental synthesis. Advanced AI platforms achieve cell-typespecific lipid nanoparticle (LNP) design by combining neural networks with high-throughput screening, expanding therapeutic potential beyond traditional liver-targeted delivery to extrahepatic tissues. Current AI applications face critical challenges. Data scarcity and quality issues remain primary bottlenecks due to limited dataset scales, heterogeneity, and reporting bias. Model interpretability poses obstacles for mechanistic understanding and clinical acceptance, though explainable AI techniques are emerging. High computational costs, generalization difficulties, and the complexity of integrating multi-scale biological factors limit practical applications. Looking forward, we identify promising directions including multi-omics data integration for precision medicine, development of interpretable hybrid models combining mechanistic knowledge with data-driven learning, personalized drug design based on individual genetic and transcriptional profiles, integration of automated experimental platforms for rapid iterative optimization cycles, and expansion to RNA-targeting small molecules and gene editing systems. Through interdisciplinary collaboration among computational scientists, molecular biologists, and clinicians, AI-assisted nucleic acid therapeutics are poised to deliver innovative treatments for major diseases, representing one of the most exciting frontiers in AI-enabled biomedical research.

Communication Author:XIE Zhi , Email:xiezhi@gmail.com

Back to top