人工智能辅助的酶分子改造应用进展
作者:
基金项目:

国家自然科学基金(62373001);安徽省自然科学基金(1808085MC86);安徽高校教师自然科学研究重点项目(2022AH052316, 2023AH050089)


Progress in the application of artificial intelligence-assisted molecular modification of enzymes
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [54]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    天然酶在活性、对映体选择性或热稳定性等方面经常难以满足应用与研究的需求,探索高效的酶分子改造技术改善该类酶的某些特性是酶工程的重要任务。酶分子改造技术主要包括理性设计、定向进化和人工智能辅助设计等。定向进化和理性设计是由实验驱动的酶分子改造策略,已经成功地应用于酶工程,但由于蛋白质序列空间的尺寸巨大以及实验数据少,现行的酶分子改造方法仍然面临着重大挑战。随着新一代测序、高通量筛选方法、蛋白质数据库和人工智能技术的发展,数据驱动的酶工程有望应对这些挑战。其中,采用人工智能辅助的统计学习方法,通过数据驱动方式构建序列/结构-酶性能的预测模型,依据预测模型挑选优良突变酶,大大提高了酶分子改造效率。基于酶分子改造的应用需求,本文综述了人工智能辅助酶分子改造的数据采集方法以及人工智能辅助酶分子改造的应用实例等,重点叙述了采用卷积神经网络预测蛋白质热稳定性的方法,以期为该领域的研究人员提供参考。

    Abstract:

    Natural enzymes are often difficult to meet the needs of application and research in terms of activity, enantiomer selectivity or thermal stability. Therefore, it is an important task of enzyme engineering to explore efficient molecular modification technologies to improve the properties of such enzymes. The molecular modification technologies of enzymes mainly include rational design, directed evolution, and artificial intelligence-assisted design. Directed evolution and rational design are experiment-driven molecular modification approaches of enzymes and have been successfully applied to enzyme engineering. However, due to the huge space sizes of protein sequences and the lack of experimental data, the current modification methods still face major challenges. With the development of next-generation sequencing, high-throughput screening, protein databases, and artificial intelligence (AI), data-driven enzyme engineering is emerging as a promising solution to these challenges. The AI-assisted statistical learning method has been used to establish a model for predicting the sequence/structure-properties of enzymes in a data-driven manner. Excellent mutant enzymes can be selected according to the prediction results, which greatly improve the efficiency of molecular modification. Considering the application requirements of molecular modification of enzymes, this paper reviews the data acquisition methods and application examples of AI-assisted molecular modification of enzymes, with focuses on the convolutional neural network method for predicting protein thermostability, aiming to provide reference for researchers in this field.

    参考文献
    [1] 王镜岩. 生物化学. 上册[M]. 北京: 高等教育出版社, 2002. WANG JY. Biochemistry. Volume I[M]. Beijing: Higher Education Press, 2002(in Chinese).
    [2] REETZ MT, CARBALLEIRA JD, VOGEL A. Iterative saturation mutagenesis on the basis of B factors as a strategy for increasing protein thermostability[J]. Angewandte Chemie International Edition, 2006, 45(46): 7745-7751.
    [3] 曲戈, 朱彤, 蒋迎迎, 吴边, 孙周通. 蛋白质工程: 从定向进化到计算设计[J]. 生物工程学报, 2019, 35(10): 1843-1856. QU G, ZHU T, JIANG YY, WU B, SUN ZT. Protein engineering: from directed evolution to computational design[J]. Chinese Journal of Biotechnology, 2019, 35(10): 1843-1856(in Chinese).
    [4] 张锟, 曲戈, 刘卫东, 孙周通. 工业酶结构与功能的构效关系[J]. 生物工程学报, 2019, 35(10): 1806-1818. ZHANG K, QU G, LIU WD, SUN ZT. Structure-function relationships of industrial enzymes[J]. Chinese Journal of Biotechnology, 2019, 35(10): 1806-1818(in Chinese).
    [5] 赵永耀. 基于机器学习的酶改造方法研究[D]. 江苏: 南京邮电大学硕士学位论文, 2023. ZHAO YY. Research of enzyme modification methods via machine learning[D]. Jiangsu: Master’s Thesis of Nanjing University of Posts and Telecommunications, 2023(in Chinese).
    [6] ARNOLD FH. The nature of chemical innovation: new enzymes by evolution[J]. Quarterly Reviews of Biophysics, 2015, 48(4): 404-410.
    [7] MUSIL M, KONEGGER H, HON J, BEDNAR D, DAMBORSKY J. Computational design of stable and soluble biocatalysts[J]. ACS Catalysis, 2019, 9(2): 1033-1054.
    [8] 蒋迎迎, 曲戈, 孙周通. 机器学习助力酶定向进化[J]. 生物学杂志, 2020, 37(4): 1-11. JIANG YY, QU G, SUN ZT. Machine learning- assisted enzyme directed evolution[J]. Journal of Biology, 2020, 37(4): 1-11(in Chinese).
    [9] 康里奇, 谈攀, 洪亮. 人工智能时代下的酶工程[J]. 合成生物学, 2023, 4(3): 524-534. KANG LQ, TAN P, HONG L. Enzyme engineering in the age of artificial intelligence[J]. Synthetic Biology Journal, 2023, 4(3): 524-534(in Chinese).
    [10] SIKANDER R, WANG YP, GHULAM A, WU XJ. Identification of enzymes-specific protein domain based on DDE, and convolutional neural network[J]. Frontiers in Genetics, 2021, 12: 759384.
    [11] JING XY, LI FM. Predicting cell wall lytic enzymes using combined features[J]. Frontiers in Bioengineering and Biotechnology, 2021, 8: 627335.
    [12] WAN ZY, WANG QD, LIU DC, LIANG JH. Accelerating the optimization of enzyme-catalyzed synthesis conditions via machine learning and reactivity descriptors[J]. Organic & Biomolecular Chemistry, 2021, 19(28): 6267-6273.
    [13] SAITO Y, OIKAWA M, SATO T, NAKAZAWA H, ITO T, KAMEDA T, TSUDA K, UMETSU M. Machine-learning-guided library design cycle for directed evolution of enzymes: the effects of training data composition on sequence space exploration[J]. ACS Catalysis, 2021, 11(23): 14615-14624.
    [14] Del RIO-CHANONA EA, FIORELLI F, ZHANG DD, AHMED NR, JING KJ, SHAH N. An efficient model construction strategy to simulate microalgal lutein photo-production dynamic process[J]. Biotechnology and Bioengineering, 2017, 114(11): 2518-2527.
    [15] GADO JE, HARRISON BE, SANDGREN M, STÅHLBERG J, BECKHAM GT, PAYNE CM. Machine learning reveals sequence-function relationships in family 7 glycoside hydrolases[J]. The Journal of Biological Chemistry, 2021, 297(2): 100931.
    [16] SIEDHOFF NE, SCHWANEBERG U, DAVARI MD. Machine learning-assisted enzyme engineering[J]. Methods in Enzymology, 2020, 643: 281-315.
    [17] 王慕镪, 陈琦, 马薇, 李春秀, 欧阳鹏飞, 许建和. 机器学习方法在酶定向进化中的应用进展[J]. 生物技术通报, 2023, 39(4): 38-48. WANG MQ, CHEN Q, MA W, LI CX, OUYANG PF, XU JH. Advances in the application of machine learning methods for directed evolution of enzymes[J]. Biotechnology Bulletin, 2023, 39(4): 38-48(in Chinese).
    [18] YANG KK, WU Z, ARNOLD FH. Machine-learning- guided directed evolution for protein engineering[J]. Nature Methods, 2019, 16: 687-694.
    [19] BUETTNER K, HERTEL TC, PIETZSCH M. Increased thermostability of microbial transglutaminase by combination of several hot spots evolved by random and saturation mutagenesis[J]. Amino Acids, 2012, 42(2): 987-996.
    [20] BÖHME B, MORITZ B, WENDLER J, HERTEL TC, IHLING C, BRANDT W, PIETZSCH M. Enzymatic activity and thermoresistance of improved microbial transglutaminase variants[J]. Amino Acids, 2020, 52(2): 313-326.
    [21] ARANGO GUTIERREZ E, MUNDHADA H, MEIER T, DUEFEL H, BOCOLA M, SCHWANEBERG U. Reengineered glucose oxidase for amperometric glucose determination in diabetes analytics[J]. Biosensors & Bioelectronics, 2013, 50: 84-90.
    [22] CHEN K, ARNOLD FH. Tuning the activity of an enzyme for unusual environments: sequential random mutagenesis of subtilisin E for catalysis in dimethylformamide[J]. Proceedings of the National Academy of Sciences of the United States of America, 1993, 90(12): 5618-5622.
    [23] STEMMER WPC. Rapid evolution of a protein in vitro by DNA shuffling[J]. Nature, 1994, 370: 389-391.
    [24] ROMERO-RIVERA A, GARCIA-BORRÀS M, OSUNA S. Computational tools for the evaluation of laboratory-engineered biocatalysts[J]. Chemical Communications, 2016, 53(2): 284-297.
    [25] YOKOYAMA K, UTSUMI H, NAKAMURA T, OGAYA D, SHIMBA N, SUZUKI E, TAGUCHI S. Screening for improved activity of a transglutaminase from Streptomyces mobaraensis created by a novel rational mutagenesis and random mutagenesis[J]. Applied Microbiology and Biotechnology, 2010, 87(6): 2087-2096.
    [26] 陈康康. 分子改造强化Streptomyces hygroscopicus谷氨酰胺转胺酶催化性能研究[D]. 无锡: 江南大学博士学位论文, 2013. CHEN KK. The study on the molecular modification of Streptomyces hygroscopicus transglutaminase for enhanced catalytic properties[D]. Wuxi: Doctoral Dissertation of Jiangnan University, 2013(in Chinese).
    [27] 倪晗朦, 胡孟凯, 张恒维, 张显, 潘学玮, 饶志明, 周楠迪. 半理性设计提高甲酸脱氢酶(CbFDH)活力及热稳定性[J]. 食品与生物技术学报, 2023, 42(10): 1-8. NI HM, HU MK, ZHANG HW, ZHANG X, PAN XW, RAO ZM, ZHOU ND. Enhanced activity and thermal stability of Formate dehydrogenase (CbFDH) via semi-rational design[J]. Journal of Food Science and Biotechnology, 2023, 42(10): 1-8(in Chinese).
    [28] ARABNEJAD H, dal LAGO M, JEKEL PA, FLOOR RJ, THJournal of Automatica Sinica, 2017, 4(4): 588-598.
    [48] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, XU B, WARDE-FARLEY D, OZAIR S, COURVILLE AC, BENGIO Y. Generative adversarial nets[J]. Advances in Neural Information Processing Systems, 2014, 27: 2672-2680.
    [49] CRESWELL A, WHITE T, DUMOULIN V, ARULKUMARAN K, SENGUPTA B, BHARATH AA. Generative adversarial networks: an overview[J]. IEEE Signal Processing Magazine, 2018, 35(1): 53-65.
    [50] SHARMA A, SINGH P, CHANDRA R. SMOTified- GAN for class imbalanced pattern classification problems[J]. IEEE Access, 2022, 10: 30655.
    [51] REMMERT M, BIEGERT A, HAUSER A, SÖDING J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment[J]. Nature Methods, 2012, 9: 173-175.
    [52] HEFFERNAN R, YANG YD, PALIWAL K, ZHOU YQ. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility[J]. Bioinformatics, 2017, 匳伳谨1朸帩嬺 2丸崴昲耭輲券瘴蘹瘮贼嵢穲>[53]吠扇畒牅孅NHALGH JC, FAHLBERG SA, PFLEGER BF, ROMERO PA. Machine learning-guided acyl-ACP reductase engineering for improved in vivo fatty alcohol production[J]. Nature 权息六mu獮恩ca彴兩乯ns本帠20洲帱稬匠戱挲戺朠圵蘸瘲砵種丼癢干甾[54] 畈牁嵎稠存戬 NING WB, MA XQ, WANG XN, ZHOU K. Improving protein solubility and activity by introducing small peptide tags designed with machine learning models[J]. Metabolic Engineering Communications, 2020, 11: e00138.
    [55] OSTAFE R, FONTAINE N, FRANK D, CHONG MNF, PRODANOVIC R, PANDJAITAN R, OFFMANN B, CADET F, FISCHER R. One-shot optimization of multiple enzyme parameters: tailoring glucose oxidase for pH and electron mediators[J]. Biotechnology and Bioengineering, 2020, 117(1): 17-29.
    [56] CADET F, FONTAINE N, LI GY, SANCHIS J, NG FUK CHONG M, PANDJAITAN R, VETRIVEL I, OFFMANN B, REETZ MT. A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes[J]. Scientific Reports, 2018, 8: 16757.
    [57] FOX RJ, DAVIS SC, MUNDORFF EC, NEWMAN LM, GAVRILOVIC V, MA SK, CHUNG LM, CHING C, TAM S, MULEY S, GRATE J, GRUBER J, WHITMAN JC, SHELDON RA, HUISMAN GW. Improving catalytic function by ProSAR-driven enzyme evolution[J]. Nature Biotechnology, 2007, 25: 338-344.
    [58] SAITO Y, OIKAWA M, NAKAZAWA H, NIIDE T, KAMEDA T, TSUDA K, UMETSU M. Machine- learning-guided mutagenesis for directed evolution of fluorescent proteins[J]. ACS Synthetic Biology, 2018, 7(9): 2014-2022.
    [59] LI GY, DONG YJ, 陒孅卅TZ朠噍孔丮輠剃癡聮鄠酭癡穣卨轩卮剥朠learn乩浮g 卲乥并药奬孵硴塩孯佮譩敺e directed evolution of selective enzymes?[J]. Advanced Synthesis & Catalysis, 2019, 361(11): 2377-2386.[60] LI G, RABE KS, NIELSEN J, ENGQVIST MKM. Machine learning applied to predicting microorganism growth temperatures and enzyme catalytic optima[J]. ACS Synthetic Biology, 2019, 8(6): 1411-1420.
    [61] YOU RH, ZHANG ZH, XIONG Y, SUN FZ, MAMITSUKA H, ZHU SF. GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank[J]. Bioinformatics, 2018, 34(14): 2465-2473.
    [62] LAI BQ, XU JB. Accurate protein function prediction via graph attention networks with predicted structure information[J]. Briefings in Bioinformatics, 2022, 23(1): bbab502.
    [63] MEIER J, RAO R, VERKUIL R, LIU J, SERCU T, RIVES A. Language models enable zero-shot prediction of the effects of mutations on protein function[J]. Advances in Neural Information Processing Systems, 2021, 34: 29287-29303.
    [64] GLIGORIJEVIĆ V, RENFREW PD, KOSCIOLEK T, LEMAN JK, BERENBERG D, VATANEN T, CHANDLER C, TAYLOR BC, FISK IM, VLAMAKIS H, XAVIER RJ, KNIGHT R, CHO K, BONNEAU R. Structure-based protein function prediction using graph convolutional networks[J]. Nature Communications, 2021, 12: 3168.
    [65] YOU RH, YAO SW, MAMITSUKA H, ZHU SF. DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction[J]. Bioinformatics, 2021, 37(supplement_1): i262-i271.
    [66] FANG XR, HUANG JS, ZHANG R, WANG F, ZHANG QY, LI GL, YAN JY, ZHANG HJ, YAN YJ, XU L. Convolution neural network-based prediction of protein thermostability[J]. Journal of Chemical Information and Modeling, 2019, 59(11): 4833-4843.
    [67] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3): 273-297.
    [68] QUINLAN JR. Induction of decision trees[J]. Machine Learning, 1986, 1(1): 81-106.
    [69] JENSEN FV. An Introduction to Bayesian Networks[M]. London: UCL Press, 1996.
    [70] GELADI P, KOWALSKI BR. Partial least-squares regression: a tutorial[J]. Analytica Chimica Acta, 1986, 185: 1-17.
    [71] LI Y, SONG K, ZHANG J, LU S. A computational method to predict effects of residue mutations on the catalytic efficiency of hydrolases[J]. Catalysts, 2021, 11(2): 286.
    [72] ABDI H, WILLIAMS LJ. Principal component analysis[J]. WIREs Computational Statistics, 2010, 2(4): 433-459.
    [73] LIAO J, WARMUTH MK, GOVINDARAJAN S, NESS JE, WANG RP, GUSTAFSSON C, MINSHULL J. Engineering proteinase K using machine learning and synthetic genes[J]. BMC Biotechnology, 2007(7): 1-19.
    [74] LI GL, FANG XR, SU F, CHEN Y, XU L, YAN YJ. Enhancing the thermostability of Rhizomucor miehei lipase with a limited screening library by rational-design point mutations and disulfide bonds[J]. Applied and Environmental Microbiology, 2018, 84(2): e02129-e02146.
    [75] KAWASHIMA S, OGTA H, KANEHISA M. AAindex: amino acid index database[J]. Nucleic Acids Research, 1999, 27(1): 368-369.
    [76] NIKAM R, KULANDAISAMY A, HARINI K, SHARMA D, GROMIHA MM. ProThermDB: thermodynamic database for proteins and mutants revisited after 15 years[J]. Nucleic Acids Research, 2021, 49(D1): D420-D424.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

徐沛,汪卫华,宁洪伟,曹瑞芬,刘胜,范培锋,宋小平. 人工智能辅助的酶分子改造应用进展[J]. 生物工程学报, 2024, 40(6): 1728-1741

复制
分享
文章指标
  • 点击次数:1664
  • 下载次数: 2151
  • HTML阅读次数: 532
  • 引用次数: 0
历史
  • 收稿日期:2023-10-30
  • 录用日期:2024-03-15
  • 在线发布日期: 2024-06-06
  • 出版日期: 2024-06-25
文章二维码
您是第位访问者
生物工程学报 ® 2025 版权所有

通信地址:中国科学院微生物研究所    邮编:100101

电话:010-64807509   E-mail:cjb@im.ac.cn

技术支持:北京勤云科技发展有限公司