机器学习在MALDI-TOF MS鉴定微生物中的应用
作者:
基金项目:

辽宁省高等学校国(境)外培养项目(2018LNGXGJWPY-YB006);中国科协优秀中外青年交流计划(2018CASTQNJL50);辽宁省重点研发计划(2019JH2/10300041);沈阳市科技计划项目(18-014-4-34,F16-205-1-51,17-65-7-00,17-231-1-04)


Application of machine learning in MALDI-TOF MS identification of microorganisms
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [66]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    基质辅助激光解吸/电离飞行时间质谱(matrix-assisted laser desorption/ionization time-of-flight mass spectrometry,MALDI-TOF MS)是一种新兴的高通量技术,已广泛应用于临床微生物、食品微生物和水产微生物的快速鉴定。如何进一步提高MALDI-TOF MS在微生物鉴定中的分辨率是该技术当前面临的一大挑战。为了高效处理大量高维微生物MALDI-TOF MS数据,各种机器学习算法得到了应用。本文综述了机器学习在微生物MALDI-TOF MS鉴定中的应用。首先,本文在介绍机器学习在微生物MALDI-TOF MS分类中的工作流程后,进一步对MALDI-TOF MS的数据特征、MALDI-TOF MS数据库、数据的预处理和模型的性能评估进行了描述。然后讨论了典型的机器学习分类算法和集成学习算法的应用。简单的机器学习算法很难满足微生物MALDI-TOF MS分类的高分辨率的需求,而组合不同机器学习算法和集成学习算法可以获得更好的微生物分类性能。在MALDI-TOF MS数据的预处理方面,小波算法和遗传算法的应用最广,它们结合分类算法可以有效提高MALDI-TOF MS的分类性能。随着微生物MALDI-TOF MS数据量的不断增加,在未来的研究工作中应更重视分类算法的改进、不同算法的选择或组合以及预处理算法的改进。

    Abstract:

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is a novel high-throughput technology widely used in rapid identification of clinical microorganisms, food microorganisms and aquatic microorganisms. Currently, however, how to further improve the resolution of MALDI-TOF MS in microbial identification is a major challenge for this technology. To effectively deal with the large amounts of high-dimensional microbial MALDI-TOF MS data, a variety of machine learning algorithms have been applied. This paper reviews the applications of machine learning in MALDI-TOF MS identification of microorganisms. Herein, the workflow of machine learning in the classification of microbial MALDI-TOF MS is introduced. Then, the characteristics of MALDI-TOF MS data, MALDI-TOF MS database, the preprocessing of the MALDI-TOF MS data, and the performance evaluation of the model are further described. The applications of typical machine learning classification algorithms and ensemble learning algorithms are also discussed.

    参考文献
    [1] Rahi P, Prakash O, Shouche YS. Matrix-assisted laser desorption/ionization time-of-flight mass-spectrometry (MALDI-TOF MS) based microbial identifications:challenges and scopes for microbial ecologists. Frontiers in Microbiology, 2016, 7:1359.
    [2] Bellanger AP, Gbaguidi-Haore H, Liapis E, Scherer E, Millon L. Rapid identification of Candida sp. by MALDI-TOF mass spectrometry subsequent to short-term incubation on a solid medium. APMIS, 2019, 127(4):217-221.
    [3] Bessède E, Solecki O, Sifré E, Labadi L, Mégraud F. Identification of Campylobacter species and related organisms by matrix assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry. Clinical Microbiology and Infection, 2011, 17(11):1735-1739.
    [4] Carbonnelle E, Grohs P, Jacquier H, Day N, Tenza S, Dewailly A, Vissouarn O, Rottman M, Herrmann JL, Podglajen I, Laurent R. Robustness of two MALDI-TOF mass spectrometry systems for bacterial identification. Journal of Microbiological Methods, 2012, 89(2):133-136.
    [5] Khot PD, Fisher MA. Novel approach for differentiating Shigella species and Escherichia coli by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Journal of Clinical Microbiology, 2013, 51(11):3711-3716.
    [6] Almuzara M, Barberis C, Traglia G, Famiglietti A, Ramirez MS, Vay C. Evaluation of matrix-assisted laser desorption ionization-time-of-flight mass spectrometry for species identification of nonfermenting gram-negative Bacilli. Journal of Microbiological Methods, 2015, 112:24-27.
    [7] Paauw A, Jonker D, Roeselers G, Jonathan MH, Mars-Groenendijk RH, Trip H, Molhoek EM, Jansen HJ, van der Plas J, de Jong AL, Majchrzykiewicz-Koehorst JA, Speksnijder AGCL. Rapid and reliable discrimination between Shigella species and Escherichia coli using MALDI-TOF mass spectrometry. International Journal of Medical Microbiology, 2015, 305(4/5):446-452.
    [8] Li P, Xin WW, Xia SS, Luo Y, Chen ZW, Jin DZ, Gao S, Yang H, Ji B, Wang HH, Yan Y, Kang L, Wang JL. MALDI-TOF mass spectrometry-based serotyping of V. parahaemolyticus isolated from the Zhejiang province of China. BMC Microbiology, 2018, 18(1):185.
    [9] Culebras DE. Application of MALDI-TOF MS in bacterial strain typing and taxonomy//Cobo F. The Use of Mass Spectrometry Technology (MALDI-TOF) in Clinical Microbiology. Amsterdam:Academic Press, 2018:213-233.
    [10] Tsuchida S. Application of MALDI-TOF for bacterial identification//Cobo F. The Use of Mass Spectrometry Technology (MALDI-TOF) in Clinical Microbiology. Amsterdam:Academic Press, 2018:101-112.
    [11] Datta S, Pihur V. Feature selection and machine learning with mass spectrometry data//Matthiesen R. Bioinformatics Methods in Clinical Research. New York:Humana Press, 2010:205-229.
    [12] Lohmann C, Sabou M, Moussaoui W, Prévost G, Delarbre JM, Candolfi E, Gravet A, Letscher-Bru V. Comparison between the Biflex III-Biotyper and the Axima-SARAMIS systems for yeast identification by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Journal of Clinical Microbiology, 2013, 51(4):1231-1236.
    [13] Sonthayanon P, Jaresitthikunchai J, Mangmee S, Thiangtrongjit T, Wuthiekanun V, Amornchai P, Newton P, Phetsouvanh R, Day NPJ, Roytrakul S. Whole cell matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) for identification of Leptospira spp. in Thailand and Lao PDR. PLoS Neglected Tropical Diseases, 2019, 13(4):e0007232.
    [14] Mesureur J, Arend S, Cellière B, Courault P, Cotte-Pattat PJ, Totty H, Deol P, Mick V, Girard V, Touchberry J, Burrowes V, Lavigne JP, O'Callaghan D, Monnin V, Keriel A. A MALDI-TOF MS database with broad genus coverage for species-level identification of Brucella. PLoS Neglected Tropical Diseases, 2018, 12(10):e0006874.
    [15] Honnavar P, Ghosh AK, Paul S, Shankarnarayan SA, Singh P, Dogra S, Chakrabarti A, Rudramurthy SM. Identification of Malassezia species by MALDI-TOF MS after expansion of database. Diagnostic Microbiology and Infectious Disease, 2018, 92(2):118-123.
    [16] Wang HY, Chen CH, Lee TY, Horng JT, Liu TP, Tseng YJ, Lu JJ. Rapid detection of heterogeneous vancomycin- intermediate Staphylococcus aureus based on matrix-assisted laser desorption ionization time-of-flight:using a machine learning approach and unbiased validation. Frontiers in Microbiology, 2018, 9:2393.
    [17] Mazzeo MF, Sorrentino A, Gaita M, Cacace G, Di Stasio M, Facchiano A, Comi G, Malorni A, Siciliano RA. Matrix-assisted laser desorption ionization-time of flight mass spectrometry for the discrimination of food-borne microorganisms. Applied and Environmental Microbiology, 2006, 72(2):1180-1189.
    [18] Böhme K, Fernández-No IC, Barros-Velázquez J, Gallardo JM, Cañas B, Calo-Mata P. SpectraBank:an open access tool for rapid microbial identification by MALDI-TOF MS fingerprinting. Electrophoresis, 2012, 33(14):2138-2142.
    [19] Fournier PE, Couderc C, Buffet S, Flaudrops C, Raoult D. Rapid and cost-effective identification of Bartonella species using mass spectrometry. Journal of Medical Microbiology, 2009, 58(9):1154-1159.
    [20] Erler R, Wichels A, Heinemeyer EA, Hauk G, Hippelein M, Reyes NT, Gerdts G. VibrioBase:A MALDI-TOF MS database for fast identification of Vibrio spp. that are potentially pathogenic in humans. Systematic and Applied Microbiology, 2015, 38(1):16-25.
    [21] López Fernández H, Reboiro-Jato M, Pérez Rodríguez JA, Fdez-Riverola F, Glez-Peña D. Implementing effective machine learning-based workflows for the analysis of mass spectrometry data. Journal of Integrated OMICS, 2016, 6(1):23-27.
    [22] Esener N, Green MJ, Emes RD, Jowett B, Davies PL, Bradley AJ, Dottorini T. Discrimination of contagious and environmental strains of Streptococcus uberis in dairy herds by means of mass spectrometry and machine-learning. Scientific Reports, 2018, 8(1):17517.
    [23] Almasoud N, Xu Y, Nicolaou N, Goodacre R. Optimization of matrix assisted desorption/ionization time of flight mass spectrometry (MALDI-TOF-MS) for the characterization of Bacillus and Brevibacillus species. Analytica Chimica Acta, 2014, 840:49-57.
    [24] Montaudo G, Montaudo MS, Puglisi C, Samperi F. Characterization of polymers by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry:molecular weight estimates in samples of varying polydispersity. Rapid Communications in Mass Spectrometry, 1995, 9(5):453-460.
    [25] Lafolie J, Sauget M, Cabrolier N, Hocquet D, Bertrand X. Detection of Escherichia coli sequence type 131 by matrix-assisted laser desorption ionization time-of-flight mass spectrometry:implications for infection control policies? Journal of Hospital Infection, 2015, 90(3):208-212.
    [26] Mather CA, Werth BJ, Sivagnanam S, Sengupta DJ, Butler-Wu SM. Rapid detection of vancomycin-intermediate Staphylococcus aureus by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Journal of Clinical Microbiology, 2016, 54(4):883-890.
    [27] Wang HY, Lee TY, Tseng YJ, Liu TP, Huang KY, Chang YT, Chen CH, Lu JJ. A new scheme for strain typing of methicillin-resistant Staphylococcus aureus on the basis of matrix-assisted laser desorption ionization time-of-flight mass spectrometry by using machine learning approach. PLoS One, 2018, 13(3):e0194289.
    [28] De Bruyne K, Slabbinck B, Waegeman W, Vauterin P, De Baets B, Vandamme P. Bacterial species identification from MALDI-TOF mass spectra through data analysis and machine learning. Systematic and Applied Microbiology, 2011, 34(1):20-29.
    [29] Dai YL, Fan ZC, Zhang LP, Xu XY, Zhang ZL. Improved random forest algorithm to classify methicillin-resistant and methicillin-susceptible Staphylococcus aureus on mass spectra//Proceedings of the 9th International Conference on Bioinformatics and Biomedical Technology. Lisbon, Portugal:ACM, 2017:64-69.
    [30] Asakura K, Azechi T, Sasano H, Matsui H, Hanaki H, Miyazaki M, Takata T, Sekine M, Takaku T, Ochiai T, Komatsu N, Shibayama K, Katayama Y, Yahara K. Rapid and easy detection of low-level resistance to vancomycin in methicillin-resistant Staphylococcus aureus by matrix-assisted laser desorption ionization time-of-flight mass spectrometry. PLoS One, 2018, 13(3):e0194212.
    [31] Ikryannikova LN, Filimonova AV, Malakhova MV, Savinova T, Filimonova O, Ilina EN, Dubovickaya VA, Sidorenko SV, Govorun VM. Discrimination between Streptococcus pneumoniae and Streptococcus mitis based on sorting of their MALDI mass spectra. Clinical Microbiology and Infection, 2013, 19(11):1066-1071.
    [32] Lasch P, Fleige C, Stämmler M, Layer F, Nübel U, Witte W, Werner G. Insufficient discriminatory power of MALDI-TOF mass spectrometry for typing of Enterococcus faecium and Staphylococcus aureus isolates. Journal of Microbiological Methods, 2014, 100:58-69.
    [33] Angeletti S, Dicuonzo G, Lo Presti A, Cella E, Crea F, Avola A, Vitali MA, Fagioni M, de Florio L. MALDI-TOF mass spectrometry and blakpc gene phylogenetic analysis of an outbreak of carbapenem-resistant K. pneumoniae strains. New Microbiologica, 2015, 38(4):541-550.
    [34] Camoez M, Sierra JM, Dominguez MA, Ferrer-Navarro M, Vila J, Roca I. Automated categorization of methicillin-resistant Staphylococcus aureus clinical isolates into different clonal complexes by MALDI-TOF mass spectrometry. Clinical Microbiology and Infection, 2016, 22(2):161.e1-161.e7.
    [35] Marí-Almirall M, Cosgaya C, Higgins PG, van Assche A, Telli M, Huys G, Lievens B, Seifert H, Dijkshoorn L, Roca I, Vila J. MALDI-TOF/MS identification of species from the Acinetobacter baumannii (Ab) group revisited:inclusion of the novel A. seifertii and A. dijkshoorniae species. Clinical Microbiology and Infection, 2017, 23(3):210.e1-210.e9.
    [36] Boggs SR, Cazares LH, Drake R. Characterization of a Staphylococcus aureus USA300 protein signature using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Journal of Medical Microbiology, 2012, 61(5):640-644.
    [37] Xiao D, Zhao F, Zhang HF, Meng FL, Zhang JZ. Novel strategy for typing Mycoplasma pneumoniae isolates by use of matrix-assisted laser desorption ionization-time of flight mass spectrometry coupled with ClinProTools. Journal of Clinical Microbiology, 2014, 52(8):3038-3043.
    [38] Fisher MA. Differentiation of closely related organisms using MALDI-TOF MS//Shah HN, Gharbia SE. MALDI-TOF and Tandem MS for Clinical Microbiology. West Sussex:John Wiley & Sons Ltd, 2017:147-165.
    [39] Nakano S, Matsumura Y, Ito Y, Fujisawa T, Chang B, Suga S, Kato K, Yunoki T, Hotta G, Noguchi T, Yamamoto M, Nagao M, Takakura S, Ohnishi M, Ihara T, Ichiyama S. Development and evaluation of MALDI-TOF MS-based serotyping for Streptococcus pneumoniae. European Journal of Clinical Microbiology & Infectious Diseases, 2015, 34(11):2191-2198.
    [40] Tomachewski D, Galvão CW, de Campos Júnior A, Guimarães AM, Ferreira Da Rocha JC, Etto RM. Ribopeaks:a web tool for bacterial classification through m/z data from ribosomal proteins. Bioinformatics, 2018, 34(17):3058-3060.
    [41] Ziegler D, Pothier JF, Ardley J, Fossou RK, Pflüger V, de Meyer S, Vogel G, Tonolla M, Howieson J, Reeve W, Perret X. Ribosomal protein biomarkers provide root nodule bacterial identification by MALDI-TOF MS. Applied Microbiology and Biotechnology, 2015, 99(13):5547-5562.
    [42] Assareh A, Moradi MH, Esmaeili V. A novel ensemble strategy for classification of prostate cancer protein mass spectra//Proceedings of 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Lyon, France:IEEE, 2007:5987-5990.
    [43] Bhanot G, Alexe G, Venkataraghavan B, Levine AJ. A robust meta-classification strategy for cancer detection from MS data. Proteomics, 2006, 6(2):592-604.
    [44] Datta S, Pihur V, Datta S. An adaptive optimal ensemble classifier via bagging and rank aggregation with applications to high dimensional data. BMC Bioinformatics, 2010, 11:427.
    [45] Ribeiro LG, Da Rocha JCF, Fedacz GL, Dos Santos F, Tomachewski D, Etto RM. Um modelo Ensemble discriminativo para classificação de bactérias do Solo. Anais SULCOMP, 2018, 9:1-10.
    [46] Fernández-Álvarez C, Torres-Corral Y, Saltos-Rosero N, Santos Y. MALDI-TOF mass spectrometry for rapid differentiation of Tenacibaculum species pathogenic for fish. Applied Microbiology and Biotechnology, 2017, 101(13):5377-5390.
    [47] Månsson V, Gilsdorf JR, Kahlmeter G, Kilian M, Kroll JS, Riesbeck K, Resman F. Capsule typing of Haemophilus influenzae by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Emerging Infectious Diseases, 2018, 24(3):443-452.
    [48] Mclean K, Palarea-Albaladejo J, Currie CG, Imrie LHJ, Manson EDT, Fraser-Pitt D, Wright F, Alexander CJ, Pollock KGJ, Allison L, Hanson M, Smith DGE. Rapid and robust analytical protocol for E. coli STEC bacteria subspecies differentiation using whole cell MALDI mass spectrometry. Talanta, 2018, 182:164-170.
    [49] Gibb S, Strimmer K. MALDIquant:a versatile R package for the analysis of mass spectrometry data. Bioinformatics, 2012, 28(17):2270-2271.
    [50] López-Fernández H, Santos HM, Capelo JL, Fdez-Riverola F, Glez-Peña D, Reboiro-Jato M. Mass-Up:an all-in-one open software application for MALDI-TOF mass spectrometry knowledge discovery. BMC Bioinformatics, 2015, 16:318.
    [51] Raus M, Šebela M. BIOSPEAN:a freeware tool for processing spectra from MALDI intact cell/spore mass spectrometry. Journal of Proteomics & Bioinformatics, 2013, 6(12):283-287.
    [52] Palarea-Albaladejo J, Mclean K, Wright F, Smith DGE. MALDIrppa:quality control and robust analysis for mass spectrometry data. Bioinformatics, 2018, 34(3):522-523.
    [53] LaMontagne M, Shetty T, Gajjar T, Kayyuru C, Sriram S, Zhang CL, Buddharaju P. HABase:A web-application for the analysis of protein spectra and identification of microbial species//Proceedings of the International Conference on Bioinformatics and Computational Biology. Las Vegas, Nevada, USA:CSREA Press, 2017:77-78.
    [54] Liu YH. Feature extraction and dimensionality reduction for mass spectrometry data. Computers in Biology and Medicine, 2009, 39(9):818-823.
    [55] Du P, Kibbe WA, Lin SM. Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics, 2006, 22(17):2059-2065.
    [56] Coombes KR, Tsavachidis S, Morris JS, Baggerly KA, Hung MC, Kuerer HM. Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics, 2005, 5(16):4107-4117.
    [57] Murugesan S, Tay DBH, Cooke I, Faou P. Application of dual tree complex wavelet transform in tandem mass spectrometry. Computers in Biology and Medicine, 2015, 63:36-41.
    [58] Zheng Y, Fan RL, Qiu CL, Liu Z, Tian D. An improved algorithm for peak detection in mass spectra based on continuous wavelet transform. International Journal of Mass Spectrometry, 2016, 409:53-58.
    [59] Gutiérrez C, Gómez-Flechoso MÁ, Belda I, Ruiz J, Kayali N, Polo L, Santos A. Wine yeasts identification by MALDI-TOF MS:optimization of the preanalytical steps and development of an extensible open-source platform for processing and analysis of an in-house MS database. International Journal of Food Microbiology, 2017, 254:1-10.
    [60] Ge MC, Kuo AJ, Liu KL, Wen YH, Chia JH, Chang PY, Lee MH, Wu TL, Chang SC, Lu JJ. Routine identification of microorganisms by matrix-assisted laser desorption ionization time-of-flight mass spectrometry:success rate, economic analysis, and clinical outcome. Journal of Microbiology, Immunology and Infection, 2017, 50(5):662-668.
    [61] Li YF, Liu YH, Bai L. Genetic algorithm based feature selection for mass spectrometry data//Proceedings of 2008 8th IEEE International Conference on BioInformatics and BioEngineering. Athens, Greece:IEEE, 2008:1-6.
    [62] Broadhurst D, Goodacre R, Jones A, Rowland JJ, Kell DB. Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry. Analytica Chimica Acta, 1997, 348(1/3):71-86.
    [63] Correa E, Goodacre R. A genetic algorithm-Bayesian network approach for the analysis of metabolomics and spectroscopic data:application to the rapid identification of Bacillus spores and classification of Bacillus species. BMC Bioinformatics, 2011, 12:33.
    [64] Bai J, Fan ZC, Zhang LP, Xu XY, Zhang ZL. Classification of methicillin-resistant and methicillin-susceptible Staphylococcus aureus using an improved genetic algorithm for feature selection based on mass spectra//Proceedings of the 9th International Conference on Bioinformatics and Biomedical Technology. Lisbon, Portugal:ACM, 2017:57-63.
    [65] Schmidt MN, Alstrøm TS, Svendstorp M, Larsen J. Peak detection and baseline correction using a convolutional neural network//Proceedings of ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Brighton, United Kingdom:IEEE, 2019:2757-2761.
    [66] Chung CR, Wang HY, Lien F, Tseng YJ, Chen CH, Lee TY, Liu TP, Horng JT, Lu JJ. Incorporating statistical test and machine intelligence into strain typing of Staphylococcus haemolyticus based on matrix-assisted laser desorption ionization-time of flight mass spectrometry. Frontiers in Microbiology, 2019, 10:2120.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

刘宏生,冯华炜,张力,孟金蕙,董雪. 机器学习在MALDI-TOF MS鉴定微生物中的应用[J]. 微生物学报, 2020, 60(5): 841-855

复制
分享
文章指标
  • 点击次数:1293
  • 下载次数: 1860
  • HTML阅读次数: 2674
  • 引用次数: 0
历史
  • 收稿日期:2019-09-01
  • 最后修改日期:2019-12-10
  • 在线发布日期: 2020-05-11
文章二维码