基于Kraken2扩展标准数据库对反刍动物消化道微生物分类能力
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Classification ability of extended Kraken2 standard database for digestive tract microbiota in ruminants
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    宏基因组学技术的应用丰富了对动物消化道中微生物组成以及功能的认识。当前,基于宏基因组测序读长(reads)水平的物种组成的分类比对水平普遍在15%−45%。因此,提高宏基因组测序reads水平微生物的比对率,可进一步挖掘宏基因数据中的微生物信息。【目的】通过扩展Kraken2标准数据库来提高反刍动物消化道微生物的分类能力,从而进一步挖掘宏基因组数据中的微生物信息。【方法】本研究共收集了来自牛、绵羊和山羊瘤胃液、粪便以及消化道中14 827个宏基因组组装基因组(metagenome-assembled genomes,MAGs),经质控过滤后,保留了3 095个物种级基因组箱(species-level genome bins,SGBs),经物种分类以及功能预测后,SGBs被整合进Kraken2标准数据库,并对其分类效果予以评估。【结果】在SGBs在基因组分类数据库(genome taxonomy database,GTDB)物种分类中,3 053个SGBs为细菌,可归类为28门782属;42个SGBs为古菌,可归类为2门8属。基于eggNOG软件功能预测,SGBs在蛋白相邻类的聚簇(cluster of orthologous groups of proteins,COG)功能分类中可注释到26种分类;在京都基因与基因组百科全书(Kyoto encyclopedia of genes and genomes,KEGG)功能预测中,前25个直系同源物(KEGG orthology,KO)通路号可归类为14种通路类型;碳水化合物酶(carbohydrate-active enzymes,CAZy)预测中,593个SGBs可注释到6类碳水化合物酶,分别是辅助氧化还原酶类(auxiliary activities,AA)、碳水化合物酯酶(carbohydrate esterases,CE)、糖苷转移酶(glycosyltransferases,GT)、碳水化合物结合模块(carbohydrate-binding modules,CBM)、糖苷水解酶(glycoside hydrolases,GH)、多糖裂解酶(polysaccharide lyases,PL);其中,GH是最为广泛的碳水化合物酶种类。3 095个SGBs加入Kraken2标准数据库(2024年5月)后,使得数据库中物种数量增加了5.00%,数据库大小从87.2 Gb提升为98.2 Gb。通过对一项基于宏基因组技术解析日粮精粗比对荷斯坦奶牛瘤胃微生物组成影响的研究再评估,加入SGBs的数据库使得该研究中瘤胃液宏基因组reads水平的物种比对率从(19.35±1.81)%提升到(51.04±2.05)%,种水平主成分(principal components analysis,PCA)分析结果表明,扩展的数据库增强了区分2种不同日粮精粗比水平下的瘤胃微生物结构的能力,线性判别丰度差异分析(linear discriminant analysis effect size,LEfSe)结果表明,在标准数据库中,Xylanibacter ruminicolaAristaeella hokkaidonensis分别是低粗料和高粗料日粮条件下的微生物标志物;而在扩展后的数据库中,Prevotellasp.902800365和Prevotellasp.900316445分别是低粗料和高粗料日粮条件下的微生物标志物。【结论】通过引入SGBs扩展Kraken2标准数据库,可进一步增加数据库中物种覆盖度,提高宏基因组reads水平物种比对率,从而增进对宏基因数据中微生物的理解。

    Abstract:

    Metagenomics has enriched our understanding about the composition and functions of digestive tract microbiota in animals. Currently, metagenomic sequencing can generally achieve the classification rate of species between 15% and 45% at the read level. Therefore, improving the alignment rate of microbial reads in metagenomics can help to further mine microbial information from metagenome data. [Objective] To enhance the classification ability for digestive tract microbiota in ruminants by extending the Kraken2 standard database, thereby deeply mining the microbial information from metagenome data. [Methods] A total of 14 827 metagenome-assembled genomes (MAGs) of the rumen fluid, feces, and digestive tracts of cattle, sheep, and goats were collected. After quality control and filtering, 3 095 species-level genome bins (SGBs) were retained. These SGBs were integrated into the Kraken2 standard database following taxonomic classification and functional prediction, and the classification effect was evaluated. [Results] In the genome taxonomy database (GTDB), the 3 095 SGBs were identified as bacteria belonging to 782 genera of 28 phyla (3 053 SGBs) and archaea belonging to 8 genera of 2 phyla (42 SGBs). The functional prediction based on eggNOG annotated the SGBs into 26 clusters of orthologous groups of proteins (COGs). The Kyoto encyclopedia of genes and genomes (KEGG) enrichment categorized the top 25 ortholog groups (KO entries) into 14 pathways. The prediction of carbohydrate-active enzymes (CAZy) showed that 593 SGBs were annotated into six classes of CAZymes: auxiliary activities (AA), carbohydrate esterases (CE), glycosyltransferases (GT), carbohydrate-binding modules (CBM), glycoside hydrolases (GH), and polysaccharide lyases (PL). Among them, GH was the most common class. The addition of 3 095 SGBs to the Kraken2 standard database (May 2024) increased the number of species in the database by 5.00%, extending the size from 87.2 Gb to 98.2 Gb. Furthermore, a study about the effect of diet fiber-to-concentrate ratio on the rumen microbiota of Holstein cows by metagenomics was reassessed, which showed that the integration of SGBs into the database raised the species alignment rate of rumen metagenome reads from (19.35±1.81)% to (51.04±2.05)%. The principal component analysis results at the species level indicated that the extended database enhanced the ability to distinguish rumen microbiota structures under two different diet fiber-to-concentrate ratios. The linear discriminant analysis effect size results indicated that the microbial markers for low-fiber and high-fiber diets were Xylanibacter ruminicola and Aristaeella hokkaidonensis, respectively, in the standard database, whereas they were Prevotella sp. 902800365 and Prevotella sp. 900316445, respectively, in the extended database. [Conclusion] In summary, introducing SGBs to extend the Kraken2 standard database can increase species coverage and improve the alignment rate of species at the metagenome read level, thereby enhancing the understanding of microbial information in metagenome data.

    参考文献
    相似文献
    引证文献
引用本文

翁玉楠,甄永康,王梦芝,王洪荣. 基于Kraken2扩展标准数据库对反刍动物消化道微生物分类能力. 微生物学报, 2025, 65(1): 402-415

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-08-03
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-01-04
  • 出版日期: 2025-01-04
文章二维码