Abstract:Metagenomics has enriched our understanding about the composition and functions of digestive tract microbiota in animals. Currently, metagenomic sequencing can generally achieve the classification rate of species between 15% and 45% at the read level. Therefore, improving the alignment rate of microbial reads in metagenomics can help to further mine microbial information from metagenome data. [Objective] To enhance the classification ability for digestive tract microbiota in ruminants by extending the Kraken2 standard database, thereby deeply mining the microbial information from metagenome data. [Methods] A total of 14 827 metagenome-assembled genomes (MAGs) of the rumen fluid, feces, and digestive tracts of cattle, sheep, and goats were collected. After quality control and filtering, 3 095 species-level genome bins (SGBs) were retained. These SGBs were integrated into the Kraken2 standard database following taxonomic classification and functional prediction, and the classification effect was evaluated. [Results] In the genome taxonomy database (GTDB), the 3 095 SGBs were identified as bacteria belonging to 782 genera of 28 phyla (3 053 SGBs) and archaea belonging to 8 genera of 2 phyla (42 SGBs). The functional prediction based on eggNOG annotated the SGBs into 26 clusters of orthologous groups of proteins (COGs). The Kyoto encyclopedia of genes and genomes (KEGG) enrichment categorized the top 25 ortholog groups (KO entries) into 14 pathways. The prediction of carbohydrate-active enzymes (CAZy) showed that 593 SGBs were annotated into six classes of CAZymes: auxiliary activities (AA), carbohydrate esterases (CE), glycosyltransferases (GT), carbohydrate-binding modules (CBM), glycoside hydrolases (GH), and polysaccharide lyases (PL). Among them, GH was the most common class. The addition of 3 095 SGBs to the Kraken2 standard database (May 2024) increased the number of species in the database by 5.00%, extending the size from 87.2 Gb to 98.2 Gb. Furthermore, a study about the effect of diet fiber-to-concentrate ratio on the rumen microbiota of Holstein cows by metagenomics was reassessed, which showed that the integration of SGBs into the database raised the species alignment rate of rumen metagenome reads from (19.35±1.81)% to (51.04±2.05)%. The principal component analysis results at the species level indicated that the extended database enhanced the ability to distinguish rumen microbiota structures under two different diet fiber-to-concentrate ratios. The linear discriminant analysis effect size results indicated that the microbial markers for low-fiber and high-fiber diets were Xylanibacter ruminicola and Aristaeella hokkaidonensis, respectively, in the standard database, whereas they were Prevotella sp. 902800365 and Prevotella sp. 900316445, respectively, in the extended database. [Conclusion] In summary, introducing SGBs to extend the Kraken2 standard database can increase species coverage and improve the alignment rate of species at the metagenome read level, thereby enhancing the understanding of microbial information in metagenome data.