Abstract:Metagenomics technology can directly extract all the microbial genetic material from environmental samples, without pure culture on the medium like traditional methods, which allows for in-depth understanding of the structures and functions of microbial communities. Moreover, it is of great significance to the diagnosis and treatment of diseases, management of the environment and understanding of life. All the genetic material of microorganism extracted from the environment is sequenced to obtain their reads which can be further assembled into contigs through the read assembly tools. Through binning of the contigs, more complete genes can be reconstructed from metagenomic samples. The effect of binning directly affects the subsequent biological analysis. Therefore, how to effectively bin these contigs containing different microbial genes has become a research hotspot and challenge in metagenomics. Machine learning methods are widely used in the binning of metagenomic contigs, which are generally classified into unsupervised contig clustering methods and supervised contig classification methods. This review introduced the methods for binning metagenomic contigs and analyzed the problems in binning methods such as low classification accuracy, high time cost, and difficulty in reconstructing more microbial genes from complex metagenomic datasets. Moreover, we summarized the future research on and development of the binning methods for metagenomic contigs. The authors suggested that semi-supervised learning, ensemble learning and deep learning methods should be used and combined with more effective data feature representation to improve the binning effect.