Standardized Approaches for Assessing Metagenomic Contig Binning Performance from Barnes-Hut t-Stochastic Neighbor Embeddings



The performance of unsupervised methods for metagenomic binning is often assessed using simulated microbial communities. The lack of well-characterized evaluation protocols and approaches to community construction cognizant of biological realities impedes the rigorous assessment and standardization of the binning process. This work attempted to standardize performance evaluation using benchmark communities constructed according to the genome similarity metric Average Amino Acid identity. This approach allowed us to extend and deepen our previous research on the unsupervised binning of metagenomic sequence fragments based on low-dimensional embeddings of pentamer frequency profiles. Experimental results evidenced our method’s potential for the binning of metagenomic contigs to become an alternative to state-of-the-art methods such as MetaCluster 3.0. © 2020, Springer Nature Switzerland AG.


