The majority of predicted Secondary metabolite types detected by antiSMASH were terpene, Bacteriocin, NRPs Nonribosomal peptide synthetase cluster, T1pks Type I PKS cluster, T3pks Type III PKS cluster,SH1 m2 hybrid peptide-polyketide compounds, and other clusters include a secondary metabolite that does not fit into any other category (15). And the minority type’s clusters were followed by lantipeptides, aryl-polyene, tranastpks, otherks, proteusin, phosphonate, cyanobactin, thiopeptide, and indole.
Figure 2: distribution of Secondary metabolite class predicted by antiSMASH. The results show a rich diversity of Secondary metabolites in the genomes. Specifically enriched are terpene, bacteriocin, nrps, and t1pks. The majority of bacterial secondary metabolites are derived from small biosynthetic units in biosynthetic pathways. The intermediates resulting from these small biosynthetic units amplify by numerous enzyme leading to products with a diversity of structures. Thus, the majority of these secondary metabolites are regularly classified based on their biosynthetic origin as terpenoids, polyketides, alkaloids, non-ribosomal peptides etc (16).
Among all of the cluster were predicted by antiSMASH we manually inspected functional annotation of putative gene cluster according to the information in cluster boundaries.
Supplementary Table 1 shows full result of all antiSMASH biosynthetic gene cluster predictions by phylogenetic grouping, evidence of enrichment in biosynthetic clusters
3.3. Investigation of homologous genomic
For the initial analysis of biosynthetic gene clusters, we investigate homologous gene clusters because of some biosynthesis gene clusters are highly modular. Therefore, to understand functional of a gene cluster, we compare each gene cluster, with other gene clusters which show similarity to it and grouped the BGCs based on homology. For each gene cluster we select three to four highest homologous gene that can be observed from Table 1. The most repeated close structural similar gene were carotenoid, Puwainaphycins, Microsclerodermins, microcystins and Nostophycin. Investigation of the homologous gene between different genomes give an extra dimension to sequence inspection and therefore assign functions to newly discovered genes (17).
Moreover, the annotation of a gene cluster with the highest cumulative bit score was assigned, the cumulative BlastP bit score of above 1,000 to 21,000 between the genes clusters are summarized in table 2.
3.4. Investigation of the predicted gene cluster responsible for biosynthesis of a particular secondary metabolite.
A total of 190 non-redundant putative gene clusters were predicted for scaffold we uploaded m3 to antiSMASH. Figure 3, 4 and 5 shows an overview of some predicted gene clusters encoding nonribosomal peptide synthesis (NRPS) and hybrid polyketide synthesis (NRPS- PKSSH4 m5 ). To observe, more gene cluster predicted can refer to supplementary information. Based on the presence of NRPS and NRPS- PKS