Abstract

The complete genome sequence of Bacillus subtilis strain DM2 isolated from petroleum-contaminated soil on the Tibetan Plateau was determined. The genome of strain DM2 consists of a circular chromosome of 4,238,631 bp for 4458 protein-coding genes and a plasmid of 84,240 bp coding for 103 genes. Thirty-four genomic islands coding for 330 proteins and 5 prophages are found in the genome. The DDH value shows that strain DM2 belongs to B. subtilis subsp. subtilis subspecies, but significant variations of the genome are also present. Comparative analysis showed that the genome of strain DM2 encodes some strain-specific proteins in comparison with B. subtilis subsp. subtilis str. 168, such as carboxymuconolactone decarboxylase family protein, gfo/Idh/MocA family oxidoreductases, GlsB/YeaQ/YmgE family stress response membrane protein, HlyC/CorC family transporters, LLM class flavin-dependent oxidoreductase, and LPXTG cell wall anchor domain-containing protein. Most of the common strain-specific proteins in DM2 and MJ01 strains, or proteins unique to DM2 strain, are involved in the pathways related to stress response, signaling, and hydrocarbon degradation. Furthermore, the strain DM2 genome contains 122 genes coding for developed two-component systems and 138 genes coding for ABC transporter systems. The prominent features of the strain DM2 genome reflect the evolutionary fitness of this strain to harsh conditions and hydrocarbon utilization.

1. Introduction

Petroleum exploitation and utilization have caused a widespread distribution of hydrocarbons in the environment. Petroleum comprises alkanes, aromatic hydrocarbons, and nonhydrocarbon compounds. These compounds pose a threat to the ecosystem and human health [1]. Many microbes have been described by their ability to degrade petroleum hydrocarbons and have been used in the bioremediation of petroleum-contaminated environment. However, the microbes inhabiting petroleum-contaminated environments at low-temperature and higher-altitude biotope remain to be studied and exploited.

Bacillus subtilis, an extensively studied gram-positive model bacterial species in the Bacillus genus, has been isolated from a variety of distinct environments, such as diverse soils and waters [2, 3], fermented foods [4], marine sand [5], rumen and intestinal tract [6], and plant endophytic bacteria [7, 8]. The versatile physiological functions of this bacterium have been explored for industrial production [9], bioremediation of polluted environment [1012], plant growth promotion and pathogen control [7], and even for use as probiotics for humans [13]. The diverse habitats of these strains reflect versatile metabolic pathways and a robust capacity of environmental adaptation of a widely distributed species [3, 14]. Recently, comparative genomics have shown the significant genetic variations among B. subtilis strains that inhabit diverse environments [3, 1417]. Thus, the ecophysiological diversity of this species provides an ideal model for revealing its genetic and molecular basis of successful environmental adaptability [14].

Recently, several Bacillus strains had been isolated from petroleum-contaminated soils and were explored to degrade hydrocarbon compounds, such as B. subtilis A1 [18], Bacillus sp. M3 [19], and Bacillus sp. Q2 [20]. In addition, genome sequences of two B. subtilis strains are available. B. subtilis strain B-1, which was isolated from an oil field, can form a thick biofilm with an extracellular matrix consisting mainly of gamma-polyglutamate [21]. The B-1 genome displays 50% sequence homology with that of the model laboratory strain B. subtilis 168. Another B. subtilis strain, MJ01, was isolated from oil-contaminated soil and evaluated as a new biosurfactant-producing strain [12]. Digital DNA-DNA hybridization showed the most similarity (94.7%) with the genome of B. subtilis subsp. spizizenii TU-B-10. In this study, we isolated a new B. subtilis strain from petroleum-contaminated soil on the Tibetan Plateau in China. To further understand its genetic traits for hydrocarbon degradation and adaptability to low-temperature environment, we analyzed the whole genome sequence and compared it with the genomes of other B. subtilis strains representing distinct ecotypes or physiological traits. Our aim was to reveal the ecological fitness associated with microbial survival strategies that are relevant to petroleum-contaminated and low-temperature soil environments.

2. Materials and Methods

2.1. Strain Isolation and Measurement of Petroleum Degradation

B. subtilis strain DM2 was isolated from oil field soils in the town of Huatugou, which is located in the northwest of Qinghai province of China (90.71°E, 38.29°N, 2907 m). The strain was isolated and cultured in MM medium (3.5 g/L MgCl2, 1.0 g/L NH4NO3, 0.35 g/L KCl, 0.05 g/L CaCl2, 1.0 g/L KH2PO4, 1.0 g/L K2HPO4, 0.01 g/L FeCl3, 0.08 g/L KBr, -4 g/L ZnSO4·7H2O, and 24 mg/L SrCl2·6H2O, pH 7.5), with 2% (/) petroleum as a sole carbon source. To assess petroleum degradation, cells were inoculated in 100 mL liquid MM medium with 2% (/) petroleum and cultured on a rotary shaker at 20°C and 150 rpm. After 96 h of fermentation, the residual petroleum in the medium was extracted using petroleum ether. The extraction was subsequently evaporated in a rotary evaporator at 40°C and the amount of residual oil was measured using the gravimetric method described by Latha and Kalaivani [22], i.e., amount of petroleum of petroleum added in the of residual oil, and the degradation rate was consequently calculated.

2.2. DNA Extraction and 16S rRNA Gene Sequencing

For DNA extraction, the strain was inoculated in liquid LB medium at 25°C and incubated at 150 rpm on a rotary shaker for 60 h. Genomic DNA was extracted using a Bacterial Genomic DNA Extraction Kit (AxyPrep, Corning Inc., NY, USA) according to its instructions. The 16S rRNA gene sequence was amplified using the primers 27F 5-AGAGTTTGATCCTGGCTCAG and 1492R 5-TACCTTGTTACGACTT [23]; the sequence was aligned with the NCBI database, and the 16S rRNA gene sequence obtained in this study was deposited into NCBI under accession number MK014304.

2.3. Genomic DNA Sequencing, Assembly, and Annotation

The PacBio genomic DNA library was prepared using TruSeq Nano DNA LT Library Preparation Kits (Illumina Inc., San Diego, CA, USA) after purification of the strain DNA and examination using a Nanodrop 2500. The DNA library sequencing was performed on a PacBio RS II platform using Illumina MiSeq at Majorbio Inc. (Shanghai, China). After quality control of the raw reads generated from sequencing, the resulting clean reads were assembled de novo using Newbler (version 2.8) and Hierarchical Genome Assembly Process (HGAP) version 3.0. The protein-coding genes, tRNA genes, and rRNA genes within the genomic sequence assembled were predicted using Glimmer 3.02 (http://www.cbcb.umd.edu/software/glimmer/), tRNAscan-SE v1.3.1, and Barrnap 0.4.2, respectively. The tandem repeat and interspersed repeat sequences were predicted using RepeatMasker and TRF software, respectively. The predicted protein-coding genes were subjected to BLASTn against the Nr, string (v9.05), and GO databases using BLAST2.2.28+. The COG (Clusters of Orthologous Groups of proteins) annotation of the predicted genes was obtained by BLASTp search against the string database (http://string-db.org/), and the functional protein clustering was performed according to the COG results. The predicted genes were further compared by blast against KEGG (Kyoto Encyclopedia of Genes and Genomes) database to gain their KOs and pathways. Genomic Island (GI) in the genome was predicted using IslandViewer 4 (https://www.pathogenomics.sfu.ca/islandviewer/) and PHAST software (version 1.5). The complete genome sequences generated in the present study were deposited in GenBank under the accession numbers CP030937 and CP030938.

2.4. Phylogenetic Analysis of the Strain

The protein sequences of 24 housekeeping genes, including CTP synthase, DNA primase, DNA-directed RNA polymerase beta-subunit, LSU ribosomal protein L3p, LSU ribosomal protein L4p, LSU ribosomal protein L5p, LSU ribosomal protein L6p, LSU ribosomal protein L7/L12, LSU ribosomal protein L11p, LSU ribosomal protein L13p, LSU ribosomal protein L16p, LSU ribosomal protein L20p, LSU ribosomal protein L27p, phosphoglycerate kinase, ribosome recycling factor, SSU ribosomal protein S2p, SSU ribosomal protein S3p, SSU ribosomal protein S5p, SSU ribosomal protein S9p, SSU ribosomal protein S10p, SSU ribosomal protein S11p, SSU ribosomal protein S13p, transcription termination protein NusA, and translation elongation factor Ts, from certain Bacillaceae members were downloaded from GenBank [13]. The protein sequences extracted from GenBank and the present isolate were aligned using MEGA 7.0, and a phylogenetic tree was consequently produced based on neighbor-joining method.

2.5. Comparative Genomics

To discern the characteristic of DM2 genome, the genomes of six Bacillus strains, i.e., B. subtilis subsp. subtilis str. 168, B. subtilis PY79, B. subtilis TO-A JPC, B. subtilis MJ01, B. subtilis B-1, B. subtilis TO-A JPC, and B. subtilis UD1022, which were isolated from different biotopes with their genome sequences deposited in GenBank, were retrieved from NCBI. The genome of strain DM2 was submitted to the Integrated Microbial Genomes (IMG) database (https://img.jgi.doe.gov/) for comparative genome analysis.

3. Results and Discussion

3.1. Isolation and Identification of a Petroleum-Degrading Strain DM2

Strain DM2 was isolated from the soil of an oil field located in a cryogenic region at an altitude of 2909 m using MM medium with petroleum as the sole carbon source. The strain grew well in liquid LB medium and reached its maximum growth rate after 24 h of shaking culture (Figure 1(a)). The strain could also grow in the oligotrophic liquid MM medium containing 2% (/) of the mixture of alkanes (C12 : C14 : : 1 : 1) as the sole carbon source (Figure 1(b)). However, when 2% petroleum was added to MM medium as the sole carbon source, the strain exhibited better growth than with the alkane mixture as the carbon source (Figure 1(c)), suggesting a low degradation capacity for middle-chain alkanes. Further experiments indicated that, when the strain incubated in liquid MM medium containing 2% petroleum as the carbon source at 20°C for 96 h, of petroleum in medium was degraded suggesting its strong petroleum-degrading ability at the culture conditions. The 16S rRNA gene sequence of strain DM2 showed 99% similarity with Bacillus subtilis. Thus, the isolate was identified as B. subtilis DM2.

3.2. The Genome Organization of B. subtilis DM2

The genome of strain DM2 consists of a circular chromosome of 4,238,631 bp with G+C content of 43.52% and a plasmid of 84,240 bp with G+C content of 35.08%. The detailed information on the genome is summarized in Table 1 and Figure 2. To further discern the characteristics of the genome, we downloaded the genomic data of six B. subtilis strains from the NCBI database and comparatively analyzed their genomes (Table 2). Of those, B. subtilis subsp. subtilis str. 168 is a subspecies and a model strain of B. subtilis. B. subtilis PY79 is one of the most widely used laboratory strains [24]. B. subtilis B-1 is a petroleum-degrading and biofilm-producing strain isolated from the oil field biofilms [21]. B. subtilis MJ01 is also a petroleum-degrading strain isolated from oil-contaminated soil [12]. B. subtilis TO-A JPC is a probiotic strain isolated from a probiotic drug Vibact® [13]. B. subtilis UD1022 is a plant growth-promoting strain isolated from the plant rhizosphere soil [7]. Among them, DM2 has the largest genome and the highest number of predicted genes and protein-coding genes. The previous studies hypothesized that there is a correlation between microbial genome size and their environment adaptability [25, 26]. Whether such distinct genetic traits of strain DM2 is related to its successful adaptation to its habitat needs further study.

3.3. Functional Protein Classification

The genome of strain DM2 encodes 4458 proteins, of which 712 are hypothetical proteins. The predicted protein sequences were aligned against the COG database using BLASTp. A total of 3163 proteins were annotated to at least one COG category. The top protein categories are amino acid transport and metabolism, carbohydrate transport and metabolism, general function prediction only, transcription, function unknown, translation, ribosomal structure and biogenesis, coenzyme transport and metabolism, cell wall/membrane/envelope biogenesis, inorganic ion transport and metabolism, signal transduction mechanisms, and energy production and conversion (Table S1). The protein numbers under categories of extracellular structures, lipid transport and metabolism, secondary metabolite biosynthesis, transport and catabolism, posttranslational modification, protein turnover, chaperones, replication, recombination and repair, and signal transduction mechanisms increase over those of the model strain B. subtilis subsp. subtilis str. 168 (Table S1). These categories include a large number of stress response and environmental adaptation proteins, implying that strain DM2 has a strong adaptive capacity to environments.

3.4. Whole-Genome Alignments Reveal Heterogeneity within Strains of B. subtilis

In general, the genetic distance and gene similarity between two organisms can be determined by DNA-DNA hybridization (DDH). Recently, the Genome Blast Distance Phylogeny (GBDP) approach was improved for in silico genome-to-genome comparison [27, 28]. The principle of this approach is, firstly, two genomes are aligned using BLAST to generate a set of high-scoring segment pairs, and secondly, a single genome-to-genome distance value is calculated from the total number of identical base pairs by a specific distance formula [28]. The DDH values between the whole genomes of strain DM2 and other B. subtilis strains, which have publicly available complete genomes, were calculated using the genome-genome distance calculator (GGDC) server at http://ggdc.dsmz.de [28]. Because the length of high-scoring pairs was used for calculation instead of the genome length, the Formula II values were used as the analysis standards. Strain DM2 was the closest to B. subtilis PY79 with 89% DDH value followed by B. subtilis NCIB 3610 and B. subtilis subsp. subtilis str. 168, both with 88.6% DDH value, suggesting that strain DM2 belongs to B. subtilis subsp. subtilis subspecies. However, DDH , which is a threshold for species delimitation in Archaea and Bacteria [28], for strain DM2 and B. subtilis subsp. stercoris, B. subtilis subsp. spizizenii, B. subtilis subsp. inaquosorum, and B. subtilis subsp. spizizenii suggest significant genome variations among these B. subtilis strains (Table 3). Pairwise genome alignments show that the genomic organization of strain DM2 has high similarity with B. subtilis subsp. subtilis str. 168, B. subtilis PY79, and B. subtilis UD1022. No rearrangement is evident, but only a few chromosomal deletions between 1178 and 1375 Kbps are observed in the chromosome. However, chromosomal inversions are observed among strains DM2, B. subtilis MJ01, and B. subtilis TO-A JPC. Synteny analysis showed various genome rearrangements between strains DM2 and B-1 with numerous genomic insertions, deletions, and inversions (Figure 3). These results indicate that the core genomes of the Bacillus subtilis strains are conserved.

3.5. Phylogenetic Analysis of B. subtilis Strain DM2

To understand the phylogenetic relationship of strain DM2, the protein sequences of 24 housekeeping genes of the members of B. subtilis and other Bacillus species were aligned using MEGAx. The neighbor-joining phylogenetic tree shows multiple clades (Figure 4). Although the DDH value indicates the closest similarity between the strains DM2 and B. subtilis subsp. subtilis str. 168, strain DM2 belongs to a separate clade in the phylogenetic tree compared with the other petroleum-degrading strains such as B. subtilis MJ01, B. subtilis B-1, and B. krulwichiae, which are only distantly related. Furthermore, an intermix of strains B. velezensis and B. amyloliquefaciens within the clades of B. subtilis implies that these strains have the closest phylogenetic relationship. In addition to B. krulwichiae, B. pumilus, and B. safensis, which have been reported as isolated from oil-contaminated environments, to date [2931], most of the oil-degrading Bacillus strains belong to B. subtilis, suggesting that B. subtilis possesses the functional diversity and adaptive capacity to various environments.

3.6. Core Proteome Analysis of Strain DM2

The orthologous proteins of four B. subtilis strains, which have the closest phylogeny or functional similarity, were aligned using Proteinortho V2.3 Perl script (Figure 5(a)). A total of 3501 proteins formed the core set of proteins of the four strains. Strain DM2 shares 3925, 3777, and 3651 common orthologous proteins with strains B. subtilis 168, PY79, and MJ01, respectively. To further unravel the differences in orthologous proteins among the Bacillus strains isolated from oil-contaminated environments, the orthologous proteins of strains DM2, MJ01, and B-1 were aligned. The result shows that 1131 orthologous proteins are shared by three strains; 3651 orthologous proteins are shared by strains DM2 and MJ01, but only 1161 orthologous proteins are shared by strains DM2 and B-1 (Figure 5(b)). This result indicates that there are great differences between strains DM2 and B-1, although both are the B. subtilis members capable of degrading petroleum. The 1131 orthologous proteins shared with the three strains were further aligned using BLASTp to identify the conserved function genes. The analysis shows that, apart from housekeeping genes, the genes responsible for sporulation/spore germination proteins, chaperones, membrane transport proteins, and transcriptional regulators are the functionally conserved Bacillus genes. Moreover, the genes encoding ring-cleaving dioxygenase, fatty acid desaturase, cytochrome P450, oxidoreductase, and FAD-binding oxidoreductase are highly conserved genes in these three petroleum-degrading strains, providing the molecular basis for petroleum biodegradation.

3.7. Characterization of Proteins Encoded by Strain DM2 Genome

The above analyses indicate that strain DM2 has the most protein-coding genes among the B. subtilis strains compared in this selected panel. Comparison analysis of the identified protein coding regions of strain DM2, model strain 168, and another oil-degrading strain MJ01 shows that strain DM2 shares several common proteins with strain MJ01, including DNA-binding response regulators, EamA family transporters, NAD(P)-dependent oxidoreductases, two-component sensor histidine kinases, two-component system response regulators, and methyl-accepting chemotaxis proteins. Most of these proteins may function in response to stresses and signal transduction. Some of them are also involved in hydrocarbon degradation, such as NAD(P)-dependent alcohol dehydrogenases. Furthermore, some proteins are DM2-specific, such as carboxymuconolactone decarboxylase family, gfo/Idh/MocA family oxidoreductases, GlsB/YeaQ/YmgE family stress response membrane proteins, HlyC/CorC family transporters, LLM class flavin-dependent oxidoreductase, and LPXTG cell wall anchor domain-containing proteins (Table S2). However, most of the abovementioned proteins are absent from strain 168. We conclude that the strain-specific proteins further imply the vigorous adaptability of strain DM2 to the harsh biotope.

3.8. Horizontal Gene Transfers in the Genome of Strain DM2

The analysis of nonorthologous proteins reveals that the genome contains many horizontal gene transfers. A total of 34 gene islands (GIs) are found in the genome, which consist of 4000 bp-100,000 bp DNA in size (Table 4). Of the total 510 genes, 330 genes are annotated as hypothetical proteins with unknown function, 37 are annotated as recombinase- and phage-related proteins, most of which are the phage-specific site integrases. Most of the genes in GI are associated with metabolism (22), transcriptional regulation (33), signal transduction, and membrane transport (10). Notably, several genes, including glycosyl transferase family A, fatty acid desaturase, short chain dehydrogenase family protein, stress response protein, and cold-shock protein, which are involved in glycosyl transfer, lipid metabolism, and stress response, are found in the GIs. The harboring of these genes in GIs suggests that horizontal gene transfer provides additional clues about metabolic diversity [26] and confers several functional genes to strain DM2 to cope with the harsh environment and to promote petroleum degradation [32]. In addition, a total of 5 prophages are found in the genome of strain DM2, which comprise an intact (123 kb), two incomplete (30 kb), and two questionable (28 kb and 61 kb) prophages (Figure 6). The protein sequences of prophages were further searched in the Nr database using BLASTp, and the result indicates that only an intact prophage protein initiated from B. subtilis, whereas the remaining protein sequences evolved from the outgroup prophages of Bacillus. The presence of prophages in the genome reflects phage-related genetic modifications and is well-known to regulate bacterial population density. Therefore, the gene transfer, which occurred in strain DM2, plays critical roles in the acquisition of the resistance genes and adaptation to harsh environments [33].

3.9. KEGG Pathway Enrichment Reveals the Metabolic Character of Strain DM2
3.9.1. Metabolism

A total of 600 genes are enriched in the metabolic and synthetic pathways, including whole genes associated with glycolysis, TCA cycle, and pentose phosphate pathway, but are lacking the key gene coding for 2-dehydro-3-deoxy-phosphogluconate in Entner-Doudoroff pathway. Strain DM2 can use sucrose, fructose, galactose, rhamnose, mannose, and C5-branched dibasic acid as substrates. However, the key genes involved in pathways of fucose, allose, and sorbitol are lacking. The genes involved in polysaccharide metabolism, such as dextranase and amylase genes, are found in strain DM2 genome. All the genes required for the anabolic pathways of amino acids, purines, and pyrimidine synthesis are present in the genome. A total of 18 genes are involved in nitrogen metabolic pathway, of which 7 genes encode nitrite reductase (nirB, nirC, nirD, narG, narH, narI, and narJ) that catalyze nitrate to ammonia. In addition, 2 genes encode nitronate monooxygenases that catalyze nitroalkane to nitrite. The lack of a gene coding for nitrogenase suggests the absence of nitrogen fixation. All key enzymes involved in synthesis of cysteine from sulfate are found in the genome, but the gene coding for sulfate transport system substrate-binding protein (Cysp) is absent. Thus, the presence of genes coding for sulfonate transport protein (ssuA) and alkanesulfonate monooxygenase (ssuD) in the DM2 genome implies that the strain uses alkanesulfonate rather than sulfate as a sulfur supply (Figure 7).

3.9.2. Osmoprotectant Transport Systems

The osmoprotectant transport system (Opu) in the genome of the strain comprises two opuA (orf3675 and orf3680), four opuBD (orf3672, orf3674, orf3677, and orf3679), and two opuC (orf3673 and orf3678) genes. The genes that are involved in the absorption and synthesis of glutamate, which acts as osmoprotectant, include three gltA (orf0968, orf2624, and orf3162), a gltD (orf2036), a gltB (orf2037), a gltC (orf2038), and two gltP/T (orf0240 and orf1048) genes.

3.9.3. Pathways for Degradation of Petroleum Hydrocarbons and Xenobiotics

Strain DM2 genome harbors a total of 37 genes that may be responsible for hydrocarbon degradation. Of those, 11 genes encode dioxygenases, 13 genes encode monooxygenases, 8 genes encode cytochrome P450 enzymes, and single genes encode fatty acid desaturase, dihydropteridine reductase, and NADH-dependent butanol dehydrogenase. Among them, catechol-2,3-dioxygenase, biphenyl-2,3-dioxygenase, 4-hydroxyphenylacetate 3-monooxygenase, cytochrome P450 CYP102A2_3, and ring-cleaving dioxygenase are the key aromatic degradation enzymes. Some monooxygenases, cytochrome P450 enzymes, NADH-dependent butanol dehydrogenases, fatty acid beta-hydroxylases, and fatty acid desaturases are involved in the degradation pathways of alkanes and alkenes (Table 5). The enzyme fatty acid desaturase is also an important member that plays roles in the adaptability to low temperature [13]. Interestingly, a ubiquitous gene coding for alkane monooxygenase (AlkB) is not found in the genome of strain DM2. Additionally, another gene coding for bacterial luciferase (LadA) involved in the degradation of long-chain hydrocarbon [34], which is harbored in the genomes of strains 168 and MJ01, is not found in DM2 genome.

The xenobiotic biodegradation and metabolism pathways consist of benzoate degradation, aminobenzoate degradation, chloroalkane and chloroalkene degradation, bisphenol degradation, azathioprine and 6-mercaptopurine degradation, fluorouracil degradation, and citalopram degradation (Table 6).

3.9.4. Stress Response and Signaling

The genomic analysis indicates that strain DM2 has developed systems of stress response and signal transduction. Many genes that are related to environmental stress response and signal transduction are found in the genome of strain DM2. A total of 122 genes encode two-component signal transduction proteins, including 19 histidine kinases, 21 response regulators, 11 sporulation and bacteria movement-related genes, and 8 diguanylate cyclases (Table 7). Of those, the histidine kinases act as the stimulus sensor and play a critical role in signal transduction [35]. These histidine kinase genes are adjacent to the genes that code for response regulators in the genome, suggesting that strain DM2 has a highly efficient two-component signaling system and may respond efficiently to the environmental signals [36]. Diguanylate cyclase catalyzes the formation of cyclic diguanylate monophosphate and acts as the ubiquitous secondary messenger involved in various bacterial metabolic and growth processes [37]. The common Sec-dependent secretion system and twin arginine targeting secretion system in the genome are beneficial to the substance exchange across cell membrane and even remold the environment for its growth [38]. In addition, strain DM2 contains 18 chaperone genes, including RNA chaperone, molecular chaperone, nitrate reductase molybdenum cofactor assembly chaperone, copper chaperone, flagellar biosynthesis chaperone, heat shock proteins, and cold shock proteins. Chaperones stabilize the protein conformations and have been shown to contribute to bacterial growth at low temperatures [39]. Strain DM2 possesses two biotin carboxylase genes that have been reported to be expressed at low temperature [40].

3.9.5. ABC Transport Systems

Another prominent feature of the genome of strain DM2 is the powerful membrane transport systems, particularly the ABC transport systems. A total of 310 genes are related to the membrane transport systems. Among them, 138 genes encode the ABC transporters, including ATP-binding protein, substrate-binding proteins, and permeases (Table 8). ABC transporters play important roles in active transmembrane transport, acting as alkanesulfonate transporters, glycine betaine/proline phosphate transporters, amino acid transporters, and osmoprotectants. ABC transporters mediate the transport of glutamine/cystine/D-methionine, maltose/maltodextrin/galactose oligomer, raffinose/stachyose/melibiose, oligopeptide, dipeptide, biotin, and bacitracin, as well as iron complex/iron II, zinc, manganese, and Na+ (Table 6). In addition, the DM2 genome contains several phosphotransferase systems (PTS), which is a major carbohydrate active transport system, indicating that these transporters are responsible for the carbohydrate transport into cells.

4. Conclusion

The B. subtilis strain DM2 isolated from petroleum-contaminated soil on the Tibetan Plateau displays a great capacity to degrade petroleum at a low temperature. The complete genome sequencing and genomic analysis of strain DM2 help us to unravel its biological features that enable it to successfully utilize hydrocarbons as carbon source and potentially withstand other environmental challenges. Strain DM2 is clustered as a separate and a higher evolutionary clade in the phylogenetic tree based on 24 housekeeping protein sequences, implying its unique position with respect to other B. subtilis strains. Strain DM2 possesses the largest genome and the most protein-coding genes relative to the other compared B. subtilis strains. DDH values show that strain DM2 belongs to B. subtilis subsp. subtilis, but significant variations in the genome occurred with respect to the other strains or subspecies. Comparative genomic analysis identified the core proteome common to strain DM2, model strain B. subtilis subsp. subtilis 168, and other B. subtilis strains. Strain DM2 possesses almost the same strain-specific proteins as strain MJ01, which is another oil-degrading B. subtilis strain, unlike strain B. subtilis subsp. subtilis 168. Furthermore, many strain DM2-specific proteins were also identified, such as carboxymuconolactone decarboxylase family protein, gfo/Idh/MocA family oxidoreductases, GlsB/YeaQ/YmgE family stress response membrane protein, HlyC/CorC family transporters, LLM class flavin-dependent oxidoreductase, and LPXTG cell wall anchor domain-containing protein. Most of these strain-specific proteins have been shown to be involved in the pathways related to stress response, signaling, and hydrocarbon degradation, suggesting that the main feature of the DM2 genome is the evolutionary occurrence of many genes related to environmental adaptation and carbon utilization. The genomic information provided by the present study might help us to further reveal the genetic and genomic characters of Bacillus subtilis, which is a ubiquitous and important bacterial species.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Acknowledgments

This work was supported by the National Natural Science Funds of China (Grant Nos. 31760110 and 31560121).

Supplementary Materials

Supplementary Table S1: comparative analysis of COG categories between strain DM2 and strain 168. Supplementary Table S2: list of the strain DM2-specific proteins versus strains PY97, 168, and MJ01. (Supplementary Materials)