Abstract

Anaplasma marginale is the main etiologic agent of bovine anaplasmosis, and it is extensively distributed worldwide. We have previously reported the first genome sequence of a Mexican strain of A. marginale (Mex-01-001-01). In this work, we report the genomic analysis of one strain from Hidalgo (MEX-14-010-01), one from Morelos (MEX-17-017-01), and two strains from Veracruz (MEX-30-184-02 and MEX-30-193-01). We found that the genome average size is 1.16-1.17 Mbp with a GC content close to 49.80%. The genomic comparison reveals that most of the A. marginale genomes are highly conserved and the phylogeny showed that Mexican strains cluster with Brazilian strains. The genomic information contained in the four draft genomes of A. marginale from Mexico will contribute to understanding the molecular landscape of this pathogen.

1. Introduction

Bovine anaplasmosis is an infectious, tick-borne disease caused mainly by Anaplasma marginale; typical signs include anemia, fever, abortion, weight loss, decreased milk production, jaundice, and potentially death. Although a sick bovine may recover when antibiotics are administered, it usually remains as a carrier for life, being a risk of infection for susceptible cattle. Anaplasma marginale is an obligate intracellular Gram-negative bacterium with a genetic composition that is highly diverse among geographical isolates [1]. Currently, there are no fully effective vaccines against bovine anaplasmosis; therefore, the economic losses due to the disease are present. Whole-genome sequencing (WGS) is an applicable tool for many pathogenic bacterial studies since 1995, when the first bacterial genomes were determined [2, 3]. Vaccine formulation became a hard task for pathogens as diverse as Anaplasma marginale, and almost all efforts have been directed toward Outer Membrane Proteins (Omp), Type IV Secretion System (T4SS), and Major Surface Proteins (Msp) [48]. Up to date, there are several genomes reported from A. marginale, but only one is from a Mexican strain [9]. New data could be useful for focusing in alternative antigens that induce specific and protective responses against bovine anaplasmosis. In this work, we present draft genomes from four Anaplasma marginale Mexican strains. In addition, a first approach for comparative analyses between them and Brazilian, Australian, and North American strains is shown. In order to advance in the identification of potential vaccine molecules, pathogenicity, transmission and infection mechanisms, and genetic diversity of Anaplasma marginale, further analyses are necessary.

2. Materials and Methods

2.1. Strain Origin

Each of the four Mexican strains were isolated from infected blood of animals from Atitalaquia, Hidalgo (MEX-14-010-01); Puente de Ixtla, Morelos (MEX-17-017-01); Tlapacoyan, Veracruz (MEX-30-184-02); and Veracruz, Veracruz (MEX-30-193-01). The infected blood was collected and kept at -80°C until its use.

2.2. Genome Sequencing, Assembly, and Annotation

We used 200 μl of bovine blood for each isolate to extract genomic DNA using the UltraClean DNA BloodSpin kit (Mo Bio Laboratories). The library preparation was performed by the University of Arizona Genetics Core, using a DNA TruSeq library construction kit (Illumina). Two micrograms of genomic DNA for each isolate was sequenced with MiSeq platform (Illumina). The NextSeq instrument from Illumina uses sequencing-by-synthesis (SBS) chemistry. The Illumina adapter sequences were removed from paired-end reads using ILLUMINACLIP trimming step of the Trimmomatic (version 0.36) program with default settings [10]. Low-quality bases were removed using the dynamictrim algorithm of SolexaQA++ (version 3.1.7.1) suite [11] with a Phred quality score . The resulting paired-end reads were de novo assembled using the SPAdes (version 3.11.1) program [12] with the following options: (i) only runs assembly module (--only-assembler), (ii) reduce number of mismatches (--careful), and (iii) -mer lengths between 21 and 127. Based on the G+C content of each contig assembled using a Python script (https://github.com/FernandoMtzMx/GC_content_MultiFasta) (A. marginale genomes reported in databases have a G+C content between 46 and 52%), contigs of four Mexican strains were differentiated from contigs that belong to other organisms (i.e., bovine genomes). Also, we aligned the sequences of each contig assembled with the nucleotide collection (nr/nt) database and Anaplasma marginale as the organism name using BLASTN suite [13]. Contigs with an alignment coverage higher than 50% and an identity higher than 70% belong to A. marginale genomes were considered “reasonably good” alignments [14]. The features of four draft genomes were evaluated using the QUAST (version 4.6.2) program [15].

The draft genomes of four Mexican strains were annotated automatically using the RAST (version 2.0) server [16], and the 16S rRNA gene sequences were obtained using the RNAmmer (version 1.2) server [17].

2.3. Genomic Comparison

The Blast Ring Image Generator (BRIG) (v0.95) program [18] was used to determine the genome comparison between the Mexican A. marginale strains and six strains from Australia, Brazil, and the United States. The circular comparative genomic map was constructed by BRIG using the GenBank files (gbk format) with standard default parameters and NCBI local blast-2.9.0+ suite.

2.4. Phylogenetic Analysis

The 16S rRNA, gyrA, gyrB, groEL, and rpoB sequences of housekeeping genes were obtained from the genomes of MEX-01-001-01 (Aguascalientes, Aguascalientes), MEX-14-010-01 (Atitalaquia, Hidalgo), MEX-17-017-01 (Puente de Ixtla, Morelos), MEX-30-184-02 (Tlapacoyan, Veracruz), and MEX-30-193-01 (Veracruz, Veracruz). The gene sequence datasets of Mexican strains were compared to 13 downloaded gene sequence datasets of A. marginale, A. centrale, A. ovis, A. phagocytophilum, and Ehrlichia canis and E. ruminantium (as outgroup), which were obtained from the GenBank database (https://www.ncbi.nlm.nih.gov/) using the nucleotide BLAST suite (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Multiple alignments between all gene sequence datasets were made using the MUSCLE (v3.8.31) program [19]. Alignment sequences per genome were concatenated using a Python script. The jModelTest (v2.1.10) program [20] was used to select the best model of nucleotide substitution using the Akaike information criterion (AIC). Phylogenetic tree was inferred based on a maximum likelihood method using the PhyML (v3.1) program [21] with 1000 bootstrap replicates. The phylogenetic tree was visualized and edited using the FigTree (v1.4.3) program (http://tree.bio.ed.ac.uk/software/figtree/).

2.5. Genome Synteny Analysis

The presence of large-scale evolutionary events, such as rearrangement and inversion of genomic segments, was detected by aligning the ordered contigs of the four Mexican strains and the genomes of Florida, Dawn, Gypsy Plains, Jaboticabal and Palmeira strains (GenBank accession numbers CP001079.1, CP006847.1, CP006846.1, CP023731.1, and CP023730.1, respectively), against the reference genome of A. marginale St. Maries (GenBank accession number CP000030.1), when using the “Align with progressiveMauve” algorithm of the Mauve program (v2.4.0) [22].

3. Results

3.1. Data Availability

The Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession VTSO00000000 (MEX-14-010-01), VTCX00000000 (MEX-17-017-01), VTCY00000000 (MEX-30-184-02), and VTCZ00000000 (MEX-30-193-01).

3.2. Genome Features

We obtained four random datasets of 1,418,888 (MEX-14-010-01); 567,482 (MEX-17-017-01); 671,050 (MEX-30-184-02); and 940,598 (MEX-30-193-01) paired-end reads of 300 bp which were reported in the GenBank database. The number of reads obtained after Trimmomatic and dynamictrim analyses are as follows: MEX-30-184-02, 669,396; MEX-30-193-01, 940,392; MEX-17-017-01, 567,434; and MEX-14-010-01, 1,418,550; the assembly coverage is 27X, 154X, 58X, and 22X, respectively.

The GC content for MEX-14-010-01, MEX-17-017-01, and MEX-30-184-02 strains is 49.79% and 49.80% for MEX-30-193-01. The percentage of GC values for Mexican strains is very similar to that for the reference strain A. marginale St. Maries (49.80%). Other genomic characteristics of each strain are shown in Table 1.

According to the RNAmmer server, all 16S rRNA Mexican strains have a length of 1,491 bp, which have 100% alignment coverage, and between 99 and 100% are identified with the 16S rRNA gene sequence of A. marginale strain St. Maries (GenBank accession CP000030). The number of rRNAs and tRNAs in the four strains is 3 and 37, respectively. The number of protein-coding genes for MEX-14-010-01, MEX-17-017-01, MEX-30-184-02, and MEX-30-193-01 is 1150, 1163, 1165, and 1138, respectively. In Table 2, the information derived from the SEED subsystem of the RAST server for each strain is shown.

3.3. Genomic Comparison and Phylogeny

We compared the four Mexican draft genomes of A. marginale with Brazilian, Australian, and North American strains. In Figure 1, the comparative genomics is shown. Although most of the genomes are highly conserved, the Dawn and Gypsy Plains strains showed some differences from the Mexican, North American, and Brazilian strains. We randomly selected fourteen ORFs found in the genomic annotation predicted as membrane proteins, and then, we located them in the genomes; as observed, most of these proteins are conserved in all genomes (Figure 2).

3.4. Genome Synteny Analysis

The genome synteny of 11 A. marginale genomes of Australian, Brazilian, Mexican, and North American strains shows that the first 100,000 bases have a rearrangement of several small fragments (Figure 3). In addition, the genome synteny of A. marginale shows that the Australian, Brazilian, and North American strains have a highly conserved genome structure, while the genomes of Mexican strains show some rearrangement and inversion of genomic segments (Figure 3). In general, the structure of A. marginale genomes shares a high percentage of coverage and is widely conserved in different geographical regions of the world.

4. Discussion

So far, only one draft genome of a Mexican strain of A. marginale has been reported [9]. In this work, we present the genomic information of other four strains: MEX-14-010-01, MEX-17-017-01, MEX-30-184-02, and MEX-30-193-01.

The genomic analysis reveals that their size (ranging from 1,167,111 bp to 1,176,681 bp) and a GC content (about 49.79%) are very similar to other A. marginale strains reported in GenBank such as the reference genome of the St. Maries strain, with a genome size of 1,197,690 bp and a GC content of 49.80%.

The number of Genes and CDS is very similar in the four strains. In fact, in the genome annotation, using the different subsystem classification of RAST server, we identified genes related to cell wall and capsule, virulence, disease and defense, membrane transport, and protein and DNA metabolism, among others. In the virulence, disease, and defense categories, we found genes associated with the cobalt-zinc-cadmium resistance, fluoroquinolone resistance, cooper homeostasis, and beta lactamase. Also, we identified genes of Mycobacterium virulence operon involved in protein synthesis (SSU and LSU ribosomal proteins) and Mycobacterium virulence operon involved in DNA transcription. Mycobacterium operon is present in several species, including Mycobacterium tuberculosis, Streptococcus pneumoniae, Bartonella bovis, and Streptococcus suis, among other animal and plant pathogens [2325].

In the stress response category, we found genes associated with oxidative stress, cold shock, heat shock, periplasmic stress response, and detoxification. For most of the obligate intracellular bacteria, the presence of peptidoglycan is not necessarily needed to maintain the integrity of the bacterial cell. In A. marginale, there are no reports of the analysis or isolation of its peptidoglycan [26]; however, we identified genes associated with the cell wall and capsule, specifically with the peptidoglycan biosynthesis. An interesting feature of A. marginale genomes is the role of the genes that we found in nitrogen metabolism. In alphaproteobacteria, the role of nitrogen metabolism may be essential for full virulence [27].

The phylogeny analysis indicates that Mexican strains are more related to Brazilian strains than to North American ones. The genomic comparison of the strains reveals the high percent of identity between A. marginale genomes as observed in the genome synteny analysis, where most of the strains are highly conserved in its structure and the Mexican strains have some rearrangements and inversions in certain genomic sequences.

The report of four draft genomes of A. marginale found in Mexico represents a first approach to unveil information that could help to develop new strategies for the design of vaccines against bovine anaplasmosis and new diagnostic methods. Still, more genomic analyses are needed to complete the molecular landscape of this pathogen.

5. Conclusions

We present here, the genomic report and analyses of four Mexican strains of A. marginale, the causal agent of bovine anaplasmosis. So far, only one genome of a Mexican strain has been reported; with this contribution, we compare our results with information of strains from the USA, Brazil, and Australia and provide more information of this pathogen.

Data Availability

The Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession: VTSO00000000 (MEX-14-010-01), VTCX00000000 (MEX-17-017-01), VTCY00000000 (MEX-30-184-02) and VTCZ00000000 (MEX-30-193-01).

Additional Points

Reference Numbers for Data Available in GenBank. MEX-30-184-02 Anaplasma marginale (GenBank ID: VTCY00000000); MEX-30-193-01 Anaplasma marginale (GenBank ID:VTCZ00000000); MEX-17-017-01 Anaplasma marginale (GenBank ID:VTCX00000000); MEX-14-010-01 Anaplasma marginale (GenBank ID:VTSO00000000).

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by Fondo INIFAP SIGI number 12353134456 and SEP-212 CONACYT number 168167. Scholarship number 293552 from the Consejo Nacional de Ciencia y Tecnología (CONACYT) was awarded to Fernando Martínez-Ocampo.