Papaya is a major fruit crop in the tropics and has recently evolved sex chromosomes. Towards sequencing the papaya sex chromosomes, two bacterial artificial chromosome (BAC) libraries were constructed from papaya male and female genomic DNA. The female BAC library was constructed using restriction enzyme BstY I and consists of 36,864 clones with an average insert size of 104 kb, providing 10.3x genome equivalents. The male BAC library was constructed using restriction enzyme EcoR I and consists of 55,296 clones with an average insert size of 101 kb, providing 15.0x genome equivalents. The male BAC library was used in constructing the physical map of the male-specific region of the male Y chromosome (MSY) and in filling gaps and extending the physical map of the hermaphrodite-specific region of the Yh chromosome (HSY) and the X chromosome physical map. The female BAC library was used to extend the X physical map gap. The MSY, HSY, and X physical maps offer a unique opportunity to study chromosomal rearrangements, Y chromosome degeneration, and dosage compensation of the papaya nascent sex chromosomes.

1. Introduction

Papaya (Carica papaya L.) is a tree crop grown in tropical and subtropical areas for the production of edible fruit that are vitamin rich. It is an ideal model species for genomic and evolutionary studies. Papaya is diploid, with 9 pairs of chromosomes, and has a relatively small genome of 372 Mb [1]. It is a fast growing perennial tree that can produce fruit as early as 9 months from planting. Papaya trees flower continuously throughout the year. Papaya fruits have a high seed number per fruit, providing abundant progeny from controlled crosses for genetic and genomic research. Papaya has efficient genetic transformation systems with Agrobacterium- and microprojectile-mediated transformations, the latter of which was used to generate transgenic papaya that is resistant to papaya ringspot virus [2, 3].

Papaya is in the order Brassicales, which also includes Brassica rapa and Arabidopsis thaliana, two other popular model species. Brassica and Arabidopsis are in the family Brassicaceae, which diverged from a common ancestor with Caricaceae, the papaya family, about 72 million years ago [4, 5]. Papaya is interesting evolutionarily, because it has three sex types: male, female, and hermaphrodite, controlled by incipient sex chromosomes that were estimated to have diverged about 2-3 million years ago [6, 7]. The female papaya genome has been sequenced, making papaya genetic and genomic studies efficient [8].

Because papaya has nascent sex chromosomes, it offers a unique opportunity to study early events of sex chromosome evolution. Sex chromosomes are thought to have originated from male and female-sterile mutations that are closely linked, leading to dioecious flowers [9]. Suppression of recombination occurs around this sex determination region, allowing for mutations and chromosomal rearrangements, and eventually leading to the degeneration of the Y chromosome. Papaya sex chromosome evolution appears to follow this model. Female papaya trees have two X chromosomes, the males have an X and a Y chromosome, and the hermaphrodites have an X and a slightly different Y chromosome, denoted as Yh. The majority of the X and Y chromosomes are homologous, except for a small sex determination region around the centromere, which is unable to recombine. Degeneration of the MSY and HSY has occurred, as seen by the increased retroelements and extreme gene paucity that were found in 5 sequenced HSY BACs [10].

BAC libraries are essential tools for physical mapping and map-based cloning of target genes. BAC libraries have been constructed and utilized in the physical mapping of many plant species such as Arabidopsis [11, 12], apple [13, 14], and grapevine [15, 16]. The BAC library of the papaya hermaphrodite [17] played an integral role in the assembly of the scaffolds of the papaya whole genome sequence [8], the construction of the papaya physical map [18], and the annotation of genes found in the HSY and its corresponding X region (R. Ming, unpublished data).

We report here the construction and application of two papaya BAC libraries, one from male and the other from female. The purposes of constructing these two BAC libraries are for physical mapping of the MSY and gap filling of the HSY, the corresponding region of the X chromosome physical maps, cloning of the sex determination genes, and comparing the gene content and chromosomal rearrangements of these regions of interest.

2. Material and Methods

2.1. Plant Material

Papaya cultivars, SunUp female and AU9 male, were the sources of the genomic DNA for the female and male BAC library construction, respectively. SunUp is the virus-resistant transgenic papaya line that was used to sequence the papaya female genome [8]. Plant tissue for both SunUp female and AU9 male was kept in dark conditions 24 hours prior to harvesting. Approximately 50 grams of young, expanding leaf tissue were harvested and immediately flash frozen in liquid nitrogen prior to nuclei preparation.

2.2. Papaya Nuclear Isolation and BAC Vector Preparation

Papaya high molecular weight DNA was isolated from the young leaves according to the procedure described by Zhang et al. [19]. Approximately 5 g of leaves were homogenized by grinding in a mortar with liquid nitrogen. The powdered leaf tissue was transferred and gently stirred for 10 min into 200 mL of HB (homogenization buffer: 10 mM Tris, 80 mM KCl, 10 mM EDTA, 1 mM spermine, 1 mM spermidine, and 0.5 M sucrose, pH9.4-9.5) plus 0.15%  β-mercaptoethanol and 0.5% Triton X-100 . The homogenate was filtered into a beaker through one layer of miracloth and two layers of cheesecloth. The filtrate was divided into four 45 mL-aliquots placed in 50 mL tubes and centrifuged at 1000 ×g for 20 min at 4°C. The supernatant was decanted and the pellet was resuspended in 45 mL of HB plus Triton and β-mercaptoethanol. The resuspended nuclei were filtered through a #50 mesh fine filter, rinsed with 200 mL HB, divided into 45 mL fractions in each of four 50 mL tubes, and centrifuged at 1000 ×g for 20 min at 4°C. The washing process was repeated three times. After the third wash, the pelleted nuclei in each tube were resuspended in 10 mL HB without Triton and β-mercaptoethanol and combined into one tube. The nuclei were then mixed with an equal volume of 1% low-melting-point agarose prewarmed to 45°C and poured into plug molds on ice, 100 uL per plug. When the agarose was completely solidified, the plugs were transferred into 5–10 volumes of lysis buffer (0.5 M EDTA pH 9.0–9.3, 1% sodium lauryl sarcosine, and 0.1 mg/mL proteinase) and stored at 4°C.

EcoR I and BstY I partial restriction enzyme digestion of DNA, separation of the partially digested DNA using pulse-field gel electrophoresis (PFGE), as well as the selection of high molecular weight DNA fragments were conducted following Luo and Wing [20]. The preparation of the EcoR I and BstY I cloning-ready single copy pIndigoBAC536 and pIndigoBAC5 vectors from the high copy pCUGIBAC1 plasmid was performed according to Luo et al. [21].

2.3. BAC Library Construction

The size-selected high molecular weight DNA fragments were ligated into BAC vectors and transformed into E. coli strain DH10B (Invitrogen, Carlsbad, CA, USA). AU9 male genomic DNA was digested with EcoR I and ligated to BAC vector pIndigoBAC536. SunUp female genomic DNA was digested with BstY I and ligated to BAC vector pIndigoBAC5. White recombinant colonies were selected on LB plates containing chloramphenicol, X-Gal, and IPTG and picked robotically using the Genetix Q-bot (Genetix, UK) and transferred into individual wells of 384-well microtiter plates containing freezing medium and then stored at −80°C. These BAC libraries are available at Texas A&M University by contacting Dr. Qingyi Yu at cost recovery basis.

2.4. BAC Library Screening

The BAC libraries were gridded onto 11.25 × 22.25 cm Hybond N+ membranes (Genetix, Hampshire, UK) in high density, by Q-bot (Genetix), with double spots (18,432 clones represented per filter) and 4 × 4 patterns. Some of the fields on the male BAC library membranes were poor quality, so only 18,432 male clones, in 6 fields across the 3 membrane set, were screened with each probe hybridized. The 2 membranes for the female library were high quality, so a total of 36,864 clones were screened per probe. To characterize the BAC inserts, BAC DNA was isolated according to standard alkaline lysis conditions in a 96-well format, digested with Not I, and separated by PFGE on a 1% agarose gel with the following conditions: 5–15 sec linear ramp time, 6V/cm, and 14°C in 0.5 X TBE buffer for 15 hours, and stained with ethidium bromide.

DNA probes were designed from BAC end sequences of hermaphrodite [18, 22] and male BACs of interest. One to four probes were hybridized to the appropriate library membranes following the protocol of the DIG High Primer DNA Labeling and Detection Starter Kit II (Roche). Positive BAC clones were tested for false positives, through colony PCR with BAC-specific primers. Individual positive BAC clones were inoculated in 15 mL of LB and allowed to grow overnight. Cultures were centrifuged, and the pellets were resuspended in a Glucose, EDTA, Tris-Cl resuspension solution, lysed in a 20% SDS and 4N NaOH solution, and Neutralized in KoAc and ph 5.0. The mixture was frozen, centrifuged, and the pellets washed with isopropanol and 70% ethanol. The pellets were resuspended in TE buffer. Insert sizes were determined using a Not 1 digestion followed by running a 1% TBE gel using PFGE electrophoresis. BAC ends of confirmed positive clones were sequenced, primers were designed from the BAC end sequences for the verification of physical map locations, and chromosome walking extended the contigs.

3. Results

3.1. Construction of the BAC Libraries

The BAC library of papaya SunUp female was constructed by partial digestion of genomic DNA with BstY I and ligation to pIndigoBAC5. A total of 36,864 clones were picked and stored in ninety-six 384-well plates. The 36,864 clones were spotted onto a set of 2 high-density colony filters, each filter consisting of forty-eight 384-well microtiter plates, or 18,432 clones. Fifteen complete sets of high-density filters for the female library were created. To estimate insert size and distribution of the papaya female BAC library, 42 BACs were selected at random and analyzed by Not I digestion and PFGE (Figure 1(a)). All tested clones contained inserts. Insert sizes ranged from 85 to 125 kb with an average of 104 kb (Figure 1(b)). The female BAC library covered the equivalent of 10.3 papaya haploid-genomes, based on a papaya genome size of 372 Mb [1].

The AU9 male BAC library was constructed by partial digestion of genomic DNA with EcoR I and ligation to pIndigoBAC536. This library consists of 55,296 clones and is stored in one hundred forty-four 384-well plates. All 55,296 clones were gridded on a set of 3 filters (A, B, and C). Twelve complete sets of high-density filters for the male library were produced, along with thirteen additional A filters and eight additional B filters. Insert size and distribution of the male BAC library were determined by Not I digestion and PFGE of 42 randomly selected BACs (Figure 2(a)). All tested clones contained inserts and the insert sizes ranged from 33 kb to 150 kb with an average insert size of 101 kb (Figure 2(b)). The male library represents 15x genome equivalents.

3.2. BAC Library Screening

To screen the female BAC library, eight probes were designed from the end sequences of BACs that had been physically mapped to the X chromosome, and hybridized to the female BAC library. These eight probes hybridized to 147 BAC clones, with an average of 18.4 BACs per probe (Table 1).

To screen the male BAC library, a total of 206 probes, designed from HSY BAC ends, were hybridized to the male BAC library in pools of three to four probes. Eleven pools consisting of 43 probes hybridized to an unusually large number of clones, indicating one or more probes in each of those 11 pools, contained repetitive sequences and, thus, were excluded from calculating the number of positive clones (Table 1). The remaining 163 probes hybridized to 1,512 clones, averaging 9.3 clones per probe. These positive clones were used as seed BACs for physical mapping of the MSY. Probes were designed from the BAC ends of the seed BACs and hybridized to the male BAC library to continue the chromosome walking to map this region. A total of 55 male probes were designed. One pool of three probes hybridized to numerous positive clones and was excluded from further analysis. The remaining 52 probes detected 508 positive clones, averaging 9.8 clones per probe.

3.3. Application of the Male and Female BAC Libraries in Physical Mapping of the Papaya Sex Chromosomes

When male and female BAC libraries were constructed, the HSY physical map contained three gaps [6], which could not be filled by screening the hermaphrodite BAC library. To fill the first gap, probes were designed from two hermaphrodite BACs, SH89M06 and SH93H03, located on either side of this gap in the physical map, and used to screen the male BAC library. A total of 12 positive BACs were detected by a probe designed from SH89M06 BAC end sequences and verified by sequencing the PCR fragments amplified from the positive BACs. A probe from SH93H03 detected 14 BACs, which were verified in the same fashion. The DM126G07 sequence overlapped with SH93H03 by 57 kb and with SH89M06 by 3.7 kb. The male BAC DM126G07, with an insert size of 98 kb bridged the first gap of about 37 kb between SH93H03 and SH89M06 (Tables 2 and 3, Figure 3).

To fill the second gap in the physical map, between BACs SH96A24 and SH88A07, probes were designed from hermaphrodite BAC SH96A24. The male BAC library was screened with a probe designed 8 kb from the end of SH96A24 that extended into the gap, and 19 clones were identified as potential positives. PCR was performed with primers designed from SH96A24 and SH88A07 BAC ends. DM57M14 was amplified by both primer pairs and the PCR products were sequence for verification. The insert size of DM57M14 was about 100 kb and the DM57M14 sequence overlapped with SH96A24 by 17 kb and with SH88A07 by 13 kb, indicating that it filled the 70 kb gap.

The third gap was quite large, on the boarder where Knob 1 is located. Knob 1 is the only Knob shared by both the X and Y chromosomes [23], and this gap was filled in the corresponding region of the X chromosome. In an attempt to fill this gap, a probe designed from end sequences of HSY BAC SH61K24 was used to screen the male BAC library. Male BAC DM125I09, with an insert size of 85 kb, was identified and confirmed as a true positive. It overlapped with SH61K24 about 2 kb. To continue chromosome walking on this gap, a probe was designed from the DM125I09 BAC end in the gap, and used to screen the hermaphrodite BAC library. Hermaphrodite BAC SH65C06 was identified and confirmed. Its insert size is 162 kb and it overlapped with DM125I09 by 3 kb. Further chromosome walking identified X chromosome BACs.

The male BAC library bridged two of the three gaps on the HSY physical map and extended the third gap.

The physical map for the HSY-corresponding region of the X chromosome is almost complete with one gap remaining (unpublished data). A probe was designed from the BAC end sequence of SH54M13 that extended into the gap and was used to screen the female BAC library. Four BAC clones were detected and female BAC SF08K16, with an insert size of 70 kb, was confirmed by sequencing PCR fragments. It overlapped SH54M13 by 21 kb but did not bridge the gap to SH49N10. A probe was designed from the BAC end sequence of SH49N10, on the other side of the gap, and used to screen the female and hermaphrodite BAC libraries. The probe hybridized to only two BACs, but neither BAC was confirmed. The same SH49N10 probe was then hybridized to the male BAC library, resulting in 30 positive clones, one of which, male BAC DM136D11, was confirmed. DM136D11 overlaps about 63 kb with SH49N10 and has an insert size of 95 kb (Tables 2 and 3). Chromosome walking using all three BAC libraries yielded no additional positive clones. Though this gap was not filled, the male and female BAC libraries extended the physical map by 81 kb (Figure 3).

4. Discussion

Papaya is trioecious with three sex types: male, female, and hermaphrodite. In any given breeding system of papaya, it is either dioecious with male and female or gynodioecious with hermaphrodite and female. No papaya plant produces all three sex types due to the lethal effect of any combination of the Y and Yh chromosomes [4]. For this reason, it is not possible to construct the male, female, and hermaphrodite BAC libraries from a single variety. The female BAC library was constructed from female SunUp, the same variety used for the hermaphrodite BAC library [17] and for sequencing the female draft genome [8]. The male BAC library was constructed from male genomic DNA of the improved, but unreleased variety AU9. AU9 has been used as a source of male genomic DNA for papaya sex chromosome research in our group for over a decade, and it shares 99% DNA sequence identity with SunUp in autosomes by comparing two orthologous BACs containing the papaya fruit flesh color gene, CpCYC-b [24]. The papaya X and Y chromosomes have diverged at an accelerated rate, sharing an average of 84–86% DNA sequence identity in merely 2-3 million years after the recombination was suppressed in the sex determination region [6, 7]. However, the Y and Yh chromosomes share an average of 98.8% DNA sequence identity and are nearly identical in the coding region, indicating recent divergence of these two Y chromosomes, estimated 73,000 years ago [7]. There is no barrier in dioecious and gynodioecious breeding systems, and the pollens of hermaphrodite or male trees can pollinate female trees from either system within the range of pollen dispersal. It is surprising that orthologous X BACs from AU9 and SunUp shared 99.997% sequence identity, higher than the 99% autosomal DNA sequences between these two varieties [7]. Nonetheless, the male and female BAC libraries are adequate to assist the physical mapping of HSY and its X counterpart.

The addition of the male and female BAC libraries has been an asset to papaya sex chromosome and genomic research. The existing hermaphrodite BAC library was used for physical mapping of the HSY and its X counterpart, but this resource was exhausted with three gaps remaining on the HSY and one gap on the X physical maps. The male and female BAC libraries were designed to be constructed using the restriction enzymes EcoR I and BstY1, respectively, different from Hind III, used for the construction of the hermaphrodite BAC library, in an attempt to cover the fractions of the sex chromosomes and genome not represented in the hermaphrodite BAC library.

As seen with the construction of the hermaphrodite BAC library [17], even though DNA fragments ranging from 100 to 300 kb were selected, the ligation and transformation efficiency for DNA fragments above 125 kb were low, and no inserts were found to be above 150 kb in the BACs tested. BAC sizes below 100 kb were also present in both the male and female BAC library. The large concentration of DNA containing fragments of varying sizes being separated simultaneously caused some of the shorter fragments to be restricted resulting in some BACs with smaller inserts then 100 kb to be selected. The tighter insert size distribution seen in the female BAC libraries versus the male can be explained by the different restriction enzymes that were used in the digestion of the male and female DNA and sequence differences causing restriction sites to vary. Also, as seen with the hermaphrodite BAC library [17], even slight differences in DNA purity, digestion time, and concentrations of both DNA and enzymes can affect the insert sizes. There are advantages to having a more diverse range of inserts. For gap filling, if the gap is small, shorter inserts can fill gaps and minimize overlap with neighboring BACs reducing the cost for sequencing. Larger inserts can span larger gaps and can cover more space with less overlap, also reducing cost of sequencing.

Two gaps on the HSY were successfully filled, and the third gap on Knob 1 was extended by BACs from the male BAC library. There are five Knobs on the HSY [23]. Knobs 2 to 5 are HSY specific, and Knob 1 is the only Knob structure shared between the X and Yh chromosomes. Chromosome walking to extend the coverage of the third gap ended up walking into the X chromosome. It is possible that the genomic sequences of the X and Y chromosome in Knob 1 are homologous and too similar to allow the separation of the two physical maps.

The only gap on the X physical map was extended 81 kb by one BAC from the female BAC library and one BAC from the male BAC library (male has X and Y chromosomes), but chromosome walking to close this gap reached an end, despite the three BAC libraries that were used. This gap corresponds to a region between Knobs 4 and 5 in the HSY, where the centromere is thought to be located [23]. Centromeres are known to be highly repetitive and extremely difficult to clone using restriction digestion.

Not only did the male and female BAC libraries contribute to filling gaps on the hermaphrodite physical maps, the confirmed BACs on the hermaphrodite physical maps were used to design probes to screen the BAC libraries and fish out male seed BACs from which chromosome walking could take place to construct an MSY physical map, which is currently underway. About 60% of the MSY was mapped using the male BAC library. The papaya genome coverage by the male and female BAC libraries is comparable to that of the hermaphrodite BAC library. The genome coverage by the female BAC library fell short of the coverage by the hermaphrodite BAC library, with 10.3X genome equivalents compared to 13.7X genome equivalents [17]. The male BAC library, with 15x genome equivalents, surpassed the hermaphrodite BAC library genome coverage. All three of the papaya BAC libraries had very high haploid genome coverage.

Chloroplast sequence contamination was not determined for either the male or female BAC library, but the BAC libraries were constructed following the same methods as were used in the construction of the papaya hermaphrodite BAC library, where 1.4% of the BAC library contained chloroplast DNA [17]. Thus, a similar level is expected in the male and female papaya BAC libraries.

The completion of the male and female physical maps will provide abundant opportunities for in-depth studies of papaya’s nonrecombinant HSY, MSY, and corresponding X regions. Papaya’s nascent sex chromosomes have became a model system to study sex chromosome evolution. With the genomic resources available, such as the draft papaya genome sequence, the hermaphrodite BAC library, and now the male and female BAC libraries, the evolutionary events, starting from the restriction of recombination and leading up to the current chromosomal state, can be better understood.

Past studies considered pairs of homologous X and Yh BACs and detected chromosomal rearrangements and estimated a time of divergence for some gene pairs [6, 7]. The MSY physical maps will allow for the comparison of the MSY, HSY, and their X counterpart, making it possible to focus on large-scale chromosomal rearrangements across sex types, to unravel the evolutionary history of the sex chromosomes in this trioecious system with two slightly different Y chromosomes.

Not only will chromosomal rearrangements be of interest, but the variation in gene content will be as well. Once the BACs of the MSY physical maps are sequenced, interesting genes, such as those involved in sex determination, YY lethality, and long peduncles in males will be more easily identified, cloned, and verified.

The identification of the papaya sex determination genes has many practical applications. Commercially, the hermaphrodite papaya trees are the most valuable, because every tree produces fruit, and the hermaphrodite fruit has better postharvest qualities, but the sex of papaya trees cannot be determined until after they flower, so resources are wasted in growing male and female trees. Time, labor, and money are squandered on planting seeds of unknown sex types, and having to cull out male and female trees once they can be identified. Knowing more about the sex determination gene may eventually lead to engineering a true breeding hermaphrodite variety that will increase the fruit production with reduced input.


This work is supported by NSF Plant Genome Research Program (award number 0553417).