Abstract

Recent advances, such as the availability of extensive genome survey sequence (GSS) data and draft physical maps, are radically transforming the means by which we can dissect Brassica genome structure and systematically relate it to the Arabidopsis model. Hitherto, our view of the co-linearities between these closely related genomes had been largely inferred from comparative RFLP data, necessitating substantial interpolation and expert interpretation. Sequencing of the Brassica rapa genome by the Multinational Brassica Genome Project will, however, enable an entirely computational approach to this problem. Meanwhile we have been developing databases and bioinformatics tools to support our work in Brassica comparative genomics, including a recently completed draft physical map of B. rapa integrated with anchor probes derived from the Arabidopsis genome sequence. We are also exploring new ways to display the emerging Brassica–Arabidopsis sequence homology data. We have mapped all publicly available Brassica sequences in silico to the Arabidopsis TIGR v5 genome sequence and published this in the ATIDB database that uses Generic Genome Browser (GBrowse). This in silico approach potentially identifies all paralogous sequences and so we colour-code the significance of the mappings and offer an integrated, real-time multiple alignment tool to partition them into paralogous groups. The MySQL database driving GBrowse can also be directly interrogated, using the powerful API offered by the Perl Bio∷DB∷GFF methods, facilitating a wide range of data-mining possibilities.