Computational Systems Biology Methods in Molecular Biology, Chemistry Biology, Molecular Biomedicine, and BiopharmacyView this Special Issue
Research Article | Open Access
Predicting Glycerophosphoinositol Identities in Lipidomic Datasets Using VaLID (Visualization and Phospholipid Identification)—An Online Bioinformatic Search Engine
The capacity to predict and visualize all theoretically possible glycerophospholipid molecular identities present in lipidomic datasets is currently limited. To address this issue, we expanded the search-engine and compositional databases of the online Visualization and Phospholipid Identification (VaLID) bioinformatic tool to include the glycerophosphoinositol superfamily. VaLID v1.0.0 originally allowed exact and average mass libraries of 736,584 individual species from eight phospholipid classes: glycerophosphates, glyceropyrophosphates, glycerophosphocholines, glycerophosphoethanolamines, glycerophosphoglycerols, glycerophosphoglycerophosphates, glycerophosphoserines, and cytidine 5′-diphosphate 1,2-diacyl-sn-glycerols to be searched for any mass to charge value (with adjustable tolerance levels) under a variety of mass spectrometry conditions. Here, we describe an update that now includes all possible glycerophosphoinositols, glycerophosphoinositol monophosphates, glycerophosphoinositol bisphosphates, and glycerophosphoinositol trisphosphates. This update expands the total number of lipid species represented in the VaLID v2.0.0 database to 1,473,168 phospholipids. Each phospholipid can be generated in skeletal representation. A subset of species curated by the Canadian Institutes of Health Research Training Program in Neurodegenerative Lipidomics (CTPNL) team is provided as an array of high-resolution structures. VaLID is freely available and responds to all users through the CTPNL resources web site.
The emerging field of lipidomics seeks to answer two seemingly simple questions: How many lipid species are there? What effect does lipid diversity have on cellular function? To address these questions, lipidomics requires a comprehensive assessment of cellular, regional, and systemic lipid homeostasis. This assessment expands beyond lipid profiling to include the transcriptomes and proteomes of lipid metabolic enzymes and transporters, as well as that of the protein targets that affect downstream lipid signalling . Lipidomic analyses also encompass an unbiased mechanistic assessment of lipid function ranging from the physicochemical basis of lipid behaviour to lipid-protein and lipid-lipid interactions triggered by intrinsic and extrinsic stimuli . The first step, however, lies in identifying the molecular identities of the lipid constituents in different membrane compartments.
Recent technological advances in electrospray ionization (ESI) and matrix-assisted laser desorption ionization (MALDI) mass spectrometry (MS), coupled to high performance liquid chromatography (LC), allow lipid diversity and membrane composition to be quantified at the molecular level [4–7]. Thousands of unique lipid species across the six major lipid structural categories in mammalian cells (fatty acyls, glycerolipids, glycerophospholipids, sphingolipids, sterol lipids, and prenol lipids) and two lipid categories synthesized by other organisms (saccharolipids and polyketides) can now be identified using LC-ESI-MS and, in some cases, MALDI-MS imaging [1, 4, 8]. Yet, with these successes come new challenges. Turning raw MS spectral data into annotated lipidomic datasets is a time-consuming, labour-intensive, and highly inefficient process. Predicting identities of “new” species, not previously curated, is exceedingly difficult. Lipidomic investigations lack essential bioinformatic tools capable of enabling automated data processing and exploiting the rich compositional data present in MS lipid spectra.
The critical first step is to unambiguously assign molecular identities from the MS structural information present in large lipidomic datasets . Where genomics and proteomics capitalize on sequence-based signatures, lipids lack such easily definable molecular fingerprints. Identities must be reconstructed by analysis of (a) lipid mass to charge (m/z) ratios following “soft” ionization ESI and MALDI techniques and (b) defining fragmentation patterns obtained after collision-induced dissociation in various MS modes . Once these molecular identities are predicted, further information about stereospecificity of critical species can then be assessed (e.g., by tandem MS, analysis of lyso-form fragment ions, and product ion spectral evaluation) [10–13]. For example, membrane phospholipids are derivatives of sn-glycero-3-phosphate with (a) an acyl, an alkyl (ether-linked plasmanyl), or an alkenyl (alkyl-1′-enyl, vinyl ether-linked plasmenyl) carbon chain at the sn-1 position; (b) a long-chain fatty acid that is usually esterified to the sn-2 position; and (c) a polar headgroup composed of a nitrogenous base, a glycerol, or an inositol unit modifying the phosphate group at the sn-3 position. The polar head group defines membership in one of 20 different phospholipid classes (e.g., glycerophosphoserines (PS), glycerophosphoethanolamines (PE), glycerophosphocholines (PC), glycerophosphoinositols (PI), etc.) . Molecular species are further distinguished by individual combinations of carbon residues (chain length and degree of unsaturation) and the nature of each sn-1 or sn-2 chemical linkage (acyl, alkyl, or alkenyl) to the glycerol backbone. PI(18:0/22:6), for example, defines a lipid with a phosphoinositol polar head group (PI), a fully saturated 18 carbon chain (referred to as :0) ester-linked at the sn-1 position, and a 22 carbon chain which is characterized by six unsaturations (indicated by :6) ester-linked at the sn-2 position (Figure 1). Immediate PI metabolites () are then produced by carbon-specific phosphorylation of the PI headgroup with unique fatty acyl, alkyl, and/or alkenyl sn-1 and sn-2 chains (Figures 1 and 2). The tight regulation of PI metabolism and its critical impact on cellular function clearly underlines the importance of these compositional changes (Figure 1). Yet, to date, biological significance of the astonishing number of potentially unique PIs and s is unknown. This is primarily due to the challenges associated with unambiguous compositional identification of PIs and s in biological membranes [1, 15–19].
Key advances in lipidomic bioinformatics have been led by the LIPID MAPS consortium both in the development of online spectral databases and the reorganization of lipid class ontologies [14, 20]. These toolsets and classification systems have recently been complemented by the in silico generation of a searchable library of all theoretically possible MS/MS lipid spectra in different ionization modes (LipidBlast) . Such fundamental toolkits are supported by a growing compendium of targeted spectral tools, reviewed in [6, 7, 20, 22]. Few existing bioinformatic resources, however, provide necessary information on all potential acyl chain inversions (e.g., sn-1 versus sn-2), critical phospholipid linkages that define lipid function, or theoretically possible double bond positions for every possible species. To address this need, we have developed Visualization and Phospholipid Identification (VaLID)—a web-based application linking a user-friendly online search engine, structural composition database, and multiple visualization features—that is capable of providing users with all theoretically possible phospholipids calculated from any m/z under a variety of MS conditions. VaLID version 1.0.0 was initially restricted to 736,584 unique PS, PE, PC, glycerophosphate (PA), glyceropyrophosphate (PPA), glycerophosphoglycerol (PG), glycerophosphoglycerophosphate (PGP), and cytidine 5′-diphosphate 1,2-diacyl-sn-glycerol (CDP-DG) identities (Table 1) . At first release, we did not include the PI family or their bioactive metabolites given the significant challenges associated with automating the visualization of all theoretically possible combinations of sn-1 and sn-2 carbon chain lengths, linkages, and variations in phosphorylation of the phosphoinositol head group. Here, we address this deficit through the development of VaLID version 2.0.0, now coded with an exhaustive PI and database, capable of computing and visualizing a total of 1,473,168 theoretically possible phospholipids predicted from any user-inputted m/z value and MS condition. VaLID version 2.0.0 is freely available for commercial and noncommercial use at http://neurolipidomics.ca and http://neurolipidomics.com/resources.html.
|The calculated number of species does not include lipids formed by changing the position of the double bond beyond those represented in VaLID’s structural models. Each lipid m/z has been calculated for exact and average masses and can be searched using even and odd carbon chains with mass tolerance ranging from ±0.0001 to ±2 and MS ion modes [M + H]+, [M + K]+, [M + Li]+, [M + Na]+, [M – H]−, or [M (Neutral)].|
2. Materials and Methods
2.1. Programming Language and Packages
VaLID version 2.0.0. was developed using Oracle’s Java programming language version 6 and external Java libraries from JExcelApi and structures are displayed within the program by ChemAxon’s Marvin View 126.96.36.199. software. The code was written using the IDE Eclipse Kepler, and packaged using the Fat Jar Eclipse version 0.0.31 plugin. VaLID is a web-based Java applet, and thus it requires that Java be both installed and enabled on a user’s web browser. The most recent Java security update is recommended, and can be downloaded from http://www.oracle.com/technetwork/java/index.html.
2.2. The PI and Compositional Database
Briefly, the underlying database contains masses of all theoretically possible PI and species calculated from both exact and average atomic masses . Component structural masses were first established for: (a) the glycerol backbone, (b) PI polar headgroups with all phosphorylation possibilities, (c) sn-1 and sn-2 hydroxyl residues (lyso-lipids), (d) sn-1 and sn-2 fatty chains ranging from 0 to 30 carbons with up to six unsaturations, considering (e) ester, ether, or vinyl ether linkages to the phosphoglyceride backbone (Figure 2). Composite masses were then calculated for every theoretically possible combination. Thus, the underlying database includes all PIs, as well as every acyl, alkyl, and alkenyl variant, for every carbon chain and double bond position, of all mono- (PIP), bis- (PIP2), and tris- (PIP3) phosphorylated PI headgroups modified on the hydroxyl group of carbons 3, 4, and/or 5.
2.3. PI and Structural Visualizations
We have updated the automated representation drawing feature of VaLID to be able to draw all theoretically possible PI and molecular identities. Structures have been restricted to display only cis double bonds separated by a minimum of two carbons. To achieve this goal, the basic structure of the PI backbone was created manually and the atom placement corrected mathematically to match known structures. Slight adjustments to atom placement were further made to improve visibility. The locations of each atom in the headgroup were then established on a Cartesian plane and coded into the software. The automated drawing feature update was integrated into the database and search functions, allowing all PI and to be visualized on demand. Chemical structures are displayed using ChemAxon’s MarvinView software (Marvin 188.8.131.52, 2011, http://www.chemaxon.com).
3. Results and Discussion
PI and are derivatives of sn-glycero-3-phosphate with (a) an acyl, an alkyl (ether-linked plasmanyl), or an alkenyl (alkyl-1′-enyl, vinyl ether-linked plasmenyl) carbon chain; (b) a fatty acid commonly esterified but also with possible alkyl or alkenyl linkages to the sn-2 position; and (c) a polar headgroup composed of an inositol unit modifying the phosphate group at the sn-3 position. Individual species are distinguished by their particular combination of carbon chains (chain length and degree of unsaturation) and by the nature of their sn-1 or sn-2 chemical linkages (acyl, alkyl, or alkenyl). PIP3(O-16:0/20:4), for example, defines a lipid species with a phosphoinositol polar head group (PI) phosphorylated at the 3rd, 4th, and 5th carbon positions, an ether linkage at the sn-1 position (O-), 16 carbons at the sn-1, and 20 carbons at the sn-2 positions, of which the sn-1 chain is fully saturated. The number of possible structural and biochemical combinations results in colossal structural diversity; however, lipids account for less than 15 percent of the total phospholipid composition in eukaryote cells . The molecular identities of these critical species have yet to be determined in different lipidomes despite emerging evidence that differences in carbon chain length, linkage, and phosphorylation status fundamentally alter biological activity [1, 15–19] (Figure 1).
Here, we enhanced VaLID’s capacity to (a) predict identities of glycerophosphoinositol species present in MS spectra from m/z under user-defined MS conditions and (b) automatically visualize every theoretically possible PI molecular species at given m/z. The updated VaLID interface, showing all of the available search terms, is presented in Figure 3. Since its inception, VaLID was designed to be a comprehensive glycerophospholipid database linking a convenient search engine with visualization features for identification and dissemination of large-scale lipidomic datasets. The intent of this tool was to aid in lipid discovery obtained through multiple MS methodologies and significantly reduce the time required to validate critical phospholipid identities present in target lipidomes. The program initially contained eight phospholipid subclasses, excluding the PI subfamily. In VaLID version 2.0.0, this capacity is now expanded to all theoretically possible PI and glycerophospholipids and comprises a total of 1,473,168 unique structures. These additions are meant to provide lipidomic researchers with the additional tools necessary to mine their lipidomes for PI and species with specific m/z under their particular MS experimental conditions including the ion mode and the lipid subclass. Due to the complexity of the PI superfamily, and to accelerate searching, users can restrict searches to subclasses (PI, PIPx) or sub-subclasses (PI, PIP, PIP, PIP, PIP2, PIP2, PIP2, PIP3). For example, if the option PIP2 is chosen, all molecular species with an inositol backbone phosphorylated only at the 3rd and 4th carbon positions will be provided and VaLID will not return any related PIP2 or PIP2 species. The PI + option restricts searches to the entire PI superfamily excluding other phospholipid families. The “All without the ” option returns all of the phospholipids in the database including PI structural precursors with the exception of metabolites. Finally, the “All” option returns results from every headgroup. When more than one headgroup is being searched, the program will let the user know how many headgroups have been loaded, and how many are remaining to be loaded.
With respect to the visualization features for PIP or PIP2, the program will draw the phosphate groups on the inositol ring in the locations that the user specified from the dropdown menu for lipid species selected. As with the other subclasses, choosing the “Display All” button will draw all the theoretically possible structures associated with the selected lipid name. Potential variants in degrees of unsaturation are drawn sequentially in every location along the fatty acid chain, separated by at least two carbons, and in cis configuration. An example of this can be seen in Figure 4. If the selected lipid meets criteria for the “Best Prediction,” selecting this option will return only the lipids in VaLID’s “Predicted to be Common” database. These species are categorized based on the relative abundance of prevalent fatty acid chains in mammalian cells .
VaLID is, to our knowledge, the first search engine that has an exhaustive m/z and visualization database of all the theoretically possible glycerophospholipids updated here from eight to twelve of the twenty phospholipid subclasses defined by the LIPID MAPS Consortium . The purpose of this update is to facilitate prediction and visualization of the identities of all unknown species, now including all PIs and their metabolites, with given m/z and MS condition that may be present in users’ lipidomes.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Graeme S. V. McDowell and Alexandre P. Blanchard contributed equally to this work.
This resource was funded by the Canadian Institutes of Health Research (CIHR) MOP 89999 to DF and SALB and a Strategic Training Initiative in Health Research (STIHR) CIHR/Training Program in Neurodegenerative Lipidomics (CTPNL) and the Institute of Aging TGF 96121 to DF, SF, and SALB. APB received a FRSQ and CTPNL graduate scholarship; GSVM received a CTPNL graduate scholarship.
- S. A. L. Bennett, N. Valenzuela, H. Xu et al., “Using neurolipidomics to identify phospholipid mediators of synaptic (dys)function in Alzheimer's Disease,” Frontiers in Physiology, vol. 4, p. 168, 2013.
- C. Le Roy and J. L. Wrana, “Clathrin- and non-clathrin-mediated endocytic regulation of cell signalling,” Nature Reviews Molecular Cell Biology, vol. 6, no. 2, pp. 112–126, 2005.
- L. C. Skwarek and G. L. Boulianne, “Great Expectations for PIP: phosphoinositides as regulators of signaling during development and disease,” Developmental Cell, vol. 16, no. 1, pp. 12–20, 2009.
- D. Piomelli, G. Astarita, and R. Rapaka, “A neuroscientist's guide to lipidomics,” Nature Reviews Neuroscience, vol. 8, no. 10, pp. 743–754, 2007.
- H. A. Brown and R. C. Murphy, “Working towards an exegesis for lipids in biology,” Nature Chemical Biology, vol. 5, no. 9, pp. 602–606, 2009.
- M. Bou Khalil, W. Hou, H. Zhou et al., “Lipidomics era: accomplishments and challenges,” Mass Spectrometry Reviews, vol. 29, no. 6, pp. 877–929, 2010.
- H. Xu, N. Valenzuela, S. Fai et al., “Targeted lipidomics—advances in profiling lysophosphocholine and platelet-activating factor second messengers,” FEBS Journal, vol. 280, pp. 5652–5667, 2013.
- X. Han, K. Yang, and R. W. Gross, “Multi-dimensional mass spectrometry-based shotgun lipidomics and novel strategies for lipidomic analyses,” Mass Spectrometry Reviews, vol. 31, no. 1, pp. 134–178, 2012.
- P. S. Niemelä, S. Castillo, M. Sysi-Aho, and M. Orešič, “Bioinformatics and computational methods for lipidomics,” Journal of Chromatography B, vol. 877, no. 26, pp. 2855–2862, 2009.
- W. Hou, H. Zhou, M. B. Khalil, D. Seebun, S. A. L. Bennett, and D. Figeys, “Lyso-form fragment ions facilitate the determination of stereospecificity of diacyl glycerophospholipids,” Rapid Communications in Mass Spectrometry, vol. 25, no. 1, pp. 205–217, 2011.
- J. C. Smith, W. Hou, S. N. Whitehead, M. Ethier, S. A. L. Bennett, and D. Figeys, “Identification of lysophosphatidylcholine (LPC) and platelet activating factor (PAF) from PC12 cells and mouse cortex using liquid chromatography/multi-stage mass spectrometry (LC/MS3),” Rapid Communications in Mass Spectrometry, vol. 22, no. 22, pp. 3579–3587, 2008.
- S. N. Whitehead, W. Hou, M. Ethier et al., “Identification and quantitation of changes in the platelet activating factor family of glycerophospholipids over the course of neuronal differentiation by high-performance liquid chromatography electrospray ionization tandem mass spectrometry,” Analytical Chemistry, vol. 79, no. 22, pp. 8539–8548, 2007.
- C.-H. Tang, P.-N. Tsao, C.-Y. Chen, M.-S. Shiao, W.-H. Wang, and C.-Y. Lin, “Glycerophosphocholine molecular species profiling in the biological tissue using UPLC/MS/MS,” Journal of Chromatography B, vol. 879, no. 22, pp. 2095–2106, 2011.
- E. Fahy, D. Cotter, M. Sud, and S. Subramaniam, “Lipid classification, structures and tools,” Biochimica et Biophysica Acta, vol. 1811, no. 11, pp. 637–647, 2011.
- U. Igbavboa, J. Hamilton, H.-Y. Kim, G. Y. Sun, and W. G. Wood, “A new role for apolipoprotein E: modulating transport of polyunsaturated phospholipid molecular species in synaptic plasma membranes,” Journal of Neurochemistry, vol. 80, no. 2, pp. 255–261, 2002.
- M. J. Sharman, G. Shui, A. Z. Fernandis et al., “Profiling brain and plasma lipids in human apoe ε2, ε3, and ε4 knock-in mice using electrospray ionization mass spectrometry,” Journal of Alzheimer's Disease, vol. 20, no. 1, pp. 105–111, 2010.
- R. B. Chan, T. G. Oliveira, E. P. Cortes et al., “Comparative lipidomic analysis of mouse and human brain with Alzheimer disease,” The Journal of Biological Chemistry, vol. 287, no. 4, pp. 2678–2688, 2012.
- P. H. Axelsen and R. C. Murphy, “Quantitative analysis of phospholipids containing arachidonate and docosahexaenoate chains in microdissected regions of mouse brain,” Journal of Lipid Research, vol. 51, no. 3, pp. 660–671, 2010.
- S. Osawa, S. Funamoto, M. Nobuhara et al., “Phosphoinositides suppress γ-secretase in both the detergent-soluble and -insoluble states,” The Journal of Biological Chemistry, vol. 283, no. 28, pp. 19283–19292, 2008.
- E. Fahy, D. Cotter, R. Byrnes et al., “Bioinformatics for Lipidomics,” Methods in Enzymology, vol. 432, pp. 247–273, 2007.
- T. Kind, K. H. Liu, Y. Lee do et al., “LipidBlast in silico tandem mass spectrometry database for lipid identification,” Nature Methods, vol. 10, no. 8, pp. 755–758, 2013.
- A. P. Blanchard, G. S. McDowell, N. Valenzuela et al., “Visualization and Phospholipid Identification (VaLID): online integrated search engine capable of identifying and visualizing glycerophospholipids with given mass,” Bioinformatics, vol. 29, no. 2, pp. 284–285, 2013.
- J. R. De Laeter, J. K. Böhlke, P. De Bièvre et al., “Atomic weights of the elements: review 2000,” Pure and Applied Chemistry, vol. 75, no. 6, pp. 683–800, 2003.
- G. Di Paolo and P. De Camilli, “Phosphoinositides in cell regulation and membrane dynamics,” Nature, vol. 443, no. 7112, pp. 651–657, 2006.
- M. Miyazaki and J. M. Ntambi, “Fatty acid desaturation and chain elongation in mammals,” in Biochemistry of Lipids, Lipoproteins and Membranes, D. E. Vance and J. E. Vance, Eds., pp. 191–211, Elsevier, 2008.
- E. Fahy, S. Subramaniam, R. C. Murphy et al., “Update of the LIPID MAPS comprehensive classification system for lipids,” Journal of Lipid Research, vol. 50, pp. S9–S14, 2009.
Copyright © 2014 Graeme S. V. McDowell et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.