Abstract

Pluripotent stem cells (PSCs) represent a unique kind of stem cell, as they are able to indefinitely self-renew and hold the potential to differentiate into any derivative of the three germ layers. As such, human Embryonic Stem Cells (hESCs) and human induced Pluripotent Stem Cells (hiPSCs) provide a unique opportunity for studying the earliest steps of human embryogenesis and, at the same time, are of great therapeutic interest. The molecular mechanisms underlying pluripotency represent a major field of research. Recent evidence suggests that a complex network of transcription factors, chromatin regulators, and noncoding RNAs exist in pluripotent cells to regulate the balance between self-renewal and multilineage differentiation. Regulatory noncoding RNAs come in two flavors: short and long. The first class includes microRNAs (miRNAs), which are involved in the posttranscriptional regulation of cell cycle and differentiation in PSCs. Instead, long noncoding RNAs (lncRNAs) represent a heterogeneous group of long transcripts that regulate gene expression at transcriptional and posttranscriptional levels. In this review, we focus on the role played by lncRNAs in the maintenance of pluripotency, emphasizing the interplay between lncRNAs and other pivotal regulators in PSCs.

1. Introduction

The term long noncoding RNAs (lncRNAs) refers to a heterogeneous class of RNA polymerase II (Pol II) transcripts greater than 200 nucleotides in length with no evident protein-coding capacity [1, 2]. They are generally spliced from multiexonic precursors, capped, polyadenylated, and localized to the nucleus, cytoplasm, or both [2, 3]. Based on the anatomical properties of their transcription loci and the relationship with the adjacent genes, lncRNAs can be classified in intronic, intergenic, or overlapping (in sense or antisense orientation) transcripts. Even though lncRNAs are less conserved than mRNA and small noncoding RNAs [4, 5], lack of conservation does not imply a lack of function [6]. Indeed, both the transcript length and the versatility of RNA to base-pair let these molecules fold into complex secondary structures [7, 8], which are interspersed with longer and less conserved stretches of nucleotide sequences. As highlighted by pioneering studies [1, 6], these structures allow lncRNAs to simultaneously interact with multiple complexes, thereby coordinating their activities.

Even though only a part of lncRNA transcripts have been mechanistically characterized, several studies have shown the participation of lncRNAs in different processes related to normal physiology and/or disease [4, 6]. As they are Pol II transcripts, lncRNA expression can be tightly regulated. Indeed lncRNA transcripts are globally more tissue specific than protein-coding genes suggesting potential roles in specifying cell identity [9, 10].

The intracellular localization of lncRNAs is predictive of their mode of action [6, 10]. Usually, nuclear lncRNAs can guide chromatin modification complexes to specific genomic loci and/or serve as molecular scaffolds that tether together distinct functionally related complexes [11, 12]. Due to their intrinsic ability to base-pair with other nucleic acids, both cis-acting (on neighboring genes) and trans-acting (on distant loci) lncRNAs can exert either repressive or promoting activities on target genes by coordinating protein and RNA interactions [1214]. Based on known examples [12], nuclear lncRNAs can exert their regulatory function as decoys by simply titrating transcription factors and other proteins away from chromatin [1517]. As a paradigm, depletion of the lncRNA PANDA substantially increased target genes occupancy by NF-YA, a nuclear transcription factor that triggers apoptosis upon DNA damage [17]. lncRNA binding on DNA can initiate the formation of heterochromatin by recruitment of DNA or histone methyltransferases (such as the histone H3 lysine 27, H3K27, and methyltransferase complex PRC2), resulting in repression of gene expression. Conversely, transcriptional activation can be induced by recruitment of different chromatin modifiers, such as the H3 lysine 4, H3K4, and methyltransferase MLL1, or by changing the 3D chromatin conformation [12, 13]. Among the cis-acting species, the enhancer-associated ncRNAs (eRNAs) are functional transcripts participating in many programs of gene activation. In particular, they play fundamental roles in targeting chromatin-remodeling complexes to specific promoters and to assist the formation of chromatin loops [18]. Using an integrated epigenomic screening, Ounzain and colleagues recently established a catalogue of enhancer-associated noncoding RNAs dynamically expressed in ESCs during cardiac differentiation [19]. The expression of these transcripts correlated with the expression of target genes in their genomic proximity. Interestingly, the expression of the eRNAs was inhibited when the target mRNAs reached maximal levels. Overall, these data gave an important contribution to the functional impact of cardiac eRNAs on heart development and cardiac remodeling after injury.

Some other lncRNAs are localized in the cytoplasm, where they can regulate gene expression through base-pairing complementary regions on target RNAs. In human, several cytoplasmic lncRNAs transactivate Staufen1-mediated mRNA decay by duplexing with 3′UTRs via Alu elements [20]. Another example is represented by the β-site APP-cleaving enzyme 1 BACE1-AS antisense RNA, which binds to BACE1 mRNA inducing its stabilization. By regulating BACE1 expression, the noncoding RNA plays a role in controlling the boundaries between physiology and pathology driving Alzheimer’s disease pathology [21]. Base-pairing is also the principle that applies to the competing endogenous RNA (ceRNA) activity of lncRNAs [22]. In this case, lncRNAs can indirectly enhance protein translation by sequestering, or “sponging,” miRNAs that otherwise would inhibit their target mRNAs. This mechanism has been shown to be involved in differentiation and cancer [22, 23]. Finally, a peculiar class of sponging lncRNAs is represented by circular RNAs (circRNAs) [24, 25], whose unusual circular structure confers increased stability. Altogether, these different properties engender lncRNAs to operate through distinct modes of action and to exert a wide range of functions across diverse biological processes.

Embryonic Stem Cells (ESCs) are the in vitro counterpart of the pluripotent epiblast of the blastocyst and constitute a useful system to study the molecular mechanisms at the basis of pluripotency. A group of transcription factors (TFs), comprising OCT4, NANOG, and SOX2, has been proposed as the core regulatory circuitry in ESCs [26]. These are pluripotency factors that ensure the proper expression of genes involved in the maintenance of the undifferentiated state. At the same time, they repress many genes that play a role during subsequent development. Such developmental genes, however, are often kept in a silent but “poised” state by the establishment of bivalent chromatin domains, where histone repressive marks coexist with marks related to active transcription [27]. It is now becoming clear that the core pluripotency TFs operate in concert with miRNAs and lncRNAs [2830]. One example of a miRNA family that plays a role in the crossroad between pluripotency and differentiation is the miR-302 family [31]. Among other activities, miR-302 regulates the balance between agonists and antagonists of the TGFβ/BMP signalling, which is a crucial pathway for the choice between maintenance of pluripotency and differentiation [32]. In ESCs, the activity of miR-302 is counteracted by let-7, an opposing miRNA family that plays a prodifferentiative role [33]. Other miRNAs also facilitate differentiation by targeting pluripotency factors or chromatin modifiers [28]. In this review, we focus on recent evidence suggesting that lncRNAs also play an important role in the maintenance of pluripotency.

ESCs have represented for a long time the only system to model human early development. More recently, the Nobel Prize-awarded derivation of induced Pluripotent Stem Cells (iPSCs) provided an alternative source of pluripotent cells [34]. iPSCs can be derived from human somatic adult cells through a reprogramming process consisting in the ectopic expression of defined factors. As their derivation requires a simple skin biopsy (or blood sampling), human iPSCs overcome ethical and legislative issues that limit the research based on human ESCs (hESCs). Importantly, iPSCs generated from human patients with genetic disorders represent a promising tool for both regenerative medicine and in vitro disease modeling.

2. The lncRNA Signature in Embryonic Stem Cells

As for protein-coding genes and miRNAs [31], Pluripotent Stem Cells express a characteristic set of lncRNAs. The lncRNA signature of mouse ESCs (mESCs) has been defined by microarray analysis [35] and genome-wide mapping of chromatin marks of actively transcribed genes, such as trimethylation of lysine 4 of histone H3 (H3K4me3) in the promoter coupled with trimethylation of lysine 36 of histone H3 in the transcribed region (K4-K36 domain) [36]. Work by Dinger et al. [35] identified several lncRNAs that are differentially expressed in proliferating mESCs and upon induction of hematopoietic differentiation. Analysis of K4-K36 domains located outside the known protein-coding loci allowed Guttman et al. [36] to identify over a thousand novel lncRNAs in mESCs and somatic cells. The catalogue of mESC lncRNAs was then expanded by including a substantial fraction of species transcribed from genes not marked by a K4-K36 domain, identified by a computational method that allowed the reconstruction of the whole transcriptome from RNA-Seq data (Scripture) [37]. A significant subset of these lncRNAs may be regulated at the transcriptional level by the ESC core TFs [29, 38].

As in the case of mouse ESCs, K4-K36 domains analysis allowed the initial identification of a characteristic set of lncRNAs genes expressed in human ESCs [39]. This list was then further extended by integrating data from RNA-Seq analysis [4]. A more detailed characterization has shown that some human lncRNAs could be under the direct control of the core pluripotency TFs [40, 41].

3. lncRNAs Play a Role in the Maintenance of Pluripotency in ESCs

Increasing evidence points to a crucial role for lncRNAs in the maintenance of ESC self-renewal (pluripotency), thus preventing their differentiation. In a large-scale functional study, the individual knockdown of more than 90% of lncRNAs (out of 147 tested) caused a significant perturbation of the transcriptome, often resulting in the loss of mESC pluripotency [29]. Interestingly, lncRNAs involved in the maintenance of ESC self-renewal are often transcriptionally regulated by core pluripotency TFs and act in regulatory networks. Examples of this mechanism include AKO28326/GOMAFU/MIAT (OCT4-activated) and AK141205 (NANOG-repressed) lncRNAs that when altered lead to robust changes in OCT4 and NANOG levels and affected pluripotency of mESCs [38]. The lncRNA TUNA/MEGAMIND is required for mESCs proliferation and maintenance of self-renewal [5]. TUNA binds a complex comprising several RNA-binding proteins and activates transcription of NANOG and SOX2 upon binding on their promoters [42]. The interplay between core TFs and lncRNAs has been reported also in hESCs for lncRNA_ES1, lncRNA_ES2, and lncRNA_ES3 [40]. Taken together, these examples indicate that lncRNAs are involved in the maintenance of the undifferentiated state and the repression of genetic programs that direct lineage commitment during differentiation.

The challenge now is to dissect the molecular mechanisms underlying the functions of these ESC lncRNAs. Mechanistically, nuclear lncRNAs may exert their function by binding and regulating the activity and/or target specificity of chromatin-modifying factors. It has been shown that ESC lncRNAs interact with all classes of histone modifiers (writers, readers, and erasers), as well as other chromatin-associated proteins [29]. This is in line with a possible role of these long transcripts as molecular scaffolds that bridge together different chromatin modification complexes [11]. Recent examples support the hypothesis that lncRNAs may be pivotal regulators of the activity of crucial chromatin modifiers, which play an essential role in the epigenetic regulation of ESCs pluripotency and differentiation. Genome-wide analysis identified a multitude of potential lncRNA interactors of PRC2 in mESCs and a somewhat promiscuous RNA-binding activity of this complex has been suggested [43, 44]. Recent work proposed that lncRNA binding might be important to modulate the interaction of PRC2 with its cofactors, thus modulating its activity and/or specificity. One of such cofactors is JARID2, belonging to the JUMONJI family of lysine demethylases (KDMs). JARID2 is peculiar as its KDM catalytic domain is inactive and it is particularly enriched in ESCs where it regulates PRC2 activity and genome occupancy [45, 46]. It has been recently shown that JARID2 contains RNA-binding region and directly interacts with about 100 previously annotated lncRNAs in mESCs [47]. Particularly interesting, among these interactors are MEG3 (also known as GTL2), RIAN, and MIRG, lncRNAs that are encoded within an imprinted locus on chromosome 12qF1, referred to as the Dlk1-Dio3 gene cluster. Proper expression of these lncRNAs is required for embryonic development [48, 49] and to achieve full pluripotency during reprogramming, as iPSCs carrying aberrantly silenced Dlk1-Dio3 cluster genes are unable to fulfill stringent pluripotency tests, such as contribution to chimaeric mice development and complementation of a tetraploid blastocyst [50]. Functionally, by binding JARID2, MEG3 and other Dlk1-Dio3 gene cluster lncRNAs may modulate the activity of PRC2 in Pluripotent Stem Cells. Genome-wide analysis indeed showed that Meg3 stimulates PRC2 occupancy in trans at genomic loci encoding for factors involved in differentiation and development [44]. These genes are derepressed in human iPSC lines expressing low levels of MEG3, suggesting evolutionary conservation of the MEG3-JARID2 axis. Mechanistically, MEG3 and other lncRNAs work as scaffolds to increase the interaction between JARID2 and the PRC2 core component EZH2 and, therefore, PRC2 assembly on chromatin at JARID2 target sites. Moreover, it has been suggested that these lncRNAs may also guide the initial recruitment of PRC2/JARID2 at specific target sites in pluripotent cells via RNA-DNA base-pairing [47] (Figure 1).

Trithorax group (TrxG) factors, including mammalian MLL complexes, positively regulate transcription via the H3K4me3. This activity is required to maintain pluripotency in ESCs. In particular, the WDR5 member of the MLL complex directly interacts with the core transcriptional regulatory circuitry and its depletion causes loss of self-renewal [51]. By taking advantage of an RNA-binding deficient mutant, Yang and coworkers recently demonstrated that the interaction with RNA is essential for WDR5 activity [52]. The half-life of the WDR5 mutant protein in the nucleus is reduced compared to wild-type, indicating that RNA binding positively regulates protein stability. Over 1000 RNAs might bind WDR5 in ESCs, including 23 previously annotated lncRNAs. Among these interactors, six were previously identified as lncRNAs required to maintain pluripotency in mESCs [29], providing a mechanistic explanation of their function. WDR5 also binds the promoters of two of these interacting lncRNAs, lincRNA-1592 and lincRNA-1552, suggesting a cis regulatory mechanism [49]. lincRNA-1552 expression may be under the direct control of many pluripotency transcription factors, including OCT4, NANOG, and KLF4 that bind its promoter, and its knockdown leads to misexpression of OCT4 and NANOG, among other mRNAs [29]. This evidence, together with the impairment of self-renewal in cells expressing the RNA-binding deficient WDR5 mutant [52], suggests that lncRNAs interacting with the Trithorax complex play a crucial role in the maintenance of ESC pluripotency (Figure 1).

The interplay between lncRNAs and Trithorax complexes may also direct specification towards specific cell fates upon ESCs differentiation. The homeotic genes Hoxa6 and Hoxa7 are involved in the specification of mesoderm derived tissues and organs [53, 54]. Bertani and colleagues demonstrated that the lncRNA MISTRAL (MIRA) mediates the transcriptional activation of Hoxa6 and Hoxa7 genes by recruiting MLL to chromatin [55]. MIRA-mediated activation of Hoxa6 and Hoxa7 culminates in the expression of genes involved in early germ layer specification in differentiating mESCs. Another interesting example of lncRNA involved in mESC differentiation is pRNA, which is localized in the nucleolus [56]. In Pluripotent Stem Cells, chromatin is globally in a transcriptionally permissive open state and becomes increasingly condensed and transcriptionally repressed upon differentiation (reviewed in [57]). Chromatin condensation occurs also at ribosomal genes and is promoted by pRNA, which guides the nucleolar repressor factor TIP5/BAZ2A to ribosomal DNA (rDNA) [56]. Interestingly, pRNA overexpression caused an increase of heterochromatin also outside rDNA, initiating the global epigenetic remodeling normally observed during differentiation.

In addition to nuclear ESC lncRNAs, on the other side of the coin, fewer examples exist for cytoplasmic lncRNAs that regulate pluripotency. During the initial steps of the reprograming process, cells initiating their conversion to pluripotency must elude inhibitory hurdles, such as cell cycle arrest, senescence, and apoptosis, raised by p53 activation by the overexpression of the reprogramming factors [58]. Thus, any change in p53 activity is predicted to affect the efficiency of reprogramming by limiting the number of cells entering the process. In this context, the cytoplasmic linc-RoR (Regulator of Reprogramming) was initially identified as lncRNA able to promote the reprogramming process [41] by acting as a negative regulator of p53 [59]. Subsequently, Wang and colleagues showed that endogenous linc-RoR also plays a key role in the maintenance of hESC self-renewal by acting as a ceRNA [60]. Previous work had shown that a single miRNA, miR-145, inhibits translation of core TFs during ESC differentiation [61]. According to the model by Wang et al., in human ESCs linc-RoR would trap miR-145, derepressing the translation of the core pluripotency transcription factors OCT4, SOX2, and NANOG and ensuring proper levels of expression in undifferentiated hESC. Upon differentiation, the disappearance of linc-RoR releases miR-145, allowing it to repress the translation of core pluripotency factors [41]. Thus, this work strongly supports the idea that linc-RoR acts as a miRNA sponge. Since OCT4, at the transcriptional level, represses miR-145 and activates linc-RoR, these studies unraveled an interesting network comprising TFs, long and short regulatory RNAs which act at the crossroad between self-renewal and differentiation (Figure 2).

More recently, Bao and colleagues [62] showed that lincRNA-p21, a nuclear noncoding transcript previously characterized as a global repressor of the p53-dependent transcriptional cascade [63], represents another example of lncRNA regulating pluripotency. Interestingly, in the context of somatic cell reprogramming, lincRNA-p21 inhibits this process without inducing apoptosis or impairing cell proliferation. It was identified in a functional screening performed in mouse to examine events accompanying the pre-iPSCs to iPSCs conversion. This is a late step, required to achieve a self-sustaining fully reprogrammed status, in which the cells become independent of the activity of the exogenous reprogramming factors and turn on the expression of endogenous pluripotency regulators [64]. Three lncRNAs, including lincRNA-p21, had a negative effect in pre-iPSCs to iPSCs conversion. Mechanistically, lincRNA-p21 has been suggested to sustain the heterochromatic state at pluripotency gene promoters by interacting with HNRNPK. HNRNPK and lincRNA-p21 together would form a repressive complex able to preserve H3K9me3 and CpG methylation at the promoter of key pluripotency regulators such as Nanog, Sox2, and Lin28 [62] (Figure 3). Besides lincRNA-p21, there are only limited examples of nuclear lncRNAs regulating gene expression by controlling DNA methylation. In a more recent paper, Wang and colleagues reported the identification of Dum, a Developmental pluripotency-associated 2 (Dppa2) Upstream binding Muscle lncRNA [65]. LncRNA Dum was found to silence its neighboring gene Dppa2 in cis by recruiting Dnmt1, Dnmt3a and Dnmt3b on its promoter. Although the cited work was mainly focused on myogenic differentiation, it is tempting to speculate that a similar regulatory mechanism might play a role in pluripotent cells as well. Dppa2 is highly enriched in pluripotent cells and activation of endogenous Dppa2 during late steps of reprogramming specifically marks the small subset of cells that will achieve full pluripotency, in which Dppa2-mediated induction of Nanog transcription is a crucial event [66]. Therefore, it will be extremely interesting in the future to assess whether the lncRNA Dum regulates critical steps of reprogramming through modulation of Dppa2.

4. Concluding Remarks

Pluripotency is a unique property of ESCs and iPSCs, which are the only cell types able to undergo indefinite self-renewal and differentiation into derivatives of the three germ layers. Pluripotent cells therefore represent both ideal candidates for dissecting the mechanisms of early embryonic development and potential therapeutic tools for regenerative medicine. Patient-specific iPSCs also provide in vitro platforms to model human disease and to test drugs in preclinical studies. Such potential applications, however, are subordinated to a deep understanding of the molecular mechanisms underlying pluripotency.

An orchestra of transcription factors, chromatin regulators, signaling transducers, miRNAs and lncRNAs play coordinately in pluripotent cells. Each of them cannot be considered a solo player. Complex networks and feedback loops exist, which comprise members of each class of regulatory factors. A huge increase of transcriptome-wide analyses, facilitated by recent advancements in next-generation sequencing technologies, uncovered a universe of long noncoding transcripts. While there is no general consensus on the extent of the global impact of lncRNAs on the regulation of cell identity and differentiation, few examples in which selected lncRNAs have been more deeply analyzed exist. As discussed in this review, at least a subset of known lncRNAs are as important as previously defined “core transcription factors” in the context of pluripotent cells (Table 1). The paucity of functional studies is in striking contrast with the number of annotated lncRNAs (thousands) that are specifically enriched in ESCs and/or described as interactors of crucial pluripotency regulators, such as Polycomb and Trithorax complexes. We expect, in the near future, a substantial increase of functional studies describing new examples of lncRNAs acting in network with other master regulators in the definition of the pluripotent state.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors are grateful to Professor I. Bozzoni for helpful discussion and to J. Hughes for critical reading of the paper. This work was partially supported by a grant from Sapienza University (C26A14EH5H) to AR and by Institute Pasteur Fondazione Cenci-Bolognetti to MB.