Abstract

Thermophilic fungal cellulases are promising enzymes in protein engineering efforts aimed at optimizing industrial processes, such as biomass degradation and biofuel production. The cloning and expression in recent years of new cellulase genes from thermophilic fungi have led to a better understanding of cellulose degradation in these species. Moreover, crystal structures of thermophilic fungal cellulases are now available, providing insights into their function and stability. The present paper is focused on recent progress in cloning, expression, regulation, and structure of thermophilic fungal cellulases and the current research efforts to improve their properties for better use in biotechnological applications.

1. Introduction

Cellulose is one of the main components of plant cell wall material and is the most abundant and renewable nonfossil carbon source on Earth. Degradation of cellulose to its constituent monosaccharides has attracted considerable attention for the production of food and biofuels [1, 2]. The degradation of cellulose to glucose is achieved by the cooperative action of endocellulases (EC 3.1.1.4), exocellulases (cellobiohydrolases, CBH, EC 3.2.1.91; glucanohydrolases, EC 3.2.1.74), and beta-glucosidases (EC 3.2.1.21). Endocellulases hydrolyze internal glycosidic linkages in a random fashion, which results in a rapid decrease in polymer length and a gradual increase in the reducing sugar concentration. Exocellulases hydrolyze cellulose chains by removing mainly cellobiose either from the reducing or the non-reducing ends, which leads to a rapid release of reducing sugars but little change in polymer length. Endocellulases and exocellulases act synergistically on cellulose to produce cellooligosaccharides and cellobiose, which are then cleaved by beta-glucosidase to glucose [3].

Thermophilic fungi are species that grow at a maximum temperature of 50°C or above, and a minimum of 20°C or above [4]. Based on their habitat, thermophilic fungi have received significant attention in recent years as a source of new thermostable enzymes for use in many biotechnological applications, including biomass degradation. Thermophilic cellulases are key enzymes for efficient biomass degradation. Their importance stems from the fact that cellulose swells at higher temperatures, thereby becoming easier to break down. A number of thermophilic fungi have been isolated in recent years and the cellulases produced by these eukaryotic microorganisms have been purified and characterized at both structural and functional level. This review aims at presenting up-to-date information on molecular, structural, genetic, and engineering aspects of thermophilic fungal cellulases and to highlight their potential in biotechnological applications.

2. Cloning, Expression and Regulation of Cellulase Genes from Thermophilic Fungi

2.1. Regulation of Gene Expression

Production of fungal cellulases is commonly induced mainly in the presence of cellulose and is controlled by a repressor/inducer system [5]. In this system, cellulose or other oligosaccharide products of cellulose degradation act as inducers while glucose or other easily metabolized carbon sources act as repressors [610]. It has been demonstrated that the upstream regulatory sequence (URS) in fungal cellulase gene promoters plays a key role in the regulation of glucose repression [11, 12]. In Trichoderma reesei, the protein product of the regulatory gene cre1 (a Cys2His2 zinc finger protein) is a negatively acting transcription factor that binds to DNA consensus sequence SYGGRG (where S = C or G, Y = C or T, R = A or G) in the URS and represses transcription of cellulase genes in the presence of glucose [11]. In addition, three new transcription factors (ACEI, ACEII, and XYR1) have been identified in T. reesei and implicated in cellulase gene regulation [12]. Thermophilic fungal cellulases have also been found to possess a repressor/inducer system [4]. Unlike the transcription factors involved in T. reesei cellulase gene regulation, the full repertoire of transcription factors influencing cellulase gene expression in thermophilic fungi has not been described to date. Nevertheless, potential regulatory element consensus sequences have been identified in the 5′ upstream region of thermophilic fungal cellulase genes (6, 9, 13–15), and CREI genes from two thermophilic fungi (Talaromyces emersonii and Thermoascus aurantiacus) have been cloned (GenBank AF440004 and AY604200, resp.). It is, therefore, likely that cellulase gene regulation in thermophilic fungi may share certain similarities with T. reesei.

In a similar fashion as in mesophilic fungi, multiple forms of cellulases are also produced in thermophilic fungi [4]. Humicola grisea, for example, has four cellobiohydrolases in family 7 while Aspergillus niger (a mesophilic fungus) two. The observed multiplicity of cellulolytic enzymes may be the result of genetic redundancy [13, 14] or the outcome of differential posttranslational and/or postsecretion processing [4].

2.2. Heterologous Expression

About 50 genes encoding thermophilic fungal cellulases have been isolated, analyzed, and expressed. A brief summary is given in Table 1. Cellulases are glycosyl hydrolases classified into families 1, 3, 5, 6, 7, 8, 9, 10, 12, 16, 44, 45, 48, 51, and 61 (http://www.cazy.org/). Thermophilic fungal cellulases are found in families 1, 3, 5, 6, 7, 12, and 45.

Most cloned cellulase genes of thermophilic fungi are expressed well in host organisms, such as E. coli, yeast, and filamentous fungi. Expression of some thermophilic fungal cellulase genes in heterologous hosts is summarized in Table 1. Transformation of T. reesei with two endochitinase genes from Melanocarpus albomyces resulted in an increase in cellulase activity several times higher than that of the parental M. albomyces strain [23]. The majority of the recombinant cellulases expressed in yeast and filamentous fungi are glycosylated [16, 18]. Both the strain and culture conditions can affect the type and extent of the glycosylation [29]. Notably, when a gene encoding a beta-glucosidase of T. emersonii was cloned into T. reesei, the secreted recombinant enzyme contained 17 potential N-glycosylation sites in its functionally active form [24]. Importantly, the glycosylation of cellulases could contribute further to the improvement of their thermostability as it has been previously reported [30]. However, extensive glycosylation in recombinant enzymes could lead to reduced activity and increased non-productive binding on cellulose [29].

3. Purification and Characterization of New Cellulases from Thermophilic Fungi

Purified thermophilic fungal cellulases have been characterized in terms of their molecular weight, optimal pH, optimal temperature, thermostability, and glycosylation. Usually, thermophilic fungal cellulases are single polypeptides although it has been reported that some beta-glucosidases are dimeric [31]. The molecular weight of thermophilic fungal cellulases spans a wide range (30–250 kDa) with different carbohydrate contents (2–50%). Optimal pH and temperature are similar for the majority of the purified cellulases from thermophilic fungi. Thermophilic fungal cellulases are active in the pH range 4.0–7.0 and have a high temperature maximum at 50–80°C for activity (Table 1). In addition, they exhibit remarkable thermal stability and are stable at 60°C with longer half-lives at 70, 80, and 90°C than those from other fungi.

The structural characteristics underpinning the increased stability of thermophilic proteins have been studied more extensively in thermophilic bacteria and hyperthermophilic archaea [32, 33]. It should be noted, however, that a common set of determinants for protein thermostability has not been established so far and several contributors to protein thermostability have been proposed. A recent analysis suggested that an increase in ion pairs on the protein surface and a stronger hydrophobic interior are the major factors supporting increased thermostability in proteins [34]. Compared with thermophilic proteins from thermophilic bacteria and hyperthermophilic archaea, the understanding of the nature and mechanism of thermostability of proteins from thermophilic fungi is relatively poor. Hence, further characterization of amino acid residues related to thermostability is necessary for comprehensive understanding of their role in the thermostability of cellulases from thermophilic fungi.

4. Structure of Thermophilic Fungal Cellulases

4.1. Primary Structure

A common characteristic of cellulases is their modular structure. Typically, endocellulases and cellobiohydrolases are composed of four domains or regions (Figure 1): a signal peptide that mediates secretion, a cellulose-binding domain (CBD) for anchorage to the substrate, a hinge region (linker) rich in Ser, Thr and Pro residues, and a catalytic domain (CD) responsible for the hydrolysis of the substrate. The mature proteins are O- and N-glycosylated in the hinge region and the CDs, respectively. The effect of the glycosylation sites in the hinge region is not clear yet but they may play a role in the flexibility and disorder of the linker [35].

Variations between cellulases within the same mechanistic class have been observed. An example is illustrated by T. emersonii CBHII, which is characterized by a modular structure [6] whereas CBH1 from the same fungus consists solely of a catalytic domain [7]. Similarly, Chaetomium thermophilum CBH1 and CBH2 consist of a typical CBD, a linker, and a catalytic domain. In contrast, CBH3 only comprises a catalytic domain and lacks a CBD and a hinge region [16].

Fungal CBDs are composed of less than 40 amino acid residues, and they interact with cellulose through a flat or platform-like hydrophobic binding site formed by three conserved aromatic residues. The binding site is thought to be complementary to the flat surfaces presented by cellulose crystals [36, 37]. The (110) faces of the cellulose crystalline microfibrils have been proposed as the putative CBD binding site [38]. With this arrangement, the glucopyranoside rings of cellulose are expected to be fully exposed and available for hydrophobic interactions.

Deletion of the CBDs from T. reesei Cel7A and Cel6A and H. grisea CBH1 greatly reduces enzymatic activity toward crystalline cellulose [48], suggesting that the tight binding to cellulose mediated by the CBD is necessary for the efficient hydrolysis of crystalline cellulose by these enzymes. Substitution of the three conserved aromatic residues (W494, W520, and, Y521) in H. grisea CBH1 CBD with other amino acids (G, F or W) has demonstrated the importance of these residues in the interdependency of high activity of H. grisea CBH1 on crystalline cellulose and high cellulose-binding ability [49].

4.2. Three-Dimensional (3D) Structure

Three-dimensional (3D) structures of thermophilic fungal cellulases from families 5, 6, 7, 12, and 45 have been reported (Table 2; Figure 2) and are briefly described below:

4.2.1. Family 5

Family 5 cellulases belong to the endoglucanase type. The overall fold of the enzymes is a common β/α-barrel. In this family, only one structure from a thermophilic fungus, that of T. aurantiacus Cel5A, is known [45]. The structure consists solely of a catalytic domain. A substrate-binding cleft is visible at the C-terminal end of the barrel. The size and shape of the cleft suggest the binding of seven glucose residues (−4 to +3). In contrast to other family 5 cellulase structures, Cel5A has only a few extrabarrel features, including a short two-stranded β-sheet in β/α-loop 3 and three one-turn helices.

4.2.2. Family 6

Family 6 comprises both endoglucanases and cellobiohydrolases. 3D structures have been reported for the endoglucanase Cel6B and the cellobiohydrolase Cel6A of this family from the thermophilic fungus H. insolens [39, 40]. The structures of these two cellulases exhibit a distorted β/α-barrel with the central β-barrel made up of seven instead of eight parallel β-strands. A substrate binding crevice is formed between strands I and VII. The crevice of Cel6A contains at least four substrate-binding sites, −2 to +2, whereas that of the Cel6B has six substrate-binding sites, −2 to +4. A significant difference between the endoglucanase Cel6B and the cellobiohydrolase Cel6A is that two extended surface loops enclose the active site in the Cel6A. These loops, however, are absent in Cel6B, resulting in an open substrate cleft in this endoglucanase. Because of this structural difference, endoglucanase can hydrolyze bonds internally in cellulose chains whilst cellobiohydrolase acts on chain ends.

4.2.3. Family 7

Similarly to family 6, family 7 contains endoglucanases and cellobiohydrolases. Only a few structures of family 7 thermophilic fungal cellulases are currently known, including T. emersonii CBHIB [7], H. insolens EGI [41, 42], and M. albomyces Cel7B [47]. The structure of M. albomyces Cel7B, similar to T. emersonii CBHIB, is a representative of the family 7 cellobiohydrolases [7]. It consists of two antiparallel β-sheets packed face-to-face to form a β-sandwich. Both β-sheets contain six β-strands. Owing to their strong curvature, these two β-sheets form the concave and convex surfaces of the sandwich. The loops connecting the strands extend from the concave face of the sandwich and form an enclosed substrate-binding tunnel. The tunnel is about 50 Å long and contains nine substrate-binding sites, −7 to +2 [47].

H. insolens EGI has a β-sandwich structure similar to M. albomyces Cel7B (a cellobiohydrolase). The structure of EGI comprises two large antiparallel β-sheets consisting of seven and eight β-strands, respectively [41, 42]. However, there are structural differences between EGI and Cel7B. EGI, for instance, has an open long active site cleft in the center of a canyon formed by the curvature of the β-strands in the β-sandwich. In contrast, Cel7B has an enclosed substrate-binding tunnel [41, 47], which is similar to the endoglucanases and cellobiohydrolases of GH family 6. C. thermophilum CBH3 is a thermostable, single-module cellobiohydrolase with no 3D structure available [16]. This cellobiohydrolase shares high sequence identity (80%) with M. albomyces Cel7B. A homology model based on the M. albomyces Cel7B structure [47] showed that all the important residues in the catalytic site and substrate-binding site as well as the disulphide bonds present in M. albomyces Cel7B are also found in C. thermophilum CBH3.

4.2.4. Family 12

The structure of a family 12 fungal cellulase from the thermophilic fungus H. grisea has been reported [44, 51]. It comprises 15 β-strands that fold into two antiparallel β-sheets, which pack on top of each other to form a compact curved β-sandwich. The convex β-sheet consists of six antiparallel strands, and the concave β-sheet consists of nine antiparallel strands. The structure’s concave face creates a long substrate-binding cleft with six substrate-binding sites, −4 to +2.

4.2.5. Family 45

The structures of two endoglucanases from family 45 have been solved: H. insolens Cel45A (EGV) [43] and M. albomyces 20 kDa endoglucanase [46, 52]. These two endoglucanases have a similar overall fold. Their structure consists of a six-stranded β-barrel with interconnecting loops. The molecule has the shape of a flattened sphere with approximate dimensions 32 Å × 32 Å × 22 Å. The β-strands are connected with long disulfide-bonded loop structures while the remainder of the structure is completed by three helices. A substrate-binding groove is formed between the β-barrel and the loop structures. This groove, approximately 40 Å long, 10 Å deep, and 12 Å wide, is subdivided into six substrate-binding sites, −4 to +2 [46].

5. Improvement of Thermophilic Fungal Cellulases

The current challenge in biomass conversion by cellulases concerns the degradation of cellulose in an efficient and cheap way. To increase cellulase efficiencies and to lower the cost, cellulases need to be improved to have higher catalytic efficiency on cellulose, higher stability at elevated temperatures and at nonphysiological pH, and higher tolerance to end-product inhibition [53]. Currently, two main research approaches used in the improvement of cellulases through protein engineering are: structure-based rational site-directed mutagenesis and random mutagenesis through directed evolution. Site-directed mutagenesis requires detailed knowledge of the protein’s 3D structure. On the other hand, the directed evolution approach is not limited by the lack of the protein’s 3D structure but requires an efficient method for high throughput screening [54].

5.1. Improvement of Thermostability

Although cellulases from thermophilic fungi are thermostable, the potential to increase their thermostability further would be beneficial for industrial applications. Improvement of M. albomyces Cel7B has been pursued by error-prone PCR, and 49 positive mutant clones were screened from 14600 random clones by a robotic high-throughput thermostability screening method [55]. Two positive thermostable mutants, Ala30Thr and Ser290Thr, showed improvements in unfolding temperatures ( ) by 1.5 and 3.5°C, respectively. In addition, the optimum temperature on a soluble substrate for the Ala30Thr mutant was improved by 5°C. The amino acid alterations are located in the β-strands furthest away from the active site tunnel of the Cel7B enzyme, which could improve protein packing. Recently, Cel7A cellobiohydrolase from the thermophilic fungus T. emersonii was engineered using rational mutagenesis to improve its thermostability and activity [25]. Additional disulphide bridges were introduced into the catalytic module of Cel7A. Three mutants had clearly improved thermostability as reflected by an improvement in Avicel hydrolysis efficiency at 75°C.

Structural analysis of H. grisea Cel12A, a thermostable endoglucanase, has revealed three unusual free cysteines in the enzyme: Cys175, Cys206, and Cys216. Subsequently, the following Cel12A mutants were constructed by site-directed mutagenesis: Cys175Gly, Cys206Pro, and Cys216Val. It was found that the three free cysteines play a significant role in modulating the stability of the enzyme [56]. More specifically, mutation of Cys206 to Pro and Cys216 to Val caused a reduction in the of 9.1 and 5.5°C, respectively, compared to the wild-type enzyme. Moreover, when the free Cys175 was mutated to a Gly, the of the enzyme was increased by 1.3°C. It has recently been reported that endoglucanases are characterized by variations in amino acid compositions resulting in fold-specific thermostability [57], thus providing new strategies for improvement of thermostability.

A new computational approach, SCHEMA, which uses protein structure data to generate new purpose-specific sequences that minimize structure disruption when they are recombined in chimeric proteins, has been employed to create thermostable fungal cellulases [21, 22]. The high resolution of H. insolens CBHII [39] as a template for SCHEMA yielded a collection of highly thermostable CBHII chimeras. Using the computer-generated sequences, a total of 31 new cellulase genes were synthesized and expressed in Saccharomyces cerevisiae; each of these cellulases was found to be more stable than the most stable parent cellulase from H. insolens, as measured either by half-life of inactivation at 63°C or by . These findings demonstrated the value of using structure-guided recombination to discover important sequence-function relationships for efficient generation of highly stable cellulases.

In addition to the improvement of cellulase thermostability, an increase of cellulase stability in detergent solutions following protein engineering has also been reported [58]. H. insolens Cel45 endoglucanase is used in the detergent industry, but is inactivated by the detergent C12-LAS (an anionic surfactant) owing to the positive charges of the enzyme surface. Based on the Cel45 crystal structure, different mutations to surface residues were obtained by site-directed mutagenesis. The data on these mutants showed that the introduction of positive charges or removal of negative charges greatly increases detergent sensitivity. The R158E mutation, in particular, gave the highest increase in stability against C12-LAS.

5.2. Improvement of Catalytic Activity

The improvement of cellulase catalytic activity using site-directed mutagenesis and directed evolution has attracted considerable attention in recent years. However, owing to the absence of general rules for site-directed mutagenesis and the limitation of screening methods on solid cellulosic substrates for postdirected evolution screening of cellulases with improved activity on insoluble substrates, only a few successful examples of cellulase mutants exist that have significantly higher activity on insoluble substrates [53]. A 20% improvement in the activity of a modified endoglucanase Cel5A from the bacterium Acidothermus cellulolyticus has been reported on microcrystalline cellulose following site-directed mutagenesis [59]. A 5-fold higher specific activity in a Bacillus subtilis endoglucanase mutant was found following directed evolution [60]. An endocellulase gene from the termite Reticulitermes speratus was modified by site-directed mutagenesis, and three mutants, G91A, Y97W, and K429A, displayed higher activities towards carboxymethyl cellulose than the wild type enzyme [61]. Similarly, few reports have been documented thus far on improving the catalytic activity of thermophilic fungal cellulases using either site-directed mutagenesis or directed evolution. As discussed above, the S290T mutant from M. albomyces Cel7B exhibits not only improved thermostability but also a 2-fold increase in the rate of Avicel hydrolysis at 70°C [62]. Similar results were also obtained with the T. emersonii Cel7A following site-directed mutagenesis [25].

As mentioned previously, and highlighted by recent studies [37], CBDs of cellulases play important roles in enhancing enzymatic activities against crystalline cellulose. A basic approach in CBD engineering is to add or replace a CBD in order to improve hydrolytic activity. Indeed, addition of a CBD from T. reesei CBHII to a T. harzianum chitinase resulted in increased hydrolytic activity on insoluble substrates [63]. The thermophilic fungus H. grisea produces two endoglucanases, one with a CBD (EGL3) and one without CBD (EGL4). The fusion protein, EGL4CBD, which consists of the EGL4 catalytic domain and the EGL3 CBD, shows relatively high activity against carboxymethyl cellulose [18]. M. albomyces family 7 (Cel7A and Cel7B) and family 45 (Cel45A) glycosyl hydrolases lack a consensus CBD and its associated linker [23]. To improve their efficiency, these three cellulases were genetically modified to carry the CBD of T. reesei CBHI. The presence of the CBD was shown to improve their hydrolytic potential towards crystalline cellulose [64].

5.3. Conversion to Glycosynthases

An important development in cellulase engineering is the conversion of cellulases to glycosynthases by site-directed mutagenesis [65]. The glycosynthases are retaining glycosidase mutants in which the catalytic nucleophile has been replaced by a non-nucleophilic residue. The first glycosynthase reported from thermophilic fungi was derived from H. insolens Cel7B after E197 was mutated to Ala. The resultant Cel7B E197A glycosynthase was able to catalyze the regio- and stereoselective glycosylation of appropriate receptors in high yield [66]. More recently, three mutants of the H. insolens Cel7B E197A glycosynthase were prepared and characterized by site-directed mutagenesis: E197A/H209A and E197A/H209G double mutants, and the Cel7B E197A/H209A/A211T triple mutant [67]. These second-generation glycosynthase mutants underwent rational redesign in +1 subsite with the aim of broadening the substrate specificity of the glycosynthase. The results showed that the double mutants E197A/H209A and E197A/H209G preferentially catalyze the formation of a β-(1,4) linkage between the two disaccharides. In contrast, the single Cel7B mutant E197A and triple Cel7B mutant E197A/H209A/A211T produce predominantly the β-(1,3)-linked tetrasaccharide. This work indicated that the regioselectivity of the glycosylation reaction catalyzed by H. insolens Cel7B E197A glycosynthase could be modulated by appropriate active-site mutations.

6. Conclusions and Future Perspectives

Thermophilic fungal cellulases have recently emerged as promising alternatives in biotechnological applications. However, only a minority of thermophilic fungal cellulases has been characterized in detail so far. Site-directed mutagenesis and directed evolution have been employed and are currently the most preferable approaches to obtain novel thermostable mutants. A systematic characterization of cellulases from additional thermophilic fungi is necessary to better understand their thermostability and evolutionary relationships to mesophilic cellulases. Further improvement of thermophilic fungal celulases will assist in developing better and more versatile cellulases for biotechnological applications and provide novel opportunities in protein engineering efforts.

Acknowledgments

This work was supported by the Chinese National Program for High Technology, Research and Development, the Chinese Project of Transgenic Organisms, the National Department Public Benefit Research Foundation, and the China National Special Fund of Sea Renewable Energy Sources (SDME2011SW01). A. C. Papageorgiou thanks the Academy of Finland for financial support (Grant no. 121278).