Abstract

Sorting of transmembrane proteins to various intracellular compartments depends on specific signals present within their cytosolic domains. Among these sorting signals, the tyrosine-based motif (YXXØ) is one of the best characterized and is recognized by πœ‡-subunits of the four clathrin-associated adaptor complexes (AP-1 to AP-4). Despite their overlap in specificity, each πœ‡-subunit has a distinct sequence preference dependent on the nature of the X-residues. Moreover, combinations of these residues exert cooperative or inhibitory effects towards interaction with the various APs. This complexity makes it impossible to predict a priori, the specificity of a given tyrosine-signal for a particular πœ‡-subunit. Here, we describe the results obtained with a computational approach based on the Artificial Neural Network (ANN) paradigm that addresses the issue of tyrosine-signal specificity, enabling the prediction of YXXØ-πœ‡ interactions with accuracies over 90%. Therefore, this approach constitutes a powerful tool to help predict mechanisms of intracellular protein sorting.

1. Introduction

A defining characteristic of eukaryotic cells is the presence of membrane-bound intracellular compartments. These membranous structures host specific biochemical processes by virtue of their distinctive lipid and protein composition [1]. Nevertheless, in order to be able to contribute to the physiology of the cell, this array of processing stations needs to be linked and coordinated by a robust trafficking system of membranous carriers [1, 2]. Indeed, the transport of cargo by this system plays a crucial role in the establishment/maintenance of each compartment’s identity and in the delivery of substrates [1, 2].

Given the outstanding relevance of protein trafficking for the onset of diseases, as well as the significance of trafficking in pathogenic infection [3, 4], understanding the mechanisms by which the cell targets its proteins to the appropriate compartment has been the focus of multiple labs [5–9]. A landmark achievement resulting from these efforts was the realization that some transmembrane proteins contain sorting signals embedded in the aminoacid sequence of their cytoplasmic segments [9]. These signals are recognized by intracellular receptors that mediate the protein inclusion in, or exclusion from, trafficking carriers [9]. Among this signal-recognition machinery, the tetrameric clathrin-associated Adaptor Proteins (APs) emerge as major players in the protein trafficking system [9, 10]. Four different AP complexes (AP-1 through AP-4) with distinctive intracellular localizations have been identified and they are believed to mediate different protein sorting events from and/or to several compartments [11, 12]. Whereas other subunits are engaged in interactions with various molecules, the medium AP πœ‡ subunit is in charge of recognizing tyrosine-based sorting signals fitting a XXXYXXØ consensus (where X = any amino acid; Y = tyrosine and Ø = residues with a bulky hydrophobic side chain such as phenylalanine, leucine, isoleucine, methionine, and valine) [6, 9, 13, 14].

Although the Y and Ø residues within these signals are critical for πœ‡ subunit binding, it is known that the less conserved X-positions play an important role in defining the specificity of different Y-signals for different AP complexes [14, 15]. In fact, the differential interaction of signals with APs is responsible for the ultimate intracellular localization of the corresponding cargo.

The two-hybrid technology was used by the Bonifacino lab at NIH to conduct the most comprehensive study of πœ‡ subunit specificity for Y-signals available to date [14, 15]. Specifically, this group used the different πœ‡ subunits (πœ‡1β€“πœ‡4 from AP-1 through AP-4, resp.) as β€œbaits” to screen a two-hybrid XXXYXXØ signal-library. The sequences of the signals selected by each πœ‡ subunit were established and the data was statistically analyzed. Further, each set of signals selected by a particular πœ‡ subunit was tested against the other πœ‡ chains generating a vast amount of data about the signal binding preferences of APs. These investigations provided unique and extremely valuable information about the signal specificity of πœ‡ subunits [14, 15]. However, they also highlighted the complexity of πœ‡/Y-signal recognition process; particularly by indicating that combinations of residues at certain X-positions display (positive or negative) cooperative effects, thereby affecting the overall ability of signals to interact with πœ‡ subunits [14, 15]. Unfortunately, these interdependence effects made it impossible to extract explicit rules for predicting recognition of Y-signals by AP πœ‡ subunits. A classical alternative to rule-based analytical models is the Artificial Neural Network (ANN) paradigm [16–18]. ANNs analyze existing examples of the phenomena under study and, through an iterative process (β€œtraining” or β€œlearning”), mathematically encode their behavior for predictive purposes [19–21]. A critical requirement for the success of ANN approaches is that a critical mass of information be available for training [22]. Since this precondition is satisfied in the case of Y-signal recognition by πœ‡ subunits [14], we designed, trained, and validated ANNs for the prediction of πœ‡/Y-signal interactions.

Our results indicate that trained ANNs were capable of predicting the experimental outcomes of previously published two-hybrid experiments with over 90% accuracy. Further, ANNs also successfully forecasted the results from novel two-hybrid experiments involving Lamp2 and CD63 mutant signals with πœ‡ subunits. Importantly, ANNs were proficient for correctly predicting two-hybrid results even in the presence of positive or negative cooperativity effects among residues within a Y-signal. Indeed, the ANNs’ predictions were correlated with the intracellular localization of transmembrane proteins bearing analyzed signals.

In summary, our results demonstrate that application of the ANN paradigm is suitable for the prediction of πœ‡/Y-signal interactions and providing a solution to this important problem in cell biology. To further improve the system performance, we encourage our colleagues to submit their own experimental results to be used in future rounds of training and validation.

2. Materials and Methods

2.1. Plasmids and Strains
2.1.1. DNA Constructs

Plasmids used in this study were prepared using standard techniques and following the general design described in [14]. Thus, XXXYXXØ signals were cloned in-frame with the TGN38 cytoplasmic tail in the multiple-cloning site of the two-hybrid vector pGBT9 (Clontech).

Site directed mutagenesis was done using the QuikChange kit (Stratagene, La Jolla, CA).

2.1.2. Yeast Culture Conditions and Transformation Procedures

Yeast two-hybrid strain AH109 (Clontech) was grown in standard yeast extract-peptone-dextrose (YPD) or synthetic medium with dextrose lacking appropriate aminoacids for plasmid maintenance at 30Β°C for 3-4 days unless indicated otherwise. Transformations were performed by standard Li-Acetate transformation procedures (Clontech yeast handbook).

2.2. HeLa Cell Culture and Transfection

HeLa cells (American Type Culture Collection, Manassas, VA) were cultured in DMEM supplemented with 10% (vol/vol) FBS/100 units/mL penicillin/100 mg/mL streptomycin (Biofluids, Rockville, MD). The night before transfection, cells were seeded onto six-well plates (Costar) in 2 mL of medium. The following day, the cells were transfected with the TAC constructs in pXS using Fugene-6 reagent (Roche Molecular Biochemicals). Twenty-four hours after transfection, cells were fixed and analyzed for expression of the TAC constructs by immunofluorescence microscopy with the 7G7 anti-TAC monoclonal antibody.

2.3. Immunofluorescence Microscopy

HeLa cells transiently transfected with TAC constructs were grown on coverslips, fixed with 4% formaldehyde and incubated with the 7G7 mouse monoclonal anti-TAC antibody diluted 1 : 500 in DMEM, 10% FCS, 0.1% saponin for 1 h at room temperature. After washing with PBS, coverslips were incubated with a goat anti-mouse IgG antibody conjugated to Alexa488 for 1 h. Coverslips were washed with PBS and mounted on slides using Aqua-PolyMount (Polysciences) and imaged in a Zeiss Axiovert 200 M microscope.

2.4. Two-Hybrid Experiments and Result Coding

Potential interactions between XXXYXXØ signals and a given AP πœ‡ subunit was tested using the two-hybrid technology as previously described [14]. Briefly, plasmid DNA encoding for GAL4 DNA Binding Domain (G4BD)-XXXYXXØ and Gal4 Activation Domain (G4AD)-πœ‡ fusion proteins were transformed into AH109 yeast cells bearing GAL4-based reporter genes. If the πœ‡ moiety is capable of binding the Y-signal of the DNA-bound G4BD-XXXYXXØ fusion, then the G4AD-πœ‡ will be recruited to the reporter gene leading to gene activation (Figure 1). The presence of the reporter gene product, for example His3 (an enzyme involved in the biosynthesis of the aminoacid histidine), will allow the cells to grow in selective media, that is, plates lacking histidine (βˆ’His, see Figure 1). Therefore, cell growth in βˆ’His media, visualized as yeast colony formation, constitutes the experimental readout that corresponds to πœ‡/Y-signal interaction. The two-hybrid results were coded as follows: when visible colonies were formed an Interaction Value, 𝑉=1 was assigned; if no colonies were observed the Interaction Value was 0 (Figure 1).

2.5. Data Sets

In this work, we used AP πœ‡-subunit/ Y-signal interaction data coming from two-hybrid library screens, most of which have been previously published [14, 15].

(a) Training Set
We used extensive collections of about 200β€‰πœ‡/Y-signal interaction data per πœ‡ subunit [14, 15] to train neural networks for the prediction of the interaction of XXXYXXØ sorting motifs with different adaptor πœ‡ subunits. Since it has been recently demonstrated that πœ‡4 is capable of binding two types of sorting signals via two different binding sites [23], we did not train an ANN for prediction of Y signal interactions with this medium subunit. However, we used data corresponding to the analysis of cross-reactivity of other πœ‡-subunits with Y-signals isolated in a πœ‡4 screen.

(b) Validation Set
In order to test the generalization capabilities of our neural network, we used a second set of πœ‡-sorting signal interaction data including a reserved group (not used for training) from the published screens [14] and also naturally occurring Y-based targeting motifs previously tested by using the two-hybrid technology [15, 24–26].

3. Results and Discussion

Here we describe a novel approach to the analysis of protein trafficking mediated by sorting signals. Specifically, we describe the design and application of an artificial intelligence approach based on the neural network paradigm.

We trained three different ANNs, which predict whether a given Y-based sorting signal will be recognized or not by three adaptor medium subunits (πœ‡1, πœ‡2, and πœ‡3). Although it is clear that πœ‡4 binds to Y-signals in a Y- and Ø-dependent manner, it recognizes at least two kinds of sorting signals [23]. Therefore, since πœ‡4 two-hybrid screens for Y-signals may have produced mixed results corresponding to more than one type of signal selected, we excluded this medium subunit from the current development. Following training, ANNs (one per adaptor medium subunit) were assembled in a single system. Algorithm and current weight sets are freely available upon request.

3.1. Design of ANN for the Prediction of πœ‡/Y-Signal Two-Hybrid Interaction

ANNs are algorithms capable of predicting the outcome of complex processes not viable for deconvolution into simple sets of rules [19, 20]. Therefore, we reasoned that these approaches would be suitable for the analysis and prediction of πœ‡/Y-signal two-hybrid Interaction Values (see Figure 1 and Section 2.4).

A typical artificial neural network (see Figure 2 for an example) is made up of independent computing units (β€œneurons”) organized in β€œlayer” groups. Following adjustment by the corresponding β€œconnection weights”, the computing results from lower layers are used by the upper layer neurons as inputs for their own calculations.

During β€œtraining”, a neural network uses iterative processes to adjust its internal parameters (connection weights) so that its output function can produce the expected response (e.g., interaction value) for each element of a large set of known experimental data. If properly trained and validated, the network will predict unknown experimental outcomes.

The tyrosine-signal neural network (TySNN) is a feed-forward ANN designed to address the question: β€œDoes this AP πœ‡ subunit bind this Y-signal?” by predicting an interaction value, 𝑉.

After trying several network architectures (not shown), we concluded the most robust system consisted of one hidden layer containing 2 neurons fully connected with the input layer as well as with the unique node within the output layer (Figure 2(a)). Therefore, TySNN is made up of three neuron layers: an input layer (106 neurons), a hidden layer (2 neurons, h1, and h2), and one output (o) neuron (Figure 2(a)). The input layer is comprised of 5 clusters that represent each X-position in a XXXYXXØ signal. Each cluster contains 20 neurons representing the 20 possible aminoacids that can be found at that specific X-position. A sixth cluster of 5 neurons represents the 5 possible aminoacids (F, M, I, L, and V) to be found at the Ø-position (Figure 2(a)). An extra, constitutively activated, β€œbias”-neuron [20] was added yielding a total amount of 106 input neurons.

The network reads each position of the XXXYXXØ signal and sends inputs to every neuron in the corresponding position-cluster. Within a cluster, an input = 0 is sent to all neurons except to the one representing the aminoacid found at the position and that receives an input = 1 (Figure 2(b)). All neurons from the input layer send an output value to both hidden neurons equal to their input multiplied by the corresponding connection weights (π‘Ših, Figure 2(a)). The resulting values constitute the input to the hidden layer. Each hidden neuron compiles a total input and elaborates an output following a sigmoidal activation function (see Appendix and [19, 20] that is transmitted to the output neuron according to their corresponding π‘Šho weights (Figure 2(a)). In turn, the output neuron sums the inputs coming from both h1 and h2 and elaborates the network output (predicted Interaction Value, 𝑉) through its own sigmoidal activation function. The network predicted 𝑉 values are translated from a real number in the range (0.0; 1.0) into an appropriate binary output. Thus, an arbitrary output value >0.5 is considered a β€œyes” result while any value ≀0.5 means β€œno” (i.e., There is or there is not an interaction between the sorting signal and the πœ‡ subunit, resp.).

3.2. Evaluation of the Artificial Neural Network Performance

The networks were initialized using small weight values randomly generated and following a normal distribution with mean = 0.00 and standard deviation = 1/[number of neurons]1/2 (i.e., β‰ˆ0.10) ([20] and Figure 3(a)). During training, the predicted binary 𝑉 values (see above) were compared to the known experimental results (training set, [14]) and the weights were modified to minimize the differences (see appendix for details). More specifically, training was performed following a β€œbatch” scheme; that is, the weight changes were accumulated and only applied after one run of the whole set of training examples or β€œepoch” (see appendix for further details on the algorithm and network architecture). The process was repeated until convergence was attained (Figure 3(b)).

Two parameters were used to measure the performance of the neural networks.(1)Accuracy (𝐴). Represents the ratio between the number of correctly predicted outcomes (𝐢) and the total number of examples (𝑁). 𝐢𝐴=𝑁.(1)(2)Mathews’ correlation coefficient (MCC) [27].MCC=π‘π‘›βˆ’π‘’π‘œΓ—1(𝑛+𝑒)(𝑛+π‘œ)(𝑝+𝑒)(𝑝+π‘œ)2,(2) where 𝑝 is the number of true positives predictions, 𝑛 the number of true negatives predictions, 𝑒 the number of false positives, and π‘œ the number of false negatives. MCC is used as a reliable performance indicator that is independent of the proportion of positive and negative results in the training set [28].

Accuracy and the total error E (see appendix) were also used to monitor the evolution of network learning during training (see Figure 3(c) for an example).

In general, the shape of the curves obtained indicated the presence of local minima (Figure 3). In fact, some of our networks’ current weight sets may correspond to low local, rather than global, minima.

Table 1 summarizes the performance of the networks following training. In all cases we observed above 90% accuracy in predicting the result of a potential πœ‡/Y-signal interaction. These values support the suitability of the ANN paradigm for predicting Y-signal specificity for clathrin-associated adaptor complexes.

We believe the accuracy of the networks can be further improved with subsequent training, aiming to reach the global minima. However, in order to avoid overtraining with a single data set, new results should be used. Therefore, we encourage our colleagues to participate in this effort by submitting their own πœ‡/Y-signal binding results. In addition, the spreadsheet macro that runs the ANN algorithm is freely available upon request.

3.3. Biologically Relevant Predictions and Detection of Cooperative Effects among Residues within a Signal

ANNs described in this work were trained using two-hybrid interaction data. Therefore, ANNs predict two-hybrid interaction values from experiments performed under similar conditions (see Section 2.4). It should be noted that two-hybrid results can significantly correlate with the targeting behavior of proteins expressed in cells [15].

Analysis of the relative relevance of residues within the signal suggests that positions Y βˆ’ 3, Y βˆ’ 2, Y + 2, and Ø usually have major effects on the overall ability of the Y-signal to interact with πœ‡ subunit.

Importantly, TySNN was able to correctly predict the specificity of a subset of naturally occurring signals, including the sorting signals for lamp2 (HTGYEQF) and CD63 (RSGYEVM). Interestingly, these signals display a similar interaction pattern against the different πœ‡ subunits: both could bind πœ‡2 and πœ‡3 but showed negligible interaction with πœ‡1 [15]. Although the residues immediately flanking the critical Y within these signals are identical (Y βˆ’ 1 and Y + 1), the ones occupying the positions Y βˆ’ 3, Y βˆ’ 2, Y + 2, and Ø are different (Figure 4(a)).

In order to test the relevance of these residues for the interaction of these naturally-occurring and highly similar Y-signals with πœ‡ subunits, we asked TySNN to predict the specificity of chimeric signals as indicated in Figure 4. Surprisingly, TySNN predicted negligible reactivity of the chimeric signal HTGYEVM with πœ‡2. This prediction was surprising as πœ‡2 has been described as the medium subunit with the most relaxed specificity [14]. Also, through this result, TySNN indicated the existence of negative cooperative effects among residues at different positions within a signal. Importantly, we tested this prediction experimentally and observed a complete correspondence with actual two-hybrid results (Figure 4(a)).

Further, we introduced both lamp2 (HTGYEQF) and the chimeric (HTGYEMV) signal into the cytoplasmic tail of interleukin-2 receptor Ξ±-subunit (also known as TAC) and expressed them in heLa cells. Intracellular localization of TAC-fusion proteins can be easily detected by immunofluorescence with an anti-TAC antibody (7G7). In fact, the TAC-Lamp2 fusion protein showed a largely intracellular, perinuclear immunofluorescence staining, compatible with a late endosomal-lysosomal localization (Figure 4(b)). In contrast, the TAC-chimeric signal fusion protein showed a strong plasma membrane staining compatible with deficient internalization due to impaired recognition by πœ‡2 (Figure 4(b)). These results support the applicability of the predictions of the ANN system to in vivo intracellular trafficking problems.

4. Conclusions

Our results indicate that ANNs can handle the complexity of the πœ‡/Y-signal interaction process. Therefore, candidate protein cargo with a suitable Y-signal within their cytoplasmic tail can be identified based on their predicted ability to interact or not with the various πœ‡ subunits. However, the investigator should be aware that for a YXXØ motif to be recognized by APs in vivo, it must also satisfy other requirements, for example, proper spacing from the corresponding transmembrane domain [9]. As mentioned in previous sections, further training with additional naturally occurring Y-sorting signals should enhance the predictive power of this approach towards cytoplasmic domains of transmembrane proteins.

Importantly, trained ANNs have been successfully used to extract information about the principles ruling the phenomenon under study [29]. Therefore, we anticipate that upon further developments, results obtained with TySNN will contribute to the establishment of explicit rules for the analysis of Y-based sorting signals. In fact, this work already reports the conclusions concerning the relative importance of certain X-positions for the recognition of the Y-signal by the different AP medium subunits. Moreover, improvements to the algorithm reported here will be directed to provide for the capability to analyze quantitative data rather than binary β€œYes/No” results. Specifically, ANNs can be trained to predict the strength of πœ‡/Y-signal interaction based on Ξ²-galactosidase activity or cell growth in the presence of different concentrations of the competitive inhibitor 3AT in two-hybrid experiments [24].

Finally, we envision that this approach may be used in the analysis of results from future screens. For example, there is almost no information regarding the specificity of APs for signals in plants and Saccharomyces cerevisiae. Therefore, we believe a systematic study of πœ‡/Y-signal interactions, like the ones conducted by the Bonifacino lab [13–15], should be pursued in yeast and plants.

Along the same lines, a screen to define the specificity of APs for dileucine signals is also lacking. The Bonifacino lab also developed a successful three-hybrid approach [30] that should be adapted for the screening of putative combinatorial dileucine signal libraries. Further, a similar ANN-based approach can be adopted for screens involving other signal/motif receptors than APs. We anticipate that use of the ANN paradigm would be of great benefit for rapidly utilizing the information generated by all these efforts and for the analysis of data from other challenging endeavors in the area of vesicle trafficking.

Appendix

Neural Network Architecture and Data Flow

As described in Section 3.1, the identity of the residues within the XXXYXXØ signal determines which neuron within the X- and Ø-position clusters (at the input layer) is turned on (i.e., β€œon” output value = 1; β€œoff” output value = 0). Then, each input neuron 𝑖 sends a β€œmessage” to each hidden neuron 𝑗, equal to its off/on output value (𝑂𝑖) times the connection weight (π‘Šπ‘–π‘—). Thus, the total net input (𝐼) received by each hidden neuron is𝐼𝑗=ξ“π‘‚π‘–π‘Šπ‘–π‘—.(A.1) In turn, neurons from the hidden layer as well as the unique node in the output layer (Figure 2) produce a response according to a sigmoidal activation function𝑂𝑗=1ξ€·1+π‘’βˆ’π›ΌπΌπ‘—ξ€Έ,(A.2) where 𝑂𝑗 represents the output response from a given hidden or output neuron 𝑗 receiving a net input 𝐼𝑗 (modulated by an 𝛼 factor) [21]. The final output (𝑂) is then compared with the expected interaction value (𝑉) (from the training data set) by using an error function (𝐸) (Figure 2(a))𝐸=π‘˜(π‘‰βˆ’π‘‚)2,(A.3) where π‘˜ is the number of examples in the training data set.

A weight correction to minimize the error function is estimated according to Ξ”π‘Šπ‘–π‘—(𝑛)ξ‚΅=βˆ’πœ‚π‘‘πΈπ‘‘π‘Šπ‘–π‘—ξ‚Ά+π‘šΞ”π‘Šπ‘–π‘—(π‘›βˆ’1),(A.4) where Ξ”π‘Šπ‘–π‘—(𝑛) and Ξ”π‘Šπ‘–π‘—(π‘›βˆ’1) represent the change of the weights calculated at iterations 𝑛 and π‘›βˆ’1, respectively. πœ‚ is the learning rate parameter and π‘š is the momentum constant [31]. The weight corrections are implemented, on the initially random π‘Šπ‘–π‘—, in the opposite direction to data flow (back-propagation), and then another feed-forward run is started by using the newly updated π‘Šπ‘–π‘— values.

The learning rate πœ‚ is continuously optimized according to a β€œline search” algorithm [32] for maximal convergence efficiency to an E minima (Figure 3). The iterations will continue until convergence is reached, leading the network to learn by backpropagation [33–35].

Acknowledgments

The authors are indebted to Dr. Juan S. Bonifacino (NIH) for sharing his lab data and for useful discussions. We also thank Dr. Darwin Reyes (National Institute of Standard and Technologies), Dr. Lymarie Maldonado-Baez (NIH), Dr. Henry Chang (Purdue University), and members of the Aguilar and Chang labs for stimulating discussions and critical reading of the paper. Special thanks to Saikat Banerjee (Indian Institute of Management, Bangalore, India) for writing the spreadsheet macro that runs the ANN algorithm (freely available upon request). This work was supported by start-up funds from the department of Biological Sciences, Purdue University and by the Center for Science of Information (CSoI), an NSF Science and Technology Center, under GRANT agreement CCF-0939370.