Peptides that bind major histocompatibility complex (MHC) class I molecules serve as recognition targets for cytotoxic CD8+ T cells (CTLs). The major function of CTLs is recognition and destruction of infected (e.g. viruses, bacteria, parasites or fungi), mutated (e.g. cancer), or foreign (e.g. transplants) cells. CTLs recognize short antigenic peptides (T-cell epitopes) presented by MHC class I molecules that mainly originate from degradation of cytosolic proteins. Intracellular antigen processing pathways determine the selectivity of peptides which are available for binding to MHC class I molecules and are thereby important targets of CTL responses [2].
MHC class I antigen processing pathway steps include proteosomal cleavage of proteins into shorter peptides, translocation of peptides into the endoplasmic reticulum (ER) by TAP, optional ER trimming by aminopeptidases, insertion of peptides into the binding groove of MHC molecules, and transport of peptide/MHC complexes to the cell surface for presentation to CTLs [3]. TAP is a transmembrane protein responsible for the transport of antigenic peptides into the ER. TAP demonstrates peptide binding selectivity and the affinity of a particular peptide for TAP influences the probability of its presentation by MHC class I molecules. Peptides that are 8–16 amino acids long and have sufficient binding affinity are efficiently translocated by TAP into the ER, while longer peptides may be transported but with lower efficiency [4]. Human TAP (hTAP) is a heterodimer that has two subunits hTAP1 and hTAP2. TAP belongs to the ATP-binding cassette transporters and each subunit protein has one transmembrane domain and one ATP-binding binding domain. The genes for human TAP1 and TAP2 are located in the MHC II locus of chromosome 6 and comprise 10 kb each [5]. A more detailed description of function, structure, expression of TAP can be found in [6].
The efficiency of TAP-mediated translocation of a peptide is proportional to its TAP-binding affinity [7, 8]. Mutations, such as premature stop codons, or deletions of either hTAP1 or hTAP2 impair peptide transport into ER and result in a significant reduction of surface expression of peptide/MHC complexes [9]. TAP deficient cells have low cell-surface HLA class I expression shown to range from 10% (HLA-A2) to 3%, (HLA-B27 and -A3) [10]. The majority of the peptides presented by HLA class I on cell surface are thus dependent on TAP.
Identification of T-cell epitopes is a highly combinatorial problem. The diversity of human immune responses to T-cell epitopes originates from two sources – high allelic variation of the host (both HLA molecules and T-cell receptors) and high variation of target antigens, particularly those derived from viruses. Computational models are routinely used for pre-screening of potential T-cell epitopes and minimization of the number of necessary experiments. Most developments have focused on modeling and prediction of peptide binding to MHC molecules [see [11]]. Amongst computational models of peptide binding to hTAP that have been developed are binding motifs [7], quantitative matrices [12, 13, 14], artificial neural networks (ANN) [12, 15], and support vector machines (SVM) [16]. Combined computational methods that integrate multiple critical steps – proteasome cleavage, TAP transport, and MHC class I binding have been proposed as a supporting methodology for prediction of high probability targets for therapeutic peptides and vaccines [17]. Several combined computational applications of models of antigen processing and presentation have been reported [18, 19, 20, 21, 22]. Testing results indicate that these predictions produce a lower incidence of false positives and reduce the number of experiments required for identification of T-cell epitopes. However, these combined predictions need to be taken with a dose of caution. Alternative pathways for both proteolytic degradation [23] and TAP transport [24] have been reported. In some cases TAP-deficient individuals have normal immune responses [25], suggesting that TAP-independent immune responses are sufficient to provide effective protection from some intracellular pathogens. Nevertheless, the proteasome-TAP-MHC class I pathway is responsible for 90–97% of expression of peptide/MHC Class I complexes and therefore is critical for the identification of target epitopes for immunotherapies and vaccines.
We developed PREDTAP, a computational system that predicts peptides binding to hTAP. It uses ANN and hidden Markov models (HMM) as predictive engines. Extensive testing was performed to validate the prediction models and ensure that PREDTAP is both sensitive and specific. PREDTAP is available for public use at [1].