Major histocompatibility complex (MHC) class I molecules, HLA-A, -B, and -C, are cell surface glycoproteins consisting of a polymorphic heavy α chain non-covalently linked to a light chain, β2-microglobulin (β2m). HLA-A and -B molecules play critical roles in cell mediated immune responses by binding short antigenic peptide fragments and presenting them on the surface of antigen-presenting cells for recognition by the CD8+ cytotoxic T lymphocyte (CTL). Although several HLA-C specificities with CTL epitopes have been reported [1, 2], much remains unknown with regards to their role in the immune response against viral antigens in part due to their poor expression at the cell surface [3, 4]. Recent research shows that this group of molecules plays a major role in the control of human immunodeficiency virus type 1 (HIV-1) infection [5]. Improved understanding of peptide binding to this group of molecules is important in the study of HIV-1 disease progression, as well as the design of effective HIV peptide vaccines.
The HLA-C allele, Cw*0401, is of particular interest in the study of HIV-1 disease progression because it is the restriction element for HIV-1 proteins [5]. Two HIV-1 proteins (p24gag and gp160gag) are currently known to be restricted by Cw*0401 [5]. Cw*0401 is present in approximately 10% of the general population [6]. The allele is expressed intracellularly in amounts comparable with HLA-A and -B molecules, but is poorly expressed at the cell surface [7, 8]. Improved understanding of peptide binding to this molecule is important for elucidating its role in HIV-1 disease progression.
Computational strategies for prediction of peptide binding to HLA-A and -B molecules are relatively advanced [9], while sequence-based predictive models for HLA-C molecules have encountered limited success due to the lack of experimental training data [10]. Two matrix-based prediction algorithms for Cw*0401 were reported [11, 12], but a sequence independent approach is still lacking. To overcome these limitations, we have developed a structure-based predictive technique that integrates the strength of Monte Carlo simulations and homology modeling [13, 14, 15]. This method utilizes a probe or "base fragment" to sample different regions of the receptor binding site, followed by loop closure and refinement of the entire class I peptide. The technique has been successfully applied to analyze peptides binding to a variety of MHC class II alleles [14, 15]. In this work, we now extend our analysis to peptides presented by the class I HLA-C molecule. We investigated the HIV-1 p24gag and gp160gag peptide binding repertoire of Cw*0401 and illustrate that areas with high concentration of T-cell epitopes or "immunological hot spots" are potentially well distributed throughout both HIV-1 p24gag and gp160gag. We also show that Cw*0401 can possibly bind antigenic peptides in amounts comparable to both HLA-A and -B molecules. Characterization of predicted Cw*0401 binding sequences reveal that Cw*0401 may bind a large variety of amino acids at anchor positions with common physico-chemical properties which correlate well with existing experimental studies [11].