Key Points

  • Polymorphisms in HLA genes may impact the ability of the immune system to detect malignant cells and direct T cells to eliminate them.

  • Several HLA alleles and haplotypes are associated with development of chronic lymphocytic leukemia across different US populations.

Abstract

Chronic lymphocytic leukemia (CLL) displays remarkable ethnic predisposition for whites, with relative sparing of African-American and Asian populations. In addition, CLL displays among the highest familial predispositions of all hematologic malignancies, yet the genetic basis for these differences is not clearly defined. The highly polymorphic HLA genes of the major histocompatibility complex play a central role in immune surveillance and confer risk for autoimmune and infectious diseases and several different cancers, the role for which in the development of CLL has not been extensively investigated. The National Marrow Donor Program/Be The Match has collected HLA typing from CLL patients in need of allogeneic hematopoietic stem cell transplant and has recruited millions of volunteers to potentially donate hematopoietic stem cells. HLA genotypes for 3491 US white, 397 African-American, and 90 Hispanic CLL patients were compared with 50 000 controls per population from the donor registry. We identified several HLA alleles associated with CLL susceptibility in each population, reconfirming predisposing roles of HLA-A*02:01 and HLA-DRB4*01:01 in whites. Associations for haplotype DRB4*01:01∼DRB1*07:01∼DQB1*03:03 were replicated across all 3 populations. These findings provide a comprehensive assessment of the role of HLA in the development of severe CLL.

Introduction

Chronic lymphocytic leukemia (CLL) is the most common form of leukemia in the United States and Europe1  and is characterized by clonal proliferation of malignant B lymphocytes.2  There is significant heterogeneity in disease characteristics and prognosis for CLL, with many patients never requiring treatment. The median age of diagnosis is 72 years, and the disease occurs more commonly in men.3  CLL occurs more commonly in Western populations, is less common in African Americans, and is relatively rare in Asians.4-6  Differences in incidence persist despite immigration,7-11  and first-degree relatives of CLL patients have a two- to sevenfold elevated risk of developing the disease,12,13  implicating genetic predisposition rather than environmental factors in the pathogenesis of CLL.

HLA plays a central role in immune surveillance, and HLA polymorphisms may impact the ability of the immune system to identify malignant cells and target them for T cell-mediated elimination.14  HLA class I proteins (HLA-A, -C, and -B) present peptides from endogenous proteins to cytotoxic T lymphocytes. HLA class II proteins (HLA-DRB3/4/5, -DRB1, and -DQB1) present peptides derived from exogenous proteins to CD4+ helper T cells. Downregulation of HLA is a well-defined mechanism of immune surveillance by viral infections and in neoplastic cells.15,16  Specifically, downregulation of HLA class I proteins has been shown in CLL cells.17  HLA gene polymorphism has been found to be associated with cancer, autoimmune, and infectious diseases.18 

HLA allele frequency varies considerably among world populations19 ; thus, population-specific studies are essential for identifying HLA associations with disease. The vast diversity of HLA alleles requires extremely large sample sizes for study, as the most common HLA allele within a population often has a frequency of only ∼25%, beyond which the frequency of other alleles decreases exponentially. Collection of sufficient CLL cases for population-based studies at a single center is not feasible, requiring the formation of cooperative groups of institutions to acquire adequate sample size.

The Genetic Epidemiology of CLL (GEC) and InterLymph Consortia have identified several single nucleotide polymorphisms (SNPs) in the major histocompatibility complex (MHC) near the HLA loci associated with CLL in genome-wide association studies (GWASs). Although the specific HLA alleles found have not yet been identified,20-22  1 recent CLL GWAS used HLA assignments imputed from SNPs.23 

Classical HLA typing is best suited for HLA association studies because information from coding sequences is ascertained directly rather than imputed from linkage with nearby SNPs. However, most CLL association studies using classical HLA typing have had an inadequate number of samples to detect associations for less common alleles. Most of these studies have relied on a German cohort of <200 patients.24  Because HLA genes are highly polymorphic, even classical HLA typing cannot easily resolve all ambiguities within a locus; therefore, most HLA association studies for CLL have thus only reported HLA associations at the allele family level. In addition, alleles also cannot readily be phased across loci into haplotypes experimentally.25 

A large number of HLA-typed CLL cases have been accumulated by the National Marrow Donor Program (NMDP)/Be The Match from patients in need of allogeneic hematopoietic stem cell transplant (allo-SCT) since 1987. Although a minority of CLL patients ultimately undergo HLA typing in anticipation of potential allo-SCT, sampling the frequency of HLA alleles in this selected patient population can provide significant insight into severe disease biology. To address the issue of unresolved allelic ambiguity and haplotype phasing in registry HLA typing, we developed methods to incorporate this typing ambiguity into a statistical model, allowing us to report high-resolution allele, haplotype, and genotype associations. We also apply factor analysis to group highly correlated associations in the context of the high linkage disequilbrium of the HLA system.

We find several HLA alleles that are associated with an increased susceptibility to the development of severe CLL, as well as several alleles that are protective across populations.

Methods

Study population

The Institutional Review Board of the National Marrow Donor Program approved this study. Informed consent was obtained in accordance with the Declaration of Helsinki. Cases consisted of CLL patients >18 years of age at diagnosis for whom a preliminary search had been conducted with the NMDP to identify a matched unrelated donor for possible allo-SCT. A total of 50 000 controls from each self-identified race/ethnic background were selected randomly from the NMDP volunteer adult donor registry recruited since 2005. Controls were matched for patient age quartiles and gender proportions, although the maximum control age (60 years) was younger than the oldest cases due to registry recruitment policy. Both cases and controls were HLA typed using DNA-based methods (either Sanger sequence-based typing or sequence-specific oligonucleotide).25 

HLA genotype information

This study used multivariate logistic regression on independent sets of cases and controls for each population to identify HLA associations that are either protective or predisposing with respect to the development of CLL.

To resolve case/control HLA typing ambiguity, we first enumerated a list of all high resolution 6-locus (A∼C∼B∼DRB3/4/5∼DRB1∼DQB1) phased genotypes consistent with the individual’s DNA-based typing.26  Probability mass was then assigned to each genotype in proportion to population haplotype frequencies calculated from the NMDP registry.19  Under the assumption of Hardy-Weinberg equilibrium, genotype probabilities were derived as the product of haplotype frequencies for each possible genotype. At this point, the HLA information for each individual is contained in the distribution of possible genotypes. In the context of this imputation using population haplotype frequencies, the probability distribution at each locus is highly skewed toward a single high resolution allele pair, given the HLA typing resolution of NMDP samples.27,28 

To obtain a fixed genotype for analysis, we took a realization of this probability mass function to assign a single genotype for each individual. We repeated this realization to generate 5 imputed HLA datasets, each containing a single unambiguous phased HLA genotype per individual. We then applied traditional logistic regression methods to each of these five imputed HLA datasets and combined the results using multiple imputation methods designed for statistical inference in the context of incomplete data.29 

HLA variant assessments

Predisposing or protective effects with respect to CLL were identified at (1) the individual allele level (A, C, B, DRB3/4/5, DRB1, and DQB1 loci); (2) haplotype combinations of these loci; and (3) genotypes (both specific allele combinations and overall heterozygosity and homozygosity). Associations for these groupings were tested at 2 levels of HLA typing resolution: allele family and high resolution. All assessments were conditional on self-reported race (ie, run separately by race/ethnic group). DRBX*NNNN designates absence of any DRB3/4/5 gene on the chromosome.

We tested killer cell immunoglobulin-like receptor (KIR) ligand categories for HLA-B and HLA-C. HLA-B alleles were categorized as having either Bw4 or Bw6 epitopes. Bw4 epitopes were further subdivided by the amino acid at position 80 (eg, I80 for isoleucine at position 80). KIR3DL1 binds HLA-Bw4 group alleles but not HLA-Bw6 group alleles.30  For HLA-C alleles, we tested HLA-C1 or HLA-C2 KIR-binding epitopes based on the amino acid at position 80. KIR2DL2 binds HLA-C1 group alleles, whereas KIR2DL1 binds HLA-C2 group alleles.31 

Statistical analysis

The odds ratio (OR) of each HLA variant (listed in assessments) was estimated using logistic regression for the 5 imputed HLA datasets; reported ORs were adjusted for the covariates of age (continuous), gender, and geographical location (4 levels: west, east, midwest, and south) using an additive model specification. The model was further augmented in a secondary analysis to include an interaction term between the covariate of age and the predisposing/protective effect of interest.

Statistical inference for model parameters was assessed using an F test on the test statistic derived from multiple imputation that accounts for the within-sample and between-sample variance of the 5 imputed HLA datasets.32 P values for protective and predisposing ORs reported throughout the results section were adjusted for multiple testing using a false discovery rate (FDR) method with a FDR threshold of 5%.33 P values for assessing the significance of the age and gender interaction terms were not adjusted for multiple testing; 2-tailed P < .05 was considered significant.

As a last step toward identifying the correlation structure of protective or predisposing HLA variants, factor analysis34  was applied to a reduced data set containing only the significant associations detected during analysis. Each significant association was coded with an indicator variable (present/absent) for every individual in the patient dataset (controls were excluded). Given that many HLA variants are highly correlated and, therefore, representative of a single underlying process, factor analysis was used to determine both the true number of underlying processes and group HLA variants by these underlying processes.

For all significant groupings, factor analysis loadings were used to determine the number of groups representing underlying processes and assign HLA variants to these groups. A scree plot was used to identify the number of active factors (ie, underlying processes), whereby a flattening or “knee” in the percent variation explained by sequential loadings was used to determine a cutoff for the number of active factors.35  Factors were grouped by assignment to their maximum loading, permitting the loading was >0.2; factors having weaker loadings less than this or negative loadings were assigned to an independent group.

Results

Potential allo-SCT CLL cases are enriched for men of younger age than the general CLL population

Cases consisted of 3616 US white, 413 African-American, and 97 Hispanic CLL patients. For each population, 50 000 controls were randomly selected from the NMDP registry. Baseline demographics for cases and controls are shown in Table 1. Asian/Pacific Islander patients were excluded from this study due to inadequate sample size (28) and high population substructure. Among whites, the median age of cases was 51 years (range, 18-76 years), which is significantly lower than the median age of CLL, which is 72 years. The median age of controls was 51 years (range, 18-60 years). The US geographic distributions for cases and controls are also shown and were similar between the groups. The gender distribution of cases was 74% men, whereas the controls were 75% men. This ratio is slightly higher than the rate of 65% men for CLL in the general population.36 

Table 1

Case and control demographics

TotalGenderAgeGeography*
Male (%)Female (%)MedianLowHighEast (%)West (%)Midwest (%)South (%)
White 
 Cases 3 616 74.1 25.9 51 18 76 20.4 17.4 29.4 32.7 
 Controls 50 000 75 25 50.5 18 60 20.7 25.2 25.0 29.2 
African American 
 Cases 413 66.1 33.9 49 18 71 29.5 21.8 19.1 29.5 
 Controls 50 000 75 25 44.5 18 60 14.7 11.6 18.4 55.3 
Hispanic 
 Cases 97 73.2 26.8 50 25 74 27.8 20.6 21.7 29.9 
 Controls 50 000 75 25 47.5 18 60 24.1 28.8 5.8 41.2 
TotalGenderAgeGeography*
Male (%)Female (%)MedianLowHighEast (%)West (%)Midwest (%)South (%)
White 
 Cases 3 616 74.1 25.9 51 18 76 20.4 17.4 29.4 32.7 
 Controls 50 000 75 25 50.5 18 60 20.7 25.2 25.0 29.2 
African American 
 Cases 413 66.1 33.9 49 18 71 29.5 21.8 19.1 29.5 
 Controls 50 000 75 25 44.5 18 60 14.7 11.6 18.4 55.3 
Hispanic 
 Cases 97 73.2 26.8 50 25 74 27.8 20.6 21.7 29.9 
 Controls 50 000 75 25 47.5 18 60 24.1 28.8 5.8 41.2 
*

When patient home zip code was not available, the transplant zip code was substituted. The United States is split into 4 geographical regions based on the first digit of the zip code (0 and 1 = east; 2, 3, and 7 = south; 4, 5, and 6 = midwest; and 8 and 9 = west).

US white population contains HLA alleles that are protective against development of CLL and others that confer increased risk

In whites, the population with the largest number of cases, we found many significant HLA associations. Our association study found 28 protective or predisposing high-resolution alleles for CLL in whites (Figure 1A). Twelve alleles were protective (A*01:01, C*05:01, C*07:01, C*16:01, B*27:05, DRB1*01:01, DRB1*04:03, DRB1*13:01, DQB1*03:01, DQB1*05:01, DQB1*06:03, and DQB1*06:04) and 16 predisposing (A*02:01, C*05:01, C*07:01, C*16:02, B*14:01, B*15:01, DRB4*01:01, DRB1*04:01, DRB1*04:02, DRB1*07:01, DRB1*08:01, DQB1*03:02, DQB1*03:03, DQB1*04:02, and DQB1*05:04). The frequency of each individual HLA allele in controls and cases is indicated, as are ORs and P values.

Figure 1

OR plot for significant HLA allelic associations. Protective alleles are indicated in blue and predisposing alleles in red for the (A) white population and the (B) African-American population. Alleles are organized in genomic order scanning across HLA loci in the MHC region of chromosome 6. ORs are indicated by a black dot, with a 95% FDR-adjusted confidence interval. P values are indicated above the 95% confidence interval for each allele. Allele frequencies in cases and controls are provided.

Figure 1

OR plot for significant HLA allelic associations. Protective alleles are indicated in blue and predisposing alleles in red for the (A) white population and the (B) African-American population. Alleles are organized in genomic order scanning across HLA loci in the MHC region of chromosome 6. ORs are indicated by a black dot, with a 95% FDR-adjusted confidence interval. P values are indicated above the 95% confidence interval for each allele. Allele frequencies in cases and controls are provided.

A full listing of significantly associated HLA alleles and haplotypes, grouped using factor analysis, is in supplemental Table 1 available on the Blood Web site. The most predisposing HLA allele was DQB1*05:04 (OR = 5.62, P = 6.59 × 10−4), whereas the allele with the greatest protective effect was DRB1*04:03 (OR = 0.67, P = .043).

We also identify many HLA alleles that had suggestive associations for CLL but did not reach statistical significance when the FDR was applied to consider multiple testing (supplemental Table 2).

Antigen level analysis reveals additional associations

Because many HLA alleles are infrequent, some associations can only observed when alleles are grouped into allele families. These allele families were originally derived from serologic antigen groups, with new alleles named based on sequence similarity with already reported alleles.20  Although the vast majority of allele family level associations were highly correlated with other high-resolution associations (supplemental Table 1), we were able to detect some haplotype associations where no high-resolution variant within the allele family reached significance; for example, A*02∼C*03∼B*15∼DRB4*01∼DRB1*04∼DQB1*03 in whites (OR = 1.35, P = .025). Examination of allele family level associations also allows for more direct comparison with previous reports.

Increased statistical power may be achieved when grouping several low-frequency alleles that have similar impact on disease. However, some strong high-resolution allelic associations are not detected at the antigen level because of opposing effects of alleles within the same antigen group; for example, DRB1*04 in whites contains the predisposing alleles DRB1*04:01 (OR = 1.15) and DRB1*04:02 (OR = 1.63) but also a protective allele DRB1*04:03 (OR = 0.67).

Novel factor analysis groups correlated alleles and associated haplotypes

We applied factor analysis in an attempt to automatically group the multitude of significantly associated HLA alleles and haplotypes that are highly correlated by linkage disequilibrium. Considering these groupings, we can identify as many as 47 potentially independent associations for CLL in whites and 6 in African Americans. The longest extended haplotype associations within each factor analysis group are listed in Tables 2-4, with a full listing of allele and haplotype associations grouped by factor analysis in supplemental Table 1. For example, the first factor analysis grouping for whites included 35 different allele and haplotype associations, all contained in the extended haplotype A*26:01∼C*12:03∼B*38:01∼DRB4*01:01∼DRB1*04:02∼DQB1*03:02. Each factor analysis group tends to represent a single haplotype, and the groups often include some of the constituent associated alleles if those alleles are not frequently found on other haplotypes.

Table 2

Extended HLA haplotype associations from factor analysis for whites

VariantORControl frequencyLower 95% CIUpper 95% CIFDR PFactor analysis
A*26:01∼C*12:03∼B*38:01∼DRB4*01:01∼DRB1*04:02∼DQB1*03:02 2.05 0.0035 1.23 3.41 1.94E-04 Group 1 
C*16:01∼B*44:03∼DRB4*01:01 0.79 0.0290 0.63 0.99 3.05E-02 Group 2 
A*01:01∼C*07:01∼B*08:01∼DRB1*03:01∼DQB1*02:01 0.83 0.0636 0.70 0.99 2.07E-02 Group 3 
DRB3*02:02∼DRB1*13:01∼DQB1*06:03 0.79 0.0401 0.65 0.96 6.79E-03 Group 4 
B*52:01∼DRB5*01:02∼DRB1*15:02∼DQB1*06:01 1.54 0.0064 1.05 2.26 1.45E-02 Group 5 
A*02∼C*03∼B*15 1.27 0.0280 1.01 1.58 2.13E-02 Group 6 
DRBX*NNNN∼DRB1*01:01∼DQB1*05:01 0.84 0.0808 0.73 0.97 5.36E-03 Group 7 
A*02∼C*03∼B*15∼DRB4*01∼DRB1*04∼DQB1*03 1.35 0.0142 1.02 1.78 2.55E-02 Group 8 
C*02:02∼B*27:05∼DRB3*02:02 0.43 0.0047 0.20 0.92 1.59E-02 Group 9 
C*02∼B*27∼DRB4*01∼DRB1*04∼DQB1*03 0.57 0.0071 0.34 0.95 2.25E-02 Group 10 
C*03:04∼DRB4*01:01 1.18 0.0416 1.00 1.40 4.23E-02 Group 11 
C*05:01∼B*44:02∼DRB1*11:01∼DQB1*03:01 0.48 0.0042 0.23 0.99 4.45E-02 Group 12 
C*05:01∼B*44:02∼DRB1*13:01∼DQB1*06:03 0.62 0.0083 0.39 0.99 4.04E-02 Group 13 
DRB3*03:01∼DRB1*13:02∼DQB1*06:04 0.82 0.0363 0.67 0.99 3.94E-02 Group 14 
DRB5*01∼DRB1*15∼DQB1*05 4.80 0.0007 2.09 11.03 1.86E-06 Group 15 
A*02:01∼DRB1*07:01 1.39 0.0314 1.13 1.70 9.57E-06 Group 16 
C*01∼B*27∼DRBX*NN∼DRB1*01 0.62 0.0110 0.41 0.94 1.16E-02 Group 17 
C*03∼B*15∼DRB4*01∼DRB1*04∼DQB1*03 1.37 0.0231 1.10 1.71 3.09E-04 Group 18 
DRB3*02∼DRB1*11∼DQB1*03 0.89 0.1007 0.80 1.00 4.00E-02 Group 19 
DRB4*01∼DRB1*04∼DQB1*02 2.02 0.0012 1.04 3.92 3.17E-02 Group 20 
A*02:01 1.20 0.2670 1.08 1.34 3.29E-06 Group 21 
C*03:03∼B*15:01∼DRB4*01:01 1.68 0.0057 1.11 2.52 2.39E-03 Group 22 
C*12:03∼B*35:03 1.75 0.0027 1.06 2.86 1.78E-02 Group 23 
C*07:01∼B*18:01 0.73 0.0216 0.56 0.95 8.97E-03 Group 24 
C*01:02∼B*27:05∼DRBX*NNNN∼DRB1*01:01∼DQB1*05:01 0.60 0.0090 0.36 0.99 4.23E-02 Group 25 
DRB4*01:01∼DRB1*07:01∼DQB1*03:03 1.49 0.0258 1.19 1.85 1.79E-07 Group 26 
DRB3*02∼DRB1*11∼DQB1*06 8.14 0.0003 2.81 23.56 2.68E-08 Group 27 
C*15:02∼B*51:01 0.73 0.0240 0.57 0.95 8.00E-03 Group 28 
C*12:03 1.26 0.0473 1.07 1.50 4.73E-04 Group 29 
A*02:01∼DRB1*14:01 1.71 0.0060 1.18 2.48 3.38E-04 Group 30 
A*24:02∼C*07:01∼B*18:01 0.37 0.0040 0.14 0.94 2.52E-02 Group 31 
C*12:03∼DRB4*01:01 1.63 0.0115 1.18 2.23 2.27E-05 Group 32 
B*15∼DRB4*01∼DRB1*04∼DQB1*03 1.51 0.0090 1.08 2.11 3.66E-03 Group 33 
DRB4*01:01∼DRB1*04:03 0.67 0.0100 0.46 0.99 4.53E-02 Group 34 
DRB4*01∼DRB1*04∼DQB1*04 2.21 0.0015 1.17 4.18 6.80E-03 Group 35 
DRB4*01:01∼DRB1*04:01∼DQB1*03:02 1.33 0.0383 1.12 1.58 1.86E-05 Group 36 
C*03:04∼B*15:01∼DRB4*01:01 1.25 0.0191 1.01 1.54 4.01E-02 Group 37 
DRB4*01:01∼DRB1*07:01 1.13 0.1322 1.01 1.27 1.46E-02 Group 38 
C*16:02 1.83 0.0028 1.12 3.00 6.18E-03 Group 39 
A*02∼B*14 1.55 0.0049 1.01 2.39 4.25E-02 Group 40 
DQB1*06 0.91 0.2445 0.83 1.00 3.58E-02 Group 41 
C*07∼B*44 1.33 0.0099 1.01 1.76 3.90E-02 Group 42 
DRB1*15:01∼DQB1*06:03 12.90 0.0003 4.63 35.98 2.80E-14 Group 43 
C*01∼B*15 2.66 0.0018 1.42 4.96 1.31E-05 Group 44 
C*01:02∼DRB4*01:01 1.72 0.0047 1.10 2.69 4.58E-03 Group 45 
C*04∼B*15 1.99 0.0025 1.16 3.41 3.06E-03 Group 46 
A*02:05 1.37 0.0100 1.01 1.85 3.36E-02 Group 47 
VariantORControl frequencyLower 95% CIUpper 95% CIFDR PFactor analysis
A*26:01∼C*12:03∼B*38:01∼DRB4*01:01∼DRB1*04:02∼DQB1*03:02 2.05 0.0035 1.23 3.41 1.94E-04 Group 1 
C*16:01∼B*44:03∼DRB4*01:01 0.79 0.0290 0.63 0.99 3.05E-02 Group 2 
A*01:01∼C*07:01∼B*08:01∼DRB1*03:01∼DQB1*02:01 0.83 0.0636 0.70 0.99 2.07E-02 Group 3 
DRB3*02:02∼DRB1*13:01∼DQB1*06:03 0.79 0.0401 0.65 0.96 6.79E-03 Group 4 
B*52:01∼DRB5*01:02∼DRB1*15:02∼DQB1*06:01 1.54 0.0064 1.05 2.26 1.45E-02 Group 5 
A*02∼C*03∼B*15 1.27 0.0280 1.01 1.58 2.13E-02 Group 6 
DRBX*NNNN∼DRB1*01:01∼DQB1*05:01 0.84 0.0808 0.73 0.97 5.36E-03 Group 7 
A*02∼C*03∼B*15∼DRB4*01∼DRB1*04∼DQB1*03 1.35 0.0142 1.02 1.78 2.55E-02 Group 8 
C*02:02∼B*27:05∼DRB3*02:02 0.43 0.0047 0.20 0.92 1.59E-02 Group 9 
C*02∼B*27∼DRB4*01∼DRB1*04∼DQB1*03 0.57 0.0071 0.34 0.95 2.25E-02 Group 10 
C*03:04∼DRB4*01:01 1.18 0.0416 1.00 1.40 4.23E-02 Group 11 
C*05:01∼B*44:02∼DRB1*11:01∼DQB1*03:01 0.48 0.0042 0.23 0.99 4.45E-02 Group 12 
C*05:01∼B*44:02∼DRB1*13:01∼DQB1*06:03 0.62 0.0083 0.39 0.99 4.04E-02 Group 13 
DRB3*03:01∼DRB1*13:02∼DQB1*06:04 0.82 0.0363 0.67 0.99 3.94E-02 Group 14 
DRB5*01∼DRB1*15∼DQB1*05 4.80 0.0007 2.09 11.03 1.86E-06 Group 15 
A*02:01∼DRB1*07:01 1.39 0.0314 1.13 1.70 9.57E-06 Group 16 
C*01∼B*27∼DRBX*NN∼DRB1*01 0.62 0.0110 0.41 0.94 1.16E-02 Group 17 
C*03∼B*15∼DRB4*01∼DRB1*04∼DQB1*03 1.37 0.0231 1.10 1.71 3.09E-04 Group 18 
DRB3*02∼DRB1*11∼DQB1*03 0.89 0.1007 0.80 1.00 4.00E-02 Group 19 
DRB4*01∼DRB1*04∼DQB1*02 2.02 0.0012 1.04 3.92 3.17E-02 Group 20 
A*02:01 1.20 0.2670 1.08 1.34 3.29E-06 Group 21 
C*03:03∼B*15:01∼DRB4*01:01 1.68 0.0057 1.11 2.52 2.39E-03 Group 22 
C*12:03∼B*35:03 1.75 0.0027 1.06 2.86 1.78E-02 Group 23 
C*07:01∼B*18:01 0.73 0.0216 0.56 0.95 8.97E-03 Group 24 
C*01:02∼B*27:05∼DRBX*NNNN∼DRB1*01:01∼DQB1*05:01 0.60 0.0090 0.36 0.99 4.23E-02 Group 25 
DRB4*01:01∼DRB1*07:01∼DQB1*03:03 1.49 0.0258 1.19 1.85 1.79E-07 Group 26 
DRB3*02∼DRB1*11∼DQB1*06 8.14 0.0003 2.81 23.56 2.68E-08 Group 27 
C*15:02∼B*51:01 0.73 0.0240 0.57 0.95 8.00E-03 Group 28 
C*12:03 1.26 0.0473 1.07 1.50 4.73E-04 Group 29 
A*02:01∼DRB1*14:01 1.71 0.0060 1.18 2.48 3.38E-04 Group 30 
A*24:02∼C*07:01∼B*18:01 0.37 0.0040 0.14 0.94 2.52E-02 Group 31 
C*12:03∼DRB4*01:01 1.63 0.0115 1.18 2.23 2.27E-05 Group 32 
B*15∼DRB4*01∼DRB1*04∼DQB1*03 1.51 0.0090 1.08 2.11 3.66E-03 Group 33 
DRB4*01:01∼DRB1*04:03 0.67 0.0100 0.46 0.99 4.53E-02 Group 34 
DRB4*01∼DRB1*04∼DQB1*04 2.21 0.0015 1.17 4.18 6.80E-03 Group 35 
DRB4*01:01∼DRB1*04:01∼DQB1*03:02 1.33 0.0383 1.12 1.58 1.86E-05 Group 36 
C*03:04∼B*15:01∼DRB4*01:01 1.25 0.0191 1.01 1.54 4.01E-02 Group 37 
DRB4*01:01∼DRB1*07:01 1.13 0.1322 1.01 1.27 1.46E-02 Group 38 
C*16:02 1.83 0.0028 1.12 3.00 6.18E-03 Group 39 
A*02∼B*14 1.55 0.0049 1.01 2.39 4.25E-02 Group 40 
DQB1*06 0.91 0.2445 0.83 1.00 3.58E-02 Group 41 
C*07∼B*44 1.33 0.0099 1.01 1.76 3.90E-02 Group 42 
DRB1*15:01∼DQB1*06:03 12.90 0.0003 4.63 35.98 2.80E-14 Group 43 
C*01∼B*15 2.66 0.0018 1.42 4.96 1.31E-05 Group 44 
C*01:02∼DRB4*01:01 1.72 0.0047 1.10 2.69 4.58E-03 Group 45 
C*04∼B*15 1.99 0.0025 1.16 3.41 3.06E-03 Group 46 
A*02:05 1.37 0.0100 1.01 1.85 3.36E-02 Group 47 

The longest extended haplotype association for each factor analysis group is listed. When multiple haplotypes had the same number of loci within the same factor analysis group, the most significant association is provided. When no haplotypes were contained in a factor analysis group, the most significant allelic association is provided. Lower 95% CI and upper 95% CI indicate the ORs for the 95% CI. FDR P value is the significance after FDR correction for multiple testing. A complete listing of allele and haplotype associations for each factor analysis group with additional information on uncorrected P values, case frequencies, and interactions for age, geography, and gender is provided in supplemental Table 1. CI, confidence interval.

Table 3

Extended HLA haplotype associations from factor analysis for African Americans

VariantORControl frequencyLower 95% CIUpper 95% CIFDR PFactor analysis
DRB4*01:01∼DRB1*09:01∼DQB1*02:01 1.73 0.030 1.03 2.93 2.81E-02 Group 1 
DRB3*03:01∼DRB1*13:02∼DQB1*06:04 2.27 0.009 1.03 4.97 3.31E-02 Group 2 
DRB4*01∼DRB1*07∼DQB1*03 28.03 0.039 9.64 81.50 0.00E+00 Group 3 
C*04:01 0.72 0.211 0.54 0.98 2.46E-02 Group 4 
DRB4*01 1.37 0.007 1.04 1.80 1.39E-02 Group 5 
A*03∼DRB1*07 2.24 0.084 1.00 4.99 4.71E-02 Group 6 
VariantORControl frequencyLower 95% CIUpper 95% CIFDR PFactor analysis
DRB4*01:01∼DRB1*09:01∼DQB1*02:01 1.73 0.030 1.03 2.93 2.81E-02 Group 1 
DRB3*03:01∼DRB1*13:02∼DQB1*06:04 2.27 0.009 1.03 4.97 3.31E-02 Group 2 
DRB4*01∼DRB1*07∼DQB1*03 28.03 0.039 9.64 81.50 0.00E+00 Group 3 
C*04:01 0.72 0.211 0.54 0.98 2.46E-02 Group 4 
DRB4*01 1.37 0.007 1.04 1.80 1.39E-02 Group 5 
A*03∼DRB1*07 2.24 0.084 1.00 4.99 4.71E-02 Group 6 

The longest extended haplotype association for each factor analysis group is listed. When multiple haplotypes had the same number of loci within the same factor analysis group, the most significant association is provided. When no haplotypes were contained in a factor analysis group, the most significant allelic association is provided. Lower 95% CI and upper 95% CI indicate the ORs for the 95% CI. FDR P value is the significance after FDR correction for multiple testing. A complete listing of allele and haplotype associations for each factor analysis group with additional information on uncorrected P values, case frequencies, and interactions for age, geography, and gender is provided in supplemental Table 1.

Table 4

Extended HLA haplotype associations from factor analysis for Hispanics

VariantORControl frequencyLower 95% CIUpper 95% CIFDR PFactor analysis
C*04:01∼B*44:03∼DRB4*01:01∼DRB1*07:01∼DQB1*02:01 4.18 0.007 1.48 11.78 3.93E-03 Group 1 
DRB4*01:01∼DRB1*07:01∼DQB1*03:03 13.86 0.003 4.15 46.30 9.59E-09 Group 2 
C*16:01∼B*45:01 3.47 0.009 1.16 10.37 1.70E-02 Group 3 
B*35∼DRB4*01∼DRB1*07 5.76 0.006 2.07 16.03 1.27E-04 Group 4 
A*24:02∼DRB1*07:01 4.27 0.008 1.65 11.08 1.29E-03 Group 5 
C*07∼DRB5*01 2.25 0.040 1.04 4.86 2.96E-02 Group 6 
C*07:04 5.60 0.006 1.67 18.74 5.71E-04 Group 7 
VariantORControl frequencyLower 95% CIUpper 95% CIFDR PFactor analysis
C*04:01∼B*44:03∼DRB4*01:01∼DRB1*07:01∼DQB1*02:01 4.18 0.007 1.48 11.78 3.93E-03 Group 1 
DRB4*01:01∼DRB1*07:01∼DQB1*03:03 13.86 0.003 4.15 46.30 9.59E-09 Group 2 
C*16:01∼B*45:01 3.47 0.009 1.16 10.37 1.70E-02 Group 3 
B*35∼DRB4*01∼DRB1*07 5.76 0.006 2.07 16.03 1.27E-04 Group 4 
A*24:02∼DRB1*07:01 4.27 0.008 1.65 11.08 1.29E-03 Group 5 
C*07∼DRB5*01 2.25 0.040 1.04 4.86 2.96E-02 Group 6 
C*07:04 5.60 0.006 1.67 18.74 5.71E-04 Group 7 

The longest extended haplotype association for each factor analysis group is listed. When multiple haplotypes had the same number of loci within the same factor analysis group, the most significant association is provided. When no haplotypes were contained in a factor analysis group, the most significant allelic association is provided. Lower 95% CI and upper 95% CI indicate the ORs for the 95% CI. FDR P value is the significance after FDR correction for multiple testing. A complete listing of allele and haplotype associations for each factor analysis group with additional information on uncorrected P values, case frequencies, and interactions for age, geography, and gender is provided in supplemental Table 1.

Linkage disequilibrium between specific HLA alleles is often very high, making it extremely challenging to distinguish causative HLA alleles from other alleles that are simply linked on the sample haplotype with the true causative allele. Often there are not sufficient cases in which one allele is present, and another allele is absent to separate the individual contribution of alleles. Haplotype-level associations were often more extreme than allele level; for example, A*02:01∼B*15:01∼DRB1*04:01 had an OR of 1.41, but A*02:01, B*15:01, and DRB1*04:01 had lower ORs of 1.20, 1.26, and 1.15, respectively (supplemental Table 1).

DRB4*01:01∼DRB1*07:01∼DQB1*03:03 is a universally predisposing haplotype for CLL in whites, African Americans, and Hispanics

We identified an HLA class II haplotype, DRB4*01:01∼DRB1*07:01∼DQB1*03:03, with a strongly predisposing impact in whites (OR = 1.49, P = 1.79 × 10−7), African Americans (OR = 28.03, P < 2 × 10−16 at 2-digit resolution), and Hispanics (OR = 13.86, P = 9.59 × 10−9) (supplemental Table 1). The frequency of this haplotype is 3.4% in whites, 0.3% in African Americans, and 1.2% in Hispanics.19  The DRB4*01:01 and DQB1*03:03 alleles on this same haplotype were also found to be predisposing in both whites (OR = 1.17, P = 3.94 × 10−5 and OR = 1.40, P = 1.85 × 10−7, respectively) and African Americans (OR = 1.37, P = .014 and OR = 2.10, P = .016, respectively; Tables 2-4). Three-locus linkage disequilibrium between these alleles was high within each population (D′ > 0.88), indicating high correlation between individual allelic associations.

Minority populations harbor specific HLA alleles that alter CLL risk

We also found that DRB1*09:01 (OR = 2.00, P = 2.31 × 10−4) predisposes African Americans to the development of CLL (Figure 1B). No DRB1*09:01 association was identified in whites, but the allele is in near complete linkage disequilibrium (D′ > 0.999) with DRB4*01:01,19  an allele found to be predisposing in both white and African-American populations. DRB1*09:01 is at 3% frequency in African Americans, but only 0.1% frequency in whites, limiting the statistical power to detect DRB1*09:01 associations across populations.

C*04:01 was found to be uniquely protective in African Americans (OR = 0.72, P = .025; Figure 1B), where the allele is at 21% frequency in controls but only 17.1% of cases.19  Interestingly, B*45:01 (OR = 2.98, P = .015) and C*07:04 (OR = 5.60, P = 5.71 × 10−4) were uniquely predisposing in Hispanics (supplemental Table 1).

Haplotype common in Ashkenazi Jews is associated with an increased risk of CLL

CLL incidence has been reported to be much higher in Jewish populations than in neighboring Arab populations.37  We identified as predisposing the haplotype A*26:01∼C*12:03∼B*38:01∼DRB4*01:01∼DRB1*04:02∼DQB1*03:02 (OR = 2.05, P = 1.94 × 10−4) and 4 predisposing alleles C*12:03 (OR = 1.26, P = 4.73 × 10−4), DRB4*01:01 (OR = 1.17, P = 3.94 × 10−5), DRB1*04:02 (OR = 1.19, P = .033), and DQB1*03:02 (OR = 1.15, P = .0046) on this haplotype (supplemental Table 1). This is the most common haplotype in US Ashkenazi Jews, at a 6.0% frequency,38  indicating that HLA may play a role in higher incidence of CLL in this population. Interestingly, this haplotype also contains the DRB4*01:01 allele that is universally predisposing among studied populations. However, Askhenazi Jews were unable to self-identify to NMDP as a differentiated subpopulation from whites, limiting our ability to validate this association.

Several HLA alleles have differing associations by age and gender

Because the median age of patients in this study is significantly lower than that of the CLL population in general, we stratified at age 50 to determine whether age at the onset of CLL was associated with individual HLA alleles. Several alleles were found to have differing associations by age (supplemental Table 1). For example, A*23:01 has a protective association in only the whites <50 years of age (OR = 0.616, P = .014). Stratified ORs for variants with age interactions are reported in supplemental Table 3.

In addition, because of the well-documented male predominance of CLL, we sought to determine if gender was associated with any of the specific HLA alleles. Several alleles were found to have differing associations by gender (supplemental Table 1). B*27:05 may lead to lower CLL risk in women (OR = 0.55, P = .00056) than in men. Stratified ORs for variants with age interactions are reported in supplemental Table 4.

Homozygosity at HLA class I is associated with development of CLL

We found overall homozygosity at all 3 HLA class I loci was associated with increased risk of CLL in whites: HLA-A (OR = 1.19, P = 1.67 × 10−4), HLA-B (OR = 1.16, P = .024), and HLA-C (OR = 1.16, P = .0076; Table 5). One specific homozygote genotype, A*02:01+A*02:01, was found to have increased risk in both whites (OR = 1.26, P = .0059) and Hispanics (OR = 1.84, P = .041 at 2-digit resolution). In addition, we found several other specific HLA genotype associations (supplemental Table 5).

Table 5

Locus-level HLA associations for overall heterozygosity and homozygosity in whites

VariantORLower 95% CIUpper 95% CIP
A*HOMO 1.19 1.07 1.33 .00016 
A*HETERO 0.84 0.76 0.92 .00016 
C*HOMO 1.16 1.02 1.31 .0077 
C*HETERO 0.86 0.77 0.96 .0077 
B*HOMO 1.16 1.02 1.32 .024 
B*HETERO 0.86 0.74 1.00 .024 
VariantORLower 95% CIUpper 95% CIP
A*HOMO 1.19 1.07 1.33 .00016 
A*HETERO 0.84 0.76 0.92 .00016 
C*HOMO 1.16 1.02 1.31 .0077 
C*HETERO 0.86 0.77 0.96 .0077 
B*HOMO 1.16 1.02 1.32 .024 
B*HETERO 0.86 0.74 1.00 .024 

Lower 95% CI and upper 95% CI indicate the ORs for the 95% CI. HOMO, associations for all homozygous genotypes at each locus; HETERO, all heterozygous genotypes.

Discussion

In this study, we reconfirmed several HLA associations seen in previous CLL genetic association studies and identified many others that were previously unobserved due to low statistical power. The large number of disease cases and donor controls across multiple populations coupled with high-quality classical HLA typing make the NMDP dataset a powerful source for inference of lower-frequency haplotype and genotype-level HLA associations with severe CLL for the first time.

GWASs for whites with CLL have previously identified associations with several SNPs in the HLA region. Because these SNPs are often in high linkage disequilibrium with specific HLA alleles, using software such as HLA*IMP,39  it is possible to impute HLA alleles from GWAS SNP data. We reconfirm with very high confidence a A*02:01 association from a GWAS study of 517 CLL cases and additional replication study specifically typing the associated rs6904029 SNP,23  finding close agreement in ORs, with OR = 1.32 in the study by Di Bernardo et al vs 1.20 in our study. A*02:01 is the most common HLA-A allele in whites, with a frequency of 26.7% in the general population19  and 30.4% in whites with severe CLL.

GWAS using GEC/Interlymph Consortium samples have yielded 6 SNP associations in the HLA region,20-22  but we are unable to compare results because these studies did not impute HLA alleles from the SNP data. HLA imputation, or more ideally classical HLA typing, on the GEC/Interlymph Consortium samples would enable adequate replication of our study. The GWAS cases included older patients and those with less severe disease than those included in this study based on data collected from the NMDP for a cohort of CLL patients who have likely failed prior therapies and were in need of allo-SCT. Candidate genes involved in apoptosis and hematopoetic cell homeostasis were identified as conferring increased risk in these GWAS.

Findings from previous CLL association studies using classical HLA typing are often consistent with our results. The first predisposing HLA association found with CLL using DNA-based typing was reported by British investigators.24  We reconfirm the A2∼B62∼DR4 haplotype reported (OR = 4.1) in our data at high resolution as A*02:01∼B*15:01∼DRB1*04:01 (OR = 1.41, P = .0049). A predisposing C*16 association (OR = 2.69) in a study of Southeastern Spanish patients40  was also seen in US whites for C*16:02 (OR = 1.83, P = .0062); however, C*16:01 was protective (OR = 0.79, P = .0098).

Previous reports have suggested that homozygosity of HLA is associated with the development of CLL41,42  and DLCBL,43  which we also report for all HLA class I loci. Although consanguinity among cases cannot be formally assessed, it would not account for the magnitude of this observed risk (OR = 1.16-1.19).

The predisposing association for DR53 (DRB4) identified in whites by Dorak et al24  was also replicated as DRB4*01:01 (OR = 1.17, P = 3.95 × 10−5), whereas the protective DR52 (DRB3) association was observed in several HLA class II haplotypes: DRB3*03:01∼DRB1*13:02∼DQB1*06:04 (OR = 0.82, P = .039), DRB3*02:02∼DRB1*13:01∼DQB1*06:03 (OR = 0.79, P = .0068), and DRB3*02∼DRB1*11∼DQB1*03 (OR = 0.89, P = .04). However, a predisposing DRB3-containing haplotype, DRB3*02∼DRB1*11∼DQB1*06 (OR = 8.14, P = 2.68 × 10−8), was also observed. The DRB4 locus was also found on multiple predisposing haplotypes containing alleles in the DRB1*04, DRB1*07, and DRB1*09 allele families. However, among studied populations, DRB1*09 was only frequent enough in African Americans to detect a predisposing effect for that of the DRB4*01:01∼DRB1*09:01 haplotype. Looking across populations, we observed a universal association for the DRB4*01:01∼DRB1*07:01∼DQB1*03:03 haplotype in whites, Hispanics, and African Americans. Concordance in associations across these populations suggests similar disease etiology in all ethnic groups studied.

The novel application of factor analysis to this HLA association study gives an automated determination of the number of unique HLA haplotypes associated with CLL. Allele-level ORs were often less extreme than haplotype level effects, which may either result from additive effects of ORs from individual constituent alleles, or it may be that certain alleles are also in linkage disequilibrium with alleles on other loci that have opposing impact on disease risk. The factor analysis groupings suggest that many HLA allele associations are best described in the context of their membership within a constellation of specific haplotypes rather than as independent findings. Our hybrid approach combining the advantages of both high-resolution allele and antigen level associations with factor analysis allowed us to avoid any disadvantages that could result from choosing to analyze at any particular level of resolution over another.

Alleles that more avidly bind a wider range of peptides may protect from infectious diseases and cancers while predisposing for development of autoimmune diseases. We identified the B*27:05 allele as the second most protective allele for CLL (OR = 0.77). B*27:05 has been found to be predisposing for ankylosing spondylitis,44  whereas it is protective for HIV progression to AIDS,45  and now for CLL.

Dorak et al reported that the A1∼B8∼DR3 haplotype was identified as trending toward significantly protective,24  which was confirmed in our study as A*01:01∼C*07:01∼B*08:01∼DRB1*03:01∼DQB1*02:01 (OR = 0.83, P = .021). However, a recent CLL association result from MD Anderson indicated that the A1∼C7∼B8 haplotype was predisposing for CLL.41  We note the MD Anderson dataset included only patients that proceeded to allogeneic transplant, which likely biases cases toward more common HLA alleles, whereas the NMDP dataset included patients that did not identify a matched donor and/or did not proceed to transplant.

Differences in CLL incidence may also be caused by allelic differences in downregulation of HLA expression. Maintenance of HLA-Bw4 allelic expression is beneficial for leukemic cells because these HLA alleles provide an inhibitory signal to natural killer cells via the KIR3DL1 ligand.46  In CLL, HLA alleles in the Bw4 group are selectively downregulated less often than Bw6 alleles, protecting leukemic cells from natural killer-mediated lysis and increasing presentation of CLL-specific peptides to cytotoxic T lymphocytes.47  We found Bw4 to be protective with threonine at position 80 (OR = 0.88, P = .0013). In ovarian cancer, downregulation of A*02 has been associated with poor prognosis.48  Together, these observations suggest that immune evasion by HLA downregulation may be a mechanism by which specific HLA alleles confer an increased risk.

Incidence of CLL varies widely among world populations, occurring at higher rates in whites and found least often in populations of the Far East.49  For Hispanic and Asian populations, far fewer cases were available in the NMDP dataset than would be expected by overall patient population proportions, consistent with observations that CLL is seen at much lower incidence compared with whites. Due to low sample sizes, there has been no assessment of the role of HLA in the development of CLL among world populations; therefore, global CLL sample collection efforts and studies are necessary. Combining whole exome sequencing and HLA peptide binding analysis is a promising approach to identify interactions between potentially differing population frequencies of CLL-specific somatic mutations and presentation of resulting tumor neoantigens by HLA alleles, which are already known to differ in frequency across populations.50 

With an incidence rate ratio of 0.73, CLL is less common in African Americans compared with the US white population, which has an incidence ratio of 1.89.51,52  In a recent review of the clinical course of African Americans with CLL, investigators from the MD Anderson Cancer Center and Duke University Medical Center reported that African Americans tend to have more unfavorable clinical and biologic characteristics, an earlier need of treatment, fewer remissions, and inferior overall survival compared with nonblack patients.53 

Despite a lower overall incidence of CLL in African Americans, the somewhat more aggressive clinical course suggests that these differences may not be accountable strictly by differences in socioeconomic status. We report for the first time that the HLA alleles predisposing to the development of CLL in African Americans are different than those for whites, and this may have implications for mechanism of evasion of immune surveillance that normally functions to suppress burgeoning malignancies. The predisposing DRB4*01:01 is less common in African Americans (18.2% allele frequency) than whites (29.7%), which may explain some of the difference in incidence. In addition, C*04:01 is uniquely protective in African Americans and found at the relatively high frequency (21%) in this population.

In conclusion, we report the largest HLA association study of CLL to date. We find several HLA alleles that predispose to the development of severe disease in whites and African Americans, as well as others that are protective. Future studies are needed to confirm these HLA associations among unselected cases that would include those with less severe CLL and patients unfit for allo-SCT due to advanced age. Additional controls over age 60 could improve adjustment for age. Correlation of HLA alleles with cancer cytogenetics that were unavailable in this dataset could reveal prognostic indicators for progression of CLL to more severe disease.54,55 

Although knowledge of HLA associations in cancer does not immediately suggest clinical intervention, these findings may direct further study of the role of the adaptive immune response in against CLL including both allogeneic stem cell transplantation and adoptive immunotherapy through use of chimeric antigen receptor T cells.56  Identifying which proteins are mutated or differentially expressed in CLL cells and which endogenous peptides derived from these cancer-related proteins are presented to T cells by different HLA alleles50,57  is an important step in identification of specific peptide-HLA complexes that could be targeted for personalized immunotherapy.58 

The online version of this article contains a data supplement.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Acknowledgments

The authors thank the NMDP patients and the >11 million volunteer donors in the Be The Match Registry operated by the National Marrow Donor Program. The authors thank Mie-Jie Zhang of the Center for International Blood and Marrow Transplant Research for contributions to development of the statistical method of logistic regression with multiple imputation.

Bioinformatics methods development was funded by Office of Naval Research grant N00014-11-1-0339.

Authorship

Contribution: L.G. performed the HLA genetic analysis and drafted the manuscript; S.F. and M.A. performed the statistical analysis; M.M guided the study design; M.K. and B.T.H. provided expertise on CLL disease from a clinical perspective and guided the study design; and all authors contributed to writing and/or review of the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Loren Gragert, Bioinformatics Research, National Marrow Donor Program/Be The Match, Minneapolis, MN 55413; e-mail: lgragert@nmdp.org.

References

References
1
Siegel
 
R
Ma
 
J
Zou
 
Z
Jemal
 
A
Cancer statistics, 2014.
CA Cancer J Clin
2014
, vol. 
64
 
1
(pg. 
9
-
29
)
2
Hallek
 
M
Cheson
 
BD
Catovsky
 
D
, et al. 
International Workshop on Chronic Lymphocytic Leukemia
Guidelines for the diagnosis and treatment of chronic lymphocytic leukemia: a report from the International Workshop on Chronic Lymphocytic Leukemia updating the National Cancer Institute-Working Group 1996 guidelines.
Blood
2008
, vol. 
111
 
12
(pg. 
5446
-
5456
)
3
Shanafelt
 
TD
Geyer
 
SM
Kay
 
NE
Prognosis at diagnosis: integrating molecular biologic insights into clinical practice for patients with CLL.
Blood
2004
, vol. 
103
 
4
(pg. 
1202
-
1210
)
4
Howlader
 
N
Noone
 
AM
Krapcho
 
M
, et al. 
 
(eds). SEER Cancer Statistics Review, 1975-2011. Bethesda, MD: National Cancer Institute; 2014
5
Boggs
 
DR
Chen
 
SC
Zhang
 
Z-N
Zhang
 
A
Chronic lymphocytic leukemia in China.
Am J Hematol
1987
, vol. 
25
 
3
(pg. 
349
-
354
)
6
Mak
 
V
Ip
 
D
Mang
 
O
, et al. 
Preservation of lower incidence of chronic lymphocytic leukemia in Chinese residents in British Columbia: a 26-year survey from 1983 to 2008.
Leuk Lymphoma
 
2013;55(4):824-827
7
Pan
 
JWY
Cook
 
LS
Schwartz
 
SM
Weis
 
NS
Incidence of leukemia in Asian migrants to the United States and their descendants.
Cancer Causes Control
2002
, vol. 
13
 
9
(pg. 
791
-
795
)
8
Sava
 
GP
Speedy
 
HE
Houlston
 
RS
Candidate gene association studies and risk of chronic lymphocytic leukemia: a systematic review and meta-analysis.
Leuk Lymphoma
2014
, vol. 
55
 
1
(pg. 
160
-
167
)
9
Brown
 
JR
Inherited susceptibility to chronic lymphocytic leukemia: evidence and prospects for the future.
Ther Adv Hematol
2013
, vol. 
4
 
4
(pg. 
298
-
308
)
10
Moesta
 
AK
Norman
 
PJ
Yawata
 
M
Yawata
 
N
Gleimer
 
M
Parham
 
P
Synergistic polymorphism at two positions distal to the ligand-binding site makes KIR2DL2 a stronger receptor for HLA-C than KIR2DL3.
J Immunol
2008
, vol. 
180
 
6
(pg. 
3969
-
3979
)
11
Coombs
 
CC
Rassenti
 
LZ
Falchi
 
L
, et al. 
Single nucleotide polymorphisms and inherited risk of chronic lymphocytic leukemia among African Americans.
Blood
2012
, vol. 
120
 
8
(pg. 
1687
-
1690
)
12
Houlston
 
RS
Sellick
 
G
Yuille
 
M
Matutes
 
E
Catovsky
 
D
Causation of chronic lymphocytic leukemia—insights from familial disease.
Leuk Res
2003
, vol. 
27
 
10
(pg. 
871
-
876
)
13
Sellick
 
GS
Catovsky
 
D
Houlston
 
RS
Familial chronic lymphocytic leukemia.
Semin Oncol
2006
, vol. 
33
 
2
(pg. 
195
-
201
)
14
Dunn
 
GP
Old
 
LJ
Schreiber
 
RD
The immunobiology of cancer immunosurveillance and immunoediting.
Immunity
2004
, vol. 
21
 
2
(pg. 
137
-
148
)
15
Verheyden
 
S
Ferrone
 
S
Mulder
 
A
, et al. 
Role of the inhibitory KIR ligand HLA-Bw4 and HLA-C expression levels in the recognition of leukemic cells by Natural Killer cells.
Cancer Immunol Immunother
2009
, vol. 
58
 
6
(pg. 
855
-
865
)
16
Cerwenka
 
A
Lanier
 
LL
Natural killer cells, viruses and cancer.
Nat Rev Immunol
2001
, vol. 
1
 
1
(pg. 
41
-
49
)
17
Drénou
 
B
Le Friec
 
G
Bernard
 
M
, et al. 
Major histocompatibility complex abnormalities in non-Hodgkin lymphomas.
Br J Haematol
2002
, vol. 
119
 
2
(pg. 
417
-
424
)
18
Trowsdale
 
J
Knight
 
JC
Major histocompatibility complex genomics and human disease.
Annu Rev Genomics Hum Genet
2013
, vol. 
14
 
July
(pg. 
301
-
323
)
19
Gragert
 
L
Madbouly
 
A
Freeman
 
J
Maiers
 
M
Six-locus high resolution HLA haplotype frequencies derived from mixed-resolution DNA typing for the entire US donor registry.
Hum Immunol
2013
, vol. 
74
 
10
(pg. 
1313
-
1320
)
20
Berndt
 
SI
Skibola
 
CF
Joseph
 
V
, et al. 
 
Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia. Nat Genet. 2013;45(8):868-876
21
Slager
 
SL
Camp
 
NJ
Conde
 
L
, et al. 
Common variants within 6p21.31 locus are associated with chronic lymphocytic leukaemia and, potentially, other non-Hodgkin lymphoma subtypes.
Br J Haematol
2012
, vol. 
159
 
5
(pg. 
572
-
576
)
22
Slager
 
SL
Rabe
 
KG
Achenbach
 
SJ
, et al. 
Genome-wide association study identifies a novel susceptibility locus at 6p21.3 among familial CLL.
Blood
2011
, vol. 
117
 
6
(pg. 
1911
-
1916
)
23
Di Bernardo
 
MC
Broderick
 
P
Harris
 
S
, et al. 
Risk of developing chronic lymphocytic leukemia is influenced by HLA-A class I variation.
Leukemia
2013
, vol. 
27
 
1
(pg. 
255
-
258
)
24
Dorak
 
MT
Machulla
 
HK
Hentschel
 
M
Mills
 
KI
Langner
 
J
Burnett
 
AK
Influence of the major histocompatibility complex on age at onset of chronic lymphoid leukaemia.
Int J Cancer
1996
, vol. 
65
 
2
(pg. 
134
-
139
)
25
Erlich
 
H
HLA DNA typing: past, present, and future.
Tissue Antigens
2012
, vol. 
80
 
1
(pg. 
1
-
11
)
26
Gourraud
 
P-A
Lamiraux
 
P
El-Kadhi
 
N
Raffoux
 
C
Cambon-Thomsen
 
A
Inferred HLA haplotype information for donors from hematopoietic stem cells donor registries.
Hum Immunol
2005
, vol. 
66
 
5
(pg. 
563
-
570
)
27
Paunić
 
V
Gragert
 
L
Madbouly
 
A
Freeman
 
J
Maiers
 
M
Measuring ambiguity in HLA typing methods.
PLoS ONE
2012
, vol. 
7
 
8
pg. 
e43585
 
28
Madbouly
 
A
Gragert
 
L
Freeman
 
J
, et al. 
Validation of statistical imputation of allele-level multilocus phased genotypes from ambiguous HLA assignments.
Tissue Antigens
2014
, vol. 
84
 
3
(pg. 
285
-
292
)
29
Rubin
 
DB
Multiple Imputation for Nonresponse in Surveys.
1987
New York
Wiley and Sons
30
Gumperz
 
JE
Barber
 
LD
Valiante
 
NM
, et al. 
Conserved and variable residues within the Bw4 motif of HLA-B make separable contributions to recognition by the NKB1 killer cell-inhibitory receptor.
J Immunol
1997
, vol. 
158
 
11
(pg. 
5237
-
5241
)
31
Khakoo
 
SI
Thio
 
CL
Martin
 
MP
, et al. 
 
HLA and NK cell inhibitory receptor genes in resolving hepatitis C virus infection. Science. 2004;305(5685):872-874
32
Schafer
 
JL
Analysis of Incomplete Multivariate Data.
1997
New York
Chapman & Hall
33
Benjamini
 
Y
Hochberg
 
Y
Controlling the false discovery rate: a practical and powerful approach to multiple testing.
J R Stat Soc B
1995
, vol. 
57
 
1
(pg. 
289
-
300
)
34
Fabrigar
 
LR
Wegener
 
DT
Exploratory Factor Analysis (Understanding Statistics).
2011
Oxford, UK
Oxford University Press
35
Cattell
 
RB
The Scree test for the number of factors.
Multivariate Behav Res
1966
, vol. 
1
 
2
(pg. 
245
-
276
)
36
National Cancer Institute. SEER: Surveillance, Epidemiology, and End Results program. http://www.seer.cancer.gov/data. Accessed December 7, 2012
37
Shvidel
 
L
Shtarlid
 
M
Klepfish
 
A
Sigler
 
E
Berrebi
 
A
Epidemiology and ethnic aspects of B cell chronic lymphocytic leukemia in Israel.
Leukemia
1998
, vol. 
12
 
10
(pg. 
1612
-
1617
)
38
Klitz
 
W
Gragert
 
L
Maiers
 
M
, et al. 
Genetic differentiation of Jewish populations.
Tissue Antigens
2010
, vol. 
76
 
6
(pg. 
442
-
458
)
39
Leslie
 
S
Donnelly
 
P
McVean
 
G
A statistical method for predicting classical HLA alleles from SNP data.
Am J Hum Genet
2008
, vol. 
82
 
1
(pg. 
48
-
56
)
40
Montes-Ares
 
O
Moya-Quiles
 
MR
Montes-Casado
 
M
, et al. 
Human leucocyte antigen-C in B chronic lymphocytic leukaemia.
Br J Haematol
2006
, vol. 
135
 
4
(pg. 
517
-
519
)
41
Shah
 
N
Decker
 
WK
Lapushin
 
R
, et al. 
HLA homozygosity and haplotype bias among patients with chronic lymphocytic leukemia: implications for disease control by physiological immune surveillance.
Leukemia
2011
, vol. 
25
 
6
(pg. 
1036
-
1039
)
42
Mueller
 
LP
Machulla
 
HKG
Increased frequency of homozygosity for HLA class II loci in female patients with chronic lymphocytic leukemia.
Leuk Lymphoma
2002
, vol. 
43
 
5
(pg. 
1013
-
1019
)
43
Wang
 
SS
Abdou
 
AM
Morton
 
LM
, et al. 
Human leukocyte antigen class I and II alleles in non-Hodgkin lymphoma etiology.
Blood
2010
, vol. 
115
 
23
(pg. 
4820
-
4823
)
44
Benjamin
 
R
Parham
 
P
Guilt by association: HLA-B27 and ankylosing spondylitis.
Immunol Today
1990
, vol. 
11
 
4
(pg. 
137
-
142
)
45
den Uyl
 
D
van der Horst-Bruinsma
 
IE
van Agtmael
 
M
Progression of HIV to AIDS: a protective role for HLA-B27?
AIDS Rev
2004
, vol. 
6
 
2
(pg. 
89
-
96
)
46
Guillaume
 
N
Marolleau
 
J-P
Is immune escape via human leukocyte antigen expression clinically relevant in chronic lymphocytic leukemia? Focus on the controversies.
Leuk Res
2013
, vol. 
37
 
4
(pg. 
473
-
477
)
47
Demanet
 
C
Mulder
 
A
Deneys
 
V
, et al. 
Down-regulation of HLA-A and HLA-Bw6, but not HLA-Bw4, allospecificities in leukemic cells: an escape mechanism from CTL and NK attack?
Blood
2004
, vol. 
103
 
8
(pg. 
3122
-
3130
)
48
Andersson
 
E
Villabona
 
L
Bergfeldt
 
K
, et al. 
Correlation of HLA-A02* genotype and HLA class I antigen down-regulation with the prognosis of epithelial ovarian cancer.
Cancer Immunol Immunother
2012
, vol. 
61
 
8
(pg. 
1243
-
1253
)
49
Ruchlemer
 
R
Polliack
 
A
Geography, ethnicity and “roots” in chronic lymphocytic leukemia.
Leuk Lymphoma
2012
October
(pg. 
1
-
9
)
50
Rajasagi
 
M
Shukla
 
SA
Fritsch
 
EF
, et al. 
Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia.
Blood
2014
, vol. 
124
 
3
(pg. 
453
-
462
)
51
Dores
 
GM
Anderson
 
WF
Curtis
 
RE
, et al. 
Chronic lymphocytic leukaemia and small lymphocytic lymphoma: overview of the descriptive epidemiology.
Br J Haematol
2007
, vol. 
139
 
5
(pg. 
809
-
819
)
52
Shenoy
 
PJ
Malik
 
N
Sinha
 
R
, et al. 
Racial differences in the presentation and outcomes of chronic lymphocytic leukemia and variants in the United States.
Clin Lymphoma Myeloma Leuk
2011
, vol. 
11
 
6
(pg. 
498
-
506
)
53
Falchi
 
L
Keating
 
MJ
Wang
 
X
, et al. 
Clinical characteristics, response to therapy, and survival of African American patients diagnosed with chronic lymphocytic leukemia: joint experience of the MD Anderson Cancer Center and Duke University Medical Center.
Cancer
2013
, vol. 
119
 
17
(pg. 
3177
-
3185
)
54
Döhner
 
H
Stilgenbauer
 
S
Benner
 
A
, et al. 
Genomic aberrations and survival in chronic lymphocytic leukemia.
N Engl J Med
2000
, vol. 
343
 
26
(pg. 
1910
-
1916
)
55
Chavez
 
JC
Kharfan-Dabaja
 
MA
Kim
 
J
, et al. 
Genomic aberrations deletion 11q and deletion 17p independently predict for worse progression-free and overall survival after allogeneic hematopoietic cell transplantation for chronic lymphocytic leukemia [published online ahead of print April 28, 2014].
Leuk Res
 
doi:10.1016/j.leukres.2014.04.006
56
Grupp
 
SA
Kalos
 
M
Barrett
 
D
, et al. 
Chimeric antigen receptor-modified T cells for acute lymphoid leukemia.
N Engl J Med
2013
, vol. 
368
 
16
(pg. 
1509
-
1518
)
57
Hawkins
 
OE
Vangundy
 
RS
Eckerd
 
AM
, et al. 
Identification of breast cancer peptide epitopes presented by HLA-A*0201.
J Proteome Res
2008
, vol. 
7
 
4
(pg. 
1445
-
1457
)
58
Weidanz
 
JA
Hawkins
 
O
Verma
 
B
Hildebrand
 
WH
TCR-like biomolecules target peptide/MHC Class I complexes on the surface of infected and cancerous cells.
Int Rev Immunol
2011
, vol. 
30
 
5-6
(pg. 
328
-
340
)

Supplemental data