Genome wide association studies (GWAS) in chronic lymphocytic leukemia (CLL) have identified thirteen single nucleotide polymorphisms (SNPs) that are associated with the risk of developing CLL but do not affect the coding regions of genes. The functional targets of these SNPs remain largely unknown although they are thought to potentially serve as regulatory elements for nearby genes. We have previously published the results of a high resolution integrated genomic analysis of 161 CLLs with matched normal DNAs using Affymetrix 6.0 SNP arrays and Affymetrix U133 Plus 2.0 arrays run on the CLL lymphocytes. In this analysis, we sought to exploit this dataset to investigate whether SNP genotype at loci implicated in CLL risk by GWAS was associated with altered expression of genes in the CLL lymphocyte expression arrays. We therefore investigated 19 SNPs previously described in GWAS studies, either the SNP itself if present on the Affymetrix 6.0 SNP array, or one or more proxy SNPs for those not present on the array, chosen based on their high linkage disequilibrium (r2 > 0.7, usually > 0.9) with the GWAS SNP. Regions studied included 2q13 (1 SNP), 2q37.1 (1 SNP), 2q37.3 (1 SNP), 6p21.3 (1 SNP), 6p25.3 (2 SNPs), 8q24.2 (6 proxy SNPs), 11q24.1 (1 SNP), 15q21.3 (1 SNP), 15q23 (1 proxy SNP), 15q25.2 (1 SNP), 18q21.1 (1 proxy SNP), and 19q13.32 (2 proxy SNPs). We hypothesized that the genes most likely to be regulated by these loci would be located nearby, and therefore explored associations between SNP genotypes and the expression of genes located within 2 Mb of the relevant SNP using the Kruskal-Wallis test. The number of genes evaluated ranged from 11–22 depending on the locus. The analysis was performed independently for SNP genotypes derived from the tumor / lymphocyte samples (n=143) and from the normal / saliva samples (n=70–80). Discordant genotypes between tumor and normal samples were manually reviewed for reconciliation or excluded in the case of poor quality, indeterminate genotype or altered genomic copy number at the locus. Using the SNP genotypes from the tumor samples, we identified 13 genes with expression significantly associated with a risk SNP (using p value < 0.05). Using the SNP genotypes from the normal samples, we identified 15 genes using the same criteria. In both the tumor and normal analyses, eight SNPs were associated with a total of seven genes. The most significant associations were found between the risk allele of rs674313 on 6p21 and higher expression of HLA-DQA1 (p<0.0001), and between the risk allele of rs4802322 on 19q13 and higher expression of FKRP (p<0.0001), although the latter did not show a wide range of gene expression. IRF4 expression on 6p25 was also significantly associated with rs872071 (p=0.01), as we and others have previously shown. MYC expression was associated with two of the proxy SNPs at 8q24, rs17762878 (p=0.03) and rs7823764 (p<0.04). Additional significant associations were seen for rs4777184 on chromosome 15 with TLE3 expression (p<0.02), for rs783540 on chromosome 15 with CPEB1 expression (p<0.01), and for rs305088 on chromosome 16 with COX4NB expression (p<0.04). The regulation of IRF4 and MYC by GWAS SNP alleles is unsurprising; current work is focused on validating the associations with the other genes in an extension cohort and exploring their possible functions in CLL.
No relevant conflicts of interest to declare.
Asterisk with author names denotes non-ASH members.