Fetal hemoglobin (HbF) inhibits the polymerization of sickle hemoglobin modulating some of the subphenotypes of sickle cell anemia. HbF concentrations vary considerably among patients and this variation is regulated as a multigenic trait; some regulatory regions previously identified are linked to the β-globin gene-like cluster and are also within quantitative trait loci (QTL) on chromosomes 6q, 8q and Xp. We genotyped panels of haplotype tagging (ht)SNPs in the β-globin gene-like cluster and covering the 8q and Xp QTL in two independent patient samples and applied two independent analytical methods to study their association with HbF levels. The first sample included 327 patients (135 htSNPs) and the second sample had 987 patients (102 htSNPs). Genotyping was done using a custom 384 multiplex design and the Illumina platform. Single SNP association was investigated by multiple linear regression analysis with simultaneous adjustment for age, sex and the co-existence of α thalassemia; we performed a permutation procedure to correct for multiple testing. The nonlinear regression Random Forest method was used for a joint analysis of covariates and all SNPs. In the smaller patient sample, 7 SNPs in TOX (thymus high mobility protein; 8q12.1) gene showed significant association with HbF (p-value <0.05), but none passed the multiple test correction. Another 11 SNPs in TOX also showed significant association in the larger patient sample and SNP rs7817609 passed the multiple test correction. This SNP in TOX was significantly associated with HbF under codominant and dominant models (empirical p-values 0.0154 and 0.0067) with the C allele associated with a high level of HbF expression; the level of HbF in subjects with CC and CG genotypes was on average 25 percent higher than that in subjects with GG genotype. Three SNPs, including the Xmn I, −158 C T SNP 5′ to HBG2 within the β-globin gene-like cluster, also showed association with HbF. This SNP was the 2nd most important SNP in predicting HbF expression using Random Forest analysis indicating that it might interact non-linearly with other regulatory factors. Joint analysis of all SNPs and covariates revealed that the most important variables for predicting HbF matched most of the SNPs identified by single SNP association studies in both data sets. Random Forest analysis also identified SNPs in EGFL6, GPM6B and FIGF that have a strong effect on predicting HbF level and that lie within the Xp22.2–p22.3 QTL. This suggests that genes within the Xp QTL might be involved in interaction with other genes to regulate the expression of HbF. TOX belongs to one of the high mobility group (HMG) box protein families, which contains proteins with a single HMG box that bind with high sequence specificity to variants of the DNA sequence (A/T)(A/T)CAAAG, inducing a sharp bend in DNA, altering local chromatin structure and modulating the formation of multi-protein regulatory complexes. Within 50 kb 5′ and 40 kb 3′ to the HBG2 there were 50 matches to this binding sequence. TOX might bind to the β-globin gene cluster forming multi-protein regulatory complexes with other proteins to regulate the expression of HbF. We have identified new candidate genes, or linkage disequilibrium with candidate genes, that might play a role in HbF regulation, but further genetic and biological studies are required for validation.

Disclosures: NIH.; Medicolegal cases.; From textbooks.

Author notes


Corresponding author