In this issue of Blood, the study by Kammers et al1 identifies genetic variants associated with gene expression in 290 platelet and 185 induced pluripotent stem cell (iPSC) megakaryocyte (MK) samples to fill in missing links between genotype and platelet phenotypes.
Platelets are unique, anucleate, effector cells with roles in immunity, wound healing, hemostasis, and thrombosis. Heritable platelet traits, such as mean platelet volume, platelet count, and their propensity to aggregate and form clots, have been linked to disease.2,3 Platelet traits have also been associated with differences in platelet gene expression.4,5 Platelet gene expression varies between individuals, is heritable, and is highly repeatable for example, stably retained over time.6 Genetic variants associated with gene expression, called expression quantitative trait loci (eQTL) (see figure), account for a significant portion of this stable variation between individuals.6,7 A previous study by Simon et al of platelet gene expression in 154 European American and African American subjects identified eQTL for 612 genes (called eGenes), many of which were unique to platelets.7
However, eQTLs found for platelets alone may not account for all of the inherited variation in platelets. Anucleate platelets inherit much of their RNA and protein content from their parent MKs. From transcription in the MK, followed by loading into platelets and 7 to 10 days of circulation, differences in RNA levels inevitably accumulate (differential RNA sorting into platelets, turnover, uptake), whereas their protein products and functional consequences might remain. Therefore, MKs could harbor novel gene-variant associations that are lost at the RNA level in platelets yet are still responsible for differences in platelet proteins and function. Unfortunately, native MKs capable of forming platelets that reside in the bone marrow and lung8 are rare and difficult to access without invasive procedures to acquire sufficient numbers for eQTL analysis. To address this gap, as outlined in the figure, Kammer et al generated iPSC-derived MKs from 185 African American and European American donors and compared their transcriptome to platelets from 290 donors. Utilizing RNA-seq on polyadenylated RNA (coding transcripts), the authors found that 91.3% of the genes expressed in platelets were expressed in iPSC-MKs, whereas only 53.1% of the iPSC-MK–expressed genes were in platelets, with a modest Spearman correlation between shared genes of 0.46.
Comparing RNA levels with DNA variants called by genome sequencing, the authors identified 1830 eQTLs in platelets. This replicated 85% of previously identified platelet eQTLs7 and adds nearly 3 times as many new eQTLs. Also identified were 946 eQTLs for MKs. Of these, 323 matching genes were identified with significant eQTLs in both MKs and platelets, whereas 57 of these shared the same lead eQTL. This number is surprisingly small since the bulk of RNA in platelets presumably comes from their parent MKs. However, considering donor differences and that the directionality of effect was predominantly the same for those eQTLs that reached statistical significance in 1 tissue but not the other, the actual regulatory overlap is potentially larger.
Variants coded within the RNA itself may affect transcription, structure, stability, localization, or the individual amino acid sequence of the protein. However, how do eQTLs found in noncoding regions distant to the gene body influence RNA expression? The authors found, for both platelets and MKs, an enrichment for lead eQTLs in regulatory regions previously identified in MKs that include long-range interactions such as DNAse-sensitive enhancer elements. This points to eQTL-mediated transcriptional regulation initiated in MKs and carried over into platelets.
Genome-wide association studies (GWAS) have found DNA variants associated with thousands of diseases, including those where platelets participate, such as cardiovascular disease.9,10 However, the relevance of these variants has been difficult to interpret. For one, DNA variants at neighboring genetic loci are coinherited (linkage disequilibrium), so the exact responsible variant among neighbors is difficult to define. In addition, between DNA at one end and disease outcome at the other is a chain of complex molecular, cellular, intercellular, and organismal interactions. Therefore, recent studies have emphasized linking DNA variants with more narrowly defined molecular and cellular intermediates, like gene expression, that in turn can be ultimately linked to disease. Databases of eQTLs found for many different tissues have helped link GWAS variants to gene expression, but the regulatory effect of eQTLs is often tissue specific; thus eQTLs are not found in current tissue databases for many variants linked with platelet traits. Consistent with this, the authors found a low overlap with platelet or MK lead eQTLs (the eQTL with the strongest association with a gene) and those previously found for 48 other tissues, suggesting platelet- and MK-specific regulation of gene expression.
To identify possible new links between DNA variants, gene expression, and platelet phenotypes, the authors compared their MK and platelet eQTLs with previously published GWAS studies of platelet traits. Of 1065 significant GWAS variants for platelet traits, 24 matched eQTLs in either platelets or MKs, and 6 matched in both. Additional eQTLs in linkage disequilibrium were also found for other GWAS single-nucleotide polymorphisms. The identification of these eQTLs provides a provocative intermediate link between DNA variants and GWAS phenotype (see figure panel B), an additional step toward understanding the functional consequences of a given genotype.
There are some caveats that should be considered when interpreting the study. The iPSC-derived populations of cells contained ∼60% of cells double positive for the MK markers CD41 and CD42a. Although there was significant overlap with the iPSC-MKs and platelet gene expression, substantial differences in gene expression will unquestionably remain between in vivo MKs and an impure population of in vitro–derived MKs. Because many eQTLs are tissue and context dependent, some of the eQTLs found for iPSC-MKs may be specific for in vitro cells. Nevertheless, the identified MK variants may prove useful when choosing which GWAS variants to rule in or out for follow-up studies to establish causation. Establishing causation with functional experimentation is an essential next step to deciphering gene-variant functions that will ultimately inform mechanisms of platelet function and forge new pathways toward disease prevention and treatment.
Conflict-of-interest disclosure: The authors declare no competing financial interests.