Background: Myelodysplastic syndromes (MDS) are heterogeneous clonally derived bone marrow disorders characterized by ineffective hematopoiesis and propensity to transform to acute myeloid leukemia. With greater than 15,000 new cases identified yearly, patients (pts) with MDS have a wide range of clinical manifestations and outcomes. Challenges in treating MDS include disease heterogeneity and a small number of effective treatments, particularly beyond first-line approaches and supportive care. Although mutational analysis of MDS provides prognostic information, the plethora of genetic events in a single case complicates using this information for guiding clinical therapy. We hypothesized that these genetic events coalesce into a finite number of protein expression signatures and that these would guide individualized therapy.

Methods: A custom Reverse Phase Protein Array (RPPA) with 378 samples (including replicates) from 123 newly diagnosed and 76 relapsed/refractory MDS pts as well as 20 normal CD34+ bone marrow samples was created and probed with 136 antibodies to determine relative protein expression. To assess impact of source cell type on protein expression, 112 of the 378 samples (representing some replicates from 95 pts) had paired CD34+ and CD34+CD38- samples. Since proteins interact with each other and function within networks, proteins were first divided into 25 Protein Functional Groups (ProFnGp) based on their known functionality in the literature. Progeny clustering was then performed for each ProFnGp to determine the optimal number of protein clusters. Principal component analysis (PCA) was used to map global differences and similarities between protein clusters and normal CD34+ samples. Hierarchical clustering (HC) was performed on a compilation of all protein clusters in one binary matrix to identify recurrent protein expression signatures (PrSIG)that comprised similar combinations of protein constellations (PrCON). Associations between signature membership, clinical and molecular features, and outcome were assessed. Proteins that were universally over or under expressed and specific for a given signature were identified.

Results: Clustering of pts for each ProFnGp revealed distinct relative expression and activation states compared to normal CD34+ samples. For each ProFnGp, 2 to 6 distinct expression clusters were identified, providing 110 protein clusters for HC. Of the 25 ProFnGp, all had MDS specific patterns and 19 had at least one cluster similar to normal CD34+ cells. HC revealed strong co-correlation between multiple groups of protein clusters from various ProFnGp and suggested 11 PrCON. Pts that expressed similar recurrent combinations of PrCON formed 11 PrSIG (Figure 1). Within PrSIG, no bias was observed in sample status (fresh or cryopreserved), source (peripheral blood or bone marrow), gender, or relapse status. Analysis of paired samples revealed 84% of CD34+ samples were present in separate PrSIG from corresponding CD34+CD38- samples, suggesting cases where CD34+ samples were distinct from CD34+CD38- samples. This suggests cell type should be considered in future analyses. Structured cluster memberships were identified, suggesting ProFnGp targets. The distinct cluster identified in PrCON 4 x PrSIG 1 revealed associations with ProFnGp including apoptosis, SMAD, PKC, mTOR, MEK, and Hippo pathways. Within this cluster, upregulation was identified in proteins including PKCα, PI3Kp110α, and SMAD6 and downregulation in SMAC, PKCd, SMAD1, SMAD4, TSC2, and NF2, suggesting targets for directed combination therapy with agents such as selective PKCα and PI3K inhibitors or SMAC mimetics. Overall summation of expression for each protein across each signature revealed many proteins with either significantly higher or lower expression relative to CD34+ controls.

Conclusions: Analysis of protein expression levels in a network-based approach revealed classification of MDS pts into finite protein expression signatures based on the existence of recurrent protein constellations. Recognition of universal differentially expressed proteins, together with signature specific proteins, suggests targets for personalized and directed combinatorial therapeutics.

Figure 1

HC based on binary ProFnGp cluster membership. Each vertical pt column consists of 25 of 110 protein clusters. Blue squares indicate positive cluster membership.

Figure 1

HC based on binary ProFnGp cluster membership. Each vertical pt column consists of 25 of 110 protein clusters. Blue squares indicate positive cluster membership.


No relevant conflicts of interest to declare.

Author notes


Asterisk with author names denotes non-ASH members.