Cutaneous T-cell lymphoma (CTCL) is defined by infiltration of activated and malignant T cells in the skin. The clinical manifestations and prognosis in CTCL are highly variable. In this study, we hypothesized that gene expression analysis in lesional skin biopsies can improve understanding of the disease and its management. Based on 63 skin samples, we performed consensus clustering, revealing 3 patient clusters. Of these, 2 clusters tended to differentiate limited CTCL (stages IA and IB) from more extensive CTCL (stages IB and III). Stage IB patients appeared in both clusters, but those in the limited CTCL cluster were more responsive to treatment than those in the more extensive CTCL cluster. The third cluster was enriched in lymphocyte activation genes and was associated with a high proportion of tumor (stage IIB) lesions. Survival analysis revealed significant differences in event-free survival between clusters, with poorest survival seen in the activated lymphocyte cluster. Using supervised analysis, we further characterized genes significantly associated with lower-stage/treatment-responsive CTCL versus higher-stage/treatment-resistant CTCL. We conclude that transcriptional profiling of CTCL skin lesions reveals clinically relevant signatures, correlating with differences in survival and response to treatment. Additional prospective long-term studies to validate and refine these findings appear warranted.

Cutaneous T-cell lymphoma (CTCL) refers to a heterogeneous group of non-Hodgkin lymphomas that share the common feature of infiltration of the skin with malignant T cells.2  The most common form of CTCL is historically known as mycosis fungoides (MF), and it has a variable clinical course. While many patients have a relatively indolent course, others experience rapidly progressive and often lethal disease. Like other lymphomas, prognosis is linked, albeit imprecisely, to clinical stage at presentation. Patients with stage IA disease have normal life expectancies, while those in stage IB have median survivals of 12.8 years, and median survival of stage IIB disease is only 3.2 years.3  Other variants of CTCL are less common, and include diseases with more or less aggressive clinical courses.2,4  Sézary syndrome (SS) is the leukemic variant of CTCL with a 5-year survival of only 11%.5 

In early disease, activated reactive but benign T cells greatly outnumber malignant cells in skin, and the clinical appearance is that of a patch or a thin plaque of an inflammatory skin disease.6  The cytokine milieu of the involved skin is dominated by type 2 T-cell cytokines, and the infiltrating cells are typically positive for cutaneous lymphocyte antigen (CLA) and CCR4.7  As disease progresses, the ratio of malignant cells to reactive cells increases, and thick plaques and tumors are characterized by loss of epidermotropism, increase in dermal lymphocytic infiltrate, and size and atypia of cells.8  The etiology of CTCL remains obscure, and we have incomplete understanding concerning the initiation of malignant disease and the role of processes such as genomic instability or antigenic stimulation in disease progression. At this time, it is impossible to predict which patients with early-stage CTCL will progress, and which will continue to manifest indolent skin-limited disease.9 

Transcriptional analysis of cancer has proven to be a powerful and increasingly useful tool in biomedical research.10-12  These patterns of gene expression have revealed unappreciated complexity in histologically homogeneous diseases like large-cell B-cell lymphoma. A goal of many studies is to link gene expression patterns revealed by transcriptional profiling to both understanding pathophysiology of disease and predicting prognosis and responsiveness to therapy.13-15  These approaches have been applied to CTCL in a limited way to characterize leukemic cells in SS,16  to distinguish MF from benign dermatoses,17  and to predict responsiveness to interferon.18 

In the present study, we use transcriptional analysis to gain insight into the pathophysiologic and prognostic features of CTCL. We evaluated 62 patients with various stages of CTCL for whom detailed clinical history was available. Lesional biopsies were analyzed via transcriptional profiling using oligonucleotide microarrays (Affymetrix HG-U133 chips; Santa Clara, CA). We present a robust unsupervised analysis of these samples with clustering into 3 groups, and explore the molecular pathways implied by the resultant clustering. We purposefully sought a rigorous assessment of optimal cluster number in order to better appreciate categories of disease process. We used supervised analysis to determine signatures of early- and late-stage CTCL and to evaluate responsiveness to therapy and clinical course. Taken together, these studies shed light on functional pathways in CTCL and suggest an approach to further predictive studies.

Samples

All patients were enrolled in a study protocol approved by the Dana-Farber Cancer Institute's Institutional Review Board. Informed consent was obtained in accordance with the Declaration of Helsinki. Patients were recruited from the Cutaneous Lymphoma Clinic at the Dana Farber Cancer Institute (DFCI)/Brigham and Women's Hospital (BWH). One sample was obtained through collaboration with MD Anderson. A total of 63 6-mm punch biopsies from lesional skin were collected from 62 patients between January 26, 2003, and June 1, 2005. One patient with patch/plaque disease was biopsied on 2 separate occasions from similar lesions at different body sites in order to provide methodologic verification through consistent clustering patterns. Patients enrolled were at various stages of disease, including 13 patients with limited patches and plaques covering less than 10% of body surface area (stage IA), 29 with generalized patches and plaques covering more than 10% of body surface area (stage IB), 8 with tumors (stage IIB), and 12 with generalized erythroderma (stage III). One-third of patients presented for diagnostic biopsy and were treatment naive, whereas the remainder had already started treatments with various therapies. Patients were diagnosed and staged based on clinical history and physical exam, biopsy findings, flow cytometry, and computed tomography (CT) studies. An extensive chart review was conducted to collect information on clinical parameters and outcomes. Data were stored in a study database. Mean age of patients with CTCL in this study was 60 years (range, 25-90 years). A total of 60% of patients were male.

Target preparation

Skin biopsies of CTCL skin lesions taken with a 6-mm punch were obtained and immediately snap-frozen in liquid nitrogen. Tissue was powdered in liquid nitrogen (Cryo-Press; Microtec Co, Chiba, Japan), and total RNA was extracted using Trizol (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. RNA was then labeled using the standard Affymetrix in vitro transcription labeling protocol, and 10 μg cRNA was hybridized to Human Genome U133A GeneChip oligonucleotide arrays, which consist of 22 283 probe sets representing approximately 13 500 named genes.

Data analysis and filtering

Gene expression data were processed using Robust Multi-Chip Average (RMA)19  as implemented in the R statistical package provided by Bioconductor (Seattle, WA).20,21  RMA data processing includes background correction, quantile normalization, and expression summarization.

Genes were filtered in order to eliminate those with cross-sample variation using a variation filter22  estimated on 14 HeLa cell lines and 5 Strategene samples (Stratagene, La Jolla, CA; Figure 1). We used this filtering to define 2 gene sets, 1 with 6997 genes (α = 0.999) and another with 4204 genes (α = 0.9999). Unsupervised analyses used the 4204-gene set, reflecting the more stringent statistical criteria required when samples and genes are allowed to cluster naturally. Supervised analyses used the 6997-gene set.

Figure 1

Gene filtering. A total of 22 283 probe sets were filtered using a noise envelope to eliminate genes with little cross-chip variability. The noise envelope was estimated on 14 HeLa cell lines and 5 Strategene samples. Two filters were used, resulting in gene sets of 6997 (α = 0.999) and 4204 (α = 0.9999) probe sets. Unsupervised analyses were done using the 4204-gene filter. Supervised analyses were done using the 6997-gene filter.

Figure 1

Gene filtering. A total of 22 283 probe sets were filtered using a noise envelope to eliminate genes with little cross-chip variability. The noise envelope was estimated on 14 HeLa cell lines and 5 Strategene samples. Two filters were used, resulting in gene sets of 6997 (α = 0.999) and 4204 (α = 0.9999) probe sets. Unsupervised analyses were done using the 4204-gene filter. Supervised analyses were done using the 6997-gene filter.

Close modal

Unsupervised analysis using consensus clustering

In this analysis, 2 unsupervised clustering techniques were used: hierarchical clustering (HC)23  and self-organizing maps (SOMs).24  The 2 clustering techniques were compared using consensus clustering25  in order to assess the stability of the identified clusters and to evaluate sensitivity to sampling variability. Confusion matrices for gene as well as sample assignment were generated to assess agreement between clusters produced by the different clustering algorithms.

All heat maps for visualization were generated using DChip.26  Fold changes were calculated based on average expression for class versus nonclass.

Supervised analysis

Statistical tests to compare predefined classes of samples based on phenotypic differences were performed using GenePattern (Broad Institute, MIT, Cambridge, MA, http://www.broad.mit.edu/cancer/software/genepattern/).27,28  In particular, we used the Comparative Marker Selection suite, which computes statistical significance for each gene and corrects for multiple hypothesis testing (MHT).29  A total of 10 000 permutations were performed for the estimation of empirical p-values, which were then corrected for MHT by the false discovery rate (FDR) procedure.30  Genes with an FDR value of less than or equal to .05 were considered significant in our study.

Gene pathway analysis

A total of 2 different algorithms were used to evaluate contribution of gene pathways to the transcriptional differentiation of samples.

GO analysis.

Gene Ontology (GO) is a collaborative and comprehensive gene annotation resource compiled by the Gene Ontology Consortium.31  GO annotations were obtained from Affymetrix and analyzed using GeneMerge.32  GeneMerge uses defined class gene markers and calculates enrichment for GO pathways.

GSEA.

Gene set enrichment analysis (GSEA) was performed by generating a ranked list of genes differentiating defined classes and comparing this ranked list with specified gene sets.33  Genes were ranked by signal to noise ratio. An enrichment score was generated for each gene set. Statistical significance was then estimated using phenotype-based permutations, with the attained P values adjusted for MHT. A total of 539 gene sets generated from several independent sources were used in our analysis. Among the resources used to compile these gene sets were (1) Biocarta (Biocarta, San Diego, CA), with 274 pathways involved in adhesion, apoptosis, cell activation, cell-cycle regulation, cell signaling, cytokines and chemokines, developmental biology, hematopoiesis, immunology, metabolism, and neurosciences; (2) GenMAPP (Gene MicroArray Pathway Profiler; www.genmapp.org), a resource for metabolic pathways, cell signaling processes, and physiologic functions, with 67 pathways; (3) Signal Transduction Knowledge Environment (STKE), a branch of the journal Science that serves as an online resource for signal transduction research and includes 29 signal transduction pathways; (4) SigmaAldrich (Sigma-Aldrich, St Louis, MO), with 12 pathways, involved in the cell cycle, mitogen-activated protein (MAP) kinase signaling, and apoptosis; (5) GO, from which were culled 23 pathways; and (6) the GEArray pathway-specific arrays made by SuperArray (SuperArray Bioscience, Frederick, MD), from which were derived 3 pathways involved in calcineurin-NFAT signaling, breast cancer and estrogen signaling, and bcl2 regulation. A total of 63 manually curated pathways involved in mitochondrial function and metabolism were also included in the analysis. An additional 59 pathways were generated from the current literature.

Identification of the consensus number of clusters

We used HC and SOM algorithms to differentiate biologically meaningful subsets of CTCL with similar transcription profiles. Based on consensus analysis, we determined that the most robust clustering structure included 3 distinct clusters (Figure 2). This clustering showed a high level of agreement between clusters produced by HC and SOMs, with approximately 89% of samples and more than 90% of genes assigned to the same clusters (Figure 3).

Figure 2

Consensus clustering. Consensus clustering indicates that 3 clusters is the greatest number of stable clusters across clustering techniques. Consensus heat maps produced by HC and SOMs for cluster number k = 3 are shown. The consensus heat map is a visual representation of the consensus matrix, which is a matrix of sample pairs. Each matrix entry is the proportion of times the pair's samples are clustered together across resampling iterations. In the heat map, a value of 1 (corresponding to 2 samples that are always clustered together) is represented by the color red, and a value of 0 (corresponding to 2 samples that are never clustered together) is represented by the color white.

Figure 2

Consensus clustering. Consensus clustering indicates that 3 clusters is the greatest number of stable clusters across clustering techniques. Consensus heat maps produced by HC and SOMs for cluster number k = 3 are shown. The consensus heat map is a visual representation of the consensus matrix, which is a matrix of sample pairs. Each matrix entry is the proportion of times the pair's samples are clustered together across resampling iterations. In the heat map, a value of 1 (corresponding to 2 samples that are always clustered together) is represented by the color red, and a value of 0 (corresponding to 2 samples that are never clustered together) is represented by the color white.

Close modal
Figure 3

Consensus clustering overlap matrices for the 3-cluster model.(A) Sample overlap comparing sample assignment between HC and SOMs. Entries are the number of samples in overlapping clusters. (B) Marker overlap between HC and SOMs. Gene markers for k = 3 clusters using HC and SOMs were evaluated to assess the number of markers held in common between clustering techniques. Entries are presented as: residual [no. observed | no. expected], where residual = (observed −expected) / sqrt (expected).

Figure 3

Consensus clustering overlap matrices for the 3-cluster model.(A) Sample overlap comparing sample assignment between HC and SOMs. Entries are the number of samples in overlapping clusters. (B) Marker overlap between HC and SOMs. Gene markers for k = 3 clusters using HC and SOMs were evaluated to assess the number of markers held in common between clustering techniques. Entries are presented as: residual [no. observed | no. expected], where residual = (observed −expected) / sqrt (expected).

Close modal

One patient was biopsied at 2 different time points and 2 different body sites to verify reproducibility of the clustering. Clustering performed with each sample separately resulted in the same cluster assignment (cluster 1) both times.

Characterization of the consensus clusters

We next examined the gene sets involved in the 3-cluster structure produced by consensus clustering. Using standard differential analysis, we identified 593 genes with a P value less than or equal to .025 and fold change greater than or equal to 2 (Figure 4). We then further characterized the clusters identified by unsupervised analysis based on sample phenotype as well as gene identity.

Figure 4

Hierarchical clustering heat map. This heat map shows partitioning of 63 samples and the top 593 genes with P values less than or equal to .025 and fold change greater than or equal to 2. The chart shows the top 15 GO terms for each cluster, ranked by FDR. Each GO term is accompanied by the number of up-regulated genes/the total number of genes included in the test group. All included GO terms have P values less than or equal to .05 and FDR less than or equal to .05.

Figure 4

Hierarchical clustering heat map. This heat map shows partitioning of 63 samples and the top 593 genes with P values less than or equal to .025 and fold change greater than or equal to 2. The chart shows the top 15 GO terms for each cluster, ranked by FDR. Each GO term is accompanied by the number of up-regulated genes/the total number of genes included in the test group. All included GO terms have P values less than or equal to .05 and FDR less than or equal to .05.

Close modal

Phenotype correlation.

Sample clustering appeared to correlate with clinical stage (Figure 5). Cluster 1 contained a mix of patients with stages IA, IB, IIB, and III disease. This cluster included 75% (6 of 8) of all stage IIB (tumor stage) samples. Cluster 2 contained primarily stage IA and IB disease, with only 2 patients with higher-stage disease. Cluster 3 contained only 1 stage IA patient and a relatively large proportion of stage III disease (58%, or 7 of 12 stage III patients).

Figure 5

Cluster composition based on clinical stage. Entries are presented as: residual [no. observed | no. expected], where residual = (observed −expected) /sqrt (expected). Yellow indicates relative deficiency of cluster with respect to the clinical stage, and green indicates enrichment of cluster with respect to the clinical stage, comparing observed with expected composition. Sample phenotype correlation showed cluster 1 to be composed of a roughly equivalent proportion of clinical stages. However, this breakdown was statistically enriched with stage IIB, or tumor stage, disease and lacking in stage IB disease. Cluster 2 was enriched with patch/plaque stage disease, with only 2 samples categorized as stage IIB and III. Cluster 3 showed depletion in stage IA and IIB disease and enrichment in stage IB and III disease. These findings were statistically significant, with a P value of .004.

Figure 5

Cluster composition based on clinical stage. Entries are presented as: residual [no. observed | no. expected], where residual = (observed −expected) /sqrt (expected). Yellow indicates relative deficiency of cluster with respect to the clinical stage, and green indicates enrichment of cluster with respect to the clinical stage, comparing observed with expected composition. Sample phenotype correlation showed cluster 1 to be composed of a roughly equivalent proportion of clinical stages. However, this breakdown was statistically enriched with stage IIB, or tumor stage, disease and lacking in stage IB disease. Cluster 2 was enriched with patch/plaque stage disease, with only 2 samples categorized as stage IIB and III. Cluster 3 showed depletion in stage IA and IIB disease and enrichment in stage IB and III disease. These findings were statistically significant, with a P value of .004.

Close modal

Among patients with CD30+ disease, all 5 were in cluster 1. Interestingly, none of the 5 patients with CD8+ disease were found in cluster 1. Of the patients with CD8+ disease, 2 were distributed to cluster 2 and 3 to cluster 3. Folliculotropic disease was found to be somewhat more predominant in cluster 3, with 2 of 7 patients found in cluster 1, 1 of 7 patients in cluster 2, and 4 of 7 patients in cluster 3.

The 3 clusters were also characterized by differences in survival. Cluster 1 included 4 deaths: 3 stage IIB patients and 1 stage IB patient. Deaths occurred at 6 months after biopsy for the stage IB patient, and at 3 months, 3 months, and 4 months, respectively, for the 3 stage IIB patients. In all cases, death occurred after developing metastatic and/or leukemic involvement and failing multiple treatment regimens, including systemic therapies. Cluster 3 included 2 deaths at 5 and 9 months, respectively, both of whom were patients with SS. In comparison, the single death in cluster 2 was due to an unrelated medical condition, occurring 20 months after biopsy.

We first examined the impact of prebiopsy treatments on the clusters, and did not find a significant correlation between particular treatment regimens and clusters, suggesting that the clustering patterns were not driven by treatment signatures (Table 1). Examination of treatment response showed that clusters 1 and 3 are associated with more aggressive, less treatment-responsive disease. Recalcitrant disease across stages was more commonly found in clusters 1 and 3. In particular, 2 patients assigned to cluster 1 have since undergone bone marrow transplantation as a last resort to control their disease, and 1 patient had progressed from stage IA disease to large-cell transformation after the biopsy was taken, with subsequent persistent disease despite several treatment regimens, including denileukin diftitox (ONTAK) total skin electron beam therapy, and a suberoylanilide hydroxamic acid (SAHA) trial. By contrast, patients in cluster 2 generally exhibited extremely good response to treatment across all stages of disease. All but 3 patients are currently in remission or under good control on topical therapies, including steroids, tacrolimus, bexarotene, and/or phototherapy, administered as PUVA, narrow-band UVB, or natural sunlight. The only stage IIB patient went into remission on PUVA alone. Among stage IB patients, those who were more difficult to treat, characterized by longer time to remission, failure to respond to standard treatment regimens, and use of therapies more commonly reserved for more advanced disease, tended to fall into clusters 1 and 3. Stage IB patients whose disease was more amenable to treatment tended to fall into cluster 2. Among stage IB patients in cluster 2, 9 of 11 were maintained on intermittent topical therapies, including topical steroids, bexarotene gel, and/or phototherapy, without progression of disease. The stage IB patients not in cluster 2 included 1 patient in cluster 1 who developed metastatic disease and died despite the use of ONTAK, oral bexarotene, total skin electron beam radiation, extracorporeal photopheresis, interferon, and single- and multiple-agent chemotherapy. A total of 7 of the 18 stage IB patients in clusters 1 and 3 were placed on systemic therapies, including oral bexarotene, prednisone, extracorporeal photopheresis, interferon, ONTAK, methotrexate, chemotherapy, and SAHA.

Table 1

Cluster composition compared with therapeutic interventions by the time of biopsy

TherapyPatients, %
P
Cluster 1Cluster 2Cluster 3
No therapy 33 47 23 .400 
Topical corticosteroids 62 53 73 .719 
Topical tacrolimus 14 16 .550 
Topical nitrogen mustard 10 11 .672 
Topical bexarotene 14 .653 
PUVA 29 26 23 .930 
UVB (narrow or broadband) 10 16 18 .746 
Electron beam radiation (total skin or local) 14 .653 
Oral bexarotene 29 11 23 .449 
ONTAK (denileukin diftitox) 24 14 .304 
Extracorporeal photopheresis 14 11 18 .814 
Interferon 19 23 .351 
Oral corticosteroids 29 23 .223 
Chemotherapy (single or multiple agent) 19 11 23 .642 
Suberoylanilide hydroxamic acid (SAHA) .556 
TherapyPatients, %
P
Cluster 1Cluster 2Cluster 3
No therapy 33 47 23 .400 
Topical corticosteroids 62 53 73 .719 
Topical tacrolimus 14 16 .550 
Topical nitrogen mustard 10 11 .672 
Topical bexarotene 14 .653 
PUVA 29 26 23 .930 
UVB (narrow or broadband) 10 16 18 .746 
Electron beam radiation (total skin or local) 14 .653 
Oral bexarotene 29 11 23 .449 
ONTAK (denileukin diftitox) 24 14 .304 
Extracorporeal photopheresis 14 11 18 .814 
Interferon 19 23 .351 
Oral corticosteroids 29 23 .223 
Chemotherapy (single or multiple agent) 19 11 23 .642 
Suberoylanilide hydroxamic acid (SAHA) .556 

Table shows percentage of patients in each cluster who had been treated with the given therapy by the time of biopsy. Clustering did not correlate with particular therapies at the time of biopsy. The chi-squared distribution was used to calculate the P value.

Although differences in survival among clusters did not reach significance, differences in event-free survival, such as disease progression or large-cell transformation, did. Event-free survival curves are shown in Figure 6.

Figure 6

Event-free Kaplan-Meier curves. Survival was evaluated using Kaplan-Meier analysis. Events included disease progression, large-cell transformation, and death. There were 10 events, including 7 deaths, 1 large-cell transformation after biopsy, and 3 patients who progressed despite systemic treatments. The log-rank test was used to determine significance of differences between survival curves. Event-free survival is shown (A) by clinical stage (P < .001); (B) by hierarchical cluster, using a 3-cluster partition (P = .028); and (C) by hierarchical cluster, using a 2-cluster partition (P = .02).

Figure 6

Event-free Kaplan-Meier curves. Survival was evaluated using Kaplan-Meier analysis. Events included disease progression, large-cell transformation, and death. There were 10 events, including 7 deaths, 1 large-cell transformation after biopsy, and 3 patients who progressed despite systemic treatments. The log-rank test was used to determine significance of differences between survival curves. Event-free survival is shown (A) by clinical stage (P < .001); (B) by hierarchical cluster, using a 3-cluster partition (P = .028); and (C) by hierarchical cluster, using a 2-cluster partition (P = .02).

Close modal

Genes and gene pathways implicated in differentiating CTCL subgroups.

We identified dominant genes differentiating the 3 CTCL clusters by applying the t statistic and determining fold change.

Cluster 1, composed of a mix of clinical stages, showed significant up-regulation of many genes involved in immune response and T-cell activation (Table 2). Notable among these are all 3 components of the IL-2 receptor, lymphocyte-specific protein tyrosine kinase (LCK), PIK3CD, T-cell receptor α and β chains, CD3, ζ-chain–associated protein kinase (ZAP70), FYN-binding protein (FYB), T-cell receptor–interacting molecule (TCRIM), CD69, and linker for activation of T cells. SH2D1A, along with SLAMF1 (fold change, 2.99; P = .003), has been implicated in TCR activation; deficiency has been associated with the development of fatal X-linked lymphoproliferative disorder after Epstein-Barr virus (EBV) infection.34  CD8+ T-cell markers, including granulysin, were also up-regulated in Cluster 1.

Table 2

Cluster 1 selected up-regulated genes

Gene symbolGene nameFold changeP
IGLC1; IGLC2 Immunoglobulin lambda constant and variable 8.63 .035 
IL26 Interleukin-26 8.44 .043 
IGHM Immunoglobulin heavy constant mu 7.16 .025 
POU2AF1 POU domain, class 2, associating factor 1 5.93 .016 
TNFRSF17 Tumor necrosis factor receptor superfamily, member 17 5.78 .040 
T3JAM TRAF3-interacting Jun N-terminal kinase (JNK)–activating modulator 4.73 < .001 
LEF1 Lymphoid enhancer-binding factor 1 4.66 .004 
SH2D1A SH2 domain protein 1A, Duncan disease 4.61 .006 
GNLY Granulysin 4.56 .041 
PTPN7 Protein tyrosine phosphatase, nonreceptor type 7 4.45 < .001 
FYB FYN binding protein (FYB-120/130) 4.44 < .001 
TNFSF14 Tumor necrosis factor (ligand) superfamily, member 14 4.16 < .001 
LCK Lymphocyte-specific protein tyrosine kinase 4.11 < .001 
RAC2 Ras-related C3 botulinum toxin substrate 2 4.10 < .001 
IL21R Interleukin-21 receptor 4.09 < .001 
TCRIM T-cell receptor interacting molecule 4.01 < .001 
ZAP70 ζ-chain (TCR)–associated protein kinase 70 kDa 3.89 < .001 
TNFSF4 Tumor necrosis factor (ligand) superfamily, member 4 3.86 .010 
CCR8 Chemokine (C-C motif) receptor 8 3.65 .031 
FUT7 Fucosyltransferase 7 (alpha (1,3) fucosyltransferase) 3.64 .007 
SELL Selectin L (lymphocyte adhesion molecule 1) 3.64 .012 
CD3G CD3G antigen, gamma polypeptide (TiT3 complex) 3.64 < .001 
CCL4 Chemokine (C-C motif) ligand 4 3.62 .042 
ITK IL-2–inducible T-cell kinase 3.61 < .001 
CXCL13 Chemokine (C-X-C motif) ligand 13 (B-cell chemoattractant) 3.60 .016 
IL2RA Interleukin-2 receptor α 3.49 .015 
TNIK TRAF2 and NCK interacting kinase 3.40 < .001 
IL2RB Interleukin-2 receptor β 3.29 < .001 
TNFRSF7 Tumor necrosis factor receptor superfamily, member 7 3.23 < .001 
TRBC1 T-cell receptor β constant 1 3.20 < .001 
LAT Linker for activation of T cells 3.16 < .001 
ST8SIA1 ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 1 3.01 .006 
IL2RG Interleukin-2 receptor γ (severe combined immunodeficiency) 3.01 < .001 
LILRB4 Leukocyte immunoglobulin-like receptor, subfamily B, member 4 2.95 .007 
CCR4 Chemokine (C-C motif) receptor 4 2.91 .003 
ITGAL Integrin αL 2.91 < .001 
CD3E CD3E antigen ϵ polypeptide (TiT3 complex) 2.89 < .001 
CD3Z CD3Z antigen ζ polypeptide (TiT3 complex) 2.88 < .001 
PRF1 Perforin 1 (pore-forming protein) 2.86 .009 
CCR5 Chemokine (C-C motif) receptor 5 2.84 .008 
LTB Lymphotoxin β (TNF superfamily, member 3) 2.82 .001 
CCR7 Chemokine (C-C motif) receptor 7 2.72 < .001 
BIRC3 Baculoviral IAP repeat–containing 3 2.71 .005 
MAP4K1 Mitogen-activated protein kinase 1 2.68 < .001 
IL27RA Interleukin-27 receptor α 2.67 < .001 
CD69 CD69 antigen (p60, early T-cell activation antigen) 2.67 .001 
IFNG Interferonγ 2.66 .009 
CXCR4 Chemokine (C-X-C motif) receptor 4 2.66 < .001 
CD3D CD3D antigen δ polypeptide (TiT3 complex) 2.58 < .001 
ST8SIA4 ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 4 2.57 .001 
MYB v-myb myeloblastosis viral oncogene homolog (avian) 2.55 .006 
ITGB7 Integrin β7 2.50 .002 
LPXN Leupaxin 2.40 < .001 
PTPRC Protein tyrosine phosphatase receptor type C 2.27 < .001 
TRA@ T-cell receptor α locus 2.25 < .001 
EVI2B Ecotropic viral integration site 2B 2.19 < .001 
TNF Tumor necrosis factor (TNF superfamily, member 2) 1.90 < .001 
TNFAIP8 Tumor necrosis factorα–induced protein 8 1.87 < .001 
WIF1 WNT inhibitory factor 1 4.93 .003 
IL1F7 Interleukin-1 family, member 7 (ζ) 4.17 < .001 
C1orf68 Chromosome 1 open reading frame 68 3.71 < .001 
PSORS1C2 Psoriasis susceptibility 1 candidate 2 3.06 < .001 
Gene symbolGene nameFold changeP
IGLC1; IGLC2 Immunoglobulin lambda constant and variable 8.63 .035 
IL26 Interleukin-26 8.44 .043 
IGHM Immunoglobulin heavy constant mu 7.16 .025 
POU2AF1 POU domain, class 2, associating factor 1 5.93 .016 
TNFRSF17 Tumor necrosis factor receptor superfamily, member 17 5.78 .040 
T3JAM TRAF3-interacting Jun N-terminal kinase (JNK)–activating modulator 4.73 < .001 
LEF1 Lymphoid enhancer-binding factor 1 4.66 .004 
SH2D1A SH2 domain protein 1A, Duncan disease 4.61 .006 
GNLY Granulysin 4.56 .041 
PTPN7 Protein tyrosine phosphatase, nonreceptor type 7 4.45 < .001 
FYB FYN binding protein (FYB-120/130) 4.44 < .001 
TNFSF14 Tumor necrosis factor (ligand) superfamily, member 14 4.16 < .001 
LCK Lymphocyte-specific protein tyrosine kinase 4.11 < .001 
RAC2 Ras-related C3 botulinum toxin substrate 2 4.10 < .001 
IL21R Interleukin-21 receptor 4.09 < .001 
TCRIM T-cell receptor interacting molecule 4.01 < .001 
ZAP70 ζ-chain (TCR)–associated protein kinase 70 kDa 3.89 < .001 
TNFSF4 Tumor necrosis factor (ligand) superfamily, member 4 3.86 .010 
CCR8 Chemokine (C-C motif) receptor 8 3.65 .031 
FUT7 Fucosyltransferase 7 (alpha (1,3) fucosyltransferase) 3.64 .007 
SELL Selectin L (lymphocyte adhesion molecule 1) 3.64 .012 
CD3G CD3G antigen, gamma polypeptide (TiT3 complex) 3.64 < .001 
CCL4 Chemokine (C-C motif) ligand 4 3.62 .042 
ITK IL-2–inducible T-cell kinase 3.61 < .001 
CXCL13 Chemokine (C-X-C motif) ligand 13 (B-cell chemoattractant) 3.60 .016 
IL2RA Interleukin-2 receptor α 3.49 .015 
TNIK TRAF2 and NCK interacting kinase 3.40 < .001 
IL2RB Interleukin-2 receptor β 3.29 < .001 
TNFRSF7 Tumor necrosis factor receptor superfamily, member 7 3.23 < .001 
TRBC1 T-cell receptor β constant 1 3.20 < .001 
LAT Linker for activation of T cells 3.16 < .001 
ST8SIA1 ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 1 3.01 .006 
IL2RG Interleukin-2 receptor γ (severe combined immunodeficiency) 3.01 < .001 
LILRB4 Leukocyte immunoglobulin-like receptor, subfamily B, member 4 2.95 .007 
CCR4 Chemokine (C-C motif) receptor 4 2.91 .003 
ITGAL Integrin αL 2.91 < .001 
CD3E CD3E antigen ϵ polypeptide (TiT3 complex) 2.89 < .001 
CD3Z CD3Z antigen ζ polypeptide (TiT3 complex) 2.88 < .001 
PRF1 Perforin 1 (pore-forming protein) 2.86 .009 
CCR5 Chemokine (C-C motif) receptor 5 2.84 .008 
LTB Lymphotoxin β (TNF superfamily, member 3) 2.82 .001 
CCR7 Chemokine (C-C motif) receptor 7 2.72 < .001 
BIRC3 Baculoviral IAP repeat–containing 3 2.71 .005 
MAP4K1 Mitogen-activated protein kinase 1 2.68 < .001 
IL27RA Interleukin-27 receptor α 2.67 < .001 
CD69 CD69 antigen (p60, early T-cell activation antigen) 2.67 .001 
IFNG Interferonγ 2.66 .009 
CXCR4 Chemokine (C-X-C motif) receptor 4 2.66 < .001 
CD3D CD3D antigen δ polypeptide (TiT3 complex) 2.58 < .001 
ST8SIA4 ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 4 2.57 .001 
MYB v-myb myeloblastosis viral oncogene homolog (avian) 2.55 .006 
ITGB7 Integrin β7 2.50 .002 
LPXN Leupaxin 2.40 < .001 
PTPRC Protein tyrosine phosphatase receptor type C 2.27 < .001 
TRA@ T-cell receptor α locus 2.25 < .001 
EVI2B Ecotropic viral integration site 2B 2.19 < .001 
TNF Tumor necrosis factor (TNF superfamily, member 2) 1.90 < .001 
TNFAIP8 Tumor necrosis factorα–induced protein 8 1.87 < .001 
WIF1 WNT inhibitory factor 1 4.93 .003 
IL1F7 Interleukin-1 family, member 7 (ζ) 4.17 < .001 
C1orf68 Chromosome 1 open reading frame 68 3.71 < .001 
PSORS1C2 Psoriasis susceptibility 1 candidate 2 3.06 < .001 

LEF1 and TCF7, both of which were up-regulated in cluster 1, are downstream members of the WNT signaling pathway. A recent study has shown that several isoforms of LEF1 and TCF7 are up-regulated in naive CD8+ T cells, and down-regulated by TCR activation.35  Both LEF1 and TCF7 have been shown to be expressed in lymphocytic leukemias, including a subset of peripheral T-cell lymphomas and T-cell acute lymphocytic leukemia (T-ALL).36,37  Based on activation markers, studies suggest that LEF1 and TCF7 are up-regulated in the setting of T-helper 1 (Th1) as compared with Th2 activation.37 

Also up-regulated in cluster 1 were several central memory T-cell markers and genes involved in lymphocyte homing: SELL (L-selectin), CCR7, and FUT7 (fucosyltransferase 7), an important regulator of selectin ligands, including CLA.38,39 

Tumor necrosis factor (TNF) pathway genes, such as TNFRSF17, TNFRSF14, TNFSF4, TNFRSF7, LTB(lymphotoxin β), TNF, TRAF1, T3JAM, and TNFAIP8, are also up-regulated in this cluster. TNFRSF14, also known as herpesvirus entry mediator (HVEM), has been shown to interact with TNFSF14(also known as LIGHT) and LTA (lymphotoxin A) resulting in costimulation of T cells.40,41 BIRC1, BIRC3, and BIRC5 were also significantly up-regulated in this cluster. BIRC3 is a member of the inhibitor of apoptosis (IAP) gene family, and inhibits apoptosis by binding to TNF receptor–associated factors TRAF1 and TRAF2.42  The previous work of Tracey et al17  also emphasized TNF antiapoptotic pathway activation in MF, consistent with these results.

Interestingly, several B-cell–related genes were found to be up-regulated in this lymphocyte activation cluster. Immunoglobulin genes are among the most up-regulated by fold change, although the significance of this result (P ≤ .05) is lower than that of many other genes. POU2AF1 has been implicated in CLL.43 

Cluster 2, consisting of mostly stage IA and IB patients, showed significant up-regulation of several genes involved in keratinocyte and epidermal differentiation and proliferation. Selected genes are shown in Table 3. Among the genes up-regulated in this cluster were keratinocyte-related genes, including loricrin (LOR), late cornified envelope 2B (LCE2B), and keratin 2A (KRT2A). Members of the WNT signaling pathway were also up-regulated, including catenin β-interacting protein 1 (CTNNBIP1), transcription factor 7-like 2 (TCF7L2), and transcription factor 7-like 1 (TCF7L1).

Table 3

Cluster 2 selected up-regulated genes

Gene symbolGene nameFold changeP
PPFIBP1 PTPRF interacting protein, binding protein 1 (liprin beta 1) 2.71 < .001 
CST6 Cystatin E/M 2.70 < .001 
NFIC Nuclear factor I/C (CCAAT-binding transcription factor) 2.69 < .001 
FCGBP Fc fragment of IgG-binding protein 2.67 < .001 
LAMC3 Laminin, γ 3 2.53 .003 
LOR Loricrin 2.47 < .001 
EPS8L1 Epidermal growth factor receptor pathway substrate 8 (EPS8)–like 1 2.47 < .001 
SEMA3B Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3B 2.45 < .001 
LCE2B Late cornified envelope 2B 2.31 < .001 
CTNNBIP1 Catenin, β-interacting protein 1 2.26 < .001 
PARD3 Par-3 partitioning defective 3 homolog (C elegans2.20 < .001 
MATN4 Matrilin 4 2.18 .033 
HLA-DQB2 Major histocompatibility complex, class II, DQ β 2 2.10 .001 
TCF7L1 Transcription factor 7-like 1 (T-cell specific, HMG-box) 2.05 < .001 
CD207 CD207 antigen, langerin 2.05 .001 
KRT2A Keratin 2A (epidermal ichthyosis bullosa of Siemens) 2.02 < .001 
COL7A1 Collagen, type VII, α 1 2.01 < .001 
LTBP4 Latent transforming growth factor β–binding protein 4 2.00 < .001 
FLG Filaggrin 1.91 < .001 
ST6GALNAC2 ST6-N-acetylgalactosaminide alpha-2,6-sialyltransferase 2 1.89 < .001 
Gene symbolGene nameFold changeP
PPFIBP1 PTPRF interacting protein, binding protein 1 (liprin beta 1) 2.71 < .001 
CST6 Cystatin E/M 2.70 < .001 
NFIC Nuclear factor I/C (CCAAT-binding transcription factor) 2.69 < .001 
FCGBP Fc fragment of IgG-binding protein 2.67 < .001 
LAMC3 Laminin, γ 3 2.53 .003 
LOR Loricrin 2.47 < .001 
EPS8L1 Epidermal growth factor receptor pathway substrate 8 (EPS8)–like 1 2.47 < .001 
SEMA3B Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3B 2.45 < .001 
LCE2B Late cornified envelope 2B 2.31 < .001 
CTNNBIP1 Catenin, β-interacting protein 1 2.26 < .001 
PARD3 Par-3 partitioning defective 3 homolog (C elegans2.20 < .001 
MATN4 Matrilin 4 2.18 .033 
HLA-DQB2 Major histocompatibility complex, class II, DQ β 2 2.10 .001 
TCF7L1 Transcription factor 7-like 1 (T-cell specific, HMG-box) 2.05 < .001 
CD207 CD207 antigen, langerin 2.05 .001 
KRT2A Keratin 2A (epidermal ichthyosis bullosa of Siemens) 2.02 < .001 
COL7A1 Collagen, type VII, α 1 2.01 < .001 
LTBP4 Latent transforming growth factor β–binding protein 4 2.00 < .001 
FLG Filaggrin 1.91 < .001 
ST6GALNAC2 ST6-N-acetylgalactosaminide alpha-2,6-sialyltransferase 2 1.89 < .001 

Genes up-regulated in cluster 3 included several genes related to keratinocyte function, but also included T-cell–specific genes. Selected genes are shown in Table 4. Up-regulated genes included keratins 6 and 16, indicating keratinocyte hyperproliferation.44 SERPINB3, CEACAM1, and WNT 5A were also significantly up-regulated. These genes have been previously associated with benign hyperplasia in the setting of psoriasis as compared with squamous cell carcinoma.45 

Table 4

Cluster 3 selected up-regulated genes

Gene symbolGene nameFold changeP
FOSL1 FOS-like antigen 1 3.80 < .001 
SERPINB13 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 13 3.66 < .001 
MMP12 Matrix metalloproteinase 12 (macrophage elastase) 3.41 .004 
SERPINB4 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 4 3.33 < .001 
EVA1 Epithelial V–like antigen 1 3.12 < .001 
SELE Selectin E (endothelial adhesion molecule 1) 3.08 < .001 
IL13RA2 Interleukin-13 receptor α2 3.05 .006 
TP73L Tumor protein p73–like 2.94 < .001 
FGFBP1 Fibroblast growth factor–binding protein 1 2.86 < .001 
KRT17 Keratin 17 2.71 < .001 
IL1RN Interleukin-1 receptor antagonist 2.58 < .001 
CD24 CD24 antigen (small-cell lung carcinoma cluster 4 antigen) 2.57 < .001 
KRT16 Keratin 16 (focal nonepidermolytic palmoplantar keratoderma) 2.54 < .001 
TNFAIP6 Tumor necrosis factor α-induced protein 6 2.51 .006 
ID1 Inhibitor of DNA binding 1, dominant-negative helix-loop-helix protein 2.38 < .001 
SERPINB3 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3 2.36 < .001 
DUSP1 Dual specificity phosphatase 1 2.24 .001 
CCL2 Chemokine (C-C motif) ligand 2 2.22 .002 
ITGA6 Integrin α6 2.14 < .001 
CEACAM1 Carcinoembryonic antigen-related cell adhesion molecule 1 2.07 .002 
SOCS3 Suppressor of cytokine signaling 3 2.00 .001 
LTB4R Leukotriene B4 receptor 1.95 < .001 
DUSP3 Dual specificity phosphatase 3 (vaccinia virus phosphatase VH1–related) 1.91 < .001 
ITGB4 Integrin β4 1.89 < .001 
KRT6A, B, C, E Keratin 6A, keratin 6B, keratin 6C, keratin 6E 1.89 < .001 
JAG1 Jagged 1 (Alagille syndrome) 1.88 < .001 
VEGF Vascular endothelial growth factor 1.88 < .001 
IVL Involucrin 1.87 .002 
MCL1 Myeloid cell leukemia sequence 1 (BCL2-related) 1.82 .008 
TGFA Transforming growth factor α 1.80 .004 
SERPINB1 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 1 1.79 .001 
S100A2 S100 calcium-binding protein A2 1.79 < .001 
YES1 v-yes-1 Yamaguchi sarcoma viral oncogene homolog 1 1.77 < .001 
FGFR2 Fibroblast growth factor receptor 2 1.72 < .001 
TNFRSF21 Tumor necrosis factor receptor superfamily, member 21 1.69 < .001 
RAB6A RAB6A, member RAS oncogene family 1.69 .003 
JUNB jun B proto-oncogene 1.68 < .001 
WNT5A Wingless-type MMTV integration site family, member 5A 1.65 .006 
Gene symbolGene nameFold changeP
FOSL1 FOS-like antigen 1 3.80 < .001 
SERPINB13 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 13 3.66 < .001 
MMP12 Matrix metalloproteinase 12 (macrophage elastase) 3.41 .004 
SERPINB4 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 4 3.33 < .001 
EVA1 Epithelial V–like antigen 1 3.12 < .001 
SELE Selectin E (endothelial adhesion molecule 1) 3.08 < .001 
IL13RA2 Interleukin-13 receptor α2 3.05 .006 
TP73L Tumor protein p73–like 2.94 < .001 
FGFBP1 Fibroblast growth factor–binding protein 1 2.86 < .001 
KRT17 Keratin 17 2.71 < .001 
IL1RN Interleukin-1 receptor antagonist 2.58 < .001 
CD24 CD24 antigen (small-cell lung carcinoma cluster 4 antigen) 2.57 < .001 
KRT16 Keratin 16 (focal nonepidermolytic palmoplantar keratoderma) 2.54 < .001 
TNFAIP6 Tumor necrosis factor α-induced protein 6 2.51 .006 
ID1 Inhibitor of DNA binding 1, dominant-negative helix-loop-helix protein 2.38 < .001 
SERPINB3 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3 2.36 < .001 
DUSP1 Dual specificity phosphatase 1 2.24 .001 
CCL2 Chemokine (C-C motif) ligand 2 2.22 .002 
ITGA6 Integrin α6 2.14 < .001 
CEACAM1 Carcinoembryonic antigen-related cell adhesion molecule 1 2.07 .002 
SOCS3 Suppressor of cytokine signaling 3 2.00 .001 
LTB4R Leukotriene B4 receptor 1.95 < .001 
DUSP3 Dual specificity phosphatase 3 (vaccinia virus phosphatase VH1–related) 1.91 < .001 
ITGB4 Integrin β4 1.89 < .001 
KRT6A, B, C, E Keratin 6A, keratin 6B, keratin 6C, keratin 6E 1.89 < .001 
JAG1 Jagged 1 (Alagille syndrome) 1.88 < .001 
VEGF Vascular endothelial growth factor 1.88 < .001 
IVL Involucrin 1.87 .002 
MCL1 Myeloid cell leukemia sequence 1 (BCL2-related) 1.82 .008 
TGFA Transforming growth factor α 1.80 .004 
SERPINB1 Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 1 1.79 .001 
S100A2 S100 calcium-binding protein A2 1.79 < .001 
YES1 v-yes-1 Yamaguchi sarcoma viral oncogene homolog 1 1.77 < .001 
FGFR2 Fibroblast growth factor receptor 2 1.72 < .001 
TNFRSF21 Tumor necrosis factor receptor superfamily, member 21 1.69 < .001 
RAB6A RAB6A, member RAS oncogene family 1.69 .003 
JUNB jun B proto-oncogene 1.68 < .001 
WNT5A Wingless-type MMTV integration site family, member 5A 1.65 .006 

We also analyzed genes in our clusters according to established functional and molecular pathways. As our results have already shown, several pathways are easily identified ad hoc. However, we used 2 rigorous techniques to identify enriched pathways because with a large set of genes, it is often difficult to pick out patterns relevant to pathophysiology of disease, as many genes, despite relatively low fold changes, may contribute as members of an up-regulated set rather than individually. We compared GO pathways and a set of pathways curated by the Broad Institute at MIT to our dataset using the GeneMerge and GSEA tools, respectively. GeneMerge scores GO pathways according to a predefined list of cluster markers. GSEA, in comparison, uses processed expression data and class phenotype information to discover pathway enrichment.

Analysis of GO pathways gave more strength to the patterns suggested by the cluster marker gene lists. The top 15 pathways for each cluster, ranked by P value and corrected for multiple testing, are shown with the heat map in Figure 4. Cluster 1 showed up-regulation of pathways involving immune response, endogenous and exogenous antigen presentation and processing, major histocompatibility complex (MHC) receptor activity, chemokine receptor activity, TCR binding, and TNF-receptor binding. In addition, proliferative pathways are also highlighted, including mitosis, cell division, nucleus, chromosome, DNA repair, T-cell proliferation, and B-cell proliferation. In contrast, cluster 2 revealed up-regulation of several pathways involved in extracellular matrix, collagen, epidermis development, transcription, keratin filament, WNT signaling, melanin metabolism, keratinocyte differentiation, and cornified envelope. Cluster 3 showed up-regulation of many varied pathways, including angiogenesis, oxygen binding, inflammatory response, chemokine activity, chemotaxis, interleukin-1 receptor binding, and response to virus.

GSEA resulted in the identification of significantly enriched pathways in only 1 cluster—cluster 1. For all pathways containing 5 or more genes, 8 were significantly enriched in cluster 1, with FDR less than or equal to 0.05, and 3 genes with FDR less than 0.01 (Table 5). The enriched pathways involve T-cell function, TCR signaling, and T-cell activation. Also enriched was a pathway up-regulated in GATA1 mutant megakaryoblasts harvested from mice, previously implicated in leukemogenesis.46 

Table 5

GSEA-enriched pathways for cluster 1

PathwayDescriptionCuratorSize, no. genesEnrichment scoreFDR
CSK pathway Csk inhibits T-cell activation by phosphorylating Lck; Csk is regulated by cAMP-dependent kinases and is opposed by the T-cell activator CD45. BioCarta 24 0.8398 < .001 
T helper pathway Helper T cells coordinate the actions of B cells, macrophages, and other immune cells via surface molecules such as T-cell receptor/CD3 and their characteristic marker CD4. BioCarta 23 < .001.8698 < .001 
T cytotoxic pathway Cytotoxic T cells release perforin and granzyme to lyse foreign cell targets and express Fas ligand to promote Fas-induced apoptosis. BioCarta 23 < .001.8513 .003 
GATA1 mutant megakaryoblasts Genes up-regulated in megakaryoblasts of GATA1-mutant mice. GATA1 (a hematopoietic transcription factor) has been implicated in leukemogenesis in patients with Down syndrome. Li et al46  22 < .001.6708 .017 
T-cell receptor pathway T-cell receptors bind to foreign peptides presented by MHC molecules and induce T-cell activation. BioCarta 24 < .001.7998 .020 
IL-17 pathway Activated T cells secrete IL-17, which stimulates fibroblasts and other cells to secrete inflammatory and hematopoietic cytokines. BioCarta 16 < .001.7848 .020 
T-cell receptor activation pathway The kinases Lck and Fyn phosphorylate and activate the T-cell receptor, which recognizes antigen-bound MHC II and leads to T-cell activation. BioCarta 17 < .001.9049 .020 
T-cell signal transduction On activation of the T-cell receptor, phospholipase C is activated to produce second messengers DAG and PIP3, both required for T-cell activation. STKE 29 < .001.7062 .034 
PathwayDescriptionCuratorSize, no. genesEnrichment scoreFDR
CSK pathway Csk inhibits T-cell activation by phosphorylating Lck; Csk is regulated by cAMP-dependent kinases and is opposed by the T-cell activator CD45. BioCarta 24 0.8398 < .001 
T helper pathway Helper T cells coordinate the actions of B cells, macrophages, and other immune cells via surface molecules such as T-cell receptor/CD3 and their characteristic marker CD4. BioCarta 23 < .001.8698 < .001 
T cytotoxic pathway Cytotoxic T cells release perforin and granzyme to lyse foreign cell targets and express Fas ligand to promote Fas-induced apoptosis. BioCarta 23 < .001.8513 .003 
GATA1 mutant megakaryoblasts Genes up-regulated in megakaryoblasts of GATA1-mutant mice. GATA1 (a hematopoietic transcription factor) has been implicated in leukemogenesis in patients with Down syndrome. Li et al46  22 < .001.6708 .017 
T-cell receptor pathway T-cell receptors bind to foreign peptides presented by MHC molecules and induce T-cell activation. BioCarta 24 < .001.7998 .020 
IL-17 pathway Activated T cells secrete IL-17, which stimulates fibroblasts and other cells to secrete inflammatory and hematopoietic cytokines. BioCarta 16 < .001.7848 .020 
T-cell receptor activation pathway The kinases Lck and Fyn phosphorylate and activate the T-cell receptor, which recognizes antigen-bound MHC II and leads to T-cell activation. BioCarta 17 < .001.9049 .020 
T-cell signal transduction On activation of the T-cell receptor, phospholipase C is activated to produce second messengers DAG and PIP3, both required for T-cell activation. STKE 29 < .001.7062 .034 

In addition to the pathway name and description, curator (source of the pathway), size (number of genes in the pathway), ES (calculated by the GSEA algorithm, giving a statistical measure of the enrichment of the sample class by pathway genes), P value, and FDR, which corrects the P value for multiple-testing correction, are also provided. All P values were 0.

Histologic and immunophenotypic correlation.

In some cases, patients had 2 biopsies available from the same lesion, 1 processed for gene expression profiling and 1 processed for histologic analysis. We examined the histology in these cases for any correlation to clustering. In particular, we were interested to determine if the relative number of T cells present in a biopsy correlated with clustering. Figure 7 shows histology and CD3 immunostaining of representative biopsies taken at the same site as for the clustering. As illustrated in the images, there appears to be variability in the number of lymphocytes with respect to cluster, suggesting that the number of T cells in a biopsy does not dominate the clustering.

Figure 7

Histologic and immunophenotypical correlation with cluster. Representative biopsies are shown. Left panels are the hematoxylin-eosin stain, and right panels are immunohistochemical stains using monoclonal antibody to CD3. (A) Cluster 2: stage IA, untreated erythematous plaque on the chest. (B) Cluster 3: top 2 panels depict a stage IB lesion, while the bottom panel shows a stage III lesion. All images were acquired on a Nikon Eclipse E600 microscope (Nikon, Melville, NY) with a 10×/0.30 objective using a SPOT digital camera and image acquisition software (Diagnostic Instruments, Sterling Heights, MI).

Figure 7

Histologic and immunophenotypical correlation with cluster. Representative biopsies are shown. Left panels are the hematoxylin-eosin stain, and right panels are immunohistochemical stains using monoclonal antibody to CD3. (A) Cluster 2: stage IA, untreated erythematous plaque on the chest. (B) Cluster 3: top 2 panels depict a stage IB lesion, while the bottom panel shows a stage III lesion. All images were acquired on a Nikon Eclipse E600 microscope (Nikon, Melville, NY) with a 10×/0.30 objective using a SPOT digital camera and image acquisition software (Diagnostic Instruments, Sterling Heights, MI).

Close modal

Markers of clinical stage and response to treatment

Differential analysis was performed on a number of different phenotypes in order to identify genes associated with particular clinical differences. We first evaluated markers differentiating clinical stage. In a comparison of patch/plaque disease (stages IA and IB) compared with more advanced disease (stages IIB and III), we identified 480 genes with FDR less than or equal to 0.05, and 50 genes with FDR less than or equal to 0.01 (Figure 8). Genes ranked by signal-to-noise ratio contributing to differentiation of clinical stages are shown in Table 6. Genes up-regulated in earlier stage disease include several involved in epidermal development, WNT signaling pathway, and keratinocyte development. Genes up-regulated in later stage disease include several genes involved in immune response, mitosis, cell proliferation, DNA replication, apoptosis, response to virus, chemotaxis, and interferon (IFN) and TNF-alpha signaling. Although GSEA was unable to show any significant enrichment of evaluated pathways, GO analysis did show significant up-regulation in particular pathways. The top 20 up-regulated GO pathways are shown in Figure 8.

Figure 8

Heat map showing gene selection for stage IA/IB compared with stage IIB/III. Our analysis yielded gene sets of 480 genes with FDR less than or equal to 0.05, using the gene filter for 6997 genes, ranked by signal-to-noise ratio. The chart shows the top 20 GO terms for each stage, ranked by FDR. Each GO term is accompanied by the number of up-regulated genes/the total number of genes included in the test group. All included GO terms have cutoffs of .05 for both P value and FDR.

Figure 8

Heat map showing gene selection for stage IA/IB compared with stage IIB/III. Our analysis yielded gene sets of 480 genes with FDR less than or equal to 0.05, using the gene filter for 6997 genes, ranked by signal-to-noise ratio. The chart shows the top 20 GO terms for each stage, ranked by FDR. Each GO term is accompanied by the number of up-regulated genes/the total number of genes included in the test group. All included GO terms have cutoffs of .05 for both P value and FDR.

Close modal
Table 6

Genes differentiating clinical stage

Gene symbolGene nameFold changePFDR
Up-regulated in stage IA/IB     
    HLA-DRB4 mMajor histocompatibility complex, class II, DR β 4 4.53 .003 .048 
    WIF1 WNT inhibitory factor 1 4.41 < .001 < .001 
    CD207 CD207 antigen, langerin 3.63 < .001 < .001 
    FCGBP Fc fragment of IgG binding protein 2.87 < .001 .017 
    MATN4 Matrilin 4 2.79 < .001 < .001 
    HLA-DQB2 Major histocompatibility complex, class II, DQ β 2 2.74 < .001 < .001 
    CD1A CD1A antigen, a polypeptide 2.60 < .001 < .001 
    FCER1A Fc fragment of IgE, high affinity I, receptor for α polypeptide 2.45 < .001 .029 
    S100B S100 calcium-binding protein β (neural) 2.17 < .001 .017 
    KRT18 Keratin 18 2.06 .003 .047 
    LAMC3 Laminin γ 3 2.05 .002 .043 
    CTSG Cathepsin G 2.04 < .001 < .001 
    CMA1 Chymase 1, mast cell 2.03 < .001 < .001 
    GSTT1 Glutathione S-transferase θ 1 2.03 < .001 .029 
    HLF Hepatic leukemia factor 2.00 .002 .039 
    CD1C CD1C antigen, c polypeptide 1.94 .002 .043 
    FRZB Frizzled-related protein 1.94 < .001 .029 
    KRT15 Keratin 15 1.86 .001 .036 
    MMP27 Matrix metalloproteinase 27 1.85 .003 .044 
    PADI2 Peptidyl arginine deiminase, type II 1.78 < .001 .017 
    SCEL Sciellin 1.78 .002 .043 
    TCF7L1 Tanscription factor 7-like 1 (T-cell specific, HMG-box) 1.73 < .001 .023 
    LTBP4 Latent transforming growth factor β–binding protein 4 1.70 < .001 .033 
    CTNNBIP1 Catenin β interacting protein 1 1.68 .003 .046 
    COL21A1 Collagen, type XXI α 1 1.67 .003 .047 
    KIT v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog 1.66 < .001 < .001 
    S100A1 S100 calcium-binding protein A1 1.64 < .001 < .001 
    NFIB Nuclear factor I/B 1.60 .002 .039 
    IL11RA Interleukin-11 receptor α 1.59 < .001 < .001 
    MATN2 Matrilin 2 1.53 < .001 .017 
Up-regulated in stage IIA/III     
    AIM2 Absent in melanoma 2 4.62 < .001 < .001 
    LAG3 Lymphocyte-activation gene 3 4.42 < .001 < .001 
    APOBEC3B Apolipoprotein B mRNA–editing enzyme, catalytic polypeptide-like 3B 3.37 < .001 .023 
    SH2D1A SH2 domain protein 1A, Duncan disease (lymphoproliferative syndrome) 3.04 .002 .039 
    TNIP3 TNFAIP3 interacting protein 3 2.75 < .001 .023 
    IGH Ig heavy locus 2.68 .002 .041 
    IL21R Interleukin-21 receptor 2.67 < .001 .017 
    TNFSF14 Tumor necrosis factor (ligand) superfamily, member 14 2.64 .002 .039 
    LILRA3 Leukocyte Ig-like receptor, subfamily A (without TM domain), member 3 2.64 .002 .043 
    CCR4 Chemokine (C-C motif) receptor 4 2.52 < .001 .023 
    G1P2 Interferon α–inducible protein (clone IFI-15K) 2.49 < .001 .029 
    ISG20 Interferon-stimulated gene 20 kDa 2.42 < .001 < .001 
    TNIK TRAF2 and NCK interacting kinase 2.39 < .001 .017 
    IGL Ig lambda locus, constant, variable and joining regions 2.36 .002 .042 
    PRKCQ Protein kinase C θ 2.24 .002 .041 
    OAS2 2′-5′-oligoadenylate synthetase 2, 69/71 kDa 2.18 < .001 < .001 
    OAS1 2′,5′-oligoadenylate synthetase 1, 40/46 kDa 2.12 .002 .039 
    IGKC; IGKV1-5 Ig κ constant and variable 1-5 2.09 .003 .050 
    CCR7 Chemokine (C-C motif) receptor 7 2.08 .001 .036 
    ZAP70 ζ-chain (TCR)–associated protein kinase 70 kDa 2.05 .002 .041 
    CCND2 Cyclin D2 2.00 .002 .039 
    GBP1 Guanylate-binding protein 1, interferon-inducible, 67 kDa 1.95 .003 .046 
    ZC3HAV1 Zinc finger CCCH type, antiviral 1 1.91 < .001 < .001 
    IL2RB Interleukin-2 receptor β 1.90 .002 .041 
    PTPRCAP Protein tyrosine phosphatase, receptor type, C-associated protein 1.90 .003 .048 
    C1GALT1 Core 1 synthase 1.89 < .001 .017 
    C5R1 Complement component 5 receptor 1 (C5a ligand) 1.88 < .001 .033 
    PIM2 pim-2 oncogene 1.88 .003 .050 
    ITGB7 Integrin β7 1.84 .002 .042 
    DDX58 DEAD (Asp-Glu-Ala-Asp) box polypeptide 58 1.84 .002 .041 
    RARRES3 Retinoic acid receptor responder (tazarotene induced) 3 1.83 .003 .047 
    IFRG28 28 kDa interferon-responsive protein 1.82 .001 .036 
    MX2 Myxovirus (influenza virus) resistance 2 (mouse) 1.75 .002 .043 
    SP100 (IFI41) Nuclear antigen Sp100 1.71 < .001 < .001 
    VAV1 vav 1 oncogene 1.68 .001 .036 
    LCP2 Lymphocyte cytosolic protein 2 1.65 .002 .042 
    NFATC2IP NFAT, cytoplasmic, calcineurin-dependent 2 interacting protein 1.65 < .001 < .001 
    DUSP5 Dual specificity phosphatase 5 1.64 < .001 .017 
    IL10 Interleukin-10 1.64 .002 .042 
    BAX BCL2-associated X protein 1.63 .002 .043 
    FYB FYN binding protein (FYB-120/130) 1.61 < .001 .023 
    NUP98 Nucleoporin 98 kDa 1.60 .002 .041 
    HSPD1 heat shock 60 kDa protein 1 (chaperonin) 1.56 .003 .048 
    IFNAR2 Interferon (α, β, and ω) receptor 2 1.53 < .001 < .001 
    RAP140 Retinoblastoma-associated protein 140 (CTCL tumor antigen se89-1) 1.52 < .001 .033 
    FGFR1OP FGFR1 oncogene partner 1.51 < .001 < .001 
Gene symbolGene nameFold changePFDR
Up-regulated in stage IA/IB     
    HLA-DRB4 mMajor histocompatibility complex, class II, DR β 4 4.53 .003 .048 
    WIF1 WNT inhibitory factor 1 4.41 < .001 < .001 
    CD207 CD207 antigen, langerin 3.63 < .001 < .001 
    FCGBP Fc fragment of IgG binding protein 2.87 < .001 .017 
    MATN4 Matrilin 4 2.79 < .001 < .001 
    HLA-DQB2 Major histocompatibility complex, class II, DQ β 2 2.74 < .001 < .001 
    CD1A CD1A antigen, a polypeptide 2.60 < .001 < .001 
    FCER1A Fc fragment of IgE, high affinity I, receptor for α polypeptide 2.45 < .001 .029 
    S100B S100 calcium-binding protein β (neural) 2.17 < .001 .017 
    KRT18 Keratin 18 2.06 .003 .047 
    LAMC3 Laminin γ 3 2.05 .002 .043 
    CTSG Cathepsin G 2.04 < .001 < .001 
    CMA1 Chymase 1, mast cell 2.03 < .001 < .001 
    GSTT1 Glutathione S-transferase θ 1 2.03 < .001 .029 
    HLF Hepatic leukemia factor 2.00 .002 .039 
    CD1C CD1C antigen, c polypeptide 1.94 .002 .043 
    FRZB Frizzled-related protein 1.94 < .001 .029 
    KRT15 Keratin 15 1.86 .001 .036 
    MMP27 Matrix metalloproteinase 27 1.85 .003 .044 
    PADI2 Peptidyl arginine deiminase, type II 1.78 < .001 .017 
    SCEL Sciellin 1.78 .002 .043 
    TCF7L1 Tanscription factor 7-like 1 (T-cell specific, HMG-box) 1.73 < .001 .023 
    LTBP4 Latent transforming growth factor β–binding protein 4 1.70 < .001 .033 
    CTNNBIP1 Catenin β interacting protein 1 1.68 .003 .046 
    COL21A1 Collagen, type XXI α 1 1.67 .003 .047 
    KIT v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog 1.66 < .001 < .001 
    S100A1 S100 calcium-binding protein A1 1.64 < .001 < .001 
    NFIB Nuclear factor I/B 1.60 .002 .039 
    IL11RA Interleukin-11 receptor α 1.59 < .001 < .001 
    MATN2 Matrilin 2 1.53 < .001 .017 
Up-regulated in stage IIA/III     
    AIM2 Absent in melanoma 2 4.62 < .001 < .001 
    LAG3 Lymphocyte-activation gene 3 4.42 < .001 < .001 
    APOBEC3B Apolipoprotein B mRNA–editing enzyme, catalytic polypeptide-like 3B 3.37 < .001 .023 
    SH2D1A SH2 domain protein 1A, Duncan disease (lymphoproliferative syndrome) 3.04 .002 .039 
    TNIP3 TNFAIP3 interacting protein 3 2.75 < .001 .023 
    IGH Ig heavy locus 2.68 .002 .041 
    IL21R Interleukin-21 receptor 2.67 < .001 .017 
    TNFSF14 Tumor necrosis factor (ligand) superfamily, member 14 2.64 .002 .039 
    LILRA3 Leukocyte Ig-like receptor, subfamily A (without TM domain), member 3 2.64 .002 .043 
    CCR4 Chemokine (C-C motif) receptor 4 2.52 < .001 .023 
    G1P2 Interferon α–inducible protein (clone IFI-15K) 2.49 < .001 .029 
    ISG20 Interferon-stimulated gene 20 kDa 2.42 < .001 < .001 
    TNIK TRAF2 and NCK interacting kinase 2.39 < .001 .017 
    IGL Ig lambda locus, constant, variable and joining regions 2.36 .002 .042 
    PRKCQ Protein kinase C θ 2.24 .002 .041 
    OAS2 2′-5′-oligoadenylate synthetase 2, 69/71 kDa 2.18 < .001 < .001 
    OAS1 2′,5′-oligoadenylate synthetase 1, 40/46 kDa 2.12 .002 .039 
    IGKC; IGKV1-5 Ig κ constant and variable 1-5 2.09 .003 .050 
    CCR7 Chemokine (C-C motif) receptor 7 2.08 .001 .036 
    ZAP70 ζ-chain (TCR)–associated protein kinase 70 kDa 2.05 .002 .041 
    CCND2 Cyclin D2 2.00 .002 .039 
    GBP1 Guanylate-binding protein 1, interferon-inducible, 67 kDa 1.95 .003 .046 
    ZC3HAV1 Zinc finger CCCH type, antiviral 1 1.91 < .001 < .001 
    IL2RB Interleukin-2 receptor β 1.90 .002 .041 
    PTPRCAP Protein tyrosine phosphatase, receptor type, C-associated protein 1.90 .003 .048 
    C1GALT1 Core 1 synthase 1.89 < .001 .017 
    C5R1 Complement component 5 receptor 1 (C5a ligand) 1.88 < .001 .033 
    PIM2 pim-2 oncogene 1.88 .003 .050 
    ITGB7 Integrin β7 1.84 .002 .042 
    DDX58 DEAD (Asp-Glu-Ala-Asp) box polypeptide 58 1.84 .002 .041 
    RARRES3 Retinoic acid receptor responder (tazarotene induced) 3 1.83 .003 .047 
    IFRG28 28 kDa interferon-responsive protein 1.82 .001 .036 
    MX2 Myxovirus (influenza virus) resistance 2 (mouse) 1.75 .002 .043 
    SP100 (IFI41) Nuclear antigen Sp100 1.71 < .001 < .001 
    VAV1 vav 1 oncogene 1.68 .001 .036 
    LCP2 Lymphocyte cytosolic protein 2 1.65 .002 .042 
    NFATC2IP NFAT, cytoplasmic, calcineurin-dependent 2 interacting protein 1.65 < .001 < .001 
    DUSP5 Dual specificity phosphatase 5 1.64 < .001 .017 
    IL10 Interleukin-10 1.64 .002 .042 
    BAX BCL2-associated X protein 1.63 .002 .043 
    FYB FYN binding protein (FYB-120/130) 1.61 < .001 .023 
    NUP98 Nucleoporin 98 kDa 1.60 .002 .041 
    HSPD1 heat shock 60 kDa protein 1 (chaperonin) 1.56 .003 .048 
    IFNAR2 Interferon (α, β, and ω) receptor 2 1.53 < .001 < .001 
    RAP140 Retinoblastoma-associated protein 140 (CTCL tumor antigen se89-1) 1.52 < .001 .033 
    FGFR1OP FGFR1 oncogene partner 1.51 < .001 < .001 

Selected genes with up-regulated expression in the specified stages are shown, with fold change greater than or equal to 1.5, P value less than or equal to .05, and FDR less than or equal to .05.

We also evaluated patients in terms of response to treatment, identifying those with disease persistence or progression despite stage-appropriate treatment as difficult to treat, and those whose disease was responsive to stage-appropriate therapies as easy to treat. Transcriptional profiling revealed a treatment response signature shown in Table 7. Changes are similar to those differentiating early- and later-stage disease: genes down-regulated in poor treatment response were involved in pathway extracellular matrix, WNT signaling pathway, epidermal development, and frizzled signaling, whereas up-regulated genes included those involved in mitosis, immune response, response to virus, apoptosis, and T-cell activation.

Table 7

Genes up- and down-regulated in patients with poor response to treatment

Gene symbolGene nameFold changePFDR
In patients with poor response to treatment     
    IL21R Interleukin-21 receptor 2.81 < .001 .048 
    MYO7A Myosin VIIA (usher syndrome 1B [autosomal recessive, severe]) 2.44 < .001 .048 
    RGS16 Regulator of G-protein signaling 16 2.29 < .001 < .001 
    ANP32E Acidic (leucine-rich) nuclear phosphoprotein 32 family, member E 2.00 < .001 < .001 
    SLA Src-like adaptor 1.91 < .001 .048 
    PPP1R16B Protein phosphatase 1, regulatory (inhibitor) subunit 16B 1.79 < .001 .048 
    C1GALT1 Core 1 synthase 1.64 < .001 .048 
    IL10 Interleukin-10 1.60 < .001 < .001 
    TRIB2 Tribbles homolog 2 (Drosophila1.55 < .001 < .001 
    MTHFD2 Methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2, methenyltetrahydrofolate cyclohydrolase 1.51 < .001 < .001 
    EHD1 EH-domain containing 1 1.51 < .001 < .001 
In patients with good response to treatment     
    WIF1 WNT inhibitory factor 1 4.02 < .001 .048 
    KRT15 Keratin 15 1.92 < .001 < .001 
    GSTA3 Glutathione S-transferase A3 1.87 < .001 .048 
    PTN Pleiotrophin (heparin-binding growth factor 8, neurite growth-promoting factor 1) 1.79 < .001 .048 
Gene symbolGene nameFold changePFDR
In patients with poor response to treatment     
    IL21R Interleukin-21 receptor 2.81 < .001 .048 
    MYO7A Myosin VIIA (usher syndrome 1B [autosomal recessive, severe]) 2.44 < .001 .048 
    RGS16 Regulator of G-protein signaling 16 2.29 < .001 < .001 
    ANP32E Acidic (leucine-rich) nuclear phosphoprotein 32 family, member E 2.00 < .001 < .001 
    SLA Src-like adaptor 1.91 < .001 .048 
    PPP1R16B Protein phosphatase 1, regulatory (inhibitor) subunit 16B 1.79 < .001 .048 
    C1GALT1 Core 1 synthase 1.64 < .001 .048 
    IL10 Interleukin-10 1.60 < .001 < .001 
    TRIB2 Tribbles homolog 2 (Drosophila1.55 < .001 < .001 
    MTHFD2 Methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2, methenyltetrahydrofolate cyclohydrolase 1.51 < .001 < .001 
    EHD1 EH-domain containing 1 1.51 < .001 < .001 
In patients with good response to treatment     
    WIF1 WNT inhibitory factor 1 4.02 < .001 .048 
    KRT15 Keratin 15 1.92 < .001 < .001 
    GSTA3 Glutathione S-transferase A3 1.87 < .001 .048 
    PTN Pleiotrophin (heparin-binding growth factor 8, neurite growth-promoting factor 1) 1.79 < .001 .048 

Shown are genes with fold change greater than or equal to 1.5, P value less than or equal to .05, and FDR less than or equal to 0.05.

Comparison of patients with leukemic disease versus patients with no evidence of peripheral blood involvement identified only 3 genes with significant fold changes based on multiple testing. These genes included 2 genes up-regulated in leukemic disease, lymphocyte-activation gene 3 (LAG3; fold change 4.24) and B-factor (BF; fold change 2.36), and 1 gene down-regulated in leukemic disease, collagen XI A (COL11A1; fold change −9.84). LAG3 has been shown to be involved in lymphocyte activation, and has a close relationship to CD4; BF is a component of the alternative pathway of complement activation; and COL11A1 is an extracellular matrix component, also reported to be expressed in some T-cell lines. To further investigate these potential markers of leukemic disease, we performed reverse transcription–polymerase chain reaction (RT-PCR) to evaluate expression of these genes in peripheral blood mononuclear cell RNA samples of patients with CTCL with and without leukemic disease (as opposed to the skin lesion RNA used in our GeneChip analysis). We found no correlation between expression of these genes in blood samples and the presence of peripheral blood involvement with CTCL (data not shown).

Evaluation of SS in comparison with all other patient samples revealed significant down-regulation of only the patched homolog gene (PTCH; fold change −1.72). PTCH encodes the receptor for sonic hedgehog, and functions as a tumor suppressor.

Correlation with TREC and spectratyping data

We have previously shown that T-cell receptor excision circles (TRECs), which decrease with T-cell division, are significantly decreased in the peripheral blood of patients with CTCL, indicating increased T-cell turnover.47  We have also previously shown that the diversity of the T-cell population appears to be decreased in CTCL, by measuring length distributions of T-cell receptor β variable regions (“spectratyping”).47,48  We compared these parameters of T-cell immunity with clustering results in 39 of the patients of this study for whom TREC and spectratyping data were available. As shown previously, mean TRECs decreased significantly with increasing clinical stage, and mean spectratype contracture increased, consistent with loss of TCR complexity (Table 8). In this study, we compared TRECs and spectratypes by cluster and found the greatest loss of TRECs in cluster 1 and the least in cluster 2 (Table 8). Loss of T-cell diversity on spectratyping was greatest in cluster 3 and least in cluster 2.

Table 8

Evaluation of TRECs

Mean no. TRECsMean contractureSample no.
Clinical stage    
    IA 6432.46 1.86 
    IB 5293.29 2.44 18 
    IIB 534.62 2.33 
    III 268.12 6.56 
Cluster    
    1 1551.66 3.18 11 
    2 6980.11 2.03 15 
    3 1837.65 4.46 13 
Mean no. TRECsMean contractureSample no.
Clinical stage    
    IA 6432.46 1.86 
    IB 5293.29 2.44 18 
    IIB 534.62 2.33 
    III 268.12 6.56 
Cluster    
    1 1551.66 3.18 11 
    2 6980.11 2.03 15 
    3 1837.65 4.46 13 

Shown are TRECs with respect to clinical stage (P = .034) and cluster (P = .011). As shown previously,47  TRECs decreased significantly across all clinical stages, and contracture increased from early-stage to late-stage CTCL, indicating loss of T-cell receptor diversity. Clusters 1 and 3 had the lowest TREC levels and highest contractures, while cluster 2 had the highest TREC levels and the lowest contractures. These findings significantly correlate with the clinical observation of poorer outcome in clusters 1 and 3. P values calculated based on ANOVA.

This study shows that gene expression profiling in lesional skin biopsies provides information which may be useful in the evaluation and management of patients with CTCL. We determined the most natural clustering of 62 patients with CTCL, and examined in detail the genes and gene sets defining each group. We further correlated these clusters with other parameters including clinical stage, event-free survival, treatment resistance, TREC levels, and spectratype diversity. Our results demonstrate that grouping patients based on gene expression reveals many interesting similarities and differences with currently accepted clinical staging. This work can serve as a framework for further studies aimed at understanding the diverse spectrum of CTCL in terms of gene expression rather than clinical manifestations alone.

Using consensus analysis, we found 3 clusters to be the most statistically robust grouping of our samples, with approximately one-third of patients in each group. One cluster (cluster 2) was associated with less aggressive disease by all measures. Another cluster (cluster 3) was associated with more extensive disease, decreased event-free survival, and poor response to treatment. A third cluster (cluster 1) was associated with an even poorer prognosis and included many patients in tumor stage. A previous microarray study of lesional CTCL skin biopsies by Tracey et al17  divided subjects into 2 clusters using hierarchical clustering. These clusters were generally characterized by clinical parameters to be associated with more and less aggressive disease. They did not perform consensus clustering and did not publish any results of clustering with an explicit 3-cluster model. However, when we examined their data in a 3-cluster division for comparison with the results of our consensus clustering, we found that a 3-cluster structure for their data were highly consistent with our results, based on sample as well as gene partitioning (Figure 9).

Figure 9

Comparison with published results. Tracey et al17  examined genes included on the CNIO Oncochip microarray platform, a subset of the genes included on the Affymetrix HG-U133 microarray platform. They identified 2 main subgroups of patients using hierarchical clustering. We compared this to our 3-cluster division determined through consensus clustering. Sample clustering in the space of genes identified by Tracey et al showed significant consistency with our results, despite only 21% overlap between genes used in clustering. We also examined a less optimized 2-cluster structure of our data for comparison, and found relatively good overlap with the 2-cluster structure of Tracey et al. (A) 3-cluster structure. 79% of samples overall were correctly classified. This result was significant (P < .001). (B) 2-cluster structure. 87% of samples overall were correctly classified using a 2-cluster structure. The result was again significant (P < .001). (C) Gene correlation. Genes were similarly classified for 71% of 130 common genes. Genes associated with more aggressive disease by Tracey et al were found to be associated with more advanced disease in our study as well. This finding was significant (P < .001).

Figure 9

Comparison with published results. Tracey et al17  examined genes included on the CNIO Oncochip microarray platform, a subset of the genes included on the Affymetrix HG-U133 microarray platform. They identified 2 main subgroups of patients using hierarchical clustering. We compared this to our 3-cluster division determined through consensus clustering. Sample clustering in the space of genes identified by Tracey et al showed significant consistency with our results, despite only 21% overlap between genes used in clustering. We also examined a less optimized 2-cluster structure of our data for comparison, and found relatively good overlap with the 2-cluster structure of Tracey et al. (A) 3-cluster structure. 79% of samples overall were correctly classified. This result was significant (P < .001). (B) 2-cluster structure. 87% of samples overall were correctly classified using a 2-cluster structure. The result was again significant (P < .001). (C) Gene correlation. Genes were similarly classified for 71% of 130 common genes. Genes associated with more aggressive disease by Tracey et al were found to be associated with more advanced disease in our study as well. This finding was significant (P < .001).

Close modal

In addition to examining individual gene differences, we examined groups of genes organized according to functional pathways. Pathways were tested using the GO and GSEA tools, which group changes in gene expression based on molecular pathways in order to uncover changes in cell functions that might otherwise be difficult to delineate because of small fold changes of individual genes. Cluster 1 demonstrated significant up-regulation of many other genes involved in lymphocyte activation, as well as up-regulation of TNF pathway genes. Gene pathways that were predominant in cluster 2, consisting of patients with relatively limited disease, tended to be associated with epidermal development and differentiation. Cluster 3, consisting of patients with more extensive disease, emphasized pathways involved in inflammation and benign epidermal hyperproliferation. Although clusters 2 and 3 appear to be clinically divergent, it is possible that they lie on a clinical spectrum, with cluster 3 tending toward more extensive skin involvement (stages IB and III, with very few stage IA patients) and with cluster 2 tending toward more limited surface area stage IA disease. Cluster 1 was found to be enriched with lymphocyte activation genes, and correlated with poorer outcome, including disease-related deaths and progression despite systemic therapies. This cluster also had lower TREC levels across all clinical stages, consistent with lymphocyte clonal expansion, as well as contracted spectratypes, indicating decreased TCR diversity. This significant correlation between our clustering and TREC/spectratype measurements supports the idea that our clustering does correlate with the biology of the disease.

Biopsies showed that the number of CD3+ cells in the epidermal and dermal infiltrates is highly variable across clusters, suggesting that up-regulation of T-cell activation genes is unlikely due solely to increased numbers of lymphocytes.

Markers of clinical stage in the supervised analysis were largely in line with those of the natural clusters, with lower stage disease enriched in epidermal differentiation pathways and higher-stage disease enriched in lymphocyte activation pathways, among others. More work with larger numbers of patients will be required to characterize subgroups more clearly and to isolate causative genes. However, this does appear to be a useful framework for additional study of the pathways and genes involved in clinical stage progression and response to treatment.

This study suggests that gene expression profiles of lesional skin provide information which correlates with disease progression and response to therapy. This information may prove useful in the future in combination with traditionally accepted clinical staging criteria to improve classification of CTCL and to help guide therapy, such as more aggressive treatment of those patients with gene expression profiles enriched in lymphocyte activation genes relative to epidermal differentiation and proliferation genes.

Presented in abstract form at the 67th annual meeting of the Society for Investigative Dermatology, Philadelphia, PA, May 4, 2006.1 

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

We are indebted to Marianne Tawa and Lisa Batchelder for their help with identifying and recruiting eligible patients for our study. We would also like to thank Helen Kuo for her help extracting RNA. We thank the Broad Institute for help with target preparation and hybridization.

This work was supported by a grant from the National Institutes of Health SPORE program (P50 CA093683 to T.S.K.).

National Institutes of Health

Contribution: All authors participated in designing the research; D.A. and M.D. collected samples; D.A., M.D., and J.S. collected clinical information; J.S. and D.J. examined pathology slides; J.S. and S.M. performed statistical analyses; J.S. and D.J. wrote the paper; and all authors reviewed the final version of the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: David A. Jones or Thomas S. Kupper, Department of Dermatology, Brigham and Women's Hospital, 77 Ave Louis Pasteur, HIM Room 660, Boston, MA 02115; e-mail: dajones@partners.org or tkupper@partners.org.

1
Shin
 
J
Jones
 
DA
Hurwitz
 
D
Kupper
 
TS
Gene expression profiling in cutaneous T cell lymphoma skin lesions.
J Invest Dermatol
2006
, vol. 
126
 
Supplement
pg. 
24
 
2
Willemze
 
R
Jaffe
 
ES
Burg
 
G
, et al. 
WHO-EORTC classification for cutaneous lymphomas.
Blood
2005
, vol. 
105
 (pg. 
3768
-
3785
)
3
Zackheim
 
HS
Amin
 
S
Kashani-Sabet
 
M
McMillan
 
A
Prognosis in cutaneous T-cell lymphoma by skin stage: long-term survival in 489 patients.
J Am Acad Dermatol
1999
, vol. 
40
 (pg. 
418
-
425
)
4
Diamandidou
 
E
Colome-Grimmer
 
M
Fayad
 
L
Duvic
 
M
Kurzrock
 
R
Transformation of mycosis fungoides/Sezary syndrome: clinical characteristics and prognosis.
Blood
1998
, vol. 
92
 (pg. 
1150
-
1159
)
5
Kim
 
YH
Liu
 
HL
Mraz-Gernhard
 
S
Varghese
 
A
Hoppe
 
RT
Long-term outcome of 525 patients with mycosis fungoides and Sezary syndrome: clinical prognostic factors and risk for disease progression.
Arch Dermatol
2003
, vol. 
139
 (pg. 
857
-
866
)
6
Kim
 
EJ
Hess
 
S
Richardson
 
SK
, et al. 
Immunopathogenesis and therapy of cutaneous T cell lymphoma.
J Clin Invest
2005
, vol. 
115
 (pg. 
798
-
812
)
7
Kupper
 
TS
Fuhlbrigge
 
RC
Immune surveillance in the skin: mechanisms and clinical consequences.
Nat Rev Immunol
2004
, vol. 
4
 (pg. 
211
-
222
)
8
Kazakov
 
DV
Burg
 
G
Kempf
 
W
Clinicopathological spectrum of mycosis fungoides.
J Eur Acad Dermatol Venereol
2004
, vol. 
18
 (pg. 
397
-
415
)
9
Morales Suarez-Varela
 
MM
Llopis Gonzalez
 
A
Marquina Vila
 
A
Bell
 
J
Mycosis fungoides: review of epidemiological observations.
Dermatology
2000
, vol. 
201
 (pg. 
21
-
28
)
10
Quackenbush
 
J
Microarray analysis and tumor classification.
N Engl J Med
2006
, vol. 
354
 (pg. 
2463
-
2472
)
11
Ebert
 
BL
Golub
 
TR
Genomic approaches to hematologic malignancies.
Blood
2004
, vol. 
104
 (pg. 
923
-
932
)
12
Tefferi
 
AM
Bolander
 
MEM
Ansell
 
SMM
Wieben
 
EDP
Spelsberg
 
TCP
Primer on medical genomics part III: microarray experiments and data analysis.
Mayo Clin Proc
2002
, vol. 
77
 (pg. 
927
-
940
)
13
Sorlie
 
T
Perou
 
CM
Tibshirani
 
R
, et al. 
Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.
Proc Natl Acad Sci U S A
2001
, vol. 
98
 (pg. 
10869
-
10874
)
14
van 't Veer
 
LJ
Dai
 
H
van de Vijver
 
MJ
, et al. 
Gene expression profiling predicts clinical outcome of breast cancer.
Nature
2002
, vol. 
415
 (pg. 
530
-
536
)
15
Monti
 
S
Savage
 
KJ
Kutok
 
JL
, et al. 
Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response.
Blood
2005
, vol. 
105
 (pg. 
1851
-
1861
)
16
Kari
 
L
Loboda
 
A
Nebozhyn
 
M
, et al. 
Classification and prediction of survival in patients with the leukemic phase of cutaneous T cell lymphoma.
J Exp Med
2003
, vol. 
197
 (pg. 
1477
-
1488
)
17
Tracey
 
L
Villuendas
 
R
Dotor
 
AM
, et al. 
Mycosis fungoides shows concurrent deregulation of multiple genes involved in the TNF signaling pathway: an expression profile study.
Blood
2003
, vol. 
102
 (pg. 
1042
-
1050
)
18
Tracey
 
L
Villuendas
 
R
Ortiz
 
P
, et al. 
Identification of genes involved in resistance to interferon-alpha in cutaneous T-cell lymphoma.
Am J Pathol
2002
, vol. 
161
 (pg. 
1825
-
1837
)
19
Irizarry
 
RA
Bolstad
 
BM
Collin
 
F
Cope
 
LM
Hobbs
 
B
Speed
 
TP
Summaries of Affymetrix GeneChip probe level data.
Nucleic Acids Res
2003
, vol. 
31
 pg. 
e15
 
20
R Development Core Team
R: a language and environment for statistical computing.
2005
Vienna, Austria
R Foundation for Statistical Computing
 
21
Gentleman
 
R
Carey
 
V
Bates
 
D
, et al. 
Bioconductor: open software development for computational biology and bioinformatics.
Genome Biology
2004
, vol. 
5
 pg. 
R80
 
22
Shipp
 
MA
Ross
 
KN
Tamayo
 
P
, et al. 
Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning.
Nat Med
2002
, vol. 
8
 (pg. 
68
-
74
)
23
Eisen
 
MB
Spellman
 
PT
Brown
 
PO
Botstein
 
D
Cluster analysis and display of genome-wide expression patterns.
Proc Natl Acad Sci U S A
1998
, vol. 
95
 (pg. 
14863
-
14868
)
24
Tamayo
 
P
Slonim
 
D
Mesirov
 
J
, et al. 
Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation.
Proc Natl Acad Sci U S A
1999
, vol. 
96
 (pg. 
2907
-
2912
)
25
Monti
 
S
Tamayo
 
P
Mesirov
 
J
Golub
 
T
Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data.
Machine Learning
2003
, vol. 
52
 (pg. 
91
-
118
)
26
Li
 
C
Wong
 
WH
Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection.
Proc Natl Acad Sci U S A
2001
, vol. 
98
 (pg. 
31
-
36
)
27
Reich
 
M
Liefeld
 
T
Gould
 
J
Lerner
 
J
Tamayo
 
P
Mesirov
 
JP
GenePattern 2.0.
Nat Genet
2006
, vol. 
38
 (pg. 
500
-
501
)
28
Golub
 
TR
Slonim
 
DK
Tamayo
 
P
, et al. 
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Science
1999
, vol. 
286
 (pg. 
531
-
537
)
29
Gould
 
J
Getz
 
G
Monti
 
S
Reich
 
M
Mesirov
 
JP
Comparative gene marker selection suite.
Bioinformatics
2006
, vol. 
22
 (pg. 
1924
-
1925
)
30
Benjamini
 
Y
Hochberg
 
Y
Controlling the false discovery rate: a practical and powerful approach to multiple testing.
J R Stat Soc Ser B
1995
, vol. 
57
 (pg. 
289
-
300
)
31
Ashburner
 
M
Ball
 
CA
Blake
 
JA
, et al. 
Gene Ontology: tool for the unification of biology.
Nat Genet
2000
, vol. 
25
 (pg. 
25
-
29
)
32
Castillo-Davis
 
CI
Hartl
 
DL
GeneMerge: post-genomic analysis, data mining, and hypothesis testing.
Bioinformatics
2003
, vol. 
19
 (pg. 
891
-
892
)
33
Subramanian
 
A
Tamayo
 
P
Mootha
 
VK
, et al. 
From the cover: gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.
Proc Natl Acad Sci U S A
2005
, vol. 
102
 (pg. 
15545
-
15550
)
34
Mehrle
 
S
Frank
 
S
Schmidt
 
J
Schmidt-Wolf
 
IGH
Marten
 
A
SAP and SLAM expression in anti-CD3 activated lymphocytes correlates with cytotoxic activity.
Immunol Cell Biol
2005
, vol. 
83
 (pg. 
33
-
39
)
35
Willinger
 
T
Freeman
 
T
Herbert
 
M
Hasegawa
 
H
McMichael
 
AJ
Callan
 
MFC
Human naive CD8 T cells down-regulate expression of the WNT pathway transcription factors lymphoid enhancer binding factor 1 and transcription factor 7 (T cell factor-1) following antigen encounter in vitro and in vivo.
J Immunol
2006
, vol. 
176
 (pg. 
1439
-
1446
)
36
Castrop
 
J
van Wichen
 
D
Koomans-Bitter
 
M
, et al. 
The human TCF-1 gene encodes a nuclear DNA-binding protein uniquely expressed in normal and neoplastic T-lineage lymphocytes.
Blood
1995
, vol. 
86
 (pg. 
3050
-
3059
)
37
Dorfman
 
DM
Greisman
 
HA
Shahsafaei
 
A
Loss of expression of the WNT/β-catenin-signaling pathway transcription factors lymphoid enhancer factor-1 (LEF-1) and T cell factor-1 (TCF-1) in a subset of peripheral T cell lymphomas.
Am J Pathol
2003
, vol. 
162
 (pg. 
1539
-
1544
)
38
Martinez
 
M
Joffraud
 
M
Giraud
 
S
, et al. 
Regulation of PSGL-1 interactions with L-selectin, P-selectin, and E-selectin: role of human fucosyltransferase-IV and -VII.
J Biol Chem
2005
, vol. 
280
 (pg. 
5378
-
5390
)
39
Nakayama
 
F
Teraki
 
Y
Kudo
 
T
, et al. 
Expression of cutaneous lymphocyte-associated antigen regulated by a set of glycosyltransferases in human T cells: involvement of α1,3-fucosyltransferase VII and β1,4-galactosyltransferase I.
J Invest Dermatol
2000
, vol. 
115
 (pg. 
299
-
306
)
40
Sedy
 
JR
Gavrieli
 
M
Potter
 
KG
, et al. 
B and T lymphocyte attenuator regulates T cell activation through interaction with herpesvirus entry mediator.
Nat Immunol
2005
, vol. 
6
 (pg. 
90
-
98
)
41
Shi
 
G
Luo
 
H
Wan
 
X
Salcedo
 
TW
Zhang
 
J
Wu
 
J
Mouse T cells receive costimulatory signals from LIGHT, a TNF family member.
Blood
2002
, vol. 
100
 (pg. 
3279
-
3286
)
42
Wang
 
C-Y
Mayo
 
MW
Korneluk
 
RG
Goeddel
 
DV
Baldwin
 
AS
NF-B antiapoptosis: induction of TRAF1 and TRAF2 and c-IAP1 and c-IAP2 to suppress caspase-8 activation.
Science
1998
, vol. 
281
 (pg. 
1680
-
1683
)
43
Auer
 
RL
Starczynski
 
J
McElwaine
 
S
, et al. 
Identification of a potential role for POU2AF1 and BTG4 in the deletion of 11q23 in chronic lymphocytic leukemia.
Genes Chromosomes Cancer
2005
, vol. 
43
 (pg. 
1
-
10
)
44
Weiss
 
RA
Eichner
 
R
Sun
 
TT
Monoclonal antibody analysis of keratin expression in epidermal diseases: a 48- and 56-kdalton keratin as molecular markers for hyperproliferative keratinocytes.
J Cell Biol
1984
, vol. 
98
 (pg. 
1397
-
1406
)
45
Haider
 
AS
Peters
 
SB
Kaporis
 
H
, et al. 
Genomic analysis defines a cancer-specific gene expression signature for human squamous cell carcinoma and distinguishes malignant hyperproliferation from benign hyperplasia.
J Invest Dermatol
2006
, vol. 
126
 (pg. 
869
-
881
)
46
Li
 
Z
Godinho
 
FJ
Klusmann
 
J-H
Garriga-Canut
 
M
Yu
 
C
Orkin
 
SH
Developmental stage-selective effect of somatically mutated leukemogenic transcription factor GATA1.
Nat Genet
2005
, vol. 
37
 (pg. 
613
-
619
)
47
Yamanaka
 
K-i
Yawalkar
 
N
Jones
 
DA
, et al. 
Decreased T-cell receptor excision circles in cutaneous T-cell lymphoma.
Clin Cancer Res
2005
, vol. 
11
 (pg. 
5748
-
5755
)
48
Yawalkar
 
N
Ferenczi
 
K
Jones
 
DA
, et al. 
Profound loss of T-cell receptor repertoire complexity in cutaneous T-cell lymphoma.
Blood
2003
, vol. 
102
 (pg. 
4059
-
4066
)

Sign in via your Institution