Abstract

The malignant Reed-Sternberg cell of Hodgkin’s disease, first described a century ago, has resisted in-depth analysis due to its extreme rarity in lymphomatous tissue. To directly study its genome-wide gene expression, approximately 11,000,000 bases (27,518 cDNA sequences) of expressed gene sequence was determined from living single Reed-Sternberg cells, Hodgkin’s tissue, and cell lines. This approach increased the number of genes known to be expressed in Hodgkin’s disease by 20-fold to 2,666 named genes. The data here indicate that Reed-Sternberg cells from both nodular sclerosing and lymphocyte predominant Hodgkin’s disease were derived from an unusual B-cell lineage based on a comparison of their gene expression to approximately 40,000,000 bases (105 sequences) of expressed gene sequence from germinal center B cells (GCB) and dendritic cells. The data set of expressed genes, reported here and on the World Wide Web, forms a basis to understand the genes responsible for Hodgkin’s disease and develop novel diagnostic markers and therapies. This study of the rare Reed-Sternberg cell, concealed in its heterogenous cellular context, also provides a formidable test case to advance the limit of analysis of differential gene expression to the single disease cell.

HODGKIN’S DISEASE stands apart from other cancers by the extraordinary and unexplained scarcity of its neoplastic (Reed-Sternberg) cell in involved tissues. Because Reed-Sternberg cells are outnumbered by surrounding nonneoplastic cells by approximately 1,000:1,1-3 direct extraction approaches fail to determine either the gene expression profile or genetics of the Reed-Sternberg cell. Here, we applied a genome-wide strategy to examine regulated gene expression in single Reed-Sternberg cells and calculate its likely cell of origin.

Reed-Sternberg cells are clonal, aneuploid, and pathognomonic of Hodgkin’s disease.1-3 Reed-Sternberg cells, which are thought to be biologically active, presumably secrete peptides that elicit the surrounding inflammatory cell infiltrate and consequent systemic symptoms.1,3 Rearrangement and somatic hypermutation of their Ig heavy chain variable (VH) genes suggest that Reed-Sternberg cells are germinal center B lymphocytes (GCB) that carry nonproductive Ig genes but resist culling by apoptosis.4 Clustering of Hodgkin’s disease cases may be a clue to an infectious etiology, and Epstein-Barr virus (EBV) is present in the Reed-Sternberg cells of many, but not all, cases of Hodgkin’s disease.1-3 However, a pathogenetic relationship of EBV with Hodgkin’s disease has not been formally established.2Little genetic information is known because of the difficulty in obtaining cells for molecular or cytogenetic analysis, and pathogenetic associations with specific genes or cytogenetic abnormalities have not yet been established. Hodgkin’s disease is sometimes familial, and the genetically identical twin siblings of affected monozygous twins carry a 100-fold increased risk of Hodgkin’s disease,5 but no specific genetic locus has been identified.

Gene expression studies of Hodgkin’s disease have reported results for only approximately 100 gene products and have been largely limited to in situ microscopy, a technique restricted to those proteins and genes for which specific antibodies or nucleic acid probes have been prepared.1-3 In an attempt to globally analyze gene expression in Reed-Sternberg cells, we developed a single cell strategy whereby cDNA libraries were prepared from individual, viable Reed-Sternberg cells selected by micropipette from cell suspensions of primary tissue suitable for analysis by blot probing6 and specific polymerase chain reaction (PCR).7 Sequencing of cDNA libraries of single Reed-Sternberg cells has been combined here with sequence analysis of Hodgkin’s-derived cell lines and primary Hodgkin’s tissues and compared with putative normal cells of origin to determine the regulated gene expression profile of the Reed-Sternberg cell.

MATERIALS AND METHODS

Cell Sources

Hodgkin’s-derived cell lines.

cDNA was prepared as described8 from two continuous cell lines, L428 and KMH2, that were derived from 2 relapsed patients with Hodgkin’s disease and that share phenotypic characteristics with primary Reed-Sternberg cells.9 

Hodgkin’s disease tissue.

Two cDNA libraries were prepared (designated HD) from one previously fresh frozen, unfixed lymph node from a patient with nodular sclerosing Hodgkin’s disease, as described.8 

Single Reed-Sternberg cells.

cDNA libraries were used that were previously prepared from four single Reed-Sternberg cells obtained by viable cell micromanipulation from two primary Hodgkin’s specimens, as reported.6 These correspond to cell numbers A1 and A14 from classical lymphocyte-predominant Hodgkin’s disease (cells 14 and 25, respectively) and cells L1 and L8 from nodular sclerosing Hodgkin’s disease (cells 34 and 41, respectively). Primary amplification of the library and the addition of restriction sites to the ends of the cDNA occurred during the first PCR reaction with 36-mers to create anXho I site at the 5′ end and an EcoRI site at the 3′ end.6,7 These were digested and ligated into λ phage using Stratagene Lambda Zap vector (Stratagene, La Jolla, CA). Reamplification was performed using a primer consisting only of the 36 nt clamp sequence without the dT24 sequence for subsequent sequencing. cDNA libraries from single cells contained actin mRNA signals by PCR and at approximately equal levels but did not contain genomic (intronic) DNA (not shown).

GCB.

The preparation of GCB cDNA libraries has been reported by the Cancer Genome Anatomy (CGAP) on the World Wide Web (www.ncbi.nlm.nih.nih.gov/CGAP) as tissue sample CGAP123.1 and libraries CGAP-GCB0 and CGAP-GCB1. A nonneoplastic human tonsil was dispersed into cell suspension and flow sorted into a GCB fraction on the basis of surface membrane phenotype, IgDneg,CD20dim. Normalized cDNA libraries were prepared from polyA-selected mRNA and sequenced. Sequences are deposited on the CGAP Web site.

Dendritic cells.

Human dendritic cells were generated from peripheral blood mononuclear cells by adherence plating and cultured for 7 days with 200 ng/mL recombinant granulocyte-macrophage colony-stimulating factor (rGM-CSF) and 200 U/mL recombinant interleukin-4 (rIL-4), as described.10 The culture was depleted of CD2+ and CD19+ cells by means of immunomagnetic beads coated with specific antibodies. This procedure gave greater than 97% pure CD1a+ and CD14 dendritic cell preparation. Cells were validated as dendritic by the following phenotype: HLA-DR+, CD14, CD1a+, CD1b+, CD83+, CD80+, CD86+, CD19, CD3, and CD16. A cDNA library was prepared.8 

Sequence and Analysis

Sequencing was peformed by automated sequencers (ABI 371; Applied Biosystems, Perkin-Elmer, Norwalk, CT), and sequence information was compared against an expressed gene database consisting of approximately 2 × 106 human sequences using the BLAST algorithm, which is stored in a Sybase relational database, and computational analyses performed on central servers, including a convex computer and Sun SPARC 200 computers.8 We have previously successfully applied a data search study for the detection and cloning of tissue-specific genes, eg, a prostate-specific gene, NKX3.1, was cloned and mapped to chromosome 8p21.8 To estimate distinct genes, the following calculations were made. Because expressed sequence tags (ESTs) with different GenBank accession matches may represent one gene, all matched GenBank sequences were compared with each other, after masking for repetitive elements, (RepeatMasker program;http://ftp.genome.washington.edu/RM/RepeatMasker.html). Any two GenBank sequences with a BLASTN match11 that included an ungapped stretch of at least 48 of 50 identically matching nucleotides were grouped together as a cluster representing a distinct gene. GenBank numbers and occurrence frequency can be found on the World Wide Web atwww.hodgkins.georgetown.edu(Fig 1).

Fig. 1.

Genes expressed in Hodgkin’s disease. The expressed sequences from Hodgkin’s cell and tissue sources are listed by their GenBank assignment and gene name. A total of 3,784 distinct GenBank accessions were found in the Hodgkin’s dataset and accounted for 2,666 distinct genes (see text). The number of individual sequences (templates), along with the number of libraries (in parentheses) and occurrence in single cells (*), is provided for each gene expressed in the Hodgkin’s dataset. To compare expression of Hodgkin’s disease sources with the 2 × 106 expressed sequences from other human cells, shown are the number of sequences (“Templates in other libraries”), the calculated R(g) value (“Estimated factor of Hodgkin’s: other libraries”), and 95% confidence interval. Genes are ranked in descending order by their lower limit of the 95% confidence interval. Detail of the distribution of expressed sequence counts of each gene among the Hodgkin’s libraries is obtained by a link from the template number for that gene. GenBank numbers are linked to the reported sequences in GenBank and published references are linked to PubMed. Complete tables are available atwww.hodgkins.georgetown.edu.

Fig. 1.

Genes expressed in Hodgkin’s disease. The expressed sequences from Hodgkin’s cell and tissue sources are listed by their GenBank assignment and gene name. A total of 3,784 distinct GenBank accessions were found in the Hodgkin’s dataset and accounted for 2,666 distinct genes (see text). The number of individual sequences (templates), along with the number of libraries (in parentheses) and occurrence in single cells (*), is provided for each gene expressed in the Hodgkin’s dataset. To compare expression of Hodgkin’s disease sources with the 2 × 106 expressed sequences from other human cells, shown are the number of sequences (“Templates in other libraries”), the calculated R(g) value (“Estimated factor of Hodgkin’s: other libraries”), and 95% confidence interval. Genes are ranked in descending order by their lower limit of the 95% confidence interval. Detail of the distribution of expressed sequence counts of each gene among the Hodgkin’s libraries is obtained by a link from the template number for that gene. GenBank numbers are linked to the reported sequences in GenBank and published references are linked to PubMed. Complete tables are available atwww.hodgkins.georgetown.edu.

Library Comparisons

ESTs sampled from different cell cDNA libraries were compared with those of Hodgkin’s disease cells and scored for relative abundance according to the following formula: R(g) = [f1(g)/M]/[f2(g)/N], where library 1 consists of M distinct genes, f1(g) is the count of ESTs encoding a gene g in library 1, N is the count of distinct genes in library 2, and f2(g) is the count of ESTs representing gene g in library 2. The proportion of gene g transcripts in cell source 1 is f1(g)/M; for source 2, the estimate is f2(g)/N. When f1(g) and f2(g) are large, the 95% confidence interval of R(g) narrows and the data more precisely estimate the abundance ratio. If the 95% confidence interval for R(g) is (L,U) where L and U are the lower and upper endpoints, respectively, then when L is greater than 1, there is 95% confidence that R(g) is greater than 1 and the gene is overexpressed in source 1 relative to source 2; if U is less than 1, then the ratio is less than 1 with 95% confidence and there is 95% confidence that the gene is overexpressed in source 2 relative to source 1. To consider the effect of small counts, genes were ranked by confidence interval in the accompanying tables according to L when R(g) = 1 and by U when R(g) < 1.

RESULTS AND DISCUSSION

Expressed Genes of Hodgkin’s Disease

ESTs were sequenced from a total of 27,518 cDNA clones from libraries prepared from Hodgkin’s disease sources. To determine the number of distinct genes in the Hodgkin’s libraries, sequences were compared with a nonredundant database of all human nucleotide sequences in GenBank. In all, 11,072 sequences had GenBank assignments, comprising 3,784 different GenBank accession numbers that encompassed 2,666 distinct, named genes.

Expressed sequences obtained from single cells (n = 4,618) had a broad distribution of genes with no unexpected overrepresentation of any particular sequence (see table “Hodgkin’s single cells vs. Hodgkin’s cell lines” at www.hodgkins.georgetown.edu and Fig 1). Taken together with the gene expression seen by hybridization and PCR,6,7 the sequence data confirm the general representationality of the single cell expressed sequence libraries. To expand the basis of gene expression among Hodgkin’s-derived cells, 11,109 sequences were determined from cDNA libraries of two cell lines. Few genes were abundantly expressed by single cells that were not found in cell lines (see the accompanying table “Hodgkin’s single cells vs. Hodgkin’s cell lines” at www.hodgkins.georgetown.edu). Excluding unique sequences of Igs, histocompatibility antigens, and repetitive endogenous retroviral sequences, there were only four distinct genes found with more than two occurrences in single cells but not observed in cell lines. Thus, sequences from single Reed-Sternberg cells and the cell lines were grouped together for subsequent analysis.

Examples of expressed sequences with relative overrepresentation in the Hodgkin’s libraries are shown in Table 1. Ig was a frequent message in the single Reed-Sternberg cells and accounted for as much as 19% of the messages sequenced in one case, consistent with a B-cell phenotype. Further support of a B-cell lineage in single cells and cell lines was abundant Ig messages and B-cell–associated genes BL34,12 B7.1-CD80,13and CD20.14 The known Reed-Sternberg cell expression of a number of genes was confirmed, eg, tumor necrosis factor β (TNFβ), CD30, and nuclear factor κ-B (NFκB).1-3 However, based on prior reporting of the expression of approximately 100 genes in Hodgkin’s disease, greater than 95% of the 2,666 named genes reported here were not previously known to be expressed in Hodgkin’s disease.

Table 1.

Genes Expressed in Hodgkin’s Disease

GenBank Gene NameTemplates HD:Other R(g) 95% Confidence Interval
HD Other
U10687 MAGE-4a  13 1  711  93, 5435  
M77844 Oculorhombin (aniridia PAX6) 5  0  91  22, 381  
AF044197 B-lymphocyte chemoattractant  3  0  383  20, 7411  
L43400 Chromosome 5 P1 clone 792C12  3  0  383  20, 7411 
X86174 SSX1  4  3  73  16, 326  
U54777 MSH6 8  13  34  14, 81  
AF026692 Frizzled-related protein frpHE  9  15  33  14, 75  
U41206 MSH2  0  274  13, 5696  
U83171 Macrophage-derived chemokine (MDC)  19  73  14  9, 24  
U03187 IL-12 receptor 6  19  17  7, 43  
U90582 Chromosome 11p15.5  6  27  7, 109 
GenBank Gene NameTemplates HD:Other R(g) 95% Confidence Interval
HD Other
U10687 MAGE-4a  13 1  711  93, 5435  
M77844 Oculorhombin (aniridia PAX6) 5  0  91  22, 381  
AF044197 B-lymphocyte chemoattractant  3  0  383  20, 7411  
L43400 Chromosome 5 P1 clone 792C12  3  0  383  20, 7411 
X86174 SSX1  4  3  73  16, 326  
U54777 MSH6 8  13  34  14, 81  
AF026692 Frizzled-related protein frpHE  9  15  33  14, 75  
U41206 MSH2  0  274  13, 5696  
U83171 Macrophage-derived chemokine (MDC)  19  73  14  9, 24  
U03187 IL-12 receptor 6  19  17  7, 43  
U90582 Chromosome 11p15.5  6  27  7, 109 

A sample of genes not previously known to be expressed in Hodgkin’s disease selected for Hodgkin’s disease-association [relatively high R(g) values, see text] and biological interest. Templates refer to the number of times a particular sequence was detected in Hodgkin’s disease (HD) and all other human cell sequences (Other). The relative incidence of expression in Hodgkin’s disease as compared with other cells [R(g)] and 95% confidence interval were calculated as described. Genes are ranked in descending order of their lower (L) limit of the 95% confidence interval of the R(g) value. A complete list of the 2,666 distinct genes is deposited on the World Wide Web (www.hodgkins.georgetown.edu, table entitled “Hodgkin’s cells/tissues vs. entire database”).

The origin of the Reed-Sternberg cell of Hodgkin’s disease was addressed by comparing the relatedness of Hodgkin’s cell gene expression to other cell types on the basis of R(g) value and 95% confidence interval. When compared with the entire dataset of 2 × 106 sequences from more than 800 human cell cDNA libraries, several genes emerged that were overrepresented in Hodgkin’s libraries (Table 1, complete list at www.hodgkins.georgetown.edu, table entitled “Hodgkin’s cells/tissues vs. other cell types”). For example, the melanoma-associated tumor antigen, MAGE-4a,15 was encountered 20 times in the Hodgkin’s libraries, but only 5 times in all other libraries. The relative frequency [R(g)] of MAGE-4a was 219, with a 95% confidence level of 82 to 583. Several genes disproportionately expressed in Hodgkin’s disease included some that were expected, such as CD30, bcl-6, and NFκB,1-3 but other genes were encountered whose expresson in Hodgkin’s disease was not known, eg, the oculorhombin (aniridia, Pax-6) gene, a paired box transcription factor found in the developing neuroretina that is not known to be expressed in the immune system or its neoplasms.16 

Reed-Sternberg and Germinal Center B Cells

The comparison of Hodgkin cell gene expression with that of GCBs was performed with approximately 49,000 sequences from cDNA libraries of cell-sorted, GCBs (library codes NCI_CGAP_GCB0 and NCI_CGAP_GCB1 atwww.ncbi.nlm. nih.gov/CGAP). The GCB libraries contained 5,139 different GenBank accession numbers that comprised an estimated 4,465 distinct genes (Table 2; complete list on the website, see table entitled “Hodgkin’s cells vs. germinal center B cells”). For the purpose of the specific comparison between Hodgkin’s cells and GCB, the whole tissues (HD) were excluded, because they were largely composed of T cells and would exaggerate differences between the two library classes.

Table 2.

Genes Overrepresented in Hodgkin’s Disease Versus Germinal Center B Cells

GenBank Gene NameTemplates HD:B R(g) 95% Confidence Interval
HD B
X03558 Elongation factor 1-α  252  3  794  50, 12740  
Z28407 Ribosomal protein L8  102  1  195  17, 882  
X635526  Elongation factor 1-γ  60  0  232  14, 3745  
M17885 Ribosomal phosphoprotein P0  56  2  54  13, 220  
L19739 Metallopanstimulin  30  0  117  7, 1909  
U83171 Macrophage-derived chemokine (MDC)  16  0  63  4, 1052 
L15320 Nucleophosmin B23 (NPM)  22  4  11  4, 31 
U10687 MAGE-4a  11  0  44  3, 747  
M64241 Wilm’s tumor-related (QM)  10  0  40  2, 686 
X07417 Retrotransposon SINE-R11  9  0  36  2, 625 
GenBank Gene NameTemplates HD:B R(g) 95% Confidence Interval
HD B
X03558 Elongation factor 1-α  252  3  794  50, 12740  
Z28407 Ribosomal protein L8  102  1  195  17, 882  
X635526  Elongation factor 1-γ  60  0  232  14, 3745  
M17885 Ribosomal phosphoprotein P0  56  2  54  13, 220  
L19739 Metallopanstimulin  30  0  117  7, 1909  
U83171 Macrophage-derived chemokine (MDC)  16  0  63  4, 1052 
L15320 Nucleophosmin B23 (NPM)  22  4  11  4, 31 
U10687 MAGE-4a  11  0  44  3, 747  
M64241 Wilm’s tumor-related (QM)  10  0  40  2, 686 
X07417 Retrotransposon SINE-R11  9  0  36  2, 625 

Listed are examples of genes whose adjusted relative occurrence values [R(g)] in Hodgkin’s disease (HD) were high compared with germinal center B cells (B). A complete list can be found at www.hodgkins.georgetown.edu (“Hodgkin’s cells vs. germinal center B cells”). Germinal center B cell (GCB) cDNA libraries are reported at www.ncbi.nlm.nih.gov/CGAP.

A concern regarding the comparison of libraries was that GCB libraries were normalized, whereas the Hodgkin’s libraries were not. Normalization is a smoothing effect achieved by self-subtracting sequences through reassociation to reduce redundancy of very highly expressed genes and bring their levels closer to the levels of intermediately expressed genes.17 It has little effect on intermediately or rarely expressed genes and does not completely remove cDNA clones. For genes whose expression was high in Hodgkin’s cells but low in GCB (top of the list in table entitled “Hodgkin’s cells vs. germinal center B cells”), the sequences that were rare in GCB after normalization must also have been rare before. The rarest of sequences, less than 0.02% (defined as components II and III in Soares et al17), would be expected to have less than 10 counts each among the 49,000 GCB sequences. Of the top 500 sequences in the comparison, “Hodgkin’s cells vs. germinal center B cells” (high in Hodgkin’s cells relative to GCB), 490 (98%) cDNAs had fewer than 10 counts each in GCB. Thus, the comparison is valid for the vast majority of genes with high relative expression in Hodgkin’s cells. For sequences at the bottom of the table “Hodgkin’s cells vs. germinal center B cells”, ie, those high in GCB relative to Hodgkin’s cells, the data are unlikely to be influenced by normalization, because normalization only increases the average frequency of rare species by about 50%.17 Thus, a Hodgkin’s cell:GCB ratio of 0.10 would only be modestly increased to 0.15, which is still a significant difference between the two library classes. These assumptions are conservative, because the GCB libraries were subjected to only one round of normalization, compared with two in Soares et al.17 Therefore, the effect of normalization on the comparison of GCB and Hodgkin’s cell libraries is likely less than estimated above.

Examples of genes with overexpression in Hodgkin’s disease as compared with GCB are displayed in Table 2 (taken from website table “Hodgkin’s cells vs. germinal center B cells”). A marked difference was the frequent expression of macrophage-derived chemokine (MDC) in Hodgkin’s cells, whereas this sequence did not occur in GCB libraries. MDC was thought to have differentially regulated expression restricted to dendritic cells.18Several B-cell lineage genes expressed in GCB were not found in Hodgkin’s disease cells (Table 3; excerpted from the last genes listed in the website table entitled “Hodgkin’s cells vs. germinal center B cells”). Genes expressed by GCB but that were not detected in the Hodgkin’s libraries included those whose expression, when reduced or lost, might release cells from normal growth or apoptosis controls, such as Rb, BUB3, and PCD2 (Table 3).

Table 3.

Genes Underrepresented in Hodgkin’s Disease Versus Germinal Center B Cells

GenBank Gene NameTemplates
HD B
U92436 Mutated in multiple advanced cancers protein (MMAC1)  0  29  
X66087 a-myb  0  25  
AF047472 Spleen mitotic checkpoint BUB3 0  18  
M16038 Lyn tyrosine kinase  0  17  
S78085 Programmed cell death-2/Rp8 (PCD2)  0  16  
U34360 Lymphoid nuclear protein (LAF-4)  0  15  
X52056 spi-1 proto-oncogene  0  12  
L04288 Cyclophillin-related protein 0  11  
D14540 Mixed lineage leukemia (MLL)  0  10 
J03779 Common acute lymphoblastic leukemia antigen (CALLA) 0  10  
L78132 Prostate cancer tumor antigen (pcta-1) 0  10  
M27866 Retinoblastoma protein (Rb)  0  10 
S75217 CD79A  0  4  
M89957 CD79B  0  
D83597 RP105 signaling molecule  0  8  
M28170 CD19 0  8  
S76617 Blk  0  9  
U07349 Germinal center kinase  0  
GenBank Gene NameTemplates
HD B
U92436 Mutated in multiple advanced cancers protein (MMAC1)  0  29  
X66087 a-myb  0  25  
AF047472 Spleen mitotic checkpoint BUB3 0  18  
M16038 Lyn tyrosine kinase  0  17  
S78085 Programmed cell death-2/Rp8 (PCD2)  0  16  
U34360 Lymphoid nuclear protein (LAF-4)  0  15  
X52056 spi-1 proto-oncogene  0  12  
L04288 Cyclophillin-related protein 0  11  
D14540 Mixed lineage leukemia (MLL)  0  10 
J03779 Common acute lymphoblastic leukemia antigen (CALLA) 0  10  
L78132 Prostate cancer tumor antigen (pcta-1) 0  10  
M27866 Retinoblastoma protein (Rb)  0  10 
S75217 CD79A  0  4  
M89957 CD79B  0  
D83597 RP105 signaling molecule  0  8  
M28170 CD19 0  8  
S76617 Blk  0  9  
U07349 Germinal center kinase  0  

Examples of genes whose expression was not detected in Hodgkin’s cells and cell lines but were expressed by GBC. The number of encounters is given as templates for Hodgkin’s disease (HD: single cells, cell lines) and GBC (B). Genes listed here are found in the table “Hodgkin’s cells vs. germinal center B cells,” which can be seen at the website given above (see legend for Table 2).

A Dendritic Cell Origin of the Reed-Sternberg Cell?

Dendritic cells have been proposed as the origin of Reed-Sternberg cells based on their similarities of immunological phenotype and function as antigen-presenting cells.1,3,19 From more than 50,000 sequences determined from a dendritic cell cDNA library, 25,823 had known GenBank assignments accounting for 5,516 distinct GenBank accessions and 4,399 distinct genes (deposited at the website table entitled “Hodgkin’s cells vs. dendritic cells”). The Hodgkin’s/Reed-Sternberg single cell and cell line sequence dataset (omitting the whole Hodgkin’s lymph node due to its many cell types) was notable for its differences when compared with dendritic cells. In contrast to the many encounters of Ig genes in the Hodgkin’s cells, Ig messages were absent in dendritic cells. For example, dendritic cells, but not Reed-Sternberg cells or cell lines, expressed the macrophage-dendritic cell lineage genes Mac1 (CD11b)20 and CD68.21 Hodgkin’s cells and cell lines expressed MDC, a gene enriched in dendritic cells, but not another CC chemokine, monocyte chemoattractant protein-4 precursor (MCP-4).22 Two conclusions can be drawn: (1) the single Reed-Sternberg cells are not dendritic in origin and (2) the single cells obtained by micromanipulation were not mistakenly macrophages.

The Single Cancer Cell

The search for genomic mutation and modulated gene expression accounting for the malignant state has been a painstaking process involving genetic linkage analysis, loss of heterozygosity and screening with specific probes for expression of known genes at the mRNA and protein levels. Recent advances in the capacity to address cancer cell-associated gene expression, using representational difference analysis (RDA),23 serial analysis of gene expression (SAGE),24 and EST sequencing,25 have begun to improve the efficiency of screening differential gene expression between cancer and normal cells. Still, each technique faces the same natural obstacle to the study of primary tissue; namely, cellular populations in cancer are not homogeneous. As a consequence, RNA extracted from tumor tissue is derived from cell types in addition to neoplastic cells, such as stromal, endothelial, and inflammatory cells. Although the amount of mRNA in a single cell is insufficient for either conventional poly-A mRNA purification or SAGE analysis,24 it is adequate for high throughput automated sequencing and database analysis, as shown here. The ability to explore genes at the level of a single, defined cell has broad potential for complex processes such as the nervous system, the developing organism, and pathological conditions. However, gene expression analysis of single cells is not stochastic and may be biased in its amplification of some mRNAs.26 Conversely, lack of detection of expected messages may be a consequence of mRNA copy number, stability of mRNA, lack of poly-A tails, or specificity of the sequence itself. The statistical power of single cell gene expression technology should be enhanced by the efficiency of in situ cell collection techniques, such as laser capture microdissection, once its resolution has consistently reached the single cell.27 A frequently represented message provides insight into differential gene expression in vivo. For example, the high frequency of Ig messages and restriction to a single light chain type suggests a (clonal) B cell and corroborates independent evidence of clonal Ig gene rearrangements in the genomic DNA of single Reed-Sternberg cells.4 

With increasingly high throughput technologies, such as capillary-based sequencing28 and gene expression microarrays,29large numbers of sequences should be accessible from small, carefully selected cells. Defined computer search logic for comparison of gene expression to other cell types and placement of short sequences into known fuller-length cDNAs creates a platform for the targeted study of regulated genes in a pathological cell. The present investigation demonstrates a model in which a rare cell, obscured by heterogenous tissue, can be unveiled by gene sequencing and statistical analysis to disclose disease–associated genes.

ACKNOWLEDGMENT

The cell lines, L428 and KMH2, were generously provided by Drs Volker Diehl and Hiroshi Kamesaki, respectively. The authors gratefully acknowledge the technical contribution of Maria Fergusson and the helpful advice of Drs Reinhard Ebner and Steven Ruben and are particularly indebted to the staff of the HGS sequencing facility.

Supported in part by the American Cancer Society, DHP112 (to J.C.), and the O. Benwood Hunter Endowment (to J.C.).

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. section 1734 solely to indicate this fact.

REFERENCES

REFERENCES
1
Gruss
HJ
Kadin
ME
Pathophysiology of Hodgkin’s disease: Functional and molecular aspects.
Balieres Clin Hematol
9
1996
417
2
Stein
H
Hummel
M
Marafioti
T
Anagnostopoulos
I
Foss
HD
Molecular biology of Hodgkin’s disease.
Cancer Surv
30
1997
107
3
Cossman
J
Messineo
C
Bagg
A
Reed-Sternberg cell: Survival in a hostile sea.
Lab Invest
78
1998
229
4
Kuppers
R
Rajewsky
K
The origin of Hodgkin and Reed/Sternberg cells in Hodgkin’s disease.
Annu Rev Immunol
16
1998
471
5
Mack
TM
Cozen
W
Shibata
DK
Weiss
LM
Nathwani
BN
Hernandez
AM
Taylor
CR
Hamilton
AS
Deapen
DM
Rappaport
EB
Concordance for Hodgkin’s disease in identical twins suggesting genetic susceptibility to the young-adult form of the disease.
N Engl J Med
332
1995
413
6
Trumper
LH
Brady
G
Loke
SL
Greisser
H
Wagman
R
Braziel
R
Gascoyne
RD
Vicini
S
Iscove
NN
Cossman
J
Mak
T
Single-cell analysis of Hodgkin and Reed-Sternberg cells: Molecular heterogeneity of gene expression and p53 mutations.
Blood
81
1993
3097
7
Messineo
C
Jamerson
MH
Hunter
E
Bagg
A
Irving
S
Cossman
J
Gene expression by single Reed-Sternberg cells: Pathways of apoptosis and activation.
Blood
91
1998
2443
8
He
WW
Sciavolino
PJ
Wing
J
Augustus
M
Hudson
P
Meissner
PS
Curtis
RT
Shell
BK
Bostwick
DG
Tindall
DJ
Gelmann
EP
Abate-Shen
C
Carter
KC
A novel human prostate-specific, androgen-regulated homeobox gene (NKX3.1) that maps to 8p21, a region frequently deleted in prostate cancer.
Genomics
43
1997
69
9
Drexler
HG
Recent results on the biology of Hodgkin and Reed-Sternberg cells. II. Continuous cell lines.
Leuk Lymphoma
9
1993
1
10
Sallusto
F
Lanzavecchia
A
Efficient presentation of soluble antigen by cultured human dendritic cells is maintained by granulocyte/macrophage colony-stimulating factor plus interleukin 4 and downregulated by tumor necrosis factor alpha.
J Exp Med
179
1994
1109
11
Altschul
SF
Gish
W
Miller
W
Myers
EW
Lipman
DJ
Basic local alignment search tool.
J Mol Biol
215
1990
403
12
Hong
JX
Wilson
GL
Fox
CH
Kehrl
JH
Isolation and characterization of a novel B cell activation gene.
J Immunol
150
1993
3895
13
Freeman
GJ
Freedman
AS
Segil
JM
Lee
G
Whitman
JF
Nadler
LM
B7, a new member of the Ig superfamily with unique expression on activated and neoplastic B cells.
J Immunol
143
1989
2714
14
Tedder
TF
Klejman
G
Schlossman
SF
Saito
H
Structure of the gene encoding the human B lymphocyte differentiation antigen CD20 (B1).
J Immunol
142
1989
2560
15
De Plaen
E
Arden
K
Traversari
C
Gaforio
JJ
Szikora
JP
De Smet
C
Brasseur
F
van der Bruggen
P
Lethe
B
Lurquin
C
Brasseur
R
Chomez
P
De Backer
O
Cavenee
W
Boon
T
Structure, chromosomal localization, and expression of 12 genes of the MAGE family.
Immunogenetics
40
1994
360
16
Plaza
S
Grevin
D
MacLeod
K
Stehelin
D
Saule
S
Pax-QNR/Pax-6, a paired- and homeobox-containing protein, recognizes Ets binding sites and can alter the transactivating properties of Ets transcription factors.
Gene Exp
4
1994
43
17
Soares
MB
Bonaldo
MF
Jelene
P
Su
L
Lawton
L
Estrafatiadis
A
Construction and characterization of a normalized cDNA library.
Proc Natl Acad Sci USA
91
1994
9228
18
Godiska
R
Chantry
D
Raport
CJ
Sozzani
S
Allavena
P
Leviten
D
Mantovani
A
Gray
PW
Human macrophage-derived chemokine (MDC), a novel chemoattractant for monocytes, monocyte-derived dendritic cells, and natural killer cells.
J Exp Med
185
1997
1595
19
Pinkus
GS
Pinkus
JL
Langhoff
E
Matsumura
F
Yamashiro
S
Mosialos
G
Said
JW
Fascin, a sensitive new marker for Reed-Sternberg cells of Hodgkin’s disease. Evidence for a dendritic or B cell derivation?
Am J Pathol
150
1997
543
20
Corbi
AL
Kishimoto
TK
Miller
LJ
Springer
TA
The human leukocyte adhesion glycoprotein Mac-1 (complement receptor type 3, CD11b) alpha subunit. Cloning, primary structure, and relation to the integrins, von Willebrand factor and factor B.
J Biol Chem
263
1988
12403
21
Holness
CL
Simmons
DL
Molecular cloning of CD68, a human macrophage marker related to lysosomal glycoproteins.
Blood
81
1993
1607
22
Berkhout
TA
Sarau
HM
Moores
K
White
JR
Elshourbagy
N
Appelbaum
E
Reape
RJ
Brawner
M
Makwana
J
Foley
JJ
Schmidt
DB
Imburgia
C
McNulty
D
Matthews
J
O’Donnell
K
O’Shannessy
D
Scott
M
Groot
PHE
Macphee
C
Cloning, in vitro expression, and functional characterization of a novel human CC chemokine of the monocyte chemotactic protein (MCP) family (MCP-4) that binds and signals through the CC chemokine receptor 2B.
J Biol Chem
272
1997
16404
23
Liang
P
Pardee
AB
Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction.
Science
257
1992
967
24
Zhang
L
Zhou
W
Velculescu
VE
Kern
SE
Hruban
RH
Hamilton
SR
Vogelstein
B
Kinzler
KW
Gene expression profiles in normal and cancer cells.
Science
276
1997
1268
25
Adams
MD
Kerlavage
AR
Fleischmann
RD
Fuldner
RA
Bult
CJ
Lee
NH
Kirkness
EF
Weinstock
KG
Gocayne
JD
White
O
Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence.
Nature
377
1995
3
(suppl)
26
McAdams
HH
Arkin
A
Stochastic mechanisms in gene expression.
Proc Natl Acad Sci USA
94
1997
814
27
Schutze
K
Lahr
GN
Identification of expressed genes by laser-mediated manipulation of single cells.
Nat Biotechnol
16
1998
737
28
Venter
JC
Adams
MD
Sutton
GG
Kerlavage
AR
Smith
HO
Hunkapiller
M
Shotgun sequencing of the human genome.
Science
280
1998
1540
29
Brown
PO
Botstein
D
Exploring the new world of the genome with DNA microarrays.
Nat Genet
21
1999
33

Author notes

Address reprint requests to Jeffrey Cossman, MD, NW 103 Medical-Dental Bldg, Georgetown University Medical Center, 3900 Reservoir Rd, NW, Washington, DC 20007; e-mail: cossmanj@gunet.georgetown.edu; website: www.hodgkins.georgetown.edu.