Cytogenetic abnormalities (CA) are a hallmark of multiple myeloma (MM) and other cancers and are commonly used as clinical parameters for determining disease stage and guiding therapy decisions. Traditional techniques, including fluorescence in situ hybridization (FISH) and karyotyping, and the recently developed array-based comparative genomic hybridization are expensive and time consuming. As gene expression profiling (GEP) is becoming more integrated in the diagnostic workup of MM and is increasingly being used for risk stratification as well as tailoring therapy, we are presented with vast amounts of data that should reflect disease associated alterations of the genome. We therefore sought to develop a GEP based vitual CA (vCA) model to predict CA in MM.
We determined genome-wide gene expression profiles and DNA copy numbers (CNs) in purified plasma cell samples obtained from 92 newly diagnosed MM patients, using the Affymetrix GeneChip and the Agilent aCGH platforms, respectively. We identified 1,114 CN-sensitive genes by Pearson's correlation coefficient (PCC) of gene expression levels and the copy numbers of the corresponding DNA loci, keeping the false discovery rate to <5%. On the basis of these CN-sensitive genes, we developed a vCA model for predicting CA in MM patients by means of GEP. The model focuses particularly on chromosomes 3, 5, 7, 9, 11, 13, 15, 19, and 21, as well as the 1p, 1q, and 6q segments, which are the most commonly altered chromosome regions in MM plasma cells. The reference CA (rCA) of a given chromosome region were determined by the mean values of signals of aCGH probes located in that region. The values of rCA could be used to distinguish among amplification, deletion, and normal. The predicted CA (pCA) of a given chromosome region were determined by the following procedures. First, we calculated the mean expression levels of CN-sensitive genes within the region. Then, by training the model in a GEP data set with 92 MM samples, we set the cutoff value of the mean expression levels of CN-sensitive genes for each chromosome region in order to obtain pCA that were most consistent with rCA in terms of the Matthews correlation coefficient, a measure of the quality of binary (two-class) classifications. The mean prediction accuracy was 0.88 (0.59–0.99) when the model was applied to the training data set. To check for overfitting in the vCA model, we applied the model to an independent data set of 23 MM samples for which both GEP and aCGH data were available. The mean prediction accuracy was 0.89 (0.74–1.00), which indicated that overfitting was negligible if present at all. We further validated the model with a FISH data set compiled from 262 independent MM samples for which both FISH records and GEP data were available. The mean prediction accuracy was 0.87. The consistency between vCA-predicted chromosomal alterations and findings of karyotyping dropped to 0.65. However, this underperformance could be due to the fact that karyotyping is limited by the low proliferation rate of terminally differentiated plasma cells in vitro.
Our results provide a proof of concept that GEP data alone can reveal all the information provided by conventional cytogenetic techniques. We show that re-purposing gene expression data using our model is a fast and economical way to obtain cytogenetic information that is accurate and can be used for diagnosis and observation in MM and potentially other malignancies. GEP can serve as a one-stop genomic data source for information from the level of specific genes to whole chromosomes.
Barlogie:Celgene: Consultancy, Honoraria, Research Funding; IMF: Consultancy, Honoraria; MMRF: Consultancy; Millennium: Consultancy, Honoraria, Research Funding; Genzyme: Consultancy; Novartis: Research Funding; NCI: Research Funding; Johnson & Johnson: Research Funding; Centocor: Research Funding; Onyx: Research Funding; Icon: Research Funding. Shaughnessy:Myeloma Health, Celgene, Genzyme, Novartis: Consultancy, Employment, Equity Ownership, Honoraria, Patents & Royalties.
This icon denotes a clinically relevant abstract
Asterisk with author names denotes non-ASH members.