STAT1 is essential for HSC function and maintains MHCIIhi stem cells that resist myeloablation and neoplastic expansion

Key Points • STAT1 is essential for normal HSC function and maintenance of a MHCIIhi HSC subset that is less responsive to stress-induced proliferation.• MHCIIhi and MHCIIlo subsets both contain functional HSCs, but MHCIIlo HSCs show increased Mk potential and are expanded in mutant CALR mice.


Competitive transplantation assays
C57BL/6 (CD45.1+)and CD45.1 + W41 (cKit W41/W41 ) recipients were irradiated with 2 x 550 and 400 cGy respectively.For competitive repopulation assays, bone marrow cells (5 x 10 5 or lower dose 5 × 10 4 ) with 0.5 x 10 6 nucleated competitor BM cells obtained from CD45.1/CD45.2F1 mice were injected into recipient mice.To evaluate qualitative differences of HSCs from WT and mutant mice or different subset of HSCs, equal number of FACS isolated ESLAM HSCs were mixed with 0.3 x 10 6 nucleated competitor BM cells obtained from CD45.1/CD45.2F1 mice before being injected into CD45.1 + recipient mice.At 16 weeks or longer post transplantation, bone marrow cells from the recipient mice were assessed for donor-derived HSC chimerism using flow cytometry by staining bone marrow cells with ESLAM markers as above as well as CD45.1 and CD45.2.
Secondary transplantation was then performed using 5 x 10 6 nucleated BM cells from primary recipients of ESLAM HSCs.In all cases, peripheral blood was obtained and analyzed by flow cytometry for donor contribution to myeloid (Ly6g and Mac1) and lymphoid (B220 and CD3e) lineages together with antibodies for CD45.1 and CD45.2 to distinguish the donor origin of repopulated cells.

ESLAM HSCs
Single cell suspensions of BMMNCs were first lineage depleted using EasySep™ Mouse Hematopoietic Progenitor Cell Isolation Kit (STEMCELL Technologies).Then the cells were stained with ESLAM HSC markers as above and I-A/I-E APC antibody (Biolegend).Gating for MHCII hi ESLAM HSCs were determined using STAT1KO ESLAM HSCs as a negative control due to lack of MHCII expression in STAT1-deficient HSCs.

MHCII hi and MHCII lo ESLAM HSC single cell in vitro differentiation assays
Single ESLAM HSCs were cultured and the derived clones were classified as previously described with minor modifications (Prins et al 2020).Briefly, single MHCII hi and MHCII lo ESLAM HSCs were FACS sorted into round-bottom 96well plates (Corning, Corning, USA) preloaded with 50µl StemSpan SFEM (serum-free expansion medium) (STEMCELL Technologies).A further 50 µL of medium was subsequently added to each well to a final concentration of 10% FBS (STEMCELL Technologies), 1% penicillin/streptomycin (Sigma-Aldrich), 1% l-glutamine (Sigma-Aldrich), stem cell factor (SCF; 250 ng/ml), IL-3 (10 ng/ml), IL-6 (10 ng/ml; STEMCELL Technologies) and 0.1 mM β-mercaptoethanol.At day 7 of culture, single cell-derived clones were visually inspected.Wells with surviving cells were classified into one of three categories: (i) wells containing only one or more enlarged cells, which were to become megakaryocytes as characterized (Prins et al., 2020) in this culture system; (ii) mixed expansion, with both small and enlarged cells; and (iii) expansion with only small cells.
The experimenter was blinded to the identity of the cells sorted into the well.

Cell cycle analysis of steady state ESLAM HSCs
Bone marrow cells from untreated STAT1KO and WT control mice were lineage depleted using EasySep™ Mouse Hematopoietic Progenitor Cell Isolation Kit.
The cells were then stained for 45 minutes with CD45 BV785 (Biolegend), EPCR PE (STEMCELL Technologies), CD150 PE/Cy7 (BioLegend), CD48 BV605 and Zombi Nir to eliminate dead cells.The stained cells were washed with FACS buffer (PBS/2% FBS), then fixed and permeabilized using Cytofix/Cytoperm™ Fixation/Permeablization kit (BD Biosciences).The cells were then stained with Ki-67 FITC (Biolegend) on ice for 1 hr and then with DAPI (Thermo Fisher) for 1 hour at RT. Flow cytometric analysis was carried out at a low flow rate using LSRFortessa flow cytometer (BD).

Apoptosis analysis of ESLAM HSCs
Bone marrow cells from either steady state or 5-FU treated mice were mixed with equal volume of ammonium chloride (StemCell Technologies) and incubated on ice for 3 minutes to lyse red blood cells.BMMNCs were then stained with CD45 BV785 (Biolegend), EPCR PE (STEMCELL Technologies), CD150 PE/Cy7 (BioLegend), CD48 BV605, I-A/I-E APC (Biolegend) for 45 minutes.The stained cells were washed with FACS buffer (PBS/2% FBS), then stained with 5ul Annexin V FITC (Biolegend) in 100ul PBS/2% FBC/5mMEDTA at room temperature in dark for 15 minutes.DAPI was then added to the samples (Thermo Fisher).Flow cytometric analysis was carried out using LSRFortessa flow cytometer (BD).

Analysis of division kinetics in Single HSC in vitro Cultures
E-SLAM HSCs were sorted into round-bottom 96-well plates, preloaded with 50 µL serum-free Stemspan medium (STEMCELL Technologies).A further 50 µL of medium was subsequently added to each well to a final concentration of 10% FCS, 1% Pen/Strep, 1% L-Glut, 250 ng/mL SCF (STEMCELL Technologies), 20 ng/mL IL-3 and IL-6 (Peprotech) and 0.1 mM β-mercaptoethanol.For cell division kinetics, the number of cells in each well was counted manually every day for up to 5 days in culture.

STAT1KO and WT ESLAM cells Smart-seq2 data analysis
Bone marrow cells were harvested from STAT1KO and WT control mice and single ESLAM HSCs were FACS sorted as described above and processed as described for Smart-seq2.

Preprocessing
The reads resulting from the SmartSeq2 experiments were mapped against Ensembl genes (release 81) (Zerbino et al., 2018) using GSNAP (version 2015-09-29) (Wu and Nacu, 2010) and quantified using HTSeq (version 0.6.0)(Anders et al., 2015).Filtering Quality Control (QC) steps were applied where nuclear genes had to be at least 20% of the mapping reads and cells with less than 50,000 reads mapping to them were rejected.In addition, the maximum allowed fraction of cells mapping to mitochondrial genes was set at 20%.The levels of technical variance were estimated using the ERCC spike-ins as described by Brennecke et al. (2013) with highly variable genes (HVGs) being defined as having the squared coefficient of variation exceeding technical noise.The resulting dataset was also transformed by applying the remove batch effect method implemented within the R limma package (Ritchie et al., 2015).The raw sequencing reads and gene count tables were deposited at the NCBI GEO (accession number: GSE180904).

Downstream processing
Further processing of the dataset was performed with the python package Scanpy (Wolf et al. 2018).The dataset was subsequently logtransformed, scaled and using the HVGs as input a PCA reduction for the top 50 components was computed.A diffusion map embedding was also calculated for the top 15 diffusion components.

Differential expression
Differential expression on the Smart-seq2 sequencing data was performed using the R package DESeq2 (Love et al., 2014).The input expression matrices were filtered to include only genes with average expression above 1 (resulting in 14,080 genes on the filtered matrix).Differential expression results were selected for a significance level of 0.01.Visualization of results was done via volcano plot generated with the Enhanced Volcano R package (Blighe et al., 2018).

Gene set enrichment analysis (GSEA)
The Wald statistic from DESeq2 result was used to generate pre-ranked gene lists.

STAT1KO and WT LK cells 10x Genomics data analysis
Bone marrow cells were harvested from STAT1KO and WT control mice and lineage negative, c-Kit + (LK) cells were FACS sorted and processed according to the manufacturer's protocol for 10x Chromium (10x Genomics, Pleasanton, CA) experiments.

10x Preprocessing
The sequenced reads were processed with Cellranger (version 2.1.1)and aligned to the 10X Genomics built mouse mm10 reference (version 1.2.0).From the two libraries processed; a total of 13,770 cells were recovered.The raw sequencing reads and gene count tables were deposited at the NCBI GEO (accession number: GSE180905) 10x Downstream analysis The remainder of the downstream analysis was performed using the Scanpy package.The Scrublet package was used to estimate doublets and 287 were identified and removed.The cells were further filtered based on percentage of mitochondrial UMI counts being less than 5% and cells expressing at least 500 genes.Genes were retained if they were expressed in at least in 3 cells.This resulted in a filtered dataset with 13,301 cells and 16,142 genes.The filtered dataset was subsequently normalized to the median UMI counts per cell across the two processed libraries and logtransformed.Highly variable genes were then computed with parameters min_mean=0.001,max_mean=5, min_disp=0.05 and the dataset scaled.space between all cells in both datasets was then calculated.Our dataset consists of two samples (WT and KO) and for each the top k (100,000/size of sample) reference neighbors were considered for each of our cells.Within each sample every time a reference cell was identified as one of the k closest neighbors it incrementally grew the projection counter.This allowed us to create projection densities on the reference embedding.These projection counters were then smoothed by a factor of 1000 and used to calculate the log2-fold difference between each of the conditions which were then subsequently plotted on the reference embedding.

Visualizations
The embedding plots shown in this publication were generated through Scanpy's plotting methods.Violin plots were also generated using Scanpy or seaborn python packages.Heatmaps were generated with seaborn clustermap method using Euclidian distance metric with the 'ward' method for linkage.

Nestorowa Dataset (GSE81682):
The Smart-Seq2 HSPC dataset was obtained from Nestorowa et al. (Blood, 2016).Differential pathway analysis between CD74 high and CD74 low LTHSCs were performed using the GO biological processes database (v7.1) downloaded from GSEA.Geometric means of genes in each term were calculated for all Nestorowa LTHSCs and t-test was applied using rank_genes_groups in Scanpy to test if means are the same between CD74 high cells and CD74 low cells.The significant terms were extracted from the MA plot with log2FC >= 0.5 and mean log2 expression >= -1.

Dahlin Dataset (GSE107727)
The 10x LK/LSK dataset was obtained from Dahlin et al. (Blood, 2018).The data was normalized to a total count of 10K and log transformed.

Mann dataset (GSE100426)
The HSC raw counts were downloaded from GEO and only unstimulated LT-HSCs from young aged mice were extracted for this study.The cells were filtered using filter_cells function in Scanpy with min_genes = 500 and the genes were filtered using the filter_genes function with min_cells=1.Then the data was normalized to a total count of 10K and logarithmized.In total, there are 88 young LTHSCs.

Haltallli dataset GSE156410
Four Plasmodium infected samples (Haltallli et al.) GSE156410 were downloaded from GEO database as filtered cellranger output h5 files, representing 2 experimental groups: infected and control, with 2 biological replicates each.Preprocessing was done using the Scanpy pipeline.Each cell was normalized to 10K for comparison and logged.Cell cycle scores were calculated using the score_genes_cell_cycle function.High variable genes were selected from each sample using highly_variable_genes, with min_mean=0.02,max_mean=3, min_disp=0.3 and merged.In total, 1612 genes were considered as highly variable.In order to capture cell type differences as the most dominant component, cell cycle phases, number of genes, number of counts and percentage of mitochondrial genes were regressed out.Four samples were further integrated using reducedMNN function in batchelor R package.Cell types assignment was done using the Nestorowa landscape for the HSC and immature populations and Dahlin landscape for the mature populations as references.Cells were clustered using Leiden clustering with resolution=1.LT-HSCs were extracted with 2 criteria: 1) Cell type was defined as LT-HSC; 2) Leiden cluster = 1 that contains HSC and immature populations to further exclude the non-specific cells.
Violin plots were done using ggplot2 package in R. Means difference between infected and control were tested using t test.

Score calculation
G2M, S and MHC scores were calculated using score_genes function in Scanpy.G2M and S gene lists were defined in Tirosh et al, 2015.MHC genes used are Cd74, H2-Aa, H2-Ab1 and H2-Eb1.The data was log-normalized, and hi and MHCII lo ESLAM HSC following 5-FU or poly-IC treatment 5-FU (150mg/Kg) or poly-IC (10mg/Kg) was administered intraperitoneally (i.p.) to mice, at 70 hours post injection post 5-FU or 16 hours post poly-IC, bone marrow cells were harvested and stained with for 45 minutes with CD45 BV785 (Biolegend), EPCR PE (STEMCELL Technologies), CD150 PE/Cy7 (BioLegend), CD48 BV605, I-A/I-E antibodies and Zombi Nir to eliminate dead cells.The stained cells were washed with FACS buffer (PBS/2% FBS), then fixed and permeabilized using Cytofix/Cytoperm™ Fixation/Permeablization kit (BD Biosciences).The cells were then stained with Ki-67 FITC (Biolegend) on ice for 1 hour and then with DAPI (Thermo Fisher) for 1 hour at RT. Flow cytometric analysis was carried out at a low flow rate using LSRFortessa flow cytometer (BD).
cells were projected onto the Dahlin landscape by a PCA projection.Euclidean distances were calculated between CALR cells and the Dahlin data based on the top 50 PCA components.Top 15 nearest neighbor cells from the Dahlin landscape were calculated for each CALR cell with the shortest Euclidean distance.The cell type annotation was then assigned as the most frequent cell type from the nearest neighbors.To annotate HSPCs in more details, the HSCs and immature populations of the CALR dataset were extracted and projected again onto the Nestorowa landscape with the same procedure indicated above.Finally, LTHSCs were extracted for this study.669 and 506 LTHSCs were extracted from the WT CALR mutant datasets respectively.Then the data was normalized to a total count of 10K and log transformed.