Gain-of-function mutations in RPA1 cause a syndrome with short telomeres and somatic genetic rescue

Telomere biology disorders (TBDs) are heterogeneous syndromes caused by mutations in telomere-associated genes. Sharma et al report a novel TBD gene responsible for varying TBD clinical features in 4 unrelated individuals. One patient manifested somatic genetic rescue in hematopoietic cells, leading to loss of the mutant allele and stabilization of blood ocunts. This report expands the genetic spectrum of TBDs.

Supplemental Figure 2. P3 family pedigree. Pedigree of P3 displaying family members suffering from clinical features consistent with TBD/STS: early hair graying and cirrhosis in maternal grandmother and mother, respectively (grey shading); idiopathic pulmonary fibrosis in two sisters (black circles), both of whom passed away from pulmonary fibrosis associated complications at ages 64 and 71 years. Additionally, cancers resulting in death were reported in 3 sisters (unknown pediatric cancer occurrence in a 7-year-old, pancreatic cancer in a 71-year-old and breast cancer in a 67-year-old) and congenital malformations were noted in two brothers who passed away during infancy.
Supplemental Figure 3. A, Telomere Shortest Length Assay (TeSLA) reveals the length of individual telomeres from all chromosome ends in P2 and P3. Nine TeSLA PCRs were performed for each peripheral blood DNA sample obtained from clinically unaffected RPA1 c.680T>C (V227A) carrier sister of P2 and P2 (left panel), in addition to the adult patient P3 and an age-matched control healthy individual (right panel). The characteristics of telomere length distributions are indicated below each blot. B, Violin plots of telomere length distribution obtained by TeSLA shows an increased fraction of very short telomeres (< 1 kb) in P2 and P3 compared to P2 sister and P3 age-matched control, respectively. A two-sided t-test for the differences between the means was performed, *** p<0.001.   Percentage of each population is denoted in each quadrant. C, Telomere length assessed in early and late passaged (P = passage) RPA1 WT and RPA1 E240K iPSCs using quantitative fluorescence in situ hybridization (Q-FISH). Spot analysis was performed on maximum intensity projections using custom FIJI macros based on local maxima detection. Approximately 300 interphase nuclei analyzed individually per sample. Graphs represent mean + SEM of one of two independent experiments (two-sided unpaired ttest, ****p<0.0001). D, Telomere restriction fragment (TRF) analysis in early and late passaged RPA1 WT and RPA1 E240K iPSCs, digested with HinfI and RsaI enzymes followed by separation on 0.7% agarose gel. Figure 7. Flow cytometry gating strategy for hematopoietic studies that assess HPs, erythroid and myeloid cells derived from RPA1 WT and RPA1 E240K iPSC. Gating strategy used for plots shown in main manuscript, Figure 3, is outlined for the analysis of hematopoietic progenitors (HP) at days 16 (A) and 21 (B), erythroid cells (C) and myeloid cells (D).    Figure 10. P1 CD34 + single cell colony assay and SNP array analysis in lymphocytes and granulocytes. A, P1 bone marrow collected at age 20 years was FACS-sorted for CD34 + cells followed by seeding of 500 cells per well of a 6-well plate (total of 12 wells) for 14 days in semi-solid methylcellulose-based media. 88 single colonies were picked followed by DNA isolation and Sanger sequencing for both germline (c.718) and somatic (c.1735) positions. Successful analysis was achieved in 32 colonies revealing 3 independent clones: a major clone (78%, blue bar) with wt sequence at both positions (c.718G and c.1735G), corresponding to UPD17p rescue clone; a smaller clone (16%, red bar) with wt germline c.718G and wt c.1735G somatic positions, corresponding to native hematopoiesis; and a second rescue clone (6%, grey bar) positive for mutant sequences at both positions. B, Single nucleotide polymorphism (SNP) array analysis shown from control and P1 lymphocytes and granulocyte peripheral blood DNA collected at the age of 25 years. Allelic distribution of SNPs on chromosome 17 (encompassing RPA1 locus) is depicted, with equal distribution of SNPs in control DNA while a small UPD17p clone was identified in lymphocytes compared to a large UPD17p clone in granulocytes from P1. Filtering criteria are outlined in the methods. P3 not shown (due to lack of parental material). The non de novo RPA1 mutations in P2 and P3 are depicted in Table 1. Abbreviations: Chr, chromosome; VAF, variant allelic frequency; MAF, minor allelic frequency; AC, allele count; AN, allele number; n.a., not available. *Confirmed as somatic (NRAS) or germline (RPA1) based on comparative sequencing of bone marrow and hair follicles.

SUPPLEMENTAL TABLES
Supplemental Table 2. Immunoglobulins and lymphocyte subsets over time in P4. # Immunoglobulin subsets at 3 weeks of age represent maternal transfer; + IgG replacement therapy. *Absolute subsets that are lower than the reference range.
Supplemental Table 3: Results of the FRET-based DNA binding and G-quadruplex melting experiments. 1 Binding stoichiometry X0 values were calculated from the inflection points of the two-line linear regression fit of the FRET change upon binding of the respective protein to dT30 ssDNA ( Figure 2b); the errors represent fitting uncertainty. 2 The equilibrium dissociating constants were calculated by fitting the respective binding isotherms shown in figures 2c and 2d to a quadratic binding equation); the errors represent fitting uncertainty. 3 The extents and rates of the h-telG4 unfolding were calculated using data shown in supplemental Fig. 4. The extents and rates were plotted as functions of protein concentration (Figure 2e and 2f) and fitted to a quadratic equation to yield apparent Kds.  Table  4.

Supplemental
Assessment of germline variant status RPA1 germline status was ascertained by sequencing of skin-derived fibroblasts in P1, hair follicles in P2, and based on VAF in the germline range (~50%) in the absence of a clonal blood disease in P3 and P4.
RNA sequencing Ribo-depleted RNA from P1 BM was sequenced on Illumina HiSeq platform with 75bp-paired end reads followed by data processing by St. Jude institutional automapper pipeline. Briefly the raw reads were first trimmed (Trim-Galore version 0.60), mapped to human genome assembly (GRCh37/hg19) (STAR v2.7) and then the gene level values were quantified (RSEM v1.31) based on GENCODE annotation (v19). Low count genes were removed from analysis using a cutoff of 10 reads with confident gene annotation of level 1 and 2 and protein-coding genes are used for differential gene expression (DGE) analysis. The individualized differentially expressed genes (iDEG) method was used for single-subject DGE analysis 3 . The parameters in iDEG were set as default except estimated Baseline= T. The significantly up-and down-regulated genes were defined by the thresholds as local FDR < 0.05 and fold change > 2 3 . Primer sequences provided in Supplemental Table 5.
RPA1 allelic quantification of genomic DNA and cDNA using ultra deep sequencing in P1 Target enrichment for germline (RPA1 c.718G>A) and somatic (RPA1 c.1735G>T) mutations for P1 on genomic DNA or cDNA was performed using PCR enrichment followed by library preparation with NEBNext Ultra II DNA library prep kit (New England BioLabs, cat no. E7645S/L, E7600S) per manufacturer's instructions. Samples were sequenced on an Illumina HiSeq 2500 with 2 × 150 bp reads with a median depth of 20,000 reads per sample. TrimGalore read trimming tool was used to remove adapter sequences and bases with low sequencing quality Following trimming, the resulting reads were aligned to the GRCh37/hg19 reference genome with BWA-MEM followed by indel realignment, duplicate removal, and SNP/INDEL calling using Freebayes with standard filtering parameters. Primer sequences provided in Supplemental Table 4.
Haplotype phasing using digital droplet (dd) PCR No dUTP ddPCR supermix (Bio-Rad Laboratories, cat no. 1863024) was used for all experiments. Primers and probes were designed to amplify the germline RPA1 c.718 and somatic RPA1 c.1735 genomic positions. Primers and probes were used at concentrations of 900 nM and 250 nM, respectively with water (negative control), duplex G-blocks (positive controls) and 1ng or 10ng genomic DNA from P1 at age 19. Droplets were generated using the QX200 AutoDG Droplet Generator (Bio-Rad Laboratories), sealed with a pierceable foil heat seal (Bio-Rad Laboratories), and cycled in a C1000 Thermal Cycler (Bio-Rad Laboratories Single cell (sc) DNA and protein sequencing library preparation and sequencing Single cell DNA sequencing with antibody-oligonucleotide staining was performed using the Tapestri single-cell DNA sequencing V2 platform, per manufacturer's instructions (MissionBio). Briefly, a custom targeted scDNA panel was designed and manufactured by MissionBio to amplify 298 amplicons including RPA1 variants found in P1: germline c.718G>A (chr17:1782314:G>A) and somatic c.1735G>T (RPA1:chr17:1798378:G>T) with oligonucleotide-conjugated antibodies (AOC) targeting cell surface proteins of interest. Cryopreserved BM samples from P1 were thawed, washed with RPMI and quantified using a Cellometer Auto T4 (Nexcelom Bioscience). 1.0 × 10 6 viable cells were then resuspended in phosphate buffered saline (PBS, Gibco) and incubated with TruStain FcX, and 1X Tapestri staining buffer for 3 minutes at room temperature. The customized pool of 7 oligonucleotide-conjugated antibodies (CD3, CD11b, CD19, CD34, CD38, CD45RA, CD90) were then added and incubated for 30 minutes at room temperature. Following multiple washes with PBS supplemented with 5% fetal bovine serum (FBS; Gibco), cells were recounted, diluted to a concentration of 4,000,000 cells/mL in Tapestri cell buffer. Next, 50 μL of the cell suspension was loaded onto a microfluidics cartridge and cells were encapsulated on the Tapestri instrument followed by the cell lysis and protease digestion in a thermal cycler within the individual droplet. The cell lysate was then barcoded such that each cell had a unique label. Amplification of the targeted DNA regions and antibody oligonucleotide tags was performed by a targeted PCR on the barcoded DNA emulsions. Emulsions were broken and DNA digested and purified with 0.7X Ampure XP beads (Beckman Coulter). The beads were pelleted, and the supernatant retained for antibody library preparation, while the remaining beads were washed with 80% ethanol and the DNA targets eluted in nuclease-free water. The supernatant containing the antibody tags was incubated with a biotinylated capture oligo (/5Biosg/CGAGATGACTACGCTACTCATGG/3C6/, Integrated DNA Technologies (IDT)) at 96°C for 5 min, followed by ice for 5 min, and recovered with streptavidin beads (Dynabead MyOne Streptavidin C1, Thermo Fisher Scientific). Indexed Illumina libraries were generated by amplifying DNA libraries with MissionBio V2 index primers and protein libraries bound to streptavidin beads with i5 and i7 index primers (IDT). All libraries were quantified using an Agilent Bioanalyzer and pooled for sequencing on an Illumina NovaSeq6000 with 150 base paired ending multiplexed runs. Adaptor sequence trimming, sequence read alignment to human genome (GRCh37/hg19), sequence read assignment to cell barcodes, and genotype calling with GATK were performed for all FASTQ files for single cell DNA libraries using Tapestri analysis pipeline. Using a loom file for subsequent processing, low quality genotypes or cells were filtered in Tapestri Insights v2.0 and Mosaic (https://github.com/MissionBio/mosaic), where a whitelist was used to filter out clones that did not have successful coverage for both RPA1 c.718 and RPA1 c.1735 positions within the same cell. We defined genetic clones based on their genotype status at both positions of interest.

Telomere length studies
Flow cytometry-based fluorescence in situ hybridization (Flow-FISH) Telomere length analysis was done by flow-FISH as previously described 5 . Briefly, peripheral blood samples were stained with a telomere specific (CCCTAA)3-peptide nucleic acid (PNA) FITC labeled FISH probe (Panagene) for DNA hybridization, followed by DNA counterstaining with LDS 751 (Sigma). Lymphocytes and granulocytes were identified based on forward scatter and LDS 751 staining. All measurements were carried in triplicates and mean telomere length was calculated in kilobases (kb) in relation to the internal control (bovine thymocytes) with known telomere length. Healthy controls for calculation of the percentile curves were used as described previously 5 .
Telomere Restriction Fragment (TRF) analysis using Southern blot Genomic DNA (800ng-1000ng) extracted from peripheral blood cells of P2 and family, as well as P3 and age matched control, was digested with HinfI and RsaI enzymes, resolved by a 0.7% agarose gel, and transferred to a nylon membrane. Hybridization at 42°C for 16 hours was performed using EasyHyb solution (Roche) and γ-32P-labeled (TTAGGG)4 probe. After washes, membranes were exposed over a PhosphorImager (AGFA). PhosphorImager exposures of telomere-probed Southern blots were analyzed with the ImageJ program. The digitalized signal data were then transferred to Microsoft Excel and served as the basis for calculating mean TRF length using the formula L = (ODi)/(ODi/Li), where ODi = integrated signal intensity at position i and Li = length of DNA fragment in position as determined by DNA ladders (1kb plus lambda HindIII ladder, Invitrogen).

Quantitative fluorescence in situ hybridization (Q-FISH)
Telomere length in iPSC and HP was analyzed by Q-FISH as reported previously 6 . Briefly, 500,000 interphase cells/genotype were hybridized using the peptide nucleic acid-FISH method followed by telomere staining with FITC-labeled (CCCTAA)3 peptide nucleic acid (PNA) probe (Telomere PNA FISH kit/FITC, DAKO) and DAPI counterstain. 300 interphase nuclei from each sample were analyzed individually to enumerate fluorescence (spot intensity X spot area)/telomere using a Zeiss Axio Imager.Z2 using a Zeiss Plan-Apochromat 63x objective and Applied Spectral Imaging SpotScan software v8.1.1 (ASI) with fixed exposure times for all samples. Q-FISH analysis for telomere length in early and late passaged (P = passage) RPA1 WT and RPA1 E240K iPSCs was performed using Z-stack images acquired on a Zeiss 980 LSM with a 63x 1.40 NA oil objective, using the same settings for all samples. Spot analysis was performed on maximum intensity projections using custom FIJI macros based on local maxima detection (code available upon request).
Telomere Shortest Length Assay (TeSLA) TeSLA was performed as previously described by Lai et al. 7 . Briefly, an equimolar mixture (50 pM each) of the six TeSLA-T oligonucleotides (containing seven nucleotides of telomeric C-rich repeats at the 3′ end and 22 nucleotides of the unique sequence at the 5' end) was annealed to and ligated with 50 ng of undigested genomic DNA at 35°C for 14 h. Then, genomic DNA was digested with CviAII, BfaI, NdeI, and MseI, the restriction enzymes that create short either AT or TA overhangs. Digested DNA was then treated with Shrimp Alkaline Phosphatase to remove 5′ phosphate from the DNA fragments to avoid their ligation to each other during the subsequent step. Upon heat-inactivation of phosphatase, partially double-stranded AT and TA adapters were added (final concentration 1 μM each) and ligated to the dephosphorylated fragments of genomic DNA at 16°C overnight. Following ligation of the adapters, genomic DNA was diluted to a final concentration of 20 pg/μL, and 2-4 μL of it was used in a 25 μL PCR reaction to amplify terminal fragments using primers complementary to the unique sequences at the 5' ends of the TeSLA-T oligonucleotides and the AT/TA adapters. FailSafe polymerase mix (Epicenter) with 1× FailSafe buffer H was used to efficiently amplify G-rich telomeric sequences. Entire PCR reactions were then loaded onto the 0.9% agarose gel for separation of the amplified fragments. To specifically visualize telomeric fragments, the DNA was transferred from the gel onto the nylon membrane by Southern blotting procedure and hybridized with the 32 P-labeled (CCCTAA)3 probe. The sizes of the telomeric fragments were quantified using TeSLA Quant software 7 .

RPA biochemical studies
Expression and purification of wild type and mutant RPA Site-directed mutagenesis (Agilent) was performed with custom synthesized primers (IDT) to introduce RPA1 c.680T>C, p.V227A and RPA1 c.718G>A, p.E240K and RPA1 c.808A>G, p.T270A variants independently into pET11d-Human RPA construct to yield RPA V227A , RPA E240K and RPA T270A respectively 8 . The presence of the mutation was confirmed by sequencing. Both wild type and mutant proteins were expressed and purified as previously described 8 . Primer sequences provided in Supplemental Table 6.
FRET-based ssDNA binding assays FRET-based assays were used to monitor RPA WT , RPA V227A and RPA E240K , and RPA T270A binding to ssDNA dually labeled with Cy3 and Cy5 fluorophores, as previously described 9 . All DNA oligonucleotides were purchased from Integrated DNA Technologies (Coralville, IA, USA , where 5 is the averaged acceptor intensity and 3 is the averaged donor intensity after subtracting the background fluorescence 9 . The calculated FRET efficiency was plotted against protein concentrations and analyzed using GraphPad Prism software. Stoichiometric binding curves (dT30 ssDNA) were fitted with two lines to determine the inflection point, and equilibrium binding curves (dT15 ssDNA) were fitted using quadratic binding equation to determine the Kds.
FRET-based G4 quadruplex melting assays RPA-mediated melting of the folded h-telG4 DNA (Cy5-(TTAGGG)5-Cy3) quadruplex was monitored in the reaction buffer containing 30 mM HEPES-KOH, pH 7.5, 1 mM DTT, 5 mM Mg-acetate and 100 mM KCl, which stabilizes the quadruplex 10 . For each RPA concentration, the averaged FRET efficiency from 3 different experiments was plotted against time and fitted to a double exponential whose combined amplitude was compared to baseline FRET efficiency value for Cy5-(TTAGGG)5-Cy3 to determine the extent of h-telG4 melting. The initial rate of G4 melting was determined as the slope of the linear portion of the progress curve (10-40 s depending on the protein concentration) for each assay, divided by 0.38 (FRET efficiency difference between fully folded and fully stretched h-telG4 DNA) and multiplied by the total amount of DNA present (1 nM). Both the rate and extent of h-telG4 melting were plotted against protein concentration and analyzed using GraphPad Prism software.

Modeling and characterization of RPA1 p.E240K in iPSC
Generation, quality control and maintenance of human iPSC ATCC normal adult human primary dermal fibroblasts (PCS-201-012) were reprogrammed into iPSC using the integration-free CytoTune2.0 Sendai virus reprogramming kit (Thermo Fisher Scientific Fisher, cat no. A16517) following methods previously reported 11,12 . G-banding karyotype was performed in all cell lines prior to experiments and pluripotency was confirmed in iPSC using flow cytometric analysis for SSEA-4 stem cell surface marker as previously described 13 . iPSC were maintained on 6-well tissue culture plates thinly coated with Matrigel® (Corning, cat no. 354230) in cGMP, feeder-free maintenance mTeSR1 medium (Stem cell Technologies, cat no. 85850). Culture medium was changed daily and iPSC were split every three to four days. Briefly, medium was aspirated, and wells were sequentially washed with PBS (Gibco) followed by cell dissociation in ReLeSR (Stem Cell Technologies, cat no. 05872) for 3 min. 1 mL mTeSR1 medium containing 1.25 μM ROCK inhibitor (Sigma, cat no. Y0503) administered per well and cells were pipetted 3-4 times using a 5 mL serological pipette to dissociate into small to medium sized clusters and split onto new 6-well plates.
Assessment of iPSC pluripotency state by immunofluorescence Immunofluorescence analyses of undifferentiated human iPSCs in monolayer culture was performed. For this purpose, cells were fixed 10 min at room temperature with 4% paraformaldehyde (Sigma-Aldrich) and washed twice with PBS. Cells were blocked for 1 hour at room temperature with 0.1% Triton (Sigma-Aldrich), 10% FBS (Sigma-Aldrich) and 1% BSA (Sigma-Aldrich) for 15 min. Primary antibodies diluted in Triton block solution were added and incubated for 3 hours at room temperature. Thereafter, cells were washed with PBS and secondary antibodies in Triton block were added for one-hour incubation in the dark. Slides were washed with PBS and mounted using vectashield containing DAPI (Biocompare, cat no. H1200) followed by applying coverslip. All samples were imaged the following day using Zeiss LSM 980 confocal scanning microscope (Zeiss). Primary antibodies used were Nanog (NL493, R&D systems, cat per well into a Matrigel-coated six-well plate and cultured overnight. On day zero of differentiation, medium A was added to promote mesodermal differentiation and half-medium A change was done on day two. Supernatant was removed on day three and hematopoietic differentiation medium B was added, followed by half-medium change on days five, seven, 10, 12, 14, 17, and 19. Full supernatant was harvested on days 16 and 21 for flow cytometric analysis of pan-hematopoietic markers (CD43, CD45). For erythroid and myeloid terminal differentiation, day 10 hematopoietic progenitors (HP) were flowsorted (BD FACS Aria, Becton Dickinson) based on CD43 and CD34 expression and 30,000 cells in triplicates were seeded for erythroid differentiation (Stem Cell Technologies, cat no. 02692; EPO, IL-3, and SCF) and myeloid differentiation (Stem Cell Technologies, cat no. 02693; G-CSF, GM-CSF, SCF, and TPO) in StemSpan SFEM II hematopoietic cell expansion media (Stem Cell Technologies, cat no. 09655). Following 14-day differentiation, erythroid (CD45 -CD71 + CD235 + ) and myeloid cells (CD45 + CD18 + CD11b + ) were counted using Cellometer Auto T4 (Nexcelom Bioscience) and stained with cell surface antibodies detailed in supplemental material for flow cytometric analysis.
Cytospin and Wright-Giemsa stain of myeloid and erythroid cells Day 14 myeloid and erythroid cells from 1 well were centrifuged at 300rpm at room temperature followed by media aspiration and resuspension in phosphate buffered solution (PBS, Gibco). Separate cytofunnels were loaded with different cell populations and cytospun on a Thermo Fisher Scientific Cytospin 4 at 60 g for 5 minutes at room temperature. All slides were then Wright-Giemsa stained for microscopic imaging. Digital images were taken with a 60x oil objective and Olympus DP22 camera and scaled identically.
Statistical analysis Data for all experiments from biological and technical replicates were presented as the mean values ± standard deviation (SD) or standard error of mean (SEM) as specified in each figure legend. Statistical significance was assessed using GraphPad Prism v 7.04 software employing paired and unpaired Student's t test. P values < 0.05 were considered statistically significant.