The human IGHV4-34 gene encodes antibodies which are intrinsically autoreactive when the VH domain is unmutated. Therefore, B cells expressing IGHV4-34 B-cell receptor immunoglobulins (BcR IG) are normally under close scrutiny in order to avoid unwanted autoreactivity, especially against DNA. The IGHV4-34 gene is frequently utilized in chronic lymphocytic leukemia (CLL), where, typically, it shows a high load of somatic hypermutation (SHM). We have previously reported distinctive SHM patterns amongst IGHV4-34 CLL, especially for subsets with stereotyped BcR IG. However, although a large number of cases (~2000) was previously studied, since even the largest subsets account for only ~3% of CLL, meaningful conclusions could not be reached for smaller subsets. Here we revisit this issue in a series of 16,528 CLL cases and focus on IGHV4-34 expressing subsets: #4 (IGHV4-34/IGHD5-18/IGHJ6 | 156 cases, 0.9%); #11 (IGHV4-34/IGHD3-10/IGHJ4 | 16 cases, 0.1%); #16 (IGHV4-34/IGHD2-15/IGHJ6 | 41 cases, 0.25%); #29 (IGHV4-34/IGHD: unassignable/IGHJ3 | 39 cases, 0.24%); and #201 (IGHV4-34/IGHD: unassignable/IGHJ3 | 43 cases 0.26%). Focusing on codons 27-104 within the VH domain (from CDR1-IMGT to FR3-IMGT), we calculated the sequence distance between subsets and the corresponding IGHV4-34 germline sequence based on a pairwise qualitative and quantitative comparison of the respective amino acid composition. The minimum distance calculated, and hence the greatest identity, was observed between subsets #4 and #16, both concerning IgG-switched cases (IgG-CLL), which is notable given the overall rarity of IgG-CLL. In contrast, the maximum distance, implying the least identity, was between subsets #16 and #201, the latter concerning IgM/D-CLL. Extreme variations between subsets were noted in codons spanning the entire VH domain. This result is consistent with our finding of a subset-biased distribution of mutations over the VH domain. More specifically, while subsets #11, #16, #29 and #201 had a lower frequency of mutations within VH CDR1 compared to VH CDR2, the exact opposite was seen in subset #4, with 40% of mutations in VH CDR1 versus 27% in VH CDR2. In addition, subsets #4, #11, #16 and #29 had a similar distribution of mutations in VH FR2 and VH FR3, in contrast to subset #201 that showed a preference for VH FR3 over VH FR2. Consequently, we noted that certain positions were targeted in a subset-specific manner e.g. codon 28 in VH CDR1 was heavily targeted in subsets #4 (68.6%) and #16 (87.8%), with most cases carrying an acidic amino acid (AA) introduced by SHM, glycine to glutamic acid, G>E: 51.3% for subset #4 and 78% for subset #16. The high prevalence of acidic AA introduced by SHM in these subsets is notable considering the electropositive nature of their VH CDR3 (especially of subset #4), strongly recalling edited anti-DNA antibodies. Interestingly, the G>E change was identified at a much lower frequency in other IGHV4-34 subsets: 18.75% for subset #11; 2.6% for subset #29; 7% for subset #201, all of which carried electronegative VH CDR3. Further, we noted that certain positions were heavily targeted in all subsets e.g. 56-86% targeting for SHM at codon 92 in VH FR3 where serine is encoded by the agc triplet, the ”hottest of hotspots”. This result could be viewed as sequence- rather than subset-dependent and linked to the molecular features of this codon, which is supported by the low targeting of codon 93 (0-6%), also encoding serine by the tct triplet. Other positions were targeted in all subsets but at vastly different frequencies e.g. codon 64 was targeted in 37.8% in subset #4 rising to 100% in subset #29. Finally, positions heavily targeted by SHM in certain subsets were unmutated in other subsets e.g. codon 36 in VH CDR1 remained unmutated in subset #16, in contrast 76.9% of subset #29 were mutated at this position resulting in an AA change. In conclusion, we document different spectra of SHM and AA changes between stereotyped IGHV4-34 CLL subsets. The finding of subset-biased, recurrent AA changes at certain codons indicates that the respective progenitor cells may have responded in a specific manner to the selecting antigen(s), despite expressing the same IGHV gene, indicating a functional purpose for these modifications. This is exemplified by the molecular characteristics of the recurrent AA changes in subset #4, thereby offering interesting pathogenetic hints.


