Abstract

The human genome uses alternative pre-mRNA splicing as an important mechanism to encode a complex proteome from a relatively small number of genes. An unknown number of these genes also possess multiple transcriptional promoters and alternative first exons that contribute another layer of complexity to gene expression mechanisms. Using a collection of more than 100 erythroid-expressed genes as a test group, we used genome browser tools and genetic databases to assess the frequency of alternative first exons in the genome. Remarkably, 35% of these erythroid genes show evidence of alternative first exons. The majority of the candidate first exons are situated upstream of the coding exons, whereas a few are located internally within the gene. Computational analyses predict transcriptional promoters closely associated with many of the candidate first exons, supporting their authenticity. Importantly, the frequent presence of consensus translation initiation sites among the alternative first exons suggests that many proteins have alternative N-terminal structures whose expression can be coupled to promoter choice. These findings indicate that alternative promoters and first exons are more widespread in the human genome than previously appreciated and that they may play a major role in regulating expression of selected protein isoforms in a tissue-specific manner. (Blood. 2006;107: 2557-2561)

Introduction

Alternative pre-mRNA splicing is an important cellular mechanism by which functionally diverse proteins can be synthesized from a single gene. As many as 60% of human genes use alternative RNA processing to generate, from a single gene, mature mRNAs that differ in exon composition at the 5′ end, within the internal coding regions, or at the 3′ end.1-3  By producing different splice variants of a single gene, multiple protein isoforms with diverse structural/functional properties can be generated; examples of this abound among molecules that are involved in transcriptional activation, ligand interaction at the cell surface, and intracellular binding interactions among cytoskeletal components.4,5  Therefore, the use of alternative exons is an essential mechanism for proper regulation of many cellular processes.

The erythroid protein 4.1R is a prime example for which splicing can significantly affect a molecule's characteristics. Many different 4.1R isoforms can be produced from the single complex locus during erythroid development. For example exon 16, which encodes an essential part of the spectrin-actin binding domain of the 4.1R molecule, is excluded in early erythroid progenitors but included in erythrocytes of later stages; the inclusion of this critical spectrin-binding region allows 4.1R proteins to stabilize the developing membrane skeleton in mature red cells, by participating in associations with skeletal protein networks as well as contacts with integral membranes in the lipid bilayer.6  In addition to a host of internal exons, the 4.1R gene also has considerable complexity at the 5′ end. Specifically, there are 3 mutually exclusive first exons that map far upstream of the coding exons, with each of these alternative exons having its own transcriptional promoter. These exons splice to different acceptor sites downstream in exon 2, producing 2 protein isoforms (80 kDa and 135 kDa) that possess distinct N-termini structure and function.7 

Besides 4.1R, several other erythroid genes are also known to have alternative first exons. The gene encoding uroporphyrinogen-III synthase, an enzyme essential for heme biosynthesis, has 2 alternative 5′ exons, one of which is specific for erythropoietic tissues.8  Two other genes vital for erythroid functioning, acetylcholinesterase9  and ankyrin-1,10  also exhibit complexity at the 5′ terminus. In the current study, we show that this genomic theme is present among many other erythroid genes as well. Genetic database analysis revealed that many erythroid genes possess 2 or more alternative first exons; the authenticity of these previously unappreciated first exons was supported by the presence of computationally predicted transcriptional promoters found near a large majority of the 5′ ends. The results of this study suggest that alternative first exons may represent an important mechanism by which the expression of erythroid genes is regulated.

Materials and methods

Erythroid genes analyzed

Many genes essential for human red cells were analyzed in this study. They were categorized into 6 different groups: (1) globins, (2) heme biosynthetic enzymes, (3) transcription factors, (4) cytoskeletal proteins, (5) plasma membrane proteins, and (6) glycolytic enzymes plus other cytoplasmic proteins. The majority of these genes were adopted from Hembase, a database of erythroid genes.11  A complete catalog of the genes analyzed in this study is displayed in Table 1.

Table 1.

List of erythroid genes analyzed for alternative first exons


Type (no. genes)

Genes
Globins (9)  HBA1, HBA2, HBB, HBD, HBG1, HBG2, HBE1, HBQ1, HBZ 
Heme biosynthetic enzymes (8)  ALAS2, ALAD, CPO, FECH, HMBS, PPOX, UROD, UROS 
Transcription factors (11)  BKLF, FLI-1, FOG-1, GATA1, GATA2, GADD153, KLF-1, MAFG, NFE2L1, NFE2L2, NFE2,  
Cytoskeletal proteins (13)  ACTB, ADD1, ADD2, ADD3, ANK1, EPB41, EPB42, EPB49, P55, SPTA1, SPTB, TMOD1, TPM1 
Plasma membrane proteins (33)  ACHE, AQP1, AQP3, AE1, CD44, CD47, CD55, CD99, CD147, CD151, CR1, DO, ERMAP, FUT1, FUT2, FUT3, FY, GLUT1, GYPA, GYPB, GYPC, GYPE, ICAM4, KEL, LU, RHAG, RHCE, RHD, SLC14A1, STOM, TRFC, XG, XK 
Glycolytic enzymes and other proteins (29)
 
A4GALT, ABC-Me, ABO, ADA, AHSP, ALDOA, ALOX15, B3GALT, BPGM, CA2, DYRK3, ENO1, G6PD, GCLC, GCNT2, GP1, GPX1, GSR, GSS, GSTT1, HK1, HMOX1, NT5C3, PGD, PGK1, PKLR, PFKM, PRDX2, TP11
 

Type (no. genes)

Genes
Globins (9)  HBA1, HBA2, HBB, HBD, HBG1, HBG2, HBE1, HBQ1, HBZ 
Heme biosynthetic enzymes (8)  ALAS2, ALAD, CPO, FECH, HMBS, PPOX, UROD, UROS 
Transcription factors (11)  BKLF, FLI-1, FOG-1, GATA1, GATA2, GADD153, KLF-1, MAFG, NFE2L1, NFE2L2, NFE2,  
Cytoskeletal proteins (13)  ACTB, ADD1, ADD2, ADD3, ANK1, EPB41, EPB42, EPB49, P55, SPTA1, SPTB, TMOD1, TPM1 
Plasma membrane proteins (33)  ACHE, AQP1, AQP3, AE1, CD44, CD47, CD55, CD99, CD147, CD151, CR1, DO, ERMAP, FUT1, FUT2, FUT3, FY, GLUT1, GYPA, GYPB, GYPC, GYPE, ICAM4, KEL, LU, RHAG, RHCE, RHD, SLC14A1, STOM, TRFC, XG, XK 
Glycolytic enzymes and other proteins (29)
 
A4GALT, ABC-Me, ABO, ADA, AHSP, ALDOA, ALOX15, B3GALT, BPGM, CA2, DYRK3, ENO1, G6PD, GCLC, GCNT2, GP1, GPX1, GSR, GSS, GSTT1, HK1, HMOX1, NT5C3, PGD, PGK1, PKLR, PFKM, PRDX2, TP11
 

Shown here is the complete list of erythroid genes analyzed in this study. In all, we evaluated 103 genes, sorted into 6 distinct categories: (1) globins, (2) heme biosynthesis, (3) transcription factors, (4) cytoskeletal proteins, (5) plasma membrane proteins, and (6) glycolytic enzymes plus other cytoplasmic proteins. The genes are listed in abbreviated form, alphabetically; the number of genes in each category is shown in parentheses.

Method of analysis

To investigate whether a particular gene has alternative first exons, the name of the gene (or its official gene symbol) was typed into the “Genome Search” option in the UCSC genome database.12  From the search results, choosing the “RefSeq” option provided extensive genomic information, including alignment of cDNA and EST clones. Clones with alternative 5′ ends were included in subsequent analyses only if they contained a majority of the gene's exons. For the erythroid genes that do possess alternative first exons, the accession IDs of their cDNA clones showing 5′ variability were recorded.

Analysis of promoter regions

DNA sequences flanking alternative first exons were analyzed using PROSCAN v1.7 (BioInformatics and Molecular Analysis Section, National Institutes of Health [NIH]; http://thr.cit.nih.gov/molbio/proscan), a program designed to find putative eukaryotic RNA polymerase II promoter sequences in primary sequence data.13  Whenever possible, 3000 bp upstream and 1500 bp downstream of each alternative first exon were inputted into the program to search for potential promoter sites. For some genes, the first exons are clustered very close together which makes this delineation impossible; in these cases, adjustments were made for the search regions to avoid overlapping. Finally, a negative control was also performed for each gene that exhibited 5′ end complexity. This was accomplished by taking internal gene sequences far from the first exon and searching them for potential promoters using PROSCAN.

Results

Identification of alternative first exons in erythroid genes

We examined a total of 103 genes which are expressed in the human red cell, using mainly genes from the Hembase database,11  supplemented with a number of transcription factor genes from the literature. A complete inventory of the genes studied is shown in Table 1. These genes are categorized as globins, heme biosynthetic enzymes, transcription factors, cytoskeletal proteins, plasma membrane proteins, and glycolytic enzymes. For each of these genes, we analyzed all of the known cDNA clones in the UCSC genome browser database12  to look for cDNAs that share coding exons but contain alternative 5′ termini. Such termini were considered strong candidates as alternative first exons if they were located upstream of, and were correctly spliced to, coding exons of the gene. Computational evidence for adjacent transcriptional promoter activity, when available, increased confidence that these 5′ sequences represent authentic first exons (see “Promoter scanning of alternative first exons”).

As shown in Table 2, our major novel finding was that a high proportion of erythroid-enriched genes possess candidate alternative first exons (36 of 103, 35%). Although several of these have been reported in earlier studies, the vast majority represent previously unrecognized alternative first exons. This result suggests that alternative first exons are much more common than previously appreciated. However, it seems likely that even this figure may be an underestimate, because a few previously reported alternative first exons were not apparent in the database analysis. Among the missing first exons were one in the protein 4.1R gene,7  2 in the acetylcholinesterase gene (ACHE),9  a muscle-specific promoter in the ankyrin-1 (ANK1) gene,10  a kidney-specific promoter in the band 3 gene (AE1),14  and a testis-specific promoter in the GATA1 gene.15  Therefore, additional first exons will likely be discovered when a more complete set of cDNAs is available for mapping against the human genome assembly.

Table 2.

Summary of genes that exhibit alternative first exons


Category and gene

No. of alternative first exons

Protein isoforms
Globins   
None   NA   NA  
Heme biosynthetic enzymes   
ALAS2  2   No  
HMBS  2   Yes  
UROS  2   No  
Transcription factors   
GATA-1*  2   No  
GATA-2   2   No  
MAFG  2   No  
NFE2L1  2   Yes  
NFE2L2  2   Yes  
NFE2  2   No  
Cytoskeletal proteins   
ADD1  2   No  
ADD2  2   No  
ADD3  2   No  
ANK1*  3   Yes  
EPB41*  3   Yes  
EPB49  4   No  
TPM1  2   Yes  
Plasma membrane proteins   
ACHE*  4   Yes  
AE1*  2   Yes  
CD47  2   Yes  
CD147  2   Yes  
CD151  2   No  
FUT3  2   Yes  
GYPE  2   Yes  
SLC14A1  3   Yes  
HK1  3   Yes  
NT5C3  2   No  
Glycolytic enzymes etc   
A4GALT  2   Yes  
ABO  2   Yes  
ALDOA  3   Yes  
B3GALT3  2   Yes  
G6PD  2   Yes  
GCNT2  3   Yes  
GSR  2   Yes  
GSS  2   No  
PFKM  2   Yes  
PKLR
 
2
 
Yes
 

Category and gene

No. of alternative first exons

Protein isoforms
Globins   
None   NA   NA  
Heme biosynthetic enzymes   
ALAS2  2   No  
HMBS  2   Yes  
UROS  2   No  
Transcription factors   
GATA-1*  2   No  
GATA-2   2   No  
MAFG  2   No  
NFE2L1  2   Yes  
NFE2L2  2   Yes  
NFE2  2   No  
Cytoskeletal proteins   
ADD1  2   No  
ADD2  2   No  
ADD3  2   No  
ANK1*  3   Yes  
EPB41*  3   Yes  
EPB49  4   No  
TPM1  2   Yes  
Plasma membrane proteins   
ACHE*  4   Yes  
AE1*  2   Yes  
CD47  2   Yes  
CD147  2   Yes  
CD151  2   No  
FUT3  2   Yes  
GYPE  2   Yes  
SLC14A1  3   Yes  
HK1  3   Yes  
NT5C3  2   No  
Glycolytic enzymes etc   
A4GALT  2   Yes  
ABO  2   Yes  
ALDOA  3   Yes  
B3GALT3  2   Yes  
G6PD  2   Yes  
GCNT2  3   Yes  
GSR  2   Yes  
GSS  2   No  
PFKM  2   Yes  
PKLR
 
2
 
Yes
 

A considerable number of the 103 genes we analyzed in this study display 5′ end complexity. In all, we found 36 genes that show alternative first exons (35%), and they are listed here by functional category. Information about the number of alternative exons, as well as whether these exons lead to the synthesis of protein isoforms with different N-termini, is also provided.

NA indicates not applicable.

*

Knowledge of one or more of the genes' alternative first exons came from empirical studies.7,9,10,12,13 

Comparison of the various erythrocytic functional categories indicates that certain gene classes appear more likely than others to display 5′ complexity (Table 3). None of the prototypical erythroid globin genes, for example, shows alternative first exons. In contrast, a high frequency of cytoskeletal proteins and transcription factors, at least 50% of both groups, exhibited alternative first exons. The other gene classes exhibited an intermediate value in between these 2 extremes. Furthermore, there is a pronounced tendency for individual members of paralogous gene families to exhibit similar structures. For example, genes within the adducin and NFE2-like families exhibited complex 5′ structures, whereas the Rhesus blood group genes did not. Presently, it is not known whether the differences among the functional categories represent a genuine phenomenon or whether it is an artifact contributed by a limited sample size; additional studies on a more expansive, genome-wide scale will be necessary to confirm this observation.

Table 3.

Prevalence of alternative first exons in various functional gene classes


Gene category

Total no. genes

No. genes with alternative first exons

Genes with alternative first exons, %
Globins   9   0   0.0  
Heme biosynthesis   8   3   37.5  
Transcription factors   11   6   54.5  
Cytoskeletal proteins   13   7   53.8  
Plasma membrane proteins   33   10   30.3  
Glycolytic enzymes etc
 
29
 
10
 
34.5
 

Gene category

Total no. genes

No. genes with alternative first exons

Genes with alternative first exons, %
Globins   9   0   0.0  
Heme biosynthesis   8   3   37.5  
Transcription factors   11   6   54.5  
Cytoskeletal proteins   13   7   53.8  
Plasma membrane proteins   33   10   30.3  
Glycolytic enzymes etc
 
29
 
10
 
34.5
 

We compared the prevalence of genes with alternative first exons among the various gene groups analyzed in this study. Certain categories appear more likely to exhibit alternative first exons. For example, none of the globin genes possess alternative 5′ ends, whereas more than half of the erythroid transcription factors and cytoskeletal proteins display such feature. The data suggest that 5′ complexity may represent an important genetic mechanism for certain cellular processes.

Promoter scanning of alternative first exons

An essential feature of a genuine first exon is the presence of an adjacent transcriptional promoter. To search computationally for potential promoter sites among the predicted first exons of these erythroid genes, we used a promoter-scanning program, PROSCAN (v1.7). This program predicts eukaryotic RNA polymerase II promoter sites based on recognition of the number and type of transcriptional elements that are typically associated with RNA polymerase II.13  It recognizes approximately 70% of primate promoter sequences. Therefore, if putative promoter elements can be found for a similar proportion of first exons analyzed in this study, it would lend credence to their being authentic, bona fide 5′ ends.

Excluding a small number of 5′ exons which has been analyzed experimentally in previous studies, we scanned the flanking regions of all of the alternative first exons identified in this study and found that 72.6% of them possessed a predicted promoter in the vicinity. These promoters are indicated by arrows in Figure S1 (available at the Blood website; see the Supplemental Figure link at the top of the online article). Most of these promoter sequences are found within 500 bp upstream of the first base pair of the cDNA, although a few were predicted as far as 1500 bp upstream or even slightly downstream from the first exon. Precedence for downstream sequences being required for promoter activity exists in the case of α-spectrin.16  For several transcripts, 2 candidate promoters were predicted, likely indicating a high concentration of transcription factor binding sites in the region.

To independently ascertain the fidelity of the program, we performed both positive and negative controls in the promoter-scanning analysis. As a positive control, we analyzed the DNA flanking the first exons of the protein 4.1R gene; in a previous study, using luciferase reporter constructs, we showed definitively that the DNA regions flanking exons 1A, 1B, and 1C function as transcriptional promoters.7  Computer scanning of these regions did in fact predict promoters in the expected locations for 2 of the 3 5′ exons (Figure S1). The negative control was done by surveying randomly selected internal 3-kb (kilobase) sequences in each of the 36 genes with alternative first exons. Because these sequences were distant from any known 5′ terminus, they were not expected to possess promoter activity. Indeed, only 1 (2.8%) of the 36 negative control sequences contained a candidate promoter region. These controls suggest that the program, although accurate enough to predict known promoters, is not undesirably permissive.

Taken together, these results substantiate the view that the majority of predicted first exons are indeed legitimate because they possess nearby, flanking elements that may function as promoters. Because the proportion of first exons predicted to possess associated promoter activity in this data set is similar to the known accuracy of the algorithm, it suggests that the large majority of candidate first exons represent authentic 5′ ends. Positive and negative controls further reinforce this notion by showing that the program does recognize known promoters while not being promiscuous in its identifications.

Alternative first exons with unique features

Analysis of the genes possessing alternative first exons revealed that 5′ complexity can assume several different forms. For the majority of the genes found to have alternative first exons, most (28) exhibit 2 such exons, whereas 6 genes had 3 and 2 genes had 4. In terms of gene organization, the most common arrangement was the presence of mutually exclusive first exons that map upstream of, and splice accurately to, a common set of downstream exons (26 of 36, 72.2%). These upstream first exons can be located close together or tens of kilobases apart, differences that presumably reflect the regulatory requirements of the individual genes. The EPB41 gene, for example, has first exons as far as 101 kb upstream of the shared coding exon, whereas the distance is much less in EPB49 and NRF1. A complete listing of the genes belonging to this class is shown in Figure S1.

A second class of mutually exclusive alternative first exons differs from the conventional class by virtue of localization internally within the gene. That is, at least one of the apparent first exons occurs downstream of one or more internal coding exon(s), sometimes quite deep within the gene. This type of internal first exon has been reported previously in a number of nonerythroid genes as well as the erythroid AE114  and ANK110  genes; in the latter case, direct experimental evidence confirmed the presence of a transcriptional promoter in intron 39 associated with this novel 5′ end, located 228 kb from the most upstream first exon.10  Additional examples of this class include aldolase A (ALDOA) and TPM1 (tropomyosin-1), shown in Figure S1.

Finally, several genes possess “dual-capacity” 5′ exons predicted to function either as alternative first exons or as internal exons, depending on differential usage of transcriptional promoters and pre-mRNA splice sites. Precedence for this type of arrangement has been reported for the UROS (uroporphyrinogen synthase) gene, including experimental confirmation of erythroid-specific promoter activity associated with the previously designated exon 2.8  In the context of transcripts that initiate upstream at exon 1A, the erythroid promoter near exon 2 is ignored and instead the exon 2 splice acceptor site is used, converting this sequence into internal exon 2. In contrast, transcripts derived from the erythroid promoter express a 5′-extended variant of “exon 2” sequence that represents instead an alternative first exon. Among the erythroid genes analyzed here, similar organization is present in the ACHE (acetylcholinesterase)9  and SLC14A1 (hUT-A1 urea transporter; Figure S1) genes.

Alternative first exons versus protein N-termini and gene complexity

We next investigated the effects of alternative first exon utilization on the structures of the encoded proteins. Translation effects may occur directly, by insertion of alternative in-frame translation initiation sites, or indirectly by effects on downstream alternative splicing patterns of exons that contain initiation sites.7  Analysis of the erythroid alternative splicing data set in this study reveal that in about two thirds of the cases (23 of 36, 64%), genes with alternative first exons have the capacity to alter translation so as to encode protein isoforms with distinct N-terminal domains. In conventional cases of mutually exclusive upstream first exons, the alternative N-termini were most frequently in the range of 10 to 50 amino acids in length. At the other extreme, the internal alternative first exon in the ANK1 gene encodes a protein lacking 1726 amino acids,10  and the isoforms created by the various 4.1R first exons differ by 210 amino acids at the N-terminus.7  Therefore, it is clear that alternative first exons may have widespread effects on protein structure and function.

One final observation we made is that complexity of a gene (defined by the total number of exons in its transcription unit) is not correlated with its likelihood for having 5′ complexity. For example, we tabulated the average number of exons of the 36 genes that are found to have alternative first exons and also that of the 67 genes that did not show the feature (10.20 ± 1.30 and 9.93 ± 1.04, respectively). Comparison of the 2 values indicates no correlation between gene complexity and alternative first exon occurrence. This is a somewhat surprising result, because one would intuitively expect a gene with more exons to be more prone for 5′ complexity.

Discussion

The results presented here demonstrate that the incidence of alternative first exons is more widespread among erythroid genes than has been recognized previously. In this survey of 103 genes, 35% exhibited apparent alternative 5′ ends that fulfill the major requirements expected of authentic first exons: they map upstream of the coding exons, splice accurately to these exons, and are computationally predicted to possess closely associated transcriptional promoters. However, it seems likely that the true frequency of alternative first exons may be even higher than that predicted by database analysis, because there does not exist a complete cDNA data set available for mapping against the genome sequence.

The complex 5′ exon structure characteristic of many erythroid genes likely influences expression of the encoded proteome by several mechanisms. The presence of multiple transcriptional promoters, each potentially regulated by separate enhancer elements and responsive to different environmental cues, can facilitate proper spatial and temporal regulation of gene expression in the specialized differentiation programs unique to various cell types. For a few of the genes, it has been shown that one promoter serves a general housekeeping function to facilitate expression in nonerythroid cells, whereas the other is erythroid-specific and apparently functions to ensure proper quantitative, stage-specific expression during erythropoiesis. Included in this category are the genes UROS,8 GATA1,15 ANK1,10  and AE1.14  For the majority of cases described here, however, specific regulatory functions have yet to be attributed to the individual promoters.

Besides spatial and temporal regulation from alternative promoters, the unique sequence features of alternative first exons may also allow additional levels of gene regulation. The simplest way this is accomplished is through the inclusion of different start codons among the first exons; as this produces protein isoforms with different N-termini, the biochemical properties of a particular gene product can be greatly altered. However, first exons that do not alter translation products can still affect a gene's expression in other ways, such as altering the efficiency of mRNA translation. An estimated 20% to 48% of human transcripts contain at least one upstream open reading frame (uORF) preceding the “real” start codon.17-20  These uORFs can manipulate the translation initiation efficiency of the main, downstream ORF sometimes in tissue-specific patterns,21-25  or they may even influence initiation at different AUG codons.26  Alternatively, 5′ UTRs can dramatically alter translation efficiency of the downstream ORF through the action of specific binding proteins such as iron regulatory protein 1 (IRP1). IRP1 binding to iron responsive elements in the 5′ UTRs of ferritin and ALAS2 can inhibit translation in an iron-dependent fashion.27-29 

It is important to note that the phenomenon of 5′ gene complexity and alternative promoters is not limited to erythroid cells, but rather is a general feature of human genes expressed in many cell types. Recent studies have used high-throughput microarray strategies to exhaustively map complete transcriptome sequences against the cognate genome assemblies, leading to identification of many novel alternative 5′ and 3′ exons in previously known genes.30,31  A general conclusion of this approach is that vertebrate genes frequently possess alternative promoters/alternative first exons.32  Alternatively, “ChIP on chip” assays can be used to map RNA polymerase II binding sites within the genome, representing an independent approach to the identification of transcription start sites.33  We speculate that these complementary strategies will ultimately reveal that alternative usage of first exons may prove to be as widespread and important in gene regulation as is alternative splicing of internal coding exons.

Prepublished online as Blood First Edition Paper, November 17, 2005; DOI 10.1182/blood-2005-07-2957.

Supported by the National Institutes of Health (grant DK32094) (N.M.) and the Director of the Office of Biological and Environmental Research, U.S. Department of Energy, under contract DE-AC03-765F0098.

The online version of the article contains a data supplement.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.

We thank Christina Schleupen for discussions and suggestions.

1
Black DL. Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology.
Cell.
2000
;
103
:
367
-370.
2
Mironov AA, Fickett JW, Gelfand MS. Frequent alternative splicing of human genes.
Genome Res.
1999
;
9
:
1288
-1293.
3
Hanke J, Brett D, Zastrow I, et al. Alternative splicing of human genes: more the rule than the exception?
Trends Genet.
1999
;
15
:
389
-390.
4
Lopez AJ. Alternative splicing of pre-mRNA: developmental consequences and mechanisms of regulation.
Annu Rev Genet.
1998
;
32
:
279
-305.
5
Smith CW, Valcarcel J. Alternative pre-mRNA splicing: the logic of combinatorial control.
Trends Biochem Sci.
2000
;
25
:
381
-388.
6
Pinder JC, Chung A, Reid ME, Gratzer WB. Membrane attachment sites for the membrane cytoskeletal protein 4.1 of the red blood cell.
Blood.
1993
;
82
:
3482
-3488.
7
Parra MK, Gee SL, Koury MJ, Mohandas N, Conboy, JG. Alternative 5′ exons and differential splicing regulate expression of protein 4.1R isoforms with distinct N-termini.
Blood.
2003
;
15
:
4164
-4171.
8
Aizencang G, Constanza S, Bishop D, Warner C, Desnick R. Human uroporphyrinogen-III synthase: genomic organization, alternative promoters, and erythroid-specific expression.
Genomics.
2000
;
70
:
223
-231.
9
Meshorer E, Toibert D, Zurel D, et al. Combinatorial complexity of 5′ alternative acetylcholinesterase transcripts and protein products.
J Biol Chem.
2004
;
279
:
29740
-29751.
10
Gallagher P, Forget B. An alternative promoter directs expression of a truncated, muscle-specific isoforms of the human ankyrin 1 gene.
J Biol Chem.
1998
;
273
:
1339
-1348.
11
National Institute of Diabetes and Digestive and Kidney Diseases. Hembase: database of human erythroid gene activity. http://hembase.niddk.nih.gov/. Accessed January 17, 2005.
12
Genome Bioinformatics Group of UC Santa Cruz. Genome Browser. http://www.genome.ucsc.edu. Accessed March 20, 2005.
13
Prestridge DS. Predicting Pol II promoter sequences using transcription factor binding sites.
J Mol Biol.
1995
;
249
:
923
-932.
14
Sahr KE, Taylor WM, Daniels BP, Rubin HL, Jarolim P. The structure and organization of the human erythroid anion exchanger (AE1) gene.
Genomics.
1994
;
24
:
491
-501.
15
Vannuchi AM, Linari S, Lin C-S, Koury MJ, Bondurant MC, Migliaccio AR. Increased expression of the distal, but not of the proximal, GATA1 transcripts during differentiation of primary erythroid cells.
J Cell Physiol.
1999
;
180
:
390
-401.
16
Wong EY, Lin J, Forget BG, Bodine DM, Gallagher PG. Sequences downstream of the erythroid promoter are required for high level expression of the human alpha-spectrin gene.
J Biol Chem.
2004
;
279
:
55024
-55033.
17
Kozak M. An analysis of 5′-noncoding sequences from 699 vertebrate messenger RNAs.
Nucleic Acids Res.
1987
;
15
:
8125
-8148.
18
Pesole G, Liuni S, Grillo G, Saccone C. Structural and compositional features of untranslated regions of eukaryotic mRNAs.
Gene.
1997
;
205
:
95
-102.
19
Pesole G, Gissi C, Grillo G, Licciulli F, Liuni S, Saccone C. Analysis of oligonucleotide AUG start codon context in eukaryotic mRNAs.
Gene.
2000
;
261
:
85
-91.
20
Suzuki Y, Ishihara D, Sasaki M, et al. Statistical analysis of the 5′-untranslated region of human mRNA using `oligo-capped' cDNA libraries.
Genomics.
2000
;
64
:
286
-297.
21
Hohn T, Corsten S, Dominguez D, et al. Shunting is a translation strategy used by plant pararetroviruses (Caulimoviridae).
Micron.
2001
;
32
:
51
-57.
22
Pooggin MM, Hohn T, Futterer J. Role of a short open reading frame in ribosome shunt on the cauliflower mosaic virus RNA leader.
J Biol Chem.
2000
;
275
:
17288
-17296.
23
Reynolds K, Zimmer AM, Zimmer A. Regulation of RARβ2 mRNA expression: evidence for an inhibitory peptide encoded in the 5′-untranslated region.
J Cell Biol.
1996
;
134
:
827
-835.
24
Zimmer A, Zimmer AM, Reynolds K. Tissue-specific expression of the retinoic acid receptor-β2: regulation by short open reading frames in the 5′-noncoding region.
J Cell Biol.
1994
;
127
:
1111
-1119.
25
Meijer HA, Thomas AM. Control of eukaryotic protein synthesis by upstream open reading frames in the 5′-untranslated region of an mRNA.
Biochem J.
2002
;
367
:
1
-11.
26
Calkhoven CF, Müller C, Leutz A. Translational control of C/EBPα and E/EBPβ isoforms expression.
Genes Dev.
2000
;
14
:
1920
-1932.
27
Nunez MT, Garate MA, Arredondo M, Tapia V, Munoz P. The cellular mechanisms of body iron homeostasis.
Biol Res.
33
:
133
-142.
28
Pantopoulos K. Iron metabolism and the IRE/IRP regulatory system: an update.
Ann N Y Acad Sci.
2004
;
1012
:
1
-13.
29
Harigae H, Nakajima O, Suwabe N, et al. Aberrant iron accumulation and oxidized status of erythroid-specific delta-aminolevulinate synthase (ALAS2)-deficient definitive erythroblasts.
Blood.
2003
;
101
:
1188
-1193.
30
Kaparanov P, Cawley SE, Drenkow J, et al. Large-scale transcriptional activity in chromosomes 21 and 22.
Science.
2002
;
296
:
916
-919.
31
Cheng J, Kapranov P, Drenkow J, et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution.
Science.
2005
;
308
:
1149
-1154.
32
Carninci P, Kasukawa T, Katayama S, et al. The transcriptional landscape of the mammalian genome.
Science.
2005
;
309
:
1559
-1563.
33
Kim TH, Barrera LO, Qu C, et al. Direct isolation and identification of promoters in the human genome.
Genome Res.
2005
;
15
:
830
-839.