Eukaryotic gene example

Many genomes have been sequenced and their gene sequences are stored in general DNA sequence databases (e.g. GenBank) and in species specific databases (e.g. The Arabidopsis Information Resource (TAIR).

The figures are views of the sequence of one (AMY1) of the approximately 25,000 genes from Arabidopsis thaliana the thale cress plant. This gene encodes an alpha amylase enzyme.

These images are views of the gene, cDNA, and coding sequence (CDS) used by researchers to study it further. Such research might involve seed germination or plant flavour.

cDNA
The image below shows a screenshot of the AMY1 cDNA. This was obtained from TAIR

Note that TAIR provide three views of the 'Nucleotide Sequence',  	 'full length CDS', 'full length cDNA' (Fig. 1), 'full length genomic' (Fig. 2]. In each of them the DNA alphabet is used, although strictly the CDS should be shown as RNA (AUG etc).

A typical eukaryotic gene is transcribed into an RNA that is then processed into a mature mRNA by removal of introns and 5' and 3' processing.



Gene
The mRNA is comprised of a 5' UTR (red) CDS (uppercase yellow) and 3' UTR (red again). All three of these regions are exonic (not just the CDS). Introns are shown in purple (lowercase). For convienience neither the 5' Cap nor 3' tail are shown in the cDNA (fig 1) although the mRNA will have them. The gene sequence is also shown in a form where the codons can be read (ATG...), rather than as the template DNA strand which is actually copied into mRNA.



This gene structure view is typical of a eukaryotic gene. Similar views of genes can be obtained for human, fruitfly, or yeast genes.