Genes, DNA->RNA->protein
DNA {A,T,C,G} | |||||||||||||||||
5' [up stream] | [promoter(s)] | [exon1] | [gt intron1 ag] | [exon2] | [gt intron2 ag] | [exon3] | 3' [down stream] | ||||||||||
transcribed to RNA {A,U,C,G} | |||||||||||||||||
5' | 3' | ||||||||||||||||
[exon1] | [gu intron1 ag] | [exon2] | [gu intron2 ag] | [exon3] | |||||||||||||
RNA spliced (edited) | |||||||||||||||||
| |||||||||||||||||
translated to protein | |||||||||||||||||
|
| |||||||||||
atg~aug->MET (& starts) | |||||||||||
Stop translation codons: taa~uaa, tag~uag, tga~uga | |||||||||||
UTS = UnTranslated Sequence |
Genetic Code
There are four DNA (RNA) bases {A,T(U),C,G}.
There are twenty (common) amino acids making up proteins.
mRNA is read in codons - groups of three bases -
and translated into protein according to
the genetic code below:
position 2 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
pyrimidine | purine | |||||||||||
U/T | C | A | G | |||||||||
p o s i t i o n 1 |
p y r i m i d i n e |
U / T |
UUU | Phe[F] | UCU | Ser[S] | UAU | Tyr[Y] | UGU | Cys[C] | U | p o s i t i o n 3 |
UUC | UCC | UAC | UGC | C | ||||||||
UUA | Leu[L] | UCA | UAA | Stop! | UGA | Stop! | A | |||||
UUG | UCG | UAG | UGG | Trp[W] | G | |||||||
C | CUU | CCU | Pro[P] | CAU | His[H] | CGU | Arg[R] | U | ||||
CUC | CCC | CAC | CGC | C | ||||||||
CUA | CCA | CAA | Gln[Q] | CGA | A | |||||||
CUG | CCG | CAG | CGG | G | ||||||||
p u r i n e |
A | AUU | Ile[I] | ACU | Thr[T] | AAU | Asn[N] | AGU | Ser[S] | U | ||
AUC | ACC | AAC | AGC | C | ||||||||
AUA | ACA | AAA | Lys[K] | AGA | Arg[R] | A | ||||||
AUG | Met[M] | ACG | AAG | AGG | G | |||||||
G | GUU | Val[V] | GCU | Ala[A] | GAU | Asp[D] | GGU | Gly[G] | U | |||
GUC | GCC | GAC | GGC | C | ||||||||
GUA | GCA | GAA | Glu[E] | GGA | A | |||||||
GUG | GCG | GAG | GGG | G |
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
codes |
---|
There are three stop (translating) codons. All coding regions begin AUG (Met).
Any "sufficiently long" stretch of DNA, in some reading frame (offset of 0, 1 or 2), not containing a stop codon is called an open reading frame (ORF) and is a potential candidate for being a part of a gene.
AA Properties
. | --hydrophylic-- | . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
l a r g e |
. | . | K | Q | E D | . | . | s m a l l |
||
. | H | R | N | . | . | . | ||||
. | . | . | . | . | P | G | ||||
W | . | . | * | T | S A | . | ||||
. | . | M | . | . | . | . | ||||
. | F L | . | I V | . | . | . | ||||
. | Y | . | . | . | . | C | ||||
. | --hydrophobic-- | . | ||||||||
--Approx(!) AA similarity ~ Swanson 84-- |