Genes, DNA->RNA->protein

DNA {A,T,C,G}
5' [up stream][promoter(s)][exon1][gt intron1 ag][exon2][gt intron2 ag][exon3]3' [down stream]
 
transcribed to RNA {A,U,C,G}
 5' 3'  
 [exon1][gu intron1 ag][exon2][gu intron2 ag][exon3]  
 
RNA spliced (edited)
[ exon1 ] [exon2] [ exon3 ]
[UTSaug... ] [exon2] [  ...  UTS]
 
translated to protein
protein
  N any
R purineY pyrimidine
small
2xH
AU/T
large
3xH
GC
atg~aug->MET (& starts)
Stop translation codons:
taa~uaa, tag~uag, tga~uga
UTS = UnTranslated Sequence

Genetic Code

There are four DNA (RNA) bases {A,T(U),C,G}. There are twenty (common) amino acids making up proteins. mRNA is read in codons - groups of three bases - and translated into protein according to the genetic code below:


  position 2  
pyrimidine purine
U/T C A G
p
o
s
i
t
i
o
n

1
p
y
r
i
m
i
d
i
n
e
U
/
T
UUUPhe[F] UCUSer[S] UAUTyr[Y] UGUCys[C] U p
o
s
i
t
i
o
n

3
UUC UCC UAC UGC C
UUALeu[L] UCA UAAStop! UGAStop! A
UUG UCG UAG UGGTrp[W] G
C CUU CCUPro[P] CAUHis[H] CGUArg[R] U
CUC CCC CAC CGC C
CUA CCA CAA Gln[Q] CGA A
CUG CCG CAG CGG G
p
u
r
i
n
e
A AUUIle[I] ACUThr[T] AAUAsn[N] AGUSer[S] U
AUC ACC AAC AGC C
AUA ACA AAA Lys[K] AGA Arg[R] A
AUG Met[M] ACG AAG AGG G
G GUUVal[V] GCUAla[A] GAUAsp[D] GGUGly[G] U
GUCGCCGACGGCC
GUAGCAGAAGlu[E] GGAA
GUGGCGGAGGGGG
--->
AAla
CCys
DAsp
EGlu
FPhe
GGly
HHis
IIle
KLys
LLeu
MMet
NAsn
PPro
QGln
RArg
SSer
TThr
VVal
WTrp
YTyr
  
--->
AlaA
ArgR
AsnN
AspD
CysC
GlnQ
GluE
GlyG
HisH
IleI
LeuL
LysK
MetM
PheF
ProP
SerS
ThrT
TrpW
TyrY
ValV
codes

There are three stop (translating) codons. All coding regions begin AUG (Met).

Any "sufficiently long" stretch of DNA, in some reading frame (offset of 0, 1 or 2), not containing a stop codon is called an open reading frame (ORF) and is a potential candidate for being a part of a gene.

AA Properties

. --hydrophylic-- .
l
a
r
g
e
    . .  K


Q
E  
   D
. .     s
m
a
l
l
. H R      N

. . .
. . . . .
P

  G
W   . . *

T
S  
   A
.
. . M . . . .
.
F  L
. I  V . . .
. Y . . . . C
. --hydrophobic-- .
--Approx(!) AA similarity ~ Swanson 84--