### Edit Distance

 LA home Computing FP  λ-calc.   Intro   Syntax   Examples    Ints    Bools    Lists(1)    arithmetic    Y (lazy)    Y (strict)    Lists(2)    Trees    Primes(1)    Fibonacci(1)    Unique    Hamming#s    Composites    Fibonacci(2)    Thue seqs.    Edit-dist.    edit dist.
The first lambda calculus solution of the string edit distance problem follows directly from the mathematical definition. It can be seen that it involves ternary recursion and is therefore exponentially slow in terms of the length of the input strings:
```let rec
length = lambda L. if null L then 0 else 1 + length tl L,
min    = lambda x. lambda y. if x < y then x else y,

A =
'A'::'C'::'G'::'T'::nil

,B=
'A'::'G'::'C'::'T'::nil

in let rec

Distance = lambda A. lambda B.
if      null A then length B
else if null B then length A
else
let As = tl A, Bs = tl B
in  if hd A = hd B then Distance As Bs
else 1 + min (Distance As Bs)
(min (Distance As B)
(Distance A Bs))

in Distance A B

{\fB Edit Distance,                                  \fP}
{\fB best case (A=B) O(|A|), worst case exponential. \fP}

```

The next version of edit distance avoids doing repeated work by storing partial results in an "array" (actually a list of lists) giving the well-known dynamic programming algorithm (DPA). This reduces the time complexity to O(|A|*|B|) where the two strings are A and B.
```let rec
count  = lambda L. lambda B.
if null B then nil
else (1 + hd L) :: count tl L tl B,
last = lambda L. if null tl L then hd L else last tl L,
min  = lambda x. lambda y. if x < y then x else y,

A = 'a'::'c'::'g'::'t'::'a'::'c'::
'g'::'t'::'a'::'c'::'g'::'t'::nil  {e.g.}

,B = 'a'::'g'::'c'::'t'::'a'::'c'::
't'::'a'::'c'::'t'::'g'::'t'::nil {e.g.}

in let

Distance = lambda A. lambda B.
let rec
Rows = (0 :: count  hd Rows  B)  {the first row }
:: EachRow A  hd Rows        {the other rows},

EachRow = lambda A. lambda lastrow.
if null A then nil
else
let rec
Ach = hd A,

DoRow = lambda B. lambda NW. lambda W. {NW N}
if null B then nil                   {W  .}
else
let    N  = tl NW
in let me = if Ach = hd B then hd NW
else 1 + min W (min hd N hd NW)
in me :: DoRow tl B  tl NW  me,

thisrow = (1 + hd lastrow)
:: DoRow B lastrow  hd thisrow

in thisrow :: EachRow tl A  thisrow

in last (last Rows)

in Distance A B

{\fB Edit Distance, O(|A|*|B|) time and space. \fP}

```

The final edit distance program reduces the time complexity of O(n*D(A,B)) where the strings are of length ~n, and D(A,B) is the edit distance of A and B.

This program is fast if the strings are similar in which case the edit distance is small. It relies on lazy evaluation or 'call by need' to get this speed up. For a full explanation, see:
L. Allison. Lazy dynamic programming can be eager, Information Processing Letters, 43, pp.207-212, Sept 1992.
```let rec
min    = lambda x. lambda y. if x < y then x else y,
length = lambda L. if null L then 0 else 1+length tl L,
last   = lambda L. if null tl L then hd L else last tl L,
index  = lambda n. lambda L.
if n=1 then hd L else index (n-1) tl L,

acgt = lambda n.
if n > 0 then 'a'::'c'::'g'::'t'::(acgt (n-4)) else nil,

mutate = lambda L. lambda mutn.
let rec
n = length L,
step = if mutn=0 then 2*n+1 else n/mutn,
ch = lambda L. lambda st. lambda mtype.
if null L then nil
else if st = 0 then
if mtype=1 or mtype=3 then            {2:1:1}
'x'::(ch  tl L  step (mtype+1))    {change}
else if mtype=2 then (ch tl L step 3) {delete}
else 'y'::(ch L step 1)               {insert}
else (hd L)::(ch tl L (st-1) mtype)     {copy}
in ch L (step/2) 1,

A = acgt 100     {e.g.}
,B = mutate A 4  {e.g.}

in let

Distance = lambda A. lambda B.
let rec
MainDiag = OneDiag A B  hd Uppers  (-1 :: hd Lowers),
Uppers   = EachDiag A B (MainDiag::Uppers), {upper diags}
Lowers   = EachDiag B A (MainDiag::Lowers), {lower diags}

OneDiag = lambda A. lambda B.
lambda diagAbove. lambda diagBelow.
let rec
DoDiag= lambda A. lambda B. lambda NW. lambda N. lambda W.
if null A  or  null B then nil
else                                   { NW N  }
let me = if hd A = hd B then NW        { W  me }
{fast}          else 1+if hd W < NW then hd W else min hd N NW
{slow}        { else 1+min NW (min hd N  hd W) }

in me::DoDiag  tl A  tl B  me  tl N  tl W, {along diag}
{hope these ^^^^  ^^^^not evaluated}

thisdiag = (1+hd diagBelow)
:: DoDiag A B  hd thisdiag  diagAbove  tl diagBelow

in thisdiag,

EachDiag =  lambda A. lambda B. lambda Diags.
if null B then nil
else (OneDiag A tl B hd tl tl Diags hd Diags) {one diag &}
:: EachDiag A tl B tl Diags                {the others}

in let LAB = (length A) - (length B)
in last if      LAB=0 then MainDiag
else if LAB > 0 then index   LAB  Lowers
else   {LAB < 0}     index (-LAB) Uppers

in Distance A B

{\fB Edit-Distance, diagonal orientation. \fP}

```

#### For the record, the last algorithm in Haskell

```module Edit_IPL_V43_N4 (d) where
-- compute the edit distance of sequences a and b.
d a b =
let
-- diagonal from the top-left element
mainDiag = oneDiag  a b (head uppers) ( -1 : (head lowers))

-- diagonals above the mainDiag
uppers   = eachDiag a b (mainDiag : uppers)

-- diagonals below the mainDiag
lowers   = eachDiag b a (mainDiag : lowers)  -- !

oneDiag a b diagAbove diagBelow =  -- \   \   \
let                               --  \   \   \
doDiag [] b nw n w = []         --   \  nw   n
doDiag a [] nw n w = []         --    \   \
doDiag (a:as) (b:bs) nw n w =   --      w   me
let me = if a==b then nw       -- dynamic programming DPA
else 1+min3 (head w) nw (head n)
in  me : (doDiag as bs me (tail n) (tail w))

firstelt = 1+(head diagBelow)
thisdiag =
firstelt:(doDiag a b firstelt diagAbove (tail diagBelow))

in thisdiag

min3 x y z =
-- see L. Allison, Lazy Dynamic-Programming can be Eager
--     Inf. Proc. Letters 43(4) pp207-212, Sept' 1992
if x < y then x else min y z   -- makes it O(|a|*D(a,b))
-- min x (min y z)             -- makes it O(|a|*|b|)
-- the fast one does not always evaluate all three values.

eachDiag a [] diags = []
eachDiag a (b:bs) (lastDiag:diags) =
let nextDiag = head(tail diags)
in  (oneDiag a bs nextDiag lastDiag):(eachDiag a bs diags)

-- which is the diagonal containing the bottom R.H. elt?
lab = (length a) - (length b)

in last( if      lab == 0 then mainDiag
else if lab > 0 then lowers !! ( lab-1)
else                 uppers !! (-lab-1) )

-- module under Gnu 'copyleft' GPL General Public Licence.
```

See: L. Allison. Lazy dynamic programming can be eager, Information Processing Letters, 43, pp.207-212, Sept 1992.