Fast, Optimal Alignment of Three Sequences Using Linear Gap Cost

LA home
Computing
Publications
 JTB'00
  software

Also see
 JTB93

Bioinformatics
 Three

Journal of Theoretical Biology, 207(3), pp.325-336, Dec., 2000

D. R. Powell, L. Allison & T. I. Dix

School of Computer Science and Software Engineering, Monash University, Australia 3800,

Abstract

Alignment algorithms can be used to infer a relationship between sequences when the true relationship is unknown. Simple alignment algorithms use a cost function that gives a fixed cost to each possible point mutation-mismatch, deletion, insertion. These algorithms tend to find optimal alignments that have many small gaps. It is more biologically plausible to have fewer longer gaps rather than many small gaps in an alignment. To address this issue, linear gap cost algorithms are in common use for aligning biological sequence data. More reliable inferences are obtained by aligning more than two sequences at a time. The obvious dynamic programming algorithm for optimally aligning k sequences of length n runs in O(nk) time. This is impractical if k>3 and n is of any reasonable length. Thus, for this problem there are many heuristics for aligning k sequences, however, they are not guaranteed to find an optimal alignment. In this paper, we present a new algorithm guaranteed to find the optimal alignment for three sequences using linear gap costs. This gives the same results as the dynamic programming algorithm for three sequences, but typically does so much more quickly. It is particularly fast when the (three-way) edit distance is small. Our algorithm uses a speed-up technique based on Ukkonen's greedy algorithm (Ukkonen, 1983) which he presented for two sequences and simple costs.

Full paper here: [doi:10.1006/jtbi.2000.2177][4/'03], [sciencedirect][4/'03].

[Software (click)]

www #ad:

↑ © L. Allison, www.allisons.org/ll/   (or as otherwise indicated).
Created with "vi (Linux)",  charset=iso-8859-1,   fetched Friday, 19-Apr-2024 14:00:33 UTC.

Free: Linux, Ubuntu operating-sys, OpenOffice office-suite, The GIMP ~photoshop, Firefox web-browser, FlashBlock flash on/off.