[Computers and Chemistry, Vol. 24 (1) (2000) pp. 43-55] © January 2000, Elsevier Science Ltd. All rights reserved.

Sequence complexity for biological sequence analysis

L. Allisona, L. Sternb, T. Edgoosea and T.I. Dixa

(a) School of Computer Science and Software Engineering, Monash University, Melbourne, 3168 Australia
(b) Department of Computer Science and Software Engineering, The University of Melbourne, Melbourne, 3052 Australia
Received 7 August 1998; accepted 18 February 1999

Full paper here: [sciencedirect(click)][4/'03], [10.1016/S0097-8485(00)80006-6][?4/'03?].

Also see ISMB'98 pp8-16 1998 and Mol. Biochem, Parasitology 18(2) pp175-186, 2001.

home1 home2
 Bib
 Algorithms
 Bioinfo
 FP
 Logic
 MML
 Prog.Lang
and the
 Book

Bioinformatics
 Compression

Abstract: A new statistical model for DNA considers a sequence to be a mixture of regions with little structure and regions that are approximate repeats of other subsequences, i.e. instances of repeats do not need to match each other exactly. Both forward- and reverse-complementary repeats are allowed. The model has a small number of parameters which are fitted to the data. In general there are many explanations for a given sequence and how to compute the total probability of the data given the model is shown. Computer algorithms are described for these tasks. The model can be used to compute the information content of a sequence, either in total or base by base. This amounts to looking at sequences from a data-compression point of view and it is argued that this is a good way to tackle intelligent sequence analysis in general.

Keywords: Algorithm; DNA; Complexity; Entropy; Pattern discovery; Sequence analysis

Coding Ockham's Razor, L. Allison, Springer

A Practical Introduction to Denotational Semantics, L. Allison, CUP

Linux
 Ubuntu
free op. sys.
OpenOffice
free office suite
The GIMP
~ free photoshop
Firefox
web browser

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Friday, 29-Mar-2024 00:43:39 AEDT.