Bioinformatics involves the manipulation, searching, and data mining of  DNA sequence data. The development of techniques to store and search DNA  sequences have led to widely applied advances in computer science,  especially string searching algorithms, machine learning and database  theory. String searching or matching algorithms, which find an  occurrence of a sequence of letters inside a larger sequence of letters  were developed to search for specific sequences of nucleotides. In other  applications such as text editors, even simple algorithms for this  problem usually suffice, but DNA sequences cause these algorithms to  exhibit near worst case behaviour due to their small number of distinct  characters. The related problem of sequence alignment aims to identify  homologous sequences and locate the specific mutations that make them  distinct. These techniques, especially multiple sequence alignment are  used in studying phylogenetic relationships and protein function. Data  sets representing entire genomes. worth of DNA sequences, such as those  produced by the Human Genome Project are difficult to use without  annotations, which label the locations of genes and regulatory elements  on each chromosome. Regions of DNA sequence that have the characteristic  patterns associated with protein or RNA-coding genes can be identified  by gene finding algorithms, which allow researchers to predict the  presence of particular gene products in an organism even before they  have been isolated experimentally.
 1:34 AM
1:34 AM
 kotesh
kotesh


