Computational genomics

Computational genomics is the study of deciphering biology from genome sequences using computational analysis. , including both DNA and RNA. Computational genomics focuses on understanding the human genome, and more generally the principles of how DNA controls the biology of any species at the molecular level. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.

History
Computational genomics began in spirit, if not in name, during the 1960s with the research of Margaret Dayhoff and others at the National Biomedical Research Foundation, who first assembled a database of protein sequences. Their research developed a phylogenetic tree that determined the evolutionary changes that were required for a particular protein to change into another protein based on the underlying amino acid sequences. This led them to create a scoring matrix that assessed the likelihood of one protein being related to another.

Beginning in the 1980s, databases of genome sequences began to be recorded, but this presented new challenges in the form of searching and comparing the databases of gene information. Unlike text-searching algorithms that are used on websites such as google or Wikipedia, searching for sections of genetic similarity requires one to find strings that are not simply identical, but similar. This led to the development of the Needleman-Wunsch algorithm, which is a dynamic programming algorithm for comparing sets of amino acid sequences with each other by using scoring matrices derived from the earlier research by Dayhoff. Later, the BLAST algorithm was developed for performing fast, optimized searches of gene sequence databases. BLAST and its derivatives are probably the most widely-used algorithms for this purpose.

The first meeting of the Annual Conference on Computational Genomics was in 1998, providing a forum for this speciality and effectively distinguishing this area of science from the more general fields of Genomics or Computational Biology. The first use of this term in scientific literature, according to MEDLINE abstracts, was just one year earlier in Nucleic Acids Research. .

The development of computer-assisted mathematics (using products such as Mathematica or Matlab) has helped engineers, mathematicians and computer scientists to start operating in this domain, and a public collection of case studies and demonstrations is growing, ranging from whole genome comparisons to gene expression analysis. . This has increased the introduction of different ideas, including concepts from systems and control, information theory, strings analysis and data mining. It is anticipated that computational approaches will become and remain a standard topic for research and teaching, while students fluent in both topics start being formed in the multiple courses created in the past few years.

Contributions of computational genomics research to biology
Contributions of computational genomics research to biology include :
 * discovering subtle patterns in genomic sequences
 * proposing cellular signalling networks
 * proposing mechanisms of genome evolution
 * predict precise locations of all human genes using [comparative genomics] techniques with several mammalian and vertebrate species
 * predict conserved genomic regions that are related to early embryonic development
 * discover potential links between repeated sequence motifs and tissue-specific gene expression
 * measure regions of genomes that have undergone unusually rapid evolution