翻訳と辞書 |
GeneMark
GeneMark is a generic name for a family of ab initio gene prediction programs developed at the Georgia Institute of Technology in Atlanta. Developed in 1993, original GeneMark was used in 1995 as a primary gene prediction tool for annotation of the first completely sequenced bacterial genome of ''Haemophilus influenzae'', and in 1996 for the first archaeal genome of ''Methanococcus jannaschii''. The algorithm introduced inhomogeneous three-periodic Markov chain models of protein-coding DNA sequence that became standard in gene prediction as well as Bayesian approach to gene prediction in two DNA strands simultaneously. Species specific parameters of the models were estimated from training sets of sequences of known type (protein-coding and non-coding). The major step of the algorithm computes for a given DNA fragment posterior probabilities of either being "protein-coding" (carrying genetic code) in each of six possible reading frames (including three frames in complementary DNA strand) or being "non-coding". Original GeneMark (developed before the HMM era in Bioinformatics) is an HMM-like algorithm; it can be viewed as approximation to known in the HMM theory posterior decoding algorithm for appropriately defined HMM. ==Prokaryotic gene prediction==
The GeneMark.hmm algorithm (1998) was designed to improve gene prediction accuracy in finding short genes and gene starts. The idea was to integrate the Markov chain models used in GeneMark into a hidden Markov model framework, with transition between coding and non-coding regions formally interpreted as transitions between hidden states. Additionally, the ribosome binding site model was used to improve accuracy of gene start prediction. Next step was done with development of the self-training gene prediction tool GeneMarkS (2001). GeneMarkS has been in active use by genomics community for gene identification in new prokaryotic genomic sequences. GeneMarkS+, extension of GeneMarkS integrating information on homologous proteins into gene prediction is used in the NCBI pipeline for prokaryotic genomes annotation; the pipeline can annotate up to 2000 genomes daily ().
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「GeneMark」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|