Gorban, A.N. and Popova, T.G. and Zinovyev, A.Yu. (2004) Four basic symmetry types in the universal 7-cluster structure of 143 complete bacterial genomic sequences. [Preprint]
| PDF 383Kb |
Abstract
Coding information is the main source of heterogeneity (non-randomness) in the sequences of bacterial genomes. This information can be naturally modeled by analysing cluster structures in the ``in-phase'' triplet distributions of relatively short genomic fragments (200-400bp). We found a universal 7-cluster structure in all 143 completely sequenced bacterial genomes available in Genbank in August 2004, and explained its properties. The 7-cluster structure is responsible for the main part of sequence heterogeneity in bacterial genomes. In this sense, our 7 clusters is the basic model of bacterial genome sequence. We demonstrated that there are four basic ``pure'' types of this model, observed in nature: ``parallel triangles'', ``perpendicular triangles'', degenerated case and the flower-like type. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy (more precisely, by two similar functions, one for eubacterial genomes and the other one for archaea). All 143 cluster animated 3D-scatters are collected in a database and is made available on our web-site: http://www.ihes.fr/~zinovyev/7clusters The finding can be readily introduced into any software for gene prediction, sequence alignment or bacterial genomes classification.
| Item Type: | Preprint |
|---|---|
| Keywords: | codon usage, cluster structure, mean field, frequency dictionary |
| Subjects: | Biology > Theoretical Biology |
| ID Code: | 3915 |
| Deposited By: | Gorban, Prof Alexander N. |
| Deposited On: | 06 Nov 2004 |
| Last Modified: | 19 Dec 2009 19:20 |
References in Article
Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.
Metadata
- ID Plus Text Citation
- RDF+XML
- BibTeX
- Pageflow Montage
- JSON
- Dublin Core
- OAI-ORE Resource Map (Atom Format)
- Simple Metadata
- Refer
- METS
- OAI-ORE Resource Map (RDF Format)
- Search Data Dump
- Pageflow
- HTML Citation
- ASCII Citation
- YAML
- EPrints Application Profile (experimental)
- OpenURL ContextObject
- EndNote
- OpenURL ContextObject in Span
- MODS
- DIDL
- EP3 XML
- Reference Manager
- RDF+N3
- Eprints Application Profile
Repository Staff Only: item control page

