Prokaryote PseudogenesWe have carried out a comprehensive analysis of the occurrence of pseudogenes (disabled copies of genes) in a diverse selection of 64 prokaryote genomes. We find a total of ~7000 candidate prokaryotic pseudogenes. Moreover, in all the genomes surveyed, pseudogenes occur in at least 1 to 5% of all gene-like sequences, with some genomes having considerably higher occurrence. The relevant data and texts can be found here. |
Downloadable Files |
These are the pseudogene databases for each prokaryote organism. Pseudogene annotations are availabe in GTF files and plain text files. This is simple, tab-delimited data file. The fields are as follows: These are the original genome sequences used for the analysis. The references for them are given in the paper. The coordinates in the above pseudogene list (e.g. in fields 4 and 5) should synch perfectly with these files. The files are stored as simple gzipped text files using a naming convention based on the organism name in field 2 of the above file, with all lowercase letters and with spaces and punctuation changed to dashes. For instance, the file for "Escherichia coli O157:H7" is called Escherichia_coli_O157:H7_EDL933 __complete_genome.fasta. |
Associated Publications |
|