Pseudogene Annotation in the Rice Genome and Pseudogene-derived Natural Antisense Small RNAs

We have applied our PseudoPipe to the rice genome (Oryza sativa sp. Japonica ) and annotated 11,956 non TE-related pseudogenes. Most of these pseudogenes are derived from past gene duplications. About 12% of the rice non TE-related genes have produced a pseudogene, with about half of which in singleton families. A survey of ~1.5 million small RNAs identified 145 of the pseudogenes potentially encoding antisense small RNAs in developing rice grains. The majority (>50%) of these RNAs are 24-nt long, a feature often seen in plant repeat-associated siRNAs that are produced by RNA-dependent RNA polymerase (RDR2) and Dicer-like protein 3 (DCL3). Multiple lines of evidence suggest that some of the pseudogene-derived RNAs might function as natural antisense siRNAs either by interacting with the complementary sense RNAs from functional genes (38 cases) or by forming double-strand RNAs in transcripts from adjacent paralogous pseudogenes (2 cases). A subsequent examination of five additional small RNA libraries showed that pseudogene-derived antisense siRNAs were often produced in specific rice developmental stages or physiological growth conditions, suggesting their potential roles in normal rice development.

Supplemenatary Data

Source of small RNA libraries used in this study

  • Primary
    • Developing rice grains, 1.5 million RNAs (matching uniquely to rice genome, ditto below). Zhu et al., 2008, Genome Res, 18, 1456-1465., GSE11014.
  • Additional
    • Dehulled mature grain, 141,370 small RNAs. Heisel et al., 2008, PLoS ONE, 3, e2871. GSE 13152.
    • 23 days old seedlings, 58,863 small RNAs. Heisel et al., 2008, PLoS ONE, 3, e2871. GSE 13152.
    • A mixture of RNAs for MPSS from (1) seedlings treated with ABA, (2) nipponbare immature panicles - 90 days old plants, (3) germinating seedlings infected with Magnaporthe grisea, (4) germinating seedlings, (5) stem, and (6) seedling control for ABA treatment. 136,870 small RNAs. Nobuta et al., 2007, Nat Biotechnol, 25, 473-477.
    • Four-week old seedlings, 73,174 small RNAs. Zhou et al., 2008, Genome Res, 19, 70-78. GSE12317
    • CRSDB, a mixture of RNAs isolated from 30 to 60 day leaves (~16.5%), 10, 25 and 30 day seedlings (~11%), 47 cm inflorescences (~16.5%) and 25 day seedling polysomes (~16.5%), 5,521 small RNAs. Johnson et al., 2006, Nucleic Acids Res, 35, D829-833. CRSDB (Cereal small RNA database,

Genome-wide distribution of rice pseudogenes, siRNAs from developing rice grains, and repeats.