Descrptions of the data in the files. Each file contains the pseudogene data corresponding to a single RP gene, so there are toatal 79 files. Within each file, each block (enclosed by ">>" and "<<" ) corresponds to a single pseudogene. And each line represents a unique attribute as described below. line 1: >>"pseudogene id" (which includes the chromosome, parent RP gene), G+C content in the 100 KB region, isochore class line 2: >Func_aa : amino acid sequence of the functional RP protein, which include gaps that represent insertions in the pseudogenes line 3: actual sequence line 4: >Pgene_aa (predicted amino acid sequence of the pseudogene, which include frameshifts and gaps that represent deletions in the pseudogenes) line5: actua sequence line 6: >CDS_unmasked (coding sequence of the function gene, include the entire gene) line 7: actual sequence line 8: >CDS_masked (coding sequence of the functional gene. include the entire gene) The nucleotides that are not consensus between human and mouse are marked as "Y" or "R" or "V", purine, pyrimidine and others ) line 9: actual sequence line10: >CDS_codons: codon phases for the nucleotides in CDS_masked and CDS_unmasked. "1" represents after the first nucleotide in the codon line11: actual data line12: >Func_DNA_unmasked same as CDS_unmased except only covers the region that matches the pseudogene line13: actual sequence line14 >Func_DNA_masked same as CDS_masked except only covers the region that matches the pseudogene line15: actual sequence line16: >Func_DNA_codons: the codon frames for the sequence Func_DNA_masked, only covers the region that matches the pseudogene line 17: actual data line18: >Pgene_DNA DNA sequence of the pseudogene line19: actual sequence