Problem 6.10
Translate RNA
Write a function translate
that takes an RNA sequence as a string and translates that into aminoacid sequences accodinging to the codon table.
The translation process, the RNA sequence is treated as a sequence of codons with 3 nucleotides in each codon. Each codon translates to one aminoacid, with the exception of UAG, UGA, and UAA, which are known as stop codons. Each aminoacid is represented as a single letter and the stop codon is represeted with a *
character.
You can refer to DNA and RNA codon tables on Wikipedia for the codon table.
Your translate
function should take an RNA sequence as input and return a sequence of aminoacids.
>>> translate("AUGAUCUCG")
'MIS'
In the above example, the RNA translation works as follows:
AUG -> M (Methionine)
AUC -> I (Isoleucine)
UCG -> S (Serine)
Here are some more examples:
>>> translate("AUGAUCUCGUAA")
'MIS*'
>>> translate("GUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG")
'VAIVMGR*KGAR*'
Please note that you are expected to implement this without using any other modules like Biopython.
Hint
You could make the codon table as a dictionary.
codon_table = {
'AAA': 'K',
'AAU': 'N',
'AAG': 'K',
'AAC': 'N',
'AUA': 'I',
'AUU': 'I',
'AUG': 'M',
'AUC': 'I',
'AGA': 'R',
'AGU': 'S',
'AGG': 'R',
'AGC': 'S',
'ACA': 'T',
'ACU': 'T',
'ACG': 'T',
'ACC': 'T',
'UAA': '*',
'UAU': 'Y',
'UAG': '*',
'UAC': 'Y',
'UUA': 'L',
'UUU': 'F',
'UUG': 'L',
'UUC': 'F',
'UGA': '*',
'UGU': 'C',
'UGG': 'W',
'UGC': 'C',
'UCA': 'S',
'UCU': 'S',
'UCG': 'S',
'UCC': 'S',
'GAA': 'E',
'GAU': 'D',
'GAG': 'E',
'GAC': 'D',
'GUA': 'V',
'GUU': 'V',
'GUG': 'V',
'GUC': 'V',
'GGA': 'G',
'GGU': 'G',
'GGG': 'G',
'GGC': 'G',
'GCA': 'A',
'GCU': 'A',
'GCG': 'A',
'GCC': 'A',
'CAA': 'Q',
'CAU': 'H',
'CAG': 'Q',
'CAC': 'H',
'CUA': 'L',
'CUU': 'L',
'CUG': 'L',
'CUC': 'L',
'CGA': 'R',
'CGU': 'R',
'CGG': 'R',
'CGC': 'R',
'CCA': 'P',
'CCU': 'P',
'CCG': 'P',
'CCC': 'P'
}
Solution
codon_table = {
'AAA': 'K',
'AAU': 'N',
'AAG': 'K',
'AAC': 'N',
'AUA': 'I',
'AUU': 'I',
'AUG': 'M',
'AUC': 'I',
'AGA': 'R',
'AGU': 'S',
'AGG': 'R',
'AGC': 'S',
'ACA': 'T',
'ACU': 'T',
'ACG': 'T',
'ACC': 'T',
'UAA': '*',
'UAU': 'Y',
'UAG': '*',
'UAC': 'Y',
'UUA': 'L',
'UUU': 'F',
'UUG': 'L',
'UUC': 'F',
'UGA': '*',
'UGU': 'C',
'UGG': 'W',
'UGC': 'C',
'UCA': 'S',
'UCU': 'S',
'UCG': 'S',
'UCC': 'S',
'GAA': 'E',
'GAU': 'D',
'GAG': 'E',
'GAC': 'D',
'GUA': 'V',
'GUU': 'V',
'GUG': 'V',
'GUC': 'V',
'GGA': 'G',
'GGU': 'G',
'GGG': 'G',
'GGC': 'G',
'GCA': 'A',
'GCU': 'A',
'GCG': 'A',
'GCC': 'A',
'CAA': 'Q',
'CAU': 'H',
'CAG': 'Q',
'CAC': 'H',
'CUA': 'L',
'CUU': 'L',
'CUG': 'L',
'CUC': 'L',
'CGA': 'R',
'CGU': 'R',
'CGG': 'R',
'CGC': 'R',
'CCA': 'P',
'CCU': 'P',
'CCG': 'P',
'CCC': 'P'
}
def group(values, n):
return [values[i:i+n] for i in range(0, len(values), n)]
def translate(seq):
result = [codon_table[codon] for codon in group(seq, 3)]
return "".join(result)