Problem 6.10

Translate RNA

Write a function translate that takes an RNA sequence as a string and translates that into aminoacid sequences accodinging to the codon table.

The translation process, the RNA sequence is treated as a sequence of codons with 3 nucleotides in each codon. Each codon translates to one aminoacid, with the exception of UAG, UGA, and UAA, which are known as stop codons. Each aminoacid is represented as a single letter and the stop codon is represeted with a * character.

You can refer to DNA and RNA codon tables on Wikipedia for the codon table.

Your translate function should take an RNA sequence as input and return a sequence of aminoacids.

>>> translate("AUGAUCUCG")
'MIS'

In the above example, the RNA translation works as follows:

AUG -> M (Methionine)
AUC -> I (Isoleucine)
UCG -> S (Serine)

Here are some more examples:

>>> translate("AUGAUCUCGUAA")
'MIS*'
>>> translate("GUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG")
'VAIVMGR*KGAR*'

Please note that you are expected to implement this without using any other modules like Biopython.

Hint

You could make the codon table as a dictionary.

codon_table = {
 'AAA': 'K',
 'AAU': 'N',
 'AAG': 'K',
 'AAC': 'N',
 'AUA': 'I',
 'AUU': 'I',
 'AUG': 'M',
 'AUC': 'I',
 'AGA': 'R',
 'AGU': 'S',
 'AGG': 'R',
 'AGC': 'S',
 'ACA': 'T',
 'ACU': 'T',
 'ACG': 'T',
 'ACC': 'T',
 'UAA': '*',
 'UAU': 'Y',
 'UAG': '*',
 'UAC': 'Y',
 'UUA': 'L',
 'UUU': 'F',
 'UUG': 'L',
 'UUC': 'F',
 'UGA': '*',
 'UGU': 'C',
 'UGG': 'W',
 'UGC': 'C',
 'UCA': 'S',
 'UCU': 'S',
 'UCG': 'S',
 'UCC': 'S',
 'GAA': 'E',
 'GAU': 'D',
 'GAG': 'E',
 'GAC': 'D',
 'GUA': 'V',
 'GUU': 'V',
 'GUG': 'V',
 'GUC': 'V',
 'GGA': 'G',
 'GGU': 'G',
 'GGG': 'G',
 'GGC': 'G',
 'GCA': 'A',
 'GCU': 'A',
 'GCG': 'A',
 'GCC': 'A',
 'CAA': 'Q',
 'CAU': 'H',
 'CAG': 'Q',
 'CAC': 'H',
 'CUA': 'L',
 'CUU': 'L',
 'CUG': 'L',
 'CUC': 'L',
 'CGA': 'R',
 'CGU': 'R',
 'CGG': 'R',
 'CGC': 'R',
 'CCA': 'P',
 'CCU': 'P',
 'CCG': 'P',
 'CCC': 'P'
}

Solution

codon_table = {
 'AAA': 'K',
 'AAU': 'N',
 'AAG': 'K',
 'AAC': 'N',
 'AUA': 'I',
 'AUU': 'I',
 'AUG': 'M',
 'AUC': 'I',
 'AGA': 'R',
 'AGU': 'S',
 'AGG': 'R',
 'AGC': 'S',
 'ACA': 'T',
 'ACU': 'T',
 'ACG': 'T',
 'ACC': 'T',
 'UAA': '*',
 'UAU': 'Y',
 'UAG': '*',
 'UAC': 'Y',
 'UUA': 'L',
 'UUU': 'F',
 'UUG': 'L',
 'UUC': 'F',
 'UGA': '*',
 'UGU': 'C',
 'UGG': 'W',
 'UGC': 'C',
 'UCA': 'S',
 'UCU': 'S',
 'UCG': 'S',
 'UCC': 'S',
 'GAA': 'E',
 'GAU': 'D',
 'GAG': 'E',
 'GAC': 'D',
 'GUA': 'V',
 'GUU': 'V',
 'GUG': 'V',
 'GUC': 'V',
 'GGA': 'G',
 'GGU': 'G',
 'GGG': 'G',
 'GGC': 'G',
 'GCA': 'A',
 'GCU': 'A',
 'GCG': 'A',
 'GCC': 'A',
 'CAA': 'Q',
 'CAU': 'H',
 'CAG': 'Q',
 'CAC': 'H',
 'CUA': 'L',
 'CUU': 'L',
 'CUG': 'L',
 'CUC': 'L',
 'CGA': 'R',
 'CGU': 'R',
 'CGG': 'R',
 'CGC': 'R',
 'CCA': 'P',
 'CCU': 'P',
 'CCG': 'P',
 'CCC': 'P'
}

def group(values, n):
    return [values[i:i+n] for i in range(0, len(values), n)]

def translate(seq):
    result = [codon_table[codon] for codon in group(seq, 3)]
    return "".join(result)