Problem 7.3
Genbank to FASTA
Write a program genbank_to_fasta.py
to convert a file in Genbank format to FASTA format.
The program should take a genbank file as a command-line argument and write it as a FASTA file. The optional flag -o
or --output
can be used to specify the destination path. If it is not specified, the destination path is constructed from the source file path by replacing the extension with .fasta
.
$ python genbank_to_fasta.py files/ls_orchid.gbk
converted files/ls_orchid.gbk to files/ls_orchid.fasta
$ python genbank_to_fasta.py files/ls_orchid.gbk -o output.fasta
converted files/ls_orchid.gbk to output.fasta
$ python genbank_to_fasta.py --help
usage: genbank_to_fasta.py [-h] [-o] filename
...
Solution
from Bio import SeqIO
import argparse
from pathlib import Path
p = argparse.ArgumentParser()
p.add_argument("filename", help="the genbank file to convert")
p.add_argument("-o", "--output", help="output file")
args = p.parse_args()
destination = args.output or Path(args.filename).with_suffix(".fasta")
records = SeqIO.parse(args.filename, "genbank")
SeqIO.write(records, destination, "fasta")
print(f"converted {args.filename} to {destination}")