Problem 6.6
Uniq Command
Write a program uniq.py
that takes filename a as argument, prints the lines to the output, ignoring the identical adjacent input lines.
The command should support the following command-line flags.
-d Only output lines that are repeated in the input.
-u Only output lines that are not repeated in the input.
-c Precede each output line with the count of the number of times
the line occurred in the input, followed by a single space.
The program should also print approprate help message when used with -h
or --help
flags.
Use the standard library module argparse for doing this. You may want to checkout the argparse tutorial to know how to use that module.
A sample input file files/animals.txt
with the following content is provided along with this problem.
cat
cat
cat
dog
dog
cat
rat
Expected Output
$ python uniq.py files/animals.txt
cat
dog
cat
rat
$ python uniq.py -d files/animals.txt
cat
dog
$ python uniq.py -u files/animals.txt
cat
rat
$ python uniq.py -c files/animals.txt
3 cat
2 dog
1 cat
1 rat
Hints
Look at the uniq
command of unix.
Solution
import argparse
import itertools
def parse_args():
p = argparse.ArgumentParser()
p.add_argument("filename")
p.add_argument("-u", default=False, action="store_true")
p.add_argument("-d", default=False, action="store_true")
p.add_argument("-c", default=False, action="store_true")
return p.parse_args()
def uniq(lines, unique=False, dups=False, count=False):
for line, chunk in itertools.groupby(lines):
n = len(list(chunk))
if unique and n > 1:
continue
if dups and n == 1:
continue
if count:
print(n, line, end="")
else:
print(line, end="")
def main():
args = parse_args()
f = open(args.filename)
uniq(f, unique=args.u, dups=args.d, count=args.c)
if __name__ == "__main__":
main()