Problem 6.6

Uniq Command

Write a program uniq.py that takes filename a as argument, prints the lines to the output, ignoring the identical adjacent input lines.

The command should support the following command-line flags.

 -d      Only output lines that are repeated in the input.

 -u      Only output lines that are not repeated in the input.

 -c      Precede each output line with the count of the number of times
        the line occurred in the input, followed by a single space.

The program should also print approprate help message when used with -h or --help flags.

Use the standard library module argparse for doing this. You may want to checkout the argparse tutorial to know how to use that module.

A sample input file files/animals.txt with the following content is provided along with this problem.

cat
cat
cat
dog
dog
cat
rat

Expected Output

$ python uniq.py files/animals.txt
cat
dog
cat
rat

$ python uniq.py -d files/animals.txt
cat
dog

$ python uniq.py -u files/animals.txt
cat
rat

$ python uniq.py -c files/animals.txt
3 cat
2 dog
1 cat
1 rat

Hints

Look at the uniq command of unix.

Solution

import argparse
import itertools

def parse_args():
    p = argparse.ArgumentParser()
    p.add_argument("filename")
    p.add_argument("-u", default=False, action="store_true")
    p.add_argument("-d", default=False, action="store_true")
    p.add_argument("-c", default=False, action="store_true")

    return p.parse_args()


def uniq(lines, unique=False, dups=False, count=False):
    for line, chunk in itertools.groupby(lines):
        n = len(list(chunk))
        if unique and n > 1:
            continue
        if dups and n == 1:
            continue

        if count:
            print(n, line, end="")
        else:
            print(line, end="")

def main():
    args = parse_args()
    f = open(args.filename)
    uniq(f, unique=args.u, dups=args.d, count=args.c)

if __name__ == "__main__":
    main()