faqts : Computers : Internet : Domain Names : djbdns

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

30 of 31 people (97%) answered Yes
Recently 10 of 10 people (100%) answered Yes

Entry

How do I eliminate duplicate lines from my "data" file?

Sep 25th, 2002 20:04
Brian Coogan, Jonathan de Boyne Pollard,


This is really an exercise in simple text file processing with 
Unix tools, and not something that is specific to "djbdns".  
There are three popular answers:
1.  sort -u
This is the most common answer to this question.  
Unfortunately, this has the side effect of sorting the records 
in the file as well, which may not be desirable if one wants 
all of the records (of different types) for a particular domain 
name to be grouped together.
2.  Dan Bernstein's "cleanup" script 
   http://cr.yp.to/dnsroot/cleanup
This, too, has the side effect of sorting the records in the 
file.  Furthermore, it is unable to cope with the more unusual 
record types, failing if it encounters them.
3.  nawk '!x[$0]++'
It's simple.  It doesn't sort the data.  It's faster than the 
"cleanup" script.  It's one line long.  And it works.  (-:
The Perl equivalent for those without nawk is:
    perl -ne 'print unless $seen{$_}++;'