Friday, 9 April 2010

Perl One Liner: Filtering Files

Perl one liners can be a really useful way of filtering a file and only returning lines that pass some criteria. For example removing lines with low quality or something. There first examples are all assuming numeric data.


This will only show lines from test.txt where the values in the second column are less than 6.
perl -ane 'print if $F[1] < 6' test.txt

This will only show lines where the sum of the columns one and two is less than five.

perl -ane 'print if $F[0]+$F[1] < 5' test.txt


This one will return the number of lines in test.txt where the first column is less than the second column, but not actually return the lines. You could of course.
perl -ane '$sum++ if $F[0] < $F[1];END{print "$sum\n"}' test.txt

These next examples are to filter character, or text files. Maybe you only want to return lines that have your name in them:

perl -ne 'print if m/Stew/' test_names.txt

 Or maybe you want the lines with your name in but you also want to know how many didn't have your data in:

perl -ne 'if (m/Stew/) {print} else {$notMe++};END{print "$notMe\n"}' test_names.txt

Or how about only showing those lines that have two columns

perl -ane 'print if @F == 2' test.txt

The possibilities are endless. 

No comments:

Post a Comment