compBiomeBlog: one liner

Showing posts with label one liner. Show all posts

Thursday, 28 July 2011

Perl One Liner: Delete files with wrong number of lines

Just a quick perl one liner for future reference. I needed to delete some text files that didn't have the correct number of lines as they would break a downstream R script to parse the results

for f in *.txt;do perl -ne 'END{unlink $ARGV unless $.==200}' ${f} ;done

I always forget that $ARGV is the variable for the input file name in a one-liner.

Friday, 18 June 2010

R: Command Line Calculator using Rscript

I currently use an awesome little bash trick to get a command line calculator that was posted on lifehacker, and that I blogged about previously.

calc(){ awk "BEGIN{ print $* }" ;}

You just add this to your .bashrc file and then you can use it just like calc 2+2.

This is really useful, however I recently stumbled upon Rscript. This comes with the standard R install and allows you to make a scripts similar to perl or bash with the shebang #!/usr/bin/Rscript, or wherever your Rscript is (you can check with a whereis Rscript command). The nice thing is that it also has a -e option for evaluating an expression at the command line, just like the perl -e for perl one liners. For example:

Rscript -e "round(runif(10,1,100),0)"

[1] 17 23 21 36 10 47 90 81 83 5

This gives you 10 random numbers uniformly distributed between 1 and 100. You can use any R functions this way, even plot for making figures.

Anyway, it seemed that Rscript would be really useful as a command line calculator too. So after a bit of playing and Googling I adapted a nice alias found in a comment on this blog post. Here it is :

alias Calc='Rscript -e "cat( file=stdout(), eval( parse( text=paste( commandArgs(TRUE), collapse=\"\"))),\"\n\")"'

So now you can type things like Calc "-log10(0.05)", whereas my above mentioned calc would just stare at me blinking, looking a bit embarrassed. You can really go to town if you like:

Calc "round(log2(sqrt(100)/exp(0.05)*(1/factorial(10))),2)"

Calc "plot(hist(rnorm(1E6),br=100))"

I think I will probably keep the calc version too as it is a bit quicker, with it's lower overhead, but Calc should be useful for more complex things too.

Thursday, 15 April 2010

Perl one liner: Command Line Google

I stumbled across this really cool perl one liner here while looking for something else.

function google () { u=`perl -MURI::Escape -wle 'print "http://google.com/search?q=".
uri_escape(join " ", @ARGV)' $@`; links $u; }

Add this to your .bashrc file and then you can just google at the command line!
For example :

google cambridge weather

You may have to change links to be w3m or lynx depending on which text browsers you have on your system.

I will probably never ever use this, but I feel happier knowing that I can

Tuesday, 13 April 2010

Perl One Liner: Remove whitespace from file name

It is often a pain to have spaces in file names when working at the command line, so remove them, or at least replace then with underscores. In this case it works on all bed files in the current directory.

ls *.bed |perl -ne 'use File::Copy;chomp;$old=$_;s/\s+/_/g;move($old,$_);'

You could replace the ls with a find for more control.

Friday, 9 April 2010

Perl One Liner: Filtering Files

Perl one liners can be a really useful way of filtering a file and only returning lines that pass some criteria. For example removing lines with low quality or something. There first examples are all assuming numeric data.

This will only show lines from test.txt where the values in the second column are less than 6.

perl -ane 'print if $F[1] < 6' test.txt

This will only show lines where the sum of the columns one and two is less than five.

perl -ane 'print if $F[0]+$F[1] < 5' test.txt

This one will return the number of lines in test.txt where the first column is less than the second column, but not actually return the lines. You could of course.

perl -ane '$sum++ if $F[0] < $F[1];END{print "$sum\n"}' test.txt

These next examples are to filter character, or text files. Maybe you only want to return lines that have your name in them:

perl -ne 'print if m/Stew/' test_names.txt

Or maybe you want the lines with your name in but you also want to know how many didn't have your data in:

perl -ne 'if (m/Stew/) {print} else {$notMe++};END{print "$notMe\n"}' test_names.txt

Or how about only showing those lines that have two columns

perl -ane 'print if @F == 2' test.txt

The possibilities are endless.

Tuesday, 24 November 2009

LSF: Job Array Modification

I use job arrays on LSF to control running large number of jobs. One nice feature of job arrays is being able to control the maximum number of jobs running, and so be nice to my fellow cluster users. I use the following BASH one liner to modify the maximum number of jobs on all of my job arrays at once.

bjobs -A |cut -f 1 -d " " |grep -v JOBID |while read seq;do bmod -J "%11" $seq;done

Just change the %11 part to be what ever number you want, well what ever number you can get away with.

Friday, 20 November 2009

BASH: randomize the lines in a file

I colleague needed to randomize the lines in a text file, and as usual google as the answer. I removed the sed and replaced it with cut. It works due to the $RANDOM variable which returns a psedu-random number each time you call it. Nice.

for i in `cat textFile.csv`;do echo "$RANDOM $i";done |sort -n -k 1|cut -f 2- -d " "

So it adds a random number before each line, then sorts on this number. Simple but clever.

Thursday, 5 November 2009

Perl one liner: Rename a file with some of its contents

I had some microarrays files that the scanner had named something not very useful. I wanted them renamed with their chip barcode which was in side the file.

perl -ne '`mv $ARGV $1.txt` if m/(1234567890(\d+_\d+_\d+))/;' *.txt

This little one liner does just that. The current file name is stored in $ARGV by default then I simply rename it from that to the extracted text I want.

Wednesday, 4 November 2009

Perl one liner: Random Lines from a File

I have some bed files that are too large to process in a reasonable time, so I need to randomly sample lines from them to create files of a workable size.

I used some bash and perl magic for this.

for f in *.bed;do export WC=`wc ${f} -l |cut -f 1 -d " "`;perl -i -ne 'srand;print if rand() <1500/$ENV{'WC'}' ${f} ;done

Basically, it checks the length of the file and stores the result in the environment variable WC, then it reads in the file line by line and only prints out the line if a random number between 0 and 1 is less than the proportion of our required size (1500 in this case) of our length (WC).

This is looped round all bed files in the current directory.

Edit:
You could also do something like this:

perl -ne 'print rand;print "\t";print;' FILENAME |sort |head -n 100 |cut -f 2 >NEWFILENAME

Which will return a random 100 lines from the file.

Tuesday, 25 August 2009

Perl One Liner: Lower case file names

If you want to make file names lower case this will make it so.

find . -type f -maxdepth 1| while read seq;do mv ${seq} $(echo ${seq} |perl -ne 'tr/A-Z/a-z/;print'); done

or with the easier ls

ls | while read seq;do mv ${seq} $(echo ${seq} |perl -ne 'tr/A-Z/a-z/;print'); done

You can add, for example, -name "*.txt" to the find part to only rename files that end in .txt. Also changing the type to -type d then it will work on directories.

Edit:
You could also move the file within perl rather than in the shell, using File::Copy

ls | perl -ne 'use File::Copy;chomp;$old=$_;tr/A-Z/a-z/;move($old,$_)'

You could replace the ls with a find statement as well for more control.

Tuesday, 11 August 2009

One Liner: Convert PDF to PNG

I needed to convert all the pdf files in a few directories to png files, so I used the following one liner:

find . -name "*.pdf" -type f -maxdepth 2|while read file;do convert ${file} $(echo ${file} |perl -ne "chomp;s/pdf/png/;print");done

It requires that you have ImageMagick installed.

Monday, 10 August 2009

One Liner: Remove File Extensions

Just a quick one, I downloaded the latest genome build already repeat masked but I the script I am running required the files to be just chromosome.fa (not chromosome.fa.masked). This quick bash one liner removes the masked part using basename.

for f in *.masked;do mv ${f} $(echo $(basename ${f} .masked));done

Sunday, 9 August 2009

One Liner: Count occurrences of multiple patterns in multiple files

There may be a more elegant solution to this, but I wanted to count the number of times a number of sequences occur in a number of files. Replace FILES with the list of files you want to search in (e.g. *.txt) and replace PATTERNS, with a file containing the things you want to search for, one entry per line. This BASH script should do the rest.

for f in FILES;do cat PATTERNS |while read seq;do grep ${seq} ${f} |wc -l|xargs echo ${f} ${seq};done;done

It is basically two loops, one that goes through the files, the other through each line in the PATTERNS file, then it just uses xargs to output the results in a sensible order. If you don't care about the number of each individual pattern in the file but just the total the -f option to grep would be work.

Friday, 7 August 2009

One Liner: Remove whitespace from filenames

Spaces in filenames can make all kinds of things break, it is good practice not to use them, but people do. So here is a good fix.

find . -type f -name "* *" -maxdepth 1| while read src; do mv "$src" `echo $src | tr " " "_"`; done

This script looks for all files in the current directory with a space in and moves them (which is how to rename things in unix) with spaces replaced by underscores.

You could do the same with with perl too, which includes a check to stop overwriting of any existing files.

find . -type f -maxdepth 1| perl -ne 'chomp; $o = $_; s/ /_/g; next if -e; rename $o, $_';

You could replace the find part with a simple ls but this method makes it easy to change the -type f into a -type d to do the same thing on directories, or change the -maxdepth to work a a whole tree of directories.

Friday, 31 July 2009

Perl one liner: lowercase a file

If you want to make a text file all lowercase, this perl one liner will do the job.

perl -i -ne 'tr/A-Z/a-z/;print;' file

The -i part modifies the file in place, so only use that if you are sure.

Tuesday, 28 July 2009

Quick one liner: Remove empty lines from files

This quick bash and perl one liner will remove any empty lines from all the files in the current directory.

for f in *;do perl -i -ne 'print unless /^$/' ${f};done

It will not remove lines that contain only white space however. For that you would need:

for f in *;do perl -i -ne 'print if /\S/' ${f};done

compBiomeBlog