Thursday, 1 April 2010

Timing the Command Line: time

I have been working more and more on chIP sequencing data recently, which can be pretty huge. Even simple tasks such as counting the number of lines in a file, sorting, filtering etc now have a considerable time cost.

In order to assess the most efficient way of performing some operations I have been using the time function at the command line. For example:

wc -l test.txt
19050959 test.txt


time sort test.txt >test_normalsort.txt
real    1m59.395s


time distSort test.txt
real    2m18.901s


In this case the normal sort was faster than a distributed sort and merge, but that could just be as our cluster was really busy when I ran this. Either way time is very useful.

No comments:

Post a Comment