Thursday, June 4, 2009

Shell commands to count field separators

So I suspected a few pipe-separated csv files where truncated inappropriately.

How could I count the number of fields of the last 2 rows?

tail -1 # will give you the last record
tail -2 | head -1 # will give you the 2nd to last record
tr -d -c '|' # deletes everything except pipe chars. -d is delete, -c is the 'complement' or inverse set of the list of chars I supply, which is only a single '|'

wc # of course, provides counts

The final commands:

tail -2 somefile.txt | head -1 | tr -d -c '|' | wc
tail -1 somefile.txt | tr -d -c '|' | wc

And I did find that the files were truncated