awk to count lines in column of file -

July 15, 2012

i have large file want use awk count lines in specific column $5, before the: , count -uniq entries, seem having trouble getting syntax correct. thank :).

sample input

chr1    955542  955763  +   agrn:exon.1 1   0 chr1    955542  955763  +   agrn:exon.1 2   0 chr1    955542  955763  +   agrn:exon.1 3   0 chr1    955542  955763  +   agrn:exon.1 4   1 chr1    955542  955763  +   agrn:exon.1 5   1  awk -f: ' nr > 1 { count += $5 } -uniq' input

desired output

$ awk -f'[ \t:]+' '{a[$5]=1;} end{for (k in a)n++; print n;}' input 1

-f'[ \t:]+'

this tells awk use spaces, tabs, or colons field separator.
a[$5]=1

as loop through each line, adds entry associative array a each value of $5 encountered.
end{for (k in a)n++; print n;}

after have finished reading file, counts number of keys in associative array a , prints total.

Search This Blog

TSQL

awk to count lines in column of file -

Comments

Post a Comment

Popular posts from this blog

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

html - Outlook 2010 Anchor (url/address/link) -

android - How to create dynamically Fragment pager adapter -