Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

awk '!seen[$0]++'

awk 'NR==FNR{A[$0]; next} $0 in A' file1.txt file2.txt



Thanks, was trying to remember that first trick just last week. For log files I had to make a slight adjustment to filter the time at the front:

  !seen[gensub($1, "", "g", $0)]++
For anyone wondering, it filters out duplicates by storing the first occurrence of the line and skipping subsequent ones.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: