You can easily do this with awk
:
awk '!seen[$0]++'
Introduction
This is a quick note on how to remove duplicate lines and show each line only once using a command.
Note: This article was translated from my original post.
Remove Duplicate Lines with One Command
The following command removes duplicate lines and outputs each one only once:
awk '!seen[$0]++'
Command Breakdown
awk '!seen[$0]++'
$0
: The entire current line being processedseen[$0]
: An associative arrayseen
using the entire line as the keyseen[$0]++
: Increments the count for the current line- If the line is seen for the first time: undefined → 0
- If the line has already been seen: the value increases by 1
!seen[$0]++
: Boolean evaluation with the NOT operator- If the line is seen for the first time:
!0
= true - If the line has already been seen:
!1(or more)
= false
- If the line is seen for the first time:
awk '!seen[$0]++'
: Only returns true and displays the line when it's the first time it's seen
So this command removes duplicate lines and only shows them once.
Example
Let's try using the command to remove duplicates from a file called list.txt
:
apple banana apple orange banana grape apple
Command:
cat list.txt | awk '!seen[$0]++'
Output:
apple banana orange grape
As you can see, each line is shown only once, with duplicates removed.
Conclusion
That was a quick note on how to remove duplicate lines and show them only once using a command.
Hope it helps someone!
[Related Post]