sed or awk or perl + remove all not uniq lines except the first line (FILE NAME) + shell script
I have the following file How to remove by sed all FILE NAME lines except the first uniq FILE NAME For example need to remove all FILE NAME lines from the file except the first:
FILE NAME: /dir1/dir2/dir3/dir4/dir5/file
FILE NAME: /dirA/dirB/dirC/dirD/dirE/file
the file:
FILE NAME: /dir1/dir2/dir3/dir4/dir5/file
PARAMETER NAME: blablabla
TARGET FILE: 12
SOURCE FILE: 565
FILE NAME: /dir1/dir2/dir3/dir4/dir5/file
PARAMETER NAME: blablabla
TARGET FILE: 18
SOURCE FILE: 552
FILE NAME: /dir1/dir2/dir3/dir4/dir5/file
PARAMETER NAME: blablabla
TARGET FILE: 14
SOURCE FILE: 559
FILE NAME: /dirA/dirB/dirC/dirD/dirE/file
开发者_Python百科PARAMETER NAME: blablabla
TARGET FILE: 134
SOURCE FILE: 344
FILE NAME: /dirA/dirB/dirC/dirD/dirE/file
PARAMETER NAME: blablabla
TARGET FILE: 13
SOURCE FILE: 445
FILE NAME: /dirA/dirB/dirC/dirD/dirE/file
PARAMETER NAME: blablabla
TARGET FILE: 13
SOURCE FILE: 434
awk '!(/^FILE NAME:/ && seen[$NF]++)' infile
In python:
import sys
seen = set()
for line in sys.stdin:
if (line.startswith('FILE NAME: ')):
if (line in seen):
continue
else:
seen.add(line)
sys.stdout.write(line)
sys.flush()
I'll have a think about sed and get back to you in a few hours, hopefully.
To be honest, though, this is not a very seddish task - sed likes tasks where you can process each line based only on the contents of that line (and perhaps one thing you've seen before and put in the hold buffer). This job fundamentally involves a more complex body of knowledge that needs to be carried through the file.
精彩评论