Remove junk from a file
I have a csv file with some junk at开发者_如何转开发 the beginning of the file. How do I get rid of it?
sh-3.2# more data_combined.csv
84252,1,A ROSEAL
The file should start with the number 842...
For the data shown, this should do the trick (assuming a single-byte codeset such as ISO 8859-1, and not UTF-8, for example):
sed '1s/^...//' data_combined.csv
If it is UTF-8, then there are 6 bytes of garbage at the start. If sed
is run with a UTF-8 locale, the '.
' metacharacter matches a UTF-8 character (2 bytes each in the case shown), so the same expression works fine. If sed
is run with a SBCS (single-byte code set) such as 8859-1, then you'd need to use a pattern like:
sed '1s/^.\{6\}//' data_combined.csv
Actually, it would use as many characters to write 6 dots; but the generalization is perhaps clearer.
As Dennis Williamson correctly said in the all too brief interval while I slept, to remove non-digits from the start of the first line, use:
sed '1s/^[^0-9]*//' data_combined.csv
精彩评论