开发者

Remove junk from a file

I have a csv file with some junk at开发者_如何转开发 the beginning of the file. How do I get rid of it?

sh-3.2# more data_combined.csv
84252,1,A ROSEAL

The file should start with the number 842...


For the data shown, this should do the trick (assuming a single-byte codeset such as ISO 8859-1, and not UTF-8, for example):

sed '1s/^...//' data_combined.csv

If it is UTF-8, then there are 6 bytes of garbage at the start. If sed is run with a UTF-8 locale, the '.' metacharacter matches a UTF-8 character (2 bytes each in the case shown), so the same expression works fine. If sed is run with a SBCS (single-byte code set) such as 8859-1, then you'd need to use a pattern like:

sed '1s/^.\{6\}//' data_combined.csv

Actually, it would use as many characters to write 6 dots; but the generalization is perhaps clearer.


As Dennis Williamson correctly said in the all too brief interval while I slept, to remove non-digits from the start of the first line, use:

sed '1s/^[^0-9]*//' data_combined.csv
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜