excluding a column in csv file with regex
Is there any way to exclude/delete/replace one field from a csv file with some regexp in notepad++?
I have a csv file with some data like this:
'1','data1','data2','data3','data4','data5','data6','data7','data8','data9',
'data10','data11','data12','data13','data14','data15','data16','data17','data18',
'data19','data20','data21','data22','data23','\'data24 with some commas,
here and there and some "double quotes", and fullstops.','data25','data26'
The only problem I am facing is with data24
WHERE I encounter \'
and then ""
and some wild characters like ,
and .
. This is particularly fixed at 24 field.
For the purpose of clarity, I have entered a newline here. But the entire text above i开发者_运维技巧s in juts one line.
Any ideas on how to solve?
Thanks.
Not reliably. It is probably easiest to change the file with some tool which knows how to handle CSV (OpenOffice).
If you still want to use a regex, take a look at the negative lookbehind, so that you match a single quote only if it is not preceded by a backslash.
I'm not sure if I understand you correctly. Do you want to remove field number 24?
To get only L fields from left and R fields from right (thus, exclude fields L+1, ..., NF - R - 1, where NF is number of fields) and not to worry about weird characters in fields staying in between you can use following awk command:
awk 'BEGIN {FS=","; L=23; R=2} { for(i=1; i<=L+1; i++) printf($i); for(i=NF-R+1; i<=NF; i++) printf($i); print '\n'}' your_file
As Dave M mentioned you can get tools like cut (and awk) for Windows from here (this particular package contains gawk which should work as well with the same command)
Edit: Yeah, download link at sourceforge seems not to work. You can get awk and cut from here:
awk: http://gnuwin32.sourceforge.net/packages/gawk.htm
cut: http://gnuwin32.sourceforge.net/packages/coreutils.htm
I suggest using something like Ruby's CSV library to read the file in, process it programmatically, and write it out again.
精彩评论