Converting non-delimited text into name/value pairs in Delphi
I've got a text file that arrives at my application as many lines of the following form:
<row amount="192.00" store="10" transaction_date="2009-10-22T12:08:49.640" comp_name="blah " comp_ref="C65551253E7A4589A54D7CCD468D8AFA" name="Accrington "/>
and I'd like to turn this 'row' into a series of name/value pairs in a given TStringList (there could be dozens of these <row>s in the file, so eventually I will want to iterate through the file breaking each row into name/value pairs in turn).
The problem I've got is that the data isn't开发者_StackOverflow中文版 obviously delimited (technically, I suppose it's space delimited). Now if it wasn't for the fact that some of the values contain leading or trailing spaces, I could probably make a few reasonable assumptions and code something to break a row up based on spaces. But as the values themselves may or may not contain spaces, I don't see an obvious way to do this. Delphi' TStringList.CommaText doesn't help, and I've tried playing around with Delimiter but I get caught-out by the spaces inside the values each time.
Does anyone have a clever Delphi technique for turning the sample above into something resembling this? ;
amount="192.00" store="10" transaction_date="2009-10-22T12:08:49.640" comp_name="blah " comp_ref="C65551253E7A4589A54D7CCD468D8AFA" name="Accrington "
Unfortunately, as is usually the case with this kind of thing, I don't have any control over the format of the data to begin with - I can't go back and 'make' it comma delimited at source, for instance. Although I guess I could probably write some code to turn it into comma delimited - would rather find a nice way to work with what I have though.
This would be in Delphi 2007, if it makes any difference.
You say it's not "obviously delimited," but to me, it's very obviously delimited because it's very obviously XML. So use an XML parser. You could start with Delphi's TXmlDocument
. You could pass each "row" string to the parser separately, but my suspicion is that all those rows are enclosed by some other angle-bracket tag. Feed that entire file to the parser, and it can help you get a list of objects representing rows, and then you can ask for the values of their attributes by name.
If you try to parse XML without regard to the nuances of XML parsing, sooner or later you're going to get burned.
procedure RowToStrings(const row: string; list: TStrings);
var
i : integer;
iDelim : integer;
inQuotes: boolean;
begin
iDelim := 0;
inQuotes := false;
for i := 1 to Length(row) do begin
if (row[i] = ' ') and (not inQuotes) then begin
list.Add(Copy(row, iDelim+1, i-iDelim-1));
iDelim := i;
end
else if row[i] = '"' then
inQuotes := not inQuotes;
end;
list.Add(Copy(row, iDelim+1, Length(row)-iDelim));
end;
procedure TForm37.Test;
var
row: string;
begin
row := 'amount="192.00" store="10" transaction_date="2009-10-22T12:08:49.640" ' +
'comp_name="blah " ' +
'comp_ref="C65551253E7A4589A54D7CCD468D8AFA" ' +
'name="Accrington "';
RowToStrings(row, ListBox1.Items);
end;
精彩评论