How to parse / colorize keyword/value pairs in url?
I tried to colorize in rebol an url like this
content: "http://domain.com/test.php?keyword=hdhdf&hdhd=sdcfsv&sbcfsv=sdncfd&sncfsdv=dncsv&cnsv=dshdkd&scsv=12334&DXV=D&SWJDJJDFDJQKKKKKKKKKKKK&DFG=V&DJJF=DJVNVV&DJFFFFFFFFFF=33333"
rule-keyword-0: [to "?" thru "?" mark: (insert mark {<font color="red">}) 19 skip to "=" mark: (insert mark "</font>") thru "="]
rule-keyword-1: [to "&" thru "&" mark: (insert mark {<font color="red">}) 19 skip to "=" mark: (insert mark "</font>") thru "="]
rule-value-0: [to "</font>=" thru "</font>=" mark: (insert mark {<font color="blue">}) 20 skip to "&" mark: (insert mark "</font>") thru "&"]
rule-value-1: [to "</font>=" thru "</font>=" mark: (insert mark {<font color="blue">}) 20 skip to end mark: (insert mark "</font>")]
rule-keyword: [any [rule-keyword-0 | rule-keyword-1] to end]
rule-value: [any [rule-value-0 | rule-value-1] to end]
parse content rule-keyword
parse content rule-value
But output is not right (see for example double font color="blue" at the end):
http://domain.com/test.php?<font color="red">keyword</font>=<font color="blue">hdhdf</font>&<font color="red">hdhd</font>=<font color="blue">sdcfsv</font>&<font color="red">sbcfsv</font>=<font color="blue">sdncfd</font>&<font color="red">sncfsdv</font>=<font color="blue">dncsv</font>&<font color="red">cnsv</font>=<font color="bl开发者_运维技巧ue">dshdkd</font>&<font color="red">scsv</font>=<font color="blue">12334</font>&<font color="red">DXV</font>=<font color="blue">D&<font color="red">SWJDJJDFDJQKKKKKKKKKKKK</font>&DFG</font>=<font color="blue">V&<font color="red">DJJF</font>=DJVNVV</font>&<font color="red">DJFFFFFFFFFF</font>=<font color="blue"><font color="blue">33333</font>
What the correct rule
There are probably more elegant rules but this seems to work for your data, assuming that I have guessed what you want.
content: "http://domain.com/test.php?keyword=hdhdf&hdhd=sdcfsv&sbcfsv=sdncfd&sncfsdv=dncsv&cnsv=dshdkd&scsv=12334&DXV=D&SWJDJJDFDJQKKKKKKKKKKKK&DFG=V&DJJF=DJVNVV&DJFFFFFFFFFF=33333"
result: parse content [
thru "?"
some [
; we should be at the beginning of the pairs
mark1:
copy stuff to "=" mark2: (
; to ensure that there is a pair here
if stuff [
insert mark2 </font>
insert mark1 <font color="red">
]
)
; find the = sign
thru </font> thru #"="
mark1:
[ copy stuff to #"&" | copy stuff to end ]
mark2:
( if stuff [
insert mark2 </font>
insert mark1 <font color="blue">
]
)
thru </font>
[ thru "&" | end ]
]
]
?? result
?? content
You didn't specify what the correct output would look like, and so submitting incorrect code and asking us to guess what you were trying to do is a bit much! I will, as usual, suggest reducing your example to the smallest possible that reproduces your problem. (That will often lead you to the solution before you have to ask the question!)
http://catb.org/esr/faqs/smart-questions.html#code
But off the cuff, I suspect you are experiencing the fact that any code in parentheses is executed during the rule match, whether the rule winds up matching or not. Look at this simple example:
>> rule-1: ["a" (print "a matched in rule-1") "b"]
== ["a" (print "a matched in rule-1") "b"]
>> rule-2: ["a" (print "a matched in rule-2") "c"]
== ["a" (print "a matched in rule-2") "c"]
>> parse "ac" [any [rule-1 | rule-2]]
a matched in rule-1
a matched in rule-2
== true
Though the first rule failed, you get both printouts! The printout from rule-1
happened because the code in parentheses executed before the failure had been determined.
Your "any" running two rules that may or may not match, both doing insertions before figuring out the full match, looks like your problem.
精彩评论