Need help iterating though regex matches in VB.net Regular Expressions
I am just can't get my head around this. Please help! I've got this Regex:
(?<=Photo:)(.+?)(?=Stock)|(?<=Stock Code:)(.+?)(?=Make:)|(?<=Make:)(.+?)(?=Model:)|(?<=Model:)(.+?)(?=Year:)|(?<=Year:)(.+?)(?=Price:)|(?<=Price:)(.+?)(?=Description:)|(?<=Description:)(.+?)(?=Photo:)|(?<=Description:)(.+?)(?=Page:)
And I've got this sample data:
Photo:http://xxxx.xxx/images/DSC_0039.JPGStock Code:435Make:BMWModel:X5 3.0 I A/TYear:2002Price:169900.00Description:Neat,160000KM
Photo:http://xxxx.xxx/images/206.JPGStock Code:453Make:Renault Model:Scenic 1.6 Year:2006Price:99900.00Description:Expression 76000km
Photo:http://xxxx.xxx/images/DSC_0058.JPGStock Code:372Make:Renault Model:ScenicYear:2005Price:89900.00Description:Nice Family Car
Photo:http://xxxx.xxx/images/j.JPGStock Code:399Make:NissanModel:Micra 1.4Year:2008Price:102900.00Description:Accenta ,neat
Photo:http://xxxx.xxx/images/207.JPGStock Code:454Make:Renault Model:Scenic 1.6 Year:2001Price:49900.00Description:Expression 185000km
Photo:http://xxxx.xxx/images/DSC_0040.JPG_dcef66ac215bd9e8c4e3535e458b280b.JPGStock Code:442Make:M/BenzModel:C270 CDIYear:2003Price:122900.00Description:A/T 154000 KM
Photo:http://xxxx.xxx/images/DSC_0008.JPG_fa489cfd99436c6b9323cfa8e34ed460.JPGStock Code:480Make:Opel AstraModel:2.0 T SportYear:2007Price:154900.00Description:126000KM Black
Photo:http://xxxx.xxx/images/DSC_0010.JPG_cfe5eb4763cbf568e73697e2cd8dd30e.JPGStock Code:462Make:SeatModel:1.4Year:2008Price:8590.00Description:54000km
Photo:http://xxxx.xxx/stockimage.jpgStock Code:339Make:BMWModel:320iYear:2005Price:109900.00Description:Man. White 155000 km
Photo:http://xxxx.xxx/images/192.JPGStock Code:192Make:MitsibushiModel:Colt 2000Year:2008Price:99900.00Description:Workhorse
Photo:http://xxxx.xxx/images/HPIM1461.JPGStock Code:204Make:FordModel:BroncoYear:1989Price:59900.00Description:Neat
Photo:http://xxxx.xxx/stockimage.jpgStock Code:445Make:M/BenzModel:Vito 2.2CRDI Year:2006Price:169900.00Description:Crewbus 140000km,White
Photo:http://xxxx.xxx/images/Picture 384.jpgStock Code:180Make:FiatModel:SienaYear:2000Price:35900.00Description:Family Car
Photo:http://xxxx.xxx/images/202.JPGStock Code:441Make:MazdaModel:6 2.0 Year:2005Price:99900.00Description:Origenal 104000 km
I need to iterate though each group get the matched content of each record and then add it to a vehicle class property depending on which group it is.
Here is my most successfull attempt so far. It's only test to try extract the data, that's why I'm not It work kinda (collect data corectly for every 8th record):
Dim pattern As String = "(?<=Photo:)(.+?)(?=Stock)|(?<=Stock Code:)(.+?)(?=Make:)|(?<=Make:)(.+?)(?=Model:)|(?<=Model:)(.+?)(?=Year:)|(?<=Year:)(.+?)(?=Price:)|(?<=Price:)(.+?)(?=Description:)|(?<=Description:)(.+?)(?=Photo:)|(?<=Description:)(.+?)(?=\r)"
Dim GroupCounter As Integer = 1
Dim GroupName As String = ""
For Each match As Match In Regex.Matches(html, pattern)
If GroupCounter = 1 Then
GroupName = "Photo:"
ElseIf GroupCounter = 2 Then
GroupName = "Stock Code:"
ElseIf GroupCounter = 3 Then
GroupName = "Make:"
ElseIf GroupCounter = 4 Then
GroupName = "Model:"
ElseIf GroupCounter = 5 Then
GroupName = "Year:"
ElseIf GroupCounter = 6 Then
GroupName = "Price:"
ElseIf GroupCounter = 7 Then
GroupName = "Desc:"
ElseIf GroupCounter = 8 Then
GroupName = "Last Desc:"
Else
GroupName = "Unknown:"
End If
开发者_运维百科
If match.Groups.Item(GroupCounter).Success And GroupCounter > 0 Then
export = export & GroupName & match.Groups.Item(GroupCounter).Value & "|"
End If
GroupCounter += 1
If GroupCounter = 9 Then
GroupCounter = 1
End If
Next
The firebug output that I get is like what i would like except that it only returns every 8th record:
{"d":"Photo:http://xxxx.xxx/images/DSC_0039.JPG|Stock Code:435|Make:BMW|Model:X5 3.0 I A/T|Year:2002|Price:169900.00|Desc:Neat,160000KM|Photo:http://xxxx.xxx/image.jpg|Stock Code:339|Make:BMW|Model:320i|Year:2005|Price:109900.00|Desc:Man. White 155000 km|Photo:http://xxxx.xxx/images/g.JPG|Stock Code:395|Make:V/wagen|Model:Citi 1.4i|Year:2003|Price:49900.00|Desc:A/C|Photo:http://xxxx.xxx/images/1 (2).JPG|Stock Code:402|Make:BMW|Model:530I|Year:2004|Price:169900.00|Desc:Nice Family Car,A/T|Photo:http://xxxx.xxx/images/DSC_0001 (2).JPG_9a8aa2faebf77bcd7f021dc9ef602552.JPG|Stock Code:471|Make:Mitsibushi|Model:Colt 2800 C/Cab 4x4|Year:2005|Price:109900.00|Desc:179000 km|Photo:http:/xxxx.xxx/images/DSC_0011.JPG_5343615443cf449ae70b684c45e0964a.JPG|Stock Code:474|Make:Audi|Model:A3|Year:2005|Price:165900.00|Desc:A3 3.2 QUATRO 6 SPEED|Photo:http://xxxx.xxx/images/HPIM1731.JPG|Stock Code:304|Make:Ford|Model:Laser |Year:1997|Price:35900.00|Desc:Tracer 1.6 Sedan|Photo:http://xxxx.xxx/images/002.JPG|Stock Code:70|Make:PEUGEOT|Model:307|Year:2006|Price:117900.00|Desc:2.0 XS"}
Please help me Many Thanks Jacques
Your regex only matches one field at a time, when it should be matching a whole record. And there's no need to iterate through the groups by number and assign names to them, when you can use named groups. I don't speak VB, so here's an example in C#:
Regex r = new Regex(@"
Photo:(?<Photo>.+?)
Stock\s+Code:(?<StockCode>.+?)
Make:(?<Make>.+?)
Model:(?<Model>.+?)
Year:(?<Year>.+?)
Price:(?<Price>.+?)
Description:(?<Description>[^\r\n]+)",
RegexOptions.IgnorePatternWhitespace);
foreach (Match m in r.Matches(data))
{
Console.WriteLine();
foreach (string name in r.GetGroupNames())
{
Console.WriteLine("{0} = {1}", name, m.Groups[name]);
}
}
In addition to the names you assign, there will always be a group named "0", representing the whole match.
On a side note, I noticed you used (.+?)(?=\r)
to match the final field. I assume you did that because the records are separated by \r\n
and you don't want to include the \r
in the match. But what if the producer of the data changes the format so the lines end with just \n
, and fails to notify you? Suddenly your regex doesn't work any more, and you can't see why. If you use [^\r\n]+
like I did, you don't have to worry about that.
The regular expression I'd use for this case is
^Photo:(.*?)Stock Code:(.*?)Make:(.*?)Year:(.*?)Price:(.*?)Description:(.*?)$
with RegexOptions.Multiline enabled. For each line, it will have the relevant data in its capturing goups. Unfortunately, my VB.NET is more than shaky. I'll give a short snippet in C#. Please feel free to edit in a VB version.
String data = "Phtoto: .....";
String pattern = "^Photo:(.*?)Stock Code:(.*?)Make:(.*?)Year:(.*?)Price:(.*?)Description:(.*?)$";
MatchCollection matches = Regex.Matches(data, pattern, RegexOptions.Multiline);
foreach (Match match in matches)
{
YourObject item = new YourObject();
item.Photo = match.Groups[1].Value;
item.StockCode = match.Groups[2].Value;
// ....
}
精彩评论