Find a date within a string that is before a match
Given the following string I need to find the last occurance of the string [12] Solution Confirmed
then traverse backwards until I hit a date. The date will always be in the format dd-开发者_如何学GoMM-yyyy.
<tr><td>17-05-2011 16:28 </td><td>DB </td>
<td>(YB) 0 </td><td>75% </td><td> </td>
<td>[10] Pending - Probable</td></tr><tr><td>15-05-2011 22:40 </td>
<td>YB </td><td>(YB) 0 </td><td>90% </td><td> </td>
<td>[12] Solution Confirmed</td></tr>
In the example above the date I would expect would be 15-05-2011.
<tr><td>18-07-2011 10:10 </td>
<td>YB </td><td>(YB) 56650 </td>
<td>90% </td><td> </td><td>[12] Solution Confirmed</td></tr>
In the example above I would expect the date would be 18-07-2011
I can't be 100% sure that the string I am looking at is HTML compliant. Would a Regex suit me best? Can anyone provide a working example.
edit I have looked into this and it looks like the date is always in this format...
<td>dd-MM-yyyy HH:mm </td>
I was confirming this in a console app but my thinking was same as @Jason:
string x = "<tr><td>17-05-2011 16:28 </td><td>DB </td><td>(YB) 0 </td><td>75% </td>" +
"<td> </td><td>[10] Pending - Probable</td></tr><tr><td>15-05-2011 22:40 </td>" +
"<td>YB </td><td>(YB) 0 </td><td>90% </td><td> </td>" +
"<td>[12] Solution Confirmed</td></tr>";
int searchBeforeLocation = x.LastIndexOf("Solution Confirmed");
x = x.Substring(0, searchBeforeLocation);
Regex r = new Regex(@"\d{2}-\d{2}-\d{4}");
MatchCollection matches = r.Matches(x);
int matchCount = matches.Count;
Console.WriteLine(matches[matches.Count - 1].Value);
Console.Read();
The one nearest to the "Solution Confirmed" will be the last match
You should be able to use .*(\d{2}-\d{2}-\d{4}).*?[12] Solution Confirmed
. The first .* (any character) is greedy so will use as much text as it can, the second .*? is lazy, so it will use as little text as it can. This should ensure that you get the one closest to the "Solution Confirmed".
This should do the trick:
MatchCollection matches = Regex.Matches(inputData,
@"\d{2}-\d{2}-\d{4}(?=.*?\[12\]\sSolution\sConfirmed.*?)");
string selectedValue = matches[matches.Count - 1].Value;
I think the best way is run the regex and get all the matches, and then extract the last value from the matches. I don't think there is a way to get that straight from regex, unless you have something unique in front of your last match that you can use as reference.
the simplest regular expression is \d{2}-\d{2}-\d{4}
update
string content = @"<tr><td>17-05-2011 16:28 </td><td>DB </td>
<td>(YB) 0 </td><td>75% </td><td> </td>
<td>[10] Pending - Probable</td></tr><tr><td>15-05-2011 22:40 </td>
<td>YB </td><td>(YB) 0 </td><td>90% </td><td> </td>
<td>[12] Solution Confirmed</td></tr>";
MatchCollection matches = Regex.Matches(content, @"\d{2}-\d{2}-\d{4}");
Try this:
var htmlData = "<tr><td>17-05-2011 16:28 </td><td>DB </td> <td>(YB) 0 </td><td>75% </td><td> </td> <td>[10] Pending - Probable</td></tr><tr><td>15-05-2011 22:40 </td> <td>YB </td><td>(YB) 0 </td><td>90% </td><td> </td> <td>[12] Solution Confirmed</td></tr> ";
var date = Regex.Replace(htmlData, @".*(\d{2}-\d{2}-\d{4}).*Solution Confirmed.*$", "$1");
Console.WriteLine(date );
精彩评论