开发者

Find a date within a string that is before a match

Given the following string I need to find the last occurance of the string [12] Solution Confirmed then traverse backwards until I hit a date. The date will always be in the format dd-开发者_如何学GoMM-yyyy.

<tr><td>17-05-2011&nbsp;16:28&nbsp;</td><td>DB&nbsp;</td>
<td>(YB)&nbsp;0&nbsp;</td><td>75%&nbsp;</td><td>&nbsp;</td>
<td>[10] Pending - Probable</td></tr><tr><td>15-05-2011&nbsp;22:40&nbsp;</td>
<td>YB&nbsp;</td><td>(YB)&nbsp;0&nbsp;</td><td>90%&nbsp;</td><td>&nbsp;</td>
<td>[12] Solution Confirmed</td></tr>

In the example above the date I would expect would be 15-05-2011.

<tr><td>18-07-2011&nbsp;10:10&nbsp;</td>
<td>YB&nbsp;</td><td>(YB)&nbsp;56650&nbsp;</td>
<td>90%&nbsp;</td><td>&nbsp;</td><td>[12] Solution Confirmed</td></tr>

In the example above I would expect the date would be 18-07-2011

I can't be 100% sure that the string I am looking at is HTML compliant. Would a Regex suit me best? Can anyone provide a working example.

edit I have looked into this and it looks like the date is always in this format...

<td>dd-MM-yyyy&nbsp;HH:mm&nbsp;</td>


I was confirming this in a console app but my thinking was same as @Jason:

string x = "<tr><td>17-05-2011&nbsp;16:28&nbsp;</td><td>DB&nbsp;</td><td>(YB)&nbsp;0&nbsp;</td><td>75%&nbsp;</td>" +
                       "<td>&nbsp;</td><td>[10] Pending - Probable</td></tr><tr><td>15-05-2011&nbsp;22:40&nbsp;</td>" +
                       "<td>YB&nbsp;</td><td>(YB)&nbsp;0&nbsp;</td><td>90%&nbsp;</td><td>&nbsp;</td>" +
                       "<td>[12] Solution Confirmed</td></tr>";
            int searchBeforeLocation = x.LastIndexOf("Solution Confirmed");
            x = x.Substring(0, searchBeforeLocation);
            Regex r = new Regex(@"\d{2}-\d{2}-\d{4}");
            MatchCollection matches = r.Matches(x);
            int matchCount = matches.Count;
            Console.WriteLine(matches[matches.Count - 1].Value);
            Console.Read();

The one nearest to the "Solution Confirmed" will be the last match


You should be able to use .*(\d{2}-\d{2}-\d{4}).*?[12] Solution Confirmed. The first .* (any character) is greedy so will use as much text as it can, the second .*? is lazy, so it will use as little text as it can. This should ensure that you get the one closest to the "Solution Confirmed".


This should do the trick:

MatchCollection matches = Regex.Matches(inputData,
                      @"\d{2}-\d{2}-\d{4}(?=.*?\[12\]\sSolution\sConfirmed.*?)");

string selectedValue = matches[matches.Count - 1].Value;

I think the best way is run the regex and get all the matches, and then extract the last value from the matches. I don't think there is a way to get that straight from regex, unless you have something unique in front of your last match that you can use as reference.


the simplest regular expression is \d{2}-\d{2}-\d{4}

update

string content = @"<tr><td>17-05-2011&nbsp;16:28&nbsp;</td><td>DB&nbsp;</td>
<td>(YB)&nbsp;0&nbsp;</td><td>75%&nbsp;</td><td>&nbsp;</td>
<td>[10] Pending - Probable</td></tr><tr><td>15-05-2011&nbsp;22:40&nbsp;</td>
<td>YB&nbsp;</td><td>(YB)&nbsp;0&nbsp;</td><td>90%&nbsp;</td><td>&nbsp;</td>
<td>[12] Solution Confirmed</td></tr>";

MatchCollection matches = Regex.Matches(content, @"\d{2}-\d{2}-\d{4}");


Try this:

        var htmlData = "<tr><td>17-05-2011&nbsp;16:28&nbsp;</td><td>DB&nbsp;</td> <td>(YB)&nbsp;0&nbsp;</td><td>75%&nbsp;</td><td>&nbsp;</td> <td>[10] Pending - Probable</td></tr><tr><td>15-05-2011&nbsp;22:40&nbsp;</td> <td>YB&nbsp;</td><td>(YB)&nbsp;0&nbsp;</td><td>90%&nbsp;</td><td>&nbsp;</td> <td>[12] Solution Confirmed</td></tr> ";
        var date = Regex.Replace(htmlData, @".*(\d{2}-\d{2}-\d{4}).*Solution Confirmed.*$", "$1");
        Console.WriteLine(date );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜