regex to match <Key>....<Value> pattern

2023-01-06 00:09 问答作者：

I have the following data sent by an external system which needs to be parsed for a particular Key

<ContextDetails>
<Context><Key>ID</Key><Value>100</Value></Context>
<Context><Key>Name</Key><Value>MyName</Value></Context>
</ContextDetails>

I tried parsing this with the 开发者_Go百科regex to fetch the value for the KEY : Name

<Context><Key>Name</Key><Value>.</Value></Context>

but the result is blank

What is the change I need to do to fix this regex

If this is XML, load it into an XDocument and query that.

See the answer from @Jens for details on how to do this.

To expand on Oded's answer, the way you should be doing this is someway like that:

XDocument doc = XDocument.Parse(@"<ContextDetails> 
<Context><Key>ID</Key><Value>100</Value></Context> 
<Context><Key>Name</Key><Value>MyName</Value></Context> 
</ContextDetails>");

String name  =  doc.Root.Elements("Context")
                        .Where(xe => xe.Element("Key").Value == "Name")
                        .Single()
                        .Element("Value").Value;

In my opinion you are doing it wrong. You should use an XML Parser. http://www.tutorialspoint.com/ruby/ruby_xml_xslt.htm It's just a guide. It can help.

I think, the Reg-Ex expression to match all Key-Value-Pairse your are whant is:

<Context>\s*?<Key>(.*?)\</Key>\s*?<Value>(.*?)</Value>\s*?</Context>

Description:

// <Context>\s*?<Key>(.*?)\</Key>\s*?<Value>(.*?)</Value>\s*?</Context>
// 
// Match the characters "<Context>" literally «<Context>»
// Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "<Key>" literally «<Key>»
// Match the regular expression below and capture its match into backreference number 1 «(.*?)»
//    Match any single character that is not a line break character «.*?»
//       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the character "<" literally «\<»
// Match the characters "/Key>" literally «/Key>»
// Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "<Value>" literally «<Value>»
// Match the regular expression below and capture its match into backreference number 2 «(.*?)»
//    Match any single character that is not a line break character «.*?»
//       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "</Value>" literally «</Value>»
// Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "</Context>" literally «</Context>»

Usage:

using System.Text.RegularExpressions;
public static void RunSnippet()
    {
        Regex RegexObj = new Regex("<Context>\\s*?<Key>(.*?)\\</Key>\\s*?<Value>(.*?)</Value>\\s*?</Context>",
            RegexOptions.IgnoreCase | RegexOptions.Multiline);
        Match MatchResults = RegexObj.Match(@"<ContextDetails>
            <Context><Key>ID</Key><Value>100</Value></Context>
            <Context><Key>Name</Key>   <Value>MyName</Value></Context>
            </ContextDetails>
            ");
        while (MatchResults.Success){
            Console.WriteLine("Key: " + MatchResults.Groups[1].Value)   ;
            Console.WriteLine("Value: " + MatchResults.Groups[2].Value) ;
            Console.WriteLine("----");
            MatchResults = MatchResults.NextMatch();
        }
    }
    /*
    Output:

        Key: ID
        Value: 100
        ----
        Key: Name
        Value: MyName
        ----
    */

The Regular-Expression to math only the Value or the Key "Name":

<Context>\s*?<Key>Name</Key>\s*?<Value>(.*?)</Value>\s*?</Context>

Description:

// <Context>\s*?<Key>Name</Key>\s*?<Value>(.*?)</Value>\s*?</Context>
// 
// Match the characters "<Context>" literally «<Context>»
// Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "<Key>Name</Key>" literally «<Key>Name</Key>»
// Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "<Value>" literally «<Value>»
// Match the regular expression below and capture its match into backreference number 1 «(.*?)»
//    Match any single character that is not a line break character «.*?»
//       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "</Value>" literally «</Value>»
// Match a single character that is a "whitespace character" (spaces, tabs, line breaks, etc.) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the characters "</Context>" literally «</Context>»

Usage:

string SubjectString = @"<ContextDetails>
            <Context><Key>ID</Key><Value>100</Value></Context>
            <Context><Key>Name</Key>   <Value>MyName</Value></Context>
            </ContextDetails>
            ";
    Console.WriteLine( Regex.Match(SubjectString, "<Context>\\s*?<Key>Name</Key>\\s*?<Value>(.*?)</Value>\\s*?</Context>",
            RegexOptions.IgnoreCase | RegexOptions.Multiline).Groups[1].Value );

Can you use an XML parser? If so, then use it, it's the Right Tool For This Job.

If you just have, say, a text editor and are willing to check every match by hand, then you could possibly use a regex. The error in your regex is that . only matches one character (any character except newline). So you'd need to replace this by .*? (match any number of characters, but as few as possible) or, better, [^<]*.

The latter means "zero or more characters except <" (which is the delimiting character). Of course, this can only work if there is never a < inside the value you're looking for.

Your regex also assumes that the entire match is on one single line with no whitespace between tags - so it will fail in all other cases.

Update: I just saw your edit: you do have access to an XML parser then - go with Oded's answer.

继续阅读：.net parsing regex xml

regex to match <Key>....<Value> pattern

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？