开发者

RegEx help - converting from JavaScript to C#

I have a string that I need to do multiple search and replaces to remove leading and trailing spaces inside an attribute. The before and after effect is shown here (visually and with a JS example of it working):

http://lloydi.com/x/re/

Now, I need to do the equivalent in C# - replace all references in a string. But I am really stuck. I know the pattern is correct, as shown in the JS version, but the syntax/escape syntax is doing my head in.

Here's what I have, but of course it doesn't work ;-)

//define the string
string xmlString = "<xml><elementName specificattribute=" 111 222 333333 " anotherattribute="something" somethingelse="winkle"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>";

// here's the regExPattern - the syntax checker doesn't like this at all
string regExPattern = "/(specificattribute=)"\s*([^"]+?)\s*"/g";开发者_高级运维

// here's the replacement
string replacement = "$1\"$2\"";

Regex rgx = new Regex(regExPattern);
string result = rgx.Replace(xmlString, replacement);

Can someone tell me the error of my ways?

Many thanks!


Don't use regular expressions for this task. .NET has powerful tools for manipulating XML documents. Try this instead:

XDocument doc = XDocument.Load("input.xml");
foreach (XAttribute attr in doc.Descendants("elementName")
                               .Attributes("specificattribute"))
{
    attr.Value = attr.Value.Trim();
}
doc.Save("output.xml");


Remove the /g at the end of regExPattern. That's the first mistake I see for certain. .NET's regex implementation has no global modifier, it's global by default.

UPDATE:

I think this should work:

           //define the string
            string xmlString = "<xml><elementName specificattribute=\" 111 222 333333 \" anotherattribute=\"something\" somethingelse=\"winkle\"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>";

            // here's the regExPattern - the syntax checker doesn't like this at all
            string regExPattern = "(specificattribute=)\"\\s*([^\"]+?)\\s*";

            // here's the replacement
            string replacement = "$1\"$2\"";

            Regex rgx = new Regex(regExPattern);
            string result = rgx.Replace(xmlString, replacement);

Although this may actually work for you, XML's nested/context-specific nature makes regular expressions ill-suited to parse it properly and efficiently. It's certainly not the best tool for the job, let's put it that way.

From the look of things you should really use something like Xpath, or Linq to XML to parse and modify these attributes.

I'm practically stealing Mark Byer's answer, but since his example is with xml files and you're doing this in memory it should be more like this:

XDocument doc = XDocument.Parse("<xml><elementName specificattribute=\" 111 222 333333 \" anotherattribute=\"something\" somethingelse=\"winkle\"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>");
foreach (XAttribute attr in doc.Descendants("elementName")
                               .Attributes("specificattribute"))
{
    attr.Value = attr.Value.Trim();
}
string result = doc.ToString();


Seriously, you should be using the System.Xml class for this. Here's another example using XPath:

    string xmlString = "<xml><elementName specificattribute=\" 111 222 333333 \" anotherattribute=\"something\" somethingelse=\"winkle\"><someotherelement>value of some kind</someotherelement><yetanotherelement>another value of some kind</yetanotherelement></elementName></xml>";

    XmlDocument xml = new XmlDocument(); ;
    xml.LoadXml(xmlString);

    foreach (XmlAttribute el in xml.SelectNodes("//@specificattribute"))
    {
        el.Value = el.Value.Trim();
    }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜