Need help inserting commas after each character in specific part of string
In the program I'm working on, I need to strip the tags around certain parts of a string, and then insert a comma after each character WITHIN the tag (not not after any other characters in the string). In case this doesn't make sense, here's an example of what needs to happen -
This is a string with a < a > tag < /a > (please ignore the spaces within the tag)
(needs to become)
This is a string with a t,a,g,.
Can anyone help me with this? I've managed to strip the tags using RegEx, but I can't figure out how to insert the commas only after the characters contained within the开发者_运维技巧 tag. If someone could help that would be great.
@Dour High Arch I'll elaborate a little bit. The code is for a text-to-speech app that won't recognize SSML tags. When the user enters a message for the text to speech app, they have the option of enclosing a word in a < a > tag to make the speaker say the world as an acronym. Because the acronym SSML tag won't work, I want to remove the < a > tag whenever present, and place commas after each character contained in the tag to fake it out (ex: < a > test< /a > becomes t,e,s,t,). All the non-tagged words in the string do not need commas after them, just those enclosed in tags (see my first example if need be).
If you have figured out the regex, I would imagine it would be simple to capture the inner text of the tag. Then it's a really simple operation to insert the commas:
var commaString = string.Join(",", capturedString.ToList());
Assuming you have your target string already parsed via your RegEx i.e. no tags around it...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication32
{
class Program
{
static void Main(string[] args)
{
// setup a test string
string stringToProcess = "Test";
// actual solution here
string result = String.Concat(stringToProcess.Select(c => c + ","));
// results: T,e,s,t,
Console.WriteLine(result);
}
}
}
Parsing XML is very problematic because you may have to deal with things like CDATA sections, nested elements, entities, surrogate characters, and on and on. I would use a state-based parser like ANTLR.
However, if you are just starting out with C# it is instructive to solve this using the built-in .Net string and array classes. No ANTLR, LINQ, or regular expressions needed:
using System;
class ReplaceAContentsWithCommaSeparatedChars
{
static readonly string acroStartTag = "<a>";
static readonly string acroEndTag = "</a>";
static void Main(string[] args)
{
string s = "Alpha <a>Beta</a> Gamma <a>Delta</a>";
while (true)
{
int start = s.IndexOf(acroStartTag);
if (start < 0)
break;
int end = s.IndexOf(acroEndTag, start + acroStartTag.Length);
if (end < 0)
end = s.Length;
string contents = s.Substring(start + acroStartTag.Length, end - start - acroStartTag.Length);
string[] chars = Array.ConvertAll<char, string>(contents.ToCharArray(), c => c.ToString());
s = s.Substring(0, start)
+ string.Join(",", chars)
+ s.Substring(end + acroEndTag.Length);
}
Console.WriteLine(s);
}
}
Please be aware this does not deal with any of the issues I mentioned. But then, none of the other suggestions do either.
精彩评论