开发者

What does char 160 mean in my source code?

I am formatting numbers to string using the following format string "# #.##", at some point I need to turn back these number strings like (1 234 567) into something like 1234567. I am trying to strip out the empty chars but found that

value开发者_如何转开发 = value.Replace(" ", "");  

for some reason and the string remain 1 234 567. After looking at the string I found that

value[1] is 160.

I was wondering what the value 160 means?


The answer is to look in Unicode Code Charts - where you'll find the Latin-1 supplement chart; this shows that U+00A0 (160 as per your title, not 167 as per the body) is a non-breaking space.


char code 160 would be  


Maybe you could to use a regex to replace those empty chars:

Regex.Replace(input, @"\p{Z}", "");

This will remove "any kind of whitespace or invisible separator".


value.Replace(Convert.ToChar(160).ToString(),"")


This is a fast (and fairly readable) way of removing any characters classified as white space using Char.IsWhiteSpace:

StringBuilder sb = new StringBuilder (value.Length);
foreach (char c in value)
{
    if (!char.IsWhiteSpace (c))
        sb.Append (c);
}
string value= sb.ToString();

As dbemerlin points out, if you know you will only need numbers from your data, you would be better use Char.IsNumber or the even more restrictive Char.IsDigit:

StringBuilder sb = new StringBuilder (value.Length);
foreach (char c in value)
{
    if (char.IsNumber(c))
        sb.Append (c);
}
string value= sb.ToString();

If you need numbers and decimal seperators, something like this should suffice:

StringBuilder sb = new StringBuilder (value.Length);
foreach (char c in value)
{
    if (char.IsNumber(c)|c == System.Globalization.NumberFormatInfo.CurrentInfo.NumberDecimalSeparator )
        sb.Append (c);
}
string value= sb.ToString();


I would suggest using the char overload version:

value = value.Replace(Convert.ToChar(160), ' ') 


Solution with extended methods:

public static class ExtendedMethods
{
    public static string NbspToSpaces(this string text)
    {
        return text.Replace(Convert.ToChar(160), ' ');
    }
}

And it can be used with this code:

value = value.NbspToSpaces();


Wouldn't be the preferred method to replace all empty characters (and this is what the questioner wanted to do) with the Regex Method which Rubens already posted?

Regex.Replace(input, @"\p{Z}", "");

or what Expresso suggests:

Regex.Replace(input, @"\p{Zs}", "");

The difference here is that \p{Z} replaces any kind of whitespace or invisible separator whereas the \p{Zs} replaces a whitespace character that is invisible, but does take up space. You can read it here (Section Unicode Categories):

http://www.regular-expressions.info/unicode.html

Using RegEx has the advantage that only one command is needed to replace also the normal whitespaces and not only the non-breaking space like explained in some answers above.

If performance is the way to go then of course other methods should be considered but this is out of scope here.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜