开发者

Parse an integer from a string with trailing garbage

I need to parse a decimal integer that appears at the start of a string.

There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)

e.g.

"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2

Is there a built-in method in the .NET framework to do this?

int.TryParse() is not suitable. It allows t开发者_运维知识库railing spaces but not other trailing characters.

It would be quite easy to implement this but I would prefer to use the standard method if it exists.


You can use Linq to do this, no Regular Expressions needed:

public static int GetLeadingInt(string input)
{
   return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}

This works for all your provided examples:

string[] tests = new string[] {
   "1",
   " 42 ",
   " 3 -.X.-",
   " 2 3 4 5"
};

foreach (string test in tests)
{
   Console.WriteLine("Result: " + GetLeadingInt(test));
}


foreach (var m in Regex.Matches(" 3 - .x. 4", @"\d+"))
{
    Console.WriteLine(m);
}

Updated per comments

Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.

To get first int:

Match match = Regex.Match(" 3 - .x. - 4", @"\d+");
if (match.Success)
    Console.WriteLine(int.Parse(match.Value));


There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).

Will the result always be non-negative (which would make things easier)?

To be honest, regular expressions are the easiest option here, but...

public static string RemoveCruftFromNumber(string text)
{
    int end = 0;

    // First move past leading spaces
    while (end < text.Length && text[end] == ' ')
    {
        end++;
    }

    // Now move past digits
    while (end < text.Length && char.IsDigit(text[end]))
    {
        end++;
    }

    return text.Substring(0, end);
}

Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).


I like @Donut's approach.

I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).

Three example implementions:

To remove trailing non-digit characters:

var digits = new string(input.Trim().TakeWhile(c =>
    ("0123456789").Contains(c)
).ToArray());

To remove leading non-digit characters:

var digits = new string(input.Trim().SkipWhile(c =>
    !("0123456789").Contains(c)
).ToArray());

To remove all non-digit characters:

var digits = new string(input.Trim().Where(c =>
    ("0123456789").Contains(c)
).ToArray());

And of course: int.Parse(digits) or int.TryParse(digits, out output)


This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:

for (int p = input.Length;  p > 0;  p--)
{
    int  num;
    if (int.TryParse(input.Substring(0, p), out num))
        return num;
}
throw new Exception("Malformed integer: " + input);

Of course, this will be slow if input is very long.

ADDENDUM (March 2016)

This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:

for (int p = input.Length;  p > 0;  p--)
{
    char  ch;
    do
    {
        ch = input[--p];
    } while ((ch < '0'  ||  ch > '9')  &&  ch != ' '  &&  p > 0);
    p++;

    int  num;
    if (int.TryParse(input.Substring(0, p), out num))
        return num;
}
throw new Exception("Malformed integer: " + input);


string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;

for (x = 0; x < s.length; x++) 
{

  if (int.TryParse(s[x], out i))
     collectedNumber += s[x];
  else
     break;     // not a number - that's it - get out.

} 

if (int.TryParse(collectedNumber, out i))
    Console.WriteLine(i); 
else
    Console.WriteLine("no number found");


This is how I would have done it in Java:

int parseLeadingInt(String input)
{
    NumberFormat fmt = NumberFormat.getIntegerInstance();
    fmt.setGroupingUsed(false);
    return fmt.parse(input, new ParsePosition(0)).intValue();
}

I was hoping something similar would be possible in .NET.

This is the regex-based solution I am currently using:

int? parseLeadingInt(string input)
{
    int result = 0;
    Match match = Regex.Match(input, "^[ \t]*\\d+");
    if (match.Success && int.TryParse(match.Value, out result))
    {
        return result;
    }
    return null;
}


Might as well add mine too.

        string temp = " 3 .x£";
        string numbersOnly = String.Empty;
        int tempInt;
        for (int i = 0; i < temp.Length; i++)
        {
            if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
            {
                numbersOnly += temp[i];
            }
        }

        Int32.TryParse(numbersOnly, out tempInt);
        MessageBox.Show(tempInt.ToString());

The message box is just for testing purposes, just delete it once you verify the method is working.


I'm not sure why you would avoid Regex in this situation.

Here's a little hackery that you can adjust to your needs.

" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);

public static class CharArrayExtensions
{
    public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
    {
        foreach (var c in array)
        {
            if(char.IsNumber(c))
                yield return c;
        }
    }
}

EDIT: That's true about the incorrect result (and the maintenance dev :) ).

Here's a revision:

    public static int FindFirstInteger(this IEnumerable<char> array)
    {
        bool foundInteger = false;
        var ints = new List<char>();

        foreach (var c in array)
        {
            if(char.IsNumber(c))
            {
                foundInteger = true;
                ints.Add(c);
            }
            else
            {
                if(foundInteger)
                {
                    break;
                }
            }
        }

        string s = string.Empty;
        ints.ForEach(i => s += i.ToString());
        return int.Parse(s);
    }


    private string GetInt(string s)
    {
        int i = 0;

        s = s.Trim();
        while (i<s.Length && char.IsDigit(s[i])) i++;

        return s.Substring(0, i);
    }


Similar to Donut's above but with a TryParse:

    private static bool TryGetLeadingInt(string input, out int output)
    {
        var trimmedString = new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray());
        var canParse = int.TryParse( trimmedString, out output);
        return canParse;
    }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜