开发者

Parse a number from a string with non-digits in between

I am working on .NET project and I am trying to parse only the numeric value from string. Fo开发者_运维问答r example,

string s = "12ACD";
int t = someparefun(s); 
print(t) //t should be 12

A couple assumptions are

  1. The string pattern always will be number follow by characters.
  2. The number portion always will be either one or two digits value.

Is there any C# predefined function to parse the numeric value from string?


There's no such function, at least none I know of. But one method would be to use a regular expression to remove everything that is not a number:

using System;
using System.Text.RegularExpressions;

int result =
    // The Convert (System) class comes in pretty handy every time
    // you want to convert something.
    Convert.ToInt32(
        Regex.Replace(
            "12ACD",  // Our input
            "[^0-9]", // Select everything that is not in the range of 0-9
            ""        // Replace that with an empty string.
    ));

This function will yield 12 for 12ABC, so if you need to be able to process negative numbers, you'll need a different solution. It is also not safe, if you pass it only non-digits it will yield a FormatException. Here is some example data:

"12ACD"  =>  12
"12A5"   =>  125
"CA12A"  =>  12
"-12AD"  =>  12
""       =>  FormatException
"AAAA"   =>  FormatException

A little bit more verbose but safer approach would be to use int.TryParse():

using System;
using System.Text.RegularExpression;

public static int ConvertToInt(String input)
{
    // Replace everything that is no a digit.
    String inputCleaned = Regex.Replace(input, "[^0-9]", "");

    int value = 0;

    // Tries to parse the int, returns false on failure.
    if (int.TryParse(inputCleaned, out value))
    {
        // The result from parsing can be safely returned.
        return value;
    }

    return 0; // Or any other default value.
}

Some example data again:

"12ACD"  =>  12
"12A5"   =>  125
"CA12A"  =>  12
"-12AD"  =>  12
""       =>  0
"AAAA"   =>  0

Or if you only want the first number in the string, basically stopping at meeting something that is not a digit, we suddenly also can treat negative numbers with ease:

using System;
using System.Text.RegularExpression;

public static int ConvertToInt(String input)
{
    // Matches the first numebr with or without leading minus.
    Match match = Regex.Match(input, "-?[0-9]+");

    if (match.Success)
    {
        // No need to TryParse here, the match has to be at least
        // a 1-digit number.
        return int.Parse(match.Value);
    }

    return 0; // Or any other default value.
}

And again we test it:

"12ACD"  =>  12
"12A5"   =>  12
"CA12A"  =>  12
"-12AD"  =>  -12
""       =>  0
"AAAA"   =>  0

Overall, if we're talking about user input I would consider not accepting invalid input at all, only using int.TryParse() without some additional magic and on failure informing the user that the input was suboptimal (and maybe prompting again for a valid number).


Regex is one approach, as demonstrated by Bobby.

Another approach, given your assumptions, is to use TakeWhile in this fashion (with a TryParse for extra safety):

string input = "12ACD";
string digits = new string(input.TakeWhile(c => Char.IsDigit(c)).ToArray());
int result;
if (Int32.TryParse(digits, out result))
{
    Console.WriteLine(result);
}

Granted, the purpose of the code doesn't immediately pop out to the reader since most of their time will be spent deciphering the TakeWhile portion being converted to a string.


The regex method as described by Bobby is probably the best way to handle this but if you are particularly wary of regular expressions you could use a combination of LINQ and the Convert.ToInt32 method:

    string test = "12ACD";
    int number = Convert.ToInt32(new String(test.Where(x => char.IsNumber(x)).ToArray()));


Using Sprache:

int t = Parse.Number.Select(int.Parse).Parse("12ACD");
print(t) //t should be 12 and type of int32.


Since you know the only chars you care about are either the first 2 or just the first, you could use int.TryParse and SubStringing on the first 2 chars.

If that returns false (i.e. the 2nd char wasn't a digit), then just do int.Parse and Substring ing on the first char.

There's probably a cleaner way, but based on your assumptions, that should accomplish the job.


Even if there were such an intrinsic function in the CLI; you'd either find it worked only on a specific form, or have to tell it the form and/or the behaviour(s) to use with said form. In other words, what would you want your solution to do with "AB123CD456EF"? Parse only the first occurrence, concatenate all the numeric characters together and parse that, or parse every occurrence to an element of an enumerable result?

Any of these cases is dealt with quite adequately by regular expressions. I'd recommend wrapping your solution extensively into readable, well-documented functions, whichever you choose of the good suggestions already given.


Ahmads solution led me to this - assuming the string is always one or two numbers, followed by at least one non-digit character:

int number = Int32.Parse(
    Char.IsDigit(foo, 1)  ?  foo.Substring(0, 2)  :  foo.Substring(0, 1), 
    CultureInfo.InvariantCulture);

The logic is the following: If the character at index 1 (position 2) is a digit, get the first two characters, then parse them. If the character at index 1 is not a digit, get the first character, then parse it.


How about just:

    public int ReadStartingNumber(string text)
    {
        if (string.IsNullOrEmpty(text) || !char.IsDigit(text[0]))
            throw new FormatException("Text does not start with any digits");

        int result = 0;
        foreach (var digit in text.TakeWhile(c => char.IsDigit(c)))
        {
            result = 10*result + (digit - '0');
        }

        return result;
    }


You can use RegEx.Match (regular expressions) read msdn article on them. It is simple.


Int32.Parse()

There are equivalents for other number types as well.

Edit: After rereading, I saw that your string is not just that number. In that case you will need to pull out the digits first with a regular expression before using parse.


The most direct code based on your assumptions would be as follows...

string s = "13AD";
string s2 = s.Substring(0, s.Length - 2);
int i = int.Parse(s2);

If your assumptions are guaranteed, this is the most readable way of doing it. No need for regex or fancy LINQ stuff. LINQ is great, but it gets over used way too often, it seems.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜