Fastest way to get the number in tail of a string
Given that we have this kind of string "XXXXXXX XXXXX 756", "XXXXX X开发者_StackOverflowXXXXX35665", (X is a character), which is the fasted way to get the number in the end of string?
EDIT: well, this is just-for-fun question. Solve this is quite simple, but I want to know the fastest algorithm to archive this. Rock it on!
In C, a quick O(n), one-pass algorithm (does not see negative signs) is :
int suffixedNumber(char* string) {
int result = 0;
char ch;
while (ch = *string++)
// Check whether <= '9' first, because most characters are > '9'.
result = (ch <= '9' && ch >= '0') ? 10*result + (ch - '0') : 0;
return result;
}
If you're alright with goto
s, you can get a ≈20% faster algorithm that (in order of importance) :
- returns -1 when there is no number at the end of
string
- avoids checking for end-of-string when
ch >= '0'
- avoids resetting
result
to zero whench
is nonnumeric - avoids multiplying
result
by ten when a number starts - avoids setting
result
to zero at the beginning
int suffixedNumber(char* string) {
int result;
char ch;
nonnumber: // STATE: Waiting for the start of a number.
ch = *string++;
if (ch > '9') goto nonnumber; // Decide this boundary first (> '9' most frequent)
if (ch < '0') { // Decide this boundary next
if (ch == '\0') return -1; // Decide this boundary last ('\0' least frequent)
goto nonnumber;
}
result = ch - '0';
number: // STATE: In the middle of a number.
ch = *string++;
if (ch > '9') goto nonnumber; // Decide this boundary first (> '9' most frequent)
if (ch < '0') { // Decide this boundary next
if (ch == '\0') return result; // Decide this boundary last ('\0' least frequent)
goto nonnumber;
}
result = 10*result + (ch - '0');
goto number;
}
Assuming the text can be streamed in reverse order (a reasonable assumption since strings in most languages are backed by an array of characters with O(1)
access), construct the number by reading the text backwards until you hit a character that is not a digit or the text has been consumed entirely.
numDigits = 0
number = 0
while(numDigits <> length and characterAt[length - numDigits] is a digit)
number = number + (parseCharacterAt[length - numDigits] * (10 ^ numDigits))
numDigits = numDigits + 1
end while
if(numDigits is 0)
Error ("No digits at the end")
else return number
Note: (10 ^ numDigits) can be trivially optimized with another variable.
Without knowing language or context, if the number of digits or characters is fixed length a simple substring would do, otherwise a regex matching consecutive digits (i.e. /\d+/
).
Probably some faster algorithm if you drop down to C++ levels, but I favour expressiveness.
I would just call the lastIndexOf('X') of your String object and proceed from there. Not backward looping mess.
use regular expression:
/^.+?(\d+)$/
and take first match capture group (\d+)
Edit: if you can't use regular expressions, FASTEST WAY will be something like this:
i = string.len
while i > 0:
break if string[i].isNotNum
i--
end
out = substring(string, i,string.len)
Stating the problem:
We want to find the index of the last non-digit character
Reflexion:
This implies that we check that each character after this point is a digit, which means we will need to perform at least O(k) comparison where
k
is the number of digits at the end of the string
Implementation:
Linear backward search, possibly involving bitwise trickery to "vectorize" the operations (comparing multiple characters at once) or leveraging a multithreading effort.
definitely dont use regex - way too slow. Just loop backwards until you find the first non numeric character:
string s = "XXXXX XXXXXX35665";
int i = s.Length;
while (--i >= 0 && Char.IsNumber(s[i]));
s=s.Substring(i + 1);
should do the trick.. ??
extracting tail number
str="XXXXXx....XXXX333";
parseInt(str.match(/\d*$/)
if float point wanted then
parseInt(str.match(/\d|\.*$/)
increment tail number:
str.replace(/(\d*)$/,function(){return parseInt(arguments[0])+1;})
精彩评论