Any way to improve this string slice method?
I wrote this string extension awhile back, and I'm actually getting quite a bit of use out of it.
public static string Slice(this string str, int? start = null, int? end = null, int step = 1)
{
if (step == 0) throw new ArgumentException("Step cannot be zero.", "step");
if (start == null)
{
if (step > 0) start = 0;
else start = str.Length - 1;
}
else if (start < 0)
{
if (start < -str.Length) start = 0;
else start += str.Length;
}
else if (start > str.Length) start = str.Length;
if (end == null)
{
if (step > 0) end = str.Length;
else end = -1;
}
else if (end < 0)
{
if (end < -str.Length) end = 0;
else end += str.Length;
}
else if (end > str.Length) end = str.Length;
if (start == end || start < end && step < 0 || start > end && step > 0) return "";
if (start < end && step == 1) return str.Substring((int)start, (int)(end - start));
int length = (int)(((end - start) / (float)step) + 0.5f);
var sb = new StringBuilder(length);
for (int i = (int)start, j = 0; j < length; i += step, ++j)
sb.Append(str[i]);
return sb.ToString();
}
Since it's in all my projects now, I'm wondering if I could have done it better. More efficient, or would it produce unexpected results in any case?
Slice. It works like Python's array notation.
"string"[start:end:step]
Many other languages have something like this too. string.Slice(1)
is equivalent to string.Substring(1)
. string.Substring(1,-1)
trims off the first and last character. string.Substring(null,null,-1)
will reverse the string. string.Substring(step:2)
will return a string with every other character... also similar to JS's slice but with an extra arg.
Re-revised based on your suggestions:
public static string Slice(this string str, int? start = null, int? end = null, int step = 1)
{
if (step == 0) throw new ArgumentException("Step size cannot be zero.", "step");
if (start == null) start = step > 0 ? 0 : str.Length - 1;
else if (start < 0) start = start < -str.Length ? 0 : str.Len开发者_如何学运维gth + start;
else if (start > str.Length) start = str.Length;
if (end == null) end = step > 0 ? str.Length : -1;
else if (end < 0) end = end < -str.Length ? 0 : str.Length + end;
else if (end > str.Length) end = str.Length;
if (start == end || start < end && step < 0 || start > end && step > 0) return "";
if (start < end && step == 1) return str.Substring(start.Value, end.Value - start.Value);
var sb = new StringBuilder((int)Math.Ceiling((end - start).Value / (float)step));
for (int i = start.Value; step > 0 && i < end || step < 0 && i > end; i += step)
sb.Append(str[i]);
return sb.ToString();
}
If you have plenty of test cases, then detecting unexpected results shouldn't be an issue if you wish to experiment with different implementations.
From an API perspective I would consider optional arguments rather than nullable ints.
Update
After reading the code closely, I can see that giving "start" and "end" a value of null, has a special meaning when taking "step" into consideration, therefore, they could not be represented as optional int parameters alone, however, they could still be optional parameters.
After looking at the code more closely, it's a bit of a funky API as the values of individual parameters have an affect on each other. My previous comment alludes to this. You really have to know the implementation to work this out, not generally a good API aspect. And possibly makes for a difficult readability experience.
I can see how "step" can be used to reverse a string, which is potentially useful. But wouldn't a Reverse extension method be better for this? Much more readable and less of a mental speedbump.
I can see 3 things, very really minor one
change the inner if into ternary like
if (start == null)
{
start = step > 0 ? 0 : str.Length - 1;
}
else if (start < 0)
{
start = start < -str.Length ? 0 : str.Length + start;
}
else if (start > str.Length)
start = str.Length;
maybe change the (int)int? into int.Value
change
var sb = new StringBuilder(length);
into
StringBuilder sb = new StringBuilder(length);
and the big question is, if it does what it need, why fixing it?
update to show how to do it with LINQ, way slower (is there a way to speed it up?)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Stopwatch sw;
string str;
sw = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
str = "Step cannot be zero.".Slice(null, null, -3, true);
sw.Stop();
Console.WriteLine("LINQ " + sw.Elapsed.TotalSeconds.ToString("0.#######") + " seconds");
sw = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
str = "Step cannot be zero.".Slice(null, null, -3, false);
sw.Stop();
Console.WriteLine("MANUAL " + sw.Elapsed.TotalSeconds.ToString("0.#######") + " seconds");
Console.ReadLine();
}
}
static class test
{
public static string Slice(this string str, int? start, int? end, int step, bool linq)
{
if (step == 0) throw new ArgumentException("Step cannot be zero.", "step");
if (linq)
{
if (start == null) start = 0;
else if (start > str.Length) start = str.Length;
if (end == null) end = str.Length;
else if (end > str.Length) end = str.Length;
if (step < 0)
{
str = new string(str.Reverse().ToArray());
step = Math.Abs(step);
}
}
else
{
if (start == null)
{
if (step > 0) start = 0;
else start = str.Length - 1;
}
else if (start < 0)
{
if (start < -str.Length) start = 0;
else start += str.Length;
}
else if (start > str.Length) start = str.Length;
if (end == null)
{
if (step > 0) end = str.Length;
else end = -1;
}
else if (end < 0)
{
if (end < -str.Length) end = 0;
else end += str.Length;
}
else if (end > str.Length) end = str.Length;
}
if (start == end || start < end && step < 0 || start > end && step > 0) return "";
if (start < end && step == 1) return str.Substring(start.Value, end.Value - start.Value);
if (linq)
{
return new string(str.Skip(start.Value).Take(end.Value - start.Value).Where((s, index) => index % step == 0).ToArray ());;
}
else
{
int length = (int)(((end.Value - start.Value) / (float)step) + 0.5f);
var sb = new StringBuilder(length);
for (int i = start.Value, j = 0; j < length; i += step, ++j)
sb.Append(str[i]);
return sb.ToString();
}
}
}
}
When I ask Python for "abcdefghijklmn"[::6]
it returns 'agm'
, but when I ask your function for "abcdefghijklmn".Slice(step:6)
it returns "ag"
.
I would recommend removing the incorrect length
calculation and just performing your loop like this:
var sb = new StringBuilder((end - start).Value / step);
for (int i = start.Value; step > 0 && i < end || step < 0 && i > end; i += step)
sb.Append(str[i]);
精彩评论