string split by index / params?
Just before I write my own function just wanted to check if there exists a function like string.split(string input, params int[] indexes)
in the .NET library?
This function should split the string on indexes i pass to it.
Edit: I shouldn't have added the string.join sen开发者_如何学Ctence - it was confusing.
You could use the String
instance method Substring.
string a = input.Substring(0, 10);
string b = input.Substring(10, 5);
string c = input.Substring(15, 3);
All other answers just seemed too complicated, so I took a stab.
using System.Linq;
public static class StringExtensions
{
/// <summary>
/// Returns a string array that contains the substrings in this instance that are delimited by specified indexes.
/// </summary>
/// <param name="source">The original string.</param>
/// <param name="index">An index that delimits the substrings in this string.</param>
/// <returns>An array whose elements contain the substrings in this instance that are delimited by one or more indexes.</returns>
/// <exception cref="ArgumentNullException"><paramref name="index" /> is null.</exception>
/// <exception cref="ArgumentOutOfRangeException">An <paramref name="index" /> is less than zero or greater than the length of this instance.</exception>
public static string[] SplitAt(this string source, params int[] index)
{
index = index.Distinct().OrderBy(x => x).ToArray();
string[] output = new string[index.Length + 1];
int pos = 0;
for (int i = 0; i < index.Length; pos = index[i++])
output[i] = source.Substring(pos, index[i] - pos);
output[index.Length] = source.Substring(pos);
return output;
}
}
The Split method divides a string based on a recognition pattern. Perfect for breaking down comma seperated lists etc.
But you are right, there are no built in string methods to achieve what you want.
This doesn't directly answer your generalized question, but in what is most likely the common case (or at least the case for which I was searching for an answer when I came upon this SO question) where indexes
is a single int
, this extension method is a little cleaner than returning a string[]
array, especially in C# 7.
For what it's worth, I benchmarked using string.Substring()
against creating two char[]
arrays, calling text.CopyTo()
and returning two strings by calling new string(charArray)
. Using string.Substring()
was roughly twice as fast.
C# 7 syntax
jdoodle.com example
public static class StringExtensions
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static (string left, string right) SplitAt(this string text, int index) =>
(text.Substring(0, index), text.Substring(index));
}
public static class Program
{
public static void Main()
{
var (left, right) = "leftright".SplitAt(4);
Console.WriteLine(left);
Console.WriteLine(right);
}
}
C# 6 syntax
jdoodle.com example
Note: Using Tuple<string, string>
in versions prior to C# 7 doesn't save much in the way of verbosity and it might actually be cleaner to just return a string[2]
array.
public static class StringExtensions
{
// I'd use one or the other of these methods, and whichever one you choose,
// rename it to SplitAt()
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Tuple<string, string> TupleSplitAt(this string text, int index) =>
Tuple.Create<string, string>(text.Substring(0, index), text.Substring(index));
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static string[] ArraySplitAt(this string text, int index) =>
new string[] { text.Substring(0, index), text.Substring(index) };
}
public static class Program
{
public static void Main()
{
Tuple<string, string> stringsTuple = "leftright".TupleSplitAt(4);
Console.WriteLine("Tuple method");
Console.WriteLine(stringsTuple.Item1);
Console.WriteLine(stringsTuple.Item2);
Console.WriteLine();
Console.WriteLine("Array method");
string[] stringsArray = "leftright".ArraySplitAt(4);
Console.WriteLine(stringsArray[0]);
Console.WriteLine(stringsArray[1]);
}
}
One possible solution:
public static class StringExtension
{
public static string[] Split(this string source, params int[] sizes)
{
var length = sizes.Sum();
if (length > source.Length) return null;
var resultSize = sizes.Length;
if (length < source.Length) resultSize++;
var result = new string[resultSize];
var start = 0;
for (var i = 0; i < resultSize; i++)
{
if (i + 1 == resultSize)
{
result[i] = source.Substring(start);
break;
}
result[i] = source.Substring(start, sizes[i]);
start += sizes[i];
}
return result;
}
}
public static IEnumerable<string> SplitAt(this string source, params int[] index)
{
var indices = new[] { 0 }.Union(index).Union(new[] { source.Length });
return indices
.Zip(indices.Skip(1), (a, b) => (a, b))
.Select(_ => source.Substring(_.a, _.b - _.a));
}
var s = "abcd";
s.SplitAt(); // "abcd"
s.SplitAt(0); // "abcd"
s.SplitAt(1); // "a", "bcd"
s.SplitAt(2); // "ab", "cd"
s.SplitAt(1, 2) // "a", "b", "cd"
s.SplitAt(3); // "abc", "d"
There is always regular expressions.
Here's an example from which one can expand upon:
string text = "0123456789ABCDEF";
Match m = new Regex("(.{7})(.{4})(.{5})").Match(text);
if (m.Success)
{
var result = new string[m.Groups.Count - 1];
for (var i = 1; i < m.Groups.Count; i++)
result[i - 1] = m.Groups[i].Value;
}
Here's a function that encapsulates the above logic:
public static string[] SplitAt(this string text, params int[] indexes)
{
var pattern = new StringBuilder();
var lastIndex = 0;
foreach (var index in indexes)
{
pattern.AppendFormat("(.{{{0}}})", index - lastIndex);
lastIndex = index;
}
pattern.Append("(.+)");
var match = new Regex(pattern.ToString()).Match(text);
if (! match.Success)
{
throw new ArgumentException("text cannot be split by given indexes");
}
var result = new string[match.Groups.Count - 1];
for (var i = 1; i < match.Groups.Count; i++)
result[i - 1] = match.Groups[i].Value;
return result;
}
This was written rather quickly but I believe it illustrates my points and emphasizes my points to author of comment, Michael.
Version with "List< string >" as return.
Caller
string iTextLine = "02121AAAARobert Louis StevensonXXXX"
int[] tempListIndex = new int[] {
// 0 - // 0number (exclude first)
5, // 1user
9, // 2name
31 // role
};
// GET - words from indexes
List<string> tempWords = getListWordsFromLine(iTextLine, tempListIndex);
method
/// <summary>
/// GET - split line in parts using index cuts
/// </summary>
/// <param name="iListIndex">Input List of indexes</param>
/// <param name="iTextLine">Input line to split</param>
public static List<string> getListWordsFromLine(string iTextLine, int[] iListIndex)
{
// INIT
List<string> retObj = new List<string>();
int currStartPos = 0;
// GET - clear index list from dupl. and sort it
int[] tempListIndex = iListIndex.Distinct()
.OrderBy(o => o)
.ToArray();
// CTRL
if (tempListIndex.Length != iListIndex.Length)
{
// ERR
throw new Exception("Input iListIndex contains duplicate indexes");
}
for (int jj = 0; jj < tempListIndex.Length; ++jj)
{
try
{
// SET - line chunk
retObj.Add(iTextLine.Substring(currStartPos,
tempListIndex[jj] - currStartPos));
}
catch (Exception)
{
// SET - line is shorter than expected
retObj.Add(string.Empty);
}
// GET - update start position
currStartPos = tempListIndex[jj];
}
// SET
retObj.Add(iTextLine.Substring(currStartPos));
// RET
return retObj;
}
I wanted to use the Range
class to implement a solution.
My use case was to convert standard property names - e.g. CustomerName
, WindowSize
, etc. - into a JSON property name that would still be easy to read - as in customer_name
, window_size
.
Creating a JsonNamingPolicy
descendent, I overrode the ConvertName
method with the following implementation:
/// <summary>
/// Converts a property name like "CustomerName" and converts to "customer_name"
/// </summary>
/// <param name="name">the propery name</param>
/// <returns>property conversion</returns>
public override string ConvertName(string name) {
// using Regex to look for caps: "([A-Z]+)"
Match[] matches = regex.Matches(name)
.ToArray();
if (!matches.Any()) {
// no capitals to match
return name;
}
if (matches.Length == 1) {
// one match
return name.ToLower();
}
// multiple matches - we could use StringBuilder
string[] parts = new string[matches.Length];
int index = 0;
// this is somewhat verbose for debugging purposes
while (index < matches.Length) {
// get our match
Match m = matches[index];
// calculate range length
int length = index + 1 < matches.Length ?
// return the start of the next match
(matches[index + 1]).Index :
// return the end of the string
name.Length;
// create the range
Range range = (m.Index..length);
// insert the part
parts[index] = (name[range]).ToLower();
// increment the indexer
++index;
}
// construct property name
return string.Join("_", parts);
}
}
Note: I could use StringBuilder
as some people will likely prefer. I don't anticipate performance problems as this is a one and done scenario.
That being said, if I needed to serialize tons of data to go across the wire, I would likely forego this process altogether and design my properties with the desired naming convention.
For completeness, here is the source class:
// trimmed to the necessary bits for brevity
public class LaunchParameters : ILoadable {
#region properties
[JsonIgnore]
string ILoadable.Directory { get; } = CONFIG_DIR;
[JsonIgnore]
string ILoadable.FileName { get; } = CONFIG_FILE;
public Size WindowSize { get; set; } = new(1024, 768);
public string Title { get; init; } = "GLX Game";
[JsonIgnore]
public string Application => Title.Replace(" ", "_");
public string Label { get; init; }
public Version Version { get; init; }
[JsonIgnore]
public string WindowTitle => $"{Title} Window";
public string LogPath { get; init; } = @".\.logs";
public string CrashLogPath { get; init; } = @".\.crash_logs";
#endregion properties
}
... and the resulting JSON:
{
"window_size": {
"is_empty": false,
"width": 1024,
"height": 768
},
"title": "GLX Game",
"label": null,
"version": null,
"log_path": ".\\.logs",
"crash_log_path": ".\\.crash_logs"
}
精彩评论