Splitting strings at specific positions
I got a little problem here, i'm looking for a better way to split Strings. For example i receive a String looking like this.
0000JHASDF+4429901234ALEXANDER
I know the pattern the string is built with and i have an array of numbers like this.
4,5,4,7,9
0000 - JHASDF - +442 - 9901234 - ALEXANDER
It is easy to split the whole thing up with the String MID command but it seems to be slow when i receive a file containing 8000 - 10000 datasets. So any sugg开发者_开发百科estion how i can make this faster to get the data in a List or an Array of Strings? If anyone knows how to do this for example with RegEx.
var lengths = new[] { 4, 6, 4, 7, 9 };
var parts = new string[lengths.Length];
// if you're not using .NET4 or above then use ReadAllLines rather than ReadLines
foreach (string line in File.ReadLines("YourFile.txt"))
{
int startPos = 0;
for (int i = 0; i < lengths.Length; i++)
{
parts[i] = line.Substring(startPos, lengths[i]);
startPos += lengths[i];
}
// do something with "parts" before moving on to the next line
}
Isn't mid a VB method?
string firstPart = string.Substring(0, 4);
string secondPart = string.Substring(4, 5);
string thirdPart = string.Substring(9, 4);
//...
Perhaps something like this:
string[] SplitString(string s,int[] parts)
{
string[] result=new string[parts.Length];
int start=0;
for(int i=0;i<parts.Length;i++)
{
int len=parts[i];
result[i]=s.SubString(start, len);
start += len;
}
if(start!=s.Length)
throw new ArgumentException("String length doesn't match sum of part lengths");
return result;
}
(I didn't compile it, so it probably contains some minor errors)
As the Mid()
function is VB, you could simply try
string.Substring(0, 4);
and so on.
The Regex Split Method would be a possibility, but since you don't have a specific delimiter in the string then I doubt it will be of any use and unlikely to be any faster.
String.Substring is also a possibility. You use it like: var myFirstString = fullString.Substring(0, 4)
I know this is late, but in the Microsoft.VisualBasic.FileIO namespace, you can find the textfieldparser and it would do a better job handling your issue. Here is a link to MSDN - https://msdn.microsoft.com/en-us/library/zezabash.aspx with an explanation. The code is in VB, but you can easily convert it to C#. You will need to add a reference to the Microsoft.VisualBasic.FileIO namespace as well. Hope this helps anyone stumbling on this question in the future.
Here is what it would look like in vb for the questioner's issue:
Using Reader As New Microsoft.VisualBasic.FileIO.
TextFieldParser("C:\TestFolder\test.log")
Reader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(4, 6, 4, 7, 9)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using
精彩评论