开发者

Splitting strings at specific positions

I got a little problem here, i'm looking for a better way to split Strings. For example i receive a String looking like this.

0000JHASDF+4429901234ALEXANDER

I know the pattern the string is built with and i have an array of numbers like this.

4,5,4,7,9
0000 - JHASDF - +442 - 9901234 - ALEXANDER

It is easy to split the whole thing up with the String MID command but it seems to be slow when i receive a file containing 8000 - 10000 datasets. So any sugg开发者_开发百科estion how i can make this faster to get the data in a List or an Array of Strings? If anyone knows how to do this for example with RegEx.


var lengths = new[] { 4, 6, 4, 7, 9 };
var parts = new string[lengths.Length];

// if you're not using .NET4 or above then use ReadAllLines rather than ReadLines
foreach (string line in File.ReadLines("YourFile.txt"))
{
    int startPos = 0;
    for (int i = 0; i < lengths.Length; i++)
    {
        parts[i] = line.Substring(startPos, lengths[i]);
        startPos += lengths[i];
    }

    // do something with "parts" before moving on to the next line
}


Isn't mid a VB method?

string firstPart = string.Substring(0, 4);
string secondPart = string.Substring(4, 5);
string thirdPart = string.Substring(9, 4);
//...


Perhaps something like this:

string[] SplitString(string s,int[] parts)
{
  string[] result=new string[parts.Length];
  int start=0;
  for(int i=0;i<parts.Length;i++)
  {
    int len=parts[i];
    result[i]=s.SubString(start, len);
    start += len;
  }
  if(start!=s.Length)
    throw new ArgumentException("String length doesn't match sum of part lengths");
  return result;
}

(I didn't compile it, so it probably contains some minor errors)


As the Mid() function is VB, you could simply try

string.Substring(0, 4);

and so on.


The Regex Split Method would be a possibility, but since you don't have a specific delimiter in the string then I doubt it will be of any use and unlikely to be any faster.

String.Substring is also a possibility. You use it like: var myFirstString = fullString.Substring(0, 4)


I know this is late, but in the Microsoft.VisualBasic.FileIO namespace, you can find the textfieldparser and it would do a better job handling your issue. Here is a link to MSDN - https://msdn.microsoft.com/en-us/library/zezabash.aspx with an explanation. The code is in VB, but you can easily convert it to C#. You will need to add a reference to the Microsoft.VisualBasic.FileIO namespace as well. Hope this helps anyone stumbling on this question in the future.

Here is what it would look like in vb for the questioner's issue:

Using Reader As New Microsoft.VisualBasic.FileIO.
   TextFieldParser("C:\TestFolder\test.log")

   Reader.TextFieldType =
      Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
   Reader.SetFieldWidths(4, 6, 4, 7, 9)
   Dim currentRow As String()
   While Not Reader.EndOfData
      Try
         currentRow = Reader.ReadFields()
         Dim currentField As String 
         For Each currentField In currentRow
            MsgBox(currentField)
         Next 
      Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
         MsgBox("Line " & ex.Message &
         "is not valid and will be skipped.")
      End Try 
   End While 
End Using  
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜