开发者

C# Programming How to not filter our spaces within directories using Regular expression?

I have a program which utilizes both tokenize and Regular expressions to filter out both spaces (' ') and "," from a log file string.

However as there are spaces located within a log file string directory, so may someone please offer some advice regarding the regular expressions that I could use? Thanks!

*Please not that there are SPACES and COMMAS due to the date, time and contents that have to be tokenized! DO NOT ASSUME THAT I PLACED THE SPACES FOR FUN and start giving negative votes! Like someone.

One such string line of the log text file would be:

Thu Mar 02 1995 21:31:00,224开发者_高级运维5107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat

The results output of the program would be"

Thu
Mar
02
1995
21:31:00
2245107
m...
r/rrwxrwxrwx
0
0
8349-128-3
C:/Program
Files/AccessData/AccessData
Forensic
Toolkit/Program/wordnet/Adj.dat

Therefore the "C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat" is seperated due to the spaces regular expressions.

The program codes:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;


namespace Testing
{
class Program
{
    static void Main(string[] args)
    {

      String value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-
      3,C:/Program Files/AccessData/AccessData Forensic 
      Toolkit/Program/wordnet/Adj.dat";
        //
        // Split the string on line breaks.
        // ... The return value from Split is a string[] array.
        //

        //foreach (String r in lines)
        //{
            String rex = @"[\s,]";

            String[] token = Regex.Split(value, rex);

            foreach (String line in token)
            {
                Console.WriteLine(line);
            }
        //}
    }
}
}


Don't split on spaces, they are part of the values.

string value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat";
string[] token = value.Split(',');
foreach (String line in token) {
  Console.WriteLine(line);
}

If you want the components of the date as separate values, you can split that on spaces:

string[] dateCompent = token[0].Split(' ');


If you have to do it in a single regex, and if the only instance where you do want to split on spaces is in the first item (i. e. the date string), then you can do

splitArray = Regex.Split(subjectString, @",|(?<=^[^,]*)\s+");

This regex splits either on a comma or on a space, but only if that space doesn't follow a comma somewhere before in the string.

Explanation:

,       # match a ,
|       # or
(?<=    # assert that it is possible to match the following before the current position:
 ^      # start of string
 [^,]*  # any number of characters except commas
)       # end of positive lookahead assertion

Be aware, though, that filenames might contain commas, too (at least they are legal there - whether they do appear in your data is something only you can judge). \s+ # then match one or more whitespace characters

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜