C# Programming How to not filter our spaces within directories using Regular expression?
I have a program which utilizes both tokenize and Regular expressions to filter out both spaces (' ') and "," from a log file string.
However as there are spaces located within a log file string directory, so may someone please offer some advice regarding the regular expressions that I could use? Thanks!
*Please not that there are SPACES and COMMAS due to the date, time and contents that have to be tokenized! DO NOT ASSUME THAT I PLACED THE SPACES FOR FUN and start giving negative votes! Like someone.
One such string line of the log text file would be:
Thu Mar 02 1995 21:31:00,224开发者_高级运维5107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat
The results output of the program would be"
Thu
Mar
02
1995
21:31:00
2245107
m...
r/rrwxrwxrwx
0
0
8349-128-3
C:/Program
Files/AccessData/AccessData
Forensic
Toolkit/Program/wordnet/Adj.dat
Therefore the "C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat" is seperated due to the spaces regular expressions.
The program codes:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;
namespace Testing
{
class Program
{
static void Main(string[] args)
{
String value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-
3,C:/Program Files/AccessData/AccessData Forensic
Toolkit/Program/wordnet/Adj.dat";
//
// Split the string on line breaks.
// ... The return value from Split is a string[] array.
//
//foreach (String r in lines)
//{
String rex = @"[\s,]";
String[] token = Regex.Split(value, rex);
foreach (String line in token)
{
Console.WriteLine(line);
}
//}
}
}
}
Don't split on spaces, they are part of the values.
string value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat";
string[] token = value.Split(',');
foreach (String line in token) {
Console.WriteLine(line);
}
If you want the components of the date as separate values, you can split that on spaces:
string[] dateCompent = token[0].Split(' ');
If you have to do it in a single regex, and if the only instance where you do want to split on spaces is in the first item (i. e. the date string), then you can do
splitArray = Regex.Split(subjectString, @",|(?<=^[^,]*)\s+");
This regex splits either on a comma or on a space, but only if that space doesn't follow a comma somewhere before in the string.
Explanation:
, # match a ,
| # or
(?<= # assert that it is possible to match the following before the current position:
^ # start of string
[^,]* # any number of characters except commas
) # end of positive lookahead assertion
Be aware, though, that filenames might contain commas, too (at least they are legal there - whether they do appear in your data is something only you can judge). \s+ # then match one or more whitespace characters
精彩评论