Help me parse a text file and extract specific values
I have a file called lijst.txt
. The file is an output from a printmessage eventlog file.
All the lines have the same format.
I want to extract from each line the username, which is between the words owned by
and was
. Also, I want to extract the pagecount, which is between the words开发者_C百科 pages printed:
and .
. I would like to put these values in a new text file.
Regards,
Dennis (new in F#)
I would recommend using a regular expression for this, something like:
open System.Text.RegularExpressions
let usernameRegex = new Regex(".*owned by\s+(?<username>.*)\s+was.*")
/// Trys to extract the username from a given line of text. Returns None if the line is malformed
// Note: You could also use failwith in the else branch or throw an exception or ...
let extractUsername line =
let regexMatch = usernameRegex.Match(line) in
if regexMatch.Success then Some regexMatch.Groups.["username"].Value else None
// In reality you would like to load this from file using File.ReadAllLines
let sampleLines =
["Some text some text owned by DESIRED USERNAME was some text some text";
"Some text line not containing the pattern";
"Another line owned by ANOTHER USER was"]
let extractUsernames lines =
lines
|> Seq.map extractUsername
|> Seq.filter (fun usernameOption -> usernameOption.IsSome)
|> Seq.map (fun usernameOption -> usernameOption.Value)
// You can now save the usernames to a file using
// File.WriteAllLines("FileName", extractUsernames(sampleLines))
You can do something like:
let getBetween (a:string) (b:string) (str:string) =
str.Split(a.ToCharArray()).[1].Split(b.ToCharArray()).[0].Trim()
let total (a:string seq) =
(a |> Seq.map Int32.Parse |> Seq.reduce (+)).ToString()
File.ReadAllLines("inFile") |> Seq.map (fun l -> (getBetween "owned by" "was" l , getBetween "Pages printed:" "." l) )
|> Seq.groupBy (fun (user,count) -> user)
|> Seq.map (fun (user,counts) -> user + "\t" + (counts |> Seq.map snd |> total) )
|> (fun s -> File.WriteAllLines("outFile",s) )
精彩评论