Parsing a comma separated file using powershell
I have a text file which contains several lines, each of which is a comma separated string. The format of each line is:
<Name, Value, Bitness, OSType>
Bitness
and OSType
are optional.
For example the file can be like this:
Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
....
....
I want to parse each line into 4 variables and perform some operation on it. This is the PowerShell script that I use..
Get-Content $inputFile | ForEach-Object {
$Line = $_;
$_var = "";
$_val = "";
$_bitness = "";
$_ostype = "";
$envVarArr = $Line.Split(",");
For($i=0; $i -lt $envVarArr.Length; $i++) {
Switch ($i) {
0 {$_var = $envVarArr[$i].Trim();}
1 {$_val = $envVarArr[$i].Trim();}
2 {$_bitness = $envVarArr[$i].Trim();}
3 {$_ostype = $envVarArr[$i].Trim();}
}
}
//perform some operation using the 4 temporary variables
}
However, I wanted to know if it is possible to do this u开发者_运维问答sing regex in PowerShell. Would you please provide sample code for doing that? Note that the 3rd and 4th values in each line can be optionally empty.
You can specify an alternate column header row for the imported file file with the -Header
parameter of the Import-Csv
cmdlet:
Import-Csv .\test.txt -Header Col1,Col2,Bitness,OSType
Wouldn't it be better to use Import-Csv
which does all this (and more reliably) for you?
As Tim suggests, you can use use Import-Csv. The difference is that Import-Csv reads from a file.
@"
Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
"@ | ConvertFrom-Csv -header var, val, bitness, ostype
# Result
var val bitness ostype
--- --- ------- ------
Name1 Value1 X64 Windows7
Name2 Value2 X86 XP
Name3 Value3 X64 XP
Name4 Value3 Windows7
Name4 Value3 X64 /*Note that no comma follows X64 */
Slower than molasses but after spending 20 years cobbling together a dozen or more partial solutions I decided to tackle it definitively. Of course in the course of time all sorts of parser libraries are now available.
function SplitDelim($Line, $Delim=",", $Default=$Null, $Size=$Null) {
# 4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
# "4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
# ,4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
$Field = ""
$Fields = @()
$Quotes = 0
$State = 'INF' # INFIELD, INQFIELD, NOFIELD
$NextState = $Null
for ($i=0; $i -lt $Line.length; $i++) {
$Char = $Line.substring($i,1)
if($State -eq 'NOF') {
# NOF and Char is Quote
# NextState becomes INQ
if ($Char -eq '"') {
$NextState = 'INQ'
}
# NOF and Char is Delim
# NextState becomes NOF
elseif ($Char -eq $Delim) {
$NextState = 'NOF'
$Char = $Null
}
# NOF and Char is not Delim, Quote or space
# NextState becomes INF
elseif ($Char -ne " ") {
$NextState = 'INF'
}
} elseif ($State -eq 'INF') {
# INF and Char is Quote
# Error
if ($Char -eq '"') {
return $Null}
# INF and Char is Delim
# NextState Becomes NOF
elseif ($Char -eq $Delim) {
$NextState = 'NOF'
$Char = $Null
}
} elseif ($State -eq 'INQ') {
# INQ and Char is Delim and consecutive Quotes mod 2 is 0
# NextState is NOF
if ($Char -eq $Delim -and $Quotes % 2 -eq 0) {
$NextState = 'NOF'
$Char = $Null
}
}
# Track consecutive quote for purposes of mod 2 logic
if ($Char -eq '"') {
$Quotes++
} elseif ($NextState -eq 'INQ') {
$Quotes = 0
}
# Normal duty
if ($State -ne 'NOF' -or $NextState -ne 'NOF') {
$Field += $Char
}
# Push to $Fields and clear
if ($NextState -eq 'NOF') {
$Fields += (IfBlank $Field $Default)
$Field = ''
}
if ($NextState) {
$State = $NextState
$NextState = $Null
}
}
$Fields += (IfNull $Field $Default)
while ($Size -and $Fields.count -lt $Size) {
$Fields += $Default
}
return $Fields
}
精彩评论