开发者

Parsing a comma separated file using powershell

I have a text file which contains several lines, each of which is a comma separated string. The format of each line is:

<Name, Value, Bitness, OSType>

Bitness and OSType are optional.

For example the file can be like this:

Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
....
....

I want to parse each line into 4 variables and perform some operation on it. This is the PowerShell script that I use..

Get-Content $inputFile | ForEach-Object {
    $Line = $_;

    $_var = "";
    $_val = "";
    $_bitness = "";
    $_ostype = "";

    $envVarArr = $Line.Split(",");
    For($i=0; $i -lt $envVarArr.Length; $i++) {
        Switch ($i) {
            0 {$_var = $envVarArr[$i].Trim();}
            1 {$_val = $envVarArr[$i].Trim();}
            2 {$_bitness = $envVarArr[$i].Trim();}
            3 {$_ostype = $envVarArr[$i].Trim();}
        }                                    
    }
    //perform some operation using the 4 temporary variables
}

However, I wanted to know if it is possible to do this u开发者_运维问答sing regex in PowerShell. Would you please provide sample code for doing that? Note that the 3rd and 4th values in each line can be optionally empty.


You can specify an alternate column header row for the imported file file with the -Header parameter of the Import-Csv cmdlet:

Import-Csv .\test.txt -Header Col1,Col2,Bitness,OSType


Wouldn't it be better to use Import-Csv which does all this (and more reliably) for you?


As Tim suggests, you can use use Import-Csv. The difference is that Import-Csv reads from a file.

@"
Name1, Value1, X64, Windows7
Name2, Value2, X86, XP
Name3, Value3, X64, XP
Name4, Value3, , Windows7
Name4, Value3, X64 /*Note that no comma follows X64 */
"@ | ConvertFrom-Csv -header var, val, bitness, ostype

# Result

var   val    bitness                                 ostype  
---   ---    -------                                 ------  
Name1 Value1 X64                                     Windows7
Name2 Value2 X86                                     XP      
Name3 Value3 X64                                     XP      
Name4 Value3                                         Windows7
Name4 Value3 X64 /*Note that no comma follows X64 */         


Slower than molasses but after spending 20 years cobbling together a dozen or more partial solutions I decided to tackle it definitively. Of course in the course of time all sorts of parser libraries are now available.


function SplitDelim($Line, $Delim=",", $Default=$Null, $Size=$Null) {

    # 4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # "4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a
    # ,4956968,"""Visible,"" 4D ""Human"" Torso Anatomy Kit (4-5/8)",FDV-26051,"" ,"",,,,,,a

    $Field = ""
    $Fields = @()
    $Quotes = 0
    $State = 'INF' # INFIELD, INQFIELD, NOFIELD
    $NextState = $Null

    for ($i=0; $i -lt $Line.length; $i++) {
        $Char = $Line.substring($i,1)

        if($State -eq 'NOF') {

            # NOF and Char is Quote
            # NextState becomes INQ
            if ($Char -eq '"') {
                $NextState = 'INQ'
            }

            # NOF and Char is Delim
            # NextState becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

            # NOF and Char is not Delim, Quote or space
            # NextState becomes INF
            elseif ($Char -ne " ") {
                $NextState = 'INF'
            }

        } elseif ($State -eq 'INF') {

            # INF and Char is Quote
            # Error
            if ($Char -eq '"') {
                return $Null}

            # INF and Char is Delim
            # NextState Becomes NOF
            elseif ($Char -eq $Delim) {
                $NextState = 'NOF'
                $Char = $Null
            }

        } elseif ($State -eq 'INQ') {

            # INQ and Char is Delim and consecutive Quotes mod 2 is 0
            # NextState is NOF
            if ($Char -eq $Delim -and $Quotes % 2 -eq 0) {
                $NextState = 'NOF'
                $Char = $Null
            }
        }

        # Track consecutive quote for purposes of mod 2 logic
        if ($Char -eq '"') {
            $Quotes++
        } elseif ($NextState -eq 'INQ') {
            $Quotes = 0
        }

        # Normal duty
        if ($State -ne 'NOF' -or $NextState -ne 'NOF') {
            $Field += $Char
        }

        # Push to $Fields and clear
        if ($NextState -eq 'NOF') {
            $Fields += (IfBlank $Field $Default)
            $Field = ''
        }

        if ($NextState) {
            $State = $NextState
            $NextState = $Null
        }
    }

    $Fields += (IfNull $Field $Default)

    while ($Size -and $Fields.count -lt $Size) {
        $Fields += $Default
    }

    return $Fields
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜