开发者

How do I get rid of this "(" using regex?

I was moving along on a regex expression and I have hit a road block I can't seem to get around. I am trying to get rid of "(" in the middle of a line of text using regex, there were 2 but I figured out how to get the one on the end of the line. its the one in the middle I can hack out.

Here is a more complete snippet of the file which I am search through.

ide1:0.present = "TRUE"
ide1:0.clientDevice = "TRUE"
ide1:0.deviceType = "cdrom-raw"
ide1:0.startConnected = "FALSE"
floppy0.startConnected = "FALSE"
floppy0.clientDevice = "TRUE"
ethernet0.present = "TRUE"
ethernet0.virtualDev = "e1000"
ethernet0.networkName = "soli开发者_开发技巧gnis.local"
ethernet0.addressType = "generated"
guestOSAltName = "Ubuntu Linux (64-bit)"
guestOS = "ubuntulinux"
uuid.location = "56 4d e8 67 57 18 67 04-c8 68 14 eb b3 c7 be bf"
uuid.bios = "56 4d e8 67 57 18 67 04-c8 68 14 eb b3 c7 be bf"
vc.uuid = "52 c7 14 5c a0 eb f4 cc-b3 69 e1 6d ad d8 1a e7"

Here is a the entire foreach loop I am working on.

my @virtual_machines;
foreach my $vm (keys %virtual_machines) {
    push @virtual_machines, $vm;
}
foreach my $vm (@virtual_machines) {
    my $vmx_file = $ssh1->capture("cat $virtual_machines{$vm}{VMX}");

    if ($vmx_file =~ m/^\bguestOSAltName\b\s+\S\s+\W(?<GUEST_OS> .+[^")])\W/xm) {
        $virtual_machines{$vm}{"OS"} = "$+{GUEST_OS}";
    } else {
        $virtual_machines{$vm}{"OS"} = "N/A";
    }
    if ($vmx_file =~ m/^\bguestOSAltName\b\s\S\s.+(?<ARCH> \d{2}\W\bbit\b)/xm) {
        $virtual_machines{$vm}{"Architecture"} = "$+{ARCH}";
    } else {
        $virtual_machines{$vm}{"Architecture"} = "N/A";
    }
}

I am thinking the problem is I cannot make a match to "(" because the expression before that is to ".+" so that it matches everything in the line of text, be it alphanumeric or whitespace or even symbols like hypens.

Any ideas how I can get this to work?

This is what I am getting for an output from a hash dump.

$VAR1 = {
      'NS02' => {
                  'ID' => '144',
                  'Version' => '7',
                  'OS' => 'Ubuntu Linux (64-bit',
                  'VMX' => '/vmfs/volumes/datastore2/NS02/NS02.vmx',
                  'Architecture' => '64-bit'
                },

The part of the code block where I am working with ARCH work flawless so really what I need is hack off the "(64-bit)" part if it exists when the search runs into the ( and have it remove the preceding whitespace before the (.

What I am wanting is to turn the above hash dump into this.

$VAR1 = {
      'NS02' => {
                  'ID' => '144',
                  'Version' => '7',
                  'OS' => 'Ubuntu Linux',
                  'VMX' => '/vmfs/volumes/datastore2/NS02/NS02.vmx',
                  'Architecture' => '64-bit'
                },

Same thing minus the (64-bit) part.


You can simplify your regex to /^guestOSAltName\s+=\s+"(?<GUEST_OS>.+)"/m. What this does:

  • ^ forces the match to start at the beginning of a line
  • guestOSAltName is a string literal.
  • \s+ matches 1 or more whitespace characters.
  • (?<GUEST_OS>.+) matches all the text from after the spaces to the end of the line, catches the group and names it GUEST_OS. If the line could have comments, you might want to change .+ to [^#]+.
  • The "'s around the group are literal quotes.
  • The m at the end turns on multi-line matching.

Code:

if ($vmx_file =~ /^guestOSAltName\s+=\s+"(?<GUEST_OS>.+)"/m) {
    print "$+{GUEST_OS}";
} else {
    print "N/A";
}

See it here: http://ideone.com/1xH5J


So you want to match the contents of the string after guestOSAltName up to (and not including) the first ( if present?

Then replace the first line of your code sample with

if ($vmx_file =~ m/^guestOSAltName\s+=\s+"(?<GUEST_OS>[^"()]+)/xm) {

If there always is a whitespace character before a potential opening parenthesis, then you can use

if ($vmx_file =~ m/^guestOSAltName\s+=\s+"(?<GUEST_OS>[^"()]+)[ "]/xm) {

so you don't need to strip trailing whitespace if present.


Something like this should work:

$match =~ s/^(.*?)\((.*?)$/$1$2/;


Generally find that .* is too powerful (as you are finding!). Two suggestions

Be more explicit on what you are looking for

    my $text = '( something ) ( something else) ' ;

    $text =~ /
      \(
      ( [\s\w]+ )
      \)
        /x ;

    print $1 ;

Use non greedy matching

    my $text = '( something ) ( something else) ' ;

    $text =~ /
      \(
      ( .*? )   # non greedy match
      \)
        /x ;

    print $1 ;

General observation - involved regexps are far easier to read if you use the /x option as this allows spacing and comments.


Use an ? behind your counter. ? stands for non greedy.

The regex is /^guestOSAltName[^"]+"(?<GUEST_OS>.+?)\s*[\("]+.*$/:

#!/usr/bin/env perl

foreach my $x ('guestOSAltName = "Ubuntu Linux (64-bit)"', 'guestOSAltName = "Microsoft Windows Server 2003, Standard Edition"') {
    if ($x =~ m/^guestOSAltName[^"]+"(?<GUEST_OS>.+?)\s*[\("]+.*$/xm) {
        print "$+{GUEST_OS}\n";
    } else {
        print "N/A\n";
    }
    if ($x =~ m/^guestOSAltName[^(]+\((?<ARCH>\d{2}).*/xm) {
         print "$+{ARCH}\n";
    } else {
         print "N/A\n";
    }
}

Start the demo:

$ perl t.pl
Ubuntu Linux
64
Microsoft Windows Server 2003, Standard Edition
N/A
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜