开发者

awk file manipulation

I have the following words on my text file and I want extract as follow.

device1 te rfe3 -1     10.1.2.3   device1 te rfe3
device2 cdr thr       开发者_如何学JAVA 10.2.5.3   device2 cdr thr
device4                10.6.0.8   device4
device3 hrdnsrc dhe    10.8.3.6   device3 hrdnsrc dhe

my objective is to extract the device name and the ip adrress everything else to strip away. the is no pattern after device name some of them has 2-3 word some of them does not have any thing. also I don't need the 3rd column. I am looking the result like this.

device1   10.1.2.3
device2   10.2.5.3 
device3   10.8.3.6 
device3   10.8.9.4 

is this possible? Thanks in advance.


 sed -r 's/^([^ ]*) .* (([0-9]{1,3}\.){3}[0-9]{1,3}).*$/\1 \2/'

Proof of Concept

$ sed -r 's/^([^ ]*) .* (([0-9]{1,3}\.){3}[0-9]{1,3}).*$/\1 \2/' ./infile
device1 10.1.2.3

device2 10.2.5.3

device4 10.6.0.8

device3 10.8.3.6


In awk, this is something like

$ awk '{
         for (f = 2; f <= NF; f++) {
           if ($f ~ /^([0-9]+\.){3}[0-9]+$/) {
             print $1, $f
             break
           }
         }
       }' file

Here's a transcript:

mress:10192 Z$ cat pffft.awk
{
  for (f = 2; f <= NF; f++) {
    if ($f ~ /^([0-9]+\.){3}[0-9]+$/) {
      print $1, $f
      break
    }
  }
}
mress:10193 Z$ cat pfft.in 
device1 te rfe3 -1     10.1.2.3   device1 te rfe3
device2 cdr thr        10.2.5.3   device2 cdr thr
device4                10.6.0.8   device4
device3 hrdnsrc dhe    10.8.3.6   device3 hrdnsrc dhe
mress:10194 Z$ awk -f pffft.awk pfft.in
device1 10.1.2.3
device2 10.2.5.3
device4 10.6.0.8
device3 10.8.3.6
mress:10195 Z$ _


in perl

perl -ne 'next if /^\s*$/ ; /^(\w+).*?(\d+(\.\d+){3})/; print "$1\t$2\n"' test_file

for sorted results you could probably pipe the output to sort command

perl -ne 'next if /^\s*$/ ; /^(\w+).*?(\d+(\.\d+){3})/; print "$1\t$2\n"' test_file | sort

Updated script like version

my $test_file = shift or die "no input file provided\n";

# open a filehandle to your test file
open my $fh, '<', $test_file or die "could not open $test_file: $!\n";

while (<$fh>) {
    # ignore the blank lines
    next if /^\s*$/;

    # regex matching
    /               # regex starts
    ^               # beginning of the string
    (\w+)           # store the first word in $1
    \s+             # followed by a space
    .*?             # match anything but don't be greedy until...
    (\d+(\.\d+){3}) # expands to (\d+\.\d+\.\d+\.\d+) and stored in $2
    /x;             # regex ends 

    # print first and second match
    print "$1\t$2\n"
}


Python's not on your list, but something like this might work.

import sys
import re
pattern= re.compile( "^(\w+)\s.*?\s(\d+\.\d+\.\d+\.\d+)\s.*$" )
for line in sys.stdin:
    match= pattern.match( line )
    sys.stdout.write( "{0} {1}".format( match.group(1), match.group(2) ) )

It should work on most Linux platforms, since Python is already installed.


Assuming the input file has the fields always aligned to the same columns, the shortest POSIX solution would be

$ cut -c1-8,23-33 x
device1  10.1.2.3

device2  10.2.5.3

device4  10.6.0.8

device3  10.8.3.6


Depending on how close to an IP number the cruft get, this may or may not cat your cake:

sed -re 's/^([^ ]*).* ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*/\1 \2/g'


À la the cut solution with perl you could use "unpack" if the file is always in the same format column wise:

perl -nE 'say unpack("A8 x14 A9")' data.txt

Or use a regular expression to get the first word followed by a space ^(\w+\s) and then one or more digits following a . 3 times (\d+(\.\d+){3}):

perl -nE '/^(?<name>\w+\s).*?(?<ip>\d+(\.\d+){3})/; 
         say "$+{name} $+{ip}" '  data.txt

The named captures ($+{name} $+{ip}) are just for fun :-)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜