awk file manipulation

2023-02-16 05:15 问答作者：

I have the following words on my text file and I want extract as follow.

device1 te rfe3 -1     10.1.2.3   device1 te rfe3
device2 cdr thr       开发者_如何学JAVA 10.2.5.3   device2 cdr thr
device4                10.6.0.8   device4
device3 hrdnsrc dhe    10.8.3.6   device3 hrdnsrc dhe

my objective is to extract the device name and the ip adrress everything else to strip away. the is no pattern after device name some of them has 2-3 word some of them does not have any thing. also I don't need the 3rd column. I am looking the result like this.

device1   10.1.2.3
device2   10.2.5.3 
device3   10.8.3.6 
device3   10.8.9.4

is this possible? Thanks in advance.

 sed -r 's/^([^ ]*) .* (([0-9]{1,3}\.){3}[0-9]{1,3}).*$/\1 \2/'

Proof of Concept

$ sed -r 's/^([^ ]*) .* (([0-9]{1,3}\.){3}[0-9]{1,3}).*$/\1 \2/' ./infile
device1 10.1.2.3

device2 10.2.5.3

device4 10.6.0.8

device3 10.8.3.6

In awk, this is something like

$ awk '{
         for (f = 2; f <= NF; f++) {
           if ($f ~ /^([0-9]+\.){3}[0-9]+$/) {
             print $1, $f
             break
           }
         }
       }' file

Here's a transcript:

mress:10192 Z$ cat pffft.awk
{
  for (f = 2; f <= NF; f++) {
    if ($f ~ /^([0-9]+\.){3}[0-9]+$/) {
      print $1, $f
      break
    }
  }
}
mress:10193 Z$ cat pfft.in 
device1 te rfe3 -1     10.1.2.3   device1 te rfe3
device2 cdr thr        10.2.5.3   device2 cdr thr
device4                10.6.0.8   device4
device3 hrdnsrc dhe    10.8.3.6   device3 hrdnsrc dhe
mress:10194 Z$ awk -f pffft.awk pfft.in
device1 10.1.2.3
device2 10.2.5.3
device4 10.6.0.8
device3 10.8.3.6
mress:10195 Z$ _

in perl

perl -ne 'next if /^\s*$/ ; /^(\w+).*?(\d+(\.\d+){3})/; print "$1\t$2\n"' test_file

for sorted results you could probably pipe the output to sort command

perl -ne 'next if /^\s*$/ ; /^(\w+).*?(\d+(\.\d+){3})/; print "$1\t$2\n"' test_file | sort

Updated script like version

my $test_file = shift or die "no input file provided\n";

# open a filehandle to your test file
open my $fh, '<', $test_file or die "could not open $test_file: $!\n";

while (<$fh>) {
    # ignore the blank lines
    next if /^\s*$/;

    # regex matching
    /               # regex starts
    ^               # beginning of the string
    (\w+)           # store the first word in $1
    \s+             # followed by a space
    .*?             # match anything but don't be greedy until...
    (\d+(\.\d+){3}) # expands to (\d+\.\d+\.\d+\.\d+) and stored in $2
    /x;             # regex ends 

    # print first and second match
    print "$1\t$2\n"
}

Python's not on your list, but something like this might work.

import sys
import re
pattern= re.compile( "^(\w+)\s.*?\s(\d+\.\d+\.\d+\.\d+)\s.*$" )
for line in sys.stdin:
    match= pattern.match( line )
    sys.stdout.write( "{0} {1}".format( match.group(1), match.group(2) ) )

It should work on most Linux platforms, since Python is already installed.

Assuming the input file has the fields always aligned to the same columns, the shortest POSIX solution would be

$ cut -c1-8,23-33 x
device1  10.1.2.3

device2  10.2.5.3

device4  10.6.0.8

device3  10.8.3.6

Depending on how close to an IP number the cruft get, this may or may not cat your cake:

sed -re 's/^([^ ]*).* ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*/\1 \2/g'

À la the cut solution with perl you could use "unpack" if the file is always in the same format column wise:

perl -nE 'say unpack("A8 x14 A9")' data.txt

Or use a regular expression to get the first word followed by a space ^(\w+\s) and then one or more digits following a . 3 times (\d+(\.\d+){3}):

perl -nE '/^(?<name>\w+\s).*?(?<ip>\d+(\.\d+){3})/; 
         say "$+{name} $+{ip}" '  data.txt

The named captures ($+{name} $+{ip}) are just for fun :-)

继续阅读：perl sed shell

awk file manipulation

Proof of Concept

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Proof of Concept

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？