开发者

better to use perl or unix commands to parse this string

Is there a good unix one liner or perl liner that can format this string from:

<?xml version="1.0" encoding="UTF-8"?><org.apache.Summary length="200429142" fileCount="197184" dirCount="50" quota="-1" spaceUsed="601287428" spaceQuota="-1"/>

To:

le开发者_开发问答ngth=200429142
filecount=197184
dirCount=50
quota=-1
spaceUsed=601287428
spaceQuota=-1


Here's a one-liner, broken up into separate lines for clarity:

perl -MXML::Simple -l \
    -e '$a = XMLin shift; print "$_=$a->{$_}" for ' \
    -e 'qw(length fileCount dirCount quota spaceUsed spaceQuota)' \
    (your XML string here)

This requires that you have the XML::Simple module installed.


Just a fast shot: What about this?

sed -r 's/.*<org.apache.Summary\s+([^>]+)>/\1/' | tr " " "\n"


 sed -e 's/.*Summary //;s/\/.*$//' temp|perl -p -e 's/ /\n/g'

length="200429142"
fileCount="197184"
dirCount="50"
quota="-1"
spaceUsed="601287428"
spaceQuota="-1"

if you want to do in place :

sed -e 's/.*Summary //;s/\/.*$//' temp|perl -pi -e 's/ /\n/g'

if you donot need the " then:

 sed -e 's/.*Summary //;s/\/.*$//' temp|perl -p -e 's/ /\n/g;s/\"//g'
length=200429142
fileCount=197184
dirCount=50
quota=-1
spaceUsed=601287428
spaceQuota=-1


A refined version based on @bmk

sed -r 's/<\?.?*\?>//' | sed -r 's/<[a-z\.]+//I' | \
sed -r 's/\/>//' | sed -r 's/ ([a-z]+)="(-?[0-9]+)"/\1=\2\n/Ig'

Total 4 sed were used.

  1. remove the <?xml?>
  2. remove the <org.apache.Summary
  3. remove the />
  4. extract the XML attributes into pairs.


This should do what you need.

perl -0777 -E'given(<>){/\?>/g; say "$1$2" while /(\w+=)"(.*?)"/g}' myfile.xml

output

length=200429142
fileCount=197184
dirCount=50
quota=-1
spaceUsed=601287428
spaceQuota=-1
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜