how to print a section of file between two regular expressions only if a line within the section contains a certain string within it
I have a file of events that has multiple multi lined events between <event>
and </event>
tags. I want to print out the entire event From <event>
to </event>
only if a line within that event contains either the string uniqueId="1279939300.862594_PFM_1_1912320699" or uniqueId="1281686522.353435_PFM_1_988171542". The file has 100000 events in it and each event has between 20 and 35 lines (attributes within the event vary its length). I started off using sed but need a little help beyond:
cat xmlEventLog_2010-03-23T* | sed -nr "/<event eventTimestamp/,/<\/event>/"
What do I need to do to finish this? Also is sed the best way of doing this given the size of the files?
Thanks in advance
A
I wanted to edit this to update. For certain reasons I want to do this with sed. I tried Denis's solution but it does not seem to work:
bash$ grep 1279939300.862594_PFM_1_1912320699 xmlEventLog*
xmlEventLog_2010-03-23T02:41:15_PFM_1_1.xml: <event eventTimestamp="2010-03-23T02:41:40.861" originalReceivedMessageSize="0" uniqueId="1279939300.862594_PFM_1_1912320699">
bash$ grep 1281686522.353435_PFM_1_988171542 xmlEventLog*
xmlEventLog_2010-03-23T07:47:38_PFM_1_1.xml: <event eventTimestamp="2010开发者_StackOverflow社区-03-23T08:02:02.299" originalReceivedMessageSize="685" uniqueId="1281686522.353435_PFM_1_988171542">
bash$ time sed -n ':a; /<event>/,/<\/event>/ N; /<event>/,/<\/event>/!b; /<\/event>/ {/uniqueId="1279939300.862594_PFM_1_1912320699"\|uniqueId="1281686522.353435_PFM_1_988171542"/p;d}; ba' xmlEventLog*
real 1m13.134s
user 1m12.463s
sys 0m0.659s
bash$
Which obviously returned nothing. So is it possible to do this with sed?
A
awk -vRS="</event>" '/<event>/ && /1279939300.862594_PFM_1_1912320699|1281686522.353435_PFM_1_988171542/{print}' file
Give this a try:
sed -n ':a; /<event>/,/<\/event>/ N; /<event>/,/<\/event>/!b; /<\/event>/ {/uniqueId="1279939300.862594_PFM_1_1912320699"\|uniqueId="1281686522.353435_PFM_1_988171542"/p;d}; ba'
You should be able to embed the unique ids directly into the regular expression, using the |
character to allow either uniqueid. I did a quick test and the following regular expression seems to find the correct entries:
<event.*?uniqueid=("1279939300\.862594_PFM_1_1912320699"|"1281686522\.353435_PFM_1_988171542").*?</event>
精彩评论