开发者

regex pattern to match insert sql query

I have some log file which contains many rows of similar:

[26-Nov-2010 07:33:08] query error: INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
                                        VALUES(8416961,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
                    0,1,0,0,0,'',0,1,0,0,'abc','abc',
                                        '0', 'abc');|http://www.example.com/|Duplicate entry '8388607' for key 1
[26-Nov-2010 08:33:08] query error: INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
                                        VALUES(8416962,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
                    0,1,0,0,0,'',0,1,0,0,'abc','abc',
                                        '0', 'abc');|http://www.example.com/|Duplicate entry '8388607' for key 1

What I would like to do is to run a regex to match ALL the insert queries only (ignoring the time, url, and the duplicate message.

So it should return:

INSERT INTO members (id,name,member_login_key,email,mgroup,posts开发者_开发百科,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
                                    VALUES(8416961,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
                0,1,0,0,0,'',0,1,0,0,'abc','abc',
                                    '0', 'abc');
INSERT INTO members (id,name,member_login_key,email,mgroup,posts,joined,ip_address,time_offset,view_sigs,email_pm,view_img,view_avs,restrict_post,view_pop,msg_total,new_msg,coppa_user,language,dst_in_use,allow_admin_mails,hide_email,subs_pkg_chosen,members_l_username,members_l_display_name, item_id, members_display_name)
                                    VALUES(8416962,'abc','3857b123a1a67ce1fc4a39fd7ae47355','test@email.com',1,0,1290756788,'127.0.0.1','',1,1,1,1,
                0,1,0,0,0,'',0,1,0,0,'abc','abc',
                                    '0', 'abc');

Anyone can assist? Thank you in advance!


If all the inserts span 4 lines of log file then you could use this regexp:

 (.*)(INSERT INTO.*\n.*\n.*\n.*\))(;.*)

with this matching replace string:

 \2\n


This should be possible, it very much depends if the whole file is identical to that.

This is to get just the INSERTs, if you want the log entries, then the regexp needs to change slightly.

$logFile = file_get_contents('inserts.log');

$matches = array();
preg_match_all("/(?P<insert>INSERT .+?;)/s", $logFile, $matches);

foreach ($matches['insert'] as $cQuery) {
    echo $cQuery . "\n";
}

See preg_match_all documentation for further info on this approach.


Do you want to extract parts of it, or just match?

Just match is easy, it doesn't require a regex at all, just the substring INSERT INTO.

grep 'INSERT INTO' foo.log

If you want to extract details, or have a more specific match, please give a little more information.

If you want to have the three following lines, you can do this.

grep -A 3 'INSERT INTO' foo.log

And if you want to trim some stuff from the start and end (it's pretty ugly, but works on your examples)

grep -A 3 'INSERT INTO' foo.log | sed -e 's/^.*INSERT INTO/INSERT INTO/' -e 's/);|.*/);/'
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜