A Perl or Gawk script that returns a Keyword, the word before, and the word after?
I need a simple script to run in Windows that searches large xml files for a keyword, then returns the word before it, the keyword, and the word after.
For example: "How can I extract keywords in context" I want: "extract keywords in"
I'm a novice with enough knowledge to return each line with the Keyword, and the lines before and after, but I'm stumped on getting the individual words I 开发者_StackOverflow中文版need out.
Anyone have any clever ideas?
Here's one way:
#!/usr/bin/perl
use 5.12.0;
my $keyword = 'keywords';
while (<DATA>)
{
say for /\b(\S+\s+\b\Q$keyword\E[[:punct:]]*\s+\S+)\b/g;
}
__END__
How can I extract keywords in context, even if there are many keywords to
extract? So many keywords, no idea how to deal with them.
grep -o
is enough:
grep -Po '(\S+\s)?keywords(\s\S+)?' << END
How can I extract keywords in context
How can I extract keywords
keywords in context
END
returns
extract keywords in
extract keywords
keywords in
精彩评论