开发者

Delete not important words

I have: an File with words: importantwords.txt (multiple lines, space separated, containing words) Example:

ALMOST
APPARENTLY
COULD
DEPEND
.
.
.

and I have text files: 01news.txt,..., 10news.txt (News as a text). Example:

During the short period of time between acquisition and allocation, the executive directors of the Company are deemed to be interested in those shares. The Company announces that 开发者_运维知识库the following transactions took place in relation to the SIP on Tuesday.

Now, I want to delete from 01news.txt, ... 10news.txt all the words which are not in importantwords.txt

How could I do that? I tried it with sed, but I am newbie. Can you help please?


for file in *news.txt
do
   awk 'FNR==NR{for(i=1;i<=NF;i++) impt[$i];next }
   { 
      for(j=1;j<=NF;j++) { 
        if ( toupper($j) in impt)  {
           printf "%s ", $j 
        }
      }
        print ""     
   } ' importantwords.txt $file > tmp && mv tmp $file

done
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜