ignoring header content of a text
I want to get rid of the content of a file before "this word" and only write the lines after that word on to disk.
text = <<EOT
I only want
the last lines
of the file after
this word. This is
the content I
want. Yet this
word can appear again.
EOT
puts text.scan("this word")
expected output:
This is
the content I
want. Yet this
word can appear again.
What is the most efficient way to 开发者_如何学Pythondo this?
Any help appreciated
Ted.
If 'this word' appears only once in your test, or if you want to remove up to the last such phrase:
text.sub(/.*this word\W*/, '')
If there is possibility that 'this word' appears multiple time and you want to remove only up to the first of such:
text.sub(/.*?this word\W*/, '')
Like this?
irb(main):001:0> text = <<EOT
irb(main):002:0" I only want
irb(main):003:0" the last lines
irb(main):004:0" of the file after
irb(main):005:0" this word. This is
irb(main):006:0" the content I
irb(main):007:0" want.
irb(main):008:0" EOT
=> " I only want\n the last lines\n of the file after\n this word. This is\n the content I \n want.\n"
irb(main):009:0> needle = 'this word. '
=> "this word. "
irb(main):012:0> text[text.index(needle)+needle.length..-1]
=> "This is\n the content I \n want.\n"
irb(main):013:0> print text[text.index(needle)+needle.length..-1]
This is
the content I
want.
=> nil
One easy way is to use the partition method:
text.partition("this word. ").last
(this assumes that "this word. " actually appears in the variable text. If it doesn't, an empty string will be returned by this code.)
One way:
text.split('this word. ')[1]
Or
text.split('this word. ').last
精彩评论