Replace content in a file between two markers

2023-02-08 00:51 问答作者：

Using ruby (not rails), I'm trying to figure out how to replace (not append) a certai开发者_JS百科n block in a static file with a string. For example, in static_file.html I want to replace everything between the html comments "start" and "end":

<p>lorem ipsum blah blah ipsum</p>

<!--start-->
REPLACE MULTI-LINE
CONTENT HERE...
<!--end-->

<p>other stuff still here...</p>

Some of the answers here are helpful for inserting text at a certain spot, but does not handle between.

Here's a function to handle it for you. Just pass it a file path and the contents to replace in between those HTML comment blocks:

As long as your comment blocks are always formatted the same: <--start--> and , this will work.

def replace(file_path, contents)
    file = File.open(file_path, "r+")
    html = ""

    while(!file.eof?)
        html += file.readline
    end

    file.close()

    return html.gsub(/<!--start-->(.*)<!--end-->/im, contents)
end

the simple answer would be:

str = "FOO\n\BAR\nblah \nblah BAZ\nBLOOP"
str.gsub(/BAR.*BAZ/m,"SEE")

I'm not sure if that's robust enough for what you are trying to do. The key here is the 'm' at the end of the regexp to indicate multi-line. If this is to template some values you may want to look at something like ERB templates instead of this gsub. Also, be careful on what you need to escape in your regular expressions.

This is a simplified example of how to do it using a parser:

require 'nokogiri'

html = '<p>lorem ipsum blah blah ipsum</p>

<!--start-->
REPLACE MULTI-LINE
CONTENT HERE...
<!--end-->

<p>other stuff still here...</p>'

doc = Nokogiri.HTML(html)
puts doc.to_html

After parsing we get:

# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body>
# >> <p>lorem ipsum blah blah ipsum</p>
# >> 
# >> <!--start-->
# >> REPLACE MULTI-LINE
# >> CONTENT HERE...
# >> <!--end-->
# >> 
# >> <p>other stuff still here...</p>
# >> </body></html>

doc.at('//comment()/following-sibling::text()').content = "\nhello world!\n"
puts doc.to_html

After finding the comment, stepping to the next text() node and replacing it:

# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body>
# >> <p>lorem ipsum blah blah ipsum</p>
# >> 
# >> <!--start-->
# >> hello world!
# >> <!--end-->
# >> 
# >> <p>other stuff still here...</p>
# >> </body></html>

If your HTML is always going to be simple, with no possibility of having strings that break your search patterns, then you can go with search/replace.

If you check around, you see that for any non-trivial HTML manipulation you should go with a parser. That's because they deal with the actual structure of the document, so if the document changes, there's a better chance of the parser not being confused.

继续阅读：ruby

Replace content in a file between two markers

更多精彩内容

精彩评论

最新问答

第一次出国飞行流程+注意事项？

再生油（关于再生油的介绍）？

东莞科技进修学院（关于东莞科技进修学院的介绍）？

均为镇政府人员平均年龄不超30？

手机msn在哪里下载（其实很简单）？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？