Find and replace text in a 47GB large file [closed]

2023-03-25 05:28 问答作者：

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.

Closed 2 years ago.

Improve this question

I have to do some find and replace tasks on a rather big file , a开发者_StackOverflow中文版bout 47 GB in size .

Does anybody know how to do this ? I tried using services like TextCrawler , EditpadLite and more but nothing supports this large a file .

I'm assuming this can be done via the commandline .

Do you have an idea how this can be accomplished ?

Sed (stream editor for filtering and transforming text) is your friend.

sed -i 's/old text/new text/g' file

Sed performs text transformations in a single pass.

I use FART - Find And Replace Text by Lionello Lunesu.

It works very well on Windows Seven x64.

You can find and replace the text using this command:

fart -c big_filename.txt "find_this_text" "replace_to_this"

github

On Unix or Mac:

sed 's/oldstring/newstring/g' oldfile.txt > newfile.txt

fast and easy...

I solved the problem usig, before, split to reduce the large file in smalls with 100 MB each.

If you are using a Unix like system then you can use cat | sed to do this

cat hosted_domains.txt | sed s/com/net/g

Example replaces com with net in a list of domain names and then you can pipe the output to a file.

For me none of the tools suggested here work well. Textcrawler ate all my computer's memory, SED didn't work at all, Editpad complained about memory...

The solution is: create your own script in python, perl or even C++.

Or use the tool PowerGrep, this is the easiest and fastest option.

I have't tried fart, it's only command line and maybe not very friendly.
Some hex editor, such as Ultraedit also work well.

I used

sed 's/[nN]//g' oldfile.fasta > newfile.fasta

to replace all the instances of n's in my 7Gb file.

If I omitted the > newfile.fasta aspect it took ages as it scrolled up the screen showing me every line of the file.

With the > newfile it ran it in a matter of seconds on an ubuntu server

继续阅读：command-line replace

Find and replace text in a 47GB large file [closed]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？