Deleting words that appear multiple times in a file

2023-02-28 18:44 问答作者：

How can I delete words that appear mult开发者_如何转开发iple times in a file and just keep the first one and delete the clones.

A simple algorithm is to just iterate over all words in the input, adding each one to a set of words you've seen before. If the word was already in the set, remove it.

Here's an example:

seen_words = set()
for word in words:
    if word not in seen_words:
        print word
        seen_words.add(word)

You can also use a dictionary like this:

mydict = {}
mylist = [1, 2, 2, 3, 4, 5, 5]
for item in mylist:
  mydict[item] = ""
for item in mydict:
  print item

Output:

But of course you would need to integrate that into file reading/writing.

You can use a set:

set('these are all the words the words all are these'.split())

output: 'these', 'the', 'all', 'are', 'words'

fileText = "some words with duplicate words"
fileWords = fileText.split(" ")
output = fileWords[0]
words = [output]
for word in fileWords:
    if word not in words:
        output += " "+word
        words.append(word)

If your file is not EXTREMELY big,

word='word'
data=open("file").read()
ind = data.find(word)
print data[:ind+len(word)] + data[ind:].replace(word,"")

继续阅读：python string

Deleting words that appear multiple times in a file

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？