Removing repeated characters in string without using recursion

2023-03-25 21:01 问答作者：

You are given a string. Develop a function to remove duplicate characters from that string. String could be of any length. Your algorithm must be in space. If you wish you can use constant siz开发者_如何学编程e extra space which is not dependent any how on string size. Your algorithm must be of complexity of O(n).

My idea was to define an integer array of size of 26 where 0th index would correspond to the letter a and the 25th index for the letter z and initialize all the elements to 0. Thus we will travel the entire string once and and would increment the value at the desired index as and when we encounter a letter.

and then we will travel the string once again and if the value at the desired index is 1 we print out the letter otherwise we do not.

In this way the time complexity is O(n) and the space used is constant irrespective of the length of the string!!

if anyone can come up with ideas of better efficiency,it will be very helpful!!

Your solution definitely fits the criteria of O(n) time. Instead of an array, which would be very, very large if the allowed alphabet is large (Unicode has over a million characters), you could use a plain hash. Here is your algorithm in (unoptimized!) Ruby:

def undup(s) 
  seen = Hash.new(0)
  s.each_char {|c| seen[c] += 1}
  result = ""
  s.each_char {|c| result << c if seen[c] == 1}
  result
end

puts(undup "")
puts(undup "abc")
puts(undup "Olé")
puts(undup "asdasjhdfasjhdfasbfdasdfaghsfdahgsdfahgsdfhgt")

It makes two passes through the string, and since hash lookup is less than linear, you're good.

You can say the Hashtable (like your array) uses constant space, albeit large, because it is bounded above by the size of the alphabet. Even if the size of the alphabet is larger than that of the string, it still counts as constant space.

There are many variations to this problem, many of which are fun. To do it truly in place, you can sort first; this gives O(n log n). There are variations on merge sort where you ignore dups during the merge. In fact, this "no external hashtable" restriction appears in Algorithm: efficient way to remove duplicate integers from an array (also tagged interview question).

Another common interview question starts with a simple string, then they say, okay now a million character string, okay now a string with 100 billion characters, and so on. Things get very interesting when you start considering Big Data.

Anyway, your idea is pretty good. It can generally be tweaked as follows: Use a set, not a dictionary. Go trough the string. For each character, if it is not in the set, add it. If it is, delete it. Sets take up less space, don't need counters, and can be implemented as bitsets if the alphabet is small, and this algorithm does not need two passes.

Python implementation: http://code.activestate.com/recipes/52560-remove-duplicates-from-a-sequence/

You can also use a bitset instead of the additional array to keep track of found chars. Depending on which characters (a-z or more) are allowed you size the bitset accordingly. This requires less space than an integer array.

继续阅读：complexity-theory string

Removing repeated characters in string without using recursion

更多精彩内容

精彩评论

最新问答

看不孕不育哪家医院强？

不孕症医院排名？

怎样增加精子活力和数量？

如果奥运延期，会对在中国举办的世俱杯、亚运会、冬奥会产生影响吗？？

治输卵管堵怎么治？

问答排行榜

Escaping "<" in Perl-generated XML

微信重新建群怎么建？

imessage会显示已读吗？

太快了能不能慢一点好爽~好大~不要拔出来了？

二年级家长回音怎么写大全简短的（二年级家长回音怎么写）？