开发者

Minimum window width in string x that contains all characters of string y

Find minimum window width in string x that contains all characters of another string y. For example:

String x = "coobdafceeaxab"
String y = "abc"

The answer should be 5, because the shortest substring in x that contains all three letters of y is "bdafc".

I can think of a naive solution with complexity O(n^2 * log(m)), where n = len(x) and m = len(y). Can anyone suggest a better solution? Thanks.

Update: now think of it, if I change my set to tr1::unordered_map, then I can cut the complexity down to O(n^2), becaus开发者_StackOverflow社区e insertion and deletion should both be O(1).


time: O(n) (One pass)
space: O(k)

This is how I would do it:
Create a hash table for all the characters from string Y. (I assume all characters are different in Y).

First pass:
Start from first character of string X.
update hash table, for exa: for key 'a' enter location (say 1).
Keep on doing it until you get all characters from Y (until all key in hash table has value).
If you get some character again, update its newer value and erase older one.

Once you have first pass, take smallest value from hash table and biggest value.
Thats the minimum window observed so far.

Now, go to next character in string X, update hash table and see if you get smaller window.


Edit:

Lets take an example here:
String x = "coobdafceeaxab"
String y = "abc"

First initialize a hash table from characters of Y.
h[a] = -1
h[b] = -1
h[c] = -1

Now, Start from first character of X.
First character is c, h[c] = 0
Second character (o) is not part of hash, skip it.
..
Fourth character (b), h[b] = 3
..
Sixth character(a), enter hash table h[a] = 5.
Now, all keys from hash table has some value.
Smallest value is 0 (of c) and highest value is 5 (of a), minimum window so far is 6 (0 to 5).
First pass is done.

Take next character. f is not part of hash table, skip it.
Next character (c), update hash table h[c] = 7.
Find new window, smallest value is 3 (of b) and highest value is 7 (of c).
New window is 3 to 7 => 5.

Keep on doing it till last character of string X.

I hope its clear now.


Edit

There are some concerns about finding max and min value from hash.
We can maintain sorted Link-list and map it with hash table.
Whenever any element from Link list changes, it should be re-mapped to hash table.
Both these operation are O(1)

Total space would be m+m


Edit

Here is small visualisation of above problem: For "coobdafceeaxab" and "abc"

step-0:
Initial doubly linked-list = NULL
Initial hash-table = NULL

step-1:
Head<->[c,0]<->tail
h[c] = [0, 'pointer to c node in LL']

step-2:
Head<->[c,0]<->[b,3]<->tail
h[c] = [0, 'pointer to c node in LL'], h[b] = [3, 'pointer to b node in LL'],

Step-3:
Head<->[c,0]<->[b,3]<->[a,5]<->tail
h[c] = [0, 'pointer to c node in LL'], h[b] = [3, 'pointer to b node in LL'], h[a] = [5, 'pointer to a node in LL']
Minimum Window => difference from tail and head => (5-0)+1 => Length: 6

Step-4:
Update entry of C to index 7 here. (Remove 'c' node from linked-list and append at the tail)
Head<->[b,3]<->[a,5]<->[c,7]<->tail
h[c] = [7, 'new pointer to c node in LL'], h[b] = [3, 'pointer to b node in LL'], h[a] = [5, 'pointer to a node in LL'],
Minimum Window => difference from tail and head => (7-3)+1 => Length: 5

And so on..

Note that above Linked-list update and hash table update are both O(1).
Please correct me if I am wrong..


Summary:

TIme complexity: O(n) with one pass
Space Complexity: O(k) where k is length of string Y


I found this very nice O(N) time complexity version here http://leetcode.com/2010/11/finding-minimum-window-in-s-which.html, and shortened it slightly (removed continue in a first while , which allowed to simplify condition for the second while loop). Note, that this solution allows for duplicates in the second string, while many of the above answers do not.

private static String minWindow(String s, String t) {
    int[] needToFind = new int[256];
    int[] hasFound = new int[256];

    for(int i = 0; i < t.length(); ++i) {
       needToFind[t.charAt(i)]++;
    }

    int count = 0;
    int minWindowSize = Integer.MAX_VALUE;
    int start = 0, end = -1;
    String window = "";

    while (++end < s.length()) {
        char c = s.charAt(end);
        if(++hasFound[c] <= needToFind[c]) {
           count++;
        }

        if(count < t.length()) continue;
        while (hasFound[s.charAt(start)] > needToFind[s.charAt(start)]) {
           hasFound[s.charAt(start++)]--;
        }

        if(end - start + 1 < minWindowSize) {
           minWindowSize = end - start + 1;
           window = s.substring(start, end + 1);
        }
    }
    return window;
}


Here's my solution in C++:

int min_width(const string& x, const set<char>& y) {
  vector<int> at;
  for (int i = 0; i < x.length(); i++)
    if (y.count(x[i]) > 0)
      at.push_back(i);

  int ret = x.size();
  int start = 0;
  map<char, int> count;

  for (int end = 0; end < at.size(); end++) {
    count[x[at[end]]]++;
    while (count[x[at[start]]] > 1)
      count[x[at[start++]]]--;
    if (count.size() == y.size() && ret > at[end] - at[start] + 1)
      ret = at[end] - at[start] + 1;
  }
  return ret;
}

Edit: Here's an implementation of Jack's idea. It's the same time complexity as mine, but without the inner loop that confuses you.

int min_width(const string& x, const set<char>& y) {
  int ret = x.size();
  map<char, int> index;
  set<int> index_set;

  for (int j = 0; j < x.size(); j++) {
    if (y.count(x[j]) > 0) {
      if (index.count(x[j]) > 0)
        index_set.erase(index[x[j]]);
      index_set.insert(j);
      index[x[j]] = j;
      if (index.size() == y.size()) {
        int i = *index_set.begin();
        if (ret > j-i+1)
          ret = j-i+1;
      }
    }
  }
  return ret;
}

In Java it can be implemented nicely with LinkedHashMap:

static int minWidth(String x, HashSet<Character> y) {
    int ret = x.length();
    Map<Character, Integer> index = new LinkedHashMap<Character, Integer>();

    for (int j = 0; j < x.length(); j++) {
        char ch = x.charAt(j);
        if (y.contains(ch)) {
            index.remove(ch);
            index.put(ch, j);
            if (index.size() == y.size()) {
                int i = index.values().iterator().next();
                if (ret > j - i + 1)
                    ret = j - i + 1;
            }
        }
    }
    return ret;
}

All operations inside the loop take constant time (assuming hashed elements disperse properly).


There is an O(n solution to this problem). It very well described in this article. http://www.leetcode.com/2010/11/finding-minimum-window-in-s-which.html Hope it helps.


This is my solution in C++, just for reference.

Update: originally I used std::set, now I change it to tr1::unordered_map to cut complexity down to n^2, otherwise these two implementations look pretty similar, to prevent this post from getting too long, I only list the improved solution.

#include <iostream>
#include <tr1/unordered_map>
#include <string>

using namespace std;
using namespace std::tr1;

typedef tr1::unordered_map<char, int> hash_t;

// Returns min substring width in which sentence contains all chars in word
// Returns sentence's length + 1 if not found
size_t get_min_width(const string &sent, const string &word) {
    size_t min_size = sent.size() + 1;
    hash_t char_set; // char set that word contains
    for (size_t i = 0; i < word.size(); i++) {
        char_set.insert(hash_t::value_type(word[i], 1));
    }
    for (size_t i = 0; i < sent.size() - word.size(); i++) {
        hash_t s = char_set;
        for (size_t j = i; j < min(j + min_size, sent.size()); j++) {
            s.erase(sent[j]);
            if (s.empty()) {
                size_t size = j - i + 1;
                if (size < min_size) min_size = size;
                break;
            }
        }
    }
    return min_size;
}

int main() {
    const string x = "coobdafceeaxab";
    const string y = "abc";
    cout << get_min_width(x, y) << "\n";
}


An implementation of Jack's idea.

public int smallestWindow(String str1, String str2){
        if(str1==null || str2==null){
            throw new IllegalArgumentException();
        }
        Map<String, Node> map=new HashMap<String, Node>();
        Node head=null, current=null;

        for(int i=0;i<str1.length();i++){
            char c=str1.charAt(i);
            if(head==null){
                head=new Node(c);
                current=head;
                map.put(String.valueOf(c), head);
            }
            else{
                current.next=new Node(c);
                current.next.pre=current;
                current=current.next;
                map.put(String.valueOf(c), current);
            }
        }

        Node end=current;
        int min=Integer.MAX_VALUE;
        int count=0;

        for(int i=0;i<str2.length();i++){
            char c = str2.charAt(i);
            Node n=map.get(String.valueOf(c));
            if(n!=null){
                if(n.index==Integer.MAX_VALUE){
                    count++;
                }
                n.index=i;
                if(n==head){
                    Node temp=head;
                    head=head.next;
                    if(head==null){//one node
                        return 1;
                    }
                    head.pre=null;        
                    temp.pre=end;
                    end.next=temp;
                    temp.next=null;
                    end=temp;
                }
                else if(end!=n){
                    n.pre.next=n.next;
                    n.next.pre=n.pre;
                    n.pre=end;
                    n.next=null;
                    end.next=n;
                    end=n;
                }
                if(count==str1.length()){
                    min=Math.min(end.index-head.index+1, min);
                }
            }
        }
        System.out.println(map);
        return min;
    }


Simple java solution using the sliding window. Extending NitishMD's idea above:

public class StringSearchDemo {

public String getSmallestSubsetOfStringContaingSearchString(String toMatch,
        String inputString) {
    if (inputString.isEmpty() || toMatch.isEmpty()) {
        return null;
    }
    //      List<String> results = new ArrayList<String>(); // optional you can comment this out

    String smallestMatch = "";
    //      String largestMatch = ""; 
    int startPointer = 0, endPointer = 1;
    HashMap<Character, Integer> toMatchMap = new HashMap<Character, Integer>();

    for (char c : toMatch.toCharArray()) {
        if (toMatchMap.containsKey(c)) {
            toMatchMap.put(c, (toMatchMap.get(c) + 1));
        } else {
            toMatchMap.put(c, 1);
        }
    }

    int totalCount = getCountofMatchingString(toMatchMap, toMatch);
    for (int i = 0; i < inputString.length();) {
        if (!toMatchMap.containsKey(inputString.charAt(i))) {
            endPointer++;
            i++;
            continue;
        }
        String currentSubString = inputString.substring(startPointer,
                endPointer);
        if (getCountofMatchingString(toMatchMap, currentSubString) >= totalCount) {
            //              results.add(currentSubString); // optional you can comment this out
            if (smallestMatch.length() > currentSubString.length()) {
                smallestMatch = currentSubString;
            } else if (smallestMatch.isEmpty()) {
                smallestMatch = currentSubString;
            }
            //              if (largestMatch.length() < currentSubString.length()) {
            //                  largestMatch = currentSubString;
            //              }
            startPointer++;
        } else {
            endPointer++;
            i++;
        }

    }
    //      System.out.println("all possible combinations = " + results); // optional, you can comment this out
    //      System.out.println("smallest result = " + smallestMatch);
    //      System.out.println("largest result = " + largestMatch);
    return smallestMatch;
}

public int getCountofMatchingString(HashMap<Character, Integer> toMatchMap,
        String toMatch) {
    int match = 0;
    HashMap<Character, Integer> localMap = new HashMap<Character, Integer>();

    for (char c : toMatch.toCharArray()) {
        if (toMatchMap.containsKey(c)) {
            if (localMap.containsKey(c)) {
                if (localMap.get(c) < toMatchMap.get(c)) {
                    localMap.put(c, (localMap.get(c) + 1));
                    match++;
                }
            } else {
                localMap.put(c, 1);
                match++;
            }
        }
    }
    return match;
}

public static void main(String[] args) {
    String inputString = "zxaddbddxyy由ccbbwwaay漢字由来"; 
    String matchCriteria = "a由";
    System.out.println("input=" + matchCriteria);
    System.out.println("matchCriteria=" + inputString);
    String result = (new StringSearchDemo())
            .getSmallestSubsetOfStringContaingSearchString(matchCriteria, inputString);
    System.out.println("smallest possbile match = " + result);
}

}

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜