Anagram algorithm with minimum complexity
I recently was asked to design an algorithm that checks if two strings are anagrams of one another. My goal was to minimize space and time complexity, so I came up with this algorithm:
- Create an array of 26 elements, each initialized to zero.
- Traverse the first string and for each character, increment the array element corresponding开发者_运维知识库 to that character.
- Traverse the second string and for each character, decrement the array element corresponding to that character.
- Scan over the array. If all elements are 0, the two strings are anagrams.
However, the time complexity of this algorithm is O(n) and I cannot come up with an algorithm with lower complexity. Does anybody know of one?
Your algorithm is asymptotically optimal. It's not possible to solve this problem in any better than Ω(n) time. To see this, suppose that an algorithm A exists that can solve the problem in o(n) time (note that this is little-o of n here). Then for any 1 > ε > 0, there is some n such that for any input of size at least n, the algorithm must terminate in at most εn steps. Set ε = 1/3 and consider any inputs to the algorithm that are of length at least n for the aforementioned n for this ε. Since the algorithm can look at most 1/3 of the characters in the two strings, then there must be two different inputs to the function, one that is a pair of anagrams and one that isn't, such that the algorithm looks at the same subset of the characters of each input. The function would then have to produce the same output in each case, and thus would be wrong on at least one of the inputs. We've reached a contradiction, so no such algorithm must exist.
You could possibly improve average performance with early exits. While scanning the 2nd string, if count[char] is 0 before you decrement, you don't have an anagram and you can stop scanning.
Also, if the strings are shorter than 26 chars, then in the last step, check only the chars in the first string for zeroes.
This doesn't change the big O, but it can change your average runtime to something less than the 2N+26 o the proposed solution, depending on your data.
Let's take a question: Given two strings s and t, write a function to determine if t is an anagram of s.
For example, s = "anagram", t = "nagaram", return true. s = "rat", t = "car", return false.
Method 1(Using HashMap ):
public class Method1 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
System.out.println(isAnagram(a, b ));// output => true
}
private static boolean isAnagram(String a, String b) {
Map<Character ,Integer> map = new HashMap<>();
for( char c : a.toCharArray()) {
map.put(c, map.getOrDefault(c, 0 ) + 1 );
}
for(char c : b.toCharArray()) {
int count = map.getOrDefault(c, 0);
if(count == 0 ) {return false ; }
else {map.put(c, count - 1 ) ; }
}
return true;
}
}
Method 2 :
public class Method2 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
System.out.println(isAnagram(a, b));// output=> true
}
private static boolean isAnagram(String a, String b) {
int[] alphabet = new int[26];
for(int i = 0 ; i < a.length() ;i++) {
alphabet[a.charAt(i) - 'a']++ ;
}
for (int i = 0; i < b.length(); i++) {
alphabet[b.charAt(i) - 'a']-- ;
}
for( int w : alphabet ) {
if(w != 0 ) {return false;}
}
return true;
}
}
Method 3 :
public class Method3 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
System.out.println(isAnagram(a, b ));// output => true
}
private static boolean isAnagram(String a, String b) {
char[] ca = a.toCharArray() ;
char[] cb = b.toCharArray();
Arrays.sort( ca );
Arrays.sort( cb );
return Arrays.equals(ca , cb );
}
}
Method 4 :
public class Method4 {
public static void main(String[] args) {
String a = "protijayi";
String b = "jayiproti";
//String c = "gini";
System.out.println(isAnagram(a, b ));// output => true
}
private static boolean isAnagram(String a, String b) {
Map<Integer, Integer> map = new HashMap<>();
a.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) + 1));
b.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) - 1));
//System.out.println(map.values());
for(int count : map.values()) {
if (count<0) return false;
}
return true;
}
}
To be sure the strings are anagrams you need to compare the whole strings - so how could that be faster than o(n)?
int anagram (char a[], char b[]) {
char chars[26];
int ana = 0;
int i =0;
for (i=0; i<26;i++)
chars[i] = 0;
if (strlen(a) != strlen(b))
return -1;
i = 0;
while ((a[i] != '\0') || (b[i] != '\0')) {
chars[a[i] - 'a']++;
chars[b[i] - 'a']--;
i++;
}
for (i=0; i<26;i++)
ana += chars[i];
return ana;
}
void main() {
char *a = "chimmy\0";
char *b = "yimmch\0";
printf ("Anagram result is %d.\n", anagram(a,b));
}
精彩评论