开发者

Display the number of the characters in a string

I have a Java question: I am writing a program to read a string and display the number of characters in that string. I found some example code but I don't quite understand the last part - can anyone help?

int[] count = countLetters(line.toLowerCase());

for (int i=0; i<count.length; i++)
{
    if ((i + 1) % 10 == 0)
            System.out.println( (char) ('a' + i)+ " " + count[i]);
    else
        System.out.print( (char) ('a' + i)+ " " +  count[i]+ " ");
}

public static int[] 开发者_运维技巧countLetters(String line)
{
    int[] count = new int[26];

    for (int i = 0; i<line.length(); i++)
    {
        if (Character.isLetter(line.charAt(i)))
            count[(int)(line.charAt(i) - 'a')]++;
    }

    return count;
}


Your last loop is :

For every character we test if it's a letter, if yes, we increment the counter relative to that character. Which means, 'a' is 0, 'b' is 1 ... (in other words, 'a' is 'a'-'a' which is 0, 'b' is 'b'-'a' which is 1 ...).

This is a common way to count the number of occurrences of characters in a string.


The code you posted counts not the length of the string, but the number of occurrences of alphabet letters that occur in the lowercased string.

Character.isLetter(line.charAt(i))

retrieved the character at position i and returns true if it is a letter.

count[(int)(line.charAt(i) - 'a')]++;

increments the count at index character - 'a', this is 0 to 26.

The result of the function is an array of 26 integers containing the counts per letter.

The for loop over the counts array ends the printed output every 10th count and uses

(char) ('a' + i)

to print the letter that the counts belongs to.


I guess you are counting the occurences of letters, not characters ('5' is also a character).

The last part:

for (int i = 0; i<line.length(); i++)
{
    if (Character.isLetter(line.charAt(i)))
        count[(int)(line.charAt(i) - 'a')]++;
}

It iterates over the input line and checks for each character if it is a letter. If it is, it increments the count for that letter. The count is kept in an array of 26 integers (for the 26 letters in the latin alphabet). The count for letter 'a' is kept at index 0, letter 'b' at 1, 'z' at 25. To get the index the code subtracts the value 'a' from the letter value (each character not only is a character/glyph, but also a numeric value). So if the letter is 'a' it subtracts the value of 'a' which should be 0 and so on.


In the method countLetters, the for loop goes through all characters in the line. The if checks to make sure it's a letter, otherwise it will be ignored.

line.charAt() yields the single character at position i. The type of this is char.

Now deep inside Java, a char is just a number corresponding to a character code. Lowercase 'a' has a character code of 97, 'b' is 98 and so on. (int) forces conversion from char to int. So we take the character code, let's say it's a 'b' so the code is 98, and we subtract the code for 'a', which is 97, so we get the offset 1 (from the beginning of the alphabet). For any letter in the alphabet, the offset will be between 0 and 25 (inclusive).

So we use that offset as an index into the array count and use ++ to increment it. Then later the loop in the top part of the program can print out the counts.

The loop at the top is using the reverse "trick" to convert those offsets from 0 to 25 back into letters from a to z.


The 'last part', the implementation of the loop is really hard to understand. Close to obfuscation ;) Here's a refactoring of the count method (split in two method, a general one for all chars and a special on for just the small capital letters:

public static int[] countAllASCII(String line) {
  int[] count = new int[256];
  char[] chars = line.toCharArray();

  for (char c : chars) {
    int index = (int) c;
    if (index < 256) {
      count[index]++;
    }  
  }

  return count;
}

public static int[] countLetters(String line) {
    int[] countAll = countAll(line);        
    int[] result = new int[26];     
    System.arraycopy(countAll, (int) 'a', result, 0, 26); 

    return result;
}

General idea: the countAll method just counts all chars. Yes, the array is bigger, but in these dimensions, nobody cares today. The advantage: I don't have to test each char. The second method just copy the area of interest into a new (resulting) array and returns it.

EDIT

I'd changed my code for a less unfriendly comment as well. Thanks anyway, Bombe.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜