Splitting an array and counting values
I'm trying to figure out how to implement a method that does the following:
The method takes in a text file and counts 开发者_运维问答the number of words up until a specified character "#".
An array is passed as a parameter through the method, this contains a list of sequential numbers. A number in an array corresponds to the position the word occurs in the text file (disregarding #'s) so the 4th word will correspond to a value of 3 (n-1).
The method will count the number of times the word before a # occurs in the array and divide it by the total number of entries between #'s it will then take the average of each time this is done.
So an example to make this clear:
Say you have the text file containing :
Hi my name # is something #
A corresponding array would be:
0,0,1,1,1,2,2,2,2,3,4,4 (a number for each letter in sequence)
The first hash would occur between the 2 and 3. So the 2's represent the word occuring before the #. So we would calculate (total number of 2's)/total number of 0's, 1's and 2's. This would be 4/9.
We would then calculate the same between the two hashes # is something #. 'something' corresponds to a 4, so we would have (total number of 4's)/total number of 3 and 4's. This would be 2/3.
We would then take the average of 2/3 and 4/9
I hope this is clear, let me know if you need any clarifications.
I'd split()
the string for any whitespace chars. Now I have every word in an array, with each cell's index representing the corresponding num, and the cell's content(the word) length is the 'how many 0's or 1's or ..' I have.
That should solve the first part of your problem.
Then you need to find where each #
is, its offset that is, but that offset you want to represent in words, not chars. So I'd iterate through the previously created array, and check if the word I stored is a #
. If it is I'd update a marker variable (this should hold the previous position/index of the last seen #
), and calculate the division you want (4/8 , 2/3 w/e). That is the length of the previous cell's content divided by the sum of length's from the marker until the current index-1.
I think that's about it; the logic. It's not that hard to implement. Just don't forget to check the bounds.
精彩评论