comparing two sets of text
I have two paragraphs of text, one is saved in a file while the other is the pie开发者_Go百科ce entered by a user willing to write the same actual paragraph. Now I want to compare the two and tell the user how efficient was he to copy the same paragraph. Any techniques on how to do it ? I was thinking of these issues which make it complex.
- What if the user spelled a word wrong
- What if the user skipped a word in between
- What if the user skipped two words and the rest of the text is same.
Do a diff on the input and the file, there is a javascript library for that here http://code.google.com/p/google-diff-match-patch/ will tell you exactly what is different then you can use this information to determine efficiency of copy
You're looking for a friendly diff output. Try something like this: Javascript Diff Algorithm
The sample should be simple enough:
var diff = diffString(
"The red brown fox jumped over the rolling log.",
"The brown spotted fox leaped over the rolling log"
);
Working example: http://jsbin.com/uhalo3
You can do this in 2 ways:
This one gives quite a precise report:
Measure the time user took to write
Use split to make an array with every words in your file and same for the entered text
Compare each word entered by user with the similar from your list, and also with the one before and the next (because you need 2 see if he skipped a word or else... everything from there will go wrong)
Count the errors (you can use levenstein distance to compare how many mistakes where in each word)
Give the report
Use levenstein distance over the 2 strings (yes... treat all text like a single string).
This one is muuuuuuch easier to use... but the report is not so precise.
精彩评论