I want to count (measure) Jargon and Abbreviation Link in HTML web page using java programming language
May I want to know that how can we count Jargon text link which is a user unfamiliar word (eg, UNHCR,USR , etc.) in web page . And also want to know how to check and count the abbreviation words (undefined acronyms) in web page. I would like to count jargon and abbreviation words in html web pag开发者_运维百科eusing automated tool . Thank you.
You can use the java SAXParser API (XML parser)to parse the Html web page. Parse thru the page using SAXParser and for each node that you want to check, parse using Java string parser (which should be quite simple enough). For example you want to want to count the number of links in a page you can find all the nodes in Html page using XML parser and then check the text value of these nodes for jargon or abbreviation words.
Hope this helps.
精彩评论