There is a isSomething() lib for Java? [closed]
I want to know if there is a lib for Java with implementations like: isTemperature()
, isPercentual()
, isDistanceUnit()
, isWeightUnit()
,isProperName()
, isDate()
, isYear()
, isPhone()
, isLocation()
and what more can be defined.
Not only units are for my interest, but all type of classifications that is possible to do, be number or word.
This will be used to classify words in a text.
Not that I am aware of. However, you can still create methods yourself that do the same thing...
isTemperature()
boolean isTemperature(String check) {
char[] letters = check.toCharArray();
if (check.indexOf("°").equals((letters.length - 1)) {
return true;
}
return false;
}
isPercentual()
boolean isPercentual(String check) {
try {
double verify = Double.parseDouble(check);
} catch (NumberFormatException e) {
return false;
}
char[] numbers = check.toCharArray();
if (check.indexOf("%").equals(numbers.length) {
return true;
}
return false;
}
isDistanceUnit()
boolean isDistanceUnit(String check, boolean customary) {
String[] customaryUnits = {"mi","yd","ft","in"};
String[] metricUnits = {"mm","cm","dm","m","km"};
if (customary) {
for (int i = 0; i <= customaryUnits.length; i++) {
if (check.toLowerCase().contains(customaryUnits[i])) {
return true;
}
}
return false;
} else {
for (int i = 0; i <= metricUnits.length; i++) {
if (check.toLowerCase().contains(metricUnits[i])) {
return true;
}
}
return false;
}
}
isWeightUnit()
boolean isWeightUnit(String check, boolean customary) {
String[] customaryUnits = {"lb","oz","T"};
String[] metricUnits = {"kg"}; //I'm sorry that's all I know :(
if (customary) {
for (int i = 0; i <= customaryUnits.length; i++) {
if (check.toLowerCase().contains(customaryUnits[i])) {
return true;
}
}
return false;
} else {
for (int i = 0; i <= metricUnits.length; i++) {
if (check.toLowerCase().contains(metricUnits[i])) {
return true;
}
}
return false;
}
}
isProperName()
boolean isProperName(String check) {
char[] letters = check.toCharArray();
String[] capLetters = {"A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"};
for (int i = 0; i <= capLetters.length; i++) {
if (letters[0].equals(capLetters[i])) {
return true;
}
}
return false;
}
isDate()
UPDATE: Now I can give you this one. A quick note before I do, though. Add these lines at the beginning of your code or this method won't work.
import java.text.SimpleDateFormat;
import java.text.ParseException;
//--------------------------------------------------------------------------------
boolean isDate(String check) {
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
if (check.trim().length != dateFormat.toPattern().length) {
return false;
}
dateFormat.setLenient(false);
try {
dateFormat.parse(check.trim());
} catch (ParseException pe) {
return false;
}
return true;
}
isYear()
boolean isYear(String check) {
try {
int verify = Integer.parseInt(check);
} catch (NumberFormatException e) {
return false;
}
char[] verify2 = check.toCharArray();
if ((verify2.length).equals(4)) {
return true;
}
return false;
}
isPhone()
You haven't really said what kind of phone number you want to check for. I'm guessing you want one in this form (865-867-5309).
boolean isPhone(String check) {
float firstHyphen = check.indexOf("-");
char[] numbers = check.toCharArray();
float check2 = numbers.length / 3;
if (firstHypen.equals(check2)) {
return true;
}
return false;
}
isLocation()
I apologize I can't give you this one right now. :(
isEmail()
boolean isEmail(String check) {
String[] emailDomains = {".com",".net",".org"};
String[] emailProviders = {"gmail","yahoo","hotmail","aol","tds","comcast","charter","peoplepc"}; //add more if you want
char[] check2 = check.toCharArray();
for (int i = 0; i <= emailProviders.length; i++) {
for (int x = 0; x <= emailDomains.length; x++) {
if ((check.indexOf(emailDomains[x]).equals(check2.length)) && (check.indexOf(emailProviders[i]).equals(check2.length - emailDomains[x].toCharArray().length)) {
return true;
}
}
}
return false;
}
You can add more units for the isDistanceUnit()
and isWeightUnit()
methods if you wish. If you need any more methods or if you have any questions, just ask. :)
Is there an equivalent library in a language you have experience with already?
I can't think of many applications where you'd need to check for the presence of all of those formats in a single data source but I'd use regular expressions to do the job.
EDIT:
If your data is heterogeneous and you want to just work out what it is, you probably need some kind of classifier. Try jBNC or classifier4j.
Try either of the first two links that appear above yours:
http://www.google.com/search?q=java+units+library&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
What you really need is some ontology to add a great deal of context. Check out Protege and Cyc. What you're asking for goes far beyond mere validation and regular expressions. After re-reading your question, it sounds like you want to read a document and somehow have your parser pick out tokens that match those units and discern what they are from the context. If that hits the mark, you've got a very difficult problem on your hands. It's much more along the lines of natural language processing.
精彩评论