how to remove whitespace while scanning text in java
I've implemented several different "scanners" in java, from the Scanner class to simply using
String.split("\ss+")
but when there are several whitespa开发者_运维问答ces in a row like "the_quick____brown___fox"
they all tokenize certain white spaces (Imagine the underscores are whitespaces). Any suggestions?
I'm not sure what you are talking about. For example,
String[] parts = "the quick brown fox".split("\\s+");
correctly tokenizes the string with no leading or trailing whitespaces on any token, and no empty tokens. If the input string may have leading or trailing whitespaces, then calling String.trim()
will remove the possibility of empty tokens.
EDIT I surmise from your other comment that you are reading the input a line at a time and then tokenizing the lines. You probably need to trim
each line before tokenizing.
Use java.util.Scanner.
精彩评论