StringTokenizer split at "<br/>"
Maybe I am stupid but I don't understand why the behaviour of StringTokenizer here:
import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;
String object = (String) value;
String escaped = escapeHtml(object);
StringTokenizer tokenizer = new StringTokenizer(escaped, escapeHtml("<br/>"));
If fx. value is
Hej<br/>$user.get(0).name Har vundet<br/><table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>开发者_JS百科;#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table><br/>
Then the result is
Hej
$use
.
e
(0).name Ha
vunde
a
e
o
de
='1'
h
Name
h
h
P
ayed
h
h
B
ewed
h
#fo
each( $u in $use
)
d
$u.name
d
d
$u.p
ayed
d
d
$u.
ewed
d
#end
a
e
It makes no sense to me.
How can I make it behave as I expect to.
From the documentation:
The characters in the delim argument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.
In other words, the characters that tell StringTokenizer
when to separate the string are:
- <
- b
- r
- /
- >
When it matches any of those characters in the string (the variable escaped
in your code), the StringTokenizer
instance will split the result and drop the token. You can confirm this by noting that the letter r
does not occur in the output.
Use String.split
, instead, as others suggest.
Each character in the string is considered a token for splitting on. So your code splits on each "&", "l", "t", ";", "b", "r", "/" or "g" character (since escapeHtml
will replace the "<" and ">" with <
and >
respectively).
You probably want to use String.split
which takes a regular expression as the thing to split on:
String[] parts = object.split("<br/>");
or
String[] parts = escaped.split(escapeHtml("<br/>"));
Just make sure that there are no regex special characters in your split token.
If you want to divide a string / text by a word and not only by few characters then you better use String.split
I've done the test:
public static void main(String[] args){
String s = "Hej<br/>$user.get(0).name Har vundet<br/><table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table><br/>";
String[] lines = s.split("<br/>");
for(String ss:lines)
System.out.println(ss);
}
and here you have the result:
Hej
$user.get(0).name Har vundet
<table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table>
Tjena
StringTokenizer splits using each character.
You need to use split. (be careful though as it takes a regular expression)
String[] lines = "some html string<br/>with line breaks<br/>".split("<br/>")
You cannot use StringTokenizer with a multicharacter delimiter. One possible solution to your problem is to replace "<br>"
with a character that you can guarantee will not appear in your string, and then us StringTokenizer with that character as the delimiter.
精彩评论