StringTokenizer split at "<br/>"

2023-02-01 10:57 问答作者：

Maybe I am stupid but I don't understand why the behaviour of StringTokenizer here:

import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;

String object = (String) value;
String escaped = escapeHtml(object);
StringTokenizer tokenizer = new StringTokenizer(escaped, escapeHtml("<br/>"));

If fx. value is

Hej<br/>$user.get(0).name Har vundet<br/><table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>开发者_JS百科;#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table><br/>

Then the result is

Hej
$use
.
e
(0).name Ha
 vunde
a
e 
o
de
='1'
h
Name
h
h
P
ayed
h
h
B
ewed
h
#fo
each( $u in $use
 )
d
$u.name
d

d
$u.p
ayed
d

d
$u.
ewed
d
#end
a
e

It makes no sense to me.

How can I make it behave as I expect to.

From the documentation:

The characters in the delim argument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.

In other words, the characters that tell StringTokenizer when to separate the string are:

When it matches any of those characters in the string (the variable escaped in your code), the StringTokenizer instance will split the result and drop the token. You can confirm this by noting that the letter r does not occur in the output.

Use String.split, instead, as others suggest.

Each character in the string is considered a token for splitting on. So your code splits on each "&", "l", "t", ";", "b", "r", "/" or "g" character (since escapeHtml will replace the "<" and ">" with < and > respectively).

You probably want to use String.split which takes a regular expression as the thing to split on:

String[] parts = object.split("<br/>");

String[] parts = escaped.split(escapeHtml("<br/>"));

Just make sure that there are no regex special characters in your split token.

If you want to divide a string / text by a word and not only by few characters then you better use String.split

I've done the test:

public static void main(String[] args){
    String s = "Hej<br/>$user.get(0).name Har vundet<br/><table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table><br/>";
    String[] lines = s.split("<br/>");
    for(String ss:lines)
        System.out.println(ss);
}

and here you have the result:

Hej
$user.get(0).name Har vundet
<table border='1'><tr><th>Name</th><th>Played</th><th>Brewed</th></tr>#foreach( $u in $user )<tr><td>$u.name</td> <td>$u.played</td> <td>$u.brewed</td></tr>#end</table>

Tjena

StringTokenizer splits using each character.

You need to use split. (be careful though as it takes a regular expression)

String[] lines = "some html string<br/>with line breaks<br/>".split("<br/>")

You cannot use StringTokenizer with a multicharacter delimiter. One possible solution to your problem is to replace "<br>" with a character that you can guarantee will not appear in your string, and then us StringTokenizer with that character as the delimiter.

StringTokenizer split at "<br/>"

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？