开发者

How to get a substring in some length for special chars like Chinese

For example, I can get 80 chars with {description?substring(0, 80)} if description is in English, but for Chinese chars, I can get开发者_C百科 only about 10 chars, and there is a garbage char at the end always.

How can I get 80 chars for any language?


FreeMarker relies on String#substring to do the actual (UTF-16-chars-based?) substring calculation, which doesn't work well with Chinese characters. Instead one should uses Unicode code points. Based on this post and FreeMarker's own substring builtin I hacked together a FreeMarker TemplateMethodModelEx implementation which operates on code points:

public class CodePointSubstring implements TemplateMethodModelEx {

    @Override
    public Object exec(List args) throws TemplateModelException {
        int argCount = args.size(), left = 0, right = 0;
        String s = "";
        if (argCount != 3) {
            throw new TemplateModelException(
                    "Error: Expecting 1 string and 2 numerical arguments here");
        }
        try {
            TemplateScalarModel tsm = (TemplateScalarModel) args.get(0);
            s = tsm.getAsString();
        } catch (ClassCastException cce) {
            String mess = "Error: Expecting numerical argument here";
            throw new TemplateModelException(mess);
        }

        try {
            TemplateNumberModel tnm = (TemplateNumberModel) args.get(1);
            left = tnm.getAsNumber().intValue();

            tnm = (TemplateNumberModel) args.get(2);
            right = tnm.getAsNumber().intValue();

        } catch (ClassCastException cce) {
            String mess = "Error: Expecting numerical argument here";
            throw new TemplateModelException(mess);
        }
        return new SimpleScalar(getSubstring(s, left, right));
    }

    private String getSubstring(String s, int start, int end) {
        int[] codePoints = new int[end - start];
        int length = s.length();
        int i = 0;
        for (int offset = 0; offset < length && i < codePoints.length;) {
            int codepoint = s.codePointAt(offset);
            if (offset >= start) {
                codePoints[i] = codepoint;
                i++;
            }
            offset += Character.charCount(codepoint);
        }
        return new String(codePoints, 0, i);
    }
}

You can put an instance of it into your data model root, e.g.

SimpleHash root = new SimpleHash();
root.put("substring", new CodePointSubstring());
template.process(root, ...);

and use the custom substring method in FTL:

${substring(description, 0, 80)}

I tested it with non-Chinese characters, which still worked, but so far I haven't tried it with Chinese characters. Maybe you want to give it a try.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜