开发者

Get ant concat to ignore BOM's'?

I have an ant build that concatenates my javascript into one file and then compresses it. The problem is that Visual Studio's default encoding attaches a BOM to every file. How do I configure ant to strip out BOM's that would otherwise appear in the middle of the resulting concatenated file?

My googl'ing revealed this discussion which is the exact problem I'm having but开发者_如何学C doesn't provide a solution: http://marc.info/?l=ant-user&m=118598847927096


The Unicode byte order mark codepoint is U+FEFF. This concatenation command will strip out all BOM characters when concatenating two files:

<concat encoding="UTF-8" outputencoding="UTF-8" destfile="nobom-concat.txt">
  <filelist dir="." files="bom1.txt,bom2.txt" />
  <filterchain>
    <deletecharacters chars="&#xFEFF;" />
  </filterchain>
</concat>

This form of the concat command tells the task to decode the files as UTF-8 character data. I'm assuming UTF-8 as this is usually where Java/BOM issues occur.

In UTF-8, the BOM is encoded as the bytes EF BB BF. If you needed it to appear at the start of the resultant file, you could use a subsequent concatenation to prefix the output file with a BOM again.

Encoded values for U+FEFF in other UTF encodings are listed here.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜