How to read file containing UTF-8 hex encoded characters and then decode the characters to HTML hex numbers?
I have a file containing UTF-8 hex encoded characters, as below:
<root>
<element&开发者_如何学Cgt;1 \xc3\x97 2 = 2</element>
</root>
I want to read the file and transform all the \xhh
characters to the equivalent HTML hex numbers and then write to a new file. So, given a file with the above contents, the new file must look like:
<root>
<element>1 × 2 = 2</element>
</root>
Thanks!
Assuming you’ve used :utf8
on the input stream, then this will fix the data:
s/([^\x00-\x7F])/sprintf "&#x%x;", ord $1/ge;
精彩评论