开发者

How do I write character ALT-0146 to an XML file using Perl?

That is the character, and I cannot find a way to detect, replace, or write it properly to an XML file. At first I was using string concatenation, then I wisened up to XML::Writer, but it still won't work, the XML is still broken afterward.(Need it in UTF-8)

This is a test I wrote that still breaks:

    my $output = new IO::File(">$foundFilePath");
    my $writer = new XML::Writer(OUTPUT => $output);
    $writer->xmlDecl("UTF-开发者_运维百科8");
    $writer->startTag("xml");
    $writer->startTag("test");
    $writer->characters("’");
    $writer->endTag("test");
    $writer->endTag("xml");
    $writer->end();
    $output->close();

To be more specific, I am trying to get the data from this page: http://investing.businessweek.com/businessweek/research/stocks/private/snapshot.asp?privcapId=4439466

And Mr. William O’Keefe is messing everything up.


There are two things you need to do. If you want to write UTF-8 to a file, you need to say so:

my $output = IO::File->new($foundFilePath, ">:utf8");

And if you want to use literal UTF-8 strings in your source code, you need to say

use utf8;

at the beginning of your program. Otherwise, Perl assumes your source code is Latin-1.

Here's a complete example script:

use utf8;
use strict;
use warnings;
use IO::File;
use XML::Writer;

my $foundFilePath = 'test.xml';
my $output = IO::File->new($foundFilePath, ">:utf8");
my $writer = XML::Writer->new(OUTPUT => $output);
$writer->xmlDecl("UTF-8");
$writer->startTag("xml");
$writer->startTag("test");
$writer->characters("’");
$writer->endTag("test");
$writer->endTag("xml");
$writer->end();
$output->close();
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜