开发者

Corrupt Spanish characters when saving variables to a text file in Perl

I think I have an encoding problem. My knowledge of perl is not great. Much better with other languages, but I have tried everything I can think of and checked lots of other posts.

  1. I am collecting a name and address. This can contain non english characters. In this case Spanish.
  2. A php proc开发者_运维知识库ess uses curl to execute a .pl script and passes the values URLEncoded
  3. The .pl executes a function in a .pm which writes the data to a text file. No database is involved.

Both the .pl and .pm have

use Encode;
use utf8;

binmode (STDIN, 'utf8');
binmode (STDOUT, 'utf8');

defined. Below is the function which is writing the text to a file

sub bookingCSV(@){
my $filename = "test.csv";
utf8::decode($_[1]{booking}->{LeadNameFirst});
open OUT, ">:utf8", $filename;
$_="\"$_[1]{booking}->{BookingNo}¦¦$_[1]{booking}->{ShortPlace}¦¦$_[1]{booking}->{ShortDev}¦¦$_[1]{booking}->{ShortAcc}¦¦$_[1]{booking}->{LeadNameFirst}¦¦$_[1]{booking}->{LeadNameLast}¦¦$_[1]{booking}->{Email}¦¦$_[1]{booking}->{Telephone}¦¦$_[1]{booking}->{Company}¦¦$_[1]{booking}->{Address1}¦¦$_[1]{booking}->{Address2}¦¦$_[1]{booking}->{Town}¦¦$_[1]{booking}->{County}¦¦$_[1]{booking}->{Zip}¦¦$_[1]{booking}->{Country}¦¦";
print OUT $_;
close (OUT);

All Spanish characters are corrupted in the text file. I have tried decode on one specific field "LeadNameFirst" but that has not made a difference. I left the code in place just in case it is useful.

Thanks for any help.


What is the encoding of the input? If the input encoding is not utf-8, then it will not do you any good to decode it as utf-8 input.

Does the input come from an HTML form? Then the encoding probably matches the encoding of the web page it came from. ISO-8859-1 is a common default encoding for American/European locales. Anyway, once you discover the encoding, you can decode the input with it:

$name = decode('iso-8859-1',$_[1]{booking}->{LeadNameFirst});
print OUT "name is $name\n"; # utf8 layer already enabled

Some browsers look for and respect a accept-charset attribute inside a <form> tag, e.g.,

<form action="/my_form_processor.php" accept-charset="UTF-8"> 
...
</form>

This will (cross your fingers) cause you to receive the form input as utf-8 encoded.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜