Corrupt Spanish characters when saving variables to a text file in Perl
I think I have an encoding problem. My knowledge of perl is not great. Much better with other languages, but I have tried everything I can think of and checked lots of other posts.
- I am collecting a name and address. This can contain non english characters. In this case Spanish.
- A php proc开发者_运维知识库ess uses curl to execute a .pl script and passes the values URLEncoded
- The .pl executes a function in a .pm which writes the data to a text file. No database is involved.
Both the .pl and .pm have
use Encode;
use utf8;
binmode (STDIN, 'utf8');
binmode (STDOUT, 'utf8');
defined. Below is the function which is writing the text to a file
sub bookingCSV(@){
my $filename = "test.csv";
utf8::decode($_[1]{booking}->{LeadNameFirst});
open OUT, ">:utf8", $filename;
$_="\"$_[1]{booking}->{BookingNo}¦¦$_[1]{booking}->{ShortPlace}¦¦$_[1]{booking}->{ShortDev}¦¦$_[1]{booking}->{ShortAcc}¦¦$_[1]{booking}->{LeadNameFirst}¦¦$_[1]{booking}->{LeadNameLast}¦¦$_[1]{booking}->{Email}¦¦$_[1]{booking}->{Telephone}¦¦$_[1]{booking}->{Company}¦¦$_[1]{booking}->{Address1}¦¦$_[1]{booking}->{Address2}¦¦$_[1]{booking}->{Town}¦¦$_[1]{booking}->{County}¦¦$_[1]{booking}->{Zip}¦¦$_[1]{booking}->{Country}¦¦";
print OUT $_;
close (OUT);
All Spanish characters are corrupted in the text file. I have tried decode on one specific field "LeadNameFirst" but that has not made a difference. I left the code in place just in case it is useful.
Thanks for any help.
What is the encoding of the input? If the input encoding is not utf-8, then it will not do you any good to decode it as utf-8 input.
Does the input come from an HTML form? Then the encoding probably matches the encoding of the web page it came from. ISO-8859-1
is a common default encoding for American/European locales. Anyway, once you discover the encoding, you can decode the input with it:
$name = decode('iso-8859-1',$_[1]{booking}->{LeadNameFirst});
print OUT "name is $name\n"; # utf8 layer already enabled
Some browsers look for and respect a accept-charset
attribute inside a <form>
tag, e.g.,
<form action="/my_form_processor.php" accept-charset="UTF-8">
...
</form>
This will (cross your fingers) cause you to receive the form input as utf-8 encoded.
精彩评论