NGINX, FastCGI, UTF-8 Encoding: Output iso-8859-1 instead of utf8
i hope you can give me an idea about what's going wrong.
The Szenario: I run gitweb (CGI) with a script in fastcgi mode:
#!/bin/sh
export FCGI_SOCKET_PATH=127.0.0.1:7001
su git -c "/var/www/vh_[vhost]/htdocs/gitweb.cgi --fastcgi &"
Then i use nginx to serve that content:
...
fastcgi_pass 127.0.0.1:7001;
...
Everything works as expected, but here's the problem:
$ wget "http://git.[host].de/?p=[repo].git;a=summary" -O /tmp/test.txt && file --mime-encoding /tmp/test.txt
> /tmp/test.txt: iso-8859-1
$ su git -c "./gitweb.cgi \"?p=[repo].git;a=summary\" > ./test" && file --mime-encoding ./test
> ./test: utf-8
Which obviously means that fast-cgi output is utf8 while content served by nginx is iso-8859-1.
FireBugs Response Header:
开发者_如何转开发Server nginx
Date Fri, 02 Sep 2011 14:14:08 GMT
Content-Type application/xhtml+xml; charset=utf-8
Transfer-Encoding chunked
Connection close
It looks like the transfer using the socket leads to an encoding problem.
I've tested a lot but can't figure out how to solve this.although you aren't using PHP, I found the fix for my issue but wrapping the pieces that were being exposed as ISO-8859-1 with: utf8_encode()
: http://php.net/manual/en/function.utf8-encode.php
If your CGI is in PERL, maybe http://perldoc.perl.org/utf8.html will solve your problem. It solved mine ... Z�rich
Another option could be to add the following to the http { }
statement in your nginx.conf:
charset utf-8;
-sd
I can make it works by using fcgiwrap
.
I though some environment variables where different between the two methods, so I added the following code to the gitweb.cgi
dispatch()
sub:
open my $tmplogfile, ">", "/tmp/gitweb-env.txt";
foreach my $varkey (sort keys %ENV) {
print $tmplogfile "$varkey = $ENV{$varkey}\n";
}
close $tmplogfile;
but the environment were the same.
Something may be done by fcgiwrap
, I do not yet found what.
Here are the commands I use and the differences I found using tcpdump
on the fcgi socket:
# gitweb spawned by fcgiwrap outputs utf-8
/usr/bin/spawn-fcgi -d /usr/share/gitweb -a 127.0.0.1 -p 3000 -u www-data -g gitolite -P /run/gitweb/gitweb.cgi.pid -- /usr/sbin/fcgiwrap
# Require the following nginx gitweb_fastcgi_params
# fastcgi_param QUERY_STRING $query_string;
# fastcgi_param REQUEST_METHOD $request_method;
# fastcgi_param SCRIPT_NAME $fastcgi_script_name;
# fastcgi_param DOCUMENT_ROOT $document_root;
# With the following nginx configuration
# upstream gitweb {
# server 127.0.0.1:3000;
# }
#
# server {
# listen 80;
#
# server_name git.example.net;
#
# root /usr/share/gitweb;
#
# access_log /var/log/nginx/gitweb-access.log;
# error_log /var/log/nginx/gitweb-errors.log;
#
# location / {
# alias /usr/share/gitweb/gitweb.cgi;
# include gitweb_fastcgi_params;
# fastcgi_pass gitweb;
# }
#
# location /static {
# alias /usr/share/gitweb/static;
# expires 31d;
# }
# }
# STDOUT captured on lo
# Begin of the FCGI answer
# 00000000 01 06 00 01 1f f8 00 00 53 74 61 74 75 73 3a 20 ........ Status:
# 00000010 32 30 30 20 4f 4b 0d 0a 43 6f 6e 74 65 6e 74 2d 200 OK.. Content-
# 00000020 54 79 70 65 3a 20 61 70 70 6c 69 63 61 74 69 6f Type: ap plicatio
# 00000030 6e 2f 78 68 74 6d 6c 2b 78 6d 6c 3b 20 63 68 61 n/xhtml+ xml; cha
# 00000040 72 73 65 74 3d 75 74 66 2d 38 0d 0a 0d 0a 3c 3f rset=utf -8....<?
# 00000050 78 6d 6c 20 76 65 72 73 69 6f 6e 3d 22 31 2e 30 xml vers ion="1.0
# [...]
#
# "Guido Günther" as UTF-8
# 00000FA0 6c 65 3d 22 53 65 61 72 63 68 20 66 6f 72 20 63 le="Sear ch for c
# 00000FB0 6f 6d 6d 69 74 73 20 61 75 74 68 6f 72 65 64 20 ommits a uthored
# 00000FC0 62 79 20 47 75 69 64 6f 20 47 c3 bc 6e 74 68 65 by Guido G..nthe
# 00000FD0 72 22 20 63 6c 61 73 73 3d 22 6c 69 73 74 22 20 r" class ="list"
Before, gitweb --fastcgi
was directly spawned by spawn-fcgi
:
# gitweb spawned by spawn-fcgi outputs iso-8859-1
/usr/bin/spawn-fcgi -d /usr/share/gitweb -a 127.0.0.1 -p 3000 -u www-data -g gitolite -P /run/gitweb/gitweb.cgi.pid -- /usr/share/gitweb/gitweb.cgi --fastcgi
# STDOUT captured on lo
# Begin of the FCGI answer with "00 46 02" in place of "1f f8 00" for utf-8 output
# 00000000 01 06 00 01 00 46 02 00 53 74 61 74 75 73 3a 20 .....F.. Status:
# 00000010 32 30 30 20 4f 4b 0d 0a 43 6f 6e 74 65 6e 74 2d 200 OK.. Content-
# 00000020 54 79 70 65 3a 20 61 70 70 6c 69 63 61 74 69 6f Type: ap plicatio
# 00000030 6e 2f 78 68 74 6d 6c 2b 78 6d 6c 3b 20 63 68 61 n/xhtml+ xml; cha
# 00000040 72 73 65 74 3d 75 74 66 2d 38 0d 0a 0d 0a 00 00 rset=utf -8......
# 00000050 01 06 00 01 02 88 00 00 3c 3f 78 6d 6c 20 76 65 ........ <?xml ve
# 00000060 72 73 69 6f 6e 3d 22 31 2e 30 22 20 65 6e 63 6f rsion="1 .0" enco
# 00000070 64 69 6e 67 3d 22 75 74 66 2d 38 22 3f 3e 0a 3c ding="ut f-8"?>.<
# [...]
#
# "Guido Günther" as ISO-8859-1
# 00001128 74 6c 65 3d 22 53 65 61 72 63 68 20 66 6f 72 20 tle="Sea rch for
# 00001138 63 6f 6d 6d 69 74 73 20 61 75 74 68 6f 72 65 64 commits authored
# 00001148 20 62 79 20 47 75 69 64 6f 20 47 fc 6e 74 68 65 by Guid o G.nthe
精彩评论