Lowercase urls in Varnish (inline C)
In Varnish (3.0), urls are treated in a case sensitive way. By that I mean http://test.com/user/a4556
is treated differently from h开发者_JAVA技巧ttp://test.com/user/A4556
. On my web server they're treated as the same url. What I'd like to do is have varnish lowercase all request urls as they come in.
I managed to find this discussion but the creator of Varnish indicates that I will have to use inline C to do it. I could achieve this in a simplistic way using multiple regexes but that just seems like it's bound to fail.
Ideally, what I'd like is a VCL configuration to do this (an example of this can be found here) but I'd settle for a C function that takes in a const char *
and returns const char *
(I'm not a C programmer so forgive me if I get the syntax wrong).
It must be mentioned that Varnish includes the ability to uppercase and lowercase strings in the std vmod ( https://www.varnish-cache.org/docs/trunk/reference/vmod_std.generated.html#func-tolower )
This is much cleaner than the embedded C route (which is disabled by default in Varnish 4). Here's an example I use to normalize the request Host and url;
import std;
sub vcl_recv {
# normalize Host header
set req.http.Host = std.tolower(regsub(req.http.Host, ":[0-9]+", ""));
....
}
sub vcl_hash {
# set cache key to lowercased req.url
hash_data(std.tolower(req.url));
....
}
Okay, I went ahead and solved this for myself. Here's the VCL:
C{
#include <ctype.h>
//lovingly lifted from:
//https://github.com/cosimo/varnish-accept-language/blob/master/examples/accept-language.vcl
static void strtolower(const char *s) {
register char *c;
for (c=s; *c; c++) {
if (isupper(*c)) {
*c = tolower(*c);
}
}
return;
}
}C
sub vcl_recv {
C{
strtolower(VRT_r_req_url(sp));
}C
}
I put this in a separate VCL file and then added an include for it.
I'll just share my solution, which expands Richard's code into a complete solution.
If URL contains upper case letters, we redirect user to the correct URL, instead of simply normalizing the URL before entering the cache machinery. This prevents search engines from indexing mixed case URLs separately from lower-case.
# Define a function that converts a string to lower-case in-place.
# http://stackoverflow.com/questions/6857445
C{
#include <ctype.h>
static void strtolower(char *c) {
for (; *c; c++) {
if (isupper(*c)) {
*c = tolower(*c);
}
}
}
}C
sub vcl_recv {
if (req.http.host ~ "[A-Z]" || req.url ~ "[A-Z]") {
# Convert host and path to lowercase in-place.
C{
strtolower(VRT_GetHdr(sp, HDR_REQ, "\005host:"));
strtolower((char *)VRT_r_req_url(sp));
}C
# Use req.http.location as a scratch register; any header will do.
set req.http.location = "http://" req.http.host req.url;
error 999 req.http.location;
}
# Fall-through to default
}
sub vcl_error {
# Check for redirects - redirects are performed using: error 999 "http://target-url/"
# Thus we piggyback the redirect target in the error response variable.
if (obj.status == 999) {
set obj.http.location = obj.response;
set obj.status = 301;
set obj.response = "Moved permanently";
return(deliver);
}
# Fall-through to default
}
There's an ugly cast from const char *
to char *
when converting req.url
to lower-case... basically, we're modifying the string in-place despite Varnish telling us not to. It seems to work. :-)
Nearly 5 years after the original question was asked I think we have a cleaner answer available now. This SO question still comes up top in a search for "lowercase Varnish".
Here is a simplified variation on the example that Fastly recommends:
# at the top of your VCL
import std;
sub vcl_recv {
# Lowercase all incoming URLs. It will also be lowercase by the time the hash is computed.
set req.url = std.tolower(req.url);
}
https://www.fastly.com/blog/varnish-tip-case-insensitivity
If you are looking for a C function that converts an upper case string to lower case, this will do:
#include <ctype.h>
static char *
to_lower (char *str)
{
char *s = str;
while (*s)
{
if (isupper (*s))
*s = tolower (*s);
s++;
}
return str;
}
Note that this modifies the string in-place. So you may want to pass a copy of the original string as argument.
Note that to set the URL from the C block and avoid crashing use:
VRT_l_req_url(sp,"new-string", vrt_magic_string_end);
(Pulled this detail from "varnishd -C" output.) Here's an untested revision to the first answer:
C{
#include <ctype.h>
//lovingly lifted from:
//https://github.com/cosimo/varnish-accept-language/blob/master/examples/accept-language.vcl
static void strtolower(const char *s) {
register char *c;
for (c=s; *c; c++) {
if (isupper(*c)) {
*c = tolower(*c);
}
}
return;
}
}C
sub vcl_recv {
C{
const char *url = VRT_r_req_url(sp);
char urlRewritten[1000];
strcat(urlRewritten, url);
strtolower(urlRewritten);
VRT_l_req_url(sp, urlRewritten, vrt_magic_string_end);
}C
}
精彩评论