开发者

C++ URLencode library (Unicode capable)?

I need a library that can URLencode a string/char array.

Now, I can hex encode an ASCII array like here: http://www.codeguru.com/cpp/cpp/cpp_mfc/article.php/c4029

But I need something that works with Unicode. Note: On Linux AND on Windows !

CURL has a quite nice:

 char *encodedURL = curl_easy_escape(handle,WEBPAGE_URL, strlen(WEBPAGE_URL));

but f开发者_Python百科irst, that needs CURL and it also is not unicode capable, as one sees by strlen


If I read the quest correctly and you want to do this yourself, without using curl I think I have a solution (sssuming UTF-8) and I think this is a conformant and portable way of URL encoding query strings:

#include <boost/function_output_iterator.hpp>
#include <boost/bind.hpp>
#include <algorithm>
#include <sstream>
#include <iostream>
#include <iterator>
#include <iomanip>

namespace {
  std::string encimpl(std::string::value_type v) {
    if (isalnum(v))
      return std::string()+v;

    std::ostringstream enc;
    enc << '%' << std::setw(2) << std::setfill('0') << std::hex << std::uppercase << int(static_cast<unsigned char>(v));
    return enc.str();
  }
}

std::string urlencode(const std::string& url) {
  // Find the start of the query string
  const std::string::const_iterator start = std::find(url.begin(), url.end(), '?');

  // If there isn't one there's nothing to do!
  if (start == url.end())
    return url;

  // store the modified query string
  std::string qstr;

  std::transform(start+1, url.end(),
                 // Append the transform result to qstr
                 boost::make_function_output_iterator(boost::bind(static_cast<std::string& (std::string::*)(const std::string&)>(&std::string::append),&qstr,_1)),
                 encimpl);
  return std::string(url.begin(), start+1) + qstr;
}

It has no non-standard dependencies other than boost and if you don't like the boost dependency it's not that hard to remove.

I tested it using:

int main() {
    const char *testurls[] = {"http://foo.com/bar?abc<>de??90   210fg!\"$%",
                              "http://google.com",
                              "http://www.unicode.com/example?großpösna"};
    std::copy(testurls, &testurls[sizeof(testurls)/sizeof(*testurls)],
              std::ostream_iterator<std::string>(std::cout,"\n"));
    std::cout << "encode as: " << std::endl;
    std::transform(testurls, &testurls[sizeof(testurls)/sizeof(*testurls)],
                   std::ostream_iterator<std::string>(std::cout,"\n"),
                   std::ptr_fun(urlencode));
}

Which all seemed to work:

http://foo.com/bar?abc<>de??90   210fg!"$%
http://google.com
http://www.unicode.com/example?großpösna

Becomes:

http://foo.com/bar?abc%3C%3Ede%3F%3F90%20%20%20210fg%21%22%24%25
http://google.com
http://www.unicode.com/example?gro%C3%9Fp%C3%B6sna

Which squares with these examples


You can consider converting your Unicode URL to UTF8 first, the UTF8 data will carry your Unicode data in ASCII characters, Once you get your URL in UTF8 you can easily encode the URL with the API you prefer.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜