开发者

How to replace all occurrences of one character with two characters using std::string?

Is there a nice simple way to 开发者_JS百科replace all occurrences of "/" in a std::string with "\/" to escape all the slashes in a std::string?


Probably the simplest way to get this done is with boost string algorithms library.

  boost::replace_all(myString, "/", "\\/");

  std::string result = boost::replace_all_copy(myString, "/", "\\/");


The answer is no... there is no "easy" way if you mean an one-liner already provided by the standard library. However it's not hard to implement that function.

First of all I think that probably you will also need to replace \ with \\ and other special characters. In this case using the replaceAll implementation given by ildjarn is going to be annoying (you'll need to replace the same string several times).

In my opinion there are many cases of string processing where nothing beats using an explicit char * approach. In this specific case however probably just using an index is fine:

std::string escape(const std::string& s)
{
    int n = s.size(), wp = 0;
    std::vector<char> result(n*2);
    for (int i=0; i<n; i++)
    {
        if (s[i] == '/' || s[i] == '\\')
            result[wp++] = '\\';
        result[wp++] = s[i];
    }
    return std::string(&result[0], &result[wp]);
}

Basically the idea is to move over the string and adding an extra \ character before any special character (in the above I just handled / and \, but you get the idea). The result is known to be at maximum 2*n in lenght, so I preallocate it making the whole processing O(n) (the replaceAll approach instead keeps moving the rest of the string to the right, making it O(n^2)). Even for short strings like "this is a test with /slashes/ that should be /escaped/" the above function is on my PC more efficient (1.3x in speed) even if calling replaceAll just once and handling instead two special chars in escape.

Note also that this function naturally returns a separate string instead of modifying the string in place (IMO a better interface) and in the timing comparison I had to create a string for each call so the results are even shifted toward equality because of that added constant time.

The above read/write approach can also be easily extended to more complex substitutions (e.g. replacing > with &gt; or characters not in printable range with %xx encoding) still maintaining a good efficiency for big strings (just one pass).


An example on how to do this is given on the cppreference.com std::string::replace page:

std::string& replaceAll(std::string& context, std::string const& from, std::string const& to)
{
    std::size_t lookHere = 0;
    std::size_t foundHere;
    while((foundHere = context.find(from, lookHere)) != std::string::npos)
    {
          context.replace(foundHere, from.size(), to);
          lookHere = foundHere + to.size();
    }
    return context;
}


std::string::replace


To replace all the occurences of a sub-string in a string by another sub-string:

#include <iostream>

void replace_all(std::string& input, const std::string& from, const std::string& to) {
  size_t pos = 0;
  while ((pos = input.find(from, pos)) != std::string::npos) {
    input.replace(pos, from.size(), to);
    pos += to.size();
  }
}

int main() {
  std::string str("i am a geek/nerd/crazy person.");
  replace_all(str, "/", "\\/");
  std::cout << str << '\n';
}

Output:

$ g++-6.1.0 -std=c++17 -g -Og -Werror -Wall -Wextra -pedantic -Wold-style-cast -Wnon-virtual-dtor -Wshadow -Wcast-align -Wunused -Woverloaded-virtual -Wconversion -Wsign-conversion -Wmisleading-indentation -fsanitize=address,leak,undefined; ./a.out
i am a geek\/nerd\/crazy person.


I extrapolated on the question, to make a streaming implementation to allows you to escape a variety of characters.

Streaming really takes the biscuit for large volumes[1], because you will get in heap fragmentation/performance hell otherwise. Also, this allows you to escape strings stored in just about any source, as the samples do show

See it Live On Coliru

#include <iostream>
#include <iterator>
#include <set>
#include <sstream>
#include <string>

template <class _II, class _OI>
    static _OI escapeSomeChars(const _II inIt, const _II endIt, _OI outIt)
{
    for (_II it=inIt; it!=endIt; ++it)
        switch (*it)
        {
            case '\0': outIt++ = '\\'; outIt++ = '0'; break;
            case '\n': outIt++ = '\\'; outIt++ = 'n'; break;
            case '\\': 
            case '"' : 
            case '$' : 
            case '/' : outIt++ = '\\';
            default  : outIt++ = *it;
        }

    return outIt;
}

static std::string escapeSomeChars(const std::string& input)
{
    std::ostringstream os;
    escapeSomeChars(input.begin(), input.end(), std::ostream_iterator<char>(os));
    return os.str();
}

namespace /*anon*/ {
    struct rawchar {   // helper - see e.g. http://bytes.com/topic/c/answers/436124-copy-istream_iterator-question
        char _c; rawchar(char c=0) : _c(c) {} 
        operator const char&() const { return _c; }
        friend std::istream& operator>>(std::istream& is, rawchar& out) { return is.get(out._c); }
    };
}

int main()
{
    static const char data[] = "\"I will \\$one day \\have \\all \\\\my slash\\es escaped, much \\like\\ in the source!\n\"";

    // use the overload for std::string
    std::cout << escapeSomeChars(data);
    std::cout << std::endl;

    // streaming in & out:
    std::istringstream is(data);
    escapeSomeChars(std::istream_iterator<rawchar>(is), std::istream_iterator<rawchar>(), std::ostream_iterator<char>(std::cout));
    std::cout << std::endl;

    // but you don't need an istream, you can use any STL iterator range
    escapeSomeChars(data, data+sizeof(data)/sizeof(data[0]), std::ostream_iterator<char>(std::cout));
    std::cout << std::endl;

    // but any source and target will do:
    std::string asstring(data);
    std::set<char> chars(asstring.begin(), asstring.end());

    asstring.clear();
    escapeSomeChars(chars.begin(), chars.end(), std::back_inserter(asstring));

    std::cout << "Unique characters in data: '" << asstring << "', but properly escaped!" << std::endl;
    return 0;
}

I chose a switch, because it will be optimized by the compiler. For dynamic sets of escapable characters, I'd prefer some kind of lookup (a vector with std::find would do, although for large sets a std::set with set::find would become the better choice).

Hope this helps

[1] see e.g. this beautiful bug I recently encountered: GParted: Simplified cleanup_cursor() implementation

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜