C++ String-type independent algorithms
I'm trying to derive a technique for writing string-algorithms that is truly independent of the underlying type of string.
Background: the prototypes for GetIndexOf and FindOneOf are either overloaded or templated variations on:
int GetIndexOf(const char * pszInner, const char * pszString);
const char * FindOneOf(const char * pszString, const char * pszSetOfChars);
This issue comes up in the following template function:
// return index of, or -1, the first occurrence of any given char in target
template <typename T>
inline int FindIndexOfOneOf(const T * str, const T * pszSearchChars)
{
return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}
Objectives:
1. I would like this code to work for CStringT<>, const char *, const wchar_t * (and should be trivial to extend to std::string) 2. I don't want to pass anything by copy (only by const & or const *)In an attempt to solve these two objectives, I thought I might be able to use a type-selector of sorts to derive the correct interfaces on the fly:
namespace details {
template <typename T>
struct char_type_of
{
// typedef T type; error for invalid types (i.e. anything for which there is not a specialization)
};
template <>
struct char_type_of<const char *>
{
typedef char type;
};
template <>
struct char_type_of<const wchar_t *>
{
typedef wchar_t type;
};
template <>
struct char_type_of<CStringA>
{
typedef CStringA::XCHAR type;
};
template <>
struct char_type_of<CStringW>
{
typedef CStringW::XCHAR type;
};
}
#define CHARTYPEOF(T) typename details::char_type_of<T>::type
Which allows:
template <typename T>
inline int FindIndexOfOneOf(T str, const CHARTYPEOF(T) * pszSearchChars)
{
return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}
This should guarantee that the second argument is passed as const *, and should not determine T (rather only the first argument should determine T).
But the problem with this approach is that T, when str is a CStringT<>, is a copy of the CStringT<> rather than a reference to it: hence we have an unnecessary copy.
Trying to rewrite the above as:
template <typename T>
inline int FindIndexOfOneOf(T & str, const CHARTYPEOF(T) * pszSearchChars)
{
return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}
Makes it impossible for the compiler (VS2008) to generate a correct instance of FindIndexOfOneOf<> for:
FindIndexOfOneOf(_T("abc"), _T("def"));
error C2893: Failed to specialize function template 'int FindIndexOfOneOf(T &,const details::char_type_of<T>::type *)'
With the following template arguments: 'const char [4]'
This is a generic problem I've had with templates since they were introduced (yes, I'm that old): That it's been essentially impossible to construct a way to handle both old C-style arrays and newer class based entities (perhaps best highlighted by const char [4] vs. CString<> &).
The STL/std library "solved" this issue (if one can really call it solving) by instead using pairs of iterators everywhere instead of a reference to the thing itself. I could go this route, except it sucks IMO, and I don't want to have to litter my code with two-arguments everywhere a single argument properly handled should have been.
Basically, I'm interested in an approach - such as using some sort of stringy_traits - that would allow me to write GetIndexOfOneOf<> (and other similar template functions) where the argument is the string (not a pair of (being, end] arguments), and the template that is then generated be correct based on that string-argument-type (either const * or const CString<> &).
So the Question: How might I write FindIndexOfOneOf<> such that its arguments can be any of the following without ever creating a copy of the underlying arguments:
1. FindIndexOfOneOf(_T("abc"), _T("def")); 2. CString str; FindIndexOfOneOf(str, _T("def")); 3. CString str; FindIndexOfOneOf(T("abc"), str); 3. CString str; FindIndexOfOneOf(str, str);Related th开发者_如何学Pythonreads to this one that have lead me to this point:
A better way to declare a char-type appropriate CString<>
Templated string literals
Try this.
#include <type_traits>
inline int FindIndexOfOneOf(T& str, const typename char_type_of<typename std::decay<T>::type>::type* pszSearchChars)
The problem is that when you make the first argument a reference type T becomes deduced as:
const char []
but you want
const char*
You can use the following to make this conversion.
std::decay<T>::type
The documentation says.
If is_array<U>::value is true, the modified-type type is remove_extent<U>::type *.
You can use Boost's enable_if and type_traits for this:
#include <boost/type_traits.hpp>
#include <boost/utility/enable_if.hpp>
// Just for convenience
using boost::enable_if;
using boost::disable_if;
using boost::is_same;
// Version for C strings takes param #1 by value
template <typename T>
inline typename enable_if<is_same<T, const char*>, int>::type
FindIndexOfOneOf(T str, const CHARTYPEOF(T) * pszSearchChars)
{
return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}
// Version for other types takes param #1 by ref
template <typename T>
inline typename disable_if<is_same<T, const char*>, int>::type
FindIndexOfOneOf(T& str, const CHARTYPEOF(T) * pszSearchChars)
{
return GetIndexOf(FindOneOf(str, pszSearchChars), str);
}
You should probably expand the first case to handle both char
and wchar_t
strings, which you can do using or_
from Boost's MPL library.
I would also recommend making the version that takes a reference take a const reference instead. This just avoids instantiation of 2 separate versions of the code (as it stands, T
will be inferred as a const type for const objects, and a non-const type for non-const objects; changing the parameter type to T const& str
means T
will always be inferred as a non-const type).
Based on your comments about iterators it seems you've not fully considered options you may have. I can't do anything about personal preference, but then again...IMHO it shouldn't be a formidable obstacle to overcome in order to accept a reasonable solution, which should be weighed and balanced technically.
template < typename Iter >
void my_iter_fun(Iter start, Iter end)
{
...
}
template < typename T >
void my_string_interface(T str)
{
my_iter_fun(str.begin(), str.end());
}
template < typename T >
void my_string_interface(T* chars)
{
my_iter_fun(chars, chars + strlen(chars));
}
Alternative to my previous answer, if you don't want to install tr1.
Add the following template specializations to cover the deduced T type when the first argument is a reference.
template<unsigned int N>
struct char_type_of<const wchar_t[N]>
{
typedef wchar_t type;
};
template<unsigned int N>
struct char_type_of<const char[N]>
{
typedef char type;
};
精彩评论