Converting sets of integers into ranges

2022-12-20 16:57 问答作者：

What's the most idiomatic way to convert a set of integers into a set of ranges?

E.g. given the set {0, 1, 2, 3, 4, 7, 8, 9, 11} I want to get { {0,开发者_运维技巧4}, {7,9}, {11,11} }.

Let's say we are converting from std::set<int> into std::vector<std::pair<int, int>>. I treat Ranges as inclusive on both sides, since it's more convenient in my case, but I can work with open-ended ranges too if necessary.

I've written the following function, but I feel like reinventing the wheel. Please tell maybe there's something in STL or boost for this.

typedef std::pair<int, int> Range;

void setToRanges(const std::set<int>& indices, std::vector<Range>& ranges)
{
    Range r = std::make_pair(-INT_MAX, -INT_MAX);

    BOOST_FOREACH(int i, indices)
    {
           if (i != r.second + 1)
           {
            if (r.second >= 0) ranges.push_back(r);
            r.first = i;                    
           }

           r.second = i;
    }

    ranges.push_back(r);
}

Now one can use interval_set from Boost.ICL (Boost > 1.46)

#include <set>
#include <iostream>
#include <algorithm>

#include <boost/icl/discrete_interval.hpp>
#include <boost/icl/closed_interval.hpp>
#include <boost/icl/interval_set.hpp>

typedef std::set<int> Set;
typedef boost::icl::interval_set<int> IntervalSet;

void setToInterval(const Set& indices, IntervalSet& intervals)
{
    Set::const_iterator pos;
    for(pos = indices.begin(); pos != indices.end(); ++pos)
    {
        intervals.insert(boost::icl::construct<boost::icl::discrete_interval<int> >(*pos, *pos, boost::icl::interval_bounds::closed()));
    }
}

int main()
{
    std::cout << ">>Interval Container Library Rocks! <<\n";
    std::cout << "----------------------------------------------------\n";

    Set indices = {0, 1, 2, 3, 4, 7, 8, 9, 11};
    IntervalSet intervals;

    setToInterval(indices, intervals);

    std::cout << "  intervals joined:    " << intervals  << "\n";

    return 0;
}

Output:

  intervals joined:    {[0,4][7,9][11,11]}

I don't think there's anything in the STL or Boost that does this.

One thing you can do is to make your algorithm a little bit more general:

template<class InputIterator, class OutputIterator>
void setToRanges(InputIterator first, InputIterator last, OutputIterator dest)
{
    typedef std::iterator_traits<InputIterator>::value_type item_type;
    typedef typename std::pair<item_type, item_type> pair_type;
    pair_type r(-std::numeric_limits<item_type>::max(), 
                -std::numeric_limits<item_type>::max());

    for(; first != last; ++first)
    {
        item_type i = *first;
        if (i != r.second + 1)
        {
            if (r.second >= 0) 
                *dest = r;
            r.first = i;                    
        }
        r.second = i;
    }
    *dest = r;
}

Usage:

std::set<int> set;
// insert items

typedef std::pair<int, int> Range;
std::vector<Range> ranges;

setToRanges(set.begin(), set.end(), std::back_inserter(ranges));

You should also consider using the term interval instead of range, because the latter in STL parlance means "any sequence of objects that can be accessed through iterators or pointers" (source).

Finally, you should probably take at look at the Boost Interval Arithmetic Library, which is currently under review for Boost inclusion.

No shrinkwrapped solution I'm afraid, but an alternative algorithm.

Store your items in a bitvector - O(n) if you know the maximum item at the start and preallocate the vector.

Translate that vector into a vector of transition point flags - exclusive-or the bitvector with a bitshifted version of itself. Slightly fiddly at the word boundaries, but still O(n). Logically, you get a new key at the old max + 1 (the transition back to zeros after all your keys are exhausted), so it's a good idea to allow for that in the preallocation of the vector.

Then, iterate through the bitvector finding the set bits. The first set bit indicates the start of a range, the second the end, the third the start of the next range and so on. The following bit-fiddling function (assuming 32 bit int) may be useful...

int Low_Bit_No (unsigned int p)
{
  if (p == 0)  return -1;  //  No bits set

  int           l_Result = 31;
  unsigned int  l_Range  = 0xffffffff;
  unsigned int  l_Mask   = 0x0000ffff;

  if (p & l_Mask)  {  l_Result -= 16;  }  else  {  l_Mask ^= l_Range;  }
  l_Range &= l_Mask;
  l_Mask  &= 0x00ff00ff;
  if (p & l_Mask)  {  l_Result -=  8;  }  else  {  l_Mask ^= l_Range;  }
  l_Range &= l_Mask;
  l_Mask  &= 0x0f0f0f0f;
  if (p & l_Mask)  {  l_Result -=  4;  }  else  {  l_Mask ^= l_Range;  }
  l_Range &= l_Mask;
  l_Mask  &= 0x33333333;
  if (p & l_Mask)  {  l_Result -=  2;  }  else  {  l_Mask ^= l_Range;  }
  l_Mask  &= 0x55555555;
  if (p & l_Mask)  {  l_Result -=  1;  }

  return l_Result;
}

I'd use adjacent_find with a predicate that defines "adjacency" as two elements that are not sequential. This solution doesn't depend on INT_MAX. Still feels kinda clunky.

bool notSequential(int a, int b) { return (a + 1) != b; }

void setToRanges(const std::set<int>& indices, std::vector<Range>& ranges)
{
  std::set<int>::iterator iter = indices.begin();
  std::set<int>::iterator end = indices.end();
  int first;
  while (iter != end)
  {
    first = *iter;
    iter = std::adjacent_find(iter, end, notSequential);
    if (iter != end)
    {
      ranges.push_back(std::make_pair(first, *iter));
      ++iter;
    }
  }
  ranges.push_back(std::make_pair(first, *--iter));
}

That tests against end more than necessary. adjacent_find can never return the last element of a list, so the incremented iterator will never be end and thus can still be dereferenced. It could be rewritten as:

void setToRanges(const std::set<int>& indices, std::vector<Range>& ranges)
{
  std::set<int>::iterator iter = indices.begin();
  std::set<int>::iterator end = indices.end();
  if (iter == end) return; // empty set has no ranges
  int first;
  while (true)
  {
    first = *iter;
    iter = std::adjacent_find(iter, end, notSequential);
    if (iter == end) break;
    ranges.push_back(std::make_pair(first, *iter++));
  }
  ranges.push_back(std::make_pair(first, *--iter));
}

继续阅读：algorithm range

Converting sets of integers into ranges

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？