开发者

Best tree/heap data structure for fixed set of nodes with changing values + need top 20 values?

I'm writing something like a game in C++ where I have a database table containing the current score for each user. I want to read that table into memory at the start of the game, quickly change each user's score while the game is being played in response to what each user does, and then when the game ends write the current scores back to the database. I also want to be able to find the 20 or so users with the highest scores. No users will be added or deleted during the short period when the game is being played. I haven't tried it yet, but updating the database might take too much time during the period when the game is being played.

  • Fixed set of users (might be 10,000 to 50,000 users)
  • Will map user IDs to their score and other user-specific information.
  • User IDs will be auto_increment values.
  • If the structure has a high memory overhead that's probably 开发者_如何学运维not an issue.
  • If the program crashes during gameplay it can just be re-started.
  • Greatly prefer something already available, such as open source/public domain code.
  • Quickly get a user's current score.
  • Quickly add to a user's current score (and return their current score)
  • Quickly get 20 users with highest score.
  • No deletes.
  • No inserts except when the structure is first created, and how long that takes isn't critical.
  • Getting the top 20 users will only happen every five or ten seconds, but getting/adding will happen much more frequently.

If not for the last, I could just create a memory block equal to sizeof(user) * max(user id) and put each user at user id * sizeof(user) for fast access. Should I do that plus some other structure for the Top 20 feature, or is there one structure that will handle all of this together?


Use a std::map. In the incredibly unlikely event that it ever shows up in your profiling, you could maybe think about changing to something more exotic. Memory overhead for 50k users will be around a megabyte or two.

I doubt that iterating over a map with 50k entries every 5-10 seconds, to find the top scores, will introduce significant overhead. If it does, though, either use a Boost multi-index container, or maintain a separate structure for the hi-scores (a heap, or just an array of pointers to the current top 20, in order). Just with an array / vector of 20, the code to increment a score might look something like this (assuming scores only go up, not down):

player.score += points;
if (player.score > hiscores[19]->score) {
    hiscore_dirty = true;
}

And the code to get the hi-scores:

if (hiscore_dirty) {
    recalculate_hiscores();
    hiscore_dirty = false;
}
std::for_each(hiscores.begin(), hiscores.end(), do_something);

If your "auto-increment" and "no delete" policies are fixed forever (i.e. you will never delete users from the DB), and therefore user ids truly are a contiguous range from 0 to the limit, then you should just use a std::vector instead of a std::map.


You might be interested in Fibonacci Heap. This has O(1) (amortized) increaseKey and findMax.

For more info on Heap in general refer: Heap Data Structure, especially the table which compares different heaps.

An implementation of Fibonacci Heap can be found here which you can perhaps use/get inspired from: http://resnet.uoregon.edu/~gurney_j/jmpc/fib.html


First of all, given that you have a Key/Value scenario, you should probably use an Associative Container.

If you are using plain old C++ and do not have Boost available, follow Steve Jessops's suggestion and simply use a std::map, if you have either C++0x or Boost, you'd better use a hash_map or unordered_map: it just matches your requirements better (you don't need to order the players by id after all, you just want to find them quickly) and will probably be faster given the number of players.

For managing the top20 you have 2 choices:

  • You could use the Boost.MultiIndex library to create one unique container that both offers fast lookup on ID (using a hash map) and an ordered index on the score... however it's a bit of a waste to order all players when you only need 20 of them
  • You can simply manages a separate structure, like a vector of pointers to users, and each time you modify the score of a user check it should replace a user in the vector

The last solution, though simple, assumes that a player cannot lose points... it's much more difficult if that may happen.

class UsersCollection;

class User
{
public:
  void incrementScore(size_t term);

private:
  size_t mId;
  size_t mScore;
  UsersCollection& mCollection;
};


class UsersCollection
{
public:
  static const size_t MNumberHiScores = 20;
  static const size_t MNotAChampion = -1;

  UsersCollection(DBConnection const&);

  // returns either the position of the user in
  // the hi scores vector or MNotAChampion
  size_t insertUserInHiScores(User const& user);

private:
  std::unordered_map<size_t, User> mUsers;
  std::vector<User const*> mHiScores;         // [1]
};

void User::incrementScore(size_t term)
{
  mScore += term;
  mCollection.insertUserInHiScores(*this);
}

struct UserSort: std::binary_function<User const*, User const*, bool>
{
  bool operator()(User const* lhs, User const* rhs) const
  {
    return lhs->score() > rhs->score();
  }
};

size_t UsersCollection::insertUserInHiScores(User const& user)
{
  std::vector<User const*>::const_iterator it =
    std::find(mHiScores.begin(), mHiScores.end(), &user);

  if (it == mHiScores.end()) // not among the hiscores
  {
    mHiScores.push_back(&user);
  }

  std::sort(mHiScores.begin(), mHiScores.end(), UserSort());

  if (mHiScores.size() > MNumberHiScores) // purge if too many users
  {
    User const* last = mHiScores.back();
    mHiScores.pop_back();

    if (&user == last) return MNotAChampion;
  }

  // return position in the vector in the [0, MNumberHiScores) range
  return std::find(mHiScores.begin(), mHiScores.end(), &user)
         - mHiScores.begin();
}

Note (1): using a set may seem a good idea however a set presumes that the elements do not change and it is not the case. It could work if we were very careful:

  • remove the user from the set before changing the score
  • putting the user back in once it has changed
  • optionally popping the last elements if there are too many of them
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜