开发者

maintaing a sorted list that is bigger than memory

I have a list of tuples.

[
  "Bob": 3,
  "Alice": 2,
  "Jane": 1,
]

When incrementing the counts

 "Alice" += 2

the order should be maintained:

[
  "Alice": 4,
  "Bob": 3,
  "Jane": 1,
]

When all is in memory there rather simple ways (some more or some less) to efficiently implement this. (using an index, insert-sort etc) The question though is: What's the most promising approach when the list does not fit i开发者_StackOverflow中文版nto memory.

Bonus question: What if not even the index fits into memory?

How would you approach this?


B+ trees order a number of items using a key. In this case, the key is the count, and the item is the person's name. The entire B+tree doesn't need to fit into memory - just the current node being searched. You can set the maximum size of the nodes (and indirectly the depth of the tree) so that a node will fit into memory. (In practice nodes are usually far smaller than memory capacity.)

The data items are stored at the leaves of the tree, in so-called blocks. You can either store the items inline in the index, or store pointers to external storage. If the data is regularly sized, this can make for efficient retrieval from files. In the question example, the data items could be single names, but it would be more efficient to store blocks of names, all names in a block having the same count. The names within each block could also be sorted. (The names in the blocks themselves could be organized as a B-tree.)

If the number of names becomes large enough that the B+tree blocks are becoming ecessively large, the key can be made into a composite key, e.g. (count, first-letter). When searching the tree, only the count needs to be compared to find all names with that count. When inserting, or searcing for a specific name with a given count, then the full key can be compared to include filtering by name prefix.

Alternatively, instead of a composite key, the data items can point to offsets/blocks in an external file that contains the blocks of names, which will keep the B+tree itself small.

If the blocks of the btree are linked together, range queries can be efficiently implemented by searching for the start of the range, and then following block pointers to the next block until the end of the range is reached. This would allow you to efficiently implement "find all names with a count between 10 and 20".

As the other answers have noted, an RDBMS is the pre-packaged way of storing lists that don't fit into memory, but I hope this gives an insight into the structures used to solve the problem.


A relational database such as MySQL is specifically designed for storing large amounts of data the sum of which does not fit into memory, querying against this large amount of data, and even updating it in place.

For example:

CREATE TABLE `people` (
    `name`    VARCHAR(255),
    `count`   INT
);

INSERT INTO `people` VALUES
('Bob', 3),
('Alice', 2),
('Jane', 1);

UPDATE `people` SET `count` = `count` + 2;

After the UPDATE statement, the query SELECT * FROM people; will show:

+-------+-------+
| name  | count |
+-------+-------+
| Bob   |     5 |
| Alice |     4 |
| Jane  |     3 |
+-------+-------+

You can save the order of people in your table by adding an autoincrementing primary key:

CREATE TABLE `people` (
    `id`      INT UNSIGNED NOT NULL AUTO_INCREMENT,
    `name`    VARCHAR(255),
    `count`   INT,

    PRIMARY KEY(`id`)
);

INSERT INTO `people` VALUES
(DEFAULT, 'Bob', 3),
(DEFAULT, 'Alice', 2),
(DEFAULT, 'Jane', 1);


RDMS? Even flat file versions like SQLite. Otherwise a combination utilizing lazy loading. Only keep X records in memory the top Y records & the Z most recent ones that had counts updated. Otherwise a table of Key, Count columns where you run UPDATEs changing the values. The ordered list can be retrieved with a simple SELECT ORDER BY.


Read about B-trees and B+-trees. With these, the index can always be made small enough to fit into memory.


An interesting approach quite unlike BTrees is the Judy Tree


What you seem to be looking for are out of core algorithms for container classes, specifically an out of core list container class. Check out the stxxl library for some great examples of out of core alogorithms and processing.

You may also want to look at this related question


As far as "implementation details tackling this 'by hand'", you could read about how database systems do this by searching for the original papers on database design or finding graduate course notes on database architecture.

I did some searching and found a survey article by G. Graefe titled "Query evaluation techniques for large databases". It somewhat exhaustively covers every aspect of querying large databases, but the entire section 4 addresses how "query evaluation systems ... access base data stored in the database". Also, Graefe's survey was linked to by the course page for CPS 216: Advanced Databases Systems at Duke, Fall 2001. Week 5 was on Physical Data Organization which says that most commerical DBMS's organize data on-disk using blocks in the N-ary Storage Model (NSM): records are stored from the beginning of each block and a "directory" exists at the end.

See also:

  • Spring 2004 CPS 216 lecture notes
  • MIT OCW 6.830 Database Systems


Of course I know I could use a database. This question was more about the implementation details tackling this "by hand"

So basically, you are asking "How does a database do this?" To which the answer is, it uses a tree (for both the data and the index), and only stores part of the tree in memory at any one time.

As has already been mentioned, B-Trees are especially useful for this: since hard-drives always read a fixed amount at a time (the "sector size"), you can make each node the size of a sector to maximize efficiency.


You do not specify that you need to add or remove any elements from the list, just keep it sorted.

If so, a straightforward flat-file approach - typically using mmap for convenience - will work and be faster than a more generic database.

You can use a bsearch to locate the item, or maintain a set of the counts of slots with each value.

As you access an item, so the part of the file it is in (think in terms of memory 'pages') gets read into RAM automatically by the OS, and the slot and it's adjacent slots even gets copied into the L1 cache-line.

You can do an immediate comparison on it's adjacent slots to see if the increment or decrement causes the item to be out-of-order; if it is, you can use a linear iteration (perhaps augmented with a bsearch) to locate the first/last item with the appropriate count, and then swap them.

Managing files is what the OS is built to do.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜