Printing the names of all the people greater than age 18?
This was a pretty good question that was posed to me recently. Suppose we have a hypothetical (insert your favori开发者_开发技巧te data storage tool here) database that consists of the names, ages and address of all the people residing on this planet. Your task is to print out the names of all the people whose age is greater than 18 within an HTML table. How would you go about doing that? Lets say that hypothetically the population is growing at the rate of 1200/per second and the database is updated accordingly(don't ask how). What would be your strategy to print the names of all these people and their addresses on an HTML table?
Storing the ages in a DB tables sounds like a recipe for trouble to me - it would be impossible to maintain. You would be better off storing the birth dates, then building an index on that column/attribute.
You have to get an initial dump of the table for display. Just calculate the date 18 years ago (let's say
D0
) and use a query for any person born earlier than that.Use DB triggers to receive notifications about deaths, so that you can remove them from the table immediately.
Since people only get older (unfortunately?), you can use ranged queries to get new additions (i.e. people that become 18 years old since yo last queried the table). E.g. if you want to update the display the next day, you issue a query for the people that were born in day
D0 + 1
only - no need to request the whole table again.You could even prefetch the people who reach 18 years of age the next day, keep the entries in memory, and add them to the display at the exact moment they reach that age.
BTW, even with 2KB of data for each person, you get a 18TB database (assuming 50% overhead). Any slightly beefed up server should be able to handle this kind of DB size. On the other hand, the thought of a 12 TB HTML table terrifies me...
Oh, and beware of timezone and DST issues - time is such a relative thing these days...
I don't see what the problem is. You don't have to worry about new records being added at all, since none of them will be included in your query unless that query takes 18 or more years to run. If you have an index on age, and presumably any DB technology sufficient to handle that much data and 1200 inserts a second updates indexes on insert, it should just work.
In the real world, using existing technologies or something like it, I would create a daily snapshot once a day and do queries on that read-only snapshot that would not include records for that day. That table would certainly be good enough for this query, and most others.
Are you forced to aggregate all of the entries into one table?
It would be simpler if you were to create a table for each age group (only around 120 tables would be needed) and just insert the inputs into those, as it's computationally simpler to look over 120 tables when you insert an entry than to look over 6,000,000,000 when looking for entries.
精彩评论