Database design: Low overhead solution for managing daily inventories / capacities?
Here is the scenario: (MySQL 5.1+, PHP, Apache)
I am planning a SaaS application that will let CLIENTS visit SHOPS and book TRIPS. (ALL CAPS are entities). SHOPS offer TRIPS but they only have a certain number of EMPLOYEES to guide the TRIPS (a transactional record). Essentially it is an issue of managing a daily capacity for each SHOP based upon the number of available EMPLOYEES. What is the best DB design solution for delivering this functionality in a way that incurs the lowest amount of overhead?
Here is a simplified view of the database entities:
table.clients
client_id (pk, ai)
table.shops
shop_id (pk, ai)
table.employees
employee_id (pk, ai)
shop_id (fk)
table.trips
trip_id (pk, ai)
client_id (fk)
shop_id (fk)
trip_date (date)
SCENARIO 1
I could run a query on TRIPS for every request when a user wants to view the calendar, like:SELECT COUNT(*),
trip开发者_JAVA技巧s.trip_date,
trips.shop_id
FROM trips
WHERE shop_id=1
GROUP BY trips.trip_date, trips.shop_id
SCENARIO 2
Create a summary table that stored info on every day but this strategy seems nightmarish with overhead issues. For instance, imagine that there are 1000 shops each booking 1000 trips per 365 day year and the table should store info for the next 2 years (830 days). It seems like that would 1/ create a huge summary table (830,000 rows) that would 2/ be queried 1,000,000+ times per year (1000 shops * 1000 trips per shop). When a CLIENT booked a TRIP it would increment the number (or when a trip was cancelled the number would decrement) which would effectively create a daily inventory/capacity.So, my question is this: Which method is the best? Or is there a better way to accomplish this?
Thanks!
Sounds like fun!
Firstly - I know you've given us a simplified version of the schema, so I assume there's a lot more elsewhere, but your "trips" table looks wrong - if shops have one and only one client, you don't need the client ID in the trips table.
However, you do need a "booked_trips" table, to record which trip is booked to which employee - you could store that against the "trips" table too, but typically a booking has lots of other stuff like an invoice, a booked date etc. so you may want to separate those things out.
I'd recommend something like your "option 1"- use queries to derive data stored in normalized tables, rather than option 2, which is effectively a denormalization for speed.
It's worth defining "overhead" in your question - pretty much all of these design questions trade time versus speed; if by overhead you mean disk space, you get a different answer than if you mean "time to run my queries".
Generally, my advice is to work with a normalized approach and measure performance; only denormalize if you know you have a problem.
精彩评论