Database schema for a library
I'm designing a library management system for a department at my university and I wanted to enlist your eyes on my proposed schema. This post is primarily con开发者_开发知识库cerned with how we store multiple copies of each book; something about what I've designed rubs me the wrong way, and I'm hoping you all can point out better ways to tackle things.
For dealing with users checking out books, I've devised three tables: book, customer, and book_copy. The relationships between these tables are as follows:
- Every book has many book_copies (to avoid duplicating the book's information while storing the fact that we have multiple copies of that book).
- Every user has many book_copies (the other end of the relationship)
The tables themselves are designed like this:
------------------------------------------------
book
------------------------------------------------
+ id
+ title
+ author
+ isbn
+ etc.
------------------------------------------------
------------------------------------------------
customer
------------------------------------------------
+ id
+ first_name
+ first_name
+ email
+ address
+ city
+ state
+ zip
+ etc.
------------------------------------------------
------------------------------------------------
book_copy
------------------------------------------------
+ id
+ book_id (FK to book)
+ customer_id (FK to customer)
+ checked_out
+ due_date
+ etc.
------------------------------------------------
Something about this seems incorrect (or at least inefficient to me) - the perfectionist in me feels like I'm not normalizing this data correctly. What say ye? Is there a better, more effective way to design this schema?
Thanks!
It's an OK schema. However, it doesn't model the possibility that a work can have several different presentations -- that is, a book can have many editions (and translations, and formats).
How you slice this -- the granularity you use -- depends, as always in data modeling, on on your usage. In other words, is it "true", for you, for your usage, that a German translation of Alice in Wonderland is "different" from the English orioginal? (Yes, probably). Is a paperback version "different" from a hardcover?
The simple answer to this is to just use ISBN as the key -- letting the publishing industry makes these decisions for you. Then, anything with the same ISBN is equal and fungible.
You may also want to model something like "acceptable substitute", "this ISBN is an acceptable substitute for that one, because the only difference is binding" or "this ISBN (Darwin's Origins 6th edition) is the sixth edition of that one (darwin;s original The Origin of Species)", or "this ISBN is a translation of that one" or even "this ISBN (the KJV Bible) is similar to that one (the NIV Bible)." This gets into subtle gradations.
The other, more fundamental problem, is that copies of the same book are conflated with checkouts of those copies. If you unfortunately ordered have 10 copies of, say, Herb Schildt's The Annotated ANSI C Standard, but they are mercifully not checked out because students at your uni read Pete Seebach's excellent review of that terrible book, what is the customer_id for those copies in book_copy?
You want (at least) tables for book (work, isbn); copy; user; and the relation user-checksout-copy.
A couple of things.
I think the only obvious normalization error is the one tpdi pointed out: you shouldn't be combining the book_copy entity with the record of a checkout. Checkouts should be recorded in a separate table which cross-references a customer to a book_copy:
checkout
- customer_id (FK: customer.id)
- book_copy_id (FK: book_copy.id)
- date_checked_out
You don't strictly need due_date in this table, since you can always calculate the due date by looking at the date_checked_out and adding however many days the title is allowed to be checked out for. This raises the topic of having various media types and varying checkout limits. For instance DVDs may have a 1 week limit while books have a 2 week limit. To track this information, you'd have another table:
checkout_limit
- media_type
- checkout_limit_days
And if you do have multiple types of media available, then of course you have to think about whether you want more tables (one for each media type) or want all media types in the same table (then you'd have to rename it from "book" to something generic). In the latter case you will have some redundancy since some types of media won't have attributes that other types have (e.g. books have ISBN but DVDs don't).
What about a master table titles
for the books, which, as tpdi suggested be keyed on ISBN numbers, which would have the title, author, and other details. A subtable book
could then be the list of actual inventory, where you have a primary key for each copy (10 entries if you have 10 copies of The Annotated ANSI C Standard) and a foreign key to the titles
table, which links those copies to their appropriate title and ISBN number. Finally, a checkout
table would be your many-to-many relationship wherein the id of users is linked to the id of the book (rather than title).
The primary difference between how you've done it so far, and how I suggest, is that you've moved the customer_id
out of the book table and employed a new checkout table. You could still have datetimes for each book, showing whether it is out or not, but to find out to whom it is lent, you must consult the checkout table.
I think what he is trying to do is to ease the way he will query for any available book copy.
Say there's 10 copies of Herb Schildt's The Annotated ANSI C Standard, there will be 10 records on the book_copy table.
Am I right?
so when a student checks-out a book, in particular, copy #3, he can just put the customer_id of the student.
The other 9 records will still have an empty customer_id.
This will definitely make your queries easier, but it is not the correct practice
It is correct what the others has pointed out, you need some sort of transaction table, check-outs table to keep track of any books going out.
精彩评论