Data Modeling: vehicle's year, make, and model?
I'm trying to model vehicles at a basic level. Here's how I see the data:
- A "year" (e.g. 2010, 2011) has 0 or more "make" (e.g. Nissan, Honda)
- A "make" has 0 or more "model" (e.g. Nissan has Sentra, Altima, Maxima)
It wouldn't make sense to have a "year" table containing just 1 column, so I think it would be combined with "make" to create:
TABLE: year_make
- year
- make
I guess that "year" and "make" columns would make up a composite key.
Then I would have a "model" table t开发者_运维问答hat somehow is associated to "year_make" table. The problem is that I don't know what in "year_make" to put in "model" to associate the 2 tables.
Do I make a PK: year_make->year_make_id and use that? It would mean that "year" and "make" columns no longer make a composite key, right?
UPDATE:
I suppose I should have a lookup table "lookup_make", then "year_make" would have a "lookup_make_id" instead of "make" column.
UPDATE 2:
Per nate c:
TABLE: make
- make_id
- name
TABLE: model
- model_id
- make_id
- name
TABLE: model_year
- model_id
- year
I think you'll be better prepared for design if you create some sample data. The goal is to come up with representative sample data. You can learn a lot from representative sample data. (And you can make some pretty expensive mistakes if your sample data isn't representative.)
Year Make Model ?? ?????
--
2011 Honda Accord LX Sedan
2011 Honda Accord SE Sedan
2011 Honda Accord EX Coupe
2011 Toyota Yaris 3-door liftback
2011 Toyota Yaris Sedan
2011 Toyota Yaris 5-door liftback
2011 Lexus IS 350 Sedan
2011 Lexus IS 250 Sedan
- What should you call the ? columns?
- Do "Yaris" and "IS 350" belong in the same column?
- Do "IS" and "350" belong in two different columns?
- What do you do about columns that don't apply to all models?
Resist the temptation to throw a handful of id numbers at data like this. Identify keys and functional dependencies first. Normalize based on the keys and dependencies.
After looking at some more I think you are right to model the year as a many to many join. I put both ways just to show as examples of what of I am talking about. If you have an attribute that covers the whole model over years such as car class (economy, truck, luxury, etc.) You would need a many to many table for normalization and to avoid data duplication.
-- id did not use auto-incs as I am just showing the relational model
create table make(
makename varchar primary key
);
-- 1. many to many
create table model(
modelname varchar not null,
makename varchar not null references make(makename),
-- if carclass changes - one update changes every model/year combo
carclass varchar not null -- economy, suv, truck etc ...
primary key (makename, modelname)
);
create table model_year(
year integer not null,
modelname varchar not null references model(modelname)
baseprice integer not null
primary key (year, modelname)
);
-- 2. year /w/ model
create table model(
modelname varchar not null,
makename varchar not null references make(makename),
year integer not null,
-- update anomaly - you would have to update every model/year combo
carclass varchar -- economy, suv, truck etc ...
-- baseprice is OK since it is tied to the year
baseprice integer not null
primary key (makename, modelname, year)
);
... So #1 would be the more 'correct' and robust way, especially if you planning to hold attributes on the model that would be independent of what year it was made. Either way would get you the same queries. In fact, its harder / more work (not really after you get it down) to join the the tables of a normalized one than a non-normalized one. But that is not the point. You are putting in the DB because you want your data to be correct (I hope).
Note: those are the real primary keys even if you use auto-inc primary keys. You would want to change the primary keys to unique ones to ensure data consistency and change the referencing foreign keys to integers.
IMO, I wouldn't make a table for the years. Doing SQL on an integer column (such as year) is incredibly fast, especially with the right indexes in place. Much faster than a join.
However, I do see the merit to having a separate table for the Make and doing join based on that. This way you'd be able to enter manufacturer details like history, years of production, website, etc.
So I'd do:
Manufacturers: (or Makes)
- id
- name
- website
- country
Cars:
- manufacturer_id (or make_id)
- model
- year
- doors
- trim_class
Looks good so far, like your thinking. In the cars table though, if you have to enter Camry 2001, camry 2002 canry 2003, camree 2004. Seems like you would have redundant data along with the risk of data integrity with misspelling. IMO you would also need somekind of Entity with makeID and modelID, with modelID it is entered just once or model is just entered once in a table. 'Camry' will always be 'Camry' represented with modelID.
If I'm understanding you correctly, you would need to create one make
table and one model
table.
The model
table would have columns for id
, make_id
, name
, year
.
Obviously, make_id would be the foreign key pointing to the make
table.
It really depends on what your situation is and how much you would need to normalize your data.
精彩评论