开发者

Data Modeling: vehicle's year, make, and model?

I'm trying to model vehicles at a basic level. Here's how I see the data:

  • A "year" (e.g. 2010, 2011) has 0 or more "make" (e.g. Nissan, Honda)
  • A "make" has 0 or more "model" (e.g. Nissan has Sentra, Altima, Maxima)

It wouldn't make sense to have a "year" table containing just 1 column, so I think it would be combined with "make" to create:

TABLE: year_make
- year
- make

I guess that "year" and "make" columns would make up a composite key.

Then I would have a "model" table t开发者_运维问答hat somehow is associated to "year_make" table. The problem is that I don't know what in "year_make" to put in "model" to associate the 2 tables.

Do I make a PK: year_make->year_make_id and use that? It would mean that "year" and "make" columns no longer make a composite key, right?

UPDATE:

I suppose I should have a lookup table "lookup_make", then "year_make" would have a "lookup_make_id" instead of "make" column.

UPDATE 2:

Per nate c:

TABLE: make
 - make_id
 - name

TABLE: model
 - model_id
 - make_id
 - name

TABLE: model_year
 - model_id
 - year  


I think you'll be better prepared for design if you create some sample data. The goal is to come up with representative sample data. You can learn a lot from representative sample data. (And you can make some pretty expensive mistakes if your sample data isn't representative.)

Year  Make    Model    ??  ?????
--
2011  Honda   Accord   LX  Sedan
2011  Honda   Accord   SE  Sedan
2011  Honda   Accord   EX  Coupe
2011  Toyota  Yaris        3-door liftback
2011  Toyota  Yaris        Sedan
2011  Toyota  Yaris        5-door liftback
2011  Lexus   IS 350       Sedan
2011  Lexus   IS 250       Sedan
  • What should you call the ? columns?
  • Do "Yaris" and "IS 350" belong in the same column?
  • Do "IS" and "350" belong in two different columns?
  • What do you do about columns that don't apply to all models?

Resist the temptation to throw a handful of id numbers at data like this. Identify keys and functional dependencies first. Normalize based on the keys and dependencies.


After looking at some more I think you are right to model the year as a many to many join. I put both ways just to show as examples of what of I am talking about. If you have an attribute that covers the whole model over years such as car class (economy, truck, luxury, etc.) You would need a many to many table for normalization and to avoid data duplication.

    -- id did not use auto-incs as I am just showing the relational model

    create table make(
            makename        varchar                         primary key
    );
    -- 1.  many to many
    create table model(
            modelname       varchar not null,
            makename        varchar not null        references make(makename),
            -- if carclass changes - one update changes every model/year combo
            carclass        varchar not null -- economy, suv, truck etc ...
            primary key (makename, modelname)
    );

    create table model_year(
            year            integer not null,
            modelname       varchar not null        references model(modelname)
            baseprice       integer not null
            primary key (year, modelname)
    );
    -- 2. year /w/ model
    create table model(
            modelname       varchar not null,
            makename        varchar not null        references make(makename),
            year            integer not null,
            -- update anomaly - you would have to update every model/year combo
            carclass        varchar -- economy, suv, truck etc ...
            -- baseprice is OK since it is tied to the year 
            baseprice       integer not null
            primary key (makename, modelname, year)
    );

... So #1 would be the more 'correct' and robust way, especially if you planning to hold attributes on the model that would be independent of what year it was made. Either way would get you the same queries. In fact, its harder / more work (not really after you get it down) to join the the tables of a normalized one than a non-normalized one. But that is not the point. You are putting in the DB because you want your data to be correct (I hope).

Note: those are the real primary keys even if you use auto-inc primary keys. You would want to change the primary keys to unique ones to ensure data consistency and change the referencing foreign keys to integers.


IMO, I wouldn't make a table for the years. Doing SQL on an integer column (such as year) is incredibly fast, especially with the right indexes in place. Much faster than a join.

However, I do see the merit to having a separate table for the Make and doing join based on that. This way you'd be able to enter manufacturer details like history, years of production, website, etc.

So I'd do:

Manufacturers: (or Makes)
  - id
  - name
  - website
  - country

Cars:
  - manufacturer_id (or make_id)
  - model
  - year
  - doors
  - trim_class


Looks good so far, like your thinking. In the cars table though, if you have to enter Camry 2001, camry 2002 canry 2003, camree 2004. Seems like you would have redundant data along with the risk of data integrity with misspelling. IMO you would also need somekind of Entity with makeID and modelID, with modelID it is entered just once or model is just entered once in a table. 'Camry' will always be 'Camry' represented with modelID.


If I'm understanding you correctly, you would need to create one make table and one model table. The model table would have columns for id, make_id, name, year.

Obviously, make_id would be the foreign key pointing to the make table.

It really depends on what your situation is and how much you would need to normalize your data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜