开发者

Simple SQL Data Modeling question

Given that I have a table that holds vehicle information and one of those pieces of information is VehicleType (usually 6-20 characters), what are the technical reasons why it is better to design the tables like this:

Vehicles

VehicleID

VehicleTypeID (INT) (relates to an INT in the VehicleTypes table)

versus this:

Vehicles

VehicleID

VehicleType (NVARCHAR(50))

I can think of a few...

1) If the description of the Vehicle Type changes, it only has to be changed in one record.

2) It takes less space to store an INT than a NVARCHAR (depending, of course, on the string length and especially if I change to TINYINT.)

A few questions...

1) Any indexing considerations? I'm assuming if I'm going to be indexing on the VehicleType, it will be faster and take less space if I'm using INTs rather than NVARCHARs.

2) Any query optimization issues? I know the former method requires a JOIN, but I don't expect that to be taxing to SQL 2008.

I'm going to be defending my position and want to have as much information as possible.

Thanks for taking the time to respond.

Thanks,

开发者_如何学运维 Darvis


It also is beneficial in the data entry forms if you have a list of "approved" vehicle types for the users to select from that is driven from the "VehicleType" table. If you don't you'll end up with misspellings and vehicle types that aren't on the list. Plus when new vehicle types are added you don't have to change the data entry front end if you are populating a drop down list from a query, the query will just get all the vehicle types in the table.


The former is 3NF, properly normalized data.

1) Any indexing considerations?

Indexes are not automatically created for foreign keys. Index creation on a foreign key makes sense - it's a very likely to be used as criteria, but the data and access to it should be considered. MySQL has a limit on the amount of space for allocating indexes (no others do that I'm aware of), and while indexes help data retrieval, they also impact INSERT/UPDATE/DELETE statements. If dealing with SQL Server, I highly recommend reading Kim Tripp's The Tipping Point series.

2) Any query optimization issues? I know the former method requires a JOIN, but I don't expect that to be taxing to SQL 2008.

A join is the most preferred means of data retrieval and manipulation, vs say a subquery...


remember that nvarchar requires 2 bytes per character so if an int is 4 character then to use the same space you could only use 2 characters in the nvarchar column.

I would go with int or even smallint (up to 32,767 and 2 bytes of storage) if tinyint (up to 255) is not enough

So I would use the first table in this case

don't know what your queries will look like but might be an idea of switching the columns around and starting with typeid


If the description of the Vehicle Type changes, it only has to be changed in one record.

Correct. And also what about currently unused "vehicle types" eg fuel cell car.

These are "data modification anomalies"

Other folk have answered the index questions...


Basically what you are trying to do is to De-normalize the table design. This is essentially useful when it comes to reporting purpose (when there are millions and millions of records) but for application purposes I would go with a normalized database with proper indexes built on them. This also aids in Referential integrity.

Hope this helps.


Be careful changing names in 'pick lists' (domains) such as Vehicle Type.

If you are using a foreign key then all existing entries for that type will be impacted - would this be valid?

I don't know what Vehicle Type is meant to be, however if Vehicle Type = Vehicle, Make problems can occur if, for instance, Datsun changes its name to Nissan - the existing vehicles in the table are still Datsuns.....

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜