Database design : two different column in one table or add another table?

2023-02-07 17:30 问答作者：

I'm developing a web application to store data about car accident on a sql server database.

I need a way to store the roads where the accident happened. initially i thought to have two different column in the accident table (say FirstRoad, SecondRoad) (the second one is used for crossroads). A friend of mine says that is better to have a different Road table .

I don't know which solution is better .

The accident table is estimated to have about 1000 record per year :could the join with another table be expensive in terms of performance?

One goal of the application is to create statistical reports; one of these is the list of the most dangerous roads and probably having two columns in database coulbe be a problem to create that report. What do you think about?

thanks.

Update: Road name is not the only information i need to store. An important data is the street number (if the accident happens within a city) or the mile (outside the city). At this 开发者_如何学JAVAstage i don't have the geographical coordinates.

What about having the Accident table with these fields ?:

AccidentId (PK), Road1Id (FK), <-- point to the Road table storing all road names Road2Id (FK), Road1_StreetNumber_or_Mile, Road2_StreetNumber_or_Mile etc...

I personally would have seperate tables for everything, especially for statistical analysis. A road can have a name, postcode, town, etc, and each of these could help with the analysis. However if all you are storing is the road name, I see no need for you to have a seperate table. In saying that, 1000 records per year is absolutely nothing. And a simple join to another table should have a negligible impact on performance.

It is best to use 2 tables. One with 2 ID columns, one for each road. The second table would have all of the road names. This decreases the actual amount of information you have to entered over the long term and decreases the chance of entering a road name 2 times with a different name.

It will also make reporting fast as you can sum by the ID of the road instead of the name.

There is a whole database theory/math system behind this called Data Normalization. This, http://en.wikipedia.org/wiki/Database_normalization is a good starting point to learning about it.

From todays standards of Database design you want to eliminate redundancies.

So a seperate Road table is definitely a good suggestion.

The next question is whether to modify with or without null values. Null values are generally faster because you don't need cross joins and are preferrable if they are often populated.

As one accident can occour only in one or two roads i would recommend one accident table with roadA, roadB and a road table with roads.

If you only every expect to see two roads maximum then I probably wouldn't normalise road into a separate table but would include Road1 and Road2 fields in the primary table. If however, as may be likely, that you'll see more than 2 roads then you may want to think about a 1 to many relationship between an accident table (1) and a roads table (many).

As you are planning to do statistical analysis then you may have issues as certainly here in the UK road name alone is insufficient to identify a location. Depending on how much data you have on the accident location, you might want to expand how you record the physical location to enable analysis beyond just the road names.

I don't know which solution is better .

Step 1. Buy a good book on relational database design before you do another thing. Seriously. This is a well-understood problem, and you need a lot more background.

Step 2. After learning about ordinary database design, buy Kimball's The Data Warehouse Toolkit.

The accident table is estimated to have about 1000 record per year :could the join with another table be expensive in terms of performance?

That's microscopically small. Seriously. Until you're talking about 100,000's of records, you don't really have a very big database.

One goal of the application is to create statistical reports; one of these is the list of the most dangerous roads and probably having two columns in database could be be a problem to create that report. What do you think about?

Buy Kimball's The Data Warehouse Toolkit immediately.

You have a "fact" table which is an accident. There may be no measurable quantity to the accident. It becomes a strange "fact-less fact table" because you have no measurements associated with the accident. Ideally, you have a fact (like cost or something).

You have dimensions of the accident:

time
road (actually two references from the accident fact to this dimension)
vehicle?
type of accident (property damage only, injury, death)
cause (driver error, mechanical problem, etc.)

Each dimension is a separate table, as is the central fact table.

Maybe you can start with something like this -- just to give you an idea.

Database design : two different column in one table or add another table?

继续阅读：database-design

Database design : two different column in one table or add another table?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？