开发者

Many to many - junction table

I am doing this design. Users can speak one or more languages (limit five). first language is the native and the other开发者_运维问答s are in order of fluency.

user 1 speak 3 languages

    user_id|first language|secondlanguage|thirdlanguage|fourthlanguage| fifthlanguage 
    1        english (1)     franch(2)    portuguese (3)      null           null 

Many to many - junction table

What do you think about this scheme? Sounds a bit strange or it is the correct way to do that?


The answer on this one is that it really depends. Relational purists will say this is a repeating group and this is a terrible design, while the quick-and-dirty crowd will say, "if you're only ever going to support 5 languages, then do it". But, really - how can you be sure you'd only support 5 languages forever and always? And what sorts of queries will you be doing? Imagine you want to find all the people who speak English - your query will be:

select * from single_user_has_spoken_languages where lang1 = @englishID or lang2 = @englishID or... etc

If you want to find people that speak both English and French, then your where clause grows again. And every single query will have to change if you ever add a 6th language.

Consider the alternative:

select distinct UserID from single_user_has_spoken_languages where langID = @englishID

Once you stop and think about how your queries will work, good design begins to become much clearer.


I would normalize out the languages.

So, for example (non-key columns omitted for brevity):

Language Table
id:  INT

Single_User Table
id:  INT

User_Language Table
user_id: INT
language_id: INT
language_num:  SMALLINT

You can put unique constraints on both user_id, language_id as well as user_id, language_num to ensure no duplicates. You can also put a check constraint on language_num to be between 1 and 5 to ensure only 5 languages are selected.

You can also choose to add a surrogate key which would make referencing rows easier.


This is a "denormalized" form of the "correct" or "normaised" design, which would have a separate table to store languages that had a column for the fluency order, ie:

Table User: User_ud, name, etc
Table Language: user_id, fluency, name

Your approach is usually used for performance reasons, but is usually to be avoided at the design stage.

Consider how your design would handle adding more language slots, or storing how well the language was spoken. The normalized design can handle this easily. You approach can not.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜