Database Design :: Normalization in 2 Participant Event :: Join Table or 2 Column?
Edit: u开发者_如何学JAVApdating generalized question to reflect actual domain: sport of hockey.
The actual event then is a games schedule, and the participants are teams.
Teams are the ultimate "owners" (i.e. when team deleted so should any related scheduled games, results, players, and player stats).
The problem discussed so far in this thread covers the decision to combine an event into a single row with 2 columns (team1, team2) or to break out into a join table. The consensus so far is: stay with 2 column approach. However, given that the original question applied to generic events and not scheduled games WITH related results, there may be a change in the approach (for example, some might say that the game schedule should record BOTH schedule info [date, time, location, teams] and game outcome/result info (score, win-loss-tie, penalty minutes power plays, etc.) and therefore a join table should be added to generate unique gameIDs).
Responses so far have been excellent ;--) Will flag as answered pending any updates. Thanks everyone!
ORIGINAL QUESTION:
Puzzled over how to solve this problem.
What is the normalized approach to handling an event (taking place on given date & location) where there will always be exactly 2 participants?
The non-normalized approach would be to create an events table:
1) eventID PK (autonum)
2) two columns, participant1 & participant2, PKs (autonum) from a participants table.While this approach does consolidate event creation in a single table record (no join table to create eventIDs), one problem with this design is that technically participants should be the owning side of the equation; i.e. when deleting a participant, any related event should be deleted since orphaned events are not allowed.
The alternative, a join table, would as I see it generate a unique eventID along with date, time, and location. The revised events table would then consist of the join table eventID and 2 separate rows, one for each participant. With this approach I can easily add an FK to the events table for participantID (PK from participants table) and thereby have proper constraints in place.
How would you approach this problem? I should point out that we have been using the non-normalized design in production for a couple of years without issue (data constraints pushed to code level), but we are looking re-architect from the ground (Database) up (Code), thus the question ;--)
Why do you think that's not normalised? Both participant IDs depend on the event in a way that makes it 3NF (depending on the key, the whole key and nothing but the key, so help me, Codd).
However, if the date and location also depend on the event ID, they too should be placed in this table (with perhaps the location being a foreign key lookup to another table, similar to the participants). Something akin to:
Events:
EventId primary key
Date
LocationId references Locations(LocationId)
ParticipantOneId references Participants(ParticipantId)
ParticipantTwoId references Participants(ParticipantId)
Locations:
LocationId primary key
<<Other location stuff>>
Participants:
ParticipantId primary key
<<Other participant stuff>>
Just because you have something that looks like an array, that doesn't automatically violate 3NF, it's just a warning sign that should be looked at.
Now, if you ever discovered an event that could have 0, 1, 3 or greater participants, then you'd want to rework the schema.
Until then, YAGNI.
Take the constraints back out of application code and put them where they belong--under control of the dbms.
ID numbers have nothing to do with normalization.
You need a table of participants. Set foreign keys referencing participants from the table of events with
ON DELETE CASCADE
; that will let the dbms delete events when users delete one of the participants. This has a lot to do with data integrity, but nothing at all to do with normalization.
Can you detail the other option you're proposing? Or is that what you're asking for? Without knowing all of the tables it is hard to say, but I think the way you're doing it is fine, especially since it is always a 1-to-2 relationship (game to teams). Index by your IDs (gameId, teamId, etc.) and you'll be good and won't need to go through and refactor just for the sake of it.
Consider, the following, what advantage do you really get?
games
=========
game_id PK
game_date
game_time
location
games2teams <-- possibly overkill, since this is a 1-to-2, not 1-to-N or M-to-N
===========
game_id
team_id
schools
==========
team_id
player_name
results
==========
game_id
team_id
score
Both your schemas are normalized, in the case that there will always be two and exactly two participants for each event.
I'd rather go with the approach that does not use a relationship table, and will set FK's on the Participant1 & Participant2 fields, to ensure proper integrity.
It looks like a m-to-n relationship (one person could create/participate to multiple events, to an event could be more participants), thus you should have 3 tables:
Table events:
- id - PK - autoincrement
- other details (start date, duration, location, etc)
Table people:
- id - PK - autoincrement
- other details
Table person_event :
- id - PK - autoincrement (this is a surrogate PK, needed because table holds supplementary information beside the person and event)
- event_id - FK on events
- person_id - FK on people
- participant_role - this can be organizer, simple participant, honoured guest, etc
Now, you can have as many participants to an event as you want, you can have multiple creators of the event, you can have as many roles as you want, enforcing whatever business logic you/the client need
Not sure if you were wanting to keep data on the stats of players on both sides of the game but assumed you do. Also assumed that possible locations is not defined and not particularly important so just have a text field in the game entity to store this. If you do want to report on events by location then you should add a location entity so you can book a game to a particular location from a list. Here is a basic ER model that I think covers all your stated rules. It should help you arrive at a more detailed schema. The "Diary" or log entity stores misc unstructured content about the game. e.g. "It was a rainy day and one player was sent off in the 9th minute. blah.. blah..."
精彩评论