SQL: Normalizing Code / Export - Import
I've decided to rewrite a database I have that is poorly normalized. I've created the new database but now need to move data from the old database into the new one. I'm not sure exactly how to accomplish though. For example, in the old database I have a webDorms table that looks like this:
id (PK, int)
room_type (varchar)
description (char)
available (varchar)
max_occupancy (varchar)
current_occupancy (varchar)
dorm_room (varchar)
dorm_building (varchar)
building_code (varchar)
In the new database I split this out into two tables Buildings and Housing. Housing looks like this:
id (PK, int)
building (FK, int)
room (nvarchar)
current_occupancy (int)
max_occupancy (int)
is_available (bit)
gender (nvarchar)
room开发者_JS百科_type (nvarchar)
Buildings looks like this:
id (PK, int)
building_code (nvarchar)
building_name (nvarchar)
I've manually repopulated the Buildings table (it was around twenty rows) but the Housing table (which is most similar to the previous webDorms table) contains around three hundred rows - and I'd rather not rekey all that data.
Any suggestions on the best way to accomplish this import?
If I'm missing something obvious in your question, please let me know with a comment, but what's wrong with using INSERT queries to copy the data over instead of retyping the data?
This is one reason why SQL has SELECT DISTINCT -- normalization.
Two SELECT DISTINCT queries from the source table should create the resulting two normalized tables.
You have two approaches.
INSERT INTO new_table_1(...) AS SELECT DISTINCT a, b, c FROM old table
INSERT INTO new_table_2(...) AS SELECT DISTINCT d, e, f FROM old table
Or do two separate queries to create two separate extract files (in CSV format). Then load those two files.
you would want to do something like this:
SELECT CAST(current_occupancy AS int) AS current_occupancy,
CAST(max_occupancy AS int) AS max_occupancy,
CAST(available AS bit) AS available
You need to take a look at books online and specifically CAST and CONVERT.
You mentioned normalisation as the reason for the re-design of the table, but you have not yet normalised all the possibilities. Room type and gender are candidates here, also the data types for max and current occupancy are rather large, unless you are wanting to have rooms that allow 2 billion occupants. :0)
It would look similar to this:
INSERT INTO ['database'].dbo.[housing] (id, building, current_occupancy, max_occupancy, room_type, is_available)
SELECT id, [building].id, cast(current_occupancy as int), cast(max_occupancy as int), room_type, CASE available WHEN 'Y' THEN 1 ELSE 0 END)
FROM ['database'].dbo.[webDorms]
INNER JOIN ['database'].dbo.[building] ON [building].building_code = [webDorms].building_code
[Room] and [Gender] would need to be added manually, or you can populate them with defaults during the insert.
This is of course after you've populated your [building] table, which could have been done by:
INSERT INTO ['database'].dbo.[building] (building_code)
SELECT DISTINCT building_code
FROM ['database'].dbo.[webDorms]
Then populating your [building_name] manually (instead of keying in all the data).
精彩评论