How to unzip text files and insert upload them to a database
Every morning we receive a Zipped folder called Statistics.tar.gz (downloaded via ftp server) which contains 200 text files with today and yesterday date appended to each file name such :
Customer_20112508_20112408.txt,
Customer_Contact_ContactType_20112508_20112408.txt,
Customer_Orders_20112508_20112408.txt,
Sales_Team_20112508_20112408.txt,
Team_20112508_20112408.txt,
.
.
.
and I have database contains 200 Tables a table for each file as following
Customer,
Customer_Contact_ContactType,
Customer_Orders,
Sales_Team,
Team,
I need to unzip the folder and trim file names (removing the dates) then upload each file to its correspondent table in the db and finally move the folder (Statistics.tar.gz) to an archive folder.
I have not used SSIS before but I am told it is the best solution for this scenario, I am also thinking about using Windows Services Application.
Note: Performance is not important here as the text files usually small.
Update
I have managed to unzip the the files and move them to another folder (C:\temp).And created a stored procedure to bulk insert each file to its correspondent table for example ( insert Customer.text into Customer ).
开发者_如何学JAVAI am trying to use foreach loop container with Execute SQL Task but it is not working.
As always, decompose your problems into smaller and smaller chunks until it's easily solved. If you need to have this by tomorrow and you've never touched SSIS, as much as I love the tool, I do not think you will have success using it.
Setup tasks
Once you create a new SSIS package you will need to create a Flat File connection manager for each of the unique file types. The name of the file can change day to day but the format of the Customer, Sales, etc files can't change or SSIS will fail miserably.
You will want to create an OLE DB connection manager to be able to talk to the destination database.
You will need a File Connection connection manager to hold a reference to the tar/zip file.
You will need to create some variables to hold the connection strings for the above connection managers and assign them to the ConnectionString expression on those Connection Managers.
Right-click in the background of your package, select properties and find the property that says Package Protection Level, change that from EncryptSaveSensitiveWithUserKey to DontSaveSensitive.
Rename Package.dtsx to something useful, like DailyCustomerSalesImport
Unzipping the file
Grab a utility like 7-zip and untar/unzip the file programmatically extract tar.gz in a single step (on windows with 7zip) To make that work in SSIS, you will want to use the "Execute Process Task"
Importing data
SSIS can work from wildcard names so there's no real need to strip the date off the file name unless you really just want to (that'd be handled by "File System Task" as a rename)
From the Unzip task, I'd have a Foreach Loop Container defined to load each file type (Customer, Customer_ContactType, etc) Each Foreach Container would be responsible for looking for the right type of file and assigning that to a variable, which is used as an expression on the flat file connection manager defined for each file type. Looping over files with the Foreach loop
Inside each of those Foreach container, you now have your package ready to do something and that something is a Data Flow Task. Each of those data flows will most likely look the same, flat file source directly mapped to an OLE DB destination, with little to no transformation involved.
Archiving the original
You will need a connection manager pointing to the tarred zip file and use the "File System Task" Wire the output from the last Foreach Loop container if you put them in series or drag all the outputs from the Foreach loop containers to the File System Task if you have them in parallel.
They pushed a patch onto my machine or I'd show you a basic wire-frame of the package. I will need my head examined for spending this much time on a question that will never be reviewed again ;)
精彩评论