开发者

Can we use SSIS to parse data from application before we store in db...?

Exploring the feasibility of something here... We have a UI page in the mvc3 web-application that the admin will use to upload a 'csv' file. Is it possible to have an SSIS package that will be called by the application..that will validate/transform all these data and store into db if its all valid or throw back error details to application if there are invalid data?

Just trying to leverage the capabilities of SSIS to d开发者_JAVA百科o the parsing of file and transformation of data. If this is not possible, is there a better way?

Please provide your thought..much appreciated!

update: yes, need to execute ssis on the fly.. and give back a message on the UI page ..telling the 'uploader' if his upload failed or not based on message that SSIS would give back to the application.


You could use SSIS to parse a text file. You could also use a sledgehammer to drive finishing nails. Better bet would be something like FileHelpers to keep you out of the text-parsing mud of parsing the text file yourself.


Given the edits and debate, I'll elaborate a bit. Mainly that there are a few big problems I see calling SSIS directly from your MVC controller:

a) Massive external dependency to ship around. Your dev, CI and staging environments will need full blown SQL server with SSIS enabled and probably a slew of other configuration depending on implementation details.

b) Speed -- SSIS is a big job engine, takes a while to spin up before it even starts working.

c) Messaging -- if the point of the operation is letting a user know what is wrong, SSIS is the wrong tool. Unless you want to write a tool to translate SSIS import errors into readable English. Then you should probably stop whatever it is you are doing and sell that tool rather than solving this problem.

On the flip side, most of the .NET based file parsing stuff and C# give you plenty of error handling capabilities that give you a much better shot of getting an understandable error message to a user quickly.

Finally, if you are looking at large files here, you really need to think about some sort of queuing and messaging system. In that context, SSIS could make a bit more sense though I still think the error messages border on useless.


Certainly an SSIS package can do that, that sort of thing is its purpose. It's easier of course if you have an experienced SSIS developer as SSIS can be quite daunting. You can create logging inthe package and exception tables to throw out the bad records and then after the process is complete, check the logs/exception tables to return any bad records to the user who kicked off the process. We have a database that stores meta data about each time an SSISI pacakage is run and you might need that too in order to be able to determin if the error record relate to the file that user uploaded or someone else's being run at about the same time.


You can execute SSIS packages from code. We have a similar process of users uploading CSV files to execute a package but the package is doing a lot more than just validating/parsing logic.

If it's just a little validation and parsing logic then take a look at Rhino ETL and FileHelpers. In fact, for us, these two tools are compelling enough that we are considering using this period but we have don't have a lot of SSIS packages, a dedicated SSIS resource and most of our packages are simple. FileHelpers alone is a great tool if it's really simple parsing.

Here's some articles on Rhino ETL.

http://ayende.com/Blog/archive/2008/01/16/Rhino-ETL-2.0.aspx

http://www.codeproject.com/KB/cs/ETLWithCSharp.aspx

https://github.com/ayende/rhino-etl

If you want to execute an SSIS package you will want to look into the Microsoft.SqlServer.Dts.Runtime namespace.

Here's an article on executing SSIS from c#.


You are describing a fairly standard ETL procedure and that is exactly what SSIS was designed for. SSIS will indeed read CSV files and be able to perform validation on individual fields. What you do with the result of that validation (good or bad, pass no pass) is entirely up to you.

And yes, you can fire off SSIS packages on the fly... as necessary. Check out the dtexec utility. You could also set up a SQL Server Agent job which would monitor the folder that receives the uploaded CSV files and trips off the SSIS processes against them automatically. You ASP.NET code only have to worry about the upload. Everything else would then be a SQL Server/SQL Agent-based ETL operation.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜