开发者

SSIS ETL Design

SSIS ETL Design

I'm trying to build a ETL job. Below are my primary design considerations

  1. One master table to track data pull and st开发者_如何学Catus

Table Details

Table Name - JobStatus

  • Columns - JobId, StartTime, EndTime, Status

Table Name - JobDetails

  • Job_Detail_id, JobId, TableName, StartTime, EndTime, Status

Table Name - ErrorLog

  • ErrorLogid, ErrorDesc, Time, Comments

How it Works

  1. Whenever job fails, Job details table would help to figure out which is the last errored table, ETL job can start from that step

  2. SSIS package would have DataFlow Tasks defined for each table between source and staging DB

  3. Example of how it works

    • Put Entry in Job Status Table to start package execution

    • Each Dataflow task for individual table would put an entry in JobDetails table. After Loading data End Time and Status would be updated

    • On Error Status updated to Error and entry logged in Errorlog table

I would need your comments / feedback for this approach. Any bookmarks/blogposts for ETL design relevant to this example also would be helpful


I think that the requirements that you have mentioned about restarting the package execution from point of failure already exists in SSIS. You can use Checkpoint file to achieve that functionality. The in-built SSIS logging feature can help you to track the error messages and will help resolving the issue.

Click here to see an article that I have written on my website with respect to your question. I hope that might give you an idea about the functionality that already exist in SSIS and how you can make best use of it.

I think you can spend quality time on fine tuning your core business functionality than reinventing the process that already exists for the most part. The standard functionality might require some minor alterations based on our requirements but I don't think that you need to start from the ground up.

Hope that helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜