How to load a flat file with header and detail data into a database using SSIS package?
I have to load a flat file that has different header and detail with variable number of columns. These have parent child relations. How to load the data into SQL Server? The file looks like this:
DEP*0116960*20110511***01*061000104*DA*1000022220940
AMT*3*13006.05
QTY*41*3
QTY*42*5
BAT*20110511**STAWRRY11051101
AMT*2*9332.33
QTY*42*2
BPR*I*4799*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001002
BPR*I*4533.33*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001001
BAT*20110511**STAWRRY11051102
AMT*2*1986.99
QTY*42*2
BPR*I*853.97*C*CHK*PBC*01*111000614*DA开发者_JS百科*708340062
REF*CK*0010736416
RMR*ST*00090183**853.97*12199.61
REF*BE*01*0123181825*0
REF*SL*NA*191219012318
DTM*810*20110430
BPR*I*1133.02*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736417
RMR*ST*00090184**1133.02*16186.04
REF*BE*01*0123181825*0
REF*SL*NA*191219012318
DTM*810*20110430
BAT*20110511**STAWRRY11051103
AMT*2*1686.73
QTY*42*1
BPR*I*1686.73*C*CHK*PBC*01*075911742*DA*0100461755
REF*CK*0000002795
RMR*ST*00094075**1686.73*42168.16
REF*BE*01*0123181825*0
REF*SL*NA*191219012318
DTM*810*20110331
SE*39*000000088
GE*1*88
IEA*1*000000088
Here is one possible way of loading this file into SQL Server. Below shown example reads the contents of EDI 823 Lockbox file and loads into multiple tables along with the relationship. I am sure that there are other better ways of doing this. This is just one example of loading an EDI file into SQL Server.
The example reads the EDI file line by line and then splits them based on the character asterisk (*). The script component assigns the value to the variables in the Script Component to populate the tables. After the data is populated by the Data Flow Task
, the Execute SQL Task
will update the ParentId column in the tables dbo.AMT
, dbo.DTM
, dbo.QTY
and dbo.REF
using the stored procedure dbo.UpdateHierarchy
. The table 'ISA' contains the data of all other segments. Screenshots show how the data is stored in the child tables.
SetId
columns used in all the tables will group data of a file in order to avoid wrong parent id being referred from data pulled from another file. SetId
will be unique for each file loaded into these tables.
Step-by-step process:
Create 5 tables named
dbo.AMT
,dbo.DTM
,dbo.ISA
,dbo.QTY
anddbo.REF
and a stored procedure nameddbo.UpdateHierarchy
in the database using the scripts provided under SQL Scripts section. TablesAMT
,DTM
,QTY
andREF
will store the data of the segments named similarly and the tableISA
will contain all other segment data.Create an
OLE DB Connection
named SQLServer to connect to the SQL Server instance and create aFlat File Connection
namedSource
as shown in screenshots #""1"" - #4. Flat File connection will use the EDI file. Remove theColumn delimiter
because the file has varying number of elements. This example will split the elements using theScript Component
.On the SSIS package, create 5 variables as shown in screenshot #5. Also, place a
Data Flow task
and anExecute SQL Task
on theControl Flow
tab of the package as shown in screenshot #5.Configure the Data Flow Task with
Flat File Source
,Script Component
,Multicast
,Conditional Split
and 5OLE DB Destinations
as shown in screenshot #6.Configure the
Flat File Source
to read the EDI file using the Flat File connection nameSource
.Configure the
Script Component
as Transformation task as shown in screenshots #7 and #8. Create all the other variables using the names and data types using the data provided under Script Component Variables section. Replace the class ScriptMain present within theScript Component Transformation
task with the code provided under Script Component Code.Configure the
Conditional Split
as shown in screenshot #9.Configure the
OLE DB Destination
AMT to insert data into the tabledbo.AMT
and map fields as shown in screenshot #10.Configure the
OLE DB Destination
QTY to insert data into the tabledbo.QTY
and map fields as shown in screenshot #11.Configure the
OLE DB Destination
REF to insert data into the tabledbo.REF
and map fields as shown in screenshot #12.Configure the
OLE DB Destination
DTM to insert data into the tabledbo.DTM
and map fields as shown in screenshot #13.Configure the
OLE DB Destination
Other to insert data into the tabledbo.ISA
and map fields as shown in screenshot #14.On the Data Flow tab, configure the
Execute SQL Task
as shown in screenshot #15.Screenshots #16 and #17 shows sample package execution. File data used for testing this example is provided under section
EDI File Content
.Screenshot #18 shows data in table
dbo.AMT
in comparison with data in tabledbo.ISA
.Screenshot #19 shows data in table
dbo.QTY
in comparison with data in tabledbo.ISA
.Screenshot #20 shows data in table
dbo.REF
in comparison with data in tabledbo.ISA
.Screenshot #21 shows data in table
dbo.DTM
in comparison with data in tabledbo.ISA
.
Hope that helps.
SQL Scripts:
CREATE TABLE [dbo].[AMT](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NULL,
[LineNumber] [int] NULL,
[AmountQualifierCode] [varchar](3) NULL,
[MonetaryAmount] [numeric](18, 2) NULL,
[SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_AMT] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
CREATE TABLE [dbo].[DTM](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NULL,
[LineNumber] [int] NULL,
[DateTimeQualifier] [varchar](3) NULL,
[Date] [varchar](10) NULL,
[Time] [varchar](10) NULL,
[TimeCode] [varchar](2) NULL,
[PeriodFormatQualifier] [varchar](3) NULL,
[DateTimePeriod] [varchar](35) NULL,
[SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_DTM] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
CREATE TABLE [dbo].[ISA](
[Id] [int] IDENTITY(1,1) NOT NULL,
[LineNumber] [int] NULL,
[SegmentCode] [varchar](3) NULL,
[ReferenceId] [varchar](30) NULL,
[Date] [varchar](10) NULL,
[Time] [varchar](10) NULL,
[IdNumberQualifier] [varchar](2) NULL,
[IdentificationNumber] [varchar](12) NULL,
[AccountNumberQualifier] [varchar](3) NULL,
[AccountNumber] [varchar](35) NULL,
[TransactionHandlingCode] [varchar](2) NULL,
[CreditDebitFlag] [varchar](1) NULL,
[PaymentMethod] [varchar](3) NULL,
[ReferenceIdQualifier] [varchar](3) NULL,
[MonetaryAmount1] [numeric](18, 2) NULL,
[MonetaryAmount2] [numeric](18, 2) NULL,
[AuthorizeInfoQualifier] [varchar](2) NULL,
[AuthorizeInfo] [varchar](10) NULL,
[SecurityInfoQualifier] [varchar](2) NULL,
[SecurityInfo] [varchar](10) NULL,
[InterchangeSenderIdQualifier] [varchar](2) NULL,
[InterchangeSenderId] [varchar](15) NULL,
[InterchangeReceiverIdQualifier] [varchar](2) NULL,
[InterchangeReceiverId] [varchar](15) NULL,
[InterchangeStandardsId] [varchar](1) NULL,
[InterchangeVersionId] [varchar](5) NULL,
[InterchangeControlNumber] [varchar](9) NULL,
[Acknowledge] [varchar](1) NULL,
[TestIndicator] [varchar](1) NULL,
[SubElementSeparator] [varchar](1) NULL,
[FunctionalId] [varchar](2) NULL,
[ApplicationSenderCode] [varchar](12) NULL,
[ApplicationReceiverCode] [varchar](12) NULL,
[Responsbility] [varchar](2) NULL,
[Version] [varchar](12) NULL,
[TransactionIdCode] [varchar](3) NULL,
[TransactionSetControlNumber] [varchar](9) NULL,
[Counter] [int] NULL,
[SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_ISA] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
CREATE TABLE [dbo].[QTY](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NULL,
[LineNumber] [int] NULL,
[QuantityQualifier] [varchar](2) NULL,
[Quantity] [numeric](15, 0) NULL,
[SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_QTY] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
CREATE TABLE [dbo].[REF](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NULL,
[LineNumber] [int] NULL,
[ReferenceIdQualifier] [varchar](3) NULL,
[ReferenceId] [varchar](30) NULL,
[SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_REF] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
CREATE PROCEDURE [dbo].[UpdateHierarchy]
AS
BEGIN
SET NOCOUNT ON;
UPDATE AMT
SET AMT.ParentId = COALESCE(PAR.ParentId, 0)
FROM dbo.AMT AMT
CROSS APPLY (
SELECT MAX(Id) AS ParentId
FROM dbo.ISA PAR
WHERE PAR.SetId = AMT.SetId
AND PAR.LineNumber < AMT.LineNumber
AND PAR.SegmentCode IN ('DEP', 'BAT')
) PAR;
UPDATE QTY
SET QTY.ParentId = COALESCE(PAR.ParentId, 0)
FROM dbo.QTY QTY
CROSS APPLY (
SELECT MAX(Id) AS ParentId
FROM dbo.ISA PAR
WHERE PAR.SetId = QTY.SetId
AND PAR.LineNumber < QTY.LineNumber
AND PAR.SegmentCode IN ('DEP', 'BAT')
) PAR;
UPDATE REF
SET REF.ParentId = COALESCE(PAR.ParentId, 0)
FROM dbo.REF REF
CROSS APPLY (
SELECT MAX(Id) AS ParentId
FROM dbo.ISA PAR
WHERE PAR.SetId = REF.SetId
AND PAR.LineNumber < REF.LineNumber
AND PAR.SegmentCode IN ('BPR', 'RMR')
) PAR;
UPDATE DTM
SET DTM.ParentId = COALESCE(PAR.ParentId, 0)
FROM dbo.DTM DTM
CROSS APPLY (
SELECT MAX(Id) AS ParentId
FROM dbo.ISA PAR
WHERE PAR.SetId = DTM.SetId
AND PAR.LineNumber < DTM.LineNumber
AND PAR.SegmentCode IN ('BPR', 'RMR')
) PAR;
END
GO
Script Component Variables:
S.no. Variable name Data Type Length/Precision
1. LineNumber four-byte signed integer [DT_I4]
2. ParentId four-byte signed integer [DT_I4]
3. SegmentCode string [DT_STR] 3
4. ReferenceId string [DT_STR] 30
5. Date string [DT_STR] 10
6. Time string [DT_STR] 10
7. IdNumberQualifier string [DT_STR] 2
8. IdentificationNumber string [DT_STR] 12
9. AccountNumberQualifier string [DT_STR] 3
10. AccountNumber string [DT_STR] 35
11. AmountQualifierCode string [DT_STR] 3
12. TransactionHandlingCode string [DT_STR] 2
13. CreditDebitFlag string [DT_STR] 1
14. PaymentMethod string [DT_STR] 3
15. DateTimeQualifier string [DT_STR] 3
16. TimeCode string [DT_STR] 2
17. PeriodFormatQualifier string [DT_STR] 3
18. DateTimePeriod string [DT_STR] 35
19. QuantityQualifier string [DT_STR] 2
20. Quantity numeric [DT_NUMERIC] 15, 0
21. ReferenceIdQualifier string [DT_STR] 3
22. MonetaryAmount1 numeric [DT_NUMERIC] 18,2
23. MonetaryAmount2 numeric [DT_NUMERIC] 18,2
24. AuthorizeInfoQualifier string [DT_STR] 2
25. AuthorizeInfo string [DT_STR] 10
26. SecurityInfoQualifier string [DT_STR] 2
27. SecurityInfo string [DT_STR] 2
28. InterchangeSenderIdQualifier string [DT_STR] 2
29. InterchangeSenderId string [DT_STR] 15
30. InterchangeReceiverIdQualifier string [DT_STR] 2
31. InterchangeReceiverId string [DT_STR] 15
32. InterchangeStandardsId string [DT_STR] 1
33. InterchangeVersionId string [DT_STR] 5
34. InterchangeControlNumber string [DT_STR] 9
35. Acknowledge string [DT_STR] 1
36. TestIndicator string [DT_STR] 1
37. SubElementSeparator string [DT_STR] 1
38. FunctionalId string [DT_STR] 2
39. ApplicationSenderCode string [DT_STR] 12
40. ApplicationReceiverCode string [DT_STR] 12
41. Responsbility string [DT_STR] 2
42. Version string [DT_STR] 12
43. TransactionIdCode string [DT_STR] 3
44. TransactionSetControlNumber string [DT_STR] 9
45. Counter four-byte signed integer [DT_I4]
46. SetId unique identifier [DT_GUID]
Script Component Code:
VB.NET code that can be used in SSIS 2005 and above
.
Public Class ScriptMain
Inherits UserComponent
Public Enum Segments
ISA
GS
ST
DEP
AMT
QTY
BAT
BPR
REF
RMR
DTM
SE
GE
IEA
End Enum
Dim arrLine As String()
Dim segmentCode As String
Dim counter As Integer
Dim lineNumber As Integer = 1
Dim segmentSequence As Hashtable = New Hashtable()
Dim setId As Guid = Guid.NewGuid()
Public Overrides Sub PreExecute()
MyBase.PreExecute()
End Sub
Public Overrides Sub PostExecute()
MyBase.PostExecute()
End Sub
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
arrLine = Row.LineText.Split("*"c)
segmentCode = SegmentValue(arrLine, 0)
Row.SegmentCode = segmentCode
Row.LineNumber = lineNumber
Row.SetId = setId
counter = arrLine.Length
lineNumber += 1
Select Case (segmentCode.ToUpper())
Case Segments.ISA.ToString.ToUpper()
Row.AuthorizeInfoQualifier = SegmentValue(arrLine, 1)
Row.AuthorizeInfo = SegmentValue(arrLine, 2)
Row.SecurityInfoQualifier = SegmentValue(arrLine, 3)
Row.SecurityInfo = SegmentValue(arrLine, 4)
Row.InterchangeSenderIdQualifier = SegmentValue(arrLine, 5)
Row.InterchangeSenderId = SegmentValue(arrLine, 6)
Row.InterchangeReceiverIdQualifier = SegmentValue(arrLine, 7)
Row.InterchangeReceiverId = SegmentValue(arrLine, 8)
Row.Date = SegmentValue(arrLine, 9)
Row.Time = SegmentValue(arrLine, 10)
Row.InterchangeStandardsId = SegmentValue(arrLine, 11)
Row.InterchangeVersionId = SegmentValue(arrLine, 12)
Row.InterchangeControlNumber = SegmentValue(arrLine, 13)
Row.Acknowledge = SegmentValue(arrLine, 14)
Row.TestIndicator = SegmentValue(arrLine, 15)
Row.SubElementSeparator = SegmentValue(arrLine, 16)
Case Segments.GS.ToString.ToUpper()
Row.FunctionalId = SegmentValue(arrLine, 1)
Row.ApplicationSenderCode = SegmentValue(arrLine, 2)
Row.ApplicationReceiverCode = SegmentValue(arrLine, 3)
Row.Date = SegmentValue(arrLine, 4)
Row.Time = SegmentValue(arrLine, 5)
Row.InterchangeControlNumber = SegmentValue(arrLine, 6)
Row.Responsbility = SegmentValue(arrLine, 7)
Row.Version = SegmentValue(arrLine, 8)
Case Segments.ST.ToString.ToUpper()
Row.TransactionIdCode = SegmentValue(arrLine, 1)
Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)
Case Segments.DEP.ToString.ToUpper()
Row.ReferenceId = SegmentValue(arrLine, 1)
Row.Date = SegmentValue(arrLine, 2)
Row.IdNumberQualifier = SegmentValue(arrLine, 5)
Row.IdentificationNumber = SegmentValue(arrLine, 6)
Row.AccountNumberQualifier = SegmentValue(arrLine, 7)
Row.AccountNumber = SegmentValue(arrLine, 8)
Case Segments.AMT.ToString.ToUpper()
Row.AmountQualifierCode = SegmentValue(arrLine, 1)
Row.MonetaryAmount1 = SegmentValue(arrLine, 2)
Case Segments.QTY.ToString.ToUpper()
Row.QuantityQualifier = SegmentValue(arrLine, 1)
Row.Quantity = Convert.ToDecimal(SegmentValue(arrLine, 2))
Case Segments.BAT.ToString.ToUpper()
Row.Date = SegmentValue(arrLine, 1)
Row.ReferenceId = SegmentValue(arrLine, 3)
Case Segments.BPR.ToString.ToUpper()
Row.TransactionHandlingCode = SegmentValue(arrLine, 1)
Row.MonetaryAmount1 = SegmentValue(arrLine, 2)
Row.CreditDebitFlag = SegmentValue(arrLine, 3)
Row.PaymentMethod = SegmentValue(arrLine, 4)
Row.IdNumberQualifier = SegmentValue(arrLine, 6)
Row.IdentificationNumber = SegmentValue(arrLine, 7)
Row.AccountNumber = SegmentValue(arrLine, 9)
Case Segments.REF.ToString.ToUpper()
Row.ReferenceIdQualifier = SegmentValue(arrLine, 1)
Row.ReferenceId = SegmentValue(arrLine, 2)
Case Segments.RMR.ToString.ToUpper()
Row.ReferenceIdQualifier = SegmentValue(arrLine, 1)
Row.ReferenceId = SegmentValue(arrLine, 2)
Row.MonetaryAmount1 = SegmentValue(arrLine, 4)
Row.MonetaryAmount2 = SegmentValue(arrLine, 5)
Case Segments.DTM.ToString.ToUpper()
Row.DateTimeQualifier = SegmentValue(arrLine, 1)
Row.Date = SegmentValue(arrLine, 2)
Row.Time = SegmentValue(arrLine, 3)
Row.TimeCode = SegmentValue(arrLine, 4)
Row.PeriodFormatQualifier = SegmentValue(arrLine, 5)
Row.DateTimePeriod = SegmentValue(arrLine, 6)
Case Segments.SE.ToString.ToUpper()
Row.Counter = Convert.ToInt32(SegmentValue(arrLine, 1))
Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)
Case Segments.GE.ToString.ToUpper()
Row.Counter = Convert.ToInt32(SegmentValue(arrLine, 1))
Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)
Case Segments.IEA.ToString.ToUpper()
Row.Counter = Convert.ToInt32(SegmentValue(arrLine, 1))
Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)
End Select
End Sub
Public Function SegmentValue(ByRef LineArray As String(), ByVal Counter As Integer) As String
If LineArray.Length > Counter Then
Return LineArray(Counter).ToString().Trim()
End If
Return String.Empty
End Function
End Class
EDI File Content:
ISA*00* *00* *12*2562379521 *08*YOUR ID *19981223*1056*U*00401*000000017*0*T*>
GS*PD*2562379521*YOUR ID*19981223*1056*000000017*X*004010VICS
ST*852*000000001
DEP*0116960*20110511***01*061000104*DA*1000022220940
AMT*3*13006.05
QTY*41*3
QTY*42*5
BAT*20110511**STAWRRY11051101
AMT*2*9332.33
QTY*42*2
BPR*I*4799*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001002
BPR*I*4533.33*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001001
BAT*20110511**STAWRRY11051102
AMT*2*1986.99
QTY*42*2
BPR*I*853.97*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736416
RMR*ST*00090183**853.97*12199.61
REF*BE*01*0123181825*0
REF*SL*NA*191219012318
DTM*810*20110430
BPR*I*1133.02*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736417
RMR*ST*00090184**1133.02*16186.04
REF*BE*01*0123181825*0
REF*SL*NA*191219012318
DTM*810*20110430
BAT*20110511**STAWRRY11051103
AMT*2*1686.73
QTY*42*1
BPR*I*1686.73*C*CHK*PBC*01*075911742*DA*0100461755
REF*CK*0000002795
RMR*ST*00094075**1686.73*42168.16
REF*BE*01*0123181825*0
REF*SL*NA*191219012318
DTM*810*20110331
SE*39*000000088
GE*1*88
IEA*1*000000088
Screenshot #1:
Screenshot #2:
Screenshot #3:
Screenshot #4:
Screenshot #5:
Screenshot #6:
Screenshot #7:
Screenshot #8:
Screenshot #9:
Screenshot #10:
Screenshot #11:
Screenshot #12:
Screenshot #13:
Screenshot #14:
Screenshot #15:
Screenshot #16:
Screenshot #17:
Screenshot #18:
Screenshot #19:
Screenshot #20:
Screenshot #21:
You can use a script component source to build the rules on handling the file. That's well described here and here. You may be able to use the EDI Source from CozyRoc, but I don't believe they have indicated it has been tested with EDI 823.
This is why translation engines are available and so widely used. I love the detail of the solution above, but also feel like there's some reinvention of the wheel.
Take a look at Jitterbit. It is open source, and pretty user friendly for data transformations.
You can model your source, model your target and then do a transformation. Based on your models, you'll have much more control over the transformation that what SSIS will give you. Sure there's a learning curve, but you can automate the communication / transformation. There also would be ZERO code to write.
精彩评论