开发者

Duplicate Records in SSIS Flat File Destination

I am writing to a flat file destination in a 2008 SSIS package. 99.99% of it works correctly. However, I get one duplicate record in the destination file.

Here is the basic flow of the package:

1. Read two ISO-8859-1 encoded files and encode their text to UTF8 in memory

2. Combine the two files together in mem开发者_JAVA百科ory and load them into a lookup cache

3. Read another source file from disk

4. Match an ID column from the source file to an ID column in the lookup cache

5. If the ID matches an ID in the lookup cache, write it to a match file, if the ID does not match write it to another file

Everything works from beginning to end. However, I am getting a duplicate in the match file. I have begun to suspect that the duplicate is caused by an end-of-file (or other) special character from the lookup cache text files when they are joined. These files are produced on a UNIX system (but I am encoding them to UTF8 when I read them). The duplicate record is the same record every time. How do I keep from getting the duplicate (or figure out where the duplicate is coming from)? I cannot use a remove duplicates, because there are legitimate duplicates in the destination. I have been trying to figure this out for a few weeks.


Start with putting the data to staging tables, tables that you can query. Maybe you can see how in join ing together you get the duplication. Also, how do you know this is an invalid duplication if you have valid opnes? What makes it invalid?


I figured out the issue. I did not set a field to an empty string when reading the source which would have eliminated that row. Then that row was being matched to a random row in the lookup transform and continuing through and being written to the destination.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜