Creating SSIS Custom Data Flow Component + SSIS 2008
Am a newbie to to SSIS Custom Component. Just started to code a component where the input row count will never be same as output row count. foreach input row it does some validation and generates n rows which need to be mapped to output buffer.
So after with Design time validation coding everything is fine.
my RunTime code is as below:
public override void PreExecute()
{
IDTSInput100 input = ComponentMetaData.InputCollection[0];
inputBufferColumnIndex = new int[input.InputColumnCollection.Count];
for (int x = 0; x < input.InputColumnCollection.Count; x++)
{
IDTSInputColumn100 column = input.InputColumnCollection[x];
inputBufferColumnIndex[x] = BufferManager.FindColumnByLineageID (input.Buffer, column.LineageID);
}
IDTSOutput100 output = ComponentMetaData.OutputCollection[0];
outputBufferColumnIndex = new int[output.OutputColumnCollection.Count];
for (int x = 0; x < output.OutputColumnCollection.Count; x++)
{
IDTSOutputColumn100 outcol = output.OutputColumnCollection[x];
outputBufferColumnIndex[x] = BufferManager.FindColumnByLineageID(input.Buffer, outcol.LineageID);
}
}
public override void ProcessInput(int inputID, PipelineBuffer buffer)
{
if(!buffer.EndOfRowset)
{
while (buffer.NextRow())
{
var rec = new Record
{
Source = buffer[0].ToString(),
Nk = buffer[1].ToString(),
Guid = new Guid(buffer[2].ToString()),
FromDate = Convert.ToDateTime(buffer[3].ToString()),
ToDate = Convert.ToDateTime(buffer[4].ToString())
};
sourceRecords.Add(rec);
}
ProcessArray(sourceRecords,buffer);
}
}
public void ProcessArray(List<Record> records, PipelineBuffer buffer)
{
//Get Distinct NKs from the source Records
List<str开发者_运维问答ing> nKs = (from c in records select c.Nk).Distinct().ToList();
foreach (var nk in nKs)
{
//Get all the record for particular NK
List<Record> filteredRecords = (from c in sourceRecords where c.Nk == nk select c)
.OrderBy(c => c.Source)
.ThenBy(c => c.FromDate)
.ThenBy(c => c.ToDate).ToList();
foreach (var filteredRecord in filteredRecords)
{
_start = filteredRecord.FromDate;
_end = filteredRecord.ToDate;
while (filteredRecord.WriteComplete == false)
{
foreach (var record in filteredRecords)
{
if (record.FromDate > _start && record.FromDate < _end) _end = record.ToDate;
if (record.ToDate < _end && record.ToDate > _start) _end = record.ToDate;
}
//Output0Buffer.AddRow();
//Output0Buffer.outSource = filteredRecord.Source;
//Output0Buffer.outNK = filteredRecord.Nk;
//Output0Buffer.outRecid = filteredRecord.Guid;
//Output0Buffer.outFromDate = _start;
//Output0Buffer.outToDate = _end;
buffer.SetString(5,filteredRecord.Source);
buffer.SetString(6,filteredRecord.Nk);
buffer.SetGuid(7,filteredRecord.Guid);
buffer.SetDateTime(8,filteredRecord.FromDate);
buffer.SetDateTime(9,filteredRecord.ToDate);
_start = _end;
_end = filteredRecord.ToDate;
if (_start == _end) filteredRecord.WriteComplete = true;
}
}
}
}
}
public class Record
{
public Guid Guid { get; set; }
public string Nk { get; set; }
public string Source { get; set; }
public DateTime FromDate { get; set; }
public DateTime ToDate { get; set; }
public bool WriteComplete { get; set; }
}
In my ProcessArray method am trying to populate the output buffer. Am not even sure this can be done.
Any guidance will be appreciated.
Thanks
Yes this type of transform can be done, it is called an asynchronous transform. Your code looks good to me. it wasn't clear from your question if you were experiencing a specific problem.
You may want to try and create a Asynchronous Script Component transform so you don't have to fumble with all of the SSIS plumbing.
more info here: http://msdn.microsoft.com/en-us/library/ms136133.aspx
http://msdn.microsoft.com/en-us/library/ms135931.aspx
I'm not sure I understand what you are trying to achieve, but looks like you are trying to sort all the data and then sequentially process the sorted list. Note that your ProcessInput method is called multiple times, each with a new buffer. Any sorting you do with received buffer only applies to this particular buffer - the data is not sorted globally, so your results may vary depending on buffer boundaries.
Is this OK for particular scenario? If not, use Sort transform to sort all the data for you, add your transform after Sort and just process the data row by row - it is already sorted. So just read it row-by-row, then modify the current row after reading - this is what buffer.SetString is for.
Also, don't hardcode column indixes, like buffer.SetString(5, ...) - the numbers may change, it is better to get and save the column index in PreExecute, then use something like buffer.SetString(nkColumnIndex, nkColumnValue);
精彩评论