Kettle: Multiple putRows() in processRow() correctly?
I'm processing a /etc/group
file from a system. I load it with CSV input
step with the delimiter :
. It has four fields: group
,pwfield
,gid
,members
. The members
field is a comma separated list with account names of unspecified count from 0 to infinite.
I would like to produce a list of records with three fields: group
开发者_Python百科,gid
,account
. In the first step I use User Defined Java Class
, in the second I use Select values
.
Example Input:
root:x:0:
first:x:100:joe,jane,zorro
second:x:101:steve
Example output (XLS) - expected:
group gid account
first 100 joe
first 100 jane
first 100 zorro
second 101 steve
Example output (XLS) - actual, wrong:
group gid account
first 100 zorro
first 100 zorro
first 100 zorro
second 101 steve
User Defined Java Class:
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
{
// boilerplate
Object[] r = getRow();
if (r == null) {
setOutputDone();
return false;
}
String tmp = get(Fields.In, "members").getString(r);
if(null==tmp)
return true;
String accounts[] = tmp.split(",");
for(int i=0; i<accounts.length; ++i){
Object[] out_row = createOutputRow(r, data.outputRowMeta.size());
String account = accounts[i];
get(Fields.Out, "account").setValue(out_row,account);
putRow(data.outputRowMeta, out_row);
}
return true;
}
I believe that I missed to call some administrative function, or I should use something other than createOutRow()
. Google did not helped.
Misterously if I create a transformation like the illustrated then
XLS debug A
has correctaccount
values in each rowXLS debug B
has repeatingaccount
values like the example output.
If I place a Dummy
step before Select values 7
, the XLS debug B
becomes correct and XLS debug A
becomes bad.
The problem is with the following line (first line in the for loop):
Object[] out_row = createOutputRow(r, data.outputRowMeta.size());
It should be replaced with these three lines:
Object[] out_row = RowDataUtil.allocateRowData(data.outputRowMeta.size());
for (int j=0; j<r.length; ++j)
out_row[j] = r[j];
UPDATE: A more easy way which is essentially the same:
Object[] out_row = RowDataUtil.createResizedCopy(r, data.outputRowMeta.size());
精彩评论