DB2/iSeries SQL clean up CR/LF, tabs etc

2023-04-09 07:19 问答作者：

I need to find and clean up line breaks, carriage returns, tabs and "SUB"-characters 开发者_JS百科in a set of 400k+ string records, but this DB2 environment is taking a toll on me.

Thought I could do some search and replacing with the REPLACE() and CHR() functions, but it seems CHR() is not available on this system (Error: CHR in *LIBL type *N not found). Working with \t, \r, \n etc doesn't seem to be working either. The chars can be in the middle of strings or at the end of them.

DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC

Any hints on what I can do with this?

I used this SQL to find x'25' and x'0D':

SELECT 
     <field>
    , LOCATE(x'0D', <field>) AS "0D" 
    , LOCATE(x'25', <field>) AS "25" 
    , length(trim(<field>)) AS "Length"
FROM <file> 
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0

And I used this SQL to replace them:

UPDATE <file> 
SET <field> = REPLACE(REPLACE(<field>, x'0D', ' '), x'25', ' ')
WHERE   LOCATE(x'25', <field>) > 0 
    OR  LOCATE(x'0D', <field>) > 0

If you want to clear up specific characters like carriage return (EBCDIC x'0d') and line feed (EBCDIC x'25') you should find the translated character in EBCDIC then use the TRANSLATE() function to replace them with space.

If you just want to remove undisplayable characters then look for anything under x'40'.

Here is an sample script that replaces X'41' by X'40'. Something that was creating issues at our shop:

UPDATE [yourfile] SET [yourfield] = TRANSLATE([yourfield], X'40', 
X'41') WHERE [yourfield] like '%' concat X'41' concat '%'

If you need to replace more than one character, extend the "to" and "from" hexadecimal strings to the values you need in the TRANSLATE function.

Try TRANSLATE or REPLACE.

The brute force method involves using POSITION to find the errant character, then SUBSTR before and after it. CONCAT the two substrings (less the undesirable character) to re-form the column.

The character encoding is almost certainly one of the EBCDIC character sets. Depending on how the table got loaded in the first place, the CR may be x'0d' and the LF x'15', x'25'. An easy way to find out is to get to a green screen and do a DSPPFM against the table. Press F10 then F11 to view the table is raw, hexadecimal (over/under) format.

For details on the available functions see the DB2 for i5/OS SQL Reference.

Perhaps the TRANSLATE() function will serve your needs.

    TRANSLATE( data, tochars, fromchars )

...where fromchars is the set of characters you don't want, and tochars is the corresponding characters you want them replaced with. You may have to write this out in hex format, as x'nnnnnn...' and you will need to know what character set you are working with. Using the DSPFFD command on your table should show the CCSID of your fields.

we struggled a lot to replace the new line char and carriage return from flat file.

Finally we used below sql to sort the issue.

REPLACE(REPLACE(COLUMN_NAME, CHR(13), ''), CHR(10), '')

Try it out

CR = CHR(13)
LF = CHR(10)

继续阅读：db2 ibm-midrange replace sql

DB2/iSeries SQL clean up CR/LF, tabs etc

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？