IBM HPU 5.2 - Converting EBCDIC Characters to Blanks

ojdiaz · New User Joined: 19 Nov 2008 Posts: 99 Location: Spain

Hi,

We are using the latest available HPU in our shop, version 5.2.

We need to unload some tables that are quite large (hundreds of millions of records) and transfer those files to an IBM Datastage system for further processing. This system requires the files to be encoded with CCSID 1145. We are downloading DB2 EBCDIC data.

After the data is donloaded and SFTP'd to an IBM datastate process, it is loaded into an UTF-8 Oracle table, as well it is processed by some python jobs.

We have a problem in a VARCHAR field which contains several characters that cause issues when the file is processed after the Datastage ETL (for examplo, x'15', x'0d', x'0a', etc) , which encodes the fields into UTF-8. For example, some characters are encoded as line feeds and carriage returns, which makes the processes fail.

We don't have control to validate the input data for the DB2 tables, so we need to "clean" those characters. The simple task was using a column function to change the values to x'40' (BLANKS), and it worked fine. However, as we started to increase the number of characters to change, we finally hit a roadblock and got a -905 error, since the unload was using too much CPU time. We talked with the DBAs to increase temporarily the time for the Batch job but that wasn't an opcion

So, I started reading the manual for HPU, but honestly, this is one of the hardest manuals I've come across. I think the padding option may help with this, but I'm not sure of the current syntax to use it.

This is the current unload step we have: