IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

IBM HPU 5.2 - Converting EBCDIC Characters to Blanks


IBM Mainframe Forums -> DB2
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
ojdiaz

New User


Joined: 19 Nov 2008
Posts: 99
Location: Spain

PostPosted: Fri Dec 27, 2024 5:11 pm
Reply with quote

Hi,

We are using the latest available HPU in our shop, version 5.2.

We need to unload some tables that are quite large (hundreds of millions of records) and transfer those files to an IBM Datastage system for further processing. This system requires the files to be encoded with CCSID 1145. We are downloading DB2 EBCDIC data.

After the data is donloaded and SFTP'd to an IBM datastate process, it is loaded into an UTF-8 Oracle table, as well it is processed by some python jobs.

We have a problem in a VARCHAR field which contains several characters that cause issues when the file is processed after the Datastage ETL (for examplo, x'15', x'0d', x'0a', etc) , which encodes the fields into UTF-8. For example, some characters are encoded as line feeds and carriage returns, which makes the processes fail.

We don't have control to validate the input data for the DB2 tables, so we need to "clean" those characters. The simple task was using a column function to change the values to x'40' (BLANKS), and it worked fine. However, as we started to increase the number of characters to change, we finally hit a roadblock and got a -905 error, since the unload was using too much CPU time. We talked with the DBAs to increase temporarily the time for the Batch job but that wasn't an opcion

So, I started reading the manual for HPU, but honestly, this is one of the hardest manuals I've come across. I think the padding option may help with this, but I'm not sure of the current syntax to use it.

This is the current unload step we have:


Code:
//P001 EXEC DBSDHPU1,JOB=XXXXXX,GR=INDBKES,TS=D240129,P=2,S=2
//SYSIN    DD  *                                               
  TEMPLATE DDUNL DSN 'XXX.XXX.XXX.XXX'   
  UNIT 3390DISK                                             
  DISP(NEW,CATLG,DELETE)                                 
  SPACE(5,1) CYL                                         
  UNLOAD TABLESPACE                                       
  DB2 FORCE LOCK NO QUIESCE NO                                                   
  SELECT CHAR('H'),                                           
  A.FIELD1,
  A.FIELD2,
  A.FIELD3,
  B.FIELD4,
  B.FIELD5,
  B.FIELD6,
  B.FIELD7,
  B.FIELD8,
  TRANSLATE(B.CHAR_FIELD,X'40',                               
  X'150D2511000102030405060708090A0B0C0E0F1012131416171819' ||
   '1A1B1C1D1E' ||                                             
   '1F2021222324262728292A2B2C2D2E2F303132333435363738393A' ||
   '3B3C3D3E3F')                                               
  FROM TABLE1 A                                 
  JOIN TABLE2 B
   ON B.KEY_FIELD1 = A.KEY_FIELD1
  AND B.KEY_FIELD2 = A.KEY_FIELD2                               
  AND B.KEY_FIELD3 = A.KEY_FIELD3
  WHERE (A.DATE BETWEEN '01.01.2023' AND '31.12.2023')   
  AND A.COMPANY <> 'STRING 1';
   OPTIONS                                                       
   LENGTH MAX                                                     
   PIC('P',LEAD,'.','00.0')                                       
   REFORMAT (                                                     
    TYPE DECIMAL INTO CHAR,                                       
    TYPE DATE INTO DATE 'DD/MO/YYYY',                             
    TYPE TIME INTO TIME 'HH:MI:SS',                               
    TYPE TIMESTAMP INTO TIMESTAMP 'DD/MO/YYYY HH:MI:SS,NNNNNN000')
   OUTDDN DDUNL                                                   
   FORMAT DSNTIAUL                                               
   EBCDIC CCSID(1145) 


So what we want is to change the translate instruction by a padding or some reformat at columen level, if it is possible, to remove the translate function.

We have a plan b which is to pass the file trhough a IFTHEN sort function, to remove those characters, but given that the file is that huge, we would like to save some CPU time.

We tried this at the options block, but it only filled the null values with x'15' at the end of the field.

Code:
OPTIONS                                                       
LENGTH MAX                                                     
PIC('P',LEAD,'.','00.0')                                       
PADDING(X'15')                                                 
REFORMAT (                                                     
 TYPE DECIMAL INTO CHAR,                                       
 TYPE DATE INTO DATE 'DD/MO/YYYY',                             
 TYPE TIME INTO TIME 'HH:MI:SS',                               
 TYPE TIMESTAMP INTO TIMESTAMP 'DD/MO/YYYY HH:MI:SS,NNNNNN000')
OUTDDN DDUNL                                                   
FORMAT DSNTIAUL                                               
EBCDIC CCSID(1145)   



This is the example of the data in the table:

Code:
Key    Column         Data                                                   
       VARCHAR500     9.............99999999999999999999999999999999999999999
                      F1020000000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
                      95D512345679AB99999999999999999999999999999999999999999
                                                                             
                      9999999999999999999999999999999999999999999999999999999
                      FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
                      9999999999999999999999999999999999999999999999999999999
                                                                             
                      9999999999999999999999999999999999999999999999999999999
                      FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
                      9999999999999999999999999999999999999999999999999999999


And this example the output generated with the latest option block

First part of the file

Code:
 ------------------------------------------------------------------------------
H012826/12/20245A.99.09999999999.00001.999.v9.............9999999999999999999999
CFFFFFF6FF6FFFFFC1FF1FFFFFFFFFFF1FFFFF1FFF0AF1020000000000FFFFFFFFFFFFFFFFFFFFFF
8012826112120245159950999999999950000159990595D512345679AB9999999999999999999999
 ------------------------------------------------------------------------------


Last part of the file:


Code:
 ------------------------------------------------------------------------------
99999999999999999999999999999999999.............................................
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF111111111111111111111111111111111111111111111
99999999999999999999999999999999999555555555555555555555555555555555555555555555
 ------------------------------------------------------------------------------


Any hints about how we could approach this would be welcome

Thanks a have a happy holidays
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DB2

 


Similar Topics
Topic Forum Replies
No new posts Format Binary file to EBCDIC JCL & VSAM 4
No new posts Issues Converting From ZD to Signed N... DFSORT/ICETOOL 4
No new posts Substring number between 2 characters... DFSORT/ICETOOL 2
No new posts Reading dataset in Python - New Line ... All Other Mainframe Topics 22
No new posts Converting fixed length file to excel... IBM Tools 7
Search our Forums:

Back to Top