IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Exclude rows with > than x occurrences of a specified char


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Beinlich Dahl

New User


Joined: 14 May 2021
Posts: 4
Location: Norway

PostPosted: Tue Oct 24, 2023 11:30 am
Reply with quote

Hi.

Context of question.
I'm loading a db2 table using a CSV file, where the column delimitor is semicolon (x'5E' for most/all ebcdic tables). The problem is that there is a free text column where someone has entered semicolon, this confuses db2 load of course which then interpret this as extra column delimitor. I want to remove rows with ; in free text from input file.

The question
Is there a sort which can take a VB file with LRECL 27994 as input, then exclude the rows not containing exactly 109 occurences of ;?
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1306
Location: Bamberg, Germany

PostPosted: Tue Oct 24, 2023 11:55 am
Reply with quote

Please provide a sample to test with. It should/will be possible I guess.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1306
Location: Bamberg, Germany

PostPosted: Tue Oct 24, 2023 12:05 pm
Reply with quote

See if this is of help. It includes only 9 semicolons for output, if additional are found they are dropped.

Code:
//WHATEVER EXEC PGM=ICEMAN                     
//SORTIN   DD *                                 
;;;;;;;;;                                       
;;;;;;;;;;                                     
/*                                             
//SYSOUT   DD SYSOUT=*                         
//SORTOUT  DD SYSOUT=*                         
//SYSIN    DD *                                 
  OPTION COPY                                   
  INREC IFTHEN=(WHEN=INIT,                     
    PARSE=(%=(STARTAT=C';';FIXLEN=1,REPEAT=9), 
           %01=(STARTAT=C';',FIXLEN=1)),       
    BUILD=(81:%01))                             
  OUTFIL FNAMES=(SORTOUT),                     
    INCLUDE=(81,1,CH,EQ,C' ')                   
  END                                           
/*
Back to top
View user's profile Send private message
Beinlich Dahl

New User


Joined: 14 May 2021
Posts: 4
Location: Norway

PostPosted: Tue Oct 24, 2023 5:50 pm
Reply with quote

The job ran fine, and in SORTOUT there is one row as expected. The problem is that the single row in SORTOUT is only spaces, I would hope to see the original input there.
Back to top
View user's profile Send private message
Beinlich Dahl

New User


Joined: 14 May 2021
Posts: 4
Location: Norway

PostPosted: Tue Oct 24, 2023 5:52 pm
Reply with quote

Sample data, I use VB 27994 as input file.

aa;g;dd;kkk;111;;;uu;
aa;;g;dd;kkk;111;;;uu;;
aa;;g;dd;kkk;111;;;uu;

In my actual problem, the rows has thousands of characters, and the rows I want to keep has exactly 108 occurences of ;
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2119
Location: USA

PostPosted: Tue Oct 24, 2023 8:09 pm
Reply with quote

Beinlich Dahl wrote:
Hi.

Context of question.
I'm loading a db2 table using a CSV file, where the column delimitor is semicolon (x'5E' for most/all ebcdic tables). The problem is that there is a free text column where someone has entered semicolon, this confuses db2 load of course which then interpret this as extra column delimitor. I want to remove rows with ; in free text from input file.

The question
Is there a sort which can take a VB file with LRECL 27994 as input, then exclude the rows not containing exactly 109 occurences of ;?


If this semicolon as a part of free text field is not quoted within single or double quotes, then there is no chance to distinguish between normal separator and this fake one, even by human mind, to say nothing about AI.

AFAIR the DB2 LOAD/UNLOAD utilities do recognize quoted free text automatically, don't they?
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1306
Location: Bamberg, Germany

PostPosted: Wed Oct 25, 2023 5:44 am
Reply with quote

Beinlich Dahl wrote:
The job ran fine, and in SORTOUT there is one row as expected. The problem is that the single row in SORTOUT is only spaces, I would hope to see the original input there.

Code:
OPTION COPY                                   
INREC IFTHEN=(WHEN=INIT,                     
  PARSE=(%=(STARTAT=C';';FIXLEN=1,REPEAT=9), 
         %01=(STARTAT=C';',FIXLEN=1)),       
  OVERLAY=(81:%01))                           
OUTFIL FNAMES=(SORTOUT),                     
  INCLUDE=(81,1,CH,EQ,C' '),                 
  BUILD=(1,80)                               
END
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Reflection 3270 screen shows only 23 ... Compuware & Other Tools 6
No new posts sort to find out the char which repea... Mainframe Interview Questions 10
No new posts To get the count of rows for every 1 ... DB2 3
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Convert single row multi cols to sing... DFSORT/ICETOOL 6
Search our Forums:

Back to Top