View previous topic :: View next topic
|
Author |
Message |
Beinlich Dahl
New User
Joined: 14 May 2021 Posts: 4 Location: Norway
|
|
|
|
Hi.
Context of question.
I'm loading a db2 table using a CSV file, where the column delimitor is semicolon (x'5E' for most/all ebcdic tables). The problem is that there is a free text column where someone has entered semicolon, this confuses db2 load of course which then interpret this as extra column delimitor. I want to remove rows with ; in free text from input file.
The question
Is there a sort which can take a VB file with LRECL 27994 as input, then exclude the rows not containing exactly 109 occurences of ;? |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1337 Location: Bamberg, Germany
|
|
|
|
Please provide a sample to test with. It should/will be possible I guess. |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1337 Location: Bamberg, Germany
|
|
|
|
See if this is of help. It includes only 9 semicolons for output, if additional are found they are dropped.
Code: |
//WHATEVER EXEC PGM=ICEMAN
//SORTIN DD *
;;;;;;;;;
;;;;;;;;;;
/*
//SYSOUT DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//SYSIN DD *
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
PARSE=(%=(STARTAT=C';';FIXLEN=1,REPEAT=9),
%01=(STARTAT=C';',FIXLEN=1)),
BUILD=(81:%01))
OUTFIL FNAMES=(SORTOUT),
INCLUDE=(81,1,CH,EQ,C' ')
END
/* |
|
|
Back to top |
|
|
Beinlich Dahl
New User
Joined: 14 May 2021 Posts: 4 Location: Norway
|
|
|
|
The job ran fine, and in SORTOUT there is one row as expected. The problem is that the single row in SORTOUT is only spaces, I would hope to see the original input there. |
|
Back to top |
|
|
Beinlich Dahl
New User
Joined: 14 May 2021 Posts: 4 Location: Norway
|
|
|
|
Sample data, I use VB 27994 as input file.
aa;g;dd;kkk;111;;;uu;
aa;;g;dd;kkk;111;;;uu;;
aa;;g;dd;kkk;111;;;uu;
In my actual problem, the rows has thousands of characters, and the rows I want to keep has exactly 108 occurences of ; |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2141 Location: USA
|
|
|
|
Beinlich Dahl wrote: |
Hi.
Context of question.
I'm loading a db2 table using a CSV file, where the column delimitor is semicolon (x'5E' for most/all ebcdic tables). The problem is that there is a free text column where someone has entered semicolon, this confuses db2 load of course which then interpret this as extra column delimitor. I want to remove rows with ; in free text from input file.
The question
Is there a sort which can take a VB file with LRECL 27994 as input, then exclude the rows not containing exactly 109 occurences of ;? |
If this semicolon as a part of free text field is not quoted within single or double quotes, then there is no chance to distinguish between normal separator and this fake one, even by human mind, to say nothing about AI.
AFAIR the DB2 LOAD/UNLOAD utilities do recognize quoted free text automatically, don't they? |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1337 Location: Bamberg, Germany
|
|
|
|
Beinlich Dahl wrote: |
The job ran fine, and in SORTOUT there is one row as expected. The problem is that the single row in SORTOUT is only spaces, I would hope to see the original input there. |
Code: |
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
PARSE=(%=(STARTAT=C';';FIXLEN=1,REPEAT=9),
%01=(STARTAT=C';',FIXLEN=1)),
OVERLAY=(81:%01))
OUTFIL FNAMES=(SORTOUT),
INCLUDE=(81,1,CH,EQ,C' '),
BUILD=(1,80)
END |
|
|
Back to top |
|
|
|