View previous topic :: View next topic
|
Author |
Message |
prashantshinde
New User
.jpg)
Joined: 25 Nov 2005 Posts: 49 Location: Pune
|
|
|
|
I have input CSV file having multiple primary and secondary key value fields. I need to sort it on Primary and then secondary key with trailer record inserted after each primary key change. I am able to handle one primary key using simple OUTFIL and IFTRAIL functions.
Also, the date from input CSV file is not always in same format. It can be M/D/YYYY or MM/D/YYYY or M/DD/YYYY as well as MM/DD/YYYY. I am currently handling varying position of '/' for possible positions they would appear. Is there any function to handle all four formats of date and give output in MMDDYYYY?
File details -
Primary Key - 1 -9 bytes, Secondary Key - 11-20 , Date - 22 -41, Indicator - 43-47,Amount - variable length upto 15 digits including decimal,Name - 50 characters
1 2 3 4 5 6
123456789012345678901234567890123456789012345678901234567890
Code: |
987654321,1000023457,8/27/2021 12:00:00 AM,False,10.01,NAME1
987654321,1000023456,8/2/2021 12:00:00 AM,False,10.02,NAME2
987654320,1000023455,10/27/202112:00:00 AM,True ,20.01,NAME3
987654320,1000023454,8/27/2021 12:00:00 AM,False,20.02,NAME4
987654320,1000023453,8/27/2021 12:00:00 AM,False,20.03,NAME5 |
Desired output is -
Code: |
987654321$$$$1000023457100108272021 NAME1 -> when 'False' indicator, one character after date and before Name field is blank.
987654321$$$$1000023456100208022021 NAME2
T987654321$$$$$$$000000002$$$$$$$$000000002003 -> Trailer record for same Primary key and space value for indicator - starting with 'T',Primary Key,spaces,number of records,spaces,sum of amount
987654320$$$$1000023454200208272021 NAME4
987654320$$$$1000023453200308272021 NAME5
T987654320$$$$$$$000000002$$$$$$$$000000004005 -> Trailer record for same Primary key and space value for indicator - starting with 'T',Primary Key,spaces,number of records,spaces,sum of amount
987654320$$$$1000023455200110272021VNAME3 -> when 'True ' indicator, one character after date and before Name field is 'V'.
T987654320$$$$$$$000000001$$$$$$$$000000002003V -> Trailer record for same Primary key and 'V' value for indicator - starting with 'T',Primary Key,spaces,number of records,spaces,sum of amount,indiactor 'V' |
Coded for you |
|
Back to top |
|
 |
sergeyken
Senior Member

Joined: 29 Apr 2008 Posts: 2209 Location: USA
|
|
|
|
1) use forum's code tags to present ANY code/data
2) if the data fields are aligned (to the same positions within records) then it doesn't matter: are they comma-separated, or not. It's all is absolutely not clear from your samples.
3) Major question: did you try to do something by yourself? Where are the results of your attempts? |
|
Back to top |
|
 |
enrico-sorichetti
Superior Member

Joined: 14 Mar 2007 Posts: 10896 Location: italy
|
|
Back to top |
|
 |
Rohit Umarjikar
Global Moderator

Joined: 21 Sep 2010 Posts: 3087 Location: NYC,USA
|
|
|
|
Please use code tags going forward. Please use SEARCH button to find similar problems discussed and solution provided. Come up with what you tried and one may help further to get it fixed.
ibmmainframes.com/about59559.html |
|
Back to top |
|
 |
Joerg.Findeisen
Senior Member

Joined: 15 Aug 2015 Posts: 1387 Location: Bamberg, Germany
|
|
|
|
How comes that,
Code: |
987654320$$$$1000023455200110272021VNAME3 -> when 'True ' indicator, one character after date and before Name field is 'V'.
T987654320$$$$$$$000000001$$$$$$$$000000002003V -> Trailer record for same Primary key and 'V' value for indicator - starting with 'T',Primary Key,spaces,number of records,spaces,sum of amount,indiactor 'V' |
an input of 2001 (amount) gives a sum of 2003 for just one record?
Besides the funny part of it, the sample input can be processed with less than 20 lines of code to achieve the desired result.
Code: |
****** **************************** Datenanfang **********
000001 987654321$$$$1000023457100108272021 NAME1
000002 987654321$$$$1000023456100208022021 NAME2
000003 T987654321$$$$$$$000000002$$$$$$$$000000002003
000004 987654320$$$$1000023454200208272021 NAME4
000005 987654320$$$$1000023453200308272021 NAME5
000006 T987654320$$$$$$$000000002$$$$$$$$000000004005
000007 987654320$$$$1000023455200110272021VNAME3
000008 T987654320$$$$$$$000000001$$$$$$$$000000002001V
****** **************************** Datenende *********** |
Excuse eventually missing details, it's late over here. |
|
Back to top |
|
 |
prashantshinde
New User
.jpg)
Joined: 25 Nov 2005 Posts: 49 Location: Pune
|
|
|
|
Below is actual input file (without Name field) and my SORT JCL
Code: |
----+----1----+----2----+----3----+----4----+----5----+-
242124321,0000007396,1/1/2021 12:00:00 AM,False,8661.52
242124321,0000007397,1/2/2021 12:00:00 AM,True ,719.88
242124321,0000007406,1/3/2021 12:00:00 AM,False,890.00
|
SORT JCL -
Sort step1 - All IFTHEN-INOUT sortcard lines are not mentioned. only first line for 01 is mentioned. For 02 thru 09, similar sort card lines are present. Also in the OUTREC, time field is omitted as well as expanded date is handled
Code: |
OPTION COPY
INREC IFTHEN=(WHEN=INIT,FINDREP=(INOUT=(C'1/',C'01/'),
STARTPOS=22,ENDPOS=23)),
IFTHEN=(WHEN=INIT,FINDREP=(INOUT=(C'1/',C'01/'),
STARTPOS=24,ENDPOS=25)),
IFTHEN=(WHEN=INIT,FINDREP=(INOUT=(C'1/',C'01/'),
STARTPOS=25,ENDPOS=26)),
IFTHEN=(WHEN=INIT,
FINDREP=(INOUT=(C'False',C' '))),
IFTHEN=(WHEN=INIT,
FINDREP=(INOUT=(C'True ',C'V')))
OUTREC IFTHEN=(WHEN=(32,2,CH,EQ,X'4040'),
BUILD=(1,31,34,46)),
IFTHEN=(WHEN=(32,1,CH,EQ,X'40'),
BUILD=(1,31,33,46))
|
Step 2
Code: |
OPTION COPY
OUTFIL FNAMES=SRT1OUT,INCLUDE=(44,1,CH,EQ,C' '),
BUILD=(1,9,4X,11,10,46,20,SFF,EDIT=(TTTTTTTTTT),
22,2,25,2,28,4,44,1),
TRAILER1=(C'T',1,9,9X,COUNT=(M11,LENGTH=09)),REMOVECC
OUTFIL FNAMES=SRT2OUT,INCLUDE=(44,1,CH,EQ,C'V'),
BUILD=(1,9,4X,11,10,46,20,SFF,EDIT=(TTTTTTTTTT),
22,2,25,2,28,4,44,1),
TRAILER1=(C'T',1,9,9X,COUNT=(M11,LENGTH=09),
47:C'VOIDS'),REMOVECC
|
Step 3- Output files of Step2 are updated separately for Trailer Record count and amount
Code: |
OPTION COPY
OUTFIL IFTRAIL=(TRLID=(1,1,CH,EQ,C'T'),
TRLUPD=(20:COUNT=(M11,LENGTH=10),
35:TOT=(24,10,ZD,TO=ZD,LENGTH=12)))
|
Step 4 - Merged two output files to get final output as below
Code: |
----+----1----+----2----+----3----+----4----+----5----+
242124321 0000007396000086615201012021
242124321 0000007406000008900001032021
T242124321 0000000002 000000955152
242124321 0000007397000007198801022021V
T242124321 0000000001 000000071988VOIDS
|
[/code] |
|
Back to top |
|
 |
Joerg.Findeisen
Senior Member

Joined: 15 Aug 2015 Posts: 1387 Location: Bamberg, Germany
|
|
|
|
prashantshinde wrote: |
SORT JCL -
Sort step1 - All IFTHEN-INOUT sortcard lines are not mentioned. only first line for 01 is mentioned. For 02 thru 09, similar sort card lines are present. Also in the OUTREC, time field is omitted as well as expanded date is handled
Code: |
OPTION COPY
INREC IFTHEN=(WHEN=INIT,FINDREP=(INOUT=(C'1/',C'01/'),
STARTPOS=22,ENDPOS=23)),
IFTHEN=(WHEN=INIT,FINDREP=(INOUT=(C'1/',C'01/'),
STARTPOS=24,ENDPOS=25)),
IFTHEN=(WHEN=INIT,FINDREP=(INOUT=(C'1/',C'01/'),
STARTPOS=25,ENDPOS=26)),
IFTHEN=(WHEN=INIT,
FINDREP=(INOUT=(C'False',C' '))),
IFTHEN=(WHEN=INIT,
FINDREP=(INOUT=(C'True ',C'V')))
OUTREC IFTHEN=(WHEN=(32,2,CH,EQ,X'4040'),
BUILD=(1,31,34,46)),
IFTHEN=(WHEN=(32,1,CH,EQ,X'40'),
BUILD=(1,31,33,46)) |
|
Learn about PARSE and CHANGE from DFSORT Manuals. FINDREP/IFTRAIL is NOT needed here at all. Also two separate DSN that have to be merged, NOT needed. Save the Planet, avoid unnecessary CPU cycles!
Also your amount fields seem to not have negatives, right? |
|
Back to top |
|
 |
sergeyken
Senior Member

Joined: 29 Apr 2008 Posts: 2209 Location: USA
|
|
|
|
Try to fix the issues one by one, not altogether.
1) Learn and train yourself: how to convert ONLY floating date values to the fixed format:
Code: |
1/1/2021 -> 01/01/2021
01/1/2021 -> 01/01/2021
1/01/2021 -> 01/01/2021
01/01/2021 -> 01/01/2021 |
2) Learn and train yourself: how to convert ONLY floating CSV values to the fixed position format:
Code: |
...10.01,NAME1 -> ...10.01 ,NAME1
...100.01,NAME2 -> ...100.01 ,NAME2
...1000.01,NAME3 -> ...1000.01 ,NAME3
...10000.01,NAME4 -> ...10000.01 ,NAME4
...100000.01,NAME5 -> ...100000.01 ,NAME5 |
3) Learn and train yourself: how to sum ONLY floating point numeric values, both positive, and negative ones!
Code: |
10.01
100.01
1000.01
10000.01
100000.01
-1000.01
-----------
110110.04 |
You need to LEARN all these tricks separately, and by yourself, to be able to do your own job. Otherwise it looks like a homeless beggar on a street.
If you don't know how to do each specific operation, then ask about it. Requesting your whole job to be done by others from start to finish - it is unashamed, and frustrating. |
|
Back to top |
|
 |
Rohit Umarjikar
Global Moderator

Joined: 21 Sep 2010 Posts: 3087 Location: NYC,USA
|
|
|
|
I would simply finish this task by writing up COBOL program which would be easy to maintain for you and others at your site for any future changes in these rules. |
|
Back to top |
|
 |
Joerg.Findeisen
Senior Member

Joined: 15 Aug 2015 Posts: 1387 Location: Bamberg, Germany
|
|
|
|
Rohit Umarjikar wrote: |
I would simply finish this task by writing up COBOL program which would be easy to maintain for you and others at your site for any future changes in these rules. |
It requires, as I have written, less than 20 lines in DFSORT. No need to write a program for that. |
|
Back to top |
|
 |
prashantshinde
New User
.jpg)
Joined: 25 Nov 2005 Posts: 49 Location: Pune
|
|
|
|
I tried PARSE to format CSV records. It removed comma correctly but with different date formats, how can I get one digit day or month converted to two digit?
Input -
Code: |
----+----1----+----2----+----3----+----4----+----5----+--
202124321,0000007396,1/30/2021 12:00:00 AM,FALSE,8661.52
202124321,0000007397,12/1/2021 12:00:00 AM,TRUE ,719.88
202124321,0000007398,7/2/2021 12:00:00 AM,FALSE,76.26
202124321,0000007399,3/1/2021 12:00:00 AM,FALSE,36.81
202124321,0000007400,7/30/2021 12:00:00 AM,FALSE,1136.30
202124321,0000007401,11/2/2021 12:00:00 AM,FALSE,984.99
202124321,0000007402,6/5/2021 12:00:00 AM,TRUE ,1563.65
202124321,0000007405,7/12/2021 12:00:00 AM,FALSE,3315.32
202124321,0000007406,10/4/2021 12:00:00 AM,FALSE,890.00
202124321,0000007407,4/9/2021 12:00:00 AM,FALSE,967.00
|
SORT CARD -
Code: |
INREC PARSE=(%01=(ENDBEFR=C',',FIXLEN=09),
%02=(ENDBEFR=C',',FIXLEN=10),
%03=(ENDBEFR=C':',FIXLEN=12),
%04=(ENDBEFR=C',',FIXLEN=08),
%05=(ENDBEFR=C',',FIXLEN=06),
%06=(ENDBEFR=C' ',FIXLEN=12)),
BUILD=(%01,%02,%03,%05,%06)
SORT FIELDS=COPY
|
Output -
Code: |
----+----1----+----2----+----3----+----4----
89202432130000073961/30/2021 12FALSE 8661.52
892024321300000739712/1/2021 12TRUE 719.88
89202432130000073987/2/2021 12FALSE 76.26
89202432130000073993/1/2021 12FALSE 36.81
89202432130000074007/30/2021 12FALSE 1136.30
892024321300000740111/2/2021 12FALSE 984.99
89202432130000074026/5/2021 12TRUE 1563.65
89202432130000074057/12/2021 12FALSE 3315.32
892024321300000740610/4/2021 12FALSE 890.00
89202432130000074074/9/2021 12FALSE 967.00
|
I have used ':' from time field as when date is in MM/DD/YYYY format there is no space left between date and time.
Secondly, I tried using SECTIONS to control break after change in Primary Key. I gave Trailer record with SUM of counts/amounts but results are not as expected. It produced Trailer record after every change in Primary key. Not like one trailer record for all primary key records from the entire file.
SORT
Code: |
INREC BUILD=(1,9,4X,11,10,46,20,SFF,EDIT=(TTTTTTTTTT),
22,2,25,2,28,4,44,1,C'00000001')
SORT FIELDS=(1,9,CH,A,11,10,CH,A)
SUM FIELDS=(43,8,ZD)
OPTION COPY
OUTREC BUILD=(1,50)
OUTFIL REMOVECC,NODETAIL,
SECTIONS=(1,9,
TRAILER3=(C'T',1,9,18:TOT=(24,20,SFF,EDIT=(TTTTTTTTTT))))
OUTPUT -
----+----1----+----2----+--
T202124321 9726043110
T202124291 2363268840
T202124321 2079636480
T202124291 8940361050
T202124321 5539074050
T202124291 0102020210
T202124321 5431406200
T202124291 2800310440
T202124321 5327056520
T202124291 1992733320
T202124321 0886095320
T202124291 1306660660
T202124321 0506445250
T202124291 2017552530
|
Note that input file to sort for SECTIONS is huge having blocks of records for both primary keys.[/code] |
|
Back to top |
|
 |
prashantshinde
New User
.jpg)
Joined: 25 Nov 2005 Posts: 49 Location: Pune
|
|
|
|
prashantshinde wrote: |
I tried PARSE to format CSV records. It removed comma correctly but with different date formats, how can I get one digit day or month converted to two digit?
Input -
Code: |
----+----1----+----2----+----3----+----4----+----5----+--
202124321,0000007396,1/30/2021 12:00:00 AM,FALSE,8661.52
202124321,0000007397,12/1/2021 12:00:00 AM,TRUE ,719.88
202124321,0000007398,7/2/2021 12:00:00 AM,FALSE,76.26
202124321,0000007399,3/1/2021 12:00:00 AM,FALSE,36.81
202124321,0000007400,7/30/2021 12:00:00 AM,FALSE,1136.30
202124321,0000007401,11/2/2021 12:00:00 AM,FALSE,984.99
202124321,0000007402,6/5/2021 12:00:00 AM,TRUE ,1563.65
202124321,0000007405,7/12/2021 12:00:00 AM,FALSE,3315.32
202124321,0000007406,10/4/2021 12:00:00 AM,FALSE,890.00
202124321,0000007407,4/9/2021 12:00:00 AM,FALSE,967.00
|
SORT CARD -
Code: |
INREC PARSE=(%01=(ENDBEFR=C',',FIXLEN=09),
%02=(ENDBEFR=C',',FIXLEN=10),
%03=(ENDBEFR=C':',FIXLEN=12),
%04=(ENDBEFR=C',',FIXLEN=08),
%05=(ENDBEFR=C',',FIXLEN=06),
%06=(ENDBEFR=C' ',FIXLEN=12)),
BUILD=(%01,%02,%03,%05,%06)
SORT FIELDS=COPY
|
Output -
Code: |
----+----1----+----2----+----3----+----4----
89202432130000073961/30/2021 12FALSE 8661.52
892024321300000739712/1/2021 12TRUE 719.88
89202432130000073987/2/2021 12FALSE 76.26
89202432130000073993/1/2021 12FALSE 36.81
89202432130000074007/30/2021 12FALSE 1136.30
892024321300000740111/2/2021 12FALSE 984.99
89202432130000074026/5/2021 12TRUE 1563.65
89202432130000074057/12/2021 12FALSE 3315.32
892024321300000740610/4/2021 12FALSE 890.00
89202432130000074074/9/2021 12FALSE 967.00
|
I have used ':' from time field as when date is in MM/DD/YYYY format there is no space left between date and time.
Secondly, I tried using SECTIONS to control break after change in Primary Key. I gave Trailer record with SUM of counts/amounts but results are not as expected. It produced Trailer record after every change in Primary key. Not like one trailer record for all primary key records from the entire file.
SORT
Code: |
INREC BUILD=(1,9,4X,11,10,46,20,SFF,EDIT=(TTTTTTTTTT),
22,2,25,2,28,4,44,1,C'00000001')
SORT FIELDS=(1,9,CH,A,11,10,CH,A)
SUM FIELDS=(43,8,ZD)
OPTION COPY
OUTREC BUILD=(1,50)
OUTFIL REMOVECC,NODETAIL,
SECTIONS=(1,9,
TRAILER3=(C'T',1,9,18:TOT=(24,20,SFF,EDIT=(TTTTTTTTTT))))
OUTPUT -
----+----1----+----2----+--
T202124321 9726043110
T202124291 2363268840
T202124321 2079636480
T202124291 8940361050
T202124321 5539074050
T202124291 0102020210
T202124321 5431406200
T202124291 2800310440
T202124321 5327056520
T202124291 1992733320
T202124321 0886095320
T202124291 1306660660
T202124321 0506445250
T202124291 2017552530
|
Note that input file to sort for SECTIONS is huge having blocks of records for both primary keys.[/code] |
I removed 'NODETAIL' from SORT CARD and got detail + trailer records. I just need to format them now.
With only problem remaining is to handle different type of date formats. |
|
Back to top |
|
 |
Joerg.Findeisen
Senior Member

Joined: 15 Aug 2015 Posts: 1387 Location: Bamberg, Germany
|
|
|
|
Quote: |
I have used ':' from time field as when date is in MM/DD/YYYY format there is no space left between date and time. |
Parse in smaller parts and it should work.
Code: |
:
%03=(FIXLEN=2,ENDBEFR=C'/',REPEAT=2),
%05=(FIXLEN=4,ENDBEFR=C','),
:
BUILD=(..%03,UFF,M11,%04,UFF,M11,%05,UFF,M11,..) |
|
|
Back to top |
|
 |
Rohit Umarjikar
Global Moderator

Joined: 21 Sep 2010 Posts: 3087 Location: NYC,USA
|
|
|
|
Joerg.Findeisen wrote: |
Rohit Umarjikar wrote: |
I would simply finish this task by writing up COBOL program which would be easy to maintain for you and others at your site for any future changes in these rules. |
It requires, as I have written, less than 20 lines in DFSORT. No need to write a program for that. |
I have already shared the link to use via DFSORT which has similar problem and the approach can be grabbed with little twists. My point was it is sometimes easier to save time and maintain quite easily if one writes the program considering future changes and skills people possesses. Let us not argue on this to divert from the original post but thanks for the views. |
|
Back to top |
|
 |
prashantshinde
New User
.jpg)
Joined: 25 Nov 2005 Posts: 49 Location: Pune
|
|
|
|
Thank you Joerg
Date formatting is resolved now! Using sections, I am able to get Trailer record after each block of primary key and value in column 42 (originally False or True value). The only problem remained is I am getting two different files based on value in 42nd byte. So, in one file different primary key records are present but they all have same value on 42nd byte. Now while merging the files, records of same primary key from both files should come together.
Input File -
Code: |
----+----1----+----2----+----3----+----4----+----5----+---
202124321,0000007396,1/30/2021 12:00:00 AM,FALSE,8661.52
202124321,0000007397,12/1/2021 12:00:00 AM,TRUE ,719.88
202124321,0000007398,7/2/2021 12:00:00 AM,FALSE,76.26
202124321,0000007399,3/1/2021 12:00:00 AM,FALSE,36.81
202124321,0000007400,7/30/2021 12:00:00 AM,FALSE,1136.30
202124331,0000007401,11/2/2021 12:00:00 AM,FALSE,984.99
202124331,0000007402,6/5/2021 12:00:00 AM,TRUE ,1563.65
202124331,0000007405,7/12/2021 12:00:00 AM,FALSE,3315.32
202124331,0000007406,10/4/2021 12:00:00 AM,FALSE,890.00
202124331,0000007407,4/9/2021 12:00:00 AM,FALSE,967.00
|
SORT
Code: |
INREC PARSE=(%01=(ENDBEFR=C',',FIXLEN=09),
%02=(ENDBEFR=C',',FIXLEN=10),
%03=(FIXLEN=2,ENDBEFR=C'/',REPEAT=2),
%05=(FIXLEN=4,ENDBEFR=C','),
%06=(ENDBEFR=C',',FIXLEN=05),
%07=(ENDBEFR=C' ',FIXLEN=12)),
BUILD=(%01,4X,%02,%07,SFF,EDIT=(TTTTTTTTTT),%03,UFF,M11,%04,
UFF,M11,%05,UFF,M11,%06,38X)
SORT FIELDS=(1,9,CH,A,38,5,CH,A)
OUTREC IFTHEN=(WHEN=INIT,FINDREP=(IN=C'FALSE',OUT=C' ')),
IFTHEN=(WHEN=INIT,FINDREP=(IN=C'TRUE ',OUT=C'V'))
OUTFIL FNAMES=SRT1OUT,REMOVECC,INCLUDE=(42,1,CH,EQ,C' '),
SECTIONS=(1,9,42,1,
TRAILER3=(C'T',1,9,9X,20:COUNT=(M11,LENGTH=10),
35:TOT=(24,10,ZD,TO=ZD,LENGTH=12)))
OUTFIL FNAMES=SRT2OUT,REMOVECC,INCLUDE=(42,1,CH,EQ,C'V'),
SECTIONS=(1,9,42,1,
TRAILER3=(C'T',1,9,9X,20:COUNT=(M11,LENGTH=10),
35:TOT=(24,10,ZD,TO=ZD,LENGTH=12),C'VOIDS'))
|
Output 1
Code: |
----+----1----+----2----+----3----+----4----+-
202124321 0000007396000086615201302021
202124321 0000007398000000762607022021
202124321 0000007399000000368103012021
202124321 0000007400000011363007302021
T202124321 0000000004 000000991089
202124331 0000007401000009849911022021
202124331 0000007405000033153207122021
202124331 0000007406000008900010042021
202124331 0000007407000009670004092021
T202124331 0000000004 000000615731
|
Output 2
Code: |
----+----1----+----2----+----3----+----4----+----5-
202124321 0000007397000007198812012021V
T202124321 0000000001 000000071988VOIDS
202124331 0000007402000015636506052021V
T202124331 0000000001 000000156365VOIDS
|
I am thinking of adding sequence number based on primary key value and then combine it together based on the same sequence number. Is there any other way to solve this issue? Also, I tried to above SORT without having two output files but not able to do that. |
|
Back to top |
|
 |
|
|