I need the get the record count of specific type of records and write that count to a summary record. There is one summary record for each group of detail records. E.g my input file format is as below
Code:
D|COUNT|111111111111|123.23|XYZ COMMERCE|NEW YORK CITY
R|11111110444A|15022016|0.34|XYZ AFFILIATE|BOSTON
R|11111110444A|15032016|2.44|XYZ AFFILIATE|BOSTON
R|11111110444B|15022016|12.76|XYZ BROTHERS|LA CITY
R|11111110444B|15032016|112.76|XYZ BROTHERS|LA CITY
D|COUNT|222222211111|73.23|ABC CONSTRUCTION|NEW YORK CITY
R|22222220111A|15022016|0.34|ABC CONSTRUCTION|BOSTON
R|22222220111B|15032016|71.63|ABC CONS.|BOSTON
Output Should have:
Code:
D|4|111111111111|123.23|XYZ COMMERCE|NEW YORK CITY
R|11111110444A|15022016|0.34|XYZ AFFILIATE|BOSTON
R|11111110444A|15032016|2.44|XYZ AFFILIATE|BOSTON
R|11111110444B|15022016|12.76|XYZ BROTHERS|LA CITY
R|11111110444B|15032016|112.76|XYZ BROTHERS|LA CITY
D|2|222222211111|73.23|ABC CONSTRUCTION|NEW YORK CITY
R|22222220111A|15022016|0.34|ABC CONSTRUCTION|BOSTON
R|22222220111B|15032016|71.63|ABC CONS.|BOSTON
That I just need the count of "R" records under each "D" type records on second field in pipe delimited record.
Could you please suggest a simplest way using DFSORT?
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Atul Banke,
The below DFSORT should give you something to start with. I have assumed the input to be FB with an LRECL=80. You might want to modify it as per your requirements.
D|4|111111111111|123.23|XYZ COMMERCE|NEW YORK CITY
R|11111110444A|15022016|0.34|XYZ AFFILIATE|BOSTON
R|11111110444A|15032016|2.44|XYZ AFFILIATE|BOSTON
R|11111110444B|15022016|12.76|XYZ BROTHERS|LA CITY
R|11111110444B|15032016|112.76|XYZ BROTHERS|LA CITY
D|2|222222211111|73.23|ABC CONSTRUCTION|NEW YORK CITY
R|22222220111A|15022016|0.34|ABC CONSTRUCTION|BOSTON
R|22222220111B|15032016|71.63|ABC CONS.|BOSTON
Under no circumstances you can get the count of following records before you've read all those records.
You cannot guess in advance what is following next in your file?
It's the same as getting your salary in advance - before starting your job.
At least two steps are required to manipulate with record re-ordering.
But correct way would be creating the totals under your detail records.
Thanks a lot, it works perfectly fine as you suggested. I was also thinking that using WHEN=GROUP and then join should work but since I didn't know the SECTIONS parameter therefore could not get that working.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
At a brief view, you don't need two steps, all the processing of the first step can be specified in the JNF1CNTL and JNF2CNTL files for your single input file (used twice).
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
magesh23586,
AFAIK it would require the same passes of data.
Besides you have used the SUM operator that requires 'sorting' of data. What I posted earlier was a SORT COPY, which does not require any sorting. Always the direction should be to avoid sorting of data whenever there is no need to. Saving resources should be the priority, we may not always gain by doing something in 'one step'.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
magesh23586,
There is no justification to build a 207 byte record in the REFORMAT to save a step. Keeping the Intermediate file lengths to the possible minimum is as important as getting rid of an unnecessary sort.
Learn to keep things simple and short. Like I mentioned earlier - everything in "one step" might not help always, and that should not be the prime objective of any application.
Below is a "one step" ICETOOL version of the DFSORT solution posted earlier.
Though you do it in ICETOOL, your solution is 4 Pass solution (1 for sort copy + 3 for Joinkeys) , which is not at all recommended when there is 1 step( Joinkey 3 Pass) solution is possible.
what will happen when second JOINKEYS Fails for some reason, we need to start sort copy + joinkey again, which means we are wasting resource unnecessarily.
Bill, Please correct me, if i am wrong.
The following code would be optimal + simple, where I am just sorting 12 bytes of records which will be much better than a reading a complete file with Sections. Below is the post I requested Bill to update.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Bill Woodger wrote:
So any solution involving SORT is absurd: wasteful of resources. Unless absolutely necessary :-)
magesh23586,
I would assume that the above note by Bill in one of the old topics still holds true for your "optimum + simple" solution-1, which involves SORT and SUM.
Moreover, I would think we don't have to restart/abend on a daily basis. So if the copy solution runs fine for 99% of the time, I don't see any problem with that, unless the job fails every day.
Edit: Your second "solution" still has unused bytes if you have not realized it yet.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Problem is, the context of the original from which the quote is taken is the ability to use the OUTFIL reporting features. "Don't use SORT just to use SUM". In JNFnCNTL, there is no possibility to use the OUTFIL reporting features. But in itself, perhaps not enough to use SUM.
It's not the SUM itself that is the problem, it is doing a SORT just to be able to do a SUM, because SORTing is an extensive operation.
I think from Arun's original solution, drop the first OUTFIL and do the appending in the JNF1CNTL. Avoids the "write", and the BUILD (which I can't see why it was needed).
For sure I've seen and used the SUM in JNFnCNTL for this type of task. A separate step I think on balance is going to be a more efficient way to do it than by using the SUM.
From both the suggested solutions, I have another idea. Have to check it first...
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Regarding the solution-2 by magesh23586, we don't really need to complicate things by adding another identifier 'A'.
I would do something like this, if this has to be a "one step" DFSORT thing.
Code:
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTJNF1 DD DSN= >> Input data set (FB/80)
//SORTJNF2 DD DSN= >> Same Input data set (FB/80)
//SORTOUT DD DSN= >> Output data set (FB/80)
//*
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(81,4,A,01,01,A),SORTED,NOSEQCK
JOINKEYS FILE=F2,FIELDS=(01,4,A,10,01,A),SORTED,NOSEQCK
JOIN UNPAIRED,F1
REFORMAT FIELDS=(F1:85,5,1,80,F2:5,5)
SORT FIELDS=COPY
INREC IFTHEN=(WHEN=(8,5,CH,EQ,C'COUNT'),
OVERLAY=(8:86,5),HIT=NEXT),
IFTHEN=(WHEN=ANY,
FINDREP=(IN=C' ',OUT=C'',STARTPOS=08,ENDPOS=11))
OUTFIL REMOVECC,NODETAIL,SECTIONS=(1,5,TRAILER3=(6,80)),BUILD=(80X)
/*
//JNF1CNTL DD *
INREC IFTHEN=(WHEN=GROUP,BEGIN=(1,1,CH,EQ,C'D'),
PUSH=(81:ID=4,SEQ=5))
/*
//JNF2CNTL DD *
INREC IFOUTLEN=10,
IFTHEN=(WHEN=GROUP,BEGIN=(1,1,CH,EQ,C'D'),
PUSH=(01:ID=4,SEQ=5)),
IFTHEN=(WHEN=INIT,OVERLAY=(5:5,5,ZD,SUB,+1,M10,LENGTH=5,C'D'))
/*
My personal preference is not to generate duplicate records and then having to roll up later. I would have to run a performance comparison between the 2 step DFSORT and the above one. I will post it here when I have it.
Just a thought, between these big performance talks and may get avoided with simple approach.
1.Assuming the Header (D) is unique always.
Here is my approach.
a. I am not sure if sequence number is required but in case to sort back or its original order I have added, also '1' for reach record
b.Whenever we get a hit for Header , push the entire header for reach record (append).
c.group them (pushed headers) and count-This gives the details record count to overlay
d.Overlay the count in step c ,when its ONLY header(D) and include only records which has a space at 51:1
I stopped right here and got to go, may be someone can modify to finsh.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Hi Rohit,
Just in case if you missed to notice, the approach described above to arrive at the detail record count is already used in one of the tested working solutions posted so far in this topic.
Rohit Umarjikar wrote:
c.group them (pushed headers) and count-This gives the details record count to overlay
d.Overlay the count in step c ,when its ONLY header(D) and include only records which has a space at 51:1
I could not comprehend how this piece can possibly get to the OPs expected result.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Rohit, the D record has been written out before all its subordinate records have been counted.
Sure, with multiple passes of the data you can do anything, but with more than three passes (at most) it is extremely likely that it would be more resource-effective to just write a program and that would only fail if there were so many records within a group as to not fit within the available storage for the program.